The rapid progress in artificial intelligence technologies in recent years has been largely driven by advances in reinforcement learning (RL). RL methods have proven to be highly effective in solving many practical problems. Distributed ledger technologies are finding wide application in the internet of things (IoTs), providing new approaches to solving problems of traditional IoT systems. Consensus is a fundamental component of distributed ledger technologies, responsible for ensuring data consistency between nodes, its security and accuracy. This paper is devoted to the study of the optimal choice of blockchain consensus protocol for IoT networks based on a combination of multi-criteria decision making (MCDM) and RL methods. The paper discusses the potential of merging MCDM and RL methods for selecting blockchain consensus protocols in IoT networks. It suggests a combined framework for effective protocol selection and management.