Convolutional neural networks (CNNs) are applied to a different range of real-world complex tasks to provide effective solutions with high accuracy. Based on the application's complexity, CNN demands a lot of processing units and memory spaces for its effective implementation. Bringing this computational task to hardware for processing the data to enhance the acceleration helps in achieving real-time performance improvement. Recent studies focused on approximation methodology to overcome this problem. This proposed survey analyzes various recent methods involved in implementing approximating computing-based processing elements and their usage in CNNs. Primarily, the survey focuses on multiple and accumulates (MAC) unit and their various approximation methods, which acts as a fundamental block as a processing element in the CNN layers. Secondly, it focuses on various CNN hardware acceleration architectures and their layers designed using different methods and their wide range of applications. Some of the recent design methods applied to various ranges of applications are also analyzed in the proposed survey. This detailed analysis gives an outlook on effective approximation blocks and the CNN architecture to be effectively used in various designs, with a scope of area in which future improvement can be made.
Copyrights © 2025