Rapid growth of the energy-efficient artificial intelligence (AI) systems has attracted substantial interest in neuromorphic computing that emulates organization and actions of a biological neural?system to support low-power, event-driven information processing. In this work, we propose a neuromorphic hardware architecture for energy-efficient AI computing that utilizes spiking neural networks and monolithic?vertical integration to improve the performance of a variety of vision tasks. The architecture is tested against three benchmark datasets— MNIST, N-MNIST, and DVS128,?representing static, spiking and dynamic input modalities, respectively. The performance metrics, such as energy efficiency, inference latency,?throughput, classification accuracy, and unified Energy Efficiency Index (EEI) are compared to characterize the generalization power of the system in different processing environments. Experimental results show that the proposed chip provides a sharply lower energy per inference with a competitively performing accuracy over conventional AI?accelerators, including GPU-based and microcontroller platforms. Additionally, the hardware achieves sub-2 ms inference latency and high throughput, indicating suitability for real-time, embedded AI applications. Comparative analysis with existing neuromorphic platforms highlights the advantage of architectural co-design in balancing energy and performance constraints. While the absence of on-chip learning presents a limitation, the system offers a scalable foundation for edge AI systems requiring efficient, continuous inference. Future directions include integrating adaptive learning mechanisms and extending evaluation to broader AI domains as a process innovation.