Large-scale smart systems such as smart cities, smart grids, smart healthcare, and IoT-based infrastructures generate massive volumes of complex, heterogeneous data that require intelligent analysis and real-time decision-making. Machine learning (ML) plays a central role in enabling these capabilities, yet the diversity of ML paradigms and the fragmented nature of existing studies make it difficult to determine which approaches are most effective for large-scale environments. This comprehensive review synthesizes and compares major ML paradigms, including supervised learning, unsupervised learning, reinforcement learning, deep learning, hybrid models, federated learning, and graph-based neural networks, across a wide range of smart system applications. The findings reveal that deep learning excels in processing high-dimensional and unstructured data, reinforcement learning performs best in autonomous and real-time control tasks, federated learning supports privacy-preserving analytics in distributed IoT ecosystems, and graph-based models offer superior performance in systems with interconnected network structures. The review also identifies key technological challenges such as data heterogeneity, computational complexity, communication bottlenecks, and privacy concerns that affect the scalability and deployment of ML in smart environments. By providing a unified comparison of ML paradigms and highlighting emerging trends, performance characteristics, and implementation challenges, this study offers valuable insights for researchers, system designers, engineers, and policymakers. The review further outlines future research directions aimed at enhancing scalability, robustness, interpretability, and real-time capability in next-generation smart systems.