Konda, Manisha
Unknown Affiliation

Published : 1 Documents Claim Missing Document
Claim Missing Document
Check
Articles

Found 1 Documents
Search

Performance Evaluation of Machine Learning Inference Workloads in Containerized Cloud Computing Environments Konda, Manisha
The Eastasouth Journal of Information System and Computer Science Vol. 3 No. 02 (2025): The Eastasouth Journal of Information System and Computer Science (ESISCS)
Publisher : Eastasouth Institute

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.58812/esiscs.v3i02.949

Abstract

Machine learning (ML) systems are increasingly deployed in cloud-native environments where scalability, portability, and resource efficiency are essential. There are many scenarios in which the Docker and Kubernetes containerization solution are the best solution for machine learning inferencing services as the application scales, moves, and seeks every efficiency. However, the performance of machine learning inferencing services within a containerized cloud environment still needs to be explored. What is the performance of machine learning inferencing services within a containerized cloud environment? The performance of machine learning inferencing services within a containerized cloud environment needs to be explored. The aim of the exploration is to understand the performance of various machine learning models within a containerized cloud environment and to determine the major factors affecting the performance of machine learning inferencing services. Several machine learning models are implemented using Python-based frameworks and deployed as microservices in Docker containers. The experiments are performed by sending simultaneous prediction requests from multiple users to the deployed models. The study establishes baseline benchmarks, which demonstrate the impact of containerization on inference speed and efficiency. This provides useful and practical knowledge for building scalable AI systems and establishes the foundation for future work, such as optimizing ML deployment pipelines, incorporating privacy-preserving inference techniques, and improving container orchestration for AI workloads.