Traditional mobile operating system (OS) schedulers struggle to maintain optimal performance amidst the increasing complexity of user multitasking, often resulting in significant latency and energy waste. This study aims to integrate a Proximal Policy Optimization (PPO) based Reinforcement Learning (RL) framework for predictive and adaptive resource allocation. Methodologically, we formulate the scheduling problem as a Markov Decision Process (MDP) where States (S) encompass CPU load, memory usage, and workload patterns; Actions (A) involve dynamic core affinity, frequency scaling, and cgroup adjustments; and Rewards (R) are calculated based on a weighted trade-off between performance maximization and energy conservation. A PPO actor-critic network is implemented and trained on a modified Android kernel (discount factor γ=0.99) under simulated high-load scenarios, including simultaneous video conferencing, data downloading, and web browsing. Experimental results demonstrate that the proposed RL mechanism reduces average task latency by 18% and boosts system responsiveness by 25%, while simultaneously achieving a 12% reduction in CPU power consumption compared to the baseline scheduler. These findings pioneer intelligent OS informatics, offering a robust foundation for sustainable multitasking for over a billion Android users through scalable, on-device fine-tuning.
Copyrights © 2025