Ichsan, Mulia
Unknown Affiliation

Published : 1 Documents Claim Missing Document
Claim Missing Document
Check
Articles

Found 1 Documents
Search
Journal : Building of Informatics, Technology and Science

Market-Adaptive Stock Trading through B-WEMA Driven Proximal Policy Optimization Ichsan, Mulia; Zahra, Amalia
Building of Informatics, Technology and Science (BITS) Vol 7 No 4 (2026): March 2026
Publisher : Forum Kerjasama Pendidikan Tinggi

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47065/bits.v7i4.9349

Abstract

Developing automated trading strategies that achieve stable returns while controlling risk remains a central threat in quantitative finance. Many reinforcement learning-based trading systems focus on reward maximization but provide limited justification for the choice of forecasting indicators and often lack comprehensive benchmarking against alternative strategies and risk measures. This essay addresses the problem of integrating a statistically grounded price-smoothing technique with a policy optimization scheme to improve sequential trading decisions under market uncertainty. We propose a hybrid model that combines Brown’s Weighted Exponential Moving Average (B-WEMA) as a trend-sensitive forecasting indicator with a Deep Reinforcement Learning agent trained using Proximal Policy Optimization (PPO). The role of B-WEMA is to provide structured price signals that reduce noise sensitivity, while PPO determines buy and sell actions through policy updates constrained for stable learning. The performance of the proposed model is evaluated over a 10-month trading horizon and compared with a buy-and-hold benchmark and an alternative reinforcement learning method, Advantage Actor-Critic (A2C), both implemented under the same experimental conditions. Empirical results show that the proposed B-WEMA-PPO framework achieved a cumulative return of 23.43% over the test period, outperforming both the benchmark and the A2C-based agent. In addition to cumulative return, risk-adjusted performance metrics, namely volatility and maximum drawdown, are reported to provide a balanced assessment of profitability and risk exposure. These findings suggest that incorporating structured exponential smoothing into policy optimization may enhance the stability and effectiveness of reinforcement learning-based trading strategies.