Bulletin of Computer Science Research
Vol. 6 No. 2 (2026): February 2026

Analisis Visual Perilaku Agen Q-Learning dan SARSA pada Cliff Walking Problem dengan Explainable Reinforcement Learning

Atqiya, Firas (Unknown)
Sholahuddin, Muhammad Rizqi (Unknown)



Article Info

Publish Date
11 Feb 2026

Abstract

Reinforcement Learning (RL) has achieved remarkable success in complex sequential decision tasks. However, modern RL models often lack explainability, creating a serious "black box" problem, especially in high-stakes domains. This study proposes a Pygame-based real-time visualization architecture for RL, and demonstrates its benefits in a Cliff Walking case study using Q-Learning and SARSA algorithms. Key contributions include: (1) a real-time visualization architecture that decouples training logic from graphics rendering with support more than 60 FPS, (2) interpretive visualization techniques including diverging heatmaps, dynamic policy arrows, and Ghost Policies, and (3) a comprehensive empirical study clarifying the distinct characteristics of both algorithms. Experimental results clearly show that Q-Learning selects an efficient but risky path aligned with its optimistic off-policy nature, while SARSA converges on a safer path reflecting its on-policy nature that considers exploration safety. Quantitatively, Q-Learning successfully achieved an optimal 13-step path with an accumulation of 10,642 falls, whereas SARSA converged to a safe 23-step path with a significantly higher collision frequency (232,844 times) to avoid extreme penalties from the cliff zone.

Copyrights © 2026






Journal Info

Abbrev

bulletincsr

Publisher

Subject

Computer Science & IT

Description

Bulletin of Computer Science Research covers the whole spectrum of Computer Science, which includes, but is not limited to : • Artificial Immune Systems, Ant Colonies, and Swarm Intelligence • Bayesian Networks and Probabilistic Reasoning • Biologically Inspired Intelligence • Brain-Computer ...