Putra, Rafi Pratama
Unknown Affiliation

Published : 1 Documents Claim Missing Document
Claim Missing Document
Check
Articles

Found 1 Documents
Search

Classification and Prediction of Video Game Sales Levels Using the Naive Bayes Algorithm Based on Platform, Genre, and Regional Market Data Putra, Rafi Pratama; Ramadani, Nevita Cahaya; Nanjar, Agi
International Journal of Informatics and Information Systems Vol 8, No 1: January 2025
Publisher : International Journal of Informatics and Information Systems

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47738/ijiis.v8i1.242

Abstract

The exponential expansion of the video game industry has resulted in a vast accumulation of market data that can be leveraged to analyze and predict sales performance. This study aims to construct a classification model for video game sales levels by applying the Naïve Bayes algorithm, recognized for its simplicity, efficiency, and strong baseline performance in supervised learning tasks. The research employs a public dataset containing over 13,000 video game entries, encompassing key attributes such as genre, platform, publisher, release year, user and critic ratings, and global sales figures. The target variable global sales was discretized into three categories: Low (1 million units), Medium (1–5 million units), and High (5 million units) to represent distinct tiers of commercial success. Prior to modeling, the dataset underwent a comprehensive preprocessing pipeline involving duplicate removal, handling of missing data, normalization of numerical attributes, and feature selection to ensure optimal model performance. The Multinomial Naïve Bayes classifier was then implemented and assessed using standard evaluation metrics, including accuracy, precision, recall, and F1-score. Experimental results revealed an accuracy of 71.82% and an F1-score of 70.03%, signifying strong predictive capability for a probabilistic model of this simplicity. The classifier effectively identified low and medium sales categories, though slightly underperformed on the high sales group due to class imbalance within the dataset. Further analysis of conditional probabilities indicated that game genre, platform popularity (especially PS2 and Wii), and critic scores were the most influential determinants of higher sales outcomes. These findings affirm that the Naïve Bayes algorithm provides a reliable and interpretable foundation for video game sales prediction, serving as a benchmark model in market analytics. Future studies are encouraged to address data imbalance through oversampling or synthetic data generation, incorporate contextual variables such as marketing strategies and release schedules, and explore ensemble or deep learning approaches to enhance predictive accuracy and robustness.