Al Ghifari, Nasy'an Taufiq
Unknown Affiliation

Published : 1 Documents Claim Missing Document
Claim Missing Document
Check
Articles

Found 1 Documents
Search

A Data-Driven Framework for Analyzing Popularity of Indian Film Adaptations Using K-Means and Random Forest Al Ghifari, Nasy'an Taufiq; Deni Arif Wibowo
Emerging Information Science and Technology Vol. 6 No. 2 (2025)
Publisher : Universitas Muhammadiyah Yogyakarta

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.18196/eist.v6i2.28415

Abstract

This study proposes a machine learning-based approach to predict the success and failure of Indian film adaptations in the box office market. Leveraging a dataset of more than 5,000 movies from the Kaggle platform, the study integrates the K-Means Clustering algorithm to group movies based on numerical fea-tures (vote_average, vote_count, and popularity), as well as the Random Forest Classifier to predict popularity. The analysis was balanced on two main categories: popular and unpopular films. The cluster-ing results showed that only a small percentage of film adaptations met the popular criteria, while most were in the unpopular category. The classification model achieves an accuracy of 82% and an F1-score of 0.79, with high performance in detecting films at risk of failure in the market. The study's main contribu-tion lies in the critical exploration of the two sides of film performance, which provides strategic insights for the film industry in designing more targeted production and distribution and avoiding investment mis-takes in less potential adaptation projects.