Emerging Information Science and Technology
Vol. 6 No. 2 (2025)

A Data-Driven Framework for Analyzing Popularity of Indian Film Adaptations Using K-Means and Random Forest

Al Ghifari, Nasy'an Taufiq (Unknown)
Deni Arif Wibowo (Unknown)



Article Info

Publish Date
29 Nov 2025

Abstract

This study proposes a machine learning-based approach to predict the success and failure of Indian film adaptations in the box office market. Leveraging a dataset of more than 5,000 movies from the Kaggle platform, the study integrates the K-Means Clustering algorithm to group movies based on numerical fea-tures (vote_average, vote_count, and popularity), as well as the Random Forest Classifier to predict popularity. The analysis was balanced on two main categories: popular and unpopular films. The cluster-ing results showed that only a small percentage of film adaptations met the popular criteria, while most were in the unpopular category. The classification model achieves an accuracy of 82% and an F1-score of 0.79, with high performance in detecting films at risk of failure in the market. The study's main contribu-tion lies in the critical exploration of the two sides of film performance, which provides strategic insights for the film industry in designing more targeted production and distribution and avoiding investment mis-takes in less potential adaptation projects.

Copyrights © 2025






Journal Info

Abbrev

eist

Publisher

Subject

Computer Science & IT

Description

Emerging Information Science and Technology is a double-blind peer-reviewed journal which publishes high quality and state-of-the-art research articles in the area of information science and technology. The articles in this journal cover from theoretical, technical, empirical, and practical ...