Journal of Applied Data Sciences
Vol 7, No 2: May 2026

An Integrated Text Analytics and Ensemble Machine Learning Framework for Fake Review Detection in Online Marketplaces

Eka Praja Wiyata Mandala (Universitas Putra Indonesia YPTK Padang)
Sarjon Defit (Universitas Putra Indonesia YPTK Padang)
Gunadi Widi Nurcahyo (Universitas Putra Indonesia YPTK Padang)



Article Info

Publish Date
05 Apr 2026

Abstract

The increasing prevalence of fake reviews on e-commerce platforms undermines consumer trust and affects purchasing decisions, particularly for local products by limited visibility such as those by West Sumatra, Indonesia. This study proposes a hybrid approach combining text analytics and machine learning to enhance the detection of fake reviews. Four classification models—Naive Bayes, Random Forest, Logistic Regression, and K-Nearest Neighbor—were tested on a dataset of 1,500 labeled product reviews. Among these models, Random Forest had the highest starting accuracy of 0.8533. To enhance it, we created a better algorithm called EKAHypeRFor (Enhanced Knowledge Augmentation of Hyperparameter Random Forest). This method uses simple feature engineering and careful tuning of settings by RandomizedSearchCV. The enhanced model reached an accuracy of 0.8778, which is 2.45% higher than the original. It also includes a real-time review sorting tool, making it easy to use on online shopping sites. Tests by a confusion matrix and feature importance drawn the model works well and is easy to understand. This method is simple, fast, and accurate, helping to make online product reviews more trustworthy for small and medium businesses in the area.

Copyrights © 2026






Journal Info

Abbrev

JADS

Publisher

Subject

Computer Science & IT Control & Systems Engineering Decision Sciences, Operations Research & Management

Description

One of the current hot topics in science is data: how can datasets be used in scientific and scholarly research in a more reliable, citable and accountable way? Data is of paramount importance to scientific progress, yet most research data remains private. Enhancing the transparency of the processes ...