Indonesian Applied Research Computing and Informatics
Vol. 1 No. 1: July (2025)

Improving Thesis Title Classification Accuracy Using Ensemble Classifier and Modified Chi-Square Feature Selection Method

Ritzkal (Universitas Ibn Khaldun)
Wahyu Tisno Atmojo (Sistem Informasi, Universitas Pradita)
Panji Novantara (Ilmu Komputer, Universitas Kuningan)
Sabir Rosidin (Doctoral Program of Information Systems)
Ahmad Dedi Jubaedi (Universitas serang raya)
Enggar Novianto (Universitas Sebelas Maret)



Article Info

Publish Date
17 Aug 2025

Abstract

Text classification of academic documents, particularly thesis titles, poses challenges due to high dimensionality, sparsity, and topic heterogeneity. Conventional feature selection techniques, such as the standard Chi-Square, often fall short in capturing discriminative features effectively. This research aims to enhance classification accuracy by proposing a Modified Chi-Square feature selection method that integrates term frequency and class distribution information. The selected features are then classified using ensemble decision tree algorithms, including Random Forest, Gradient Boosting, and XGBoost. Experiments were conducted on a labeled dataset of thesis titles using TF-IDF for vector representation. Evaluation metrics such as accuracy, precision, recall, F1-score, and AUC were used to assess model performance. The results showed that the combination of Modified Chi-Square and XGBoost outperformed other models, achieving the highest accuracy of 93.8% and an AUC of 0.94. These findings demonstrate that the integration of advanced feature selection and ensemble learning techniques can significantly improve academic text classification performance, providing valuable implications for the development of intelligent digital repositories and recommendation systems.

Copyrights © 2025






Journal Info

Abbrev

iarci

Publisher

Subject

Computer Science & IT Control & Systems Engineering

Description

Focus and Scope Indonesian Applied Research Computing and Informatics Indonesian Applied Research Computing and Informatics is a scientific journal that publishes applied research in the fields of computing and informatics. The journal aims to serve as a platform for academics, researchers, and ...