Claim Missing Document
Check
Articles

Found 1 Documents
Search

Analisis Kinerja Algoritma C4.5 pada Dataset Titanic yang Tidak Seimbang Menggunakan Gain Ratio: Penelitian Kuncoro Singgih Prasojo; Hasbi Firmansyah; Wahyu Asriyani; Ali Sofyan
Jurnal Pengabdian Masyarakat dan Riset Pendidikan Vol. 4 No. 2 (2025): Jurnal Pengabdian Masyarakat dan Riset Pendidikan Volume 4 Nomor 2 (October 202
Publisher : Lembaga Penelitian dan Pengabdian Masyarakat

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.31004/jerkin.v4i2.4402

Abstract

This study aims to analyze the performance of the C4.5 algorithm in classifying passenger survival status using the Titanic dataset, which exhibits an imbalanced class distribution. The research employed a quantitative approach consisting of data preprocessing, manual calculation of entropy, information gain, split information, and gain ratio using Microsoft Excel, followed by model implementation using RapidMiner. The dataset contains 800 passenger records with the survived attribute defined as the class label. Manual calculation results indicate that the Gender attribute has the highest information gain value of 0.955, making it the root node of the decision tree, while other attributes such as Pclass, Age Group, and Fare Group contribute very limited information. The experimental results show that the C4.5 model achieves an accuracy of 62.50%; however, all test instances are predicted as non-survived, resulting in 0% precision and recall for the survived class. In addition, the generated decision tree structure is very shallow with no significant branching. These findings demonstrate that class imbalance in the Titanic dataset strongly affects the performance of the C4.5 algorithm, indicating the need for imbalanced data handling techniques to improve classification results.