Jurnal Informatika Global
Vol. 16 No. 2: August 2025

Klasifikasi Pendapatan Menggunakan Algoritma Random Forest: Studi Kasus Dataset Adult Income

Amanda, Widia (Unknown)
Voutama, Apriade (Unknown)



Article Info

Publish Date
16 May 2025

Abstract

This research aims to classify a person's income based on demographic attributes using Random Forest algorithm, which is one of the popular ensemble learning methods in the field of machine learning. The dataset used is Adult Income from the UCI Machine Learning Repository, which consists of more than 32 thousand data with 15 attributes such as age, gender, education, education level, employment type, marital status and others. The research process includes data preprocessing, model pipeline creation, training, and performance evaluation. Preprocessing was done through the removal of irrelevant attributes, normalization of numerical data, and application of one-hot encoding on categorical data. The model was trained with default parameters and evaluated using accuracy, precision, recall, F1-score, and confusion matrix metrics. The evaluation results show that the model achieved an accuracy of 85.44%, with higher performance in classifying income classes ≤50K than >50K. The low recall value in the >50K class indicates that the model tends to be biased towards the majority class, which could be caused by data imbalance. Therefore, it is necessary to improve the model through hyperparameter tuning techniques, handling data imbalance, or exploring other algorithms such as Gradient Boosting. This research is expected to be the basis for developing accurate and applicable data-based prediction systems in the fields of economics, policy planning, and decision support systems that require analysis of individual income potential.

Copyrights © 2025






Journal Info

Abbrev

IG

Publisher

Subject

Computer Science & IT

Description

Journal of global informatics publish articles on architectures from various perspectives, covering both literary and fieldwork studies. The journal, serving as a forum for the study of informatics, system information, computer system, informatics management, supports focused studies of particular ...