This study addresses the subjective determination of smoking addiction levels among students at Malikussaleh University by implementing the C4.5 algorithm. Using a data mining approach based on entropy and gain ratio, the research objectively classifies addiction levels. Data was gathered from 300 respondents, divided into 240 training and 60 testing samples, covering attributes such as cigarettes per day, smoking duration, and the first cigarette after waking. Analysis reveals that cigarettes per day yielded the highest gain ratio (0.2717), serving as the decision tree's root. The classification identified 95 students with mild, 148 moderate, 55 severe, and 2 very severe addiction. Model evaluation via a confusion matrix showed 80% accuracy, 64.5% precision, 56.8% recall, and a 58.9% F1-score. The C4.5 algorithm proved effective in building an interpretative model using IF–THEN rules. These findings provide a solid foundation for university health policies, prevention programs, and early identification of high-addiction risks among students.
Copyrights © 2026