Illegal drug abuse remains a serious problem that requires analytical approaches with a high level of accuracy. In developing classification models, the imbalance in the amount of data between classes is often a major obstacle because it can cause the model to tend to classify data into the majority class. This study aims to improve accuracy and balance classification performance in identifying illegal drug users by applying a combination of the Synthetic Minority Over-sampling Technique (SMOTE) and the Random Forest algorithm. This study utilized the Drug Consumption dataset obtained from the UCI Machine Learning Repository with a total of 1,885 data points. The dataset was converted into a binary classification form, namely heroin users and non-users. Next, the data was divided into 75% training data and 25% testing data using a random sampling method. To address the problem of class imbalance, the SMOTE technique was applied to the training data to synthetically increase the number of data in the minority class. The balanced data was then processed using the Random Forest algorithm as a classification method. Model performance evaluation was carried out by comparing the classification results before and after the SMOTE application using accuracy, precision, recall, and F1-score metrics. The test results show that the application of SMOTE to Random Forest provides better performance improvements compared to the model without SMOTE. The accuracy value increased from 0.90 to 0.91, precision from 0.89 to 0.90, recall from 0.90 to 0.91, and F1-score from 0.89 to 0.90. This improvement reflects the model's more optimal ability to detect minority classes, thus minimizing bias towards the majority class. Based on these results, it can be concluded that the combination of SMOTE and Random Forest is effective in addressing imbalanced data issues and improving the overall classification performance of illegal drug users.
Copyrights © 2026