The rapid growth of digital applications in population administration services has increased the importance of sentiment analysis to understand user perceptions more deeply. This study focuses on the Digital population identity (Identitas Kependudukan Digital, IKD), a digital identity application developed by the Indonesian government. It aims to classify user reviews of the IKD application into positive, neutral, and negative sentiments using the random forest algorithm. The dataset consisted of 28,134 user reviews from the Google Play Store, including usernames, review texts, timestamps, and star ratings. The research stages included data preprocessing, labeling, handling missing values, and text processing (cleansing, tokenizing, stopword removal, and stemming). The data were divided into 80% training and 20% testing sets. The best-performing model used the parameters: max_depth=None, max_features=log2, min_samples_leaf=1, min_samples_split=2, and n_estimators=300, achieving an average accuracy of 83.78%. To address class imbalance, the synthetic minority oversampling technique (SMOTE) was applied, resulting in improved performance with an accuracy of 86.29%. Evaluation metrics before SMOTE showed 83.85% accuracy, 80.40% precision, 83.85% recall, and 81.73% F1 score. After SMOTE, precision increased to 81.22%, while accuracy and recall slightly decreased to 80.86%, with an F1 score of 81.03%. Furthermore, sentiment trend analysis using N-gram techniques (unigram, bigram, trigram) was conducted to identify frequently mentioned topics and user concerns. These insights support the research objective of guiding application improvements aligned with user needs and enhancing the overall digital service experience.
Copyrights © 2025