Claim Missing Document
Check
Articles

Found 3 Documents
Search

Enhancing Student Sentiment Classification on AI in Education using SMOTE and Naive Bayes Saekhu, Ahmad; Berlilana, Berlilana; Saputra, Dhanar Intan Surya
Building of Informatics, Technology and Science (BITS) Vol 6 No 4 (2025): March 2025
Publisher : Forum Kerjasama Pendidikan Tinggi

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47065/bits.v6i4.6469

Abstract

This study investigates student sentiment regarding the use of artificial intelligence (AI) in education, employing the Naive Bayes model enhanced with the Synthetic Minority Over-sampling Technique (SMOTE) to address class imbalance issues. Class imbalance, a common challenge in sentiment classification, often skews model performance toward majority classes, reducing its effectiveness in recognizing minority classes. To mitigate this, SMOTE was applied to generate synthetic samples for minority classes, achieving a more balanced class distribution. The results demonstrate that incorporating SMOTE improved the Naive Bayes model's accuracy from 65% to 78.87% and significantly increased sensitivity to minority classes. Evaluation metrics, including precision, recall, and F1-score, showed satisfactory performance for certain classes, notably classes 2 and 4. However, challenges remained with class 1, where classification accuracy was lower, indicating inherent complexities in its data patterns. While SMOTE successfully enhanced model performance, it also introduced a potential risk of overfitting, particularly with limited original datasets, highlighting the importance of data quality and size. This research offers actionable insights for educators, developers, and policymakers, emphasizing the need for AI systems in education that are adaptive and responsive to student perceptions. The study concludes that Naive Bayes combined with SMOTE is an effective approach for sentiment analysis in imbalanced datasets. Future research should explore more sophisticated models and larger datasets to achieve more comprehensive and representative outcomes.
Analysis of Demographic and Consumer Behavior Factors on Satisfaction with AI Technology Usage in Digital Retail Using the Random Forest Algorithm Priyanto, Eko; Saekhu, Ahmad; Prasetyo, Priyo Agung
International Journal for Applied Information Management Vol. 4 No. 4 (2024): Regular Issue: December 2024
Publisher : Bright Institute

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47738/ijaim.v4i4.91

Abstract

The rapid integration of artificial intelligence (AI) into digital retail has reshaped consumer interactions, enabling personalized services and operational enhancements. This study investigates the demographic and behavioral factors influencing consumer satisfaction with AI technologies in digital retail, using the Random Forest classification algorithm for predictive modeling. After comprehensive preprocessing and hyperparameter tuning through grid search cross-validation, the Random Forest model achieved an overall accuracy of 83%. While the model showed strong performance for predicting satisfied consumers yielding a precision of 0.84, recall of 0.97, and F1-score of 0.90, it performed poorly in identifying dissatisfied users, with a recall of only 0.27 and F1-score of 0.39, highlighting a class imbalance issue. Feature importance analysis revealed that experiential factors, particularly enhanced AI experience and preference for online services, significantly influenced satisfaction levels, whereas demographic variables such as age and gender had limited predictive value. These findings emphasize the need for digital retailers to focus on user-centric design and service personalization, rather than demographic segmentation alone, to enhance customer satisfaction and loyalty. Furthermore, the study contributes methodologically by demonstrating the effectiveness of Random Forest in handling complex consumer datasets and theoretically by validating TAM and Customer Satisfaction Theory in the context of AI adoption. Despite limitations related to class imbalance and sector-specific data, this research offers actionable insights for retailers, marketers, and system developers aiming to improve AI-driven service quality and consumer engagement. Future studies are encouraged to address these limitations through the inclusion of emotional and contextual variables and by expanding the analysis to other industries for broader applicability.
Comparative Analysis of Data Balancing Techniques for Machine Learning Classification on Imbalanced Student Perception Datasets Saekhu, Ahmad; Berlilana, Berlilana; Saputra, Dhanar Intan Surya
Jurnal Teknik Informatika (Jutif) Vol. 6 No. 2 (2025): JUTIF Volume 6, Number 2, April 2025
Publisher : Informatika, Universitas Jenderal Soedirman

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.52436/1.jutif.2025.6.2.4286

Abstract

Class imbalance is a common challenge in machine learning classification tasks, often leading to biased predictions toward the majority class. This study evaluates the effectiveness of various machine learning algorithms combined with advanced data balancing techniques in addressing class imbalance in a dataset collected from Class XI students of SMK Ma'arif 1 Kebumen. The dataset, comprising 300 instances and 36 features, includes textual attributes, demographic information, and sentiment labels categorized as Positive, Neutral, and Negative. Preprocessing steps included text cleaning, target encoding, handling missing data, and vectorization. Four sampling techniques—SMOTE, SMOTE + Tomek Links, ADASYN, and SMOTE + ENN—were applied to the training data to create balanced datasets. Nine machine learning algorithms, including CatBoost, Extra Trees, Random Forest, Gradient Boosting, and others, were evaluated using four train-test splits (60:40, 70:30, 80:20, and 90:10). Model performance was assessed using metrics such as accuracy, precision, recall, F1-score, and AUC- ROC. The results demonstrate that SMOTE + Tomek Links is the most effective balancing technique, achieving the highest accuracy when paired with ensemble algorithms like Extra Trees and Random Forest. CatBoost also delivered competitive performance, showcasing its adaptability in imbalanced scenarios. The 90:10 train-test split consistently yielded the best results, emphasizing the importance of adequate training data for model generalization. This study highlights the critical role of data balancing techniques and robust algorithms in optimizing classification performance for imbalanced datasets and provides a framework for future research in similar contexts.