Claim Missing Document
Check
Articles

Found 3 Documents
Search
Journal : Journal of Applied Data Sciences

Leveraging K-Nearest Neighbors with SMOTE and Boosting Techniques for Data Imbalance and Accuracy Improvement Lubis, Adyanata; Irawan, Yuda; Junadhi, Junadhi; Defit, Sarjon
Journal of Applied Data Sciences Vol 5, No 4: DECEMBER 2024
Publisher : Bright Publisher

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47738/jads.v5i4.343

Abstract

This research addresses the issue of low accuracy in sentiment analysis on Israeli products on social media, initially achieving only 64% using the K-NN algorithm. Given the ongoing Israeli-Palestinian conflict, which has garnered widespread international attention and strong opinions, understanding public sentiment towards Israeli products is crucial. To improve accuracy, the study employs SMOTE to handle data imbalance and combines K-NN with boosting algorithms like AdaBoost and XGBoost, which were selected for their effectiveness in improving model performance on imbalanced and complex datasets. AdaBoost was chosen for its ability to enhance model accuracy by focusing on misclassified instances, while XGBoost was selected for its efficiency and robustness in handling large datasets with multiple features. The research process includes data pre-processing (cleaning, normalization, tokenization, stopwords removal, and stemming), labeling using a Lexicon-Based approach, and feature extraction with CountVectorizer and TF-IDF. SMOTE was applied to oversample the minority class to match the number of instances in the majority class, ensuring balanced representation before model training. A total of 1,145 datasets were divided into training and testing data with a ratio of 70:30. Results demonstrate that SMOTE increased K-NN accuracy to 77%. Interestingly, combining K-NN with AdaBoost after SMOTE achieved 72% accuracy, which, although lower than the 77% achieved with SMOTE alone, was higher than the 68% accuracy without SMOTE. This discrepancy can be attributed to the added complexity introduced by AdaBoost, which may not synergize as effectively with SMOTE as XGBoost does, particularly in this dataset's context. In contrast, K-NN with XGBoost after SMOTE reached the highest accuracy of 88%, demonstrating a more effective combination. Boosting without SMOTE resulted in lower accuracies: 68% for KNN+AdaBoost and 64% for KNN+XGBoost. The combination of K-NN with SMOTE and XGBoost significantly improves model accuracy and reliability for sentiment analysis on social media.
Improving Evaluation Metrics for Text Summarization: A Comparative Study and Proposal of a Novel Metric Junadhi, Junadhi; Agustin, Agustin; Efrizoni, Lusiana; Okmayura, Finanta; Habibie, Dedi Rahman; Muslim, Muslim
Journal of Applied Data Sciences Vol 6, No 2: MAY 2025
Publisher : Bright Publisher

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47738/jads.v6i2.547

Abstract

This research evaluates and compares the effectiveness of various evaluation metrics in text summarization, focusing on the development of a new metric that holistically measures summary quality. Commonly used metrics, including ROUGE, BLEU, METEOR, and BERTScore, were tested on three datasets: CNN/DailyMail, XSum, and PubMed. The analysis revealed that while ROUGE achieved an average score of 0.65, it struggled to capture semantic nuances, particularly for abstractive summarization models. In contrast, BERTScore, which incorporates semantic representation, performed better with an average score of 0.75. To address these limitations, we developed the Proposed Metric, which combines semantic similarity, n-gram overlap, and sentence fluency. The Proposed Metric achieved an average score of 0.78 across datasets, surpassing conventional metrics by providing more accurate assessments of summary quality. This research contributes a novel approach to text summarization evaluation by integrating semantic and structural aspects into a single metric. The findings highlight the Proposed Metric's ability to capture contextual coherence and semantic alignment, making it suitable for real-world applications such as news summarization and medical research. These results emphasize the importance of developing holistic metrics for better evaluation of text summarization models.
Adaptive Neural Collaborative Filtering with Textual Review Integration for Enhanced User Experience in Digital Platforms Efrizoni, Lusiana; Ali, Edwar; Asnal, Hadi; Junadhi, Junadhi
Journal of Applied Data Sciences Vol 6, No 4: December 2025
Publisher : Bright Publisher

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47738/jads.v6i4.944

Abstract

This research proposes a hybrid rating prediction model that integrates Neural Collaborative Filtering (NCF), Long Short-Term Memory (LSTM), and semantic analysis through Natural Language Processing (NLP) to enhance recommendation accuracy. The main objective is to improve alignment between system predictions and actual user preferences by leveraging multi-source information from the Amazon Movies and TV dataset, which includes explicit user–item ratings and textual reviews. The core idea is to combine three complementary processing paths—(1) user–item interaction modeling via NCF, (2) temporal dynamics capture through LSTM, and (3) semantic understanding of reviews using NLP—into a unified deep learning-based adaptive architecture. Experimental evaluation demonstrates that this multi-input approach outperforms the baseline collaborative filtering model, with the Mean Absolute Error (MAE) reduced from 1.3201 to 1.2817 (a 2.91% improvement) and the Mean Squared Error (MSE) reduced from 2.2315 to 2.1894 (a 1.89% improvement). Training metrics visualization further shows a stable convergence pattern, with the MAE gap between training and validation consistently below 0.03, indicating minimal overfitting. The findings confirm that integrating cross-dimensional signals significantly enhances predictive performance and can contribute to increased user satisfaction and engagement in recommendation platforms. The novelty of this work lies in the simultaneous integration of interaction, temporal, and semantic dimensions into a single adaptive recommendation framework, a configuration not jointly explored in prior studies. Moreover, the flexible architecture enables adaptation to other domains such as e-commerce, music, or online learning, broadening its practical applicability.