Claim Missing Document
Check
Articles

Found 7 Documents
Search
Journal : JOURNAL OF APPLIED INFORMATICS AND COMPUTING

HANA: An AI Chatbot for Islamic Jurisprudence on Menstruation using SBERT and TF-IDF Masuzzahra, Tsaura Rafah; Khothibul Umam; Hery Mustofa; Maya Rini Handayani
Journal of Applied Informatics and Computing Vol. 9 No. 3 (2025): June 2025
Publisher : Politeknik Negeri Batam

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.30871/jaic.v9i3.9449

Abstract

The advancement of Artificial Intelligence (AI), particularly in Natural Language Processing (NLP), has opened new opportunities for religious technological innovation, especially in addressing practical Islamic jurisprudence issues such as menstruation (fiqh haid). This research proposes and implements HANA, an AI chatbot developed for Telegram, utilizing a hybrid approach combining Term Frequency-Inverse Document Frequency (TF-IDF) and Sentence-BERT (SBERT) models. A curated dataset of over 1000 question-answer pairs from classical and contemporary Islamic literature was used, primarily based on the Shafi'i school of thought. The chatbot matches user queries through a two-stage retrieval: initial keyword matching via TF-IDF and deeper semantic matching via SBERT embeddings. Evaluations were conducted by comparing TF-IDF, SBERT, and hybrid approaches using cosine similarity, precision, recall, and F1-score metrics, focused on top-1 retrieval accuracy. HANA achieved an average cosine similarity score of 0.6581 and a semantic relevance rating of 87% based on expert validation, while User Acceptance Testing (UAT) involving 15 respondents indicated 86.7% satisfaction. Although the system is deployed as a proof-of-concept on Google Colab without persistent hosting, it demonstrates the viability of lightweight AI chatbots for Shariah consultation services. Future improvements include multi-turn conversation handling and integration with large language models for better context understanding. This research contributes to expanding NLP applications within techno-dakwah initiatives, providing a scalable approach to enhance women's access to Islamic jurisprudence knowledge.
Comparative Study of SVM, KNN, and Naïve Bayes for Sentiment Analysis of Religious Application Reviews Heti Aprilianti; Khothibul Umam; Maya Rini Handayani
Journal of Applied Informatics and Computing Vol. 9 No. 3 (2025): June 2025
Publisher : Politeknik Negeri Batam

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.30871/jaic.v9i3.9482

Abstract

This study aims to evaluate and compare the performance of three machine learning algorithms—Support Vector Machine (SVM), K-Nearest Neighbors (K-NN), and Naïve Bayes—for sentiment classification of user reviews on the NU Online application in the Google Play Store. NU Online is a religious digital platform providing Islamic content such as articles, prayers, and worship schedules. A total of 1,500 user reviews were collected using web scraping, and 1,491 were retained after data cleaning. Preprocessing steps included punctuation removal, case folding, normalization, stopword removal, stemming, and tokenization. Sentiment labels (positive or negative) were automatically assigned using a lexicon-based approach. The performance of the models was assessed using accuracy, precision, recall, and F1-score, calculated via confusion matrix with a training-testing data split. The results show that the SVM with a linear kernel achieved the best accuracy (81.6%), followed by Naïve Bayes (73.2%) and K-NN (66.9%). These findings indicate that SVM is the most effective algorithm in this context, providing practical contributions for developers of the NU Online digital religious platform and contributing to research in Indonesian natural language processing.
Detecting Fake Reviews in E-Commerce: A Case Study on Shopee Using Support Vector Machine and Random Forest Khoirotulmuadiba Purifyregalia; Khothibul Umam; Nur Cahyo Hendro Wibowo; Maya Rini Handayani
Journal of Applied Informatics and Computing Vol. 9 No. 3 (2025): June 2025
Publisher : Politeknik Negeri Batam

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.30871/jaic.v9i3.9514

Abstract

The increasing popularity of online shopping, particularly on platforms such as Shopee, has made product reviews a significant factor influencing consumer purchasing decisions. However, the presence of fake reviews generated by non-human agents undermines consumer trust and affects platform credibility. This study aims to detect fake reviews on Shopee by applying a text classification approach using Random Forest and Support Vector Machine (SVM) algorithms. A dataset consisting of 3,686 Shopee product reviews was collected and underwent preprocessing steps including data cleaning, normalization, tokenization, and TF-IDF weighting. Review labeling was performed automatically through the Latent Dirichlet Allocation (LDA) method, categorizing reviews into Original (OR) and Computer-Generated (CG). Model performance was evaluated using accuracy, precision, recall, and F1-score metrics. Experimental results show that the SVM algorithm achieved the highest accuracy at 88.84%, outperforming Random Forest which obtained 80.39%. These findings highlight the effectiveness of SVM in handling high-dimensional text data for fake review detection. The study contributes to the application of automated topic modeling (LDA) for labeling e-commerce reviews in the Indonesian context and opens opportunities for further enhancement using larger datasets and deep learning-based models to improve classification accuracy and scalability.
Performance of Machine Learning Algorithms on Imbalanced Sentiment Datasets Without Balancing Techniques Dina Wulan Yekti rahayu; Khothibul Umam; Maya Rini Handayani
Journal of Applied Informatics and Computing Vol. 9 No. 3 (2025): June 2025
Publisher : Politeknik Negeri Batam

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.30871/jaic.v9i3.9584

Abstract

This study explores the performance of five sentiment classification algorithms—Naïve Bayes, Logistic Regression, Support Vector Machine, Decision Tree, and Random Forest—on an imbalanced sentiment dataset, with the SMOTE technique applied as a comparison. The research follows the Knowledge Discovery in Databases (KDD) framework, which includes data selection, preprocessing, transformation, data mining, and evaluation. The evaluation uses metrics such as accuracy, precision, recall, F1-score, and macro average F1-score. Initial results show that all five algorithms performed fairly well even without using a balancing technique, with Naïve Bayes achieving the highest F1-score of 0.84 and recall of 0.81. After applying SMOTE, only small improvements were observed in some models, such as Random Forest (F1-score increased from 0.81 to 0.85), while other models like Naïve Bayes experienced a decrease in performance, dropping to 0.77. This suggests that the effect of balancing techniques like SMOTE can vary depending on the algorithm. Thus, this study provides empirical contributions that highlight the importance of selecting appropriate approaches and the need for a deep understanding of each algorithm's behavior in the context of imbalanced data. Researchers are encouraged to carefully consider these aspects when designing experiments and interpreting results.
Sentiment Classification of MyPertamina Reviews Using Naïve Bayes and Logistic Regression Dwi Yuni Saraswati; Handayani, Maya Rini; Umam, Khothibul; Mustofa, Mokhamad Iklil
Journal of Applied Informatics and Computing Vol. 9 No. 4 (2025): August 2025
Publisher : Politeknik Negeri Batam

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.30871/jaic.v9i4.9723

Abstract

This research conducts a comparative evaluation of the effectiveness of the Naïve Bayes and Logistic Regression algorithms in mapping public perceptions of the MyPertamina application on the Google Play Store. The data consists of 2,000 user reviews obtained through a scraping technique. The research steps include labeling the reviews as positive or negative, followed by pre-processing and TF-IDF weighting. The dataset was systematically divided into two parts, with 80% allocated for model training and the remaining 20% for evaluation. The Naïve Bayes and Logistic Regression models were implemented using the Python programming language and evaluated based on accuracy, precision, recall, and F1-score metrics. The analysis shows that Logistic Regression achieved an accuracy of 86%, while Naïve Bayes achieved 81%. Logistic Regression demonstrated superior performance as it effectively captures linear relationships between features in TF-IDF representations and provides a more balanced outcome in terms of precision and recall. In contrast, Naïve Bayes is more influenced by high-frequency word distributions and does not account for feature correlations, which can limit its performance in certain contexts. Therefore, Logistic Regression is considered more suitable for sentiment classification tasks in this study. These findings emphasize the importance of selecting appropriate algorithms for sentiment analysis and suggest opportunities for future research using alternative methods to enhance predictive accuracy.
Identification of Buzzers in Skincare Reviews Using a Lexicon-Based Sentiment Analysis Method Pramesti, Arfiana Diah; Umam, Khothibul; Handayani, Maya Rini
Journal of Applied Informatics and Computing Vol. 9 No. 5 (2025): October 2025
Publisher : Politeknik Negeri Batam

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.30871/jaic.v9i5.11005

Abstract

Along with the rapid development of digital technology, social media has become the main platform for consumers to share experiences about products, including skincare products. However, it is not uncommon for reviews provided by users to not reflect authentic experiences, but rather reviews created by certain parties, or buzzers, to manipulate public perception. The presence of buzzers in skincare reviews is important to consider, as they can affect consumer trust and influence purchasing decisions. This study aims to identify the presence of buzzers in skincare product reviews using a lexicon dictionary-based sentiment analysis. Of the 529 comments analyzed, 75 comments showed negative sentiment and 454 comments showed positive sentiment. The classification results revealed that 85.8% of the comments belonged to the non-buzzer category, while 14.2% were indicated as buzzers. Evaluation of the classification model showed high accuracy, reaching 93%, but performance in detecting buzzers was limited, with a recall metric of only 0.50. This shows that while the model managed to classify non-buzzer comments well, there are still difficulties in identifying buzzer comments, mostly due to data imbalance. This research emphasizes the importance of a proper analytical approach in detecting inauthentic reviews to ensure the information consumers receive remains accurate, transparent, and accountable.
Mapping the Polarity of Tourist Opinions on Indonesian Destinations through Google Maps Reviews Using Supervised Learning Methods Sa’adah, Siti Miftahus; Umam, Khothibul; Handayani, Maya Rini; Mustofa, Mokhammad Iklil
Journal of Applied Informatics and Computing Vol. 9 No. 5 (2025): October 2025
Publisher : Politeknik Negeri Batam

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.30871/jaic.v9i5.9836

Abstract

The advancement of information technology has transformed how individuals seek information and plan their travels, notably through online reviews of tourist attractions on platforms like Google Maps. However, these reviews do not always align with visitors' expectations, necessitating further analysis to comprehend the underlying sentiments. The objective of this research is to inspect the performance of multiple machine learning algorithms in executing sentiment analysis on user generated reviews related to tourist attractions in Indonesia. The algorithms examined include Multinomial Naïve Bayes, Random Forest Classifier, Logistic Regression, Support Vector Machine, K-Nearest Neighbors, and Extra Trees Classifier. The research process encompasses data collection and labeling, data preprocessing, exploratory data analysis (EDA), Word Cloud visualization, feature extraction, classification implementation, and performance evaluation. Experimental results indicate that the K-Nearest Neighbors (KNN) algorithm attain the most accuracy and F1-score of 97%, indicating its effectiveness in categorizing text-based sentiment reviews sourced from the Google Maps platform.