Claim Missing Document
Check
Articles

Found 21 Documents
Search
Journal : Journal of Applied Data Sciences

Recommender System for Book Review based on Clustering Algorithms Udariansyah, Devi; Kurniawan, Tri Basuki; Dewi, Deshinta Arrova; Zakaria, Mohd Zaki; Hanan, Nur Syuhana binti Abd
Journal of Applied Data Sciences Vol 6, No 1: JANUARY 2025
Publisher : Bright Publisher

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47738/jads.v6i1.492

Abstract

Book reviews show the expression of the reviewers that are to be evaluated and describe the book. Today, the amount of the book is growing rapidly, and it offers people a lot of choices. The recommender system on book reviews is mostly mentioned, and we will recommend a book based on the keyword selected. This study highlights two primary objectives. The first objective is to identify the keywords of the book review, and the last objective is to design and develop a book review analysis visualization using the result of the k-means clustering algorithm. The methodology of this research consists of ten phases, which start with the preliminary study, knowledge acquisition and analysis phase, data collection phase, data pre-processing phase, and modeling phase. The research then continues with the design and implementation, dashboard development, testing and evaluation, and finally, the documentation phase. The data from this study is scraped from Amazon.com and focuses on three genres: Fiction and Fantasy, Mystery and Thriller, and Romance. All the data will be clean before it can be applied to k-means clustering. The result of clustering will define the keywords for every genre and will compare with the keywords for each book that was collected from Amazon.com.
Efficient Fruit Grading and Selection System Leveraging Computer Vision and Machine Learning Dewi, Deshinta Arrova; Kurniawan, Tri Basuki; Thinakaran, Rajermani; Batumalay, Malathy; Habib, Shabana; Islam, Muhammad
Journal of Applied Data Sciences Vol 5, No 4: DECEMBER 2024
Publisher : Bright Publisher

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47738/jads.v5i4.443

Abstract

Automated fruit grading is crucial to overcoming the time and accuracy challenges posed by manual methods, which are often limited by subjective human judgment. This study introduces an intelligent grading system leveraging computer vision and AI to improve speed and consistency in assessing fruit quality. Using high-resolution imaging and advanced feature extraction, including grayscale processing, binarization, and enhancement, the system achieves non-destructive, efficient sorting for fruits like apples, bananas, and oranges. Grayscale processing reduces image complexity while preserving essential details, binarization isolates the fruit from its background, and enhancement highlights critical features. Notably, the Edge Pixel method proved most effective, achieving 79.20% accuracy in grading, while the Grayscale Pixel method reached 93.94% accuracy for fruit types. Edge Pixel also achieved 80.32% in differentiating grading types, showcasing its ability to capture essential shapes and edges. Fruits are classified into four grades: Grade_01 (highest quality), Grade_02 (minor imperfections), Grade_03 (notable defects but consumable), and Grade_04 (unfit for consumption). A specialized dataset supports model training, ensuring practical real-world application. The study concludes that this automated system offers significant improvements over traditional grading, providing a scalable, objective, and reliable solution for the agricultural sector, ultimately enhancing productivity and quality assurance.
Utilizing Sentiment Analysis for Reflect and Improve Education in Indonesia Henderi, Henderi; Asro, Asro; Sulaiman, Agus; Kurniawan, Tri Basuki; Dewi, Deshinta Arrova; AlQudah, Mashal
Journal of Applied Data Sciences Vol 6, No 1: JANUARY 2025
Publisher : Bright Publisher

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47738/jads.v6i1.527

Abstract

This study explores the potential of sentiment analysis in providing valuable insights into education in Indonesia based on comments from the YouTube platform. Utilizing the Naive Bayes Classifier method, this research analyzed 13,386 processed comments out of 17,920 original comments. The results show that 53.8% of comments were negative, while 28.5% were positive, and 17.7% were neutral, reflecting diverse perspectives on existing educational issues. The Accuracy of this model reached up to 72.51% with testing on various sample sizes (10%-30%), indicating the model's effectiveness in identifying sentiments. Although the model tends to classify comments as unfavorable, this opens opportunities for introspection and improvement within the educational system. Further analysis with a word cloud revealed dominant keywords, indicating areas that require more attention in public discussions about education. By leveraging this sentiment analysis, the study offers practical and valuable guidance for policymakers to reflect on and enhance educational strategies and policies in Indonesia. This research measures public reactions and aims to foster more constructive and inclusive discussions about the sustainable development of education in Indonesia.
Machine Learning Techniques for Distinguishing Android Malware Variants Irwansyah, Irwansyah; Kurniawan, Tri Basuki; Dewi, Deshinta Arrova; Zakaria, Mohd Zaki; Azmi, Nurhafifi Binti
Journal of Applied Data Sciences Vol 6, No 1: JANUARY 2025
Publisher : Bright Publisher

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47738/jads.v6i1.493

Abstract

The advancement of portable devices has been quickly and dramatically reshaping the usage trend and consumer preferences of electronic devices. Android, the most common mobile operating system, has a privilege-separated protection system with a complex access control mechanism. Android apps require permission to get access to confidential personal data and device resources. However, studies have shown that various malicious applications can acquire permission and target systems and applications by misleading users. In this study, we suggest a machine-learning approach to classifying Android malware variants by mining requested permissions, real permissions, suspicious calls, and API calls that were obtained and used in Android malware applications. Selected features were selected using a feature selection called KBest. Feature selection techniques are used to minimize the scale of the features and increase the performance. Two types of Naïve Bayes classifiers, called Multinomial distribution and multivariate Bernoulli distribution, are used and compared in malware family classification for text classification. Both naïve Bayes types are evaluated using a confusion matrix based on 4022 Android malware applications belonging to 10 families. Experimental findings show that the Multinomial distribution offers a reliable performance from three tests experiment with an average accuracy of 95%.
A Proposed Model for Detecting Learning Styles Based on the Felder-Silverman Model Using KNN and LR with Electroencephalography (EEG) Hasibuan, Muhammad Said; Isnanto, R Rizal; Dewi, Deshinta Arrova; Kurniawan, Tri Basuki; Yeh, Ming-Lang; Wijaya, Adi
Journal of Applied Data Sciences Vol 6, No 2: MAY 2025
Publisher : Bright Publisher

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47738/jads.v6i2.659

Abstract

The identification of learning styles plays a crucial role in enhancing personalized education and optimizing learning outcomes. This research proposes a model for detecting learning styles based on the Felder-Silverman model using two machine learning algorithms: K-Nearest Neighbors (KNN) and Linear Regression (LR). Electroencephalography (EEG) data, known for its ability to capture cognitive and neural activity, serves as the primary dataset for this study. The proposed model was tested on a dataset comprising EEG signals collected during various learning tasks. Feature extraction and preprocessing techniques were employed to ensure high-quality input for the learning algorithms. The experimental results revealed that the LR-based model achieved an accuracy of 96.4%, significantly outperforming the KNN-based model, which obtained an accuracy of 89.9%. These findings highlight the potential of EEG-based models for accurately identifying learning styles, offering valuable insights for educators and researchers aiming to implement adaptive learning systems. This study demonstrates the feasibility and effectiveness of combining EEG data with machine learning techniques for learning style detection, paving the way for more personalized and efficient educational approaches. Future research will explore the integration of additional physiological data and advanced machine learning methods to further improve model accuracy and applicability.
Integrating Convolutional Neural Networks into Mobile Health: A Study on Lung Disease Detection Hasibuan, Muhammad Said; Isnanto, R Rizal; Dewi, Deshinta Arrova; Triloka, Joko; Aziz, RZ Abdul; Kurniawan, Tri Basuki; Maizary, Ary; Wibaselppa, Anggawidia
Journal of Applied Data Sciences Vol 6, No 3: September 2025
Publisher : Bright Publisher

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47738/jads.v6i3.660

Abstract

This study presents the development and evaluation of a Convolutional Neural Network (CNN) model for lung disease detection from chest X-ray images, complemented by a mobile application for real-time diagnosis. The CNN model was trained on a diverse dataset comprising images labeled as "NORMAL" and "PNEUMONIA," achieving an overall accuracy of 96%. Compared to traditional machine learning methods such as Support Vector Machine (SVM) and Random Forest, which typically achieve accuracies ranging from 85% to 92%, the proposed CNN model demonstrates superior performance in classifying lung conditions. The model achieved high precision (0.98) and recall (0.96) for pneumonia detection, as well as precision (0.89) and recall (0.95) for normal cases, ensuring both sensitivity and specificity in diagnostic performance. These results indicate that the model minimizes false positives and false negatives, which is crucial for reducing misdiagnoses and improving patient outcomes in clinical settings. To enhance accessibility, an Android-based application was developed, allowing users to upload chest X-ray images and receive instant diagnostic results. The application successfully integrated the trained CNN model, offering a user-friendly interface suitable for healthcare professionals and patients alike. User testing demonstrated reliable performance, facilitating timely and accurate lung disease detection, particularly in areas with limited access to radiologists. These findings highlight the potential of CNNs in medical imaging and the critical role of mobile technology in expanding healthcare accessibility. This innovative approach not only improves diagnostic accuracy but also enables real-time disease detection, ultimately supporting clinical decision-making. Future research will focus on expanding the dataset, incorporating additional lung conditions, and optimizing the model for enhanced robustness in diverse clinical scenarios.
Data Science Approaches to Analyzing Aesthetic Strategies in Contemporary Presidential Campaigns Isnawijaya, Isnawijaya; Lexianingrum, Siti Rahayu Pratami; Taqwa, Dwi Muhammad; Dewi, Deshinta Arrova; Kurniawan, Tri Basuki
Journal of Applied Data Sciences Vol 6, No 2: MAY 2025
Publisher : Bright Publisher

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47738/jads.v6i2.609

Abstract

In today’s digital political landscape, social media platforms play a critical role in shaping voter engagement, especially among youth. This study investigates how aesthetic political strategies were applied in Prabowo Subianto’s 2024 presidential campaign on TikTok and Instagram. It focuses on decoding voter sentiment, optimizing content delivery, and identifying visual elements that resonate with the public. Using machine learning models tailored to various data types, the research analyses over 50,000 comments and 30 million engagements. A BERT-based sentiment analysis model achieved 88% accuracy, revealing 60% positive, 25% neutral, and 15% negative sentiment, reflecting broad public approval. Meanwhile, a Gradient Boosting engagement prediction model reached 85% accuracy in forecasting post performance based on content format, timing, and hashtag use. Posts with videos and trending hashtags had a 78% chance of high engagement, while static images without hashtags scored only 45%. Evening posts performed best, with a 25% higher likelihood of engagement. The findings highlight the value of AI-driven insights in political communication, emphasizing that emotionally and visually rich content—particularly patriotic and relatable themes—enhances audience connection. This study offers a practical framework for political actors to develop adaptive, data-informed strategies that align with voter preferences in an increasingly fragmented and fast-paced digital media environment.
Incorporate Transformer-Based Models for Anomaly Detection Dewi, Deshinta Arrova; Singh, Harprith Kaur Rajinder; Periasamy, Jeyarani; Kurniawan, Tri Basuki; Henderi, Henderi; Hasibuan, M. Said; Nathan, Yogeswaran
Journal of Applied Data Sciences Vol 6, No 3: September 2025
Publisher : Bright Publisher

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47738/jads.v6i3.762

Abstract

This paper explores the effectiveness of Transformer-based models, specifically the Time-Series Transformer (TST) and Temporal Fusion Transformer (TFT), for anomaly detection in streaming data. We review related work on anomaly detection models, highlighting traditional methods' limitations in speed, accuracy, and scalability. While LSTM Autoencoders are known for their ability to capture temporal patterns, they suffer from high memory consumption and slower inference times. Though efficient in terms of memory usage, the Matrix Profile provides lower performance in detecting anomalies. To address these challenges, we propose using Transformer-based models, which leverage the self-attention mechanism to capture long-range dependencies in data, process sequences in parallel, and achieve superior performance in both accuracy and efficiency. Our experiments show that TFT outperforms the other models with an F1-score of 0.92 and a Precision-Recall AUC of 0.71, demonstrating significant improvements in anomaly detection. The TST model also shows competitive performance with an F1-score of 0.88 and Precision-Recall AUC of 0.68, offering a more efficient alternative to LSTMs. The results underscore that Transformer models, particularly TST and TFT, provide a robust solution for anomaly detection in real-time applications, offering improved performance, faster inference times, and lower memory usage than traditional models. In conclusion, Transformer-based models stand out as the most effective and scalable solution for large-scale, real-time anomaly detection in streaming time-series data, paving the way for their broader application across various industries. Future work will further focus on optimizing these models and exploring hybrid approaches to enhance detection capabilities and real-time performance.
Detecting Gender-Based Violence Discourse Using Deep Learning: A CNN-LSTM Hybrid Model Approach Kurniawan, Tri Basuki; Dewi, Deshinta Arrova; Henderi, Henderi; Hasibuan, M. Said; Zakaria, Mohd Zaki; Ismail, Abdul Azim Bin
Journal of Applied Data Sciences Vol 6, No 3: September 2025
Publisher : Bright Publisher

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47738/jads.v6i3.761

Abstract

Gender-Based Violence (GBV) is a critical social issue impacting millions worldwide. Social media discussions offer valuable insights into public awareness, sentiment, and advocacy, yet manually analyzing such vast textual data is highly challenging. Traditional text classification methods often struggle with contextual understanding and multi-class categorization, making it difficult to accurately identify discussions on Sexual Violence, Physical Violence, and other topics. To address this, the present study proposes a hybrid deep learning approach combining Convolutional Neural Networks (CNN) and Long Short-Term Memory (LSTM) networks. CNN is utilized for extracting key linguistic features, while LSTM enhances the classification process by maintaining sequential dependencies. This hybrid CNN+LSTM model is evaluated against standalone CNN and LSTM models to assess its performance in classifying GBV-related tweets. The dataset was sourced from Kaggle, containing real-world Twitter discussions on GBV. Experimental results demonstrate that the hybrid model surpasses both CNN and LSTM models, achieving an accuracy of 89.6%, precision of 88.4%, recall of 89.1%, and F1-score of 88.7%. Confusion matrix and ROC curve analyses further confirm the hybrid model’s superior performance, correctly identifying Sexual Violence (82%), Physical Violence (15%), and Other (3%) cases with reduced misclassification rates. These results suggest that combining CNN’s feature extraction with LSTM’s contextual learning provides a more balanced and effective classification model for GBV-related text. This work supports the development of AI-based tools for social media monitoring, policy-making, and advocacy, helping stakeholders better understand and respond to GBV discussions. Future research could explore transformer-based models like BERT and real-time classification applications to further improve performance.
Navigating Heart Stroke Terrain: A Cutting-Edge Feed-Forward Neural Network Expedition Praveen, S Phani; Mantena, Jeevana Sujitha; Sirisha, Uddagiri; Dewi, Deshinta Arrova; Kurniawan, Tri Basuki; Onn, Choo Wou; Yorman, Yorman
Journal of Applied Data Sciences Vol 6, No 3: September 2025
Publisher : Bright Publisher

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47738/jads.v6i3.763

Abstract

Heart stroke remains one of the leading causes of death worldwide, necessitating early and accurate prediction systems to enable timely medical intervention. While a variety of machine learning approaches have been employed to address this issue, including Logistic Regression, Decision Trees, Random Forests, Support Vector Machines, and K-Nearest Neighbors, these models often suffer from limitations such as overfitting, insufficient generalization, poor performance on imbalanced datasets, and inability to capture complex nonlinear patterns in clinical data. Additionally, many existing works do not comprehensively integrate both clinical and demographic features or lack rigorous evaluation metrics beyond accuracy alone. This study proposes a novel Feed-Forward Neural Network (FFNN) model for heart stroke prediction, designed to overcome the shortcomings of conventional models. Unlike shallow classifiers, the FFNN architecture employed here leverages multiple hidden layers and nonlinear activation functions to learn intricate relationships within the dataset. The dataset used comprises various attributes such as age, hypertension, heart disease, BMI, and smoking status, which were preprocessed through normalization, one-hot encoding, and imputation techniques to ensure data quality and model performance. Experiments were conducted using a stratified train-test split, and the model was trained using the Adam optimizer with carefully tuned hyperparameters. Comparative evaluations against baseline models (Logistic Regression, Random Forest, and SVM) were carried out using precision, recall, F1-score, and ROC-AUC as performance metrics. The proposed FFNN achieved the highest accuracy of 96.47%, along with substantial improvements in recall and F1-score, highlighting its superior capability in identifying potential stroke cases even in imbalanced datasets. This work bridges a significant gap in heart stroke prediction by demonstrating the effectiveness of deep learning models—specifically FFNNs—in extracting complex patterns from diverse patient data. It also sets the stage for further exploration of deep learning-based clinical decision support systems.
Co-Authors - Kurniawan, - Adi Wijaya Agus Riyanto Alde Alanda, Alde Alqudah, Mashal Kasem Alqudah, Musab Kasim Andri Andri Antoni, Darius Armoogum, Sheeba Armoogum, Vinaye Asro, Asro Astried, Astried Aziz, RZ. Abdul Azmi, Nurhafifi Binti Bappoo, Soodeshna Batumalay, Malathy Bujang, Nurul Shaira Binti Chandra, Anurag Dedy Syamsuar Dewi, Deshinta Arrova Dewi, Deshinta Arrowa Diana Diana Edi Surya Negara Eko Risdianto Fadly Fadly Fatoni, Fatoni Febriyanti Panjaitan Firosha, Ardian Fuad, Eyna Fahera Binti Eddie Habib, Shabana Hadi Syahputra Hanan, Nur Syuhana binti Abd Hasibuan, M.S. Henderi . Hendra Kurniawan Herdiansyah, M. Izman Hidayani, Nieta Hisham, Putri Aisha Athira binti Irianto, Suhendro Y. Irwansyah Irwansyah Ismail, Abdul Azim Bin Isnawijaya, Isnawijaya Joan Angelina Widians, Joan Angelina Kijsomporn, Jureerat Kurniawan, Dendi Lexianingrum, Siti Rahayu Pratami M Said Hasibuan Madjid, Fadel Muhammad Maizary, Ary Mantena, Jeevana Sujitha Mashal Alqudah Melanie, Nicolas Misinem, Misinem Mohd Salikon, Mohd Zaki Motean, Kezhilen Muhamad Akbar Muhammad Islam, Muhammad Muhammad Nasir Muhayeddin, Abdul Muniif Mohd Nathan, Yogeswaran Nazmi, Che Mohd Alif Oktariansyah Oktariansyah, Oktariansyah Onn, Choo Wou Periasamy, Jeyarani Prahartiningsyah, Anggari Ayu Praveen, S Phani Puspitasari, Novianti Qisthiano, M Riski R Rizal Isnanto Rahmi Rahmi RR. Ella Evrita Hestiandari Saksono, Prihambodo Hendro Saringat, Zainuri Singh, Harprith Kaur Rajinder Sirisha, Uddagiri Sri Karnila Sulaiman, Agus Sunda Ariana, Sunda Suriani, Uci Syaputra, Hadi Taqwa, Dwi Muhammad Thinakaran, Rajermani Triloka, Joko Udariansyah, Devi Usman Ependi Wibaselppa, Anggawidia Yeh, Ming-Lang Yorman Yupika Maryansyah, Yupika Zakari, Mohd Zaki Zakaria, Mohd Zaki