Claim Missing Document
Check
Articles

Found 21 Documents
Search
Journal : Journal of Applied Data Sciences

Unveiling Criminal Activity: a Social Media Mining Approach to Crime Prediction Armoogum, Sheeba; Dewi, Deshinta Arrova; Armoogum, Vinaye; Melanie, Nicolas; Kurniawan, Tri Basuki
Journal of Applied Data Sciences Vol 5, No 3: SEPTEMBER 2024
Publisher : Bright Publisher

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47738/jads.v5i3.350

Abstract

Social media platforms have become breeding grounds for abusive comments, necessitating the use of machine learning to detect harmful content. This study aims to predict abusive comments within a Mauritian context, focusing specifically on comments written in Mauritian Kreol, a language with limited natural language processing tools. The objective was to build and evaluate four machine learning models—Decision Tree, Random Forest, Naïve Bayes, and Support Vector Machine (SVM)—to accurately classify comments as abusive or non-abusive. The models were trained and tested using k-fold cross-validation, and the Decision Tree model outperformed others with 100% precision and recall, while Random Forest followed with 99% accuracy. Naïve Bayes and SVM, although achieving 100% precision, had lower recall rates of 35% and 16%, respectively, due to imbalanced data in the training set. Pre-processing steps, including stop-word removal and a custom Kreol spell checker, were key in enhancing model performance. The study provides a novel contribution by applying machine learning in a Mauritian context, demonstrating the potential of AI in detecting abusive language in underrepresented languages. Despite limitations such as the absence of a Kreol lemmatization tool and incomplete coverage of Kreol spelling variations, the models show promise for wider application in social media crime detection. Future research could explore expanding this approach to other languages and domains of social media crimes.
Analyzing Factors that Influence Student Performance in Academic Hidayani, Nieta; Dewi, Deshinta Arrova; Kurniawan, Tri Basuki
Journal of Applied Data Sciences Vol 5, No 2: MAY 2024
Publisher : Bright Publisher

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47738/jads.v5i2.221

Abstract

Student performance analysis is a complex and popular study area in educational data mining. Multiple factors affect performance in nonlinear ways, making this topic more appealing to academics. The broad availability of educational datasets adds to this interest, particularly in online learning. Although previous studies have focused on analyzing and predicting students' performance based on their classroom activities, this study did not take into account student's outside conditions, such as sleep hours, extracurricular activities, and a sample of question papers that they had practiced.  These three variables are included among others in our study. In this paper, we describe an analysis of 10,000 student records, with each record containing information on numerous predictors and a performance index. The dataset intends to shed light on the relationship between predictor variables and the performance indicator. To create the correlation variable heatmap, we use both univariate and bivariate studies to produce a linear equation. Following that, we perform data preprocessing and modeling to facilitate predictive analysis. Finally, we showed the outcomes of actual and expected student performance using the model we constructed. The findings demonstrate that our prediction model was 98% accurate, with a mean absolute error of 1.62. 
Clustering the Unlabeled Data Using a Modified Cat Swarm Optimization Dewi, Deshinta Arrova; Kurniawan, Tri Basuki; Zakaria, Mohd Zaki; Armoogum, Sheeba
Journal of Applied Data Sciences Vol 5, No 3: SEPTEMBER 2024
Publisher : Bright Publisher

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47738/jads.v5i3.349

Abstract

This paper presents a modified version of the Cat Swarm Optimization (CSO) algorithm aimed at addressing the limitations of traditional clustering methods in handling complex, high-dimensional datasets. The primary objective of this research is to improve clustering accuracy and stability by eliminating the mixture ratio (MR), setting the counts of dimensions to change (CDC) to 100%, and incorporating a new search equation in the tracing mode of the CSO algorithm. To evaluate the performance of the modified algorithm, five classic datasets from the UCI Machine Learning Repository—namely Iris, Cancer, Glass, Wine, and Contraceptive Method Choice (CMC)—were used. The proposed algorithm was compared against K-Means and the original CSO. Performance metrics such as intra-cluster distance, standard deviation, and F- measure were used to assess the quality of clustering. The results demonstrated that the modified CSO consistently outperformed the competing algorithms. For example, on the Iris dataset, the modified CSO achieved a best intra-cluster distance of 96.78 and an F-measure of 0.786, compared to 97.12 and 0.781 for K-Means. Similarly, for the Wine dataset, the modified CSO reached a best intra-cluster distance of 16399, surpassing K-Means which recorded 16768. In conclusion, the modifications introduced to the CSO algorithm significantly enhance its clustering performance across diverse datasets, producing tighter and more accurate clusters with improved stability. These findings suggest that the modified CSO is a robust and effective tool for data clustering tasks, particularly in high-dimensional spaces. Future work will focus on dynamic parameter tuning and testing the scalability of the algorithm on larger and more complex datasets.
Gum Disease Identification Using Fuzzy Expert System Nasir, Muhammad; Kurniawan, Tri Basuki; Dewi, Deshinta Arrova; Zakaria, Mohd Zaki; Bujang, Nurul Shaira Binti
Journal of Applied Data Sciences Vol 5, No 3: SEPTEMBER 2024
Publisher : Bright Publisher

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47738/jads.v5i3.346

Abstract

Gum disease, including Gingivitis and Periodontitis, is among the most common dental conditions, primarily caused by dental plaque, a bacterial biofilm. These conditions are strongly linked to various systemic illnesses, including cancer, atherosclerosis, hypertension, stroke, and respiratory and cardiovascular conditions like aspiration pneumonia, as well as adverse pregnancy outcomes. Gum inflammation is typically characterized by symptoms such as increased redness, swelling (edema), and a loss of surface texture (stippling; gum fiber attachment). These symptoms are site-specific, meaning that an individual can have both healthy and diseased areas within their mouth. In this research, we developed a fuzzy expert system using MATLAB to identify gum diseases. The system was tested on various cases and produced an output value of 0.133, which successfully identified Gingivitis. This value was derived using a fuzzy logic system that processes input data through predefined rules within the Fuzzy Expert System (FES). The system utilizes several input variables such as the frequency of gum bleeding, the extent of plaque accumulation, the depth of gum recession, and the degree of tooth mobility. The key contribution of this study lies in the integration of fuzzy logic to handle the inherent uncertainties in clinical diagnosis, providing a more nuanced assessment compared to traditional methods. The novelty of this research is the application of a fuzzy expert system in dental diagnostics, offering a promising tool for improving the accuracy and efficiency of gum disease identification in clinical settings. This system has the potential to assist dentists in making more informed decisions, ultimately leading to better patient outcomes.
Fake vs Real Image Detection Using Deep Learning Algorithm Fatoni, Fatoni; Kurniawan, Tri Basuki; Dewi, Deshinta Arrova; Zakaria, Mohd Zaki; Muhayeddin, Abdul Muniif Mohd
Journal of Applied Data Sciences Vol 6, No 1: JANUARY 2025
Publisher : Bright Publisher

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47738/jads.v6i1.490

Abstract

The purpose of this research project is to address the growing issues presented by modified visual information by developing a deep learning model for identifying between real and fake images. To enhance accuracy, this project evaluates the effectiveness of deep learning algorithms such as Residual Neural Network (ResNet), Visual Geometry Group 16 (VGG16), and Convolutional Neural Network (CNN) together with Error Level Analysis (ELA) as preprocessing the dataset. The CASIA dataset contains 7,492 real images and 5,124 fake images. The images included are from a wide range of random subjects, including buildings, fruits, animals, and more, providing a comprehensive dataset for model training and validation. This research examined models' effectiveness through experiments, measuring their training and validation accuracies. It comes out with the best accuracy of each model, which is for Convolutional Neural Network (CNN), 94% for training accuracy, and validation accuracy of 92%. For VGG16, with both training and validation accuracy reaching 94%. Lastly, Residual Neural Network (ResNet) demonstrated optimal performance with 95% training accuracy and 93% validation accuracy. This project also constructs a system prototype for practical applications, offering an interface for real-world testing. When integrating into the system prototype, only Residual Neural Network (ResNet) shows consistency and effectiveness when predicting both fake and real images, and this led to the decision to choose ResNet for integration into the system. Furthermore, the project identified several areas for improvement. Firstly, expanding the model comparison for discovering more successful algorithms. Next, improving the dataset preprocessing phase by incorporating filtering or denoising techniques. Lastly, refining the system prototype for greater appeal and user-friendliness has the potential to attract a larger audience.
Deep Learning Based Face Mask Detection System Using MobileNetV2 for Enhanced Health Protocol Compliance Fadly, Fadly; Kurniawan, Tri Basuki; Dewi, Deshinta Arrova; Zakaria, Mohd Zaki; Hisham, Putri Aisha Athira binti
Journal of Applied Data Sciences Vol 5, No 4: DECEMBER 2024
Publisher : Bright Publisher

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47738/jads.v5i4.476

Abstract

Personal protective equipment (PPE) is crucial in mitigating the spread of infections within the pharmacy industry, manufacturing sectors, and healthcare facilities. Airborne particles and contaminants can be released during the handling of pharmaceuticals, the operation of machinery, or patient care activities. These particles can be transmitted through close contact with an infected individual or by touching contaminated surfaces and then touching one's face (mouth, nose, or eyes). PPE, including face masks, plays a vital role in minimizing the risk of transmission of infectious diseases. Although mandates for wearing face masks might relax as situations improve and vaccination rates increase, staying prepared for potential future outbreaks and the resurgence of infectious diseases remains important. Therefore, an automated system for face mask detection is important for future use. This research proposes real-time face mask detection by identifying who is (i) not wearing a mask and (ii) wearing a mask. This research presents a deep-learning approach using a pre-trained model, MobileNet-V2. The model is trained on a 10,000 dataset of images of individuals with and without masks. The result shows that the pre-trained MobileNet-V2 model obtained a high accuracy of 98.69% on the testing dataset.
Scalable Machine Learning Approaches for Real-Time Anomaly and Outlier Detection in Streaming Environments Dewi, Deshinta Arrova; Singh, Harprith Kaur Rajinder; Periasamy, Jeyarani; Kurniawan, Tri Basuki; Henderi, Henderi; Hasibuan, M. Said
Journal of Applied Data Sciences Vol 5, No 4: DECEMBER 2024
Publisher : Bright Publisher

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47738/jads.v5i4.444

Abstract

The prevalence of streaming data across various sectors poses significant challenges for real-time anomaly detection due to its volume, velocity, and variability. Traditional data processing methods often need to be improved for such dynamic environments, necessitating robust, scalable, and efficient real-time analysis systems. This study compares two advanced machine learning approaches—LSTM autoencoders and Matrix Profile algorithms—to identify the most effective method for anomaly detection in streaming environments using the NYC taxi dataset. Existing literature on anomaly detection in streaming data highlights various methodologies, including statistical tests, window-based techniques, and machine learning models. Traditional methods like the Generalized ESD test have been adapted for streaming data but often require a full historical dataset to function effectively. In contrast, machine learning approaches, particularly those using LSTM networks, are noted for their ability to learn complex patterns and dependencies, offering promising results in real-time applications. In a comparative analysis, LSTM autoencoders significantly outperformed other methods, achieving an F1-score of 0.22 for anomaly detection, notably higher than other techniques. This model demonstrated superior capability in capturing temporal dependencies and complex data patterns, making it highly effective for the dynamic and varied data in the NYC taxi dataset. The LSTM autoencoder's advanced pattern recognition and anomaly detection capabilities confirm its suitability for complex, high-velocity streaming data environments. Future research should explore the integration of LSTM autoencoders with other machine-learning techniques to enhance further the accuracy, scalability, and efficiency of anomaly detection systems. This study advances our understanding of scalable machine-learning approaches and underscores the critical importance of selecting appropriate models based on the specific characteristics and challenges of the data involved.
Deep Learning Incorporated with Augmented Reality Application for Watch Try-On Andri, Andri; Kurniawan, Tri Basuki; Dewi, Deshinta Arrova; Alqudah, Mashal Kasem; Alqudah, Musab Kasim; Zakaria, Mohd Zaki; Hisham, Putri Aisha Athira binti
Journal of Applied Data Sciences Vol 6, No 1: JANUARY 2025
Publisher : Bright Publisher

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47738/jads.v6i1.529

Abstract

In evaluating the dynamic landscape of online shopping, the integration of Augmented Reality (AR) technologies has emerged as a transformative force, redefining the way consumers engage with products in virtual environments. This research project investigates the intersection of deep learning and AR in the context of online shopping, with a particular focus on a Watch Try-On application. The experimentation involves the use of SSD MobileNet's models for real-time object detection aimed at enhancing the user experience during online watch shopping. Training both SSD MobileNet's V1 and V2 models through 50,000 iterations, the results reveal intriguing insights into their performance. SSD MobileNet's V1 demonstrated superior results, boasting a mean average precision (mAP) of 0.9725 and a significant reduction in total loss from 0.774 to 0.5405. However, the longer training time of 7 hours and 42 minutes prompted the selection of SSD MobileNet's V2 for real-time applications due to its faster inference capabilities. Extending beyond traditional online shopping experiences, the research explores the potential of AR technologies to revolutionize product visualization and interaction. The choice of the Vuforia model target for the Watch Try-On application showcases the synergy between deep learning and AR, allowing users to virtually try on watches and visualize them in their real-world environment. The application successfully detects users' hands with high accuracy, creating an immersive and visually enriching experience. In conclusion, this project contributes to the ongoing discourse on the fusion of deep learning and AR for online shopping. The exploration of SSD MobileNet's models, coupled with the integration of AR technologies, underscores the potential to elevate the online shopping experience by providing users with dynamic, interactive, and personalized ways to engage with products.
Convolutional Neural Network Based Deep Learning Model for Accurate Classification of Durian Types Diana, Diana; Kurniawan, Tri Basuki; Dewi, Deshinta Arrova; Alqudah, Mashal Kasem; Alqudah, Musab Kasim; Zakari, Mohd Zaki; Fuad, Eyna Fahera Binti Eddie
Journal of Applied Data Sciences Vol 6, No 1: JANUARY 2025
Publisher : Bright Publisher

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47738/jads.v6i1.480

Abstract

Durian recognition is significant among fans of the durian community since many people tend to get confused, especially if they are not familiar with durian species, which can lead them to be involved in durian fraud. The development of this prototype can detect and classify durian fruits into three categories, including Musang King, Black Thorn, and D24, which can significantly benefit consumers. The prototype in this research involves training using a dataset of durian images, specifically in Musang King, Black Thorn, and D24 varieties. Preprocessing techniques such as resizing and scaling data are applied to enhance the quality and consistency of the dataset. The models chosen to develop this prototype include VGG-16 and Xception, and each model is compared according to its accuracy percentage. The accuracy outcomes of VGG-16 and Xception models are 56.64% and 92%, respectively. The models used a total of 1,372 images of durian with three classifications. Based on the findings, further enhancement of the CNN models for durian classification can be done by implementing different architectures, techniques, and methods. Moreover, future models can consider real-time image capture and processing capabilities to enhance the practicality of the system for durian consumers. The prototype developed in this study demonstrates the feasibility of using deep learning techniques for accurate and efficient durian classification, paving the way for future advancements in automated fruit grading and quality control systems in the durian industry.
Leveraging Data Analytics for Student Grade Prediction: A Comparative Study of Data Features Misinem, Misinem; Kurniawan, Tri Basuki; Dewi, Deshinta Arrova; Zakaria, Mohd Zaki; Nazmi, Che Mohd Alif
Journal of Applied Data Sciences Vol 5, No 4: DECEMBER 2024
Publisher : Bright Publisher

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47738/jads.v5i4.442

Abstract

In educational settings, a persistent challenge lies in accurately identifying and supporting students at risk of underperformance or grade retention. Traditional approaches often fall short by applying generalized interventions that fail to address specific academic needs, leading to ineffective outcomes and increased grade repetition. This study advocates for integrating machine learning algorithms into educational assessment practices to address these limitations. By leveraging historical and current performance data, machine learning models can help identify students needing additional support early in their academic journey, allowing for precise and timely interventions. This research examines the effectiveness of three machine learning algorithms: Naive Bayes, Deep Learning, and Decision Trees. Naive Bayes, known for its simplicity and efficiency, is well-suited for initial data screening. Deep Learning excels at uncovering complex patterns in large datasets, making it ideal for nuanced predictions. Decision Trees, with their interpretable and actionable outputs, provide clear decision paths, making them particularly advantageous for educational applications. Among the models tested, the Decision Tree algorithm demonstrated the highest performance, achieving an accuracy rate of 86.68%. This high precision underscores its suitability for educational contexts where decisions need to be based on reliable, interpretable data. The results strongly support the broader application of Decision Tree analysis in educational practices. By implementing this model, educational administrators can better identify at-risk students, tailor interventions to meet individual needs, and ultimately improve student success rates. This study suggests that Decision Trees could become a vital tool in data-driven strategies to enhance student retention and optimize academic outcomes.
Co-Authors - Kurniawan, - Adi Wijaya Agus Riyanto Alde Alanda, Alde Alqudah, Mashal Kasem Alqudah, Musab Kasim Andri Andri Antoni, Darius Armoogum, Sheeba Armoogum, Vinaye Asro, Asro Astried, Astried Aziz, RZ. Abdul Azmi, Nurhafifi Binti Bappoo, Soodeshna Batumalay, Malathy Bujang, Nurul Shaira Binti Chandra, Anurag Dedy Syamsuar Dewi, Deshinta Arrova Dewi, Deshinta Arrowa Diana Diana Edi Surya Negara Eko Risdianto Fadly Fadly Fatoni, Fatoni Febriyanti Panjaitan Firosha, Ardian Fuad, Eyna Fahera Binti Eddie Habib, Shabana Hadi Syahputra Hanan, Nur Syuhana binti Abd Hasibuan, M.S. Henderi . Hendra Kurniawan Herdiansyah, M. Izman Hidayani, Nieta Hisham, Putri Aisha Athira binti Irianto, Suhendro Y. Irwansyah Irwansyah Ismail, Abdul Azim Bin Isnawijaya, Isnawijaya Joan Angelina Widians, Joan Angelina Kijsomporn, Jureerat Kurniawan, Dendi Lexianingrum, Siti Rahayu Pratami M Said Hasibuan Madjid, Fadel Muhammad Maizary, Ary Mantena, Jeevana Sujitha Mashal Alqudah Melanie, Nicolas Misinem, Misinem Mohd Salikon, Mohd Zaki Motean, Kezhilen Muhamad Akbar Muhammad Islam, Muhammad Muhammad Nasir Muhayeddin, Abdul Muniif Mohd Nathan, Yogeswaran Nazmi, Che Mohd Alif Oktariansyah Oktariansyah, Oktariansyah Onn, Choo Wou Periasamy, Jeyarani Prahartiningsyah, Anggari Ayu Praveen, S Phani Puspitasari, Novianti Qisthiano, M Riski R Rizal Isnanto Rahmi Rahmi RR. Ella Evrita Hestiandari Saksono, Prihambodo Hendro Saringat, Zainuri Singh, Harprith Kaur Rajinder Sirisha, Uddagiri Sri Karnila Sulaiman, Agus Sunda Ariana, Sunda Suriani, Uci Syaputra, Hadi Taqwa, Dwi Muhammad Thinakaran, Rajermani Triloka, Joko Udariansyah, Devi Usman Ependi Wibaselppa, Anggawidia Yeh, Ming-Lang Yorman Yupika Maryansyah, Yupika Zakari, Mohd Zaki Zakaria, Mohd Zaki