cover
Contact Name
Aji Prasetya Wibawa
Contact Email
keds.journal@um.ac.id
Phone
+62818539333
Journal Mail Official
keds.journal@um.ac.id
Editorial Address
Semarang St. No 5, Malang, Indonesia
Location
Kota malang,
Jawa timur
INDONESIA
Knowledge Engineering and Data Science
ISSN : -     EISSN : 25974637     DOI : https://doi.org/10.17977
Knowledge Engineering and Data Science (2597-4637), KEDS, brings together researchers, industry practitioners, and potential users, to promote collaborations, exchange ideas and practices, discuss new opportunities, and investigate analytics frameworks on data-driven and knowledge base systems.
Articles 98 Documents
Random Forest Algorithm to Measure the Air Pollution Standard Index Setiawan, Ariyono; Wibowo, Untung Lestari; Mubarok, Ahmad; Larasati, Khoirunnisa; Hammad, Jehad A.H
Knowledge Engineering and Data Science Vol 7, No 1 (2024)
Publisher : Universitas Negeri Malang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.17977/um018v7i12024p86-100

Abstract

This study uses the Random Forest algorithm to measure and predict the Air Pollution Standard Index (APSI) at Blimbing Banyuwangi Airport. Air pollution data, including concentrations of O3, CO, NO2, SO2, PM2.5, and PM10, were collected from air monitoring stations at the airport from April 15-30, 2024. APSI measurement followed established formulas by relevant authorities. Data analysis utilized statistical approaches and computational algorithms. The findings reveal that air quality at the airport is generally "Moderate," with occasional "Good" days. The Random Forest algorithm effectively predicts APSI based on existing pollution data. These results provide insights for improving air pollution management at the airport and surrounding areas, emphasizing the need for continuous air quality monitoring. Days classified as "Moderate" suggest health risks for sensitive groups, indicating the need for targeted mitigation strategies. Recommendations include increasing green spaces, optimizing flight schedules to reduce peak pollution, and raising public awareness about air quality. The effectiveness of the Random Forest algorithm suggests its potential application in other airports for proactive air quality management. Future research could integrate real-time data and advanced machine learning models for more accurate and timelier APSI predictions.
Debtor Eligibility Prediction Using Deep Learning with Chatbot-Based Testing Noviania, Reski; Sela, Enny Itje; Latumakulita, Luther Alexander; Sentinuwo, Steven R.
Knowledge Engineering and Data Science Vol 7, No 2 (2024)
Publisher : Universitas Negeri Malang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.17977/um018v7i22024p128-138

Abstract

Predicting debtor eligibility is essential for effective risk management and minimizing bad credit risks. However, financial institutions face challenges such as imbalanced data, inefficient feature selection, and limited user accessibility. This study combines Recursive Feature Elimination (RFE) and Deep Learning (DL) to improve prediction accuracy and integrates a chatbot interface for user-friendly testing. RFE effectively identifies critical features, while the DL model achieves a validation accuracy of 97.62%, surpassing previous studies with less comprehensive methodologies. The chatbot's novel design not only ensures accessibility but also enhances user engagement through flexible input options, such as approximate values, enabling non-experts to interact seamlessly with the system. For financial institutions, this chatbot-based testing approach offers practical benefits by streamlining debtor evaluation processes, reducing dependency on manual assessments, and providing consistent, scalable, and efficient solutions for credit risk management. It allows institutions to handle inquiries outside business hours, ensuring a continuous service flow. Furthermore, the system’s flexibility supports better customer interaction, increasing trust and transparency. By combining advanced machine learning with accessible interfaces, this study offers a scalable solution to improve the precision and practicality of debtor eligibility assessments, making it a valuable tool for modern financial institutions.
Manifold Learning and Undersampling Approaches for Imbalanced Class Sentiment Classification Jumansyah, L. M. Risman Dwi; Soleh, Agus Mohamad; Syafitri, Utami Dyah
Knowledge Engineering and Data Science Vol 7, No 2 (2024)
Publisher : Universitas Negeri Malang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.17977/um018v7i22024p139-151

Abstract

Movie reviews are crucial in determining a film's success by influencing audience decisions. Automating sentiment classification is essential for efficient public opinion analysis. However, it faces challenges such as high-dimensional data and imbalanced class distributions. This study addresses these issues by applying manifold learning techniques, Principal Component Analysis (PCA) and Laplacian Eigenmaps (LE) to reduce data complexity and undersampling strategies (Random Undersampling (RUS) and EasyEnsemble) to balance data and improve predictions for both sentiment classes. On reviews of The Raid 2: Berandal, EasyEnsemble achieved the highest average G-Mean of 0.694 using Term Frequency-Inverse Document Frequency (TF-IDF) features with a linear kernel without dimensionality reduction. RUS provided balanced but inconsistent results, while Review of Systems (ROS) combined with PCA (85% variance cumulative) improved predictions for negative reviews. Laplacian Eigenmaps were effective for negative reviews with 500 dimensions but less accurate for positive ones. This study highlights EasyEnsemble's superior performance in addressing the class imbalance, though optimization with manifold learning remains challenging.
A Novel Approach to Defect Detection in Arabica Coffee Beans Using Deep Learning: Investigating Data Augmentation and Model Optimization Ardian, Yusriel; Irawan, Novta Danyel; Sutoko, Sutoko; Astawa, I Nyoman Gede Arya; Purnama, Ida Bagus Irawan; Dwiyanto, Felix Andika
Knowledge Engineering and Data Science Vol 7, No 1 (2024)
Publisher : Universitas Negeri Malang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.17977/um018v7i12024p117-127

Abstract

Arabica coffee beans have valuable market worth because of their taste and quality, and there are defects like wholly and partially black beans that can lower the standards of a product, especially in the premium coffee sector. However, the manual processes used to detect the defects take an inordinate amount of time and are inefficient. This study aims to bridge the knowledge gap on the automated detection and recognition of the defects present in the Arabica coffee beans by creating and optimizing a CNN model based on a modified VGG16 architecture. The model applies data augmentation, rotation, cropping, and Bayesian hyperparameter optimization to improve defect detectability and expedite the training period. During testing, the defined model demonstrated excellent efficiency in defect detection, with a 97.29% confidence level, which was higher than that of the modified VGG16 and Slim-CNN models. The goal of the second optimization was an improvement of the practical application of the model. In terms of the time it takes for a model to be trained, approximately 30% of the time was saved. These findings present a consistent and effective way for the mass production processes of coffee to have quality control procedures automated. The model's ability to detect defects in other agricultural items makes it attractive, thus serving as a practical example of how AI can impact effective management in the inspection processes. The research further enriches the study of deep learning applications in agriculture by demonstrating how to efficiently address specific defect detection problems through an optimized convolutional neural network model.
Constructing Qur’an Recitation Classification using Alexnet Algorithm Rosyid, Harits Ar; Abdullah, Dzulkifli; Alqahtani, Mohammed S.
Knowledge Engineering and Data Science Vol 7, No 2 (2024)
Publisher : Universitas Negeri Malang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.17977/um018v7i22024p152-163

Abstract

The growing demands for accurate and efficient methods in the Qur'an recitation classification highlight the limitations of existing models, particularly in assisting the memorization process. This study aims to address these challenges by implementing the AlexNet Convolutional Neural Network architecture, widely recognized for its effectiveness in image classification, to classify the Qur'an recitations using the Mel-Frequency Cepstral Coefficient (MFCC) as the feature extraction method. The research involves several stages, including data collection, preprocessing (audio segmentation by verse), data augmentation, feature extraction, and classification using the AlexNet architecture, followed by performance evaluation. Key results demonstrate that the combination of MFCC and AlexNet yields promising accuracy in classifying Surah Al-Ikhlas recitations, suggesting its potential application for automatic reading correction. This approach significantly improves over traditional methods, contributing to more effective tools for Qur'an memorization assistance. Future work could explore its application in other significant improvement contexts and address potential challenges related to varying audio quality.
Deep Learning Approach for Dental Anomalies X-ray Imaging using YOLOv8 Ismail, Amelia Ritahani; Taseen, Md Salim Sadman
Knowledge Engineering and Data Science Vol 7, No 2 (2024)
Publisher : Universitas Negeri Malang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.17977/um018v7i22024p164-175

Abstract

Dental X-ray imaging is a critical diagnostic tool for identifying various dental anomalies. However, manual interpretation is time-consuming, prone to human error, and requires specialized expertise. Deep learning models, particularly object detection frameworks like YOLO, have demonstrated promising results in automating medical image analysis. This study aims to develop and evaluate a YOLOv8-based deep learning model for automated detection and classification of 14 dental anomaly categories, including Caries, Crowns, Fillings, Implants, and Periapical lesions. The proposed approach addresses limitations in previous YOLO versions by leveraging anchor-free detection and enhanced feature extraction for improved accuracy. The model was trained on a dataset of annotated dental X-ray images and preprocessed with data augmentation techniques to improve generalization. Performance was evaluated using Precision, Recall, F1-score, and Mean Average Precision (mAP). Additional insights were obtained from confusion matrices, precision-recall curves, and training-validation loss curves. The model achieved high precision in detecting Implants (0.90), Crowns (0.89), and Root Canal Treatment (0.69), demonstrating strong potential for clinical applications. However, Caries (0.30) and Periapical lesions (0.15) were detected with lower accuracy, indicating the need for further optimization. Analysis of training loss curves and label distributions suggested that class imbalance and anomaly co-occurrence influenced detection performance. YOLOv8 presents a promising AI-based solution for dental anomaly detection, capable of improving diagnostic efficiency and accuracy in clinical practice. The model’s integration into dental healthcare systems can reduce radiologists' workload and enhance early disease detection, particularly in resource-limited settings.
Optimal Strategy for Handling Unbalanced Medical Datasets: Performance Evaluation of K-NN Algorithm Using Sampling Techniques Salim, Yulita; Utami, Aulia Putri; Manga’, Abdul Rachman; Aziz, Huzain; Admojo, Fadhila Tangguh
Knowledge Engineering and Data Science Vol 7, No 2 (2024)
Publisher : Universitas Negeri Malang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.17977/um018v7i22024p176-186

Abstract

This study addresses the critical role of medical image classification in enhancing healthcare effectiveness and tackling the challenges of imbalanced medical datasets. It focuses on optimizing classification performance by integrating Canny edge detection for segmentation and Hu-moment feature extraction and applying oversampling and undersampling techniques. Five diverse medical datasets were utilized, covering Alzheimer’s and Parkinson’s diseases, COVID-19, brain tumours, and lung cancer. The K-Nearest Neighbors (K-NN) algorithm was implemented to enhance classification accuracy, aiming to develop a more robust framework for medical image analysis. The evaluation, conducted using cross-validation, demonstrated notable improvements in key metrics. Specifically, oversampling significantly enhanced lung cancer detection accuracy, while undersampling contributed to balanced performance gains in the COVID-19 class. Metrics, including accuracy, precision, recall, and F1-score, provided insights into the model’s effectiveness. These findings highlight the positive impact of data balancing techniques on K-NN performance in imbalanced medical image classification. Continued research is essential to refine these techniques and improve medical diagnostics.
A Hierarchical Density-Based Spatial Clustering of Applications with Noise (HDBSCAN) Approach for Identifying Potential Villages in Buleleng Regency Amalina, Dina Nur; Fauzan, Achmad
Knowledge Engineering and Data Science Vol 7, No 2 (2024)
Publisher : Universitas Negeri Malang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.17977/um018v7i22024p187-199

Abstract

Buleleng Regency, located in Bali Province, possesses diverse village potential, including agricultural production and tourist attractions. However, this potential has not been fully optimized. Therefore, it is important to enhance village potential by clustering villages based on their specific characteristics to identify and prioritize those requiring special attention. This approach aims to promote equitable village development and reduce poverty levels. This study clusters villages in Buleleng Regency based on their potential using the Hierarchical Density-Based Spatial Clustering of Applications with Noise (HDBSCAN) method. The data utilized in this study comprises village potential data obtained from the Buleleng Regency Statistics Office (BPS) for all districts and the Statistical Service Information System. The variables used in this study are based on aspects of population, communication, tourism, trade, health, religion, social affairs, and public welfare. Tuning parameters were performed to determine the optimal parameters, resulting in optimal parameters, such as minimum cluster size = five and minimum samples = 2, which produced two main clusters. The first cluster comprises six villages, while the second includes 118 villages. Additionally, a noise cluster representing outliers, consisting of 24 villages, was identified. The findings indicate that the first cluster exhibits higher village potential than the second cluster. Based on these results, it is recommended that the government prioritize the second cluster when designing and implementing targeted programs and policies to reduce poverty by developing village potential.

Page 10 of 10 | Total Record : 98