Claim Missing Document
Check
Articles

Found 19 Documents
Search

Hybrid Approach-RSMOTE for Handling Class Imbalance with Label Noise Hartono Hartono; Erianto Ongko
Jurnal Ilmiah Teknik Elektro Komputer dan Informatika Vol 8, No 3 (2022): September
Publisher : Universitas Ahmad Dahlan

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.26555/jiteki.v8i3.23684

Abstract

The class imbalance problem is the main problem in classification. This issue arises because real-world datasets frequently exhibit an imbalance as a result of a class with more instances than other classes. In handling class imbalance, a Hybrid Approach that blends data-level and algorithm-level approaches produce good results. However, apart from the class imbalance, which reduces classification accuracy, the complexity of the data also has an effect. The complexity of this data causes a minority noise sample which lies between the minority and the majority. In order to determine how close minority samples are to their homogeneous and heterogeneous nearest neighbors, it is necessary to calculate the relative density. The greater the proximity to the homogeneous nearest neighbors, the greater the relative density, which causes the minority samples to be in a safe state but otherwise be categorized as noisy samples. This research will combine the application of the Hybrid Approach with A self-adaptive Robust SMOTE (RSMOTE), which is an adaptive method from SMOTE that applies the concept of relative density in the over-sampling process on minority samples. The research contribution is to implement the Hybrid Approach-RSMOTE in handling class imbalance with noise by using relative density in over-sampling and also to improve classification performance. The results showed that the Hybrid Approach-RSMOTE and Hybrid Approach-SMOTE had given good results in handling class imbalance. However, the Hybrid Approach-RSMOTE gave better results in the Precision, Recall, F1-Measure, and G-Mean and showed significant differences. Based on the results of the study, it can be stated that the performance of the Hybrid Approach in handling class imbalance is influenced by the selection of the over-sampling method. The results show that RSMOTE can be considered an over-sampling method in the Hybrid Approach.
Klasifikasi Penyakit Daun Pada Tanaman Jagung Menggunakan Algoritma Support Vector Machine, K-Nearest Neighbors dan Multilayer Perceptron Jaka Kusuma; Rubianto; Rika Rosnelly; Hartono; B. Herawan Hayadi
Journal of Applied Computer Science and Technology Vol 4 No 1 (2023): Juni 2023
Publisher : Indonesian Society of Applied Science

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.52158/jacost.v4i1.484

Abstract

Corn is one of the substitute staple foods in Indonesia after rice. Maize crops grown in Indonesia often experience considerable losses due to maize plant diseases. Generally, plant diseases are initially caused by morphological changes in the leaves. Accurate detection and classification of diseases that appear on the leaves will prevent the widespread spread of the disease. This study will compare classification algorithms, namely Support Vector Machine, K-Nearest Neighbors, and Multilayer Perceptron to find the best algorithm in the classification of leaf disease in corn plants, namely, cercospora leaf spot gray, common rust, and northern leaf blight using the VGG-16 deep learning model used as image feature extraction. The results showed that the Multilayer Perceptron algorithm produced the best values with accuracy, precision, and recall of 97.4% each.
Classification of Basurek Batik Using Pre-Trained VGG16 and Support Vector Machine Meli Handayani; Rika Rosnelly; Hartono Hartono
Proceeding of International Conference on Information Science and Technology Innovation (ICoSTEC) Vol. 2 No. 1 (2023): Proceeding of International Conference on Information Science and Technology In
Publisher : Universitas Respati Yogyakarta

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.35842/icostec.v2i1.31

Abstract

By introducing Indonesian batik motifs, we know that the island of Sumatra, especially Bengkulu and Jambi provinces, has a distinctive batik called Basurek batik. This research aims to classify the two batik motifs using the Support Vector Machine (SVM) algorithm. First, we extract the image of the batik motif with a pre-trained VGG-16 model and then use them as a dataset for the SVM classification process. The classification process itself uses linear, polynomial, and sigmoid kernels. We divided the data 90:10 and used 10-fold cross-validation to analyze each training and testing data classification result. The results of this study are the highest values of accuracy, precision, and recall of 76.4%, 76.5%, and 76.4% produced by the linear kernel for the training data classification. For the testing data classification, both the linear and polynomial kernels generate the best accuracy, precision, and recall values of 87.5%, 90%, and 85.5%. On average, incorporating the training and testing classification results, we found that the linear kernel is the best function for classifying the Basurek batik motif using the collected images from the internet.
Analysis of Machine Learning Algorithms in Predicting the Flood Status of Jakarta City Irwan Daniel; Hartono Hartono; Zakarias Situmorang
Proceeding of International Conference on Information Science and Technology Innovation (ICoSTEC) Vol. 2 No. 1 (2023): Proceeding of International Conference on Information Science and Technology In
Publisher : Universitas Respati Yogyakarta

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.35842/icostec.v2i1.38

Abstract

By mining the information in the dataset, we can solve a prediction problem, especially flood status prediction based on floodgate levels, using machine learning algorithms. This research employs three machine learning algorithms (K-Nearest Neighbor, Naive Bayes, and Support Vector Machine) for predicting the flood status using a dataset containing the data of DKI Jakarta's floodgate levels. Using a 5-fold, 10-fold, and 20-fold cross-validation evaluation, we get the highest accuracy (85.096%), f-score (85.1%), precision (85.641%), and recall (85.096%) from the model using the SVM algorithm with a polynomial kernel. Average performance-wise, the K-NN algorithm performs better than the other algorithm with an average accuracy of 83.147%, an average f-score of 83.156%, an average precision of 83.566%, and an average recall of 83.147%
Predicting Children's Talent Based On Hobby Using C4.5 Algorithm And Random Forest Sugeng Riyadi; Hartono Hartono; Wanayumini Wanayumini
Proceeding of International Conference on Information Science and Technology Innovation (ICoSTEC) Vol. 2 No. 1 (2023): Proceeding of International Conference on Information Science and Technology In
Publisher : Universitas Respati Yogyakarta

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.35842/icostec.v2i1.54

Abstract

A person's talent is closely related to intelligence, hobbies, and interests. These factors are the best features to be used in a dataset to predict a children's talent, such as in an academy, arts, or sports. This research uses the C4.5 and random forest algorithms in 8 different models to predict a children's talent based on a dataset gained from a survey involving 1601 parents. Each model contains four training-testing data ratios, such as 50:50, 60:40, 70:30, and 80:20. We calculate each model prediction performance using 10-fold and 20-fold crossvalidation, with the accuracy, f-score, precision, and recall values as a comparison. The best result for the training evaluation we get is 91.5% for each comparison value from the random forest model (70:30 ratio) using a 20-fold cross-validation. For the testing evaluation, we get 92.7%, 92.8%, 92.8%, and 92.7% from the random forest model (50:50 ratio). The worst testing evaluation we get is 81.7% for each comparison value from the C4.5 model (50:50 ratio) using a 20-fold cross-validation. For the testing evaluation, we get 89.2%, 89.2%, 89.3%, and 89.2% from the C4.5 model (50:50 ratio).
Penerapan Smart Farming Sebagai Upaya Modernisasi Pertanian Cabai Rahman, Sayuti; Indrawati, Asmah; Sembiring, Arnes; Hartono, Hartono; Zuhanda, Muhammad Khahfi; Ongko, Erianto
Prioritas: Jurnal Pengabdian Kepada Masyarakat Vol. 6 No. 02 (2024): EDISI SEPTEMBER 2024
Publisher : Universitas Harapan Medan

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.35447/prioritas.v6i02.1050

Abstract

Cabai merupakan salah satu komoditas hortikultura yang memiliki nilai ekonomi tinggi, namun produktivitasnya sering terganggu oleh berbagai penyakit daun yang disebabkan oleh hama, seperti bercak daun, layu fusarium, embun tepung, dan virus kuning. Penyakit-penyakit ini tidak hanya memengaruhi kualitas hasil panen, tetapi juga menyebabkan kerugian ekonomi yang signifikan bagi petani. Untuk mengatasi permasalahan ini, dilakukan pengabdian kepada masyarakat dengan mengimplementasikan teknologi Convolutional Neural Network (CNN) untuk klasifikasi penyakit daun cabai secara cepat dan akurat. Metode yang digunakan melibatkan observasi lapangan untuk mengidentifikasi permasalahan yang dihadapi petani di Desa Lubuk Cuik, Batu Bara, Sumatera Utara. Data berupa gambar daun cabai yang terinfeksi dikumpulkan dan digunakan untuk melatih model CNN. Model yang dikembangkan, efficientChiliNet, mampu mengklasifikasikan penyakit daun cabai dengan akurasi pelatihan 99,8% dan akurasi validasi 96,5%. Aplikasi berbasis web dan desktop kemudian dibuat untuk mempermudah petani dalam mendiagnosis penyakit daun cabai secara mandiri. Aplikasi ini juga disosialisasikan kepada petani melalui pelatihan untuk memastikan implementasi teknologi yang optimal. Hasil pengabdian ini menunjukkan bahwa teknologi berbasis CNN mampu memberikan solusi efektif dalam mengidentifikasi penyakit daun cabai dan membantu petani meningkatkan produktivitas pertanian. Rekomendasi selanjutnya adalah pengembangan fitur tambahan dalam aplikasi untuk memberikan panduan penanganan hama dan integrasi teknologi Internet of Things (IoT) untuk pemantauan lingkungan secara real-time. Dengan pendekatan ini, diharapkan terciptanya modernisasi pertanian berbasis smart farming yang berkelanjutan.
Challenges and Strategies in Forensic Investigation: Leveraging Technology for Digital Security Using Log/Event Analysis Method Ammar Yasir Nasution; Hartono Hartono; Rika Rosnelly
JURNAL TEKNIK INFORMATIKA Vol 18, No 1: JURNAL TEKNIK INFORMATIKA
Publisher : Department of Informatics, Universitas Islam Negeri Syarif Hidayatullah

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.15408/jti.v18i1.42815

Abstract

Cybersecurity threats continue to evolve, necessitating advanced techniques for network anomaly detection. This study developed a comprehensive methodology for detecting network anomalies by leveraging sophisticated log and event analysis using machine learning algorithms. By employing a Naive Bayes classification approach on a synthetic cybersecurity dataset comprising 40,000 entries with 25 unique features, the research aimed to enhance anomaly detection precision. The methodology involved meticulous data preprocessing, feature selection, and strategic model validation techniques, including cross-validation and external benchmarking. Comparative analysis with K-Nearest Neighbors and Support Vector Machine algorithms demonstrated the Naive Bayes method's superior performance, achieving a classification accuracy of 94.8%, an Area Under the Curve (AUC) of 0.949, and a Matthews Correlation Coefficient of 0.896. The study identified critical parameters influencing anomaly detection, such as source port characteristics and attack signatures. These findings contribute significant insights into machine learning-based network security strategies, offering a robust framework for early threat identification and mitigation.
Impact of Adaptive Synthetic on Naïve Bayes Accuracy in Imbalanced Anemia Detection Datasets Zuhanda, Muhammad Khahfi; Lisya Permata; Hartono; Erianto Ongko; Desniarti
Jurnal RESTI (Rekayasa Sistem dan Teknologi Informasi) Vol 9 No 1 (2025): February 2025
Publisher : Ikatan Ahli Informatika Indonesia (IAII)

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.29207/resti.v9i1.6031

Abstract

This research aims to analyze the impact of the Adaptive Synthetic (ADASYN) oversampling technique on the performance of the Naïve Bayes classification algorithm on datasets with class imbalance. Class imbalance is a common problem in machine learning that can cause bias in prediction results, especially in minority classes. ADASYN is one of the oversampling methods that focuses on adaptively synthesizing new data for minority classes. In this study, the performance of the Naïve Bayes algorithm was tested on Anemia Diagnosis datasets before and after the application of ADASYN. This dataset contains 104 instances, 5 attributes, and 2 classes, and has an imbalance ratio of 3. The evaluation was carried out by comparing accuracy, confusion matrix, precision, recall, and F1-score to obtain a more comprehensive picture of the effectiveness of ADASYN in improving Naïve Bayes. The results of the study show that the performance of the oversampling method depends on the imbalance ratio so it is important to ensure that the oversampling method does not cause overfitting and this can be overcome by using ADASYN which only selects Selected Neighbors. The results showed that ADASYN significantly increased accuracy from 0.57 to 0.78, precision from 0.17 to 0.74, recall from 0.20 to 0.88, and F1-Score from 0.18 to 0.80. In this study, we also compared the application of ADASYN and SMOTE on the Naïve Bayes algorithm. The results show that ADASYN outperforms SMOTE across all key metrics—accuracy, precision, recall, and F1-Score—while the accuracy improvements were statistically significant (p-value = 0.00903).
A Hybrid GDHS and GBDT Approach for Handling Multi-Class Imbalanced Data Classification Hartono, Hartono; Zuhanda, Muhammad Khahfi; Syah, Rahmad; Rahman, Sayuti; Ongko, Erianto
International Journal of Engineering, Science and Information Technology Vol 5, No 3 (2025)
Publisher : Malikussaleh University, Aceh, Indonesia

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.52088/ijesty.v5i3.894

Abstract

Multiclass imbalanced classification remains a significant challenge in machine learning, particularly when datasets exhibit high Imbalance Ratios (IR) and overlapping feature distributions. Traditional classifiers often fail to accurately represent minority classes, leading to biased models and suboptimal performance. This study proposes a hybrid approach combining Generalization potential and learning Difficulty-based Hybrid Sampling (GDHS) as a preprocessing technique with Gradient Boosting Decision Tree (GBDT) as the classifier. GDHS enhances minority class representation through intelligent oversampling while cleaning majority classes to reduce noise and class overlap. GBDT is then applied to the resampled dataset, leveraging its adaptive learning capabilities. The performance of the proposed GDHS+GBDT model was evaluated across six benchmark datasets with varying IR levels, using metrics such as Matthews Correlation Coefficient (MCC), Precision, Recall, and F-Value. Results show that GDHS+GBDT consistently outperforms other methods, including SMOTE+XGBoost, CatBoost, and Select-SMOTE+LightGBM, particularly on high-IR datasets like Red Wine Quality (IR = 68.10) and Page-Blocks (IR = 188.72). The method improves classification performance, especially in detecting minority classes, while maintaining high accuracy.