Articles
Combining Hybrid Approach Redefinition-Multiclass Imbalance (HAR-MI) and Hybrid Sampling in Handling Multi-Class Imbalance and Overlapping
Hartono Hartono;
Erianto Ongko
JOIV : International Journal on Informatics Visualization Vol 5, No 1 (2021)
Publisher : Society of Visual Informatics
Show Abstract
|
Download Original
|
Original Source
|
Check in Google Scholar
|
DOI: 10.30630/joiv.5.1.420
The class imbalance problem in the multi-class dataset is more challenging to manage than the problem in the two classes and this problem is more complicated if accompanied by overlapping. One method that has proven reliable in dealing with this problem is the Hybrid Approach Redefinition-Multiclass Imbalance (HAR-MI) method which is classified as a hybrid approach that combines sampling and classifier ensembles. However, in terms of diversity among classifiers, a hybrid approach that combines sampling and classifier ensembles will give better results. HAR-MI provides excellent results in handling multi-class imbalances. The HAR-MI method uses SMOTE to increase the number of samples in the minority class. However, this SMOTE also has a weakness where an extremely imbalanced dataset and a large number of attributes will be over-fitting. To overcome the problem of over-fitting, the Hybrid Sampling method was proposed. HAR-MI combination with Hybrid Sampling is done to increase the number of samples in the minority class and at the same time reduce the number of noise samples in the majority class. The preprocessing stages at HAR-MI will use the Minimizing Overlapping Selection under Hybrid Sampling (MOSHS) method, and the processing stages will use Different Contribution Sampling. The results obtained will be compared with the results using Neighbourhood-based under-sampling. Overlapping and Classifier Performance will be measured using Augmented R-Value, the Matthews Correlation Coefficient (MCC), Precision, Recall, and F-Value. The results showed that HAR-MI with Hybrid Sampling gave better results in terms of Augmented R-Value, Precision, Recall, and F-Value
HAR-MI method for multi-class imbalanced datasets
H. Hartono;
Yeni Risyani;
Erianto Ongko;
Dahlan Abdullah
TELKOMNIKA (Telecommunication Computing Electronics and Control) Vol 18, No 2: April 2020
Publisher : Universitas Ahmad Dahlan
Show Abstract
|
Download Original
|
Original Source
|
Check in Google Scholar
|
DOI: 10.12928/telkomnika.v18i2.14818
Research on multi-class imbalance from a number of researchers faces obstacles in the form of poor data diversity and a large number of classifiers. The Hybrid Approach Redefinition-Multiclass Imbalance (HAR-MI) method is a Hybrid Ensembles method which is the development of the Hybrid Approach Redefinion (HAR) method. This study has compared the results obtained with the Dynamic Ensemble Selection-Multiclass Imbalance (DES-MI) method in handling multiclass imbalance. In the HAR-MI Method, the preprocessing stage was carried out using the random balance ensembles method and dynamic ensemble selection to produce a candidate ensemble and the processing stages was carried out using different contribution sampling and dynamic ensemble selection to produce a candidate ensemble. This research has been conducted by using multi-class imbalance datasets sourced from the KEEL Repository. The results show that the HAR-MI method can overcome multi-class imbalance with better data diversity, smaller number of classifiers, and better classifier performance compared to a DES-MI method. These results were tested with a Wilcoxon signed-rank statistical test which showed that the superiority of the HAR-MI method with respect to DES-MI method.
Hybrid approach redefinition-multi class with resampling and feature selection for multi-class imbalance with overlapping and noise
Erianto Ongko;
Hartono Hartono
Bulletin of Electrical Engineering and Informatics Vol 10, No 3: June 2021
Publisher : Institute of Advanced Engineering and Science
Show Abstract
|
Download Original
|
Original Source
|
Check in Google Scholar
|
DOI: 10.11591/eei.v10i3.3057
Class imbalance and overlapping on multi-class can reduce the performance and accuracy of the classification. Noise must also be considered because it can reduce the performance of classification. With a resampling algorithm and feature selection, this paper proposes a method for improving the performance of hybrid approach redefinition-multi class (HAR-MI). Resampling algorithm can overcome the problem of noise but cannot handle overlapping well. Feature selection is good at dealing with overlapping but can experience a decrease in quality if there is a noise. The HAR-MI approach is a way to deal with multi-class imbalance issues, but it has some drawbacks when dealing with overlapping. The contribution of this paper is to suggest a new approach for dealing with class imbalance, overlapping, and noise in multi-class. This is accomplished by employing minimizing overlapping selection (MOSS) as an ensemble learning algorithm and a preprocessing technique in HAR-MI, as well as employing multi-class combination cleaning and resampling (MC-CCR) as a resampling algorithm at the processing stage. When subjected to overlapping and classifier performance, it is discovered that the proposed method produces good results, as evidenced by higher augmented r-value, class average accuracy, class balance accuracy, multi class g-mean, and confusion entropy.
Hybrid approach redefinition with cluster-based instance selection in handling class imbalance problem
Hartono Hartono;
Erianto Ongko;
Dahlan Abdullah
International Journal of Advances in Intelligent Informatics Vol 7, No 3 (2021): November 2021
Publisher : Universitas Ahmad Dahlan
Show Abstract
|
Download Original
|
Original Source
|
Check in Google Scholar
|
DOI: 10.26555/ijain.v7i3.515
Class Imbalance problems often occur in the classification process, the existence of these problems is characterized by the tendency of a class to have instances that are much larger than other classes. This problem certainly causes a tendency towards low accuracy in minority classes with smaller number of instances and also causes important information on minority classes not to be obtained. Various methods have been applied to overcome the problem of the imbalance class. One of them is the Hybrid Approach Redefinition method which is one of the Hybrid Ensembles methods. The tendency to pay attention to the performance classifier, has led to an understanding of the importance of selecting an instance that will be used as a classifier. In the classic Hybrid Approach Redefinition method classifier selection is done randomly using the Random Under Sampling approach, and it is interesting to study how performance is obtained if the sampling process is based on Cluster-Based by selecting existing instances. The purpose of this study is to apply the Hybrid Approach Redefinition method with Cluster-Based Instance Selection (CBIS) approach so that it can obtain a better performance classifier. The results showed that Hybrid Approach Redefinition with cluster-based instance selection gave better results on the number of classifiers, data diversity, and performance classifiers compared to classic Hybrid Approach Redefinition.
Biased support vector machine and weighted-smote in handling class imbalance problem
Hartono Hartono;
Opim Salim Sitompul;
Tulus Tulus;
Erna Budhiarti Nababan
International Journal of Advances in Intelligent Informatics Vol 4, No 1 (2018): March 2018
Publisher : Universitas Ahmad Dahlan
Show Abstract
|
Download Original
|
Original Source
|
Check in Google Scholar
|
DOI: 10.26555/ijain.v4i1.146
Class imbalance occurs when instances in a class are much higher than in other classes. This machine learning major problem can affect the predicted accuracy. Support Vector Machine (SVM) is robust and precise method in handling class imbalance problem but weak in the bias data distribution, Biased Support Vector Machine (BSVM) became popular choice to solve the problem. BSVM provide better control sensitivity yet lack accuracy compared to general SVM. This study proposes the integration of BSVM and SMOTEBoost to handle class imbalance problem. Non Support Vector (NSV) sets from negative samples and Support Vector (SV) sets from positive samples will undergo a Weighted-SMOTE process. The results indicate that implementation of Biased Support Vector Machine and Weighted-SMOTE achieve better accuracy and sensitivity.
Hybrid approach redefinition with progressive boosting for class imbalance problem
Hartono Hartono;
Erianto Ongko
Science in Information Technology Letters Vol 1, No 1: May 2020
Publisher : Association for Scientific Computing Electronics and Engineering (ASCEE)
Show Abstract
|
Download Original
|
Original Source
|
Check in Google Scholar
|
DOI: 10.31763/sitech.v1i1.34
Problems of Class Imbalance in data classification have received attention from many researchers. It is because the imbalance class will affect the accuracy of the classification results. The problem of the imbalance class itself will ignore the minority class, which is a class with a smaller number of instances even though the minority class is an exciting class to observe. In overcoming the imbalanced class problem, it is necessary to pay attention to diversity data, the number of classifiers, and also classification performance. Several methods have been proposed to overcome the imbalanced class problem, one of which is the Hybrid Approach Redefinition Method. This method is a good hybrid ensemble method in dealing with imbalance class problems, which can provide useful diversity data and also a smaller number of classifiers. This research will combine the Hybrid Approach Redefinition by replacing the use of SMOTE Boost by using Progressive Boosting to get better data diversity, a small number of classifiers, and better performance. This study will conduct testing in handling imbalance class problems using datasets sourced from the KEEL-Dataset Repository. The results of this study indicate that the Hybrid Approach Redefinition with Progressive Boosting will provide better results in the number of classifiers, data diversity, and classification performance.
Penerapan Algoritma C4.5 Dalam Memprediksi Ketersediaan Uang Pada Mesin ATM
Firman Syahputra;
Hartono Hartono;
Rika Rosnelly
JURNAL MEDIA INFORMATIKA BUDIDARMA Vol 5, No 2 (2021): April 2021
Publisher : STMIK Budi Darma
Show Abstract
|
Download Original
|
Original Source
|
Check in Google Scholar
|
DOI: 10.30865/mib.v5i2.2933
This study aims to provide an evaluation of the availability of money in ATM machines using data mining. Data mining with the C4.5 algorithm is used to predict cash demand or total cash withdrawals at ATMs. To determine the need for ATM cash based on cash transaction data. It is hoped that this forecasting can help the monitoring department in making decisions about the money requirements that must be allocated to each ATM machine. The results of this study are expected to assist the ATM management unit in optimizing and monitoring the availability of money at an ATM machine for cash needs, so that it can provide optimal service to customers. Algortima C4.5 is an algorithm that is able to form a decision tree, where the decision tree will then generate new knowledge. The results of the test matched the data on the availability of money at the ATM machine. The results of implementing the C4.5 method on the availability of money at the ATM machine are seen from the travel time to the ATM location and also the remaining balance in the machine. The resulting decision tree model is to make the balance variable as the root, then the travel time as a branch at Level 1 with the variables fast, medium, long, and the bank becomes a branch at the last level (Level 2). Then the C4.5 algorithm was tested using the K-Fold Cross validation method with the value of fold = 10, it can be seen that the accuracy rate is 85%, the Precision value is 80% and the Recall value is 66.67%. While the AUC (Area Under Curve) value is 0.833, this shows that if the AUC value approaches the value 1, the accuracy level is getting better
Analisa Association Rule Pada Algoritma Apriori Untuk Minat Pembelian Alat Kesehatan
Andi Rahmadsyah;
Hartono Hartono;
Rika Rosnelly
JURNAL MEDIA INFORMATIKA BUDIDARMA Vol 5, No 1 (2021): Januari 2021
Publisher : STMIK Budi Darma
Show Abstract
|
Download Original
|
Original Source
|
Check in Google Scholar
|
DOI: 10.30865/mib.v5i1.2658
In the competition in the business world, especially the Medical Device industry, it requires developers to find an accurate strategy that can increase sales of goods. One way to overcome this problem is to continue to provide various types of medical devices in the warehouse. To find out what medical devices are purchased by consumers, market basket analysis techniques are carried out, namely analysis of consumer buying habits. In order to make it easier for companies to determine Buyers' interest in medical devices, a data mining method is needed which is accompanied by an a priori algorithm based on the purchasing process carried out by consumers based on the relationship between the products purchased. Based on the sample sales data for medical devices CV Andira Karya Jaya, amounting to 25 transactions and in this study a minimum support = 12% and a minimum confidence = 70% will be used. In the final stage, the results obtained are medical devices that are in demand by buyers at CV. Andira Karya Jaya, namely 1 M3 oxygen cylinder and 1 M3 troley of oxygen. Based on this data, CV. Andira Karya Jaya can provide supplies of medical devices that are of interest to buyers.
Implementation of Artifical Neural Networks with Multilayer Perceptron for Analysis of Acceptance of Permanent Lecturers
Hartono Hartono;
Muhammad Sadikin;
Dian Maya Sari;
Nur Anzelina;
Silvia Lestari;
Wulan Dari
Jurnal Mantik Vol. 4 No. 2 (2020): Augustus: Manajemen, Teknologi Informatika dan Komunikasi (Mantik)
Publisher : Institute of Computer Science (IOCS)
Show Abstract
|
Download Original
|
Original Source
|
Check in Google Scholar
|
DOI: 10.35335/mantik.Vol4.2020.954.pp1389-1396
Lecturer acceptance selection is the first step in building an education. The Multilayer Perceptron method can be applied in the case of permanent lecturer admissions. The problem faced in the admission of permanent lecturers is that reception is still subjective. This research will prove the ability of the Multilayer Perceptron algorithm to classify eligibility as a lecturer or not. Inputs from this study were prospective applicants' data, namely age, grade point average (GPA), written test score, interview value, and home base status. Sample data amounted to 100 data. 75% of the data is used as training data, and 25% as test data. The test results of the accuracy of the data are known that the multilayer perceptron neural network method has an accuracy rate of 98.7% and with a ROC Area value of 0.989. This proves that the application of the model used belongs to the classification category very well because it has a ROC value between 0.90-1.00.
Combining feature selection and hybrid approach redefinition in handling class imbalance and overlapping for multi-class imbalanced
Hartono Hartono;
Erianto Ongko;
Yeni Risyani
Indonesian Journal of Electrical Engineering and Computer Science Vol 21, No 3: March 2021
Publisher : Institute of Advanced Engineering and Science
Show Abstract
|
Download Original
|
Original Source
|
Check in Google Scholar
|
DOI: 10.11591/ijeecs.v21.i3.pp1513-1522
In the classification process that contains class imbalance problems. In addition to the uneven distribution of instances which causes poor performance, overlapping problems also cause performance degradation. This paper proposes a method that combining feature selection and hybrid approach redefinition (HAR) method in handling class imbalance and overlapping for multi-class imbalanced. HAR was a hybrid ensembles method in handling class imbalance problem. The main contribution of this work is to produce a new method that can overcome the problem of class imbalance and overlapping in the multi-class imbalance problem. This method must be able to give better results in terms of classifier performance and overlap degrees in multi-class problems. This is achieved by improving an ensemble learning algorithm and a preprocessing technique in HAR using minimizing overlapping selection under SMOTE (MOSS). MOSS was known as a very popular feature selection method in handling overlapping. To validate the accuracy of the proposed method, this research use augmented R-Value, Mean AUC, Mean F-Measure, Mean G-Mean, and Mean Precision. The performance of the model is evaluated against the hybrid method (MBP+CGE) as a popular method in handling class imbalance and overlapping for multi-class imbalanced. It is found that the proposed method is superior when subjected to classifier performance as indicate with better Mean AUC, F-Measure, G-Mean, and precision.