Claim Missing Document
Check
Articles

Found 14 Documents
Search

Improving the Accuracy of C4.5 Algorithm with Chi-Square Method on Pure Tea Classification Using Electronic Nose Mula Agung Barata; Edi Noersasongko; Purwanto; Moch Arief Soeleman
Jurnal RESTI (Rekayasa Sistem dan Teknologi Informasi) Vol 7 No 2 (2023): April 2023
Publisher : Ikatan Ahli Informatika Indonesia (IAII)

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.29207/resti.v7i2.4687

Abstract

Tea is one of the plantation products within the Ministry of Agriculture of the Republic of Indonesia, which plays an essential role as a mainstay commodity that boosts the Indonesian economy. Each type of tea has different properties, and the aroma of each type of tea can measure the quality of the tea. The human sense of smell is still very limited in classifying pure types of tea. Therefore, a device is needed to help measure the aroma of tea from an electronic nose. The devices attached to several gas sensors help humans take data from the smell of pure tea and calculate the value of each type of tea to test datasets with data mining algorithms. This study uses the C4.5 algorithm as a classification method with advantages over noise data, missing values, and handling variables with discrete and continuous types. Meanwhile, Chi-square is used to perform attribute severing in the data preprocessing process to increase the accuracy of dataset testing. Testing a pure tea dataset with four whole attributes, namely CO2, CO, H2, and CH4, using the C4.5 algorithm resulted in an accuracy of 93.65% and an increase in the accuracy performance of the C4.5 algorithm by 94.27% with dataset testing using Chi-Square feature selection with the two highest value attributes.
Customer Segmentation with RFM Model using Fuzzy C-Means and Genetic Programming Anas Syaifudin; Purwanto Purwanto; Heribertus Himawan; M. Arief Soeleman
MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer Vol 22 No 2 (2023)
Publisher : LPPM Universitas Bumigora

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.30812/matrik.v22i2.2408

Abstract

One of the strategies a company uses to retain its customers is Customer Relationship Management (CRM). CRM manages interactions and supports business strategies to build mutually beneficial relationships between companies and customers. The utilization of information technology, such as data mining used to manage the data, is critical in order to be able to find out patterns made by customers when processing transactions. Clustering techniques are possible in data mining to find out the patterns generated from customer transaction data. Fuzzy C-Means (FCM) is one of the best-known and most widely used fuzzy grouping methods. The iteration process is carried out to determine which data is in the right cluster based on the objective function. The local minimum is the condition where the resulting value is not the lowest value from the solution set. This research aims to solve the minimum local problem in the FCM algorithm using Genetic Programming (GP), which is one of the evolution-based algorithms to produce better data clusters. The result of the research is to compare the application of fuzzy c-means (FCM) and genetic programming fuzzy c-means (GP-FCM) for customer segmentation applied to the Cahaya Estetika clinic dataset. The test results of the GP-FCM yielded an objective function of 20.3091, while for the FCM algorithm, it was 32.44741. Furthermore, evaluating cluster validity using Partition Coefficient (PC), Classification Entropy (CE), and Silhouette Index proves that the results of cluster quality from gp-fcm are more optimal than fcm. The results of this study indicate that the application of genetic programming in the fuzzy c-means algorithm produces more optimal cluster quality than the fuzzy c-means algorithm.
Data Pre-Processing And Feature Selection Techniques Backward Elimination For Naïve Bayes Classification On Heart Disease Detection Julius Warih Angkasa; Edi Noersasongk; Purwanto
Jurnal Ekonomi Teknologi dan Bisnis (JETBIS) Vol. 2 No. 4 (2023): JETBIS : Journal Of Economics, Technology and Business
Publisher : Al-Makki Publisher

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.57185/jetbis.v2i4.48

Abstract

According to a study published in the International Journal of Cardiology titled "Heart failure across Asia: Same healthcare burden but differences in organization of care," the mortality rate due to heart failure in Indonesia is relatively high. The research findings indicate that approximately 5% of the total population in Indonesia suffers from heart failure. Heart disease is a condition that occurs when the heart experiences disruptions, either due to infections or congenital abnormalities. It is important to pay attention to heart disease in order to reduce the mortality rate. However, there are several inaccuracies in identifying heart disease, and it is necessary to perform calculations using a predictive approach utilizing data mining techniques. One of the data mining methods used is the Naïve Bayes (NB) algorithm, which serves as a classification technique. Additionally, before performing the classification, issues with the data content are often encountered, such as the presence of missing values. This problem can interfere with the classification process; therefore, a special technique called pre-processing is needed to remove missing values. By employing this technique, it can support obtaining accurate prediction results. Furthermore, to support the classification, this study applies feature selection using the Backward Elimination (BE) method to enhance accuracy. In this study, through the implementation of data pre-processing techniques and feature selection, the accuracy rate was successfully improved to 98.31%.
Gaussian Based-SMOTE Method for Handling Imbalanced Small Datasets Muhammad Misdram; Edi Noersasongko; Purwanto Purwanto; Muljono Muljono; Fandi Yulian Pamuji
Jurnal Ilmiah Teknik Elektro Komputer dan Informatika Vol 9, No 4 (2023): December
Publisher : Universitas Ahmad Dahlan

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.26555/jiteki.v9i4.26881

Abstract

The problem of dataset imbalance needs special handling, because it often creates obstacles to the classification process. A very important problem in classification is to overcome a decrease in classification performance. There have been many published researches on the topic of overcoming dataset imbalances, but the results are still unsatisfactory. This is proven by the results of the average accuracy increase which is still not significant. There are several common methods that can be used to deal with dataset imbalances. For example, oversampling, undersampling, Synthetic Minority Oversampling Technique (SMOTE), Borderline-SMOTE, Adasyn, Cluster-SMOTE methods. These methods in testing the results of the classification accuracy average are still relatively low. In this research the selected dataset is a medical dataset which is classified as a small dataset of less than 200 records. The proposed method is Gaussian Based-SMOTE which is expected to work in a normal distribution and can determine excess samples for minority classes. The Gaussian Based-SMOTE method is a contribution of this research and can produce better accuracy than the previous research. The way the Gaussian Based-SMOTE method works is to start by determining the random location of synthesis candidates, determining the Gaussian distribution. The results of these two methods are substituted to produce perfect synthetic values. Generated synthetic values are combined with SMOTE sampling of the majority data from the training data, produce balanced data. The result of the balanced data classification trial from the influence of the Gaussian Based SMOTE result in a significant increase in accuracy values of 3% on average.