Claim Missing Document
Check
Articles

Found 7 Documents
Search

Optimasi Cluster Pada K-Means Clustering Dengan Teknik Reduksi Dimensi Dataset Menggunakan Gini Index Zarkasyi, Muhammad Imam; Mawengkang, Herman; Sitompul, Opim Salim
Building of Informatics, Technology and Science (BITS) Vol 4 No 3 (2022): December 2022
Publisher : Forum Kerjasama Pendidikan Tinggi

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47065/bits.v4i3.2458

Abstract

In K-Means Clustering, the number of attributes of a data can affect the number of iterations generated in the data grouping process. One of the solutions to overcome these problems is by using a reduction technique on the dimensions of the dataset. In this study, the authors apply the Gini Index to perform attribute reduction on the data set to reduce attributes that have no effect on the dataset before clustering with K-Means Clustering. The dataset used to be tested as a testing instrument in this research is Absenteeism at work obtained from the UCI Machine Learning Repository, with 20 attributes, 740 data records and 4 attribute classes. The results of the tests in this research indicate that the number of iterations obtained from the comparison of tests using the K-Means in a Conversional (Without Attribute Reduction) is obtained by the number of 9 iterations, while the K-Means with attribute reduction with the Gini Index obtains the number of iterations totaling 6 iterations. Clustering evaluation was calculated using Sum of Square Error (SSE). The SSE value in K-Means Clustering in a Conversional (Without Attribute Reduction) is 1391.613, while in K-Means Clustering with attribute reduction with a Gini Index, it is 440.912. From the results of the proposed method, it is able to reduce the percentage of errors and minimize the number of iterations in K-Means Clustering by reducing the dimensions of the dataset using the Gini Index
K-Means Performance Optimization Using Rank Order Centroid (ROC) And Braycurtis Distance Irwandi, Hafiz; Sitompul, Opim Salim; Sutarman, Sutarman
Sinkron : jurnal dan penelitian teknik informatika Vol. 6 No. 2 (2022): Articles Research Volume 6 Issue 2, April 2022
Publisher : Politeknik Ganesha Medan

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.33395/sinkron.v7i2.11371

Abstract

K-Means is a clustering algorithm that groups data based on similarities between data. Some of the problems that arise from this algorithm are when determining the center point of the cluster randomly. This will certainly affect the final result of a clustering process. To anticipate the poor accuracy value, a process is needed to determine the initial centroid in the initialization process. The second problem is when calculating the Euclidean distance on the distance between data. However, this method only gives the same impact on each data attribute. From some of these problems, this study proposes the Rank Order Centroid (ROC) method for initializing the cluster center point and using the Braycurtis distance method to calculate the distance between data. With the experiment K=2 to K=10, the results obtained in this study are the proposed method obtains an iteration reduction of 6.6% on the Student Performance Exams dataset and 19.3% on the Body Fat Prediction dataset. However, there was an increase in iterations on the Heart Failure dataset by 24.2%. In testing the cluster results using the Silhouette Coefficient, this method shows an increase in the evaluation value of 5.9% in the Student Performance Exams dataset. However, the evaluation value decreased by 8.3% in the Body Fat Prediction dataset and 3.3% in the Heart Failure dataset.
Perfomance analysis of Naive Bayes method with data weighting Afdhaluzzikri, Afdhaluzzikri; Mawengkang, Herman; Sitompul, Opim Salim
Sinkron : jurnal dan penelitian teknik informatika Vol. 6 No. 3 (2022): Article Research Volume 6 Number 3, July 2022
Publisher : Politeknik Ganesha Medan

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.33395/sinkron.v7i3.11516

Abstract

Classification using naive bayes algorithm for air quality dataset has an accuracy rate of 39.97%. This result is considered not good and by using all existing data attributes. By doing pre-processing, namely feature selection using the gain ratio algorithm, the accuracy of the Naive Bayes algorithm increases to 61.76%. This proves that the gain ratio algorithm can improve the performance of the naive bayes algorithm for air quality dataset classification. Classification using naive bayes algorithm for air quality dataset. While the Water Quality dataset has an accuracy rate of 93.18%. These results are considered good and by using all the existing data attributes. By doing pre-processing, namely feature selection using the gain ratio algorithm, the accuracy of the Naive Bayes algorithm increases to 95.73%. This proves that the gain ratio algorithm can improve the performance of the naive bayes algorithm for air quality dataset classification. Classification using Naive Bayes algorithm for Water Quality dataset. Based on the tests that have been carried out on all data, it can be seen that the Weight nave Bayes classification model can provide better accuracy values ​​because there is a change in the weighting of the attribute values ​​in the dataset used. The value of the weighted Gain ratio is used to calculate the probability in Nave Bayes, which is a parameter to see the relationship between each attribute in the data, and is used as the basis for the weighting of each attribute of the dataset. The higher the Gain ratio of an attribute, the greater the relationship to the data class. So that the accuracy value increases than the accuracy value generated by the Naïve Bayes classification model. The increase in accuracy in the Naïve Bayes classification model is due to the number of weights from the attribute selection in the Gain ratio.
Analysis Clustering Using Normalized Cross Correlation In Fuzzy C-Means Clustering Algorithm Kembaren, Ricky Crist Geoversam Imantara; Sitompul, Opim Salim; Sawaluddin, Sawaluddin
Sinkron : jurnal dan penelitian teknik informatika Vol. 6 No. 4 (2022): Article Research: Volume 6 Number 4, October 2022
Publisher : Politeknik Ganesha Medan

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.33395/sinkron.v7i4.11666

Abstract

Abstract: Fuzzy C-Means Clustering (FCM) has been widely known as a technique for performing data clustering, such as image segmentation. This study will conduct a trial using the Normalized Cross Correlation method on the Fuzzy C-Means Clustering algorithm in determining the value of the initial fuzzy pseudo-partition matrix which was previously carried out by a random process. Clustering technique is a process of grouping data which is included in unsupervised learning. Data mining generally has two techniques in performing clustering, namely: hierarchical clustering and partitional clustering. The FCM algorithm has a working principle in grouping data by adding up the level of similarity between pairs of data groups. The method applied to measure the similarity of the data based on the correlation value is the Normalized Cross Correlation (NCC). The methodology in this research is the steps taken to measure clustering performance by adding the Normalized Cross Correlation (NCC) method in determining the initial fuzzy pseudo-partition matrix in the Fuzzy C-Means Clustering (FCM) algorithm. the results of data clustering using the Normalized Cross Correlation (NCC) method on the Fuzzy C-Means Clustering (FCM) algorithm gave better results than the ordinary Fuzzy C-Means Clustering (FCM) algorithm. The increase that occurs in the proposed method is 4.27% for the Accuracy, 4.73% for the rand index and 8.26% for the F-measure..
Strategic plant maintenance planning in agriculture by integrating lean principles and optimization Simarmata, Gayus; Suwilo, Saib; Sitompul, Opim Salim; Sutarman, Sutarman
International Journal of Electrical and Computer Engineering (IJECE) Vol 14, No 6: December 2024
Publisher : Institute of Advanced Engineering and Science

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.11591/ijece.v14i6.pp6279-6286

Abstract

Operational planning within agricultural production systems plays a pivotal role in facilitating farmers' decision-making processes. This study introduces a novel mathematical model aimed at optimizing plant maintenance planning through the efficient allocation of labor, optimal utilization of machinery, and strategic scheduling. Utilizing mixed integer non-linear programming (MINLP), the model integrates lean principles to minimize waste and improve operational efficiency. The primary contributions of this study include the development of a comprehensive maintenance planning model, the application of advanced mathematical techniques in agriculture, and the enhancement of resource allocation strategies. The results demonstrate significant improvements in maintenance task scheduling, reduced downtime, and enhanced productivity, ultimately contributing to sustainable farming practices and food security. This model serves as a strategic decision-support tool for farmers, enabling data-driven planning and resource utilization to achieve both short-term efficiency and long-term agricultural viability.
Development of the fuzzy grid partition methods in generating fuzzy rules for the classification of data set Marbun, Murni; Sitompul, Opim Salim; Nababan, Erna Budhiarti; Sihombing, Poltak
Bulletin of Electrical Engineering and Informatics Vol 13, No 3: June 2024
Publisher : Institute of Advanced Engineering and Science

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.11591/eei.v13i3.5378

Abstract

The main weakness of complex and sizeable fuzzy rule systems is the complexity of data interpretation in terms of classification. Classification interpretation can be affected by reducing rules and removing important rules for several reasons. Based on the results of experiments using the fuzzy grid partition (FGP) approach for high-dimensional data, the difficulty in generating many fuzzy rules still increases exponentially as the number of characteristics increases. The solution to this problem is a hybrid method that combines the advantages of the rough set method and the FGP method, which is called the fuzzy grid partition rough set (FGPRS) method. In the Irish data, the rough set approach reduces the number of characteristics and objects so that data with excessive values can be minimized, and the fuzzy rules produced using the FGP method are more concise. The number of fuzzy rules produced using the FGPRS method at K=2 is 50%; at K=K+1, it is reduced by 66.7% and at K=2 K, it is reduced by 75%. Based on the findings of the data collection classification test, the FGPRS method has a classification accuracy rate of 83.33%, and all data can be classified.
RHEUMATIC HEART DISEASE SCREENING USING HANDHELD ECHOCARDIOGRAPHY: A STUDY AMON`G JUNIOR HIGH SCHOOL STUDENTS IN INDONESIA Ardini, Tengku Winda; Muttaqien, Chairiza; Sitompul, Opim Salim; Hasan, Refli; Effendy, Elmeid; Rusda, Muhammad
Jurnal Al Ulum LPPM Universitas Al Washliyah Medan Vol. 13 No. 2 (2025): Jurnal Al Ulum: LPPM Universitas Al Washliyah Medan
Publisher : UNIVERSITAS AL WASHLIYAH (UNIVA) MEDAN

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47662/alulum.v13i2.1007

Abstract

Considering the high burden of morbidity and mortality associated with Rheumatic Heart Disease (RHD) in Indonesia, there is increasing interest in assessing cost-effective screening modalities, particularly the application of handheld echocardiography. Our study aimed to determine the prevalence of Rheumatic Heart Disease (RHD) in Batu Bara, North Sumatra, Indonesia. Our descriptive observational study was carried out in the Batu Bara region of North Sumatra in May 2025. The study population comprised junior high school students aged 10 to 15 years from a selected school in Batu Bara. Data collection included sociodemographic variables, parental characteristics, environmental and household factors, anthropometric measurements, physical examination findings, auscultation results, and echocardiographic evaluations. All data were analyzed using descriptive statistical methods. A total of 190 children were assessed in this study, with a median age of 13 years, and females comprised 54.7% of the participants. Echocardiographic screening detected Rheumatic Heart Disease (RHD) in three participants, corresponding to a prevalence of 1.6%. Within our study population, the prevalence of RHD was 1.6%. Expansion of echocardiographic screening programs is warranted to comprehensively establish RHD prevalence, accurately evaluate disease burden, and facilitate earlier detection to mitigate adverse clinical outcomes.