To obtain the desired information , it would require an appropriate methodology to be used in processing the data that already exists . One is the data grouping or clustering that produces clusters . Quality cluster distances determined from the size of each data in the cluster by the cluster center ( centroid ) , the closer the distance the better the quality of the cluster , and vice versa . This study combines the method of Principal Component Analysis ( PCA ) with K -Means . PCA is used to reduce the dimensions of a dataset or reducing variable (feature eksraction ) before the clustered using K -Means . The dataset used in this study is data coronary heart disease ( statlog ) from the University of California , Irvine ( UCI ) machine learning repository . Of the 13 variables were reduced dataset produced three variables and three variables were generated to represent the 13 variables that exist in the dataset . The clustering process that combines K -Means with Principal Component Analysis ( PCA ) is capable of producing Chebychev distance is shorter than the K -means clustering and FCM ( fuzzy clustering algorithm) .
Copyrights © 2017