Claim Missing Document
Check
Articles

Found 1 Documents
Search
Journal : Jurnal Teknik Informatika C.I.T. Medicom

Outlier detection in the clustired data Bu'ulolo, Efori; Syahputra, Rian; Simorangkir, Elsya Sabrina Asmita
Jurnal Teknik Informatika C.I.T Medicom Vol 16 No 6 (2025): January : Intelligent Decision Support System (IDSS)
Publisher : Institute of Computer Science (IOCS)

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.35335/cit.Vol16.2025.1005.pp394-404

Abstract

The purpose of this study is to detect outliers in data clusters. Outliers in data cluster datasets often occur in the data clustering process, especially in the K-Means algorithm. Outliers in cluster data are members/cluster items that are far from the centroid value and are not found in the dominant cluster. Outliers in cluster data are caused by various factors such as inaccurate K values, inaccurate centroid point values, poor data quality and others. To detect outliers in cluster data using the blox plot method, Z-Score and relative size factor (RSF). The input value is the sum of squared error (SSE), calculated by summing the squares of the distance of each data point from the cluster centroid. The dataset used consists of 3 (three) variances, namely high data variance, medium data variance and low data variance. The method used for outlier detection in this study can detect outliers in all data variances used, only not all outlier detection methods are optimal for all data variances. The plox plot method is optimal for high data variance and medium data variance, the RSF method is optimal for medium data variance and the Z-Score method is not optimal for high data variance.