Jurnal Teknik Informatika C.I.T. Medicom
Vol 16 No 6 (2025): January : Intelligent Decision Support System (IDSS)

Outlier detection in the clustired data

Bu'ulolo, Efori (Unknown)
Syahputra, Rian (Unknown)
Simorangkir, Elsya Sabrina Asmita (Unknown)



Article Info

Publish Date
31 Jan 2025

Abstract

The purpose of this study is to detect outliers in data clusters. Outliers in data cluster datasets often occur in the data clustering process, especially in the K-Means algorithm. Outliers in cluster data are members/cluster items that are far from the centroid value and are not found in the dominant cluster. Outliers in cluster data are caused by various factors such as inaccurate K values, inaccurate centroid point values, poor data quality and others. To detect outliers in cluster data using the blox plot method, Z-Score and relative size factor (RSF). The input value is the sum of squared error (SSE), calculated by summing the squares of the distance of each data point from the cluster centroid. The dataset used consists of 3 (three) variances, namely high data variance, medium data variance and low data variance. The method used for outlier detection in this study can detect outliers in all data variances used, only not all outlier detection methods are optimal for all data variances. The plox plot method is optimal for high data variance and medium data variance, the RSF method is optimal for medium data variance and the Z-Score method is not optimal for high data variance.

Copyrights © 2025






Journal Info

Abbrev

JTI

Publisher

Subject

Computer Science & IT

Description

The Jurnal Teknik Informatika C.I.T a scientific journal of Decision support sistem , expert system and artificial inteligens which includes scholarly writings on pure research and applied research in the field of information systems and information technology as well as a review-general review of ...