Claim Missing Document
Check
Articles

Found 2 Documents
Search

Deteksi Outlier Hasil Clustering Algoritma K-Medoids Menggunakan Metode Boxplot Pada Data KIP Kuliah Simorangkir, Elsya Sabrina Asmita; Siahaan, Andysah Putera Utama; Marlina, Leni; Nasution, Darmeli; Sitorus, Zulham
Journal of Computer System and Informatics (JoSYC) Vol 5 No 4 (2024): August 2024
Publisher : Forum Kerjasama Pendidikan Tinggi (FKPT)

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47065/josyc.v5i4.5479

Abstract

In the process of forming clusters with the K-Medoids algorithm, cluster result anomalies often occur, such as outliers. This value appears as a revelation in existing data patterns. Outliers occur due to measurement errors, rare events, or due to other unexpected factors. In this research, the dataset used is data on prospective KIP recipient students at Budi Darma University, where there is a high level of interest in KIP Kuliah while the quota is limited, which means that KIP Kuliah administrators sometimes have difficulty determining which students are eligible to receive KIP Kuliah. For this reason, the K-Medoids clustering technique was used to cluster data on 54 prospective students who were eligible to receive KIP Kuliah Merdeka and those who were not eligible. From the cluster results, outlier detection was carried out using the box plot method with the aim of finding out whether each cluster member was actually in the appropriate cluster or not. The result is that the data cluster is divided into 2 (K-2). In the max min centroid selection, cluster I consists of 52 members and cluster II consists of 2 members, where the outlier data consists of 3 data, while in random centroid selection (python), cluster I consists of 36 members and cluster II 18 members with data The outlier consists of 4 members. The accuracy of the clustering results between max min and random centroid selection has an accuracy of 64.81%, and the outlier accuracy is 75%.
Outlier detection in the clustired data Bu'ulolo, Efori; Syahputra, Rian; Simorangkir, Elsya Sabrina Asmita
Jurnal Teknik Informatika C.I.T Medicom Vol 16 No 6 (2025): January : Intelligent Decision Support System (IDSS)
Publisher : Institute of Computer Science (IOCS)

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.35335/cit.Vol16.2025.1005.pp394-404

Abstract

The purpose of this study is to detect outliers in data clusters. Outliers in data cluster datasets often occur in the data clustering process, especially in the K-Means algorithm. Outliers in cluster data are members/cluster items that are far from the centroid value and are not found in the dominant cluster. Outliers in cluster data are caused by various factors such as inaccurate K values, inaccurate centroid point values, poor data quality and others. To detect outliers in cluster data using the blox plot method, Z-Score and relative size factor (RSF). The input value is the sum of squared error (SSE), calculated by summing the squares of the distance of each data point from the cluster centroid. The dataset used consists of 3 (three) variances, namely high data variance, medium data variance and low data variance. The method used for outlier detection in this study can detect outliers in all data variances used, only not all outlier detection methods are optimal for all data variances. The plox plot method is optimal for high data variance and medium data variance, the RSF method is optimal for medium data variance and the Z-Score method is not optimal for high data variance.