Desimal: Jurnal Matematika
Vol. 8 No. 3 (2025): Desimal: Jurnal Matematika

Performance evaluation of clustering algorithms for protein sequence data

Ardaneswari, Gianinna (Unknown)
Aminah, Siti (Unknown)
Awang, Mohd Khalid (Unknown)
Laksmitara, Anindya (Unknown)
Azkiya, Azkal (Unknown)
Razi, Fakhrur (Unknown)
Joshua Situmeang, Jason Nimrod (Unknown)



Article Info

Publish Date
05 Nov 2025

Abstract

Protein sequence data analysis is a fundamental task in bioinformatics, supporting the exploration of biological variations and the identification of functional relationships among proteins. This study presents a performance analysis of four clustering algorithms, which include Balanced Iterative Reducing and Clustering using Hierarchies (BIRCH), Density-Based Spatial Clustering of Applications with Noise (DBSCAN), Agglomerative Hierarchical Clustering, and Spectral Clustering, applied to protein sequence datasets. Feature extraction was conducted using the Discere package in Python, generating 27 numerical attributes from protein sequences. The optimal number of clusters for BIRCH, Agglomerative, and Spectral Clustering was determined using the Elbow method, while DBSCAN parameters (MinPts, Eps) were tuned using the sorted k-distance plot. Clustering performance was assessed using the Silhouette Score. Among the algorithms, DBSCAN produced the highest silhouette score of 0.8105, whereas BIRCH achieved a strong balance between clustering quality, with a score of 0.7405, and computational efficiency. Agglomerative clustering provided moderate results with a score of 0.6779, while Spectral clustering yielded the lowest score of 0.6310 but demonstrated flexibility in capturing complex structures. These findings provide a benchmark comparison of clustering methods for protein sequence data, offering practical insights into algorithm selection based on data characteristics and performance trade-offs.

Copyrights © 2025






Journal Info

Abbrev

desimal

Publisher

Subject

Education Mathematics Social Sciences

Description

Desimal: Jurnal Matematika, particularly focuses on the main issues in the development of the sciences of mathematics education, mathematics education, and applied mathematics. Desimal: Jurnal Matematika published three times a year, the period from January to April, May to Augustus, and September ...