Fitriyanto, Rachmad
Unknown Affiliation

Published : 8 Documents Claim Missing Document
Claim Missing Document
Check
Articles

Found 2 Documents
Search
Journal : Techno Nusa Mandiri : Journal of Computing and Information Technology

MULTILEVEL MODAL VALUE ANALYSIS FOR INTERPRETING CATEGORICAL K-MEDOIDS CLUSTERS DATA Fitriyanto, Rachmad; Syafiqoh, Ummi
Jurnal Techno Nusa Mandiri Vol. 21 No. 2 (2024): Techno Nusa Mandiri : Journal of Computing and Information Technology Period o
Publisher : Lembaga Penelitian dan Pengabdian Pada Masyarakat

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.33480/techno.v21i2.5796

Abstract

Consumer segmentation plays a crucial role for business owners in developing their enterprises. K-Medoid is commonly used for segmentation functions due to its low computational complexity. However, K-Medoid has limitations, such as the variability in cluster sizes across different iterations and the challenge of determining the optimal number of clusters. The Davies-Bouldin Index (DBI) is a metric used to evaluate the number of clusters by calculating the ratio between the within-cluster distance and the between-cluster distance. Most segmentation studies typically stop at the formation of clusters without further interpretation, particularly when dealing with categorical data. This study aims to modify the use of K-Medoid and propose a method for interpreting clusters with categorical data. The research began with questionnaire design and the data collecting from 100 respondents, which was normalized in the second stage. Clustering used K-Medoid with variations K values from K=2 to K=10, with each K value tested 10 times. The clustering results were evaluated using the DBI to select the optimal clusters. Data interpretation conducted using modal values, calculated as the ratio of the number of times a specific attribute variable was selected by respondents to the total number of data points in the cluster. Utilization and hierarchical visualization of modal values proposed in this study offer insights into the dominant variables within an attribute and also depict the relationships between attributes based on the ranking of modal values. These advantages facilitate business analysts in labeling clusters for developing consumer-driven business strategies.
FEATURE SELECTION COMPARATIVE PERFORMANCE FOR UNSUPERVISED LEARNING ON CATEGORICAL DATASET Fitriyanto, Rachmad; Mohamad Ardi
Jurnal Techno Nusa Mandiri Vol. 22 No. 1 (2025): Techno Nusa Mandiri : Journal of Computing and Information Technology Period o
Publisher : Lembaga Penelitian dan Pengabdian Pada Masyarakat

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.33480/techno.v22i1.6512

Abstract

In the era of big data, Knowledge Discovery in Databases (KDD) is vital for extracting insights from extensive datasets. This study investigates feature selection for clustering categorical data in an unsupervised learning context. Given that an insufficient number of features can impede the extraction of meaningful patterns, we evaluate two techniques—Chi-Square and Mutual Information—to refine a dataset derived from questionnaires on college library visitor characteristics. The original dataset, containing 24 items, was preprocessed and partitioned into five subsets: one via Chi-Square and four via Mutual Information using different dependency thresholds (a low-mid-high scheme and dynamic quartile thresholds: Q1toMax, Q2toMax, and Q3toMax). K-Means clustering was applied across nine variations of K (ranging from 2 to 10), with clustering performance assessed using the silhouette score and Davies-Bouldin Index (DBI). Results reveal that while the Mutual Information approach with a Q3toMax threshold achieves an optimal silhouette score at K=7, it retains only 4 features—insufficient for comprehensive analysis based on domain requirements. Conversely, the Chi-Square method retains 18 features and yields the best DBI at K=9, better capturing the intrinsic characteristics of the data. These findings underscore the importance of aligning feature selection techniques with both clustering quality and domain knowledge, and highlight the need for further research on optimal dependency threshold determination in Mutual Information.