Claim Missing Document
Check
Articles

Found 3 Documents
Search

PERBANDINGAN K-MEDOIDS DAN CLARA (Clustering Large Application) PADA DATA POPULASI TERNAK DI INDONESIA Ardhani, Rizky; Marshelle, Sean; Fitrianto, Anwar; Erfiani, Erfiani; Jumansyah, L. M. Risman Dwi
Jurnal Lebesgue : Jurnal Ilmiah Pendidikan Matematika, Matematika dan Statistika Vol. 5 No. 3 (2024): Jurnal Lebesgue : Jurnal Ilmiah Pendidikan Matematika, Matematika dan Statistik
Publisher : LPPM Universitas Bina Bangsa

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.46306/lb.v5i3.764

Abstract

This study compares the K-Medoids and CLARA (Clustering Large Application) methods for livestock population data in Indonesian districts and cities. Calculating the distance between points and objects in the data, K-Medoids is a method for clustering based on data points (medoids). A larger dataset is divided into several samples for comparison in CLARA, an extension of the K-Medoids approach. The CLARA method analysis results show that three clusters are the ideal number. The ideal number of clusters in a K-Medoids cluster analysis is two. The Silhouette Score (SS), Davis-Bouldin Index (DBI), and Calinski-Harabasz Index (CHI) are the metrics that are measured. The evaluation of the comparison results shows that the CLARA method has an SS value of 0.66, while K-Medoids has an SS value of 0.62. The comparison of the CLARA and K-Medoids approaches yielded DBI values of 1.38 and 1.92, respectively, and 197.54 and 132.73 for CHI. The findings indicate that, in comparison to the K-Medoids approach, the SS value for the CLARA method is closer to 1, and that the CHI value derived from the CLARA method is likewise greater. The K-Medoids approach has a higher DBI value than the CLARA method, where a lower DBI value denotes superior performance. The CLARA approach is the most effective way to do cluster analysis on livestock population data in Indonesian districts and cities, according to the findings.
Manifold Learning and Undersampling Approaches for Imbalanced Class Sentiment Classification Jumansyah, L. M. Risman Dwi; Soleh, Agus Mohamad; Syafitri, Utami Dyah
Knowledge Engineering and Data Science Vol 7, No 2 (2024)
Publisher : Universitas Negeri Malang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.17977/um018v7i22024p139-151

Abstract

Movie reviews are crucial in determining a film's success by influencing audience decisions. Automating sentiment classification is essential for efficient public opinion analysis. However, it faces challenges such as high-dimensional data and imbalanced class distributions. This study addresses these issues by applying manifold learning techniques, Principal Component Analysis (PCA) and Laplacian Eigenmaps (LE) to reduce data complexity and undersampling strategies (Random Undersampling (RUS) and EasyEnsemble) to balance data and improve predictions for both sentiment classes. On reviews of The Raid 2: Berandal, EasyEnsemble achieved the highest average G-Mean of 0.694 using Term Frequency-Inverse Document Frequency (TF-IDF) features with a linear kernel without dimensionality reduction. RUS provided balanced but inconsistent results, while Review of Systems (ROS) combined with PCA (85% variance cumulative) improved predictions for negative reviews. Laplacian Eigenmaps were effective for negative reviews with 500 dimensions but less accurate for positive ones. This study highlights EasyEnsemble's superior performance in addressing the class imbalance, though optimization with manifold learning remains challenging.
Clustering Time Series Forecasting Model for Grouping Provinces in Indonesia Based on Granulated Sugar Prices Amatullah, Fida Fariha; Ilmani, Erdanisa Aghnia; Fitrianto, Anwar; Erfiani, Erfiani; Jumansyah, L. M. Risman Dwi
Journal of Applied Informatics and Computing Vol. 9 No. 1 (2025): February 2025
Publisher : Politeknik Negeri Batam

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.30871/jaic.v9i1.8840

Abstract

Clustering time series is the process of organizing data into groups based on similarities in specific patterns. This research uses the prices of granulated sugar in each province of Indonesia. According to USDA reports, sugar consumption in Indonesia in 2023 reached 7.9 million tons. On April 26, 2024, the price of granulated sugar peaked in the Papua Mountains at Rp29,320 per kg, while the lowest price was recorded in the Riau Islands at Rp16,460 per kg. The research aims to cluster provinces based on the characteristics of granulated sugar prices and to use forecasting models for each group. Two groups were formed based on the price patterns of granulated sugar over time. The provinces of Papua and West Papua are in group 2, while the other 30 provinces are in group 1. The best model developed using the auto ARIMA method is ARIMA (2, 1, 0), with a MAPE value of 2.36% for cluster 1, and ARIMA (1, 1, 1), with a MAPE value of 2.59% for cluster 2. These values are less than 10%, indicating that the models built using the auto ARIMA method for clusters 1 and 2 are suitable for forecasting.