Claim Missing Document
Check
Articles

Found 2 Documents
Search

Model Klastering Hybrid Menggunakan Inisialisasi K-means++ dan Algoritma Optimasi Grey Wolf Mukti, Bayu Priya
JUSTIN (Jurnal Sistem dan Teknologi Informasi) Vol 13, No 2 (2025)
Publisher : Jurusan Informatika Universitas Tanjungpura

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.26418/justin.v13i2.88211

Abstract

Penelitian ini mengembangkan GWO-KMeans++, model klastering hybrid yang mengintegrasikan Grey Wolf Optimizer (GWO) dengan K-Means++ untuk mengatasi masalah local optima dalam inisialisasi centroid. Model diuji pada lima dataset UCI (Seeds, Wine, Sonar, Bank, Forest) dengan karakteristik beragam, mulai dari data pertanian berdimensi rendah (6 fitur) hingga sinyal sonar berisik (60 fitur). Kinerja diukur menggunakan Silhouette Score (SC) dan Davies-Bouldin Index (DB) untuk jumlah klaster k=2"“10, lalu dibandingkan dengan K-Means++ melalui Uji Wilcoxon Signed-Rank. Hasil menunjukkan GWO-KMeans++ meningkatkan SC sebesar 19,71"“24,59% (Seeds, k=5"“7), 56,81% (Wine, k=5), dan 210,85% (Sonar, k=2), serta mengurangi DB hingga 22,19% (Seeds, k=7) dan 28,02% (Wine, k=5). Uji statistik mengonfirmasi peningkatan SC signifikan di semua dataset (p 0,05), dengan nilai p=0,0039 (Seeds, Wine, Sonar), p=0,0117 (Bank), dan p=0,0273 (Forest). Namun, perbaikan DB hanya signifikan pada Seeds (p=0,0117) dan Wine (p=0,0078). Visualisasi klaster memperlihatkan distribusi data lebih terpisah dan centroid lebih akurat, khususnya pada data multidimensi (Wine) dan berisik (Sonar). Model ini stabil pada k=3"“6, cocok untuk data nonlinier, dengan aplikasi di bioinformatika hingga deteksi kecurangan keuangan. Rekomendasi lanjutan meliputi optimasi parameter GWO, integrasi reduksi dimensi (PCA), dan pengujian pada dataset big data.
Leveraging TF-IDF and Random Forest to Uncover Genre Patterns in Google Books Metadata Putri, Nadya Awalia; Mukti, Bayu Priya
International Journal for Applied Information Management Vol. 5 No. 4 (2025): Regular Issue: December 2025
Publisher : Bright Institute

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47738/ijaim.v5i4.112

Abstract

This paper presents a machine learning-based approach for classifying books into genres using their descriptions. We employed a Random Forest classifier combined with Term Frequency-Inverse Document Frequency (TF-IDF) to convert text descriptions into numerical features, enabling the classification of books into six genres: Fiction, Literary Criticism, Education, Social Science, Biography & Autobiography, and Unknown Genre. The model was trained and evaluated on a dataset sourced from Google Books, which was preprocessed to remove missing data and clean the text descriptions by eliminating punctuation, numbers, and stopwords. We performed 5-fold cross-validation to assess the model's performance, which resulted in an average cross-validation accuracy of 64.22%. The final model achieved an accuracy of 62.71% on the test set, with the highest recall observed in the "Fiction" genre. The results indicated that the Random Forest classifier was particularly effective in classifying well-represented genres like "Fiction" and "Unknown Genre." However, genres with fewer samples, such as "Social Science" and "Biography & Autobiography," showed poor performance, highlighting the challenges posed by class imbalance and data sparsity. A confusion matrix and classification report revealed these discrepancies, with certain genres being misclassified more often than others. This research demonstrates the feasibility of using machine learning for automated book genre classification, offering significant potential for enhancing book recommendation systems and improving user experience. Despite its promising results, the study's limitations, including data sparsity and genre imbalance, suggest that further work is needed to refine the model. Future research could explore the use of deep learning techniques and the expansion of the dataset to address these issues and improve genre classification accuracy. The potential for automated genre classification in real-world applications, such as book categorization and personalized recommendations, presents an exciting direction for the book industry.