Claim Missing Document
Check
Articles

Found 15 Documents
Search

Imbalanced Text Classification on Tourism Reviews using Ada-boost Naïve Bayes Suzanti, Ika Oktavia; Kamil, Fajrul Ihsan; Rochman, Eka Mala Sari; Azis, Huzain; Suni, Alfa Faridh; Rachman, Fika Hastarita; Solihin, Firdaus
Jurnal ELTIKOM : Jurnal Teknik Elektro, Teknologi Informasi dan Komputer Vol. 9 No. 1 (2025)
Publisher : P3M Politeknik Negeri Banjarmasin

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.31961/eltikom.v9i1.1496

Abstract

Hidden paradise is a term that aptly describes the island of Madura, which offers diverse tourism potential. Through the Google Maps application, tourists can access sentiment-based information about various attractions in Madura, serving both as a reference before visiting and as evaluation material for the local government. The Multinomial Naïve Bayes method is used for text classification due to its simplicity and effectiveness in handling text mining tasks. The sentiment classification is divided into three categories: positive, negative, and mixed. Initial analysis revealed an imbalance in sentiment data, with most reviews being positive. To address this, sampling techniques—both oversampling and undersampling—were applied to achieve a more balanced data distribution. Additionally, the Adaptive Boosting ensemble method was used to enhance the accuracy of the Multinomial Naïve Bayes model. The dataset was split into training and testing sets using ratios of 60:40, 70:30, and 80:20 to evaluate the model’s stability and reliability. The results showed that the highest F1-score, 84.1%, was achieved using the Multinomial Naïve Bayes method with Adaptive Boosting, which outperformed the model without boosting, which had an accuracy of 76%.
PCA-counseled k-means and k-medoids with dimension reduction for improved in determining optimal aid clustering Jauhari, Achmad; Suzanti, Ika Oktavia; Anamisa, Devie Rosa; Admojo, Fadhila Tangguh
Jurnal Ilmiah Kursor Vol. 13 No. 1 (2025)
Publisher : Universitas Trunojoyo Madura

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.21107/kursor.v13i1.460

Abstract

Assuring effective allocation requires targeted distribution of aid, which makes aid clustering a crucial component. For the purpose of using data-driven segmentation based on important characteristics to determine effective help targeting, accuracy in clustering is essential. The study explores the combination of Principal ComponentAnalysis (PCA), k-means, and k-medoids to enhance aid clusters, with the goal ofincreasing aid distribution accuracy and efficiency. The information gathered consists of 1600 records with 13 attributes. In order to standardized the data having two processes in it, preprocessing is first applied. When used with PCA, it makes measuring variance easier and preserves 80% of the variation by choosing five components. Thenumber of clusters may be determined with the use of PCA, k-medoids, and the k-means approach. Greater PCA-k-means silhouette coefficients, which indicate betterclustering ability, are highlighted by comparative analysis. This analysis shows thatPCA-k-means is an effective technique for creating accurate and unique clusters withina data set's structure.The clustering results using the PCA-k-means algorithm have produced the greatest accuracy in the silhouette score of 0.49 and the DBI score is 0.84.
Integration of Concatenated Deep Learning Models with ResNet Backbone for Automated Corn Leaf Disease Identification imam sudianto, Achmad; Sigit Susanto Putro; Eka Mala Sari; Ika Oktavia Suzanti; Aeri Rachmad; Wildan Surya Wijaya
BEST Vol 7 No 2 (2025): BEST
Publisher : Universitas PGRI Adi Buana Surabaya

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.36456/3kct9e57

Abstract

Corn is one of Indonesia's food commodities, which is an alternative food that supports food diversification in Indonesia. However, leaf infections in corn plants often cause significant yield losses and threaten food security. Early detection of this disease is very important, especially for small farmers, because conventional diagnostic methods that rely on agronomists are expensive and time-consuming. Recent advances in Agricultural Artificial Intelligence (AI) and image processing have facilitated automatic plant disease recognition through Convolutional Neural Networks (CNN), with ResNet as the main backbone combined through concatenation with MobileNetV3, DenseNet161, and GoogleNet. The dataset consists of 4,000 images divided into 2,560 training data, 640 validation data, and 800 test data, with image sizes adjusted to 224×224 pixels. The dataset consists of 4,000 images distributed across four categories: gray leaf spot, common rust, northern leaf blight, and healthy leaf. The testing was conducted using three different optimizers, namely Adam, RMSprop, and SGD, with a learning rate of 0.01. The experimental results showed that the SGD optimizer provided the best performance with a loss value of 0.2275, accuracy of 0.9513, precision of 0.9536, recall of 0.9513, and F1-score of 0.9512. These findings confirm that the combination of ResNet, MobileNetV3, DenseNet161, and GoogleNet architectures with the SGD optimizer can significantly improve the accuracy of corn leaf disease detection, making it a potential application for automatic detection systems in support of smart farming practices.
Deteksi Kemiripan Dokumen Abstrak Skripsi menggunakan Metode Jaro-Winkler Distance dan Synonym Recognition Syahrullah, Muhammad; Rachman, Fika Hastarita; Suzanti, Ika Oktavia
Sains Data Jurnal Studi Matematika dan Teknologi Vol 2, No 2: July-December 2024
Publisher : Sekolah Tinggi Agama Islam Nurul Islam Mojokerto

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.52620/sainsdata.v2i2.136

Abstract

Natural Language Processing (NLP) terus berkembang hingga saat ini. Dalam 10 tahun terakhir, NLP berkembang pesat seiring meningkatnya ketersediaan teks elektronik saat ini. Salah satu contoh aplikasi yang mengimplementasikan pendekatan NLP adalah Similarity Detection atau deteksi kemiripan. Deteksi kemiripan digunakan untuk mengetahui seberapa mirip dokumen teks satu dengan lainnya. Dokumen teks merupakan sebuah tulisan yang tercetak yang bertujuan untuk menerangkan atau memberikan sebuah informasi tertentu. Pada penelitian ini, metode Jaro-Winkler Distance dikombinasikan dengan Synonym Recognition untuk mendeteksi nilai persentase kemiripan dari dokumen abstrak skripsi. Abstrak skripsi yang digunakan adalah abstrak skripsi dari Program Studi Infromatika Fakultas Teknik Universitas Trunojoyo Madura dengan jumlah 110 abstrak. Dari uji coba yang telah dilakukan, diperoleh hasil bahwa dengan menggunakan kombinasi metode Jaro-Winkler Distance dengan Synonym Recognition dinilai kurang efektif karena score yang dihasilkan lebih rendah. Uji coba dilakukan menggunakan data sintetis potongan dan data sintetis gabungan. Tujuan dari dibuatnya data sintetis untuk menjadi ground truth atau acuan peneliti terhadap nilai similarity yang asli dari query yaitu agar dapat menghasilkan nilai Error Rate dari kinerja metode Jaro-Winkler Distance dan Synonym Recognition. Error Rate yang diperoleh tanpa menggunakan Synonym Recognition memiliki nilai sebesar 0.005511, sedangkan menggunakan Synonym Recognition diperoleh nilai sebesar 0.0397.
Improving Computational Efficiency and Accuracy of Damerau-Levenshtein Distance for Indonesian Spelling Correction using Cosine Similarity husni husni; Yoga Dwitya Pramudita; Mohammad Syarief; Army Justitia; Ika Oktavia Suzanti
Journal of Innovation Information Technology and Application (JINITA) Vol 7 No 2 (2025): JINITA, December 2025
Publisher : Politeknik Negeri Cilacap

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.35970/jinita.v7i2.2893

Abstract

Spelling correction is an automatic correction feature useful in detecting spelling errors and providing word suggestions if necessary. Spelling correction is one of the crucial preprocessing phases in text mining. The Damerau-Levenshtein Distance method is one of the spelling correction methods that has high accuracy. This method has four types of operations: insertion, deletion, substitution, and transposition. The basic approach in detecting spelling errors in the Indonesian language is to use a dictionary search. Despite its accuracy, the Damerau-Levenshtein Distance method has a slow computation time. Furthermore, when the dictionary contains several suggested words that have the same distance from the target word, it will be difficult to prioritize the most appropriate suggestions. To overcome this problem, we introduce a caching mechanism to store previously calculated corrections, thereby speeding up the computation process. In addition, we use the cosine similarity method to rank words in Damerau-Levenshtein Distance results. The results of our approach have a significant improvement in accuracy, increasing from 72.13% to 83.60% by integrating caching and cosine similarity for ranking, which shows a significant improvement in both efficiency and effectiveness