Claim Missing Document
Check
Articles

Found 3 Documents
Search

Optimizing Clustering of Indonesian Text Data Using Particle Swarm Optimization Algorithm: A Case Study of the Quran Translation R Wahyudi, M Didik; Fatwanto, Agung
Telematika Vol 17, No 1: February (2024)
Publisher : Universitas Amikom Purwokerto

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.35671/telematika.v17i1.2724

Abstract

The Quran considered the holy book for Muslims, contains scientific and historical facts affirming Islam's truth, beauty, and influence on human life. Consequently, the Quran text and its translations are valuable sources for text mining research, particularly for studying the interrelationship of its verses. One approach to grouping objects using certain algorithms is clustering, with K-Means Clustering being a prominent example. However, clustering results are often suboptimal due to the random selection of centroids. To address this, the study proposes using the Particle Swarm Optimization (PSO) algorithm, which selects centroids based on PSO results. The hybrid PSO algorithm initiates a single iteration of the K-means algorithm. It concludes either upon reaching the maximum iteration limit or when the average shift in the center of the mass vector falls below 0.0001. Evaluation of the clustering results from the three models indicates that the K-Means algorithm produced the lowest Sum of Squared Error (SSE) value of 1032.19. Additionally, the hybrid PSO algorithm generated the highest Silhouette value of 0.258 and the lowest quantization value of 0.00947. Further evaluation using a confusion matrix showed that K-Means clustering had an accuracy rate of 81.7%, K-Means with PSO had 82.5%, and the combination of K-Means with hybrid PSO yielded the highest accuracy rate of 91.1% among the three grouping models.
Evaluation of TF-IDF Algorithm Weighting Scheme in The Qur'an Translation Clustering with K-Means Algorithm R Wahyudi, M Didik
Journal of Information Technology and Computer Science Vol. 6 No. 2: August 2021
Publisher : Faculty of Computer Science (FILKOM) Brawijaya University

Show Abstract | Download Original | Original Source | Check in Google Scholar | Full PDF (1036.775 KB) | DOI: 10.25126/jitecs.202162295

Abstract

The Al-Quran translation index issued by the Ministry of Religion can be used in text mining to search for similar patterns of Al-Quran translation. This study performs sentence grouping using the K-Means Clustering algorithm and three weighting scheme models of the TF-IDF algorithm to get the best performance of the Tf-IDF algorithm. From the three models of the TF-IDF algorithm weighting scheme, the highest percentage results were obtained in the traditional TF-IDF weighting scheme, namely 62.16% with an average percentage of 36.12% and a standard deviation of 12.77%. The smallest results are shown in the TF-IDF 1 normalization weighting scheme, namely 48.65% with an average percentage of 25.65% and a standard deviation of 10.16%. The smallest standard deviation results in a normalized 2 TF-IDF weighting of 8.27% with an average percentage of 28.15% and the largest percentage weighting of 48.65% which is the same as the normalized TF-IDF 1 weighting.
Veil and Hijab: Twitter Sentiment Analysis Perspective Lestari, Lusiana; R Wahyudi, M Didik; Kiftiyani, Usfita
IJID (International Journal on Informatics for Development) Vol. 9 No. 1 (2020): IJID June
Publisher : Faculty of Science and Technology, UIN Sunan Kalijaga Yogyakarta

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.14421/ijid.2020.09108

Abstract

Controversies about veil and hijab are often occur in society. Especially in today’s digital era, public opinion expressed through social media can greatly influence the others opinions, regardless of whether it is positive or negative. Therefore, this research was aiming to conduct an approach through analysis sentiment of public opinion about the veil and hijab to know how much accurate the sentiment analysis predict the positive, negative, or other sentiments with using Twitter data as the research object. The algorithm used in this study is Support Vector Machine (SVM) because of its fairly good classification model though it trained using small set of data. The SVM on this research was combined with Radial Base Function (RBF) kernel because of its numerical difficulties that are fewer than linear and polynomial kernel and also because this research doesn’t have a large feature.  The amount of data used is 3556 tweets data. Tweets data, which is numbered 1056, is classified manually for the learning process. The remaining 2500 data will be classified automatically with the classifier model that has been created. A total of 1056 tweets data that have been classified manually is separated into training and testing data with a ratio of 8: 2. The result of the sentiment analysis process using Support Vector Machine algorithm RBF kernel with C=1 and γ=1  has an accuracy score of 73.6% with precision to negative opinions are 62%, positive opinions are 83%, neutral opinions reach 53% and irrelevant opinions that talk about hijab and veil reach 98%. It shows that sentiment analysis can be used for predicting the negative, positive or other sentiments of a sentence based on a certain topic, in this case veil and hijab.