Claim Missing Document
Check
Articles

Found 1 Documents
Search
Journal : Bulletin of Applied Mathematics and Mathematics Education

Optimization of feature selection on semi-supervised data Wijayanti, Dian Eka; Afriyani, Sintia; Surono, Sugiyarto; Dewi, Deshinta Arrova
Bulletin of Applied Mathematics and Mathematics Education Vol. 4 No. 2 (2024)
Publisher : Universitas Ahmad Dahlan

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.12928/bamme.v4i1.11104

Abstract

This research explores feature selection optimization in semi-supervised text data by utilizing the technique of dividing data into training and testing sets and implementing pseudo-labeling. Proportions of data division, namely 70:30, 80:20, and 90:10, were used as experiments, employing TF-IDF weighting and PSO feature selection. Pseudo-labeling was applied by assigning positive, negative, and neutral labels to the training data to enrich information in the classification model during the testing phase. The research results indicate that the linear SVM model achieved the highest accuracy with a 90:10 data division proportion with a value of 0.9051, followed by Random Forest, which had an accuracy of 0.9254. Although RBF SVM and Poly SVM yielded good results, KNN showed lower performance. These findings emphasize the importance of feature selection strategies and the use of pseudo-labeling to enhance the performance of classification models in semi-supervised text data, offering potential applications across various domains that rely on semi-supervised text analysis.