This research focuses on optimizing the Random Forest algorithm for sentiment analysis of social media x in Indonesian using TextBlob as a labeling tool, followed by the SMOTE data balancing technique and hyperparameter optimization with GridSearch. The data used was taken from 611 tweets with the keyword ukt (single tuition). Sentiment labeling using TextBlob produces 438 negative sentiments and 173 positive sentiments. The SMOTE method is used to balance the data by first dividing the data into 75% training data and 25% test data. Data vectorization using tf-idf. The Random Forest algorithm model was evaluated with an initial accuracy using split data of 73%, and cross validation evaluation with 10 k-folds produced an accuracy value of 75%. Optimization carried out with GridSearch hyperparameters succeeded in increasing the accuracy value to 74%, while cross validation evaluation using 10 k-fold accuracy was 89%. In this research, the SMOTE method was effective in balancing unbalanced data, and gridsearch hyperparameter optimization succeeded in increasing the accuracy value of the Random Forest algorithm in classifying social media sentiment x in Indonesian with automatic texblob labeling.
Copyrights © 2025