Jurnal Sistem Informasi dan Informatika (SIMIKA)
Vol. 9 No. 1 (2026): Jurnal Sistem Informasi dan Informatika (Simika)

COMPARISON OF SPLIT DATA RATIO PERFORMANCE IN SENTIMENT ANALYSIS OF PON XXI ACEH-SUMUT 2024 USING SUPPORT VECTOR MACHINE WITH SMOTE APPLICATION

Karina Shafa Amalia (Telkom University)
Anisa Dzulkarnain (Telkom University)
Berlian Rahmy Lidyawati (Telkom University)



Article Info

Publish Date
08 Feb 2026

Abstract

The 21st National Sports Week (PON) Aceh-North Sumatra 2024 is the largest multi-sport competition in Indonesia, sparking diverse public responses on social media platforms, particularly X (formerly Twitter). The high volume and diverse nature of comments related to PON XXI pose challenges in understanding public sentiment and communication patterns. This study aims to compare the performance of various training and testing data splitting ratios in the Support Vector Machine (SVM) algorithm with an RBF kernel for sentiment classification of X platform data related to PON XXI Aceh-North Sumatra 2024. The research methodology involved data collection using the Tweet Harvest library, gathering 2,503 Indonesian-language posts during the period from 9 August to 20 October 2024. Text preprocessing included cleaning, case adjustment, normalisation, tokenisation, stop word removal, and stemming. The dataset was classified into three sentiment categories: positive, negative, and neutral. Four different split ratios were evaluated: 90:10, 80:20, 70:30, and 60:40. The SMOTE (Synthetic Minority Over-sampling Technique) was applied to address the class imbalance issue. The results show that the 80:20 split ratio achieved optimal performance with the highest accuracy of 86.23%, precision of 86.10%, recall of 86.23%, and F1 score of 86.15%. These findings indicate that the appropriate data split ratio significantly influences model performance and provides valuable insights for developing more accurate and representative public opinion analysis models for Indonesian social media content.

Copyrights © 2026






Journal Info

Abbrev

jsii

Publisher

Subject

Computer Science & IT Control & Systems Engineering

Description

Jurnal Sistem Informasi dan Informatika aims to provide scientific literature specifically on studies of applied research in information systems (IS), information technology (IT) and public review of the development of theory, method, and applied sciences related to the ...