Social media is one of the most used media to get information in Indonesia. The high number of social media usage makes the risk of spreading negative content even greater. This was shown in 2018 the Ministry of Communication and Information received 547.506 complaints of negative content on social media where Twitter became the first most complained social media. The number of complaints creates problems if it has to be checked manually. Therefore, the authors propose research to build a negative content detector on Twitter documents. This research uses the Support Vector Machine method and Pipeline for hashtag segmentation. The process starts with preprocessing the data, then do hashtag segmentation with Pipeline, weighting using Term Frequency-Inverse Document Frequency, followed by classification using Support Vector Machine. In this research the test was carried out by K-Fold Cross Validation using 300 data divided into 10 fold. The test results with the highest accuracy were obtained at 0,8325 with learning rate = 0,0001, complexity = 0,001, lambda = 0,1, epsilon = 0,0001 and maximum iteration = 50.
Copyrights © 2021