Beauty vlogger is a term for people who do vlog activities to discuss beauty issues and make up tutorials on YouTube. Beauty vloggers often get body shaming comments. In Indonesia, body shaming comments are a violation regulated in the Electronic Information and Transaction Act (UU ITE). Body shaming comment classification system can help to classify body shaming comments more efficient and faster. Body shaming comment classification system in this research uses the BM25 and K-Nearest Neighbor methods. Process in this research are pre-processing each data to look for words that are characteristic for each data, then calculate the term frequency based on the number of words contained in each data, then calculate the inverse document frequency, then calculate the BM25 score and sorting the data. The last step is to do the K-Nearest Neighbor classification. This study uses 600 data comments with 300 data on body shaming class, and 300 data on not body shaming class. The average of all k-fold cross validation tests obtained the highest value, namely precision = 0.87153019, recall = 0.86666667, f-measure = 0.86606885, and accuracy = 0.86666667 at value k = 3. The value of testing using balanced data is much better than testing using unbalanced data, with the highest average value of testing unbalanced data, namely precision = 0.84306693, recall = 0.775, f-measure = 0.7582337, and accuracy = 0.775.
Copyrights © 2019