Twitter is a social network that has one of the most active users today. With the openness of information users move to send texts or tweets about other users, the number of Twitter users makes a lot of tweets related to ethnic issues, between groups, races, religions (SARA). Twitter cannot access the content of tweets that contain Sara's Issues, research is needed to classify tweets to understand including categories of Sara's Issues or Not Sara's Problems. Classification The Sara issue starts in several ways, namely preprocessing which consists of several stages, namely cleaning, folding cases, tokenisation, filtering and stemming. Followed by the term weghting process, to the classification process using the Improved K-Nearest Neighbor method. Based on the implementation and testing carried out in the research on Sara's Issue Classification on Twitter Using K-NN Increase, get the best results based on Precision averages of 0.976422, Remember at 1, F-Measure of 0.987944444 and Accuracy of 96%. Where the number of documents used as training data are 320 documents and test data as many as 80 documents. Where the number of documents, comparison or balance of training data and the value of k-value used determine the good or not classification process of the document.
Copyrights © 2020