The use of short text based on digital to date is still growing and extending to various social media. Twitter has news features in tweets to represent information representing each type. Each categorization of this type is done to make it easier for users to use it. The purpose of the use of categories in this classification, to evaluate and improve the quality of social media in grouping categories of content of the content provided. Traditional classification is still used today, but the results are sometimes not maximal, it is necessary to expand the word to add words to the text in order to improve the accuracy. Word expansion is used with a semantic-based distributional euclidean distance technique to find the closest word from an external source to be a query to be added to the test data text. Using test data 105 and training data 400, the classification using K-Nearest Neighbor can obtain 90% results with nearest neighbor K=5. These results are similar to the results of tests conducted without using word expansion techniques. While the test is done by adding the expansion of words with threshold 0.5 and the nearest immediate value K-Nearest Neighbor K=5 obtained an accuracy of 92%.
Copyrights © 2018