The large number of posted tweets resulted in scattered tweets and appearing on the Twitter homepage very diverse and not classified by categories such as health, sports, technology, economics, tourism and so on. The absence of categorization causes the user difficulty to read or retrieve information related to certain desired categories. Solution that can be done is by the method of text classification, which in the process of classification is able to classify automatically against some categories on unstructured text with natural language. In this research will be done classification process using Naive Bayes method with additional query expansion to add term in initial document. The addition of term aims to optimize the classification process because the tweet is a short text that can lead to ambiguity of classification classi. The additions made are hyponym and hypernym from original documents extracted from WordNet. Accuracy calculation method used is k-fold that aims to test the robustness of system. The accuracy obtained was 72% for the classification without query expansion, 65.75% for hyponym and hypernym addition, 66.3% for hyponym addition, and 67.5% for hypernym addition. It can be concluded that the addition of queries made less effective to improve the accuracy of the classification process.
Copyrights © 2018