Stopword is an insignificant word contained in a sentence. Stopword was used to help the text preprocessing stage, especially in the stopword removal stage. Digital library was often used at this stage to get a stopword list. However, not all stopword lists in the digital library were words that were not important in the data. The main focus in this research was to find out forming stopword list and word weighting on the document classification of product review using the Zipf Law method. The method used for word weighting was Augmented Term Frequency - Probability Inverse Document Frequency. The document classification process aimed to find out the effect of forming stopword list and word weighting. Document classification using the Support Vector Machine algorithm and Polynomial Kernel. The output of the research was the result of classification accuracy. Based on the result of classification accuracy, there was an effect of forming a stopword list and weighting of words on the classification result. The best accuracy result of the document classification was found at a percentage of 15% for forming stopword list taken from term that has low constant result. The resulting accuracy consisted of a precision value of 0.73, a recall value of 0.7 and a f-measure value of 0.63.
Copyrights © 2020