One of the process that can be implemented in text mining is categorizing text documents. Problems that related the categorizing text documents are found in universities, especially in the reading room of the Faculty of Computer Science, Universitas Brawijaya (FILKOM UB). There is no process for categorizing thesis documents automatically is one of the problem. The thesis documents categorization in FILKOM UB's reading room is still not organized according to the focus of the existing research. The categorization is completed using the BM25 and K-Nearest Neighbor methods. The process was done is pre-processing text document, calculate the BM25 score of each document, then classify them using the K-Nearest Neighbor algorithm. The testing process in this research uses 10 k-fold. Each test used 31 testing documents and 300 training documents. The average results obtained in each test produced the best results at the value of k=11 with a f-measure value is 0.9092, recall is 0.9087, and precision is 0.9265. The greater the value of k cause the classification process runs less optimally because it produces a smaller f-measure value.
Copyrights © 2019