The high utilization rates of medicinal plants is leading to increase the studies on it. Those studies certainly require documentation that contains information about medicinal plants. The large and scattered documentation cause difficulties in searching for information about medicinal plants. To overcome these problems a system that can classify the document automatically is needed to make the information search work more effective and efficient. K-Nearest Neighbor is the algorithm often used to classify text, but has a weakness in accuracy because of the fixed k values for each category. K values is the amount of the closest training data to the test data. Improved k-Nearest Neighbour is the algorithm used in this study to overcome the problem where the different k values will be applied based on the amount of the training data for each category. The average accuracy for the k values testing is 70,99%. The training data variation testing shows that the bigger amount of training data the higher average accuracy will be. The unbalanced data testing showed that the balance data training category has 1,9% better accuracy than the unbalanced category.
                        
                        
                        
                        
                            
                                Copyrights © 2018