The number of documents available in digital form is increasing. Meanwhile, one document and another document may be related to each other, but they must not be plagiarized without including the reference source. For this reason, a mechanism for detecting similarities is needed. This research only discusses similarity in documents. In this research, the technique used to solve the above problem is to use text mining techniques to categorize the documents searched according to keywords. Meanwhile, to search for documents according to keywords, the indexing process is used to display documents that are searched for according to keywords. Semantics is a technique used by search engines to match key words on one page with another page. This method has been used very often before, because it is very precise and easy.
Copyrights © 2023