The number of documents available in digital form is increasing. Meanwhile, one document and another document may be related to each other, but they must not be plagiarized without including the reference source. For this reason, a mechanism for detecting similarities is needed. This research only discusses similarity in documents. In this research, the technique used to solve the above problem is to use text mining techniques to categorize the documents searched according to keywords. Meanwhile, to search for documents according to keywords, the indexing process is used to display documents that are searched for according to keywords. Semantics is a technique used by search engines to match key words on one page with another page. This method has been used very often before, because it is very precise and easy. The weight values (W) of D1 and D2 are the same. If the document weight sorting results cannot be sorted quickly, because both W values are the same, then a calculation process using the vector-space model algorithm is needed. The idea of this method is to calculate the cosine value of the angle of two vectors, namely W from each document and W from keywords. From the research results, it can be seen that document 3 (D3) has the highest level of similarity to keywords, followed by D2 and D1.