Pascal: Journal of Computer Science and Informatics
Vol. 1 No. 02 (2024): Pascal: Journal of Computer Science and Informatics

Analysis and Implementation of Similarity Measurement in Documents Using Semantic Methods

Prayogi, Satria Yudha (Unknown)
Sinaga, Sony Bahagia (Unknown)



Article Info

Publish Date
05 Jul 2024

Abstract

The number of documents available in digital form is increasing. Meanwhile, one document and another document may be related to each other, but they must not be plagiarized without including the reference source. For this reason, a mechanism for detecting similarities is needed. This research only discusses similarity in documents. In this research, the technique used to solve the above problem is to use text mining techniques to categorize the documents searched according to keywords. Meanwhile, to search for documents according to keywords, the indexing process is used to display documents that are searched for according to keywords. Semantics is a technique used by search engines to match key words on one page with another page. This method has been used very often before, because it is very precise and easy. The weight values (W) of D1 and D2 are the same. If the document weight sorting results cannot be sorted quickly, because both W values are the same, then a calculation process using the vector-space model algorithm is needed. The idea of this method is to calculate the cosine value of the angle of two vectors, namely W from each document and W from keywords. From the research results, it can be seen that document 3 (D3) has the highest level of similarity to keywords, followed by D2 and D1.

Copyrights © 2024






Journal Info

Abbrev

komputer

Publisher

Subject

Decision Sciences, Operations Research & Management Education Electrical & Electronics Engineering Library & Information Science Social Sciences

Description

Pascal: Journal of Computer Science and Informatics is a national scientific journal that publishes research articles in the field of Computer Science and Informatics which include: Computer Engineering, Information Engineering, Computer Science, Information Systems, Information Technology, Software ...