PIKSEL : Penelitian Ilmu Komputer Sistem Embedded and Logic
Vol. 12 No. 2 (2024): September 2024

Comparison of Latent Semantic Analysis (LSA) and Doc2Vec Algorithms of Thesis Similarity Detection

Arifin, Rita Wahyuni (Unknown)
Putra, Mardi Yudhi (Unknown)
Putri, Dwi Ismiyana (Unknown)



Article Info

Publish Date
30 Sep 2024

Abstract

This study aims to develop a system for detecting similarities in thesis titles and content to prevent plagiarism and support student originality. The high level of similarity in final projects is a significant concern in academic environments. Two text vectorization methods, Latent Semantic Analysis (LSA) and Doc2Vec, were compared to measure document similarity. Results showed that LSA achieved a very high cosine similarity (99.94%) due to dimensionality reduction that preserved semantic correlations. In contrast, Doc2Vec produced lower similarity scores, with 7.17% for PV-DM and 39.07% for PV-DBOW, indicating richer text representations. This study adopted the CRISP-DM model, which includes Business Understanding, Data Understanding, Data Preparation, Modelling, and Evaluation. The model is expected to strengthen academic integrity and encourage valuable scientific contributions.

Copyrights © 2024






Journal Info

Abbrev

piksel

Publisher

Subject

Computer Science & IT Decision Sciences, Operations Research & Management

Description

Jurnal PIKSEL diterbitkan oleh Universitas Islam 45 Bekasi untuk mewadahi hasil penelitian di bidang komputer dan informatika. Jurnal ini pertama kali diterbitkan pada tahun 2013 dengan masa terbit 2 kali dalam setahun yaitu pada bulan Januari dan September. Mulai tahun 2014, Jurnal PIKSEL mengalami ...