Garuda - Garba Rujukan Digital

JOURNAL OF APPLIED INFORMATICS AND COMPUTING

Vol. 9 No. 6 (2025): December 2025

Akbar, Muhammad Ilham (Unknown)
Ningrum, Novita Kurnia (Unknown)

Publish Date
05 Dec 2025

Source code plagiarism identificatio requires a system capable of identifying semantic similarity rather than mere textual resemblance. This study utilized a dataset of 1,000 source code files, which after cleaning resulted in 996 individual code samples collected from GitHub repositories. The dataset included various programming languages (Python, Java, JavaScript, TypeScript, C++), divided into 697 training data, 149 validation data, and 149 testing data. The model employed was CodeBERT, configured with a hidden size of 768, 12 layers, and 12 attention heads. CodeBERT generated vector embeddings for each code sample, which were then projected by a Siamese Network to calculate cosine similarity between code pairs. Testing used a threshold of 0.80 to classify plagiarism. The identification results achieved an accuracy of 96.4%, precision of 95.2%, recall of 97.8%, F1-score of 96.4%, and an error rate of 4.6%. The system produced similarity scores and status labels of “plagiarism detected” or “not detected,” demonstrating the effectiveness of the CodeBERT-based approach for adaptive and intelligent code similarity identificatio.

Citation Download

EndNote, Reference Manager, ProCite

Latex, Jabref

Check in Google Scholar

Journal Info

JOURNAL OF APPLIED INFORMATICS AND COMPUTING

Website

Abbrev

JAIC

Publisher

Politeknik Negeri Batam

Subject

Computer Science & IT

Description

Journal of Applied Informatics and Computing (JAIC) Volume 2, Nomor 1, Juli 2018. Berisi tulisan yang diangkat dari hasil penelitian di bidang Teknologi Informatika dan Komputer Terapan dengan e-ISSN: 2548-9828. Terdapat 3 artikel yang telah ditelaah secara substansial oleh tim editorial dan ...

Article Info

Abstract

Identification of Source Code Plagiarism Using a Natural Language Processing (NLP) Approach Based on Code Writing Style Analysis

Article Info

Abstract