Claim Missing Document
Check
Articles

Found 2 Documents
Search
Journal : JOIV : International Journal on Informatics Visualization

Performance Improvement of Cosine Similarity Algorithm with Bidirectional Encoder Representations from Transformers on Abstract Document Similarity Detection Pradana, Musthofa Galih; Irzavika, Nindy; Maulana, Nurhuda; Mu, Jesselyn; Wari, Valtrizt Khalifah
JOIV : International Journal on Informatics Visualization Vol 9, No 2 (2025)
Publisher : Society of Visual Informatics

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.62527/joiv.9.2.2853

Abstract

In thesis courses or final projects, students are required to be able to conduct research by the science they are engaged in, find innovations, solve problems, and foster a culture and critical mindset. However, the issue that is often encountered is plagiarism. Plagiarism is taking a work that can be in the form of someone else's opinion and making it seem as if it is your own. The step in applying technology that can be done is to carry out early detection of the similarity of documents written by students. In this case, the document that will be detected is an abstract that must be collected by students when submitting a thesis title. The algorithm used is a cosine similarity algorithm, which is computationally efficient because of its ease of interpretation and compatibility with large-scale data. This research was carried out using two schematic approaches: bidirectional encoder representations from transformers (BERT) and not bidirectional encoder representations from transformers (BERT). The corpus data used in this study was 1450 data of student thesis abstract documents, with the test using 10 data to see the performance of the cosine similarity algorithm in detecting the similarity of abstract documents. The results showed that documents with optimization using the Bidirectional Encoder Representations from Transformers (BERT) approach had better results, with an average performance improvement of 23.48%.
Levenshtein Distance Algorithm in Javanese Character Translation Machine Based on Optical Character Recognition Pradana, Musthofa Galih; Seta, Henki Bayu; Irzavika, Nindy; Saputro, Pujo Hari; Rusiyono, Ruwet
JOIV : International Journal on Informatics Visualization Vol 9, No 4 (2025)
Publisher : Society of Visual Informatics

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.62527/joiv.9.4.3151

Abstract

Indonesia has diverse art, cultures, and languages. Linguistically, Indonesia has many local languages, which makes it a diverse country, with Javanese being the regional language with the highest number of entries in the Kamus Besar Bahasa Indonesia. The Javanese script, one of the cultural symbols of Java, differs significantly from the Latin script commonly used in daily communication. In the context of cultural preservation, which is also one of the ministry's strategic steps, a translation or transfer process is needed from the Javanese script to the Latin script to the Indonesian language as an active participation in culture, with technology helping promote and introduce Indonesian culture. This study develops an algorithm-based approach to capture data images and improve translation accuracy. Transliteration is further enhanced by incorporating optical character recognition to convert character images. The study also applies a convolutional neural network (CNN) algorithm for character image recognition and a Levenshtein distance algorithm to translate Latin characters into Indonesian. The convolutional neural network (CNN) algorithm achieved an optimal % image detection accuracy of 95% at the 21st epoch. The translation process yielded a 90% word-level translation accuracy and 70% sentence-level accuracy. These results indicate that sentence translation remains suboptimal due to a lack of sufficient training data and similarities between scripts, highlighting the need for further improvements through transformer models or data augmentation.