Garuda - Garba Rujukan Digital

p-Index From 2021 - 2026

0.92

P-Index

This Author published in this journals

All Journal Seminar Nasional Aplikasi Teknologi Informasi (SNATI) Jurnal Edukasi dan Penelitian Informatika (JEPIN) Jurnal Linguistik Komputasional Jurnal Nasional Teknik Elektro dan Teknologi Informasi Journal of Computing and Informatics Research

Herry Sujaini

Sekolah Teknik Elektro dan Informatika, Institut Teknologi Bandung

Author-ID : 280870

Computer Science & IT Control & Systems Engineering Education Electrical & Electronics Engineering Energy Engineering

Published : 11 Documents Claim Missing Document

Claim Missing Document

Articles

Title

Analisis Perbandingan Kemiripan Teks Bahasa Daerah di Indonesia Menggunakan Algoritma Naive Bayes dan K-Nearest Neighbor Alfarizi; Herry Sujaini; Niken Candraningrum
Journal of Computing and Informatics Research Vol 5 No 1 (2025): November 2025
Publisher : Forum Kerjasama Pendidikan Tinggi (FKPT)

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47065/comforch.v5i1.2345

Indonesia, as an archipelagic country, has a wide variety of languages, with 718 regional languages. However, many regional languages face the risk of declining usage and even extinction. Technological developments have opened up opportunities to analyze the patterns and unique characteristics of regional languages through n-gram analysis using naive bayes and k-nearest neighbor algorithms. Therefore, this study was conducted with the aim of analyzing the similarity of regional languages, particularly Central Javanese, Sundanese, and Pontianak Malay, as part of an effort to assist in the preservation of regional languages in Indonesia. The similarity between languages was calculated based on errors in the confusion matrix, and the performance of the algorithms was evaluated using accuracy and F1-score metrics. The naive bayes algorithm with combined unigram and bigram features showed the best performance with an accuracy and F1-score of 0.921. The results of the study showed the highest similarity value in the ‘Javanese - Malay’ language, although only 3.82%, and the lowest in the ‘Malay - Sundanese’ language at 1.66%. These similarity values are based on the dominant characters that appear in a language, such as ‘e’ in Malay and ‘a’ and ‘u’ in Sundanese. This study proves that there is little similarity between Javanese, Sundanese, and Malay.

Co-Authors Alfarizi Arif Bijaksana PN. Arif Bijaksana Putra Bomo Wibowo Sanjaya Dedy Suryadi Eva Faja Ripanti Fitri Imansyah Hafidz Ardhi Hafiz Muhardi Heri Priyanto Jefri Hasiholan Simanjuntak M. Iqbal Arsyad Muhammad Gerdy Asparilla Muhammad Rezy Anshari Niken Candraningrum Redi Ratiandi Yacoub Redy Ratiandi Yacoub Rivan Achmad Nugroho Rudy Dwi Nyoto Syahfira Mulya Tursina Tursina

Title Search

Found 1 Documents Search Journal : Journal of Computing and Informatics Research

Abstract

Title

Found 1 Documents
Search
Journal : Journal of Computing and Informatics Research