cover
Contact Name
Mesran
Contact Email
mesran.skom.mkom@gmail.com
Phone
+6282161108110
Journal Mail Official
mib.stmikbd@gmail.com
Editorial Address
Jalan sisingamangaraja No 338 Medan, Indonesia
Location
Kota medan,
Sumatera utara
INDONESIA
JURNAL MEDIA INFORMATIKA BUDIDARMA
ISSN : 26145278     EISSN : 25488368     DOI : http://dx.doi.org/10.30865/mib.v3i1.1060
Decission Support System, Expert System, Informatics tecnique, Information System, Cryptography, Networking, Security, Computer Science, Image Processing, Artificial Inteligence, Steganography etc (related to informatics and computer science)
Articles 1,182 Documents
Analisis Perbandingan Metode Similarity untuk Kemiripan Dokumen Bahasa Indonesia pada Deteksi Kemiripan Teks Bahasa Indonesia Pawestri, Sheraton; Suyanto, Yohanes
JURNAL MEDIA INFORMATIKA BUDIDARMA Vol 8, No 3 (2024): Juli 2024
Publisher : Universitas Budi Darma

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.30865/mib.v8i3.7648

Abstract

Ease of accessing information brings diverse benefits, including the ability to develop models that can detect similarities between documents, a plagiarism-checking system, automatic summarization, classification, etc. The various benefits of word similarity detection make research on similarity detection between documents an important area to develop. However, studies regarding similarity detection specifically for Indonesian language documents are still relatively small and the performance can still be developed. Therefore, this research aims to conduct a comparative analysis of the performance of Doc2Vec compared to the Jaccard Coefficient, Cosine Similarity, and Euclidean Distance in detecting the similarity of documents with Indonesian text. Three datasets are used in this analysis, with the first dataset consisting of 200 news from Google News, the second dataset from IndoNLU, and the third dataset from TaPaCo. The findings from this study show that overall Cosine Similarity has better performance than Jaccard Coefficient and Euclidean Distance for average performance. The superior performance was with accuracy of 0.98, precision of 0.84, recall of 0.95, and F-1 score of 0.89, with the model formed in 10.56 seconds using the Cosine Similarity algorithm on the Google News dataset. This is because doc2vec is better suited to datasets with higher dimensions than datasets that only contain a few words.
Optimasi Random Forest dengan Genetic Algorithm dan Recursive Feature Elimination pada High Dimensional Data Stunting Samarinda Satria, Bima; Siswa, Taghfirul Azhima Yoga; Pranoto, Wawan Joko
JURNAL MEDIA INFORMATIKA BUDIDARMA Vol 8, No 3 (2024): Juli 2024
Publisher : Universitas Budi Darma

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.30865/mib.v8i3.7883

Abstract

Stunting is a chronic malnutrition problem that disrupts children's growth, with long-term impacts on physical growth, cognitive development, and productivity in adulthood. In Indonesia, the prevalence of stunting is still above the WHO threshold, reaching 24.4% according to the 2021 Indonesian Nutritional Status Study (SSGI), and in Samarinda City, the prevalence reached 24.7% in 2021 with 1,402 toddlers identified as stunted. Addressing this problem requires a more structured data-driven approach to provide targeted interventions. This study uses data from the Samarinda City Health Office, encompassing 150,474 stunting data points, and involves data collection, data cleaning, feature selection, and classification model application. This study aims to improve the accuracy of stunting data classification in Samarinda City in 2023 using the Random Forest algorithm enhanced with Recursive Feature Elimination (RFE) feature selection techniques and Genetic Algorithm (GA) optimization. The feature selection results using RFE show that the most influential features are Weight, ZS TB/U, ZS BB/U, and BB/U. The application of RFE increased the model's average accuracy from 91.91% to 93.64%, while GA optimization further increased the average accuracy to 98.39%. The definite accuracy increased from 94.23% (baseline model) to 97.10% (with RFE) and reached 99.70% (with RFE and GA). The combination of RFE and GA has proven effective in tackling data complexity and improving the reliability of stunting predictions. This study significantly contributes to the development of machine learning techniques for high-dimensional data analysis in health and is expected to be the foundation for more effective intervention programs in addressing stunting issues in Indonesia.