Claim Missing Document
Check
Articles

Found 3 Documents
Search
Journal : REKAYASA

Perbandingan Metode Term Weighting terhadap Hasil Klasifikasi Teks pada Dataset Terjemahan Kitab Hadis Ni'mah, Ana Tsalitsatun; Arifin, Agus Zainal
Rekayasa Vol 13, No 2: August 2020
Publisher : Universitas Trunojoyo Madura

Show Abstract | Download Original | Original Source | Check in Google Scholar | Full PDF (434.672 KB) | DOI: 10.21107/rekayasa.v13i2.6412

Abstract

Hadis adalah sumber rujukan agama Islam kedua setelah Al-Qur’an. Teks Hadis saat ini diteliti dalam bidang teknologi untuk dapat ditangkap nilai-nilai yang terkandung di dalamnya secara pegetahuan teknologi. Dengan adanya penelitian terhadap Kitab Hadis, pengambilan informasi dari Hadis tentunya membutuhkan representasi teks ke dalam vektor untuk mengoptimalkan klasifikasi otomatis. Klasifikasi Hadis diperlukan untuk dapat mengelompokkan isi Hadis menjadi beberapa kategori. Ada beberapa kategori dalam Kitab Hadis tertentu yang sama dengan Kitab Hadis lainnya. Ini menunjukkan bahwa ada beberapa dokumen Kitab Hadis tertentu yang memiliki topik yang sama dengan Kitab Hadis lain. Oleh karena itu, diperlukan metode term weighting yang dapat memilih kata mana yang harus memiliki bobot tinggi atau rendah dalam ruang Kitab Hadis untuk optimalisasi hasil klasifikasi dalam Kitab-kitab Hadis. Penelitian ini mengusulkan sebuah perbandingan beberapa metode term weighting, yaitu: Term Frequency Inverse Document Frequency (TF-IDF), Term Frequency Inverse Document Frequency Inverse Class Frequency (TF-IDF-ICF), Term Frequency Inverse Document Frequency Inverse Class Space Density Frequency (TF-IDF-ICSδF), dan Term Frequency Inverse Document Frequency Inverse Class Space Density Frequency Inverse Hadith Space Density Frequency (TF-IDF-ICSδF-IHSδF). Penelitian ini melakukan perbandingan hasil term weighting terhadap dataset Terjemahan 9 Kitab Hadis yang diterapkan pada mesin klasifikasi Naive Bayes dan SVM. 9 Kitab Hadis yang digunakan, yaitu: Sahih Bukhari, Sahih Muslim, Abu Dawud, at-Turmudzi, an-Nasa'i, Ibnu Majah, Ahmad, Malik, dan Darimi. Hasil uji coba menunjukkan bahwa hasil klasifikasi menggunakan metode term weighting TF-IDF-ICSδF-IHSδF mengungguli term weighting lainnya, yaitu mendapatkan Precission sebesar 90%, Recall sebesar 93%, F1-Score sebesar 92%, dan Accuracy sebesar 83%.Comparison of a term weighting method for the text classification in Indonesian hadithHadith is the second source of reference for Islam after the Qur’an. Currently, hadith text is researched in the field of technology for capturing the values of technology knowledge. With the research of the Book of Hadith, retrieval of information from the hadith certainly requires the representation of text into vectors to optimize automatic classification. The classification of the hadith is needed to be able to group the contents of the hadith into several categories. There are several categories in certain Hadiths that are the same as other Hadiths. Shows that there are certain documents of the hadith that have the same topic as other Hadiths. Therefore, a term weighting method is needed that can choose which words should have high or low weights in the Hadith Book space to optimize the classification results in the Hadith Books. This study proposes a comparison of several term weighting methods, namely: Term Frequency Inverse Document Frequency (TF-IDF), Term Frequency Inverse Document Frequency Inverse Class Frequency (TF-IDF-ICF), Term Frequency Inverse Document Frequency Inverse Class Space Density Frequency (TF-IDF-ICSδF) and Term Frequency Inverse Document Frequency Inverse Class Space Density Frequency Inverse Hadith Space Density Frequency (TF-IDF-ICSδF-IHSδF). This research compares the term weighting results to the 9 Hadith Book Translation dataset applied to the Naive Bayes classification engine and SVM. 9 Books of Hadith are used, namely: Sahih Bukhari, Sahih Muslim, Abu Dawud, at-Turmudzi, an-Nasa’i, Ibn Majah, Ahmad, Malik, and Darimi. The trial results show that the classification results using the TF-IDF-ICSδF-IHSδF term weighting method outperformed another term weighting, namely getting a Precession of 90%, Recall of 93%, F1-Score of 92%, and Accuracy of 83%.
The Ngoko Javanese Stemmer uses the Enhanced Confix Stripping Stemmer Method Shevia Ilfa Melia; Jamiatus Sholihah; Dianatin Nisak; Intan Sukma Juniaristha; Ana Tsalitsatun Ni'mah
Rekayasa Vol 16, No 1: April 2023
Publisher : Universitas Trunojoyo Madura

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.21107/rekayasa.v16i1.19308

Abstract

Stemming is vital in text processing. The stemming that is most often encountered is Indonesian and English stemming. This is because more articles are processed in text processing in English and Indonesian. Indonesia has several regional languages, especially local school content, often used in learning. Therefore, research is needed to process Javanese language texts to make it easier for education practitioners, especially in Ngoko Javanese. Ngoko Javanese stemming, which still uses the affix removal stemmers method (rule-based approach) in previous research. Has a problem, namely the lack of success of this method when returning the root words of Ngoko Javanese, so it is necessary to check the Ngoko Javanese dictionary so that the results of the root words obtained are maximized. This study aims to conduct stemmer research on Ngoko Javanese using the Enhanced Confix Stripping (ECS) method. This stemmer is designed to do word splitting according to the Enhanced Confix Stripping algorithm and through checking the dictionary adapted to the Ngoko Javanese language. The results of this study are the ability to extract essential words in Javanese Ngoko to achieve a level of truth in returning root words reaching 97 percent.
Convolutional Autoencoder for Reconstruction of Historical Document Images: Ancient Manuscript Babad Lombok Syuhada, Fahmi; Firdaus, Asno Azzawagama; Ni'mah, Ana Tsalitsatun; Sa’adatai, Yuan; Tajuddin, Muhammad
Rekayasa Vol 17, No 1: April, 2024
Publisher : Universitas Trunojoyo Madura

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.21107/rekayasa.v17i1.26101

Abstract

The Babad Lombok is an ancient literary or manuscripts document that generally contains stories about the origins of the people of Lombok. This document is written on a lontar leaf, which in the past was used to write manuscripts, letters, and documents. At present, the Babad Lombok document can be seen in the form of photos or scans, so it can be viewed without having to go to a museum or cultural heritage site where the document is usually exhibited. However, because this document is an ancient artifact that has been around for hundreds of years, it has naturally experienced fading in the original document or its scanned versions. This makes the text inside less clear. This paper proposes to automatically reconstruct/repair the Babad Lombok document using a neural network. The type of neural network used is an Autoencoder or Convolutional Autoencoder (CAE). The CAE model is built sequentially and trained using original images of Babad Lombok as its training data and manually corrected images of Babad Lombok as the target or ground truth data. In the process, the two types of data are iteratively cropped to a size of 64x64 along the original size of the Babad Lombok image. This process results in input and target data for the CAE training process in this research, each consisting of 48,288 images. Testing the trained autoencoder model shows that the Babad images have been successfully repaired, making the text quality clearer before reconstruction. Ultimately, the proposed CAE has achieved training and validation accuracies of 89.09% and 94.57%, with corresponding loss values of 0.0418 and 0.0226.