Claim Missing Document
Check
Articles

Penerapan Particle Swarm Optimization Pada Feedforward Neural Network Untuk Klasifikasi Teks Hadis Bukhari Terjemahan Bahasa Indonesia Muhammad Ghufran; Adiwijaya Adiwijaya; Said Al-Faraby
JURNAL MEDIA INFORMATIKA BUDIDARMA Vol 2, No 4 (2018): Oktober 2018
Publisher : STMIK Budi Darma

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.30865/mib.v2i4.951

Abstract

Hadith is the second source of Islamic law after Al-Qur'an and used as a guide for Muslims life. there are many hadith which has been narrated, one of them is Bukhari history. This research aims to build a model that can classify Bukhari hadith translation of Indonesian language. This topic is chosen to assist the public in understanding the meaning of the information that contained in the hadith, in the form of advocacy information, prohibitions or just information. The Backpropagation Algorithm (BP) is the general technique that used to train the Feedforward Neural Network (FNN) in classification process cause it has good accuracy for text classification. But, BP has a weakness that is relatively slow to reach convergent and stuck in local minimum. To overcome this, the Particle Swarm Optimization (PSO) algorithm is used to speed up convergence and find the minimum global value. The purpose of this test is to see the PSO's ability to train the weight and refraction of FNN. The result of this research on 1000 hadith data show that model PSO-FNN with stemming process get 88.5% accuracy while without stemming process get 88.57% accuracy. Meanwhile, the result of comparative test between PSO-FNN with BP-FNN, the result shows that  PSO-FNN get accuracy equal to 88.57% which is lower 0.93% than BP-FNN which has 89.5% accuracy.
Pengaruh Text Preprocessing terhadap Analisis Sentimen Komentar Masyarakat pada Media Sosial Twitter (Studi Kasus Pandemi COVID-19) Syifa Khairunnisa; Adiwijaya Adiwijaya; Said Al Faraby
JURNAL MEDIA INFORMATIKA BUDIDARMA Vol 5, No 2 (2021): April 2021
Publisher : STMIK Budi Darma

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.30865/mib.v5i2.2835

Abstract

COVID-19 is a pandemic that is troubling many people. This has led to a lot of public comments on Twitter social media. The comments are used for sentiment analysis so that we know the polarity of the sentiment that appears, whether it is positive, negative, or neutral. The problem when using twitter data is that the tweet data still contains many non-standard words such as abbreviated writing due to the maximum limitation of characters that can be used in one tweet. Preprocessing is the most important initial stage in sentiment analysis when using Twitter data, because it affects the classification performance results. This study specifically discusses the preproceesing technique by performing several test scenarios for the combination of preprocessing techniques to determine which preprocessing technique produces the most optimal accuracy and its effect on sentiment analysis. Feature extraction using N-Gram and word weighting using TF-IDF. Mutual Information as a feature selection method. The classification method used is SVM because it is able to classify high-dimensional data according to the data used in this study, namely text data. The results of this study indicate that the best performance is obtained by using a combination of cleaning and stemming; and normalization of words, cleaning, and stemming with the same accuracy of 77.77%. the use of unigram results in higher accuracy compared to bigram. Mutual Information is able to reduce overfitting problems by reducing irrelevant features so that train and test accuracy is quite stable
Analisis Sentimen Berbasis Aspek pada Review Female Daily Menggunakan TF-IDF dan Naïve Bayes Clarisa Hasya Yutika; Adiwijaya Adiwijaya; Said Al Faraby
JURNAL MEDIA INFORMATIKA BUDIDARMA Vol 5, No 2 (2021): April 2021
Publisher : STMIK Budi Darma

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.30865/mib.v5i2.2845

Abstract

The results of a product review will provide considerable benefits for producers or consumers. Female daily is a forum that discusses beauty products. There are many reviews that are obtained every day. Therefore a technique is needed to analyze the results of the review into valuable information. One of the techniques is aspect-based sentiment analysis. Aspect-based sentiment analysis will analyze each text to identify various aspects (attributes or components) then determine the level of sentiment (positive, negative, or neutral) that is appropriate for each aspect. From the results obtained, there are reviews that use multilingual languages. Then the steps taken are to translate the multilingual language into one language only, namely Indonesian. Before the review is processed, preprocessing will be carried out to make it easier to process. Then the word weighting is done using TF-IDF, and the method for classifying sentiments that will be used is Complement Naïve Bayes to overcome unbalanced data. From the test results obtained the best F1-Score of 62,81% for data translated into English and then into Indonesian and not using stopword removal
Pengklasifikasian Topik Hadits Terjemahan Bahasa Indonesia Menggunakan Latent Semantic Indexing dan Support Vector Machine Hafizh Fauzan; Adiwijaya Adiwijaya; Said Al-Faraby
JURNAL MEDIA INFORMATIKA BUDIDARMA Vol 2, No 4 (2018): Oktober 2018
Publisher : STMIK Budi Darma

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.30865/mib.v2i4.948

Abstract

Hadith is used as the source of Islamic law othen than Qur’an, Ijma, Ijtihad and Qiyas, hadith is the second of Islamic law after the Qur’an. This study attempted to build a system than can classify shahih hadith of Bukhari in Indonesian Translation. This topic was chosen to help Muslims who want to understand from each hadith is in the form of informations, prohibitions or suggestion. Support Vector Machine was chosen because it can perform classification by providing good performance for dataset with a large number of features. Latent Semantic Indexing as a feature selection method was chosen because it can reduce features by eliminationg unimportant features (noise term). This study also using Bootstrap Aggregating (Bagging) method to improve accuracy of the classification system. The accuracy results show that by using Latent Semantic Indexing and Bootstrap Aggregating on Support Vector Machine classification single label system is 84% on polynomial kernel and 84.67% on RBF kernel
Klasifikasi Argument Pada Teks dengan Menggunakan Metode Multinomial Logistic Regression Terhadap Kasus Pemindahan Ibu Kota Indonesia di Twitter Mochammad Naufal Rizaldi; Adiwijaya Adiwijaya; Said Al Faraby
JURNAL MEDIA INFORMATIKA BUDIDARMA Vol 4, No 4 (2020): Oktober 2020
Publisher : STMIK Budi Darma

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.30865/mib.v4i4.2348

Abstract

Information on moving the Indonesian capital from Jakarta to East Kalimantan certainly raises the pros and cons conveyed by the Indonesian people through the Twitter social network. However, the pros and cons comments are of course varied, accompanied or not accompanied by arguments or even completely unrelated to the topic under discussion. User limitations in filtering out that information will certainly make it difficult for the public or even the government to analyze the information contained in the tweet. Therefore, a system was built that could classify tweets automatically into three classes, namely non-arbitration, argument and unknown. The method used in this research is Multinomial Logistic Regression (MLR). MLR is a generalization method of Logistic Regression and is used to classify 3 or more classes. Before the classification process is carried out, the tweet must be preprocessed in order to make the tweet clear of all existing noise. Feature extractions used in this study include unigram, bigram and trigram. In this study, there are 12 test scenarios and comparison methods, namely Artificial Neural Network (ANN). Of all the test scenarios the best results for the MLR method are SRU with an accuracy of 41,30%, while for the ANN method namely the RU scenario with an accuracy of 45,10%.
KLASIFIKASI AYAT AL-QURAN TERJEMAHAN BAHASA INGGRIS MENGGUNAKAN K-NEAREST NEIGHBOR (KNN) DAN INFORMATION GAIN Timami Hertza Putrisanni; Adiwijaya Adiwijaya; Said Al Faraby
KOMIK (Konferensi Nasional Teknologi Informasi dan Komputer) Vol 3, No 1 (2019): Smart Device, Mobile Computing, and Big Data Analysis
Publisher : STMIK Budi Darma

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.30865/komik.v3i1.1614

Abstract

Al-Quran is a holy book that contains instructions and instructions for the life of Muslims. In the Al-Quran there are interpretations quoted from the previous verse and have an implied meaning, so to be able to obtain these verses textually and contextually it is necessary to classify the interpretation of the Al-Quran to facilitate Muslims in finding topics in theAl-Quran. In this study, it is proposed to classify the topic of Al-Quran verses in English translation which consists of three classifications, namely commands, prohibitions and others. In this research the system design is done by collecting datasets, preprocessing to get clean data, selecting features using gain information, classifying using the K-Nearest Neighbor (KNN) method, and testing the system. The results of the tests conducted resulted in a value 64,10% for accuracy, 63% for precision, and 62.68% for recall using the value of k = 17 and the dataset containing data testing and data training of 1:9, respectively.Keywords: classification, topics of Al-Quran, K-Nearest Neighbor, Information gain.
Klasifikasi Topik Multi Label pada Hadis Shahih Bukhari Menggunakan K-Nearest Neighbor dan Latent Semantic Analysis Dian Chusnul Hidayati; Said Al Faraby; Adiwijaya Adiwijaya
JURIKOM (Jurnal Riset Komputer) Vol 7, No 1 (2020): Februari 2020
Publisher : STMIK Budi Darma

Show Abstract | Download Original | Original Source | Check in Google Scholar | Full PDF (236.258 KB) | DOI: 10.30865/jurikom.v7i1.2013

Abstract

Hadith is the second source of Islamic law after Al-Quran, making it important to study. However, there are some difficulties in learning hadith, such as to determine which hadith belongs to the topic of suggestions, prohibitions, and information. This certainly obstructs the hadith learning process, especially for Muslims. Therefore, it is necessary to classify hadiths into the topic of suggestions, prohibitions, information, and a combination of the three topics which also called as multi-label topic. The classification can be done with the K-Nearest Neighbor, it is one of the best methods in the Vector Space Model and is the simplest but quite effective method. However, the KNN has a lack in dealing with high vector dimension, resulting in the long time computing classification. For that reason, it is necessary to classify Sahih Bukhari's Hadiths into the topic of recommendations, prohibitions, and information using the Latent-Semantic Analysis - K-nearest Neighbor (LSA-KNN) method. Binary Relevance method is also employed in this research to process the multi-label data. This research shows that the performance of LSA-KNN is 90.28% with the computation time is 19 minutes 21 seconds and the performance of KNN is 90.23% with the computation time is 37 minutes 06 seconds, which means that the LSA-KNN method has a better performance than KNN
Pengaruh Distribusi Panjang Data Teks pada Klasifikasi: Sebuah Studi Awal Said Al Faraby; Ade Romadhony
JURNAL MEDIA INFORMATIKA BUDIDARMA Vol 6, No 3 (2022): Juli 2022
Publisher : STMIK Budi Darma

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.30865/mib.v6i3.4259

Abstract

In text classification, there is a problem with text domain differences (cross-domain) between the data used to train the model and the data used when the model is applied. In addition to the problem of domain differences, there are also language differences (cross-lingual). Many previous studies have looked for ways how classification models can be applied effectively and efficiently in these cross-domain and cross-lingual situations. However, there is one difference that is not given special attention because it is considered not very influential, namely the difference in text length (cross-length). In this study, we further investigated the cross-length condition by creating a special dataset and testing it with various commonly used classification models. The results showed that the difference in the distribution of text length between the training data and the test data could affect the performances. Cross-length transfers from long to short texts show an average decrease in F1-scores across all models of 14%, while transfers from short to long texts give an average decrease of 9%.
Klasifikasi Sentiment Analysis Pada Review Film Berbahasa Inggris Dengan Menggunakan Metode Doc2vec Dan Support Vector Machine (svm) Winda Christina Widyaningtyas; Adiwijaya Adiwijaya; Said Al Faraby
eProceedings of Engineering Vol 5, No 1 (2018): April 2018
Publisher : eProceedings of Engineering

Show Abstract | Download Original | Original Source | Check in Google Scholar

Abstract

Sentimen merupakan sebuah penilaian dari seseorang berupa pendapat atau komentar terhadap suatu topik atau produk tertentu. Analisis sentiment berfungsi untuk melihat pendapat dan komentar terhadap suatu masalah atau topic tertentu cenderung positif atau negatif. Penelitian Tugas Akhir ini menjelaskan klasifikasi sentiment pada dokumen review film untuk mempermudah orang lain dalam mengetahui kualitas sebuah film. Dengan kemajuan di bidang teknologi banyak informasi yang tersedia di internet, salah satunya review film. Review film berisikan pendapat orang lain mengenai ulasan film. Jika informasi tersebut diolah dengan baik, maka akan diperoleh informasi mengenai kualitas film. Metode yang digunakan dalam penelitian ini yaitu Doc2Vec untuk mengekstraksi data menjadi vektor. Doc2Vec dipilih karena metode ini membantu komputer untuk mengidentifikasi kombinasi kata yang akan diklasifikasi. Metode klasifikasi yang digunakan dalam penelitian ini yaitu Support Vector Machine (SVM) karena SVM mampu mengklasifikasikan data berdimensi tinggi. Proses klasifikasi dilakukan dengan melatih data yang telah ditentukan sehingga akan menghasilkan sebuah model yang akan diujukan pada data testing. Dari uji skenario yang dilakukan, algoritma Doc2Vec dan SVM yang digunakan pada kasus review film memiliki nilai F1-Measure sebesar 54.1872%.
Klasifikasi Dokumen Menggunakan Metode Knn Dengan Information Gain Pratama Dwi Nugraha; Said Al Faraby; Adiwijaya Adiwijaya
eProceedings of Engineering Vol 5, No 1 (2018): April 2018
Publisher : eProceedings of Engineering

Show Abstract | Download Original | Original Source | Check in Google Scholar

Abstract

Pada saat ini, informasi sangatlah penting bagi semua orang, kebutuhan akan informasi semaking meningkat seiring dengan semakin canggihnya teknologi sekarang ini. Informasi yang dibutuhkan saat ini semakin tinggi, baik informasi bersifat umum maupun informasi bersifat khusus. Tapi terkadang informasi yang didapat tidak sesuai dengan apa yang diinginkan. Sehingga muncul sebuah permasalah pada saat pencarian data yang dibutuhkan. Sehingga diperlukan sebuah cara untuk memperoleh data yang valid. Document Classification (Klasifikasi dokumen) dapat membantu dalam proses pencarian sebuah data atau dokumen yang valid sesuai dengan apa yang kita butuhkan. Penggunaan klasifikasi dokumen tidak lain untuk membantu dalam proses pencarian data dengan cepat, tepat dan valid. Klasifikasi dokumen mengelompokan dokumen yang sesuai dengan kategori yang terkandung pada dokumen tersebut. Untuk menyelesaikan permasalahan yang ada, metode yang akan digunakan pada penelitian ini yaitu Metode K-Nearest Neighbot (KNN) dan Information Gain.