Claim Missing Document
Check
Articles

Found 28 Documents
Search

The Effect of Stemming and Removal of Stopwords on the Accuracy of Sentiment Analysis on Indonesian-language Texts Aditya Wiha Pradana; Mardhiya Hayaty
Kinetik: Game Technology, Information System, Computer Network, Computing, Electronics, and Control Vol 4, No 4, November 2019
Publisher : Universitas Muhammadiyah Malang

Show Abstract | Download Original | Original Source | Check in Google Scholar | Full PDF (317.101 KB) | DOI: 10.22219/kinetik.v4i4.912

Abstract

Preprocessing is an essential task for sentiment analysis since textual information carries a lot of noisy and unstructured data. Both stemming and stopword removal are pretty popular preprocessing techniques for text classification. However, the prior research gives different results concerning the influence of both methods toward accuracy on sentiment classification. Therefore, this paper conducts further investigations about the effect of stemming and stopword removal on Indonesian language sentiment analysis. Furthermore, we propose four preprocessing conditions which are with using both stemming and stopword removal, without using stemming, without using stopword removal, and without using both. Support Vector Machine was used for the classification algorithm and TF-IDF as a weighting scheme. The result was evaluated using confusion matrix and k-fold cross-validation methods. The experiments result show that all accuracy did not improve and tends to decrease when performing stemming or stopword removal scenarios. This work concludes that the application of stemming and stopword removal technique does not significantly affect the accuracy of sentiment analysis in Indonesian text documents.
POLA PEMBELIAN KONSUMEN DAN MENYUSUN STRATEGI PENJUALAN MENGGUNAKAN ALGORITMA APRIORI BERBASIS WEBSITE (STUDI KASUS : PT. XYZ) Mardhiya Hayaty; Wisnu Dwi Harianto
Jurnal Mantik Penusa Vol. 3 No. 1.1 (19): Manajemen dan Ilmu Komputer
Publisher : Lembaga Penelitian dan Pengabdian (LPPM) STMIK Pelita Nusantara Medan

Show Abstract | Download Original | Original Source | Check in Google Scholar | Full PDF (440.402 KB)

Abstract

PT. XYZ is a company engaged in the business of selling agrocomplex products including agricultural products, plantations, fisheries, health, and household products. Transaction data continues to increase, companies have difficulty knowing the patterns of consumer purchases accurately. Huge accumulation of data can be used by companies to determine strategies that can support the company's business processes. In this study, the implementation of association rule mining was carried out to help determine consumer purchasing patterns. The technique of combining mining rules used is a priori algorithm that is applied to web-based applications to analyze nasaofficial.com transaction data. To get accurate results added calculation of lift ratio. In this study determined the minimum value of support is 10% and a minimum confidence value of 60%. The result is that there are 3 items of goods, namely viterna, hormonal, nasa. The final results of this study indicate that the mining association rules using a priori algorithms have been successfully applied in applications. The highest association rule produced in the transaction data for the past year is if consumers buy viterna (natural animal vitamins) and hormonic then buy NASA POC with a support value of 23% and 96% confidence value.
Pelatihan Pembuatan Video Ajar Pada Guru MIN 1 Klaten Sri Ngudi Wahyuni; Mei Maemunah; Sri Mulyatun; Rosyidah Jayanti Vijaya; Istiningsih Istiningsih; Mohammad Khalis Purwanto; Rahma Widyawati; Mardhiya Hayati
Journal of Community Development Vol. 3 No. 1 (2022): August
Publisher : Indonesian Journal Publisher

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47134/comdev.v3i1.54

Abstract

Since the COVID-19 pandemic has occurred throughout the world, all activities and activities have been disrupted, including educational activities. All educational activities that were initially carried out offline must be online or online to control the spread of COVID-19. One way to facilitate teaching and learning activities and delivery of material to students needs an easy and exciting way for students and teachers. Teachers are required to immediately adapt to technology so that the educational process in schools is not disrupted. One of the efforts to upgrade skills is to hold training to develop exciting learning videos that can be tracked at any time, wherever they are. Open Broadcast Software or OBS is one of the open-source tools that can be used to make learning videos easily and not connected to the internet. So that it is more efficient and does not incur additional costs when building, all teachers held this training at MIN 1 Klaten; the participants were very enthusiastic. In general, this training aims to improve the quality and ability in information technology in teaching and learning activities for teachers at MIN 1 Klaten in the 5.0 era. This community service activity is by direct delivery of material or material oration and implementation. At the end of the activity, an evaluation was held in the form of a questionnaire to measure the success of the absorption of the material presented. The activity evaluation results showed that it was satisfactory for all activity participants, easy to understand the material presented and easy to use.
Pelatihan Pembuatan Konten Pembelajaran Menggunakan Open Broadcast Software Mardhiya Hayaty; Sri Ngudi Wahyuni; Istiningsih; Andriyan Dwi Putra; Mei Maemunah; Barka Satya; Dwi Nurani
Abdiformatika: Jurnal Pengabdian Masyarakat Informatika Vol. 1 No. 2 (2021): November 2021 - Abdiformatika: Jurnal Pengabdian Masyarakat Informatika
Publisher : Indonesian Scientific Journal

Show Abstract | Download Original | Original Source | Check in Google Scholar | Full PDF (2580.217 KB) | DOI: 10.25008/abdiformatika.v1i2.142

Abstract

Pandemi COVID-19 terjadi terjadi diseluruh dunia, seluruh aktifitas dan kegiatan menjadi terganggu termasuk aktifitas pendidikan. Seluruh kegiatan pendidikan yang semula dilaksanakan secara offline, harus dilaksanakan secara online. Salah satu cara untuk mempermudah kegiatan belajar mengajar, adalah dengan mengunggah video yang berisi materi-materi pembelajaran ke media sosial seperti Youtube ataupun sejenisnya. Hal ini tentunya membawa perubahan yang sangat signifikan terhadap kegiatan belajar mengajar termasuk kegiatan di SD Muhammadiyah Rabbani Kabupaten Klaten Jawa Tengah. Open Broadcast Software atau OBS merupakan salah satu tools open source yang bisa dimanfaatkan untuk membuat video pembelajaran dengan mudah dan tidak terkoneksi internet. Tujuan dari pelatihan ini adalah tentang tatacara pembuatan OBS. Evaluasi menggunakan kuesioner online dan diolah menggunakan SPSS versi 25. Hasil evaluasi menunjukkan bahwa 60% peserta menjawab sangat setuju bahwa pelatihan mudah dimengerti, pemateri sangat handal dan OBS mudah di implementasikan, sedangkan 30% lainnya menjawab setuju.
Feature Extraction using Lexicon on the Emotion Recognition Dataset of Indonesian Text Aprilia Nurkasanah; Mardhiya Hayaty
ULTIMATICS Vol 14 No 1 (2022): Ultimatics : Jurnal Teknik Informatika
Publisher : Faculty of Engineering and Informatics, Universitas Multimedia Nusantara

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.31937/ti.v14i1.2540

Abstract

Text Mining is a part of Neural Language Processing (NLP), also known as text analytics. Text mining includes sentiment analysis and emotion analysis which are often used in analysis on social media, news, or other media in written form. The emotional breakdown is a level of sentiment analysis that categorises text into negative, neutral, and positive sentiments. Emotion is categorized into several classes, In this study, emotion is categorized into 5 classes namely anger, fear, happiness, love, and sadness. This study proposed feature extraction using Lexicon and TF-IDF on the emotion recognition dataset of Indonesian texts. InSet Lexicon Dictionary is used as the corpus in performing the feature extraction. Therefore, InSet Lexicon was chosen as the dictionary to perform feature extraction in this study. The results show that InSet Lexicon has poor performance in feature extraction by showing an accuracy of 30%, while TF-IDF is 62%.
IMPLEMENTASI DAN PELATIHAN PENGGUNAAN APLIKASI DETEKSI PLAGIAT BERITA ONLINE PADA KLIKKALTENG.ID Mardhiya Hayaty; Teguh Efriyanto; Sri Ngudi Wahyuni
JAMAIKA: JURNAL ABDI MASYARAKAT Vol 3, No 1 (2022): FEBRUARI
Publisher : Universitas Pamulang

Show Abstract | Download Original | Original Source | Check in Google Scholar | Full PDF (560.486 KB)

Abstract

Perkembangan era digital saat ini mampu mempercepat proses arus informasi, berita-berita yang sebelumnya diakses berbasis kertas(media cetak) berubah menjadi media digital, dan dapat diakses dengan mudah melalui website, social media dan lain sebagainya. Jurnalis adalah profesi yang pekerjaannya mencari, mengumpulkan, memilih, mengolah berita dan menyajikan secepatnya kepada masyarakat luas. Masyarakat sebagai konsumen berita membutuhkan informasi yang cepat dan akurat untuk mendukung kehidupan mereka sehari-hari. Hal ini berdampak pada wartawan media berita online untuk dapat mencari informasi berita yang cepat dan akurat, kemungkinan wartawan melakukan tindakan plagiarism. Oleh karena itu pembuatan software berbasis web dapat  mengukur prosentasi kemiripan isi berita sehingga dapat ditentukan apakah wartawan melakukan tindakan plagiat katagori berat atau tidak. Tujuan dari implementasi aplikasi ini adalah agar wartawan bisa bekerja lebih professional dan berdampak positif pada kualitas berita yang dihasilkan. Penerapan software ini mampu mengurangi tindakan plagiarism dikalangan wartawan klikkalteng.id yaitu 74.49% pada minggu pertama, 60.14% pada minggu kedua dan terus menurun pada minggu ketiga yaitu hanya 49.55%
JARO WINKLER ALGORITHM FOR MEASURING SIMILARITY ONLINE NEWS Teguh Efriyanto; Mardhiya Hayaty
Jurnal Teknik Informatika (Jutif) Vol. 3 No. 4 (2022): JUTIF Volume 3, Number 4, August 2022
Publisher : Informatika, Universitas Jenderal Soedirman

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.20884/1.jutif.2022.3.4.152

Abstract

Online news is a source of information for people; this impacts journalists as news writers who can find news information quickly and accurately every day. Journalists can plagiarise other journalists or take news material from other news media sites and use it to publish in the media without including the source. An algorithm is needed to measure the similarity of online news. This work proposed the Jaro Winkler algorithm, with the value obtained from the calculation normalised so that the value 0 means there is no resemblance, and one means it has the exact resemblance. The data used is 20 online news media sites in the Central Kalimantan area. The Scraping process utilised the Custome Search JSON API and used keywords to get the news on the same topic. The results of the calculation of news similarity with the Jaro Winkler algorithm obtained an average value of online news similarity of 74.49%, with 43 news data with severe plagiarism levels and 12 news data with moderate plagiarism levels. There are weaknesses in the Jaro Winkler algorithm in calculating the similarity value in the data obtained. Some undetected data should have a heavy plagiarism level but not severe and vice versa.
Facial Images Improvement in the LBPH Algorithm Using the Histogram Equalization Method Aditya Salman; Mardhiya Hayaty; Ika Nur Fajri
JUITA : Jurnal Informatika JUITA Vol. 10 No. 2, November 2022
Publisher : Department of Informatics Engineering, Universitas Muhammadiyah Purwokerto

Show Abstract | Download Original | Original Source | Check in Google Scholar | Full PDF (1150.687 KB) | DOI: 10.30595/juita.v10i2.13223

Abstract

In face recognition research, detecting several parts of the face becomes a necessary part of the study. The main factor in this work is lighting; some obstacles emerge when the low light's intensity falls in the process of face detection because of some conditions, such as weather, season, and sunlight. This study focuses on detecting faces in dim lighting using the Local Binary Pattern Histogram (LBPH) algorithm assisted by the Classifier Method, which is often used in face detection, namely the Haar Cascade Classifier. Furthermore, It will employ the image enhancement method, namely Histogram Equalization (HE), to improve the image source from the webcam. In the evaluation, different light intensities and various head poses affect the accuracy of the method. As a result, The research reaches 88% accuracy for successful face detection. Some factors such as head accessories, hair covering the face, and several parts of the face, like the eye, mouth, and nose that are invisible, should not be extreme.
Performance of Lexical Resource and Manual Labeling on Long Short-Term Memory Model for Text Classification Mardhiya Hayaty; Aqsal Harris Pratama
Jurnal Ilmiah Teknik Elektro Komputer dan Informatika Vol 9, No 1 (2023): March
Publisher : Universitas Ahmad Dahlan

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.26555/jiteki.v9i1.25375

Abstract

Data labeling is a critical aspect of sentiment analysis that requires assigning labels to text data to reflect the sentiment expressed. Traditional methods of data labeling involve manual annotation by human annotators, which can be both time-consuming and costly when handling large volumes of text data. Automation of the data labeling process can be achieved through the utilization of lexicon resources, which consist of pre-labeled dictionaries or databases of words and phrases in sentiment information. The contribution of this study is an evaluation of the performance of lexicon resources in document labeling. The evaluation aims to provide insight into the accuracy of using lexicon resources and inform future research. In this study, a publicly available dataset was utilized and labeled as negative, neutral, and positive. To generate new labels, a lexicon resource such as VADER, AFINN, SentiWordNet, and Liu & Hu was employed. An LSTM model was then trained using the newly generated labels. The performance of the trained model was evaluated by testing it on data that had been manually labeled. The study found manual labeling led to highest accuracy of 0.79, 0.80, and 0.80 for training, validation, and testing respectively. This is likely due to manual creation of test data labels, enabling the model to learn and capture balanced patterns. Models using lexicon resources (VADER and AFINN) had lower accuracy of 0.54 and 0.56. SentiWordNet had lowest accuracy among all models with 0.49, and Liu&Hu model had the lowest testing score of 0.26. Our research indicates that lexicon resources alone are not sufficient for sentiment data labeling as they are dependent on pre-defined dictionaries and may not fully capture the context of words within a sentence, thus, manual labeling is necessary to complement lexicon-based methods to achieve better result.
Perbandingan Metode Word Embedding Untuk Analisis Sentimen Pada Data Ulasan Marketplace Nur’aini; Arfian Yogi Ferianto; Dhani Ariatmanto; Mardhiya Hayaty; Norhikmah .
Jurnal ICT: Information Communication & Technology Vol. 22 No. 2 (2022): JICT-IKMI, December 2022
Publisher : LPPM STMIK IKMI Cirebon

Show Abstract | Download Original | Original Source | Check in Google Scholar

Abstract

Marketplace is a platform for buying and selling goods online, one of whichis shopee. The platform provides a lot of short text data about reviews of various products being sold. Therefore, sentiment analysis is carried out for the classification of reviews by taking into account the factors in the sentiment object.In sentiment analysis, there is a more advanced method, namely using word embedding, word representation in vectors, many researchers have used this method in their research. Therefore, this study uses review data obtained from the shopee marketplace for sentiment analysis.In this study, data is classified using Long Short Term Memory (LSTM).Reviews that are classified will have 2 labels namely positive and negative. Thisstudy aims to determine the final accuracy and vocabulary generated by word embedding which is classified using LSTM in analyzing sentiment in Indonesian shopee reviews.Word embedding methods used are Word2Vec and Global Vector (Glove).This study uses a dataset of 10,000 to produce a vocabulary of 18004 words. From the dataset, 80% training data and 20% test data were distributed. The accuracy of the word embedding word2vec method is 83% and the word embedding Glove method gets 86% accuracy.