Claim Missing Document
Check
Articles

Hoax Detection of Indonesian News Media on Twitter Using IndoBERT with Word Embedding Word2Vec Pernanda Arya Bhagaskara S M; Sri Suryani Prasetiyowati; Yuliant Sibaroni
JURNAL MEDIA INFORMATIKA BUDIDARMA Vol 7, No 3 (2023): Juli 2023
Publisher : Universitas Budi Darma

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.30865/mib.v7i3.6367

Abstract

Hoax is data that is added or deducted from the news that occurred. In the digital age, hoaxes are increasingly being spread, and people are very quickly affected by their spread, especially hoaxes circulating in Indonesian news media on social media. Disseminating information that has not been confirmed as accurate can cause public concern and anxiety. Virtual diversion has transformed into a correspondence key to begin thinking, talking, and moving around cordial issues. In this manner, exploration will be led by consolidating the IndoBERT model with the Word2Vec development highlight in arranging deception news in Indonesian news media. This model was constructed using K-Fold cross-validation to enhance model performance across extensive data sets. The information utilized comes from tweets shared on Twitter by the Indonesian public. The trials that have been carried out demonstrate that combining Word2Vec with IndoBERT is effective at detecting hoaxes, with an overall accuracy score of 88% for the entire dataset. This conclusion can be drawn from the classification results of Word2Vec with IndoBERT. Also, the best precision and incentive for every cycle is almost 99%. In addition, the study's objective is to identify hoax news in Indonesian news media disseminated via social media. This will encourage individuals to be more cautious when reading and disseminating news, as untrue information will significantly impact certain individuals.
Sentiment Classification of Fuel Price Increase With Gated Recurrent Unit (GRU) and FastText Aditya Andar Rahim; Yuliant Sibaroni; Sri Suryani Prasetiyowati
JURNAL MEDIA INFORMATIKA BUDIDARMA Vol 7, No 3 (2023): Juli 2023
Publisher : Universitas Budi Darma

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.30865/mib.v7i3.6391

Abstract

The government usually implements a policy of increasing fuel prices and reducing subsidized fuel every year. Rising fuel prices have had a mixed impact on society. The rapid development of information technology has led to easy access and an increase in the number of internet users. Social media platforms, such as Twitter, are widely used by people to express themselves in everyday life. Through this social media, the public can submit reviews regarding public policies implemented by the government regarding fuel prices. The reviews submitted varied, ranging from positive, neutral to negative. Sentiment analysis can analyze the types of reviews submitted by people, including positive, negative, or neutral. This research uses Gated Recurrent Unit and FastText feature expansion to classify sentiments related to rising fuel prices on Twitter. This system was developed through several stages, namely data crawling, data labeling, data initial processing, feature expansion, classification, and evaluation. This study aims to determine the classification performance using Gated Recurrent Unit and FastText. The data used was 8,635, and the highest accuracy reached 90.15% with an F1 score of 90.06%. The research results may help the government in determining how individuals feel about fuel price increases. By understanding public sentiment, the government can reevaluate its policies or even establish new ones that serve the public interest.
Detection of Fraudulent Financial Statement based on Ratio Analysis in Indonesia Banking using Support Vector Machine Yuliant Sibaroni; Muhammad Novario Ekaputra; Sri Suryani Prasetiyowati
JOIN (Jurnal Online Informatika) Vol. 5 No 2 (2020)
Publisher : Department of Informatics, UIN Sunan Gunung Djati Bandung

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.15575/join.v5i2.646

Abstract

This study proposes the use of ratio analysis-based features combined with the SVM classifier to identify fraudulent financial statements. The detection method used in this study applies a data mining classification approach. This method is expected to replace the expert in forensic accounting in identifying fraudulent financial statements that are usually done manually. The experimental results show that the proposed classifier model and ratio analysis-based features provide more than 90% accuracy results where the optimal number of features based on ratio analysis is 5 features, namely Capital Adequacy Ratio (CAR), (ANPB) to total earning assets and non-earning assets (ANP), Impairment provision on earning assets (CKPN) to earning assets, Return on Asset (ROA), and Return on Equity (ROE). The contribution of the study is to complement the research of fraudulent financial statements detection where the classifier method used here is different compare to other research. The selection of banking cases in Indonesia is also unique in this research which distinguishes it from other research because the financial reporting standards in each country can be different. 
Detection of Indonesian Hate Speech in the Comments Column of Indone-sian Artists' Instagram Using the RoBERTa Method Adhe Akram Azhari; Yuliant Sibaroni; Sri Suryani Prasetiyowati
JIPI (Jurnal Ilmiah Penelitian dan Pembelajaran Informatika) Vol 8, No 3 (2023)
Publisher : STKIP PGRI Tulungagung

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.29100/jipi.v8i3.3898

Abstract

This study detects hate speech comments from Instagram post comments where the method used is RoBERTa. Roberta's model was chosen based on the consideration that this model has a high level of accuracy in classifying text in English compared to other models, and possibly has good potential in detecting Indonesian as used in this research. There are two test scenarios namely full-preprocessing and non full-preprocessing where the experimental results show that non full-preprocessing has an average value of accuracy higher than full-preprocessing, and the average value of non full-preprocessing accuracy is 85.09%. Full-preprocessing includes several preprocessing stages, namely cleansing, case folding, normalization, tokenization, and stemming. While non full-preprocessing includes all processes in preprocessing except the stemming process. This shows that RoBERTa predicts comments well when not using full-preprocessing.
Word2Vec Optimization on Bi-LSTM in Electric Car Sentiment Classification Siti Uswah Hasanah; Yuliant Sibaroni; Sri Suryani Prasetyowati
JURNAL MEDIA INFORMATIKA BUDIDARMA Vol 8, No 1 (2024): Januari 2024
Publisher : Universitas Budi Darma

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.30865/mib.v8i1.7200

Abstract

The Indonesian government is actively promoting electric vehicles. This policy has generated many sentiments from the public, both positive and negative. Public sentiment can have a significant impact on the success of government policies. Therefore, it is important to understand public sentiment towards these policies. This research develops a sentiment classification model to understand public sentiment towards electric vehicles in Indonesia. Sentiment classification is the process of identifying and measuring the positive or negative sentiment in a text. This research uses a Bi-LSTM model to perform classification on a dataset of tweets related to electric vehicles. To evaluate the performance, testing was conducted through two main scenarios. In Scenario I, the focus was on finding the optimal embedding size for two Word2Vec architectures, namely CBOW and Skip-gram. Model evaluation was performed using cross-validation to gain a deeper understanding of model performance. Scenario II focused on searching for the best dropout parameters for the Bi-LSTM model. This step aimed to find the optimal configuration for the model to generate more accurate and consistent predictions in classifying tweets related to electric vehicles. The results showed that in the context of sentiment classification on tweets about electric vehicles, the combination of CBOW with an embedding size of 200 and the Bi-LSTM model with a Dropout value of 0.2 is the best choice and achieves an accuracy of 96.31%, precision of 92.57%, Recall of 98.61%, and F1-Score of 95.49%.
Performance of Time-Based Feature Expansion in Developing ANN Classification Prediction Models on Time Series Data Sri Suryani Prasetiyowati; Arnasli Yahya; Aniq Atiqi Rohmawati
International Journal on Information and Communication Technology (IJoICT) Vol. 9 No. 2 (2023): Vol.9 No. 2 Dec 2023
Publisher : School of Computing, Telkom University

Show Abstract | Download Original | Original Source | Check in Google Scholar

Abstract

The prediction problem in most research is the main goal, to estimate future events related to the field under study. Research on classification that involves the prediction process in it, with spatial-time data and influenced by many features, such as the problem of disease spread, climate change, regional planning, environment, economic growth, requires methods that can predict while solving the problem of features and time. To obtain a time-based classification prediction model using many features, this research uses machine learning methods, one of which is Artificial Neural Network (ANN). The scenario carried out is to develop a t+r classification prediction model by expanding features based on the time t-r of the previous period. The performance of feature expansion in the development of ANN classification prediction models is determined based on the optimal accuracy value of the combination of t-r classification prediction models for the previous time period. By implementing the model on the data, it is found that the performance of time-based feature expansion in ANN classification ranges from 3.5% to 11%. While the optimal accuracy value is obtained from the feature expansion scenario of 3 to 5 time periods earlier.
Revealing the Impact of the Combination of Parameters on SVM Performance in COVID-19 Classification Sri Suryani Prasetiyowati; Sri Harini; Juniardi Nur Fadila; Hilda Fahlena
International Journal on Information and Communication Technology (IJoICT) Vol. 10 No. 1 (2024): Vol. 10 No.1 June 2024
Publisher : School of Computing, Telkom University

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.21108/ijoict.v10i1.965

Abstract

Non-linear SVM functions to modify the kernel in the SVM. Each kernel function in linear and non-linear SVMs has several parameters that are used in the classification process. SVM is a method that has advantages in classification, but there are still obstacles in selecting optimal parameters. This research investigates the effect of parameter variations on SVM classification performance on the COVID-19 dataset, using linear, RBF, Sigmoid and polynomial kernels. The analysis shows that the polynomial kernel is superior with the highest performance compared to other kernels. The highest accuracy of 77.57% was achieved with a combination of C values ??of 0.75 and Gamma of 0.75, and an F1-Score value of 76.67% indicating an optimal balance between precision and recall. The performance stability produced by the polynomial kernel provides advantages in classifying the COVID-19 dataset, with more controlled fluctuations compared to other kernels. The interaction between the C and Gamma parameters shows that a Gamma value of 0.75 consistently provides good results, while adjusting the C parameter shows more controlled performance variations. This confirms that appropriate Gamma parameter settings are key in improving the accuracy and consistency of SVM model predictions in this case.
Optimal Number Data Trains in Hoax News Detection of Indonesian using SVM and Word2Vec Asramanggala, Muhammad Sulthon; Prasetyowati, Sri Suryani; Sibaroni, Yuliant
Building of Informatics, Technology and Science (BITS) Vol 5 No 1 (2023): June 2023
Publisher : Forum Kerjasama Pendidikan Tinggi

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47065/bits.v5i1.3516

Abstract

Along with the development of the era of technological development also has an increase. Information dissemination occurs very quickly on social media, especially Twitter. On Twitter, only some news circulating is necessarily accurate information. Lots of information that is spread is hoax news that irresponsible individuals apply. In this research, the author will build a system to determine the optimal amount of data trained in the hoax news classification process. In this study, the authors will use the support vector machine and word2vec algorithms to classify hoax and non-hoax news on the system to be created. In this study, five experiments were carried out with the number of train data used as many as 5000, 10000, 15000, 20000, and 25000. 5000 data train results in an accuracy of 77.28%, 10000 data train produce an accuracy of 79.68%, data 15,000 trains produce an accuracy of 79.892%, 20,000 data trains produce an accuracy of 80,416%, and 25,000 data trains produce an accuracy of 81,184%, by using a combination of unigram with token full token selection. This research aims to build a hoax detection system that can determine the optimal amount of data training to use. Also, this research is used to see the performance of the Support Vector Machine algorithm with Word2Vec in detecting hoax news
The Effect of Feature Weighting on Sentiment Analysis TikTok Application Using The RNN Classification Aufa, Rizki Nabil; Prasetiyowati, Sri Suryani; Sibaroni, Yuliant
Building of Informatics, Technology and Science (BITS) Vol 5 No 1 (2023): June 2023
Publisher : Forum Kerjasama Pendidikan Tinggi

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47065/bits.v5i1.3597

Abstract

Social media is a medium used by people to express their opinions. In its development, social media has become a necessity in social life. One of the most popular social media applications since 2020 is TikTok. Short videos with an average duration of 60 seconds can entertain the community so that they don't feel isolated. There are 17 million TikTok application reviews in the Google Play store in Indonesia from various user ages. The rapid development of information and technology has led to the pros and cons of this application. Freedom of expression without specific restrictions on content publication negatively impacts the user's mentality. Based on this, sentiment analysis is very important to reveal trends in opinions about applications that are useful for the community in increasing awareness of whether the application is good before use. Proper feature weighting is required to improve the sentiment analysis results' accuracy. More optimal results can be obtained by determining the appropriate weight for different feature weighting. This study compares the TF IDF, TF RF, and Word2Vec feature weighting methods with the RNN classifier on the TikTok app review. The experiment shows that TF RF is superior to TF IDF, with successive feature weighting accuracy with TF RF of 87,6%, TF IDF of 86%, and Word2Vec of 80%. The contribution of this research lies in its exploration of different feature weighting methods to enhance sentiment analysis accuracy and provide valuable insights for decision-making processes.
Hate Speech Classification in Tiktok Reviews using TF-IDF Feature Extraction, Differential Evolution Optimization, and Word2Vec Feature Expansion in a Classification System using Recurrent Neural Network (RNN) Fatha, Rizkialdy; Sibaroni, Yuliant; Prasetyowati, Sri Suryani
Building of Informatics, Technology and Science (BITS) Vol 6 No 2 (2024): September 2024
Publisher : Forum Kerjasama Pendidikan Tinggi

Show Abstract | Download Original | Original Source | Check in Google Scholar

Abstract

In the ever-evolving digital era, social media, especially platforms like TikTok, has become a primary channel for users to share opinions, experiences, and expressions. However, the increasing prevalence of hate speech in reviews on the Google Play Store for the TikTok app indicates the need for a sophisticated approach to identify and classify harmful content. This research is aimed to optimize the classification of hate speech in Google Play reviews of the TikTok app by integrating Term Frequency-Inverse Document Frequency (TF-IDF), Differential Evolution, and Word2Vec within a Recurrent Neural Network (RNN) model. The TF-IDF technique will be used to extract relevant features from a review, while Differential Evolution will be applied to efficiently optimize the model parameters. The use of Word2Vec will enhance the representation of words in the context of app reviews, whereas the RNN model will enable the recognition of temporal patterns in hate speech. The results of this research are expected to contribute significantly to the improvement of hate speech classification on digital platforms focused on app reviews.
Co-Authors Abduh Salam Adhe Akram Azhari Adhitya Aldira Hardy Aditya Andar Rahim Aditya Firman Ihsan Aditya Gumilar Aniq A. Rohmawati Aniq Atiqi Rohmawati Aqilla, Livia Naura arief rahman Arnasli Yahya Asramanggala, Muhammad Sulthon Aufa, Rizki Nabil Azmi Aulia Rahman Chamadani Faisal Amri Christina Natalia Claudia Mei Serin Sitio Damar, Muhammad Dede Tarwidi Derwin Prabangkara Ekaputra, Muhammad Novario Elqi Ashok Erna Sri Sugesti Fairuz, Mitha Putrianty Fatha, Rizkialdy Fathin, Muhammad Ammar Fatri Nurul Inayah Gede Astawa Pradika Gilang Brilians Firmanesha Gusti Aji, Raden Aria Gutama, Soni Andika Hawa, Iqlima Putri Haziq, Muhammad Raffif Hilda Fahlena I Putu Ananda Miarta Utama Ibnu Muzakky M. Noor Indra Kusuma Yoga Indri Octavellia Wulanissa Irfani Adri Maulana Jauzy, Muhammad Abdurrahman Al Juniardi Nur Fadila Lesmana, Aditya Mahadzir, Shuhaimi Maharani, Anak Agung Istri Arinta Mardha Al Nazhfi Ali Mitha Putrianty Fairuz Muh. Kiki Adi Panggayuh Muhammad Damar Muhammad Ghifari Adrian Muhammad Hadyan Baqi Muhammad Ikram Kaer Sinapoy Muhammad Novario Ekaputra Muldani, Muhamad Dika Nanda Ihwani Saputri Naufal Alvin Chandrasa Nenny Lisbeth Minarno Ni Made Dwipadini Puspitarini Nur Fadila, Juniardi Nuraena Ramdani Nurul Fajar Riani Pernanda Arya Bhagaskara S M Pilar Gautama, Hadid Purwanto, Brian Dimas Putra, Ihsanudin Pradana Putri, Pramaishella Ardiani Regita Rachmadania Irmanita Rafika Salis Rahmanda, Rayhan Fadhil Ridha Novia Ridho Isral Essa Rifaldy, Fadil Rizky Fauzi Ramadhani Rizky Yudha Pratama Rizky, Muhammad Zacky Faqia Salis, Rafika Salsabila, Syifa Sinaga, Astria M P Siti Uswah Hasanah Sri Harini Sri Harini Suhendar, Annisya Hayati Winico Fazry Wira Abner Sigalingging Yahya, Arnasli Yuliant Sibaroni Zaidan, Muhammad Naufal