Claim Missing Document
Check
Articles

Found 7 Documents
Search

Predictive Analysis of Potential Fraud in the Distribution of The Program Indonesia Pintar (PIP) Funds Using the Naïve Bayes and SVM Methods Gumay, Rizki Izandi; Anggai, Sajarwo; Tukiyat, Tukiyat
International Journal of Engineering, Science and Information Technology Vol 5, No 4 (2025)
Publisher : Malikussaleh University, Aceh, Indonesia

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.52088/ijesty.v5i4.982

Abstract

The distribution of funds for The Indonesia Smart Program (Program Indonesia Pintar, or PIP), as a national education assistance program, faces serious challenges related to the potential for fraud that can harm the state and hinder the goal of equitable access to education. This study aims to develop a machine learning-based predictive model to detect potential fraud in the distribution of PIP funds by comparing two main algorithms, Naive Bayes and Support Vector Machine (SVM). The dataset used is the result of the integration of PIP and DAPODIK data in 2023, as well as additional features of engineering results based on the pattern of audit findings. All data, through preprocessing, normalization, and balancing processes, uses SMOTE to overcome class imbalances. The model was evaluated using accuracy, precision, recall, and F1-score metrics, both on internal and external test data from Banten Province. The results showed that SVMs with RBF kernel and optimal parameter tuning provided the best performance with an accuracy of up to 98.5% on test data. At the same time, Naive Bayes tended to be more sensitive to changes in data distribution in new data. Features such as recipient differences, budget checks, and stakeholder proposals have proven to be the leading indicators in detecting fraud. This study emphasizes the importance of external validation and regular model updates so that fraud detection systems remain adaptive to data dynamics in the field. The resulting model can be used as a tool for supervision and decision-making to prevent fraud in distributing education funds.
Optimasi Akurasi Jawaban Aplikasi Chatbot Layanan Pelanggan dengan Metode RAGRetrieval-Augmented Generation Dhaman, Dhaman; Anggai, Sajarwo; Waskita, Arya Adhyaksa
Journal of Information System Research (JOSH) Vol 6 No 4 (2025): Juli 2025
Publisher : Forum Kerjasama Pendidikan Tinggi (FKPT)

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47065/josh.v6i4.8048

Abstract

This research addresses the issue of low answer accuracy in chatbot systems based on Large Language Models (LLMs) when responding to questions derived from customer service documents. To overcome this problem, the Retrieval-Augmented Generation (RAG) method is applied to improve the quality of responses by adding relevant context from external documents. Three LLM models used in this study are LLaMA3.1 8B, LLaMA3.2 1B, and LLaMA3.2 3B from Meta AI. Evaluation is conducted using automatic ROUGE metrics (ROUGE-1, ROUGE-2, and ROUGE-L) and manual human evaluation assessing accuracy, relevance, and hallucination. This research contributes to the development of more reliable question-answering systems based on LLMs enhanced with external contextual documents related to customer service information. The results show a significant improvement across all models after applying the RAG method. ROUGE F1-scores increased consistently, with Llama3.1:8b showing the highest gain (from 0.12 to 0.58 on ROUGE-1). Human evaluation also confirmed improvements in accuracy (up to +2.73 points) and reductions in hallucination (up to −2.63 points). These improvements were evident not only in larger models but also in smaller ones, indicating that the benefits of RAG are not dependent on model size. In conclusion, RAG is highly effective in enhancing the accuracy and reliability of chatbot responses, especially in document-based question-answering scenarios. By leveraging contextual information from external documents, the system produces more factual, relevant, and hallucination-free responses. RAG has proven to be an effective approach for enhancing the response quality of LLM, including those with smaller parameter sizes.
Analisis Prediksi Curah Hujan di Kota Tangerang Menggunakan Metode LSTM dan GRU Supriatna, Dahlan; Anggai, Sajarwo; Tukiyat
Jurnal Pustaka AI (Pusat Akses Kajian Teknologi Artificial Intelligence) Vol 5 No 2 (2025): Pustaka AI (Pusat Akses Kajian Teknologi Artificial Intelligence)
Publisher : Pustaka Galeri Mandiri

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.55382/jurnalpustakaai.v5i2.1068

Abstract

Curah hujan yang tidak menentu dapat memengaruhi berbagai sektor, seperti pertanian, energi, dan infrastruktur. Akurasi prediksi curah hujan sangat penting untuk mitigasi risiko bencana banjir maupun kekeringan. Penelitian ini bertujuan untuk membandingkan akurasi prediksi curah hujan menggunakan dua algoritma deep learning, yaitu LSTM dan GRU serta dapat memberikan kontribusi pada pengelolaan sumber daya air yang lebih efektif. Model ini diterapkan pada data historis curah hujan dan variabel meteorologi terkait, data penelitian adalah data sekunder yang bersumber dari data BMKG Kota Tangerang periode Januari 2014 – Januari 2025 sebanyak 4.062 data. Evaluasi kinerja model dilakukan menggunakan metrik seperti MAE, MSE, RMSE, dan R². Hasil menunjukan Model LSTM dengan konfigurasi hyperparameter optimal—terdiri dari timesteps 36 bulan, 64 unit memori, 100 epoch pelatihan, batch size 16, dropout 0.3, dan learning rate 0.0001—menghasilkan metrik evaluasi terbaik MAE sebesar 0.08473, MSE sebesar 0.00973, RMSE sebesar 0.09863, dan R2 sebesar 0.65601. Nilai R2 yang relatif tinggi ini mengindikasikan bahwa model LSTM mampu menjelaskan sekitar 65.6% dari variabilitas dalam data curah hujan aktual. Sebagai perbandingan, model GRU dengan kinerja terbaiknya (menggunakan batch size 32) menunjukkan metrik evaluasi yang sedikit di bawah LSTM, yaitu MAE 0.08883, MSE 0.01078, RMSE 0.10383, dan R2 Score 0.61878, secara keseluruhan, LSTM terbukti lebih unggul dalam kapabilitas prediksinya.
Enhancing BERTopic with Neural Network Clustering for Thematic Analysis of U.S. Presidential Speeches Anggai, Sajarwo; Zain, Rafi Mahmud; Tukiyat, Tukiyat; Waskita, Arya Adhyaksa
Jurnal Teknik Informatika (Jutif) Vol. 6 No. 4 (2025): JUTIF Volume 6, Number 4, Agustus 2025
Publisher : Informatika, Universitas Jenderal Soedirman

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.52436/1.jutif.2025.6.4.5090

Abstract

Understanding the underlying themes in presidential speeches is critical for analyzing political discourse and determining public policy direction.  However, topic modeling in this context presents difficulties, particularly when clustering semantically rich topics from high-dimensional embeddings.  This study seeks to improve topic modeling performance by incorporating a Neural Network Clustering (NNC) approach into the BERTopic pipeline.  We analyze 2,747 speeches delivered by U.S President Joe Biden (2021-2025) and compare three clustering techniques: HDBSCAN, KMeans, and the proposed Autoencoder-based NNC.  The evaluation metrics (UMass, NPMI, Topic Diversity) show that NNC produces the most coherent and diverse topic clusters (UMass = -0.4548, NPMI = 0.0234, Diversity = 0.3950, ).  These findings show that NNC can overcome the limitations of density and centroid-based clustering in high-dimensional semantic spaces. The study contributes to the field of Natural Language Processing by demonstrating how neural-based clustering can improve topic modeling, particularly for complex, real-world political corpora.
Analisis Sentimen Ulasan Pengguna Aplikasi Info BMKG pada Google Play Store Menggunakan Model Transformer BERT dan RoBERTa Brando, Charlo; Anggai, Sajarwo; Tukiyat, Tukiyat
Jurnal SISKOM-KB (Sistem Komputer dan Kecerdasan Buatan) Vol. 9 No. 1 (2025): Volume IX - Nomor 1 - September 2025
Publisher : Teknik Informatika, Sistem Informasi dan Teknik Elektro

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47970/siskom-kb.v9i1.872

Abstract

Aplikasi Info BMKG memiliki peran penting dalam menyampaikan informasi cuaca, iklim, gempa bumi, dan peringatan dini bencana kepada masyarakat. Seiring meningkatnya penggunaan perangkat mobile di Indonesia, analisis sentimen menjadi relevan untuk mengevaluasi kepuasan pengguna serta mengidentifikasi aspek yang perlu diperbaiki. Penelitian ini bertujuan untuk menganalisis sentimen ulasan pengguna terhadap aplikasi Info BMKG di Google Play Store dengan memanfaatkan model transformer BERT dan RoBERTa. Dataset 10.791 ulasan pengguna yang diklasifikasikan ke dalam tiga kategori sentimen meliputi positif, netral, dan negatif. Tahapan penelitian mencakup eksplorasi data awal, prapemrosesan data, serta evaluasi model. Hasil evaluasi menunjukkan bahwa model BERT memberikan performa terbaik dengan akurasi sebesar 93,14%, disusul oleh RoBERTa dengan akurasi 91,06% pada skenario pembagian data 80:10:10. Selain itu, model BERT juga unggul dalam metrik lain, yakni presisi 93,45%, recall 92,90%, dan F1-score 93,17%, dibandingkan RoBERTa dengan presisi 91,12%, recall 90,72%, dan F1-score 90,91%. Analisis lanjutan menunjukkan bahwa meskipun aplikasi mendapatkan apresiasi, pengguna juga menyoroti isu keterlambatan notifikasi gempa dan ketidakakuratan informasi. Temuan ini diharapkan dapat menjadi dasar pengembangan lebih lanjut dalam meningkatkan kualitas layanan dan efektivitas penyampaian informasi oleh BMKG.
ANALISIS SENTIMEN ULASAN PENGGUNA APLIKASI MEDIA SOSIAL X DI PLAY STORE MENGGUNAKAN ALGORITMA LONG SHORT-TERM MEMORY (LSTM) DAN GATED RECURRENT UNIT (GRU): Studi Kasus pada Ulasan Pengguna di Google Play Store Wily, Wily Arisandi; Anggai, Sajarwo; Tukiyat, Tukiyat
Jurnal SISKOM-KB (Sistem Komputer dan Kecerdasan Buatan) Vol. 9 No. 1 (2025): Volume IX - Nomor 1 - September 2025
Publisher : Teknik Informatika, Sistem Informasi dan Teknik Elektro

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47970/siskom-kb.v9i1.875

Abstract

Penelitian ini bertujuan untuk membandingkan performa dua algoritma deep learning, yaitu Long Short-Term Memory (LSTM) dan Gated Recurrent Unit (GRU), dalam melakukan klasifikasi sentimen terhadap ulasan pengguna aplikasi media sosial X di Google Play Store. Dataset yang digunakan sebanyak 5.100 data ulasan yang telah diberi label secara manual ke dalam tiga kategori sentimen kelas positif, netral, dan negatif. Proses evaluasi dilakukan melalui 12 skenario kombinasi hyperparameter yang melibatkan variasi nilai learning rate, regularization, epoch, dan batch size. Data dibagi menjadi tiga bagian, yaitu 70% untuk pelatihan, 15% validasi, dan 15% pengujian. Hasil evaluasi menunjukkan bahwa model LSTM dengan skenario 0.002-LSTM-100-512 memberikan performa terbaik dengan akurasi 0.842, presisi 0.730, recall 0.719, dan F1-score 0.724. Sementara itu, model GRU terbaik dengan skenario 0.001-GRU-100-256 menghasilkan akurasi 0.837, presisi 0.713, recall 0.690, dan F1-score 0.696. Meskipun GRU memiliki nilai presisi yang kompetitif, model LSTM unggul dalam semua metrik lainnya, terutama F1-score yang menjadi indikator utama dalam penelitian ini karena mencerminkan keseimbangan antara presisi dan recall. Berdasarkan hasil tersebut, model LSTM dipilih sebagai model paling optimal untuk tugas analisis sentimen dalam studi ini.
Analysis of Broiler Chicken Production Success Classification Using K-Nearest Neighbors And Naive Bayes Methods at PT. Jandela Jaga Kaloka (Jajaka) Tukiyat; Anggai, Sajarwo; Agnia Bilqisti
Digitus : Journal of Computer Science Applications Vol. 2 No. 4 (2024): October 2024
Publisher : Indonesian Scientific Publication

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.61978/digitus.v2i4.396

Abstract

The livestock subsector, particularly broiler chickens, provides animal protein sources in Indonesia. However, low production efficiency, managerial challenges, and productivity fluctuations remain the primary obstacles to achieving sustainability in this sector. This study aims to analyze the success rate of broiler chicken production at PT. Jandela Jaga Kaloka (JAJAKA) using a data mining classification approach with the K-Nearest Neighbors (K-NN) and Naive Bayes algorithms. The research population comprises broiler production data from various branches of PT. JAJAKA, with a sample of 200 datasets selected based on representative criteria. The study employs the hold-out method with data splits of 60:40 and 70:30 for training and testing the models. The success rate of production is classified into three categories: good, less good, and excellent. The findings reveal that the K-NN algorithm outperforms with an accuracy of 92.59%, compared to Naive Bayes, which achieves 76.67%. Regarding recall, K-NN records a value of 96.67%, higher than Naive Bayes at 71.67%. However, Naive Bayes shows slightly better precision (94.29%) than K-NN (93.55%). These results affirm that the K-NN algorithm is more effective for classifying the success rate of broiler chicken production, supporting PT. JAJAKA in making more precise and strategic managerial decisions. Furthermore, this study contributes significantly to developing data mining methods in the poultry farming sector to improve efficiency and productivity sustainably. It provides valuable insights for PT. Jandela Jaga Kaloka will evaluate the success rate of broiler chicken production, facilitating more accurate managerial decision-making.