Claim Missing Document
Check
Articles

Found 26 Documents
Search

Chatbot Model Development Using BERT for West Sumatera Halal Tourism Information Hafidz, Irmasari; Mukti, Bayu Siddhi; Naseela, Qudsiyah Zahra Ilham; Yudistira, Ahmadhian Daffa; Purnama, I Putu Adhitya Pratama Mangku; Ariyani, Nurul Fajrin; Astuti, Hanim Maria; Tjahyanto, Aris
Halal Research Vol 4 No 2 (2024): July
Publisher : Halal Center ITS

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.12962/j22759970.v4i2.1819

Abstract

Halal tourism in Indonesia is growing rapidly, highlighting the need for Muslim halal tourism information that gives unique and relevant information for traveller. However, providing timely and reliable information, specifically related to halal tourism remains a challenge. This research aims to address this by developing a chatbot model using BERT for West Sumatra’s halal tourism. A total of 1,125 questions were prepared, divided into nine categories or labels with 125 questions each. Eighty percent (900 questions) was used to fine-tune the BERT-base-multilingual-uncased model, while 20% (225 questions) was used for evaluation. The model was fine-tuned using BertForSequenceClassification for three epochs with a batch size of 32. The chatbot demonstrated high performance, with an overall accuracy of 0.96. However, the lowest precision value was 0.89 for “budaya” (or culture) and “kuliner” (or culinary) labels, and the lowest recall value was 0.64 for the “belanja” (or shopping) label, yielding an F1-score of 0.78. This study describes chatbot model development, from data collection and pre-processing to experimental setup and model training using a fine-tuned BERT-base-multilingual-uncased model. The chatbot model can group user queries into specific purposes and respond to a predefined list. However, one label (e.g “belanja” or shopping) may have the lowest recall due to a poor training dataset and query variation.
Improved performance of fake account classifiers with percentage overlap features selection Tjahyanto, Aris; Pratama, Rivanda Putra; Shiddiqi, Ary Mazharuddin
IAES International Journal of Artificial Intelligence (IJ-AI) Vol 13, No 2: June 2024
Publisher : Institute of Advanced Engineering and Science

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.11591/ijai.v13.i2.pp1585-1595

Abstract

Feature selection plays a crucial role in the development of high-performance classification models. We propose an innovative method for detecting fake accounts. This method leverages the percentage overlap technique to refine feature selection. We introduce our technique upon earlier work that showcased the enhanced efficacy of the Naïve Bayesian classifier through dataset normalization. Our study employs a dataset of account profiles sourced from Twitter, which we normalize using the Min-Max method. We analyze the results through a series of comprehensive experiments involving diverse classification algorithms—such as Naïve Bayes, decision tree, k-nearest neighbors (KNN), deep learning, and support vector machines (SVM). Our experimental results demonstrate a 100% accuracy achieved by the SVM and deep learning classifiers. The results are attributed to the percentage overlap technique, which facilitates the identification of four highly informative features. These findings outperform models with more extensive feature sets, underscoring the efficacy of our approach.
Integrasi Analytic Hierarchy Process (AHP)–Machine Learning yang Dinamis dalam Prediksi Risiko Kredit Riskandy, Yudi Hendra; Tjahyanto, Aris
Jurnal Locus Penelitian dan Pengabdian Vol. 4 No. 10 (2025): : JURNAL LOCUS: Penelitian dan Pengabdian
Publisher : Riviera Publishing

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.58344/locus.v4i10.4459

Abstract

Dalam beberapa tahun terakhir, risiko kredit pada sektor UMKM meningkat tajam akibat keterbatasan model penilaian tradisional yang cenderung mengabaikan kompleksitas data dan preferensi pakar. Untuk menjawab tantangan tersebut, penelitian ini mengusulkan integrasi metode Analytical Hierarchy Process (AHP) dan algoritma Machine Learning (ML), khususnya Random Forest, sebagai pendekatan hybrid dalam prediksi risiko kredit. Penelitian ini bertujuan mengembangkan sistem prediktif yang akurat dan dapat dijelaskan, dengan menggabungkan kekuatan AHP sebagai alat pembobotan pakar dan ML sebagai mesin klasifikasi berbasis data. Metodologi penelitian melibatkan normalisasi dataset UMKM, pelatihan model Random Forest menggunakan WEKA dan Python, serta integrasi bobot AHP ke dalam proses kalibrasi threshold klasifikasi. Wawancara dengan pakar kredit digunakan untuk membentuk matriks perbandingan AHP dan memastikan konsistensi bobot. Hasil menunjukkan bahwa model Random Forest awal memiliki akurasi 89,5% dan AUC 96,6%. Setelah integrasi AHP, precision meningkat menjadi 100%, meskipun recall menurun menjadi 82,8%, menandakan pergeseran ke strategi konservatif. Threshold optimal secara empiris tercapai di 0,583 dengan F1-score 91,89%. Kesimpulannya, integrasi AHP–ML tidak hanya meningkatkan performa model secara statistik, tetapi juga memperkuat transparansi dan fleksibilitas pengambilan keputusan, menjadikannya solusi ideal bagi manajemen risiko dan kebijakan kredit yang adaptif di sektor keuangan.
Automatic Categorization of Multi Marketplace FMCGs Products using TF-IDF and PCA Features Indasari, Sri Suci; Tjahyanto, Aris
Jurnal Sisfokom (Sistem Informasi dan Komputer) Vol. 12 No. 2 (2023): JULI
Publisher : ISB Atma Luhur

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.32736/sisfokom.v12i2.1621

Abstract

The use of technology in line with the increasing number of internet users has caused a shift in the product sales ecosystem to the realm of electronic commerce (electronic commerce). A total of 73.23 customers made purchase transactions using e-commerce and the most purchased products were products classified as Fast Moving Consumer Goods (FMCGs). The increasingly varied FMCGs data coupled with the increasing number of marketplaces is felt to need to be broken down into specific groups. The process is carried out by analyzing e-commerce product information, especially product names, and descriptions. In this study, we propose an automatic categorization of multiple marketplaces using data from multiple marketplaces. Data text is converted into structured data with a series of preprocessing, and comprehensive experiments are carried out to see the extraction performance of variables including TF-IDF, BOW, and N-Gram.  All three methods are used to validate text data sets with K-Means grouping results used with the help of PCA to reduce data dimensions.  The results show that the performance of the TF-IDF algorithm with a dimension reduction value of 70 and the use of Python can provide optimal results for the percentage of grouping data.
An Enhanced Dynamic Signature Verification using the X and Y Histogram Features Tjahyanto, Aris; Rangga Rahardika, Ano; Mazharuddin Shiddiqi, Ary
Infotekmesin Vol 12 No 2 (2021): Infotekmesin: Juli 2021
Publisher : P3M Politeknik Negeri Cilacap

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.35970/infotekmesin.v12i2.668

Abstract

Dynamic signature verification by using histogram features is a well-known signature forgery detection technique due to its high performance. However, this technique is often limited to angular histograms derived from vectors containing two adjacent points. We propose additional new features from the X and Y histograms to overcome the limitation.  Our experiments indicate that our technique produced Under Curve Area AUC values 0.80 to detect skilled forgery and 0.91 for random forgery. Our method performed best when the verification system uses 12 of the most dominant features.  This setup produced AUC values of 0.80 to detect skilled forgery and 0.93 for random forgery. These results outperformed the original technique when the X and Y histogram features are not used that produced AUC values of 0.78 to detect skilled forgery and 0.90 for random forgery.
Comparative Analysis: Machine Learning Algorithms for TOC Prediction in Pharmaceutical Water Treatment Systems Mustapa, Dieki Rian; Tjahyanto, Aris
Jurnal Sisfokom (Sistem Informasi dan Komputer) Vol. 13 No. 2 (2024): JULY
Publisher : ISB Atma Luhur

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.32736/sisfokom.v13i2.2148

Abstract

Water quality is crucial in pharmaceutical production, where it serves as a solvent and raw material. Contamination with organic compounds poses a risk to product integrity and safety. TOC serves as a key indicator for assessing organic pollution levels in water. An increase in TOC signals potential issues with water treatment systems. Machine learning prediction of TOC values is essential for preemptive monitoring and maintenance. This study aimed to compare three different machine learning algorithms - Linear Regression (RL), Random Forest (RF), and multilayer perceptron (MLP) - for predicting Total Organic Carbon (TOC) in pharmaceutical water treatment systems. By utilizing a dataset covering various operational conditions of pharmaceutical water treatment systems, the research conducted a comprehensive analysis. Each algorithm underwent evaluation using performance metrics like coefficient of determination (R-squared), and prediction accuracy to assess their effectiveness in predicting TOC concentrations. A correlation coefficient approaching 1 (100%) signifies a strong relationship between model predictions and actual target values (accuracy prediction), while a smaller Mean Absolute Error (MAE) indicates higher accuracy in predicting target values. The study found that the results of the correlation coefficient in order from highest to lowest are the RF, MLP, and RL models with values of 95.04%, 93.11%, and 80.27%, respectively. Likewise, additional metrics for evaluation, including Root Mean Squared Error (RMSE), Mean Absolute Error (MAE), Relative Absolute Error (RAE) and Root Relative Squared Error (RRSE), exhibit a ranking from lowest to highest values across RF, MLP, and RL models. RF has a higher prediction accuracy of the TOC than other models (95%) and lowest MAE (3.9). This research offers valuable insights into utilizing machine learning algorithms for TOC prediction within pharmaceutical water treatment to make informed decisions, improving water treatment systems and overall product quality.