Claim Missing Document
Check
Articles

Found 7 Documents
Search

Experimental of information gain and AdaBoost feature for machine learning classifier in media social data Jasmir, Jasmir; Abidin, Dodo Zaenal; Fachruddin, Fachruddin; Riyadi, Willy
Indonesian Journal of Electrical Engineering and Computer Science Vol 36, No 2: November 2024
Publisher : Institute of Advanced Engineering and Science

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.11591/ijeecs.v36.i2.pp1172-1181

Abstract

In this research, we use several machine learning methods and feature selection to process social media data, namely restaurant reviews. The selection feature used is a combination of information gain (IG) and adaptive boosting (AdaBoost) which is used to see its effect on the classification performance evaluation value of machine learning methods such as Naïve Bayes (NB), K-nearest neighbor (KNN), and random forest (RF) which is the aim of this research. NB is very simple and efficient and very sensitive to feature selection. Meanwhile, KNN is known for its weaknesses such as biased k values, overly complex computation, memory limitations, and ignoring irrelevant attributes. Then RF has weaknesses, including that the evaluation value can change significantly with only small data changes. In text classification, feature selection can improve the scalability, efficiency and accuracy of text classification. Based on tests that have been carried out on several machine learning methods and a combination of the two selection features, it was found that the best classifier is the RF algorithm. RF produces a significant increase in value after using the IG and AdaBoost features. Increased accuracy by 10%, precision by 12.43%, recall by 8.14% and F1-score by 10.37%. RF also produces even accuracy, precision, recall, and F1-score values after using IG and AdaBoost with an accuracy value of 84.5%; precision of 85.58%; recall was 86.36%; and F1-score was 85.97%.
Perancangan Sistem Informasi Administrasi Pembayaran Spp Pada Smk Batanghari Kota Jambi Sujatmiko, Tri Aji; Maulana Hidayat; Pajri; Abidin, Dodo Zaenal
Jurnal Informatika Dan Rekayasa Komputer(JAKAKOM) Vol 4 No 2 (2024): JAKAKOM Vol 4 No 2 September 2024
Publisher : LPPM Universitas Dinamika Bangsa

Show Abstract | Download Original | Original Source | Check in Google Scholar

Abstract

ABSTRACT Tri Aji Sujatmiko 8020180256, Maulana Hidayat 8020190016, Pajri 8020190025 DESIGN OF A WEB-BASED SPP ADMINISTRATION PAYMENT INFORMATION SYSTEM AT SMK BATANGHARI JAMBI CITY Keywords: Design, System, Information, Payment, Tuition Payment, Website. In today's digital era, efficiency and effectiveness in school administration will definitely be important. One important aspect of school administration is the payment of SPP (Education Development Contribution). SMK Batanghari Jambi City needs a system that can simplify the SPP payment process, reduce data entry errors, and increase transparency. In this research, we will design and develop a web-based tuition payment information system that can be accessed by students, parents, and schools. With a software development method that involves requirements analysis, system design, implementation, and testing. The result of the design is a website that allows users to pay tuition fees online, storage of payment history, and real time financial reports. This system is expected to improve the efficiency of managing tuition payments at SMK Batanghari Jambi City and provide easy access to information to all interested parties.
Improving Term Deposit Customer Prediction Using Support Vector Machine with SMOTE and Hyperparameter Tuning in Bank Marketing Campaigns Abidin, Dodo Zaenal; Rosario , Maria; Sadikin , Ali; Nurhadi, Nurhadi; Jasmir, Jasmir
Jurnal Teknik Informatika (Jutif) Vol. 6 No. 3 (2025): JUTIF Volume 6, Number 3, Juni 2025
Publisher : Informatika, Universitas Jenderal Soedirman

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.52436/1.jutif.2025.6.3.4585

Abstract

Identifying potential customers for term deposit products remains a challenge in the banking industry due to class imbalance in marketing datasets. This study proposes an integrated approach that combines Support Vector Machine (SVM) with the Synthetic Minority Oversampling Technique (SMOTE) and hyperparameter tuning via GridSearchCV to enhance prediction performance. The dataset comprises 45,211 records containing demographic and campaign-related features. Preprocessing steps include categorical encoding, feature scaling, and SMOTE-based resampling. The optimized SVM model achieves an accuracy of 91% and an AUC of 0.96, outperforming the baseline model and demonstrating strong discriminatory ability, particularly for the minority class. This method improves the balance between precision and recall while reducing bias toward the majority class. The findings confirm the effectiveness of combining SMOTE and SVM for imbalanced classification tasks in the financial domain. These results contribute to the advancement of applied machine learning in informatics, particularly in developing robust decision support systems for data-driven banking strategies. Future work may extend this approach to diverse datasets and explore advanced resampling or ensemble techniques to improve model generalization.
Comparison of robust machine learning algorithms on outliers and imbalanced spam data Abidin, Dodo Zaenal; Jasmir, Jasmir; Rasywir, Errisya; Siswanto, Agus
Indonesian Journal of Electrical Engineering and Computer Science Vol 39, No 2: August 2025
Publisher : Institute of Advanced Engineering and Science

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.11591/ijeecs.v39.i2.pp1130-1144

Abstract

Effective spam detection is essential for data security, user experience, and organizational trust. However, outliers and class imbalance can impact machine learning models for spam classification. Previous studies focused on feature selection and ensemble learning but have not explicitly examined their combined effects. This study evaluates the performance of random forest (RF), gradient boosting (GB), and extreme gradient boosting (XGBoost) under four experimental scenarios: (i) without synthetic minority over-sampling technique (SMOTE) and outliers, (ii) without SMOTE but with outliers, (iii) with SMOTE and without outliers, and (iv) with SMOTE and with outliers. Results show that XGBoost achieves the highest accuracy (96%), an area under the curve-receiver operating characteristic (AUCROC) of 0.9928, and the fastest computation time (0.6184 seconds) under the SMOTE and outlier-free scenario. Additionally, RF attained an AUCROC of 0.9920, while GB achieved 0.9876 but required more processing time. These findings emphasize the need to address class imbalance and outliers in spam detection models. This study contributes to developing more robust spam filtering techniques and provides a benchmark for future improvements. By systematically evaluating these factors, it lays a foundation for designing more effective spam detection frameworks adaptable to real-world imbalanced and noisy data conditions.
A Comprehensive Benchmarking Pipeline for Transformer-Based Sentiment Analysis using Cross-Validated Metrics Abidin, Dodo Zaenal; Afuan, Lasmedi; Toscany, Afrizal Nehemia; Nurhadi, Nurhadi
Jurnal Teknik Informatika (Jutif) Vol. 6 No. 4 (2025): JUTIF Volume 6, Number 4, Agustus 2025
Publisher : Informatika, Universitas Jenderal Soedirman

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.52436/1.jutif.2025.6.4.4894

Abstract

Transformer-based models have significantly advanced sentiment analysis in natural language processing. However, many existing studies still lack robust, cross-validated evaluations and comprehensive performance reporting. This study proposes an integrated benchmarking pipeline for sentiment classification on the IMDb dataset using BERT, RoBERTa, and DistilBERT. The methodology includes systematic preprocessing, stratified 5-fold cross-validation, and aggregate evaluation through confusion matrices, ROC and precision-recall (PR) curves, and multi-metric classification reports. Experimental results demonstrate that all models achieve high accuracy, precision, recall, and F1-score, with RoBERTa leading overall (94.1% mean accuracy and F1), followed by BERT (92.8%) and DistilBERT (92.1%). All models exceed 0.97 in ROC-AUC and PR-AUC, confirming strong discriminative capability. Compared to prior approaches, this pipeline enhances result robustness, interpretability, and reproducibility. The provided results and open-source code offer a reliable reference for future research and practical deployment. This study is limited to the IMDb dataset in English, suggesting future work on multilingual, cross-domain, and explainable AI integration.
Enhancing Fake News Detection on Imbalanced Data Using Resampling Techniques and Classical Machine Learning Models Abidin, Dodo Zaenal; Siswanto, Agus; Saputra, Chindra; Betantiyo , Betantiyo; Nehemia Toscany, Afrizal
Jurnal Teknik Informatika (Jutif) Vol. 6 No. 5 (2025): JUTIF Volume 6, Number 5, Oktober 2025
Publisher : Informatika, Universitas Jenderal Soedirman

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.52436/1.jutif.2025.6.5.5177

Abstract

Class imbalance remains a critical challenge in fake news detection, particularly in domains such as entertainment media where class distributions are highly skewed. This study evaluates seven resampling techniques—Random Oversampling, SMOTE, ADASYN, Random Undersampling, Tomek Links, NearMiss, and No Resampling—applied to three classical machine learning models: Logistic Regression, Support Vector Machine (SVM), and Random Forest. Using the imbalanced GossipCop dataset comprising 24,102 news headlines, the proposed pipeline integrates TF-IDF vectorization, stratified 3-fold cross-validation, and five evaluation metrics: F1-score, precision, recall, ROC AUC, and PR AUC. Experimental results show that oversampling methods, particularly SMOTE and Random Oversampling, substantially improve minority class (fake news) detection. Among all model–resampling combinations, SVM with SMOTE achieved the highest performance (F1-score = 0.67, PR AUC = 0.74), demonstrating its robustness in handling imbalanced short-text classification. Conversely, undersampling methods frequently reduced recall, especially with ensemble models like Random Forest. This approach enhances model robustness in fake news detection on skewed datasets and contributes a reproducible, domain-specific framework for developing more reliable misinformation classifiers.
Analysis of the Application of Transaction Data with Association Techniques using the Apriori Algorithm in Pharmacy Nasutioni, Wahyudi; Abidin, Dodo Zaenal; Rasywir, Errissya
Journal of Applied Business and Technology Vol. 4 No. 3 (2023): Journal of Applied Business and Technology
Publisher : Institut Bisnis dan Teknologi Pelita Indonesia

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.35145/jabt.v4i3.141

Abstract

The development of information technology influences the rapid growth in the amount of data collected and stored in large t. Dimas Pharmacy, located at Jl. Segara Kec. Nipah Panjang is one of the public health services that sells various medicines, medical devices, and so on. This study is expected to provide positive benefits for owners of Dimas Nipah Panjang Pharmacy in Providing information about the pattern of medicine purchases made by consumers and Facilitating pharmacy owners to know the available medicine supplies in the warehouse so as not to experience emptiness when needed. Problem Formulation, Literature Study, Data Collection, Calculation and Analysis of Associations with Priori Algorithms, Results Evaluation and Analysis and Report Making. Based on the results of interviews and observations that have been made, the authors obtain data from the Dimas Pharmacy sales transaction. Data held ± 1000 sales transaction data for the period of May and June. But the author only entered 216 sales transactions in May and 141 sales transactions in June. After knowing the method of data selection, the authors conducted data selection by taking 6 items of significant data specifications in certain contexts, namely Anti Serotonin / Allergy, Antacid / Ulcer, Antibiotics, Antipyretics, Inflammation, Hypertension Each of these items had different brands. From these results it can be explained that the sales transaction of the Dimas Pharmacy in May and June generates or generates relationships between shopping product items. With the calculation of the Apriori Association Algorithm, a Market Basket Analysis relationship was found between Medicinal Pronicy and Dexa items. With the Rule "IF Buy Pronicy, THEN Buy Dexa". The rule is generated from the highest support and confident values of the overall support and confident items. The highest support value is 0.15 and the highest confident value is 0.5 ".