Claim Missing Document
Check
Articles

XGBoost and Random Forest Optimization using SMOTE to Classify Air Quality Arifianti, Fidela Putri; Salam, Abu
Advance Sustainable Science, Engineering and Technology Vol 6, No 1 (2024): November-January
Publisher : Universitas PGRI Semarang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.26877/asset.v6i1.18136

Abstract

Air pollution due to the growth of industry and motorized vehicles seriously threatens human health. Clean air is essential, but pollutant contamination can cause acute respiratory illnesses and other illnesses. Several studies have been carried out to anticipate this air pollution. Various algorithms, methods, and data balancing techniques have been implemented, but still need to be done to obtain better accuracy results. Therefore, this study aims to classify heart disease using the XGBoost and Random Forest algorithms and implement the SMOTE technique to overcome data imbalance. This research produces a Random Forest algorithm with SMOTE implementation with splitting 80:20, which has the best accuracy with an accuracy of 92.4%, an average AUC of 0.98, and a log loss of 0.2366, which shows that SMOTE has succeeded in improving model performance in classifying minority classes. Based on the results obtained, the XGBoost and Random Forest algorithms after SMOTE are superior to the model with SMOTE, with accuracy above 90%.
REDUCING UNDER-FETCHING AND OVER-FETCHING IN REST API WITH GRAPHQL FOR WEB-BASED SOFTWARE DEVELOPMENT Muzaki, Rizki Nuzul; Salam, Abu
Jurnal Teknik Informatika (Jutif) Vol. 5 No. 2 (2024): JUTIF Volume 5, Number 2, April 2024
Publisher : Informatika, Universitas Jenderal Soedirman

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.52436/1.jutif.2024.5.2.1725

Abstract

Rest API is the most popular architectural style in website-based software development. However, Rest API has under-fetching and over-fetching problems. Under-fetching is a situation when the client has to make requests to several endpoints, while over-fetching is a situation when the client receives more data than needed. There is an alternative technology to Rest API, namely GraphQL. GraphQL has the potential to solve both under-fetching and over-fetching problems. This research aims to analyze how quickly GraphQL responds in overcoming under-fetching and over-fetching problems and conducting condition analysis to determine when it is best to use GraphQL. In this research, tests were conducted to answer these problems by applying each of the five test scenarios for under-fetching and over-fetching problems. Test results show that GraphQL can provide response speeds of 36.84% to 93.04% superior to Rest API. In the case of under-fetching, it is best to choose GraphQL when there is a need to call more than four endpoints. Meanwhile, for over-fetching problems, using the Rest API provides adequate response speed. However, if a more optimal response speed is needed, using GraphQL could be an alternative.
PERANCANGAN SISTEM PREDIKSI KELULUSAN MAHASISWA UNIVERSITAS DIAN NUSWANTORO MENGGUNAKAN UNIFIED MODELING LANGUAGE (UML): PERANCANGAN SISTEM PREDIKSI KELULUSAN MAHASISWA UNIVERSITAS DIAN NUSWANTORO MENGGUNAKAN UNIFIED MODELING LANGUAGE (UML) Zeniarja, Junta; Salam, Abu; Alan Ma’ruf, Farda
Prosiding Seminar Nasional Teknologi Informasi, Mekatronika, dan Ilmu Komputer Vol 1 (2022): Sentimeter 2022
Publisher : Prosiding Seminar Nasional Teknologi Informasi, Mekatronika, dan Ilmu Komputer

Show Abstract | Download Original | Original Source | Check in Google Scholar

Abstract

Mahasiswa ialah salah satu tiang yang sangat berarti dalam siklus hidup suatu Universitas. Jumlah kelulusan suatu Universitas sering kali mempunyai perbandingan yang kecil bila dibanding dengan jumlah mahasiswa yang diperoleh pada tahun akademik yang serupa. Tingkatan kelulusan mahasiswa yang kecil ini bisa disebabkan oleh sebagian aspek, seperti banyaknya aktivitas kemahasiswaan yang diiringi oleh aspek ekonomi, serta aspek-aspek lainnya. Perihal ini membuat sesuatu Universitas wajib mempunyai desain ataupun metode yang bisa memperhitungkan apakah mahasiswa itu bisa lulus dengan durasi yang sesuai. Salah satu faktor yang mendukung keberhasilan di dalam Univeritas adalah mahasiswa yang lulus dengan durasi yang sesuai. Semakin banyak mahasiswa yang lulus dengan durasi yang sesuai (dalam hal ini untuk sarjana yaitu <= 8 semester), harus berbanding yang sama atau lebih tinggi terhadap jumlah mahasiswa yang masuk pada suatu Universitas. Jika jumlah mahasiswa yang tidak lulus dengan durasi yang sesuai lebih tinggi, maka dapat menyebabkan lonjakan peningkatan jumlah data akademis dari semua mahasiswa yang masih terdaftar sehingga akan mempengaruhi citra dan reputasi dari Universitas yang nantinya dapat mengancam nilai akreditasi Universitas tersebut. Untuk mengatasi hal tersebut, maka diperlukan sistem yang dapat memprediksi kelulusan mahasiswa. Objek Penelitian ini dilakukan pada mahasiswa Universitas Dian Nuswantoro. Perancangan sistem prediksi menggunakan diagram Unified Modelling Language (UML). Diharapkan sistem prediksi kelulusan mahasiswa ini dapat berjalan optimal sehingga dapat memprediksi dan mengantisipasi secara dini profil kelulusan mahasiswa Universitas Dian Nuswantoro yang tidak sesuai meskipun di tengah wabah pandemi Covid-19.
Optimalisasi Model SciBERT dengan Attention-BiLSTM-CRF untuk Pengenalan Entitas Penyakit dalam Teks Biomedis Pamungkas, Tahta Arya; Salam, Abu
Building of Informatics, Technology and Science (BITS) Vol 7 No 1 (2025): June (2025)
Publisher : Forum Kerjasama Pendidikan Tinggi

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47065/bits.v7i1.7263

Abstract

This research aims to improve the performance of medical entity recognition in biomedical text by modifying the SciBERT model with Attention-BiLSTM-CRF. Although SciBERT, based on the BERT architecture and trained on biomedical text data, has proven effective in entity recognition, it still has limitations in handling complex medical entities, especially nested entities. As a solution, this research integrates Attention, BiLSTM, and CRF components into the SciBERT model to enhance entity recognition accuracy. Experimental results show that the SciBERT + Attention-BiLSTM-CRF model outperforms the SciBERT model across all key evaluation metrics. Precision improved by 1.7% (from 0.8221 to 0.8364), Recall increased by 2.9% (from 0.8537 to 0.8768), and F1-Score increased by 2.1% (from 0.8372 to 0.8554). These improvements demonstrate that this modification significantly enhances the model's ability to recognize more complex medical entities in biomedical text. The addition of Attention and BiLSTM enriches contextual understanding, while CRF ensures consistency across entity labels. These results indicate that this approach could significantly contribute to automated systems in processing medical data.
Pengembangan Chatbot Kesehatan Mental Berbasis Web Menggunakan Model Long Short-Term Memory (LSTM) Ardin, Akbar Ilham; Salam, Abu
Building of Informatics, Technology and Science (BITS) Vol 7 No 1 (2025): June (2025)
Publisher : Forum Kerjasama Pendidikan Tinggi

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47065/bits.v7i1.7282

Abstract

Mental health issues such as stress, anxiety, and academic burnout are increasingly prevalent among university students. However, many students remain reluctant or unable to access counseling services due to time limitations, social stigma, and a lack of available professionals. This study aims to develop CuraBot, a web-based chatbot designed to provide preliminary emotional support and mental health education in an instant, anonymous, and easily accessible manner for students. The system was developed using the Long Short-Term Memory (LSTM) algorithm, which is proven to be effective in understanding contextual text-based conversations. The dataset used consists of 1,624 conversational entries across 77 intent classes, adapted and localized from an open-source corpus to reflect the linguistic style and needs of Indonesian students. The development process involved several stages, including data preprocessing (lemmatization, tokenization, stopword removal, and padding), model training using TensorFlow, and deployment into a Flask-based web application. The model was evaluated using a separate test set of 244 entries, resulting in an accuracy of 89.9%, precision of 90.4%, recall of 89.1%, and an F1-score of 89.8%. These results indicate that the model can classify user intent with high accuracy. This research contributes to the development of a contextual, practical, and AI-based digital solution that supports early access to psychological services within university environments.
Enhancing Interpretable Multiclass Lung Cancer Severity Classification using TabNet Norman, Maria Bernadette Chayeenee; Dewi, Ika Novita; Salam, Abu; Utomo, Danang Wahyu; Rakasiwi, Sindhu
Journal of Applied Informatics and Computing Vol. 9 No. 6 (2025): December 2025
Publisher : Politeknik Negeri Batam

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.30871/jaic.v9i6.11417

Abstract

Lung cancer poses a significant global mortality challenge, with early clinical detection hindered by non-specific symptoms making accurate diagnosis dependent on extracting subtle patterns from often complex medical tabular data. Traditional machine learning approaches often fall short in capturing intricate patterns within such heterogeneous datasets, hindering effective clinical decision support. This research introduces TabNet, an interpretable deep learning architecture, for multiclass lung cancer severity prediction (low, medium, high). Utilizing the Kaggle Lung Cancer dataset, our methodology leverages TabNet's unique attention-based feature selection for end-to-end processing of tabular data, enabling adaptive identification of key predictors and crucial model interpretability. To effectively assess its predictive capabilities and ensure robust performance, the model was trained with default configurations and validated through stratified 5-fold cross-validation, achieving outstanding performance on the test set: 98.50% accuracy, a 0.98 F1-score, and a 0.9996 macro-AUC-ROC. Beyond its robustness, confirmed by stable learning curves, interpretability analysis highlighted 'Genetic Risk' and 'Shortness of Breath' as dominant factors. Our results underscore TabNet's efficacy as a reliable, robust, and inherently interpretable solution, offering significant potential to improve the precision and transparency of lung cancer severity assessment in clinical practice.