Claim Missing Document
Check
Articles

Comparison of Light Gradient Boosting Machine, eXtreme Gradient Boosting, and CatBoost with Balancing and Hyperparameter Tuning for Hypertension Risk Prediction on Clinical Dataset Murtiningsih, Dewi Ayu; Sari, Bety Wulan; Fajri, Ika Nur
Journal of Applied Informatics and Computing Vol. 9 No. 5 (2025): October 2025
Publisher : Politeknik Negeri Batam

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.30871/jaic.v9i5.10400

Abstract

Hypertension is a long-lasting condition that is highly prevalent and significantly contributes to cardiovascular issues, making early identification a crucial preventive action. This research evaluates the efficacy of three boosting algorithms, eXtreme Gradient Boosting (XGBoost), Light Gradient Boosting Machine (LGBM), and CatBoost in forecasting hypertension risk. A publicly accessible dataset consisting of 4,363 samples was employed, followed by data preprocessing, feature selection through a voting method that integrates Boruta, Recursive Feature Elimination (RFE), and SelectKBest, as well as addressing class imbalance using the Synthetic Minority Over-sampling Technique (SMOTE) and ADASYN (Adaptive Synthetic Sampling Approach). The models were additionally fine-tuned through hyperparameter optimization using GridSearchCV and Repeated Stratified K-Fold Cross Validation. The evaluation results demonstrate that all three algorithms exhibited strong predictive capabilities, with CatBoost leading the way, achieving an accuracy of 0.992, precision of 0.992, recall of 0.992, F1-score of 0.992, and ROC-AUC of 0.9987. Analyzing the confusion matrix further validated that CatBoost had the lowest number of misclassifications when compared to XGBoost and LGBM. Additionally, the use of SHapley Additive exPlanations (SHAP) for model interpretability highlighted that the key factors influencing the prediction of hypertension risk are blood pressure, body mass index (BMI), overall physical activity, waist circumference, triglyceride levels, age, and LDL cholesterol levels, aligning with established medical knowledge. To facilitate real-world use, the top-performing model was implemented into a user-friendly website interface, allowing users to predict their hypertension risk interactively. These findings illustrate that boosting algorithms, especially CatBoost, offer an accurate, dependable, and interpretable machine learning method for creating hypertension risk prediction systems.
Sentiment Analysis of the Film "JUMBO" on Twitter Using the Naive Bayes Method and Support Vector Machine (SVM) with a Text Mining Approach Widodo, Tegar Robi; Fajri, Ika Nur; Sari, Bety Wulan
Journal of Applied Informatics and Computing Vol. 9 No. 5 (2025): October 2025
Publisher : Politeknik Negeri Batam

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.30871/jaic.v9i5.10557

Abstract

This study aims to perform sentiment analysis on reviews of the film “JUMBO” collected from the Twitter platform, using the Naive Bayes and Support Vector Machine (SVM) methods. The data were gathered through a crawling process on Twitter, yielding 2,011 tweets, which were then processed through several pre-processing steps, including case folding, cleaning, normalization, tokenization, stopword removal, and stemming. Subsequently, the data were transformed into numerical representations using TF-IDF, followed by sentiment labeling into positive, negative, and neutral categories. For the Naive Bayes method, training and evaluation were conducted using 5-fold Cross Validation. The results showed that the Naive Bayes model achieved an accuracy of 80.60%, precision of 73.83%, recall of 73.50%, and an F1-score of 69.98%. Meanwhile, the SVM method obtained an accuracy of 75.87%, precision of 76.36%, recall of 62.45%, and an F1-score of 65.64%. Compared to the baseline random classifier, which only achieved an accuracy of 32.47%, both primary methods significantly outperformed it in classifying film review sentiments. The analysis also indicates that the F1-score is lower than the accuracy due to the imbalanced data distribution, with a considerably higher number of positive reviews. This study also presents visualizations of sentiment distribution and word clouds to provide a clearer understanding of audience opinions. The results demonstrate that the Naive Bayes method performs well and has potential for use in sentiment analysis of films on social media platforms. These findings are expected to provide valuable insights for the creative industry, particularly in evaluating audience responses and improving the quality of future film productions.
PREDICTION OF STROKE USING LOGISTIC REGRESSION WITH A MACHINE LEARNING APPROACH Rana Aphrodita, Ishiqa; Nur Fajri, Ika; Nugroho, Agung
JURTEKSI (jurnal Teknologi dan Sistem Informasi) Vol. 11 No. 4 (2025): September 2025
Publisher : Lembaga Penelitian dan Pengabdian Kepada Masyarakat (LPPM) STMIK Royal Kisaran

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.33330/jurteksi.v11i4.4161

Abstract

Abstract: Stroke is one of the leading causes of death and disability in various parts of the world, including in Indonesia. Along with the development of digital technology, the use of Machine Learning in the health sector is growing, one of which is in an effort to predict the occurrence of stroke. This study aims to implement the Logistic Regression algorithm in predicting the likelihood of a person having a stroke based on data from the Brain Stroke dataset. The research process includes data preprocessing (missing value handling, normalization, and label encoding), dividing the data into 80% training data and 20% test data, as well as model training. The model was then evaluated using several measures such as accuracy, precision, recall, F1-score, and ROC-AUC, as well as a confusion matrix. The results of the study showed that Logistic Regression was able to provide stroke classification results with an accuracy of 82.4%, precision of 80.1%, recall of 78.6%, F1-score of 79.3%, and a ROC-AUC value of 0.87. Then, the model is integrated into applications that use Streamlit, so it can be used interactively to predict stroke risk in new data. The results of this study show that the combination of Machine Learning and web-based applications has the potential to support efforts to detect early stroke risk. Keywords: logistic regression; machine learning; prediction; streamlit; stroke. Abstrak: Stroke adalah salah satu penyebab utama kematian dan kecacatan di berbagai belahan dunia, termasuk di Indonesia. Seiring perkembangan teknologi digital, penggunaan Machine Learning dalam bidang kesehatan semakin berkembang, salah satunya dalam upaya memprediksi terjadinya penyakit stroke. Penelitian ini bertujuan untuk mengimplementasikan algoritma Logistic Regression dalam memprediksi kemungkinan seseorang mengalami stroke berdasarkan data dari dataset Brain Stroke. Proses penelitian meliputi preprocessing data (penanganan missing value, normalisasi, dan label encoding), membagi data menjadi 80% data latih dan 20% data uji, serta pelatihan model. Model kemudian dievaluasi menggunakan beberapa ukuran seperti akurasi, precision, recall, F1-score, dan ROC-AUC, serta confusion matrix. Hasil penelitian menunjukkan bahwa Logistic Regression mampu memberikan hasil klasifikasi penyakit stroke dengan akurasi sebesar 82,4%, precision 80,1%, recall 78,6%, F1-score 79,3%, dan nilai ROC-AUC sebesar 0,87. Kemudian, model tersebut diintegrasikan ke dalam aplikasi yang menggunakan Streamlit, sehingga dapat digunakan secara interaktif untuk memprediksi risiko stroke pada data baru. Hasil penelitian ini menunjukkan bahwa kombinasi Machine Learning dan aplikasi berbasis web berpotensi mendukung upaya deteksi dini risiko stroke. Kata kunci: logistic regression; machine learning; prediksi; streamlit; stroke.
FROZEN FOOD SALES SYSTEM AT DAKON STORE USING FRAMEWORK FOR THE APPLICATION SYSTEM THINKING METHOD Mangli, Luh Ajeng Roro; Fajri, Ika Nur
ZONAsi: Jurnal Sistem Informasi Vol. 6 No. 3 (2024): Publikasi artikel ZONAsi: Jurnal Sistem Informasi Periode September 2024
Publisher : Universitas Lancang Kuning

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.31849/zn.v6i3.21796

Abstract

Increasingly advanced information and communication technology has triggered various influences, including a significant need for the Internet. This technological development disrupts the business sector, especially trade, which requires a shift from conventional stores to online stores to accelerate and increase sales through e-commerce, which expands market share without limits. The Dakon frozen food shop, established in 2018, needs help with conventional sales, which force buyers to come to the shop, as well as time-consuming manual stock and sales data collection. To overcome this problem, the author proposes developing a website-based information system using the FAST (Framework for the Application of System Thinking) method, making it easier to design systems, analyze needs, and build appropriate systems. Implementing this system is expected to expand the reach of buyers, increase sales, and improve governance. With the FAST method, various operational challenges can be overcome more effectively. Payments have become more efficient through automation of the sales process, although improvements to the website's appearance are still needed to improve the user experience
Sistem Rekomendasi Wisata Magelang Menggunakan Metode Collaborative Filtering Siska, Siska; Fajri, Ika Nur; Rayhan, Radhita; Pratama, Akbar; Rohman, Arif Nur
Eksplora Informatika Vol 14 No 1 (2024): Jurnal Eksplora Informatika
Publisher : Institut Teknologi dan Bisnis STIKOM Bali

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.30864/eksplora.v14i1.1084

Abstract

Pariwisata telah menjadi kegiatan yang populer dan digemari oleh banyak orang, termasuk di Indonesia yang memiliki berbagai destinasi terkenal. Magelang, salah satu daerah di Indonesia, memiliki potensi pariwisata yang besar dengan ragam objek wisata, mulai dari sejarah hingga alam. Penelitian ini membahas tentang pengembangan sistem rekomendasi tempat wisata di Magelang menggunakan metode collaborative filtering. Data yang digunakan berasal dari kaggle.com, mencakup informasi rating dan profil pengguna. Analisis umur menunjukkan partisipasi tinggi dari kelompok usia 21-30 tahun, yang merupakan segmen aktif dalam wisata. Mayoritas pengguna berasal dari Pulau Jawa, menambah dimensi kebudayaan dalam penelitian. Metode penelitian ini melibatkan penggunaan collaborative filtering untuk menghasilkan rekomendasi tempat wisata berdasarkan preferensi pengguna. Pengujian dilakukan pada User_Id 1, yang menghasilkan rekomendasi beragam dengan prediksi skor sekitar 3,81 untuk tiga tempat utama. Hasil ini menunjukkan bahwa sistem rekomendasi dapat membantu pengguna menemukan destinasi yang sesuai dengan preferensi mereka. Kesimpulan penelitian ini menggarisbawahi potensi sistem rekomendasi untuk meningkatkan pengalaman wisata dan mendukung pengembangan sektor pariwisata di Magelang.
Pemanfaatan Sistem Informasi Berbasis Website untuk Mendukung Pengelolaan Administrasi Data Karyawan Yayasan Taruna Alquran Sleman Yogyakarta Nurmasani, Atik; Dyah Anggita, Sharazita; Dwi Hartanto, Anggit; Pujastuti, Eli; Asti Astuti, Ika; Pristyanto, Yoga; Nur Fajri, Ika
Jurnal Pengabdian Masyarakat Inovasi Indonesia Vol 3 No 4 (2025): JPMII - Agustus 2025
Publisher : CV Firmos

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.54082/jpmii.829

Abstract

Penerapan sistem informasi pada suatu institusi penting untuk mendukung proses bisnis. Yayasan Taruna Al-Quran ingin memaksimalkan teknologi dalam mengelola administrasi data unit kerja. Masalah yang dialami pada pengelolaan administrasi data yaitu keterbatasan dalam pengelolaan arsip dan tidak optimalnya proses pencarian data. Sistem informasi berbasis website dibuat untuk mengatasi masalah pengelolaan administrasi dan kemudahan akses bagi seluruh unit kerja. Metode yang diterapkan pada kegiatan terdiri dari perencanaan, pelaksanaan, dan evaluasi. Hasil kegiatan perencanaan berupa perencanaan yang sesuai kebutuhan sebagai dasar pelaksanaan.  Hasil kegiatan pelaksanaan berupa sistem informasi yang siap diserahkan kepada mitra. Hasil evaluasi berupa masukan pengguna dari mitra terhadap sistem informasi, dimana pengguna mudah menggunakan sistem informasi dengan skor 5.9 atau 86%. Sistem informasi yang diterapkan dapat membantu mitra mengelola administrasi data karyawan dengan mudah. Seluruh pengguna dapat mengakses data secara online sesuai kebutuhan.
IMPLEMENTATION OF RANDOM FOREST CLASSIFIER FOR STUDENT GRADUATION CLASSIFICATION Zaidan Putra, Bazil; Nur Fajri, Ika; Nugroho, Agung
JURTEKSI (jurnal Teknologi dan Sistem Informasi) Vol. 12 No. 1 (2025): Desember 2025
Publisher : Lembaga Penelitian dan Pengabdian Kepada Masyarakat (LPPM) STMIK Royal Kisaran

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.33330/jurteksi.v12i1.4160

Abstract

Abstract: Higher education plays an essential role in improving human resource quality, one of which is through the institution’s ability to monitor and predict student graduation outcomes. This study does not focus on a specific university but utilizes the publicly available Students Performance in Exams dataset from Kaggle, consisting of 1,000 student records containing mathematics, reading, and writing scores, along with demographic attributes such as gender, parental education level, lunch type, and test preparation participation. The data were processed through a feature engineering stage by adding an average score variable as an early indicator of graduation status. A predictive model was developed using the Random Forest Classifier, achieving an accuracy of 94.5%. The final model was integrated into a Streamlit-based web application to provide an accessible tool for academic stakeholders. The results indicate that the proposed model can serve as an effective decision-support tool for early evaluation of students’ likelihood of graduation. Keywords: prediction; random forest classifier, streamlit, student graduation. Abstrak: Pendidikan tinggi memegang peran penting dalam peningkatan kualitas sumber daya manusia, salah satunya melalui kemampuan institusi dalam memantau dan memprediksi tingkat kelulusan mahasiswa. Penelitian ini tidak berfokus pada perguruan tinggi tertentu, melainkan menggunakan dataset publik Students Performance in Exams dari Kaggle yang berisi 1.000 data mahasiswa, terdiri atas nilai matematika, membaca, menulis, serta atribut demografis seperti gender, tingkat pendidikan orang tua, jenis makan siang, dan partisipasi kursus persiapan. Data diolah melalui tahap feature engineering dengan menambahkan variabel average score sebagai indikator awal kelulusan. Model prediksi dibangun menggunakan algoritma Random Forest Classifier, yang menghasilkan tingkat akurasi sebesar 94,5%. Model ini kemudian diimplementasikan ke dalam aplikasi web berbasis Streamlit untuk memberikan layanan prediksi yang mudah diakses oleh pihak akademik. Hasil penelitian menunjukkan bahwa model mampu digunakan sebagai alat pendukung keputusan untuk melakukan evaluasi dini terhadap potensi kelulusan mahasiswa. Kata kunci: kelulusan mahasiswa; prediksi; random forest classifier; streamlit.
Perancangan dan Implementasi Sistem Informasi Berbasis Website pada Toko Sembako Sayur Amanah Radhita Rayhan; Ika Nur Fajri
Jurnal Teknologi Informasi dan Multimedia Vol. 7 No. 1 (2025): February
Publisher : Sekawan Institut

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.35746/jtim.v7i1.656

Abstract

In the midst of the rapid development of digital technology, various business sectors, including the trade sector, have begun to adopt digital-based information systems to improve operational efficiency and effectiveness. Toko Sembako Sayur Amanah currently still relies on a manual system for recording transactions, managing stock items, and financial reporting using a cash book. This manual system causes the sales process to be inefficient, time-consuming, and prone to errors such as misrecording or data loss. In addition, the manual system is unable to meet the needs of customers who have limited time and makes it difficult to manage transactions and stock items effectively. To overcome these problems, this research aims to design and implement a website-based information system using the Waterfall method, which includes requirements analysis, system design, implementation, and system testing. Testing is carried out with a Black-box Testing approach to ensure the suitability of system functionality with predetermined needs. The test results show that the developed system has succeeded in increasing the efficiency of managing categories and goods by the admin and making it easier for customers to place orders and make payments online. This research is expected to be a reference for the development of similar systems in other grocery stores with the potential to increase competitiveness in an increasingly competitive market. As a follow-up, this research opens up opportunities for further development, such as integration with mobile applications or more sophisticated inventory management systems.
Penerapan Metode Design Thinking dalam Perancangan UI/UX Website Pintu Rumah Roy Wenang Robbani; Ika Nur Fajri
Jurnal Teknologi Informasi dan Multimedia Vol. 7 No. 2 (2025): May
Publisher : Sekawan Institut

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.35746/jtim.v7i2.714

Abstract

Property rental is the activity of utilizing property by tenants for a certain period of time at an agreed-upon cost. The types of property rented include residential, commercial, and industrial, which are the choices of many people because of their flexibility. As technology evolves, property searches and transactions are now more accessible through digital platforms such as websites and mobile applications. This platform allows tenants or buyers to get faster and more precise in-formation according to their needs. However, many property rental platforms still face challenges in providing an optimal user experience, such as a complicated interface and a lack of direct in-teraction between tenants and property owners. This study aims to improve the user experience on property rental platforms by adding an appointment feature that allows direct communication between tenants and property owners. The method used in this study is Design Thinking, which consists of five stages: Empathy, Define, Ideate, Prototype, and Test. The developed prototype was tested using Maze, a real-time user testing platform. The test results show that this platform has a Maze Usability Score (MAUS) of 69.63%, which is classified as “Good”. Although in general the platform can be used well, there are areas that need improvement, such as the high level of click errors in the process of adding properties by the owner. The conclusion of this study is that alt-hough the platform functions effectively, there is still room for improvement in terms of clarity and ease of navigation.
Hybrid LexRank-LDA-MMR for Indonesian Text Summarization Muis, Nasrul Amin; Pristyanto, Yoga; Fajri, Ika Nur
Jurnal Nasional Teknologi dan Sistem Informasi Vol 12 No 1 (2026): April 2026
Publisher : Departemen Sistem Informasi, Fakultas Teknologi Informasi, Universitas Andalas

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.25077/TEKNOSI.v12i1.2026.97-104

Abstract

The rapid growth of digital text information makes it crystal clear that there is a need for automated tools that summarize text for rapid retrieval. Extractive methods employed include LexRank, Latent Dirichlet Allocation (LDA), and Maximal Marginal Relevance (MMR), and the study aimed at enhancing the quality of Indonesian text summaries with more than just regular LexRank. In this study, the role of LexRank was to assist in selecting meaningful sentences with centricity to the center of the graphs, while the role of LDA was to ensure that the sentences were topically relevant. The strength of MMR is maintaining the document's relevance and diversity, which reduces redundancy in the summaries. Summaries from two publicly available datasets, IndoSum and Liputan6, containing texts in Bahasa Indonesia, were analyzed at 30% and 50% compression levels and graded using ROUGE (ROUGE-1, ROUGE-2, ROUGE-L F1 score) measurements. Analysis of 5000 articles per dataset showed that the implementation of LexRank and LDA together with MMR resulted in a greater average ROUGE score than when using standard LexRank, irrespective of the set compression levels and across both datasets, demonstrating the effectiveness of the approach to enhance summary quality. The improvements recorded are most significant in ROUGE-1 and ROUGE-2, which indicates that these combination approaches can produce more informative and relevant summaries while preserving sentence-level diversity, which deepens the understanding of the information presented in the summary.