cover
Contact Name
Mesran
Contact Email
mesran.skom.mkom@gmail.com
Phone
-
Journal Mail Official
jurnal.bits@gmail.com
Editorial Address
-
Location
Kota medan,
Sumatera utara
INDONESIA
Building of Informatics, Technology and Science
ISSN : 26848910     EISSN : 26853310     DOI : -
Core Subject : Science,
Building of Informatics, Technology and Science (BITS) is an open access media in publishing scientific articles that contain the results of research in information technology and computers. Paper that enters this journal will be checked for plagiarism and peer-rewiew first to maintain its quality. This journal is managed by Forum Kerjasama Pendidikan Tinggi (FKPT) published 2 times a year in Juni and Desember. The existence of this journal is expected to develop research and make a real contribution in improving research resources in the field of information technology and computers.
Arjuna Subject : -
Articles 889 Documents
Model Klasifikasi Cerdas Gangguan Tidur Berbasis Machine Learning Random Forest pada Data Kesehatan dan Perilaku Harian Ni'mah, Laila Maulin; Kurniawan, Defri
Building of Informatics, Technology and Science (BITS) Vol 7 No 3 (2025): December 2025
Publisher : Forum Kerjasama Pendidikan Tinggi

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47065/bits.v7i3.8631

Abstract

Sleep disorders, such as insomnia and sleep apnea, have become a significant health issue in the modern era, driven by the demands of lifestyle changes. This condition highlights the urgent need for early detection tools that are not only accurate but also easily accessible to the general public. This research aims to design and implement an intelligent classification system to automatically identify the risk of sleep disorders based on health and daily behavior data. To achieve this goal, this study applies a machine learning method using the Random Forest algorithm, which was chosen for its reliable ability to handle complex and non-linear data relationships. The data used is the "Sleep Health and Lifestyle Dataset" sourced from the Kaggle platform, covering 374 respondents with 13 relevant features. The research process included data pre-processing steps to ensure input quality, model training, and rigorous performance evaluation. The evaluation results on the test data show that the developed Random Forest model exhibited very solid performance, successfully achieving an accuracy rate of 91% and a weighted average F1-Score of 0.90. This F1-Score metric, which balances precision and recall, confirms that the model is not only accurate but also has a balanced performance in detecting each class, which is crucial for health classification. Furthermore, the feature importance analysis confirmed that Stress Level, BMI Category, and Heart Rate are the three most dominant predictor factors. The culmination of this research is the successful implementation of this predictive model into an interactive web application developed using the Streamlit framework. This application allows users to independently input their health data and receive feedback in the form of a real-time risk prediction. With an intuitive interface and easy-to-understand results, this application serves as a practical and informative initial screening tool for personal sleep health analysis.
Perbandingan Kinerja Model IndoBERT, IndoBERTweet, dan Algoritma Klasik pada Analisis Sentimen Isu Indonesia Gelap Alvin, Fris; Winarsih, Nurul Anisa Sri
Building of Informatics, Technology and Science (BITS) Vol 7 No 3 (2025): December 2025
Publisher : Forum Kerjasama Pendidikan Tinggi

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47065/bits.v7i3.8636

Abstract

This study aims to compare the performance of Transformer-based models, namely IndoBERT and IndoBERTweet, with three classical machine learning algorithms, namely Support Vector Machine (SVM), Logistic Regression, and Random Forest, in analyzing public sentiment regarding the “Indonesia Gelap” issue that has been widely discussed on social media. The dataset was collected using a crawling process on TikTok user comments containing keywords related to the issue, resulting in 5.000 comments. After the preprocessing stage, 4.667 comments were deemed suitable for analysis and were labeled into positive, negative, and neutral sentiment categories using a lexicon-based approach. To address the imbalance in class distribution, three oversampling strategies were applied: without oversampling, oversampling before data splitting, and oversampling after data splitting applied only to the training data. Each model was evaluated using four performance metrics: accuracy, precision, recall, and F1-score. The results show that oversampling before data splitting yielded the best performance across all models, with IndoBERT achieving the highest F1-score of 0.93, followed by IndoBERTweet with 0.91, while the classical algorithms achieved average F1-scores ranging from 0.89 to 0.90. Meanwhile, both the non-oversampling scenario and oversampling after data splitting on the training data resulted in lower performance, with average F1-scores ranging from 0.70 to 0.78. These findings indicate that Transformer-based models are more effective in capturing informal language characteristics commonly found in social media comments. Furthermore, balancing the dataset before model training significantly improves the stability and performance of sentiment classification on imbalanced data.
Optimasi Algoritma Decision Tree Menggunakan GridSearchCV untuk Klasifikasi Tipe Obesitas Laurent, Feby; Winarno, Sri; Dewi, Ika Novita
Building of Informatics, Technology and Science (BITS) Vol 7 No 3 (2025): December 2025
Publisher : Forum Kerjasama Pendidikan Tinggi

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47065/bits.v7i3.8638

Abstract

The rise in obesity cases in various countries, including Indonesia, has become a serious public health problem because it increases the risk of chronic diseases and affects individuals' psychological aspects. One of the main challenges in obesity management is the differences in obesity types in each individual, which are influenced by various factors. Therefore, accurate classification methods are needed to ensure more targeted treatment. In this context, machine learning-based technology is a potential solution for classifying obesity types. However, variations in individual characteristics make the classification process complex, as models often struggle to accurately distinguish obesity types. To overcome this problem, the Decision Tree algorithm was chosen because of its easy-to-interpret results. However, using Decision Tree with default parameters on datasets with many attributes and high variation tends to cause overfitting and decrease accuracy. Furthermore, Decision Tree performance is highly dependent on hyperparameter settings, requiring optimization techniques to achieve optimal results. Based on this, this study aims to optimize the Decision Tree algorithm using GridSearchCV to obtain the most optimal parameters to improve model performance in obesity type classification. The dataset used is from the UCI Machine Learning Repository, consisting of 2,111 rows of data and 17 attributes. Based on the initial test results, the default model achieved 92.58% accuracy, 92.58% recall, 92.66% precision, and 92.56% F1-score. After optimization, the accuracy increased to 95.69%, 95.69% recall, 95.72% precision, and 95.67% F1-score. The 3.1% increase in accuracy demonstrates the effectiveness of GridSearchCV in improving Decision Tree performance, resulting in a more accurate and stable prediction model. This research is expected to contribute as a basis for decision-making in early detection and prevention and treatment of obesity more efficiently and effectively.
Analisis Sentiment Pengguna X Terhadap Hilirisasi Kemenyan Menggunakan Algoritma Naïve Bayes Utama, Farrel Rizki; Suaidah, Suaidah
Building of Informatics, Technology and Science (BITS) Vol 7 No 3 (2025): December 2025
Publisher : Forum Kerjasama Pendidikan Tinggi

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47065/bits.v7i3.8639

Abstract

The issue of frankincense downstreaming in Indonesia is a significant concern because it has the potential to increase the economic value of local commodities and the welfare of the community, especially farmers. However, public perception of this downstreaming policy is still diverse and has not been widely analyzed scientifically, especially on social media. Therefore, this study aims to analyze the sentiment of social media users X towards the issue of frankincense downstreaming using the Naïve Bayes algorithm. The research data was obtained through a crawling process using the Twitter API with the keywords "Frankincense Downstreaming" and "Downstreaming", resulting in 1,844 tweets. The data then went through a preprocessing stage including cleaning, case folding, normalization, tokenizing, stopword removal, and stemming, leaving 1,790 tweets ready for analysis. The sentiment labeling process was carried out using a lexicon-based approach with three categories: positive, negative, and neutral. Feature representation was carried out using the TF-IDF method, then the data was classified using the Naïve Bayes algorithm. The test results show that the Naïve Bayes algorithm is able to classify sentiment well, with the highest precision in the negative class at 0.90 and the highest recall in the neutral class at 0.92. The majority of X users showed neutral sentiment towards the issue of frankincense downstreaming at 55.20%, followed by positive at 26.03% and negative at 18.77%.
Comparison of XGBoost and LSTM in Knowledge Discovery for GrokAI Mobile Application Sentiment Analysis Risyahputri, Aliyananda; Kurniawan, Dedy; Tania, Ken Ditha
Building of Informatics, Technology and Science (BITS) Vol 7 No 3 (2025): December 2025
Publisher : Forum Kerjasama Pendidikan Tinggi

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47065/bits.v7i3.8651

Abstract

Generative AI has provided real benefits in key sectors of the public sector. However, the rapid expansion of AI assistant services also raises concerns about whether newly released products can consistently meet user expectations, especially as negative experiences are increasingly expressed through public reviews. Its positive impacts encourage competitive rivalry among AI assistant product developers, including xAI, which also participates by formulating the Grok AI application. As a relatively new product with over 50 million downloads, GrokAI needs to perform an evaluation to maintain its competitiveness. This condition leads to the research goal of analyzing user sentiment toward GrokAI application through reviews on Google Play Store and comparing the performance of Machine Learning and Deep Learning classification models within the framework of Knowledge Discovery in Databases (KDD). This study uses 11,108 review data classified using the VADER Lexicon method, resulting in 7,633 positive reviews and 3,475 negative reviews. The data is then tested on XGBoost (Extreme Gradient Boosting) and LSTM (Long-Short Term Memory) models. The results show that the XGBoost model performs slightly better with an accuracy of 87.22%, compared to LSTM, which reaches 86.58%. However, both models exhibit significant performance disparities in classifying negative classes due to the extreme difference in data quantity. The knowledge discovery process reveals that the majority of positive sentiment appreciates the free access and general functions of the application. Meanwhile, negative sentiment focuses on complaints related to response time, output quality, and specific features such as image and voice. The main recommendation is to maintain the advantage of free access also improve features and processing logic to sustain loyalty and service quality. Future research is suggested to test models with more balanced data and optimize dataset cleaning to improve accuracy in minority classes.
Prediksi dan Optimalisasi Konsumsi Energi Smart Atmospheric Water Generator (SAWG) Menggunakan XGBoost Regression Wiradinata, Halim Jayakusuma; Santoso, Heru Agus
Building of Informatics, Technology and Science (BITS) Vol 7 No 3 (2025): December 2025
Publisher : Forum Kerjasama Pendidikan Tinggi

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47065/bits.v7i3.8655

Abstract

The decreasing availability of clean water has motivated the use of Smart Atmospheric Water Generator (SAWG) systems as an alternative water source, but their electrical energy consumption fluctuates with ambient conditions and operating patterns. This study develops a predictive model of SAWG energy consumption (kWh) using Extreme Gradient Boosting (XGBoost) and demonstrates a prediction-based operational optimization scheme for energy-efficient scheduling. The SAWG logging dataset (1,601 rows, 9 variables) is preprocessed through missing-value handling, numeric conversion, and noise/outlier detection, resulting in 1,313 usable records. The feature set includes environmental parameters, electrical signals, and time features: hour of day, day of week, and month. Modeling employs chronological time-based splits (80:20 as the main configuration and 60:40 as a robustness check), Time Series Cross-Validation on the training block, and hyperparameter tuning via GridSearchCV. Evaluation on the hold-out test sets shows that the model’s performance in a strict time-series setting remains limited: for the 80:20 split, the test results are approximately MAE = 23.16 kWh, MSE = 648.93 kWh², and R² = −0.22, while for the 60:40 split they are MAE = 27.21 kWh, MSE = 932.17 kWh², and R² = −1.75. Although the model cannot yet explain the overall variance of energy consumption satisfactorily, it can still be used to rank hours by predicted energy. In the prediction-based operational optimization stage, hourly model outputs are fed into a Greedy Scheduler that selects H = 8 operating hours with the lowest predicted energy. Compared with a naive schedule, which yields a total predicted energy of 47.493 kWh over the simulation horizon, the greedy schedule achieves 43.134 kWh, corresponding to an estimated saving of about 9.18%. These results indicate that prediction-based scheduling can reduce SAWG energy consumption without modifying the device hardware and can be further developed as a decision-support component for SAWG operation.
Implementasi dan Evaluasi Model Machine Learning untuk Optimalisasi Prediksi Penjualan Produk Kue Kering Hilmi, Muhammad Abror Auliya; Susanto, Ajib
Building of Informatics, Technology and Science (BITS) Vol 7 No 3 (2025): December 2025
Publisher : Forum Kerjasama Pendidikan Tinggi

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47065/bits.v7i3.8657

Abstract

The modern retail sector, such as Transmart, faces difficulties in maintaining stable sales performance due to changes in consumer behavior, variations in product types, and differing store characteristics. To address this issue, this study proposes the use of the Extreme Gradient Boosting (XGBoost) machine learning algorithm to predict retail product sales volumes based on historical data from 2024–2025. The research utilizes the CRISP-DM framework, which consists of the following stages: Business Understanding, Data Understanding, Data Preparation, Modeling, Evaluation, and Deployment. The data cleaning and preprocessing processes involve several steps such as data cleaning, label encoding, feature selection, and data splitting with an 80:20 ratio. The model is further evaluated using the Mean Absolute Error (MAE) and the coefficient of determination (R²) metrics to assess prediction accuracy. The findings indicate that XGBoost is capable of effectively capturing sales patterns and generating accurate predictions to support decision-making strategies in the retail sector, particularly in stock planning and sales optimization. Therefore, the implementation of this data-driven predictive approach is expected to assist companies in enhancing operational management as well as improving competitiveness in the market.
Analisis Komparatif Kinerja Algoritma Machine Learning untuk Deteksi Status Gizi Balita Sabrina, Della; Kurniawan, Defri
Building of Informatics, Technology and Science (BITS) Vol 7 No 3 (2025): December 2025
Publisher : Forum Kerjasama Pendidikan Tinggi

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47065/bits.v7i3.8668

Abstract

Nutritional status in children under five years of age serves as a key indicator in assessing the overall health, growth, and development of children. Conventionally, nutritional status is determined through manual measurements and interpretation of anthropometric tables, which is time-consuming and prone to human error. With advances in technology, machine learning-based approaches can be used to help classify nutritional status more quickly, objectively, and accurately, thereby supporting decision-making in public health. This study focuses on analyzing and comparing the performance of three machine learning algorithms, namely Support Vector Machine (SVM), K-Nearest Neighbors (KNN), and Decision Tree (DT) in classifying the nutritional status of toddlers using anthropometric data that includes variables such as age, gender, weight, and height. In this study, the nutritional status categories classified for the toddler weight dataset include: Severely Underweight, Underweight, Normal, and Overweight. The categories for the height dataset include Severely Stunted, Stunted, Normal, and Tall. The research stages included data preprocessing, data splitting into training and testing, and model performance assessment through accuracy, precision, recall, and F1-score matrices. Based on the evaluation results of the toddler height dataset, the K-Nearest Neighbors (KNN) algorithm proved to be the model with the best performance, with an accuracy of 99.91%. This value exceeded that of the Decision Tree, which achieved an accuracy of 99.89%, and the SVM (RBF) algorithm, which achieved 98.48%.
Perbandingan Kinerja Naive Bayes, Support Vector Machine dan Random Forest Untuk Analisis Sentimen Aplikasi Brimo Darwin, Amelia; Lestarini, Dinda; Seprina, Iin
Building of Informatics, Technology and Science (BITS) Vol 7 No 3 (2025): December 2025
Publisher : Forum Kerjasama Pendidikan Tinggi

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47065/bits.v7i3.8697

Abstract

The development of financial technology has driven the increasing use of mobile banking, including BRImo, owned by Bank Rakyat Indonesia (BRI). However, user reviews on the Google Play Store show various complaints such as login difficulties, system errors, and failed transactions. This study aims to analyze BRImo user sentiment using three machine learning algorithms: Naive Bayes, Support Vector Machine (SVM), and Random Forest. Data were obtained from 4,996 reviews through web scraping and labeled based on ratings with categories 1-3 negative and 4-5 positive. The labeling process obtained 4,123 positive reviews and 873 negative reviews, which were then balanced using the Synthetic Minority Oversampling Technique (SMOTE). Feature extraction was performed using TF-IDF. Test results showed that Random Forest provided the best performance with an accuracy of 0.87, a recall of 0.70, and an F1-score of 0.65 in the negative class, and an F1-score of 0.92 in the positive class. The macro F1-score reached 0.79, higher than SVM (0.69) and Naive Bayes (0.70). This finding indicates that Random Forest is more effective in classifying BRImo user sentiment, especially after data balancing, and can serve as a reference for developers in improving the quality of application services.