cover
Contact Name
Mesran
Contact Email
mesran.skom.mkom@gmail.com
Phone
-
Journal Mail Official
jurnal.bits@gmail.com
Editorial Address
-
Location
Kota medan,
Sumatera utara
INDONESIA
Building of Informatics, Technology and Science
ISSN : 26848910     EISSN : 26853310     DOI : -
Core Subject : Science,
Building of Informatics, Technology and Science (BITS) is an open access media in publishing scientific articles that contain the results of research in information technology and computers. Paper that enters this journal will be checked for plagiarism and peer-rewiew first to maintain its quality. This journal is managed by Forum Kerjasama Pendidikan Tinggi (FKPT) published 2 times a year in Juni and Desember. The existence of this journal is expected to develop research and make a real contribution in improving research resources in the field of information technology and computers.
Arjuna Subject : -
Articles 777 Documents
Perbandingan Multi Algoritma Klasifikasi dan Tuning Parameter untuk Prediksi Ketergantungan Skincare Berbasis Streamlit Kuncoro, Aneira Vicentiya; Ni’mah, Laila Maulin; Faisal, Edi
Building of Informatics, Technology and Science (BITS) Vol 7 No 2 (2025): September 2025
Publisher : Forum Kerjasama Pendidikan Tinggi

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47065/bits.v7i2.7897

Abstract

The use of skincare products in Indonesia has increased significantly along with the increasing public awareness of the importance of skincare, but this also raises indications of dependence behaviour that needs to be anticipated, especially in young age groups. This research aims to build a skincare dependency prediction system based on demographic, psychological and behavioural attributes collected through an online survey. In addition, a comparison of five classification algorithms-Decision Tree, Random Forest, Support Vector Machine, K-Nearest Neighbors, and Logistic Regression-was conducted to determine the best model that is most accurate and efficient in predicting the dependency tendency. The data obtained was processed through normalisation and categorical feature transformation with One-Hot Encoding, then evaluated using accuracy, precision, recall, and F1-score metrics. The results showed that the Decision Tree algorithm provided the best performance with accuracy reaching 87% and excellence in model interpretability. The model was then implemented in the form of an interactive web application based on Streamlit that allows users to make predictions independently and in real-time. The contribution of this research is the availability of a prediction system that supports education and wiser decision-making in the use of skincare, as well as opening up opportunities for the utilisation of machine learning technology for other issues.
Klasifikasi Website Phishing Menggunakan Metode X-Gboost dengan Teknik Penyeimbang Data Radial Based Undersampling Yoga, Yoga; Umbara, Fajri Rakhmat; Sabrina, Puspita Nurul
Building of Informatics, Technology and Science (BITS) Vol 7 No 2 (2025): September 2025
Publisher : Forum Kerjasama Pendidikan Tinggi

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47065/bits.v7i2.7920

Abstract

Phishing websites are one of the most prevalent forms of cyberattacks and have the potential to cause significant losses, both financially and non-financially. Automatic phishing detection using machine learning algorithms has become an effective solution to address this threat. This study aims to classify phishing websites using the Extreme Gradient Boosting (XGBoost) algorithm and to address the issue of class imbalance by applying the Radial Based Undersampling (RBU) method. In addition, hyperparameter tuning was performed using the Random Search method to optimize the model's performance. The dataset used was obtained from the Kaggle platform and exhibits an imbalanced class distribution, where the number of non-phishing instances far exceeds phishing instances. This imbalance can lead to a biased model and reduce its ability to detect minority class patterns. Based on the evaluation results, the application of RBU significantly improved the model’s capability in detecting phishing instances, while hyperparameter tuning further enhanced its accuracy. The best model was achieved through a combination of RBU and Random Search, reaching an accuracy of 90.39% on the test data. These findings indicate that the combined approach of data balancing and model optimization provides an effective solution for phishing website classification and can be applied to similar cases in the field of cybersecurity.
Development of an Expert System to Detect Mental Disorders in Pregnant Women using Forward and Backward Chaining Methods Dela, Monisa; Darnila, Eva; Rosnita, Lidya
Building of Informatics, Technology and Science (BITS) Vol 7 No 2 (2025): September 2025
Publisher : Forum Kerjasama Pendidikan Tinggi

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47065/bits.v7i2.7930

Abstract

Mental health during pregnancy plays a critical role in fetal development and maternal well-being. However, psychological conditions such as depression, stress, and anxiety in pregnant women often go undetected, especially in primary healthcare settings. This research aims to design and develop a web-based expert system capable of diagnosing the mental health conditions of pregnant women using Forward Chaining and Backward Chaining inference techniques. Forward Chaining is applied to infer possible conditions based on reported symptoms, while Backward Chaining is used to validate hypotheses by tracing required supporting symptoms. The system was developed using patient data collected from three health centers in Lhokseumawe City, totaling 500 records with parameters including name, age, gestational age, number of children, and reported complaints. It incorporates 30 symptoms and 9 diagnostic rules to classify the mental condition and its severity.The results indicate that 179 women were diagnosed with depression (mild 107, moderate 33, severe 39), 150 with anxiety (mild 24, moderate 91, severe 35), and 171 with stress (mild 82, moderate 50, severe 39). The system also demonstrates diagnostic probability (e.g., 66.67% in a specific case). Validation using 20 test cases yielded an accuracy of 85%, showing the system performs reliably in aligning symptoms with diagnostic outcomes. This study makes two significant contributions. Practically, it offers a decision-support tool for midwives and general practitioners to perform early mental health screening of pregnant women, especially in regions lacking access to psychiatric specialists. Scientifically, it demonstrates the effectiveness of a hybrid reasoning approach in handling overlapping psychological symptoms and in assessing severity levels, thereby enriching the development of domain-specific expert systems in maternal mental health. In conclusion, this system provides a practical and accessible solution to support early detection and intervention in maternal mental health, ultimately contributing to improved health outcomes for both mothers and their babies.
Klasifikasi Churn Dengan Algoritma Xgboost Menggunakan Feature Selection Boruta-Shap Hadi Sakaro, Dwi Wahyu Kuncoro; Shabrina, Puspita Nurul; Ramadhan, Edvin
Building of Informatics, Technology and Science (BITS) Vol 7 No 2 (2025): September 2025
Publisher : Forum Kerjasama Pendidikan Tinggi

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47065/bits.v7i2.7965

Abstract

Customer churn is a critical issue for telecommunications companies, as it directly impacts revenue and business sustainability. This study proposes the development of a churn prediction model using the Extreme Gradient Boosting (XGBoost) algorithm combined with the Boruta feature selection method and SHAP (SHapley Additive exPlanations)-based feature interpretation. The dataset used is the Telco Customer Churn dataset from Kaggle, consisting of 7,043 customer records and 21 features. The research stages include data preprocessing, data transformation, an 80:20 train-test split, data balancing using SMOTE, feature selection with Boruta, feature interpretation with SHAP, and classification using XGBoost. The model’s performance was evaluated using accuracy, precision, recall, and F1-score metrics. Results show that the XGBoost model with Boruta-SHAP (Model B) achieved an accuracy of 0.7576, slightly higher than the model without feature selection (Model A), which achieved 0.7512. Model B also demonstrated improved performance for the majority class (non-churn), with recall increasing from 0.76 to 0.79 and F1-score from 0.82 to 0.83. However, for the minority class (churn), recall decreased from 0.72 to 0.66, although precision increased from 0.52 to 0.54. These findings indicate that integrating Boruta-SHAP can enhance model efficiency and interpretability, but additional strategies are required to maintain performance for the minority class.
Analisis Sentimen Terhadap Cyberbullying di Twitter (X) Menggunakan Improved Word Vectors dan Bert Nusantara, Madya Dharma; Umbara, Fajri Rakhmat; Sabrina, Puspita Nurul
Building of Informatics, Technology and Science (BITS) Vol 7 No 2 (2025): September 2025
Publisher : Forum Kerjasama Pendidikan Tinggi

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47065/bits.v7i2.7968

Abstract

Text mining is an important approach in analyzing text data, particularly for detecting negative sentiments such as cyberbullying on social media. Twitter (X), as an open platform, often serves as a space for the proliferation of hate speech and abusive behavior recorded in text form. This study aims to improve the performance of sentiment classification models on Twitter (X) data by combining the Improved Word Vector (IWV) and Bidirectional Encoder Representations from Transformers (BERT) methods, evaluated using precision, recall, and F1-score metrics. The dataset used consists of 9,874 Indonesian-language tweets labeled into three categories: Hate Speech (HS), Abusive, and Neutral. This data is sourced from previous research and is the result of re-annotation of the original dataset of 13,169 tweets. IWV is formed from a combination of Word2Vec, GloVe, POS tagging, and emotion lexicon features designed to enrich word representation semantically. The preprocessing process is carried out through several important stages, namely tokenization, filtering, stemming/lemmatization, and normalization. The IWV extraction results were then combined with BERT embedding through concatenation to produce high-dimensional vector representations. Evaluation was performed using precision, recall, and F1-score metrics. The test results showed that the combined IWV+BERT model was able to produce better performance than BERT alone. The use of data that has been balanced through balancing techniques also contributed to the improvement in accuracy, with the highest accuracy value reaching 91%. This finding indicates that the integration of word representation features from IWV and sentence context from BERT can improve the effectiveness of text mining in sentiment analysis related to cyberbullying on social media
Perbandingan Kinerja LSTM, Bi-LSTM, dan Prophet untuk Prediksi Kekeringan berdasarkan SPEI (Standardized Precipitation-Evapotranspiration Index) Amalina, Hana; Zuliarso, Eri
Building of Informatics, Technology and Science (BITS) Vol 7 No 2 (2025): September 2025
Publisher : Forum Kerjasama Pendidikan Tinggi

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47065/bits.v7i2.7971

Abstract

Drought is a natural disaster with widespread impacts on agriculture and water availability, particularly in the Gajah Mungkur Reservoir area of Wonogiri Regency, Indonesia. Rainfall instability driven by global climate change and local climate variability is the primary cause of this disaster. Accurate drought prediction is essential for formulating sustainable mitigation strategies. This study aims to analyze drought characteristics in the Gajah Mungkur Reservoir, Wonogiri Regency, using the Standardized Precipitation Evapotranspiration Index (SPEI) and to compare the performance of three prediction models: Long Short-Term Memory (LSTM), Bidirectional LSTM (Bi-LSTM), and Prophet in predicting SPEI. The dataset includes monthly rainfall and air temperature data from 1995 to 2024. The analysis reveals that longer SPEI time scales tend to show more temporally concentrated drought patterns. At the 6-month SPEI scale, which represents long-term drought, a total of 55 drought months were detected between 1995 and 2024, with major drought episodes occurring in 1996–1997, 2000–2007, 2019, and 2023–2024. Model performance evaluation shows a numerical trend in which Bi-LSTM outperforms others for 1-month SPEI prediction, while LSTM performs better at the 3- and 6-month scales. However, statistical significance testing indicates that the performance differences among the three models are not significant (p > 0,05), suggesting that other factors such as computational efficiency may be important considerations in practical applications.
Prediksi Diabetes Mellitus dengan Ensemble Gradient Boosting dan Advanced Feature Engineering Ramadhan, Daniswara Tegar; Agustina, Feri
Building of Informatics, Technology and Science (BITS) Vol 7 No 2 (2025): September 2025
Publisher : Forum Kerjasama Pendidikan Tinggi

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47065/bits.v7i2.8011

Abstract

Diabetes mellitus represents a metabolic disease that constitutes a global health challenge with continuously increasing prevalence rates. Early detection through automated prediction systems can help reduce complications and treatment costs. This study develops a diabetes mellitus prediction system using an ensemble gradient boosting approach optimized with advanced feature engineering. The research dataset combines 768 Pima Indians samples with 5,000 samples from diabetes prediction dataset, resulting in 5,768 total data points subsequently balanced using ADASYN technique. Feature engineering process transforms 8 original features into 25 predictive features encompassing diabetes risk scores, BMI categories, age groups, and glucose categories. Three gradient boosting algorithms (XGBoost, LightGBM, CatBoost) along with ensemble voting classifier were optimized using Optuna framework with Tree-structured Parzen Estimator. Evaluation employed accuracy, precision, recall, F1-score, and ROC-AUC metrics through 5-fold cross validation. Results demonstrate LightGBM achieving optimal performance with 97.14% accuracy and 0.9976 ROC-AUC, followed by CatBoost (97.14%, 0.9973) and XGBoost (96.45%, 0.9971). Feature importance analysis identified DiabetesPedigreeFunction, Pregnancies, and SmokingHistory as key predictors. The developed model can be implemented as a diabetes screening system in primary healthcare facilities
Classification of English Language Anxiety Using Support Vector Machine on Twitter User Marozi, Ericho; Lhaksmana, Kemas Muslim
Building of Informatics, Technology and Science (BITS) Vol 7 No 2 (2025): September 2025
Publisher : Forum Kerjasama Pendidikan Tinggi

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47065/bits.v7i2.8015

Abstract

This study aims to classify expressions of language anxiety in English as a foreign language, as reflected in user-generated texts on Twitter. The research applies machine learning approaches Support Vector Machine (SVM) and Convolutional Neural Network (CNN) to perform automatic classification of anxiety levels. The dataset was collected through Twitter crawling, filtered for relevance, and annotated manually using a three-point scale (low, medium, high) based on psychological indicators such as fear of speaking, avoidance, and self-perceived inability. Preprocessing included text normalization, tokenization, stopword removal, and feature extraction using TF-IDF with unigram to trigram representations. Model training was conducted on a balanced dataset, and performance was evaluated through cross-validation and tuning of key hyperparameters. SVM achieved the highest accuracy of 98.40%, showing strong stability across various test conditions. CNN initially performed competitively but experienced a slight performance drop after tuning, suggesting its sensitivity to parameter settings and data volume. The findings demonstrate that SVM is more robust and suitable for limited-data environments, making it a practical tool for classifying psychological traits like language anxiety in digital communication. This research offers insight into the potential of machine learning in psychological and linguistic analysis, especially through social media platforms.
Public Sentiment Classification on Megathrust Issues in Social Media Using BERT Algorithm Wicaksono, Candra Kus Khoiri; Gunawan, Putu Harry
Building of Informatics, Technology and Science (BITS) Vol 7 No 2 (2025): September 2025
Publisher : Forum Kerjasama Pendidikan Tinggi

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47065/bits.v7i2.8016

Abstract

In recent years, the threat of megathrust earthquakes has intensified concern among scientists and the public, especially in seismically active countries like Indonesia. As people increasingly turn to social media to express fears and opinions about such disasters, these platforms offer a rich, real-time resource for gauging public sentiment. This study introduces a sentiment-classification system built on IndoBERT, an Indonesian-language adaptation of the renowned BERT architecture. Our model was trained on a custom-labeled dataset of social-media posts categorized as positive, negative, or neutral. Preprocessing involved tokenizing the text, truncating or padding inputs to 64 tokens, and converting sentiment labels into PyTorch tensor format to facilitate efficient training. We fine-tuned the IndoBERT model using the AdamW optimizer with a learning rate of 1e-5, a dropout rate of 0.1, and early stopping criteria to guard against overfitting, training for a maximum of seven epochs. Notably, the IndoBERT classifier achieved a validation accuracy of 93.33% on a hold-out test set representing 20% of the data, with this peak occurring in the very first epoch. This rapid convergence likely reflects both the strong pretrained language representations inherent in IndoBERT and the specific characteristics of the dataset. While early stopping effectively prevented overfitting, the immediate peak suggests that the model required minimal additional fine-tuning to adapt to this sentiment classification task. These findings demonstrate that advanced natural-language-processing tools like IndoBERT can reliably interpret sentiment in the context of sensitive topics and have the potential to be integrated into disaster-response frameworks, equipping officials with timely, data-driven insights into public opinion and concerns during emergencies.
Opinion Mining on TikTok Using Bidirectional Long Short-Term Memory for Enhanced Sentiment Analysis and Trend Prediction Muharnisa Haspin, Wafiq; Junadhi, Junadhi; Susanti, Susanti; Yenni, Helda
Building of Informatics, Technology and Science (BITS) Vol 7 No 2 (2025): September 2025
Publisher : Forum Kerjasama Pendidikan Tinggi

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47065/bits.v7i2.8019

Abstract

The widespread use of TikTok has generated a vast number of user reviews, offering a rich dataset for sentiment analysis. This study aims to classify TikTok reviews from the Google Play Store into positive, negative, and neutral categories, a complex task due to the informal and unstructured text. The research seeks to develop a reliable sentiment analysis model using deep learning to understand user perceptions, aiding platform improvements and marketing strategies. We collected 10,000 reviews via web scraping, preprocessed through text cleaning, normalization, tokenization, filtering, and stemming. Sentiment labels were assigned automatically using a lexicon-based approach, showing predominantly positive reviews. Word2Vec transformed text into numerical vectors for feature extraction. The Bidirectional Long Short-Term Memory (Bi-LSTM) model, with Embedding, Bidirectional LSTM, Dropout, and Dense layers, achieved 80% accuracy and an F1-score of 0.78 using a 90:10 train-test split. While effective for positive and negative sentiments, neutral expressions were less accurately detected due to lower recall. Compared to traditional methods like Naive Bayes, Support Vector Machine, and K-Nearest Neighbors, Bi-LSTM offered superior accuracy and better handling of linguistic variability, making it valuable for analyzing social media feedback.