cover
Contact Name
Mesran
Contact Email
mesran.skom.mkom@gmail.com
Phone
-
Journal Mail Official
jurnal.bits@gmail.com
Editorial Address
-
Location
Kota medan,
Sumatera utara
INDONESIA
Building of Informatics, Technology and Science
ISSN : 26848910     EISSN : 26853310     DOI : -
Core Subject : Science,
Building of Informatics, Technology and Science (BITS) is an open access media in publishing scientific articles that contain the results of research in information technology and computers. Paper that enters this journal will be checked for plagiarism and peer-rewiew first to maintain its quality. This journal is managed by Forum Kerjasama Pendidikan Tinggi (FKPT) published 2 times a year in Juni and Desember. The existence of this journal is expected to develop research and make a real contribution in improving research resources in the field of information technology and computers.
Arjuna Subject : -
Articles 926 Documents
Analisis Sentimen Ulasan DANA Dari Play Store dengan Metode SVM, Logistic Regression, Naive Bayes dan KNN Fitriyanto, Anwar Dwiky; Purwanto, Purwanto
Building of Informatics, Technology and Science (BITS) Vol 7 No 3 (2025): December 2025
Publisher : Forum Kerjasama Pendidikan Tinggi

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47065/bits.v7i3.8769

Abstract

The growth of digital transaction services in Indonesia has driven the increased use of digital wallets such as DANA, resulting in a continuous increase in the number of user reviews. The large number of reviews makes the process of manually reading, sorting, and understanding sentiment trends inefficient and prone to bias. This challenge is exacerbated by the fact that reviews in Indonesian often contain non-standard language, abbreviations, and slang, making it difficult for the system to accurately recognize the context. In addition, the large volume of data also affects the modeling process, where the availability of more data generally improves the model's ability to learn sentiment patterns more stably. To address these issues, this study developed a machine learning-based sentiment classification system capable of automatically processing large numbers of reviews through TF-IDF feature representation. In this study, review data was collected from the Google Play Store, through a cleaning and preprocessing stage before being converted into TF-IDF feature vectors. Four main algorithms were tested, namely Logistic Regression, Support Vector Machine (SVM), K-Nearest Neighbor (KNN), and Naive Bayes, which were then evaluated using accuracy, precision, recall, and F1-score metrics. The test results showed that TF-IDF was able to describe the relationship between words quite well, while the Naive Bayes algorithm provided the most stable performance compared to the other three methods, with an accuracy rate of 79.80%. The model developed can help companies understand user perceptions more quickly and objectively, as well as support data-driven decision making to improve service quality.
Optimasi Support Vector Machine Menggunakan RandomizedSearchCV dan SMOTE untuk Klasifikasi Kebugaran Berdasarkan Parameter Fisiologis Nathansyach, Gema Amran; Purwanto, Purwanto
Building of Informatics, Technology and Science (BITS) Vol 7 No 3 (2025): December 2025
Publisher : Forum Kerjasama Pendidikan Tinggi

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47065/bits.v7i3.8770

Abstract

This study aims to improve the accuracy of the Support Vector Machine (SVM) model in classifying fitness status (fit/unfit) based on physiological parameters and lifestyle using the Fitness Classification Dataset, which is a synthetic dataset designed to represent fitness indicators such as BMI, height, weight, heart rate, blood pressure, nutritional quality, sleep duration, and activity index. The dataset has an imbalanced class distribution and contains a combination of numerical and categorical features, thus requiring comprehensive preprocessing. This study applies two optimization techniques, namely RandomizedSearchCV for efficient hyperparameter tuning and SMOTE for handling class imbalance. The experimental results show that the baseline SVM model produces an accuracy of 75.75%, while the combination of SVM + RandomizedSearchCV + SMOTE increases the accuracy to 80%, or an increase of 4.25%. In addition, the AUC value also increased from 0.835 in the baseline to 0.850 in the optimized model. These findings indicate that the integration of RandomizedSearchCV and SMOTE significantly improves the model's ability to capture non-linear patterns while increasing sensitivity to minority classes. Overall, this study proves that the optimized SVM pipeline is capable of providing more stable and accurate performance in fitness status classification tasks and can be used as a reference for developing predictive models in other health domains.
Implementasi K-Means sebagai Mekanisme Self-Labeling dalam Arsitektur Ensemble Voting Classifier untuk Prediksi Penjualan Usaha Mikro Kecil dan Menengah (UMKM) pada Data Tanpa Label Fahmi, Muhammad Aqil; Kurniawan, Defri
Building of Informatics, Technology and Science (BITS) Vol 7 No 3 (2025): December 2025
Publisher : Forum Kerjasama Pendidikan Tinggi

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47065/bits.v7i3.8779

Abstract

Sales forecasting in the Micro, Small, and Medium Enterprises (MSME) sector faces challenges due to the fluctuating (noisy) nature of the data and the absence of class labels (unlabeled) required for training supervised learning models. This study proposes a sequential hybrid architecture in which the K-Means algorithm is employed as a Self-Labeling mechanism to automatically transform raw transaction data into class labels (“Low” and “High”). The resulting synthetic labels are then used to train an Ensemble Voting Classifier model that aggregates predictions from XGBoost, LightGBM, and CatBoost. The experimental evaluation results show that although the single XGBoost model achieves a slightly higher accuracy (96.24%) compared to the Ensemble model (96.07%), the hybrid Ensemble Voting model proves superior in terms of probability calibration, achieving the lowest Loss value of 0.1532. This value outperforms XGBoost (0.1646) and LightGBM (0.1772), indicating more reliable and stable prediction confidence. The model also demonstrates excellent balance with an F1-Score of 0.95 and a Recall of 0.96 for the majority class. This study confirms that the hybrid approach is effective in reducing uncertainty in MSME stock management.
Analisis Komparasi Algoritma ARIMA dan LSTM pada Prediksi Harga Cabai Merah Keriting Harian Utari, Cut Try; Sembiring, M Thariq Arya Putra; Siregar, M Habibi Rizq Zhafar
Building of Informatics, Technology and Science (BITS) Vol 7 No 3 (2025): December 2025
Publisher : Forum Kerjasama Pendidikan Tinggi

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47065/bits.v7i3.8784

Abstract

Curly red chili is a strategic national commodity characterized by extreme price fluctuations, which significantly impact regional inflation and farmer welfare. Although conventional statistical methods are frequently used for forecasting, these approaches have inherent limitations in capturing non-linear volatility and dynamic price patterns. This research aims to address this gap by comprehensively comparing the performance of the AutoRegressive Integrated Moving Average (ARIMA) statistical model and the Long Short-Term Memory (LSTM) Deep Learning model. This study utilizes a univariate prediction approach based on daily historical price data from January 2024 to October 2025. The dataset is partitioned into 80% for training and 20% for testing purposes. Model performance is rigorously evaluated using Root Mean Squared Error (RMSE), Mean Absolute Percentage Error (MAPE), and Coefficient of Determination (). The experimental results demonstrate that the LSTM model significantly outperforms ARIMA in tracking daily price trends. LSTM achieved an average MAPE of 13.76% (classified as "Good") with an value of 0.92, whereas the ARIMA model recorded a significantly higher MAPE of 41.21% and a negative value. This study concludes that Deep Learning-based algorithms are superior and more effective in handling food commodity price volatility compared to classical linear statistical methods.
Analisis Hyperparameter Tuning MobileNetV2 dengan Metode Sequential Search dalam Sistem Klasifikasi Penyakit Daun Kentang Khoirur Rizky, Muhammad Ivan; Rozada, Akfi; Baroroh, Nurul; Pramunendar, Ricardus Anggi
Building of Informatics, Technology and Science (BITS) Vol 7 No 3 (2025): December 2025
Publisher : Forum Kerjasama Pendidikan Tinggi

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47065/bits.v7i3.8786

Abstract

Indonesia’s national potato production faces significant threats from leaf diseases, while manual classification remains slow, subjective, and prone to error due to the high visual similarity across disease categories. This highlights the need for a precise and reliable automated classification system. However, many previous studies have not applied systematic hyperparameter optimization, leaving the capacity of deep learning architectures underutilized. Addressing this research gap, this study aims to enhance the performance of MobileNetV2 for potato leaf disease classification through a structured hyperparameter optimization process. A Sequential Search strategy validated through 3 fold Stratified Cross Validation is employed to obtain stable performance estimates. Four key hyperparameters are examined: learning rate from 0.001 to 0.009, dropout from 0.1 to 0.9, batch size from 8 to 192, and epochs from 10 to 100. The optimal configuration consists of a learning rate of 0.007, dropout of 0.2, batch size of 32, and 60 epochs, which enables MobileNetV2 to achieve an accuracy of 99 percent. Despite this strong performance, evaluation results reveal a minor limitation in the Young Blight class, where precision is slightly lower due to overlapping visual characteristics. These findings establish a new benchmark for potato leaf disease classification and provide a reproducible optimization framework for future studies. The study offers both methodological and practical contributions to the development of precise and efficient plant disease classification systems within the context of smart agriculture.
Perbandingan Metode Naive Bayes Classifier dan Support Vector Machine Pada Analisis Sentimen Wisata Biru Berdasarkan Ulasan Twitter, Instagram, dan Google Maps Review Rahmadila, Selvi; Alita, Debby
Building of Informatics, Technology and Science (BITS) Vol 7 No 3 (2025): December 2025
Publisher : Forum Kerjasama Pendidikan Tinggi

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47065/bits.v7i3.8791

Abstract

Blue tourism in Lampung Province has been recognized as a leading regional asset encompassing coastal areas, islands, and marine zones with strong appeal to visitors. Public responses toward these destinations can be captured through online reviews distributed across multiple digital platforms. In this study, the performance of sentiment classification algorithms, namely Naive Bayes Classifier and Support Vector Machine, was examined and compared using reviews related to blue tourism in Lampung. A total of 3,950 review records were collected from Twitter or X, Instagram, and Google Maps Review. The collected data were subjected to a series of preprocessing stages, including text cleaning to remove irrelevant elements, followed by theme and sentiment labeling using a semi supervised learning approach. Feature representation was generated through the Term Frequency Inverse Document Frequency method to transform textual data into numerical form. The labeling results revealed an imbalanced sentiment distribution with a strong dominance of positive sentiment. Model evaluation was conducted using an 80 to 20 split between training and testing datasets. The evaluation results indicated that the Support Vector Machine achieved an accuracy of 91.90 percent, while the Naive Bayes Classifier reached an accuracy of 90.38 percent. These findings suggest that the Support Vector Machine demonstrates superior capability in handling high dimensional textual data and imbalanced sentiment distributions. The outcomes of this study are expected to provide empirical guidance in selecting appropriate sentiment analysis algorithms to support data driven management and development of blue tourism destinations.
Pendekatan Ensemble Multi-Arsitektur Convolutional Neural Network melalui Soft Voting untuk Klasifikasi Citra Histopatologi Kanker Payudara Fitriyani, Shelomita; Rakasiwi, Sindhu
Building of Informatics, Technology and Science (BITS) Vol 7 No 3 (2025): December 2025
Publisher : Forum Kerjasama Pendidikan Tinggi

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47065/bits.v7i3.8797

Abstract

Breast cancer is one of the leading causes of mortality among women, creating a strong need for diagnostic methods that are accurate, consistent, and capable of handling the morphological variations present in histopathological images. This study aims to improve the stability and accuracy of breast cancer histopathology image classification through an ensemble multi-architecture Convolutional Neural Network approach. The BreakHis dataset, which consists of four magnification levels 40×, 100×, 200×, and 400× was used in this research. Three architectures, VGG19, ResNet50, and EfficientNetB0, served as the base models. All images underwent preprocessing, including resizing to 224×224 pixels, pixel-intensity normalization, and data augmentation. Each model was trained independently, and their probability outputs were combined using a soft voting mechanism to generate the final predictions. The experimental results show that the ensemble method provides the most stable and superior performance across all magnification levels. At 40× magnification, the ensemble achieved an accuracy of 92.00%, recall of 99.03%, and F1-score of 94.44%. At 100× magnification, the accuracy increased to 94.56%, with a recall of 99.07% and an F1-score of 96.18%. The 200× level produced an accuracy of 94.03%, recall of 97.61%, and an F1-score of 95.77%. Meanwhile, at 400× magnification, the model reached an accuracy of 90.11%, recall of 95.14%, and an F1-score of 92.88%. These consistently high recall and F1-score values highlight the model’s strong ability to detect malignant cases while maintaining balanced predictive performance. Overall, the findings demonstrate that combining multiple CNN architectures enhances feature representation and shows strong potential as a decision-support system for breast cancer diagnosis using histopathological images.
Meningkatkan Klasifikasi Obesitas Multi-Kelas Menggunakan Hybrid Stacking dan Meta-Learner CatBoost yang Interpretable melalui Analisis SHAP Level-2 Lomi, Septiani Wulandari; Sudaryanto, Slamet
Building of Informatics, Technology and Science (BITS) Vol 7 No 3 (2025): December 2025
Publisher : Forum Kerjasama Pendidikan Tinggi

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47065/bits.v7i3.8802

Abstract

Obesity is a global health problem that requires accurate, stable, and transparent multi-class prediction methods to support early clinical intervention. Previous studies used a Hybrid Stacking architecture with a linear Meta-Learner, which achieved 96.88% accuracy but had limitations in capturing complex non-linear interactions between basic model predictions. The main problem lies in the limitations of the linear Meta-Learner (Logistic Regression), which is not optimal in integrating non-linear signals from tree-based models at Level-1. The purpose of this study is to improve the performance, stability, and transparency of multi-class obesity predictions through the development of a Hybrid Stacking architecture with a non-linear Meta-Learner and the implementation of model interpretability techniques. To address this gap, this study proposes a new Hybrid Stacking Ensemble model by replacing the linear Meta-Learner with a powerful boosting model, namely CatBoost. The proposed model was evaluated on a multi-class obesity dataset and successfully surpassed state-of-the-art (SOTA) performance. The main performance improvement is demonstrated by an increase in Accuracy to 97.83% (an absolute increase of +0.95%) and a significant improvement in multi-class stability metrics: MCC (reaching 97.30%) and Cohen's Kappa (reaching 97.39%). This superiority validates the hypothesis that non-linear Meta-Learners are more effective. Furthermore, we included the technical innovation of Manual Padding on Level-1 outputs to ensure feature consistency, enabling a valid SHAP Level-2 analysis. The SHAP analysis revealed a strategic synergy, where CatBoost relied on Logistic Regression (a linear model) to predict high-risk class probabilities (Obesity Type II & III), while utilizing tree-based models for other classes. This model provides a superior, stable, and transparent methodology for obesity level prediction.
Sentiment Analysis of the Matahari Application to Provide User Experience Insights using Support Vector Machine Rizal, Moch Arif Samsul; Vitianingsih, Anik Vega; Zangana, Hewa Majeed; Maukar, Anastasia Lidya; Marisa, Fitri
Building of Informatics, Technology and Science (BITS) Vol 7 No 3 (2025): December 2025
Publisher : Forum Kerjasama Pendidikan Tinggi

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47065/bits.v7i3.8811

Abstract

The expansion of Indonesia's digital commerce ecosystem has pushed retail companies to strengthen the quality of their online services to remain competitive. Matahari, one of the country's leading retail brands, launched its mobile app as a platform for shopping, promotions, and customer interaction. However, user feedback on the Google Play Store indicates persistent problems with system responsiveness, ease of use, and the consistency of promotional information. This study examines sentiment patterns in 2,500 user reviews and classifies them using a Support Vector Machine (SVM) based model that incorporates three kernel types: Linear, RBF, and Polynomial. Before modelling, the text corpus underwent several pre-processing steps—such as tokenization, stopword filtering, and stemming represented numerically using TF-IDF weighting. Among all tested configurations, the Linear kernel produced the strongest results, achieving an accuracy rate of 88%. Despite a moderate distribution across categories (1030 negative, 886 neutral, and 584 positive), the model achieved consistent performance across all classes. Evaluation using Precision, Recall, and F1-Score confirmed the validity of the 88% accuracy without the need for additional sampling techniques. From a scholarly standpoint, this research adds insight into sentiment analysis for retail applications within the Indonesian context by applying a machine-learning approach. In practice, the outcomes highlight areas for improvement, particularly technical stability, the intuitiveness of user flows, and promotional clarity to support a better overall user experience.
Implemetasi TF-IDF N-Gram dan Algoritma Nearest Centroid untuk Klasifikasi Topik Tugas Akhir Hana, Rohima Choirul; Kurniawan, Defri
Building of Informatics, Technology and Science (BITS) Vol 7 No 3 (2025): December 2025
Publisher : Forum Kerjasama Pendidikan Tinggi

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47065/bits.v7i3.8859

Abstract

This study presents a lightweight and explainable workflow for curating undergraduate thesis titles in the Informatics Engineering Study Program by combining TF-IDF n-gram (1–2) features with a cosine based Nearest Centroid classifier. Titles are grouped into three internal research area classes, RPLD, SC, and SKKKD, to support topic grouping and supervisor assignment. The approach is implemented as a Streamlit web application that supports Excel upload with preview and persistent saving, column standardization, text normalization, duplicate rejection using normalized titles, rapid training on labeled data, topic prediction for new titles, and retrieval of the most similar titles to assist curation. A key operational contribution is the direct linkage from predicted classes to the program maintained lecturer list for each area, enabling students to identify suitable supervisors and helping coordinators run a consistent and auditable workflow. On a multi semester corpus of 1,057 titles, stratified 5-fold cross-validation achieved 92.43 percent average accuracy, Macro F1 of 0.875, Micro F1 of 0.924, and Weighted F1 of 0.925, indicating a balance between accuracy, efficiency, and interpretability for short text. Decision inspection is supported by class specific top terms and nearest neighbor title lists. Limitations mainly stem from the minority class, therefore future work will expand labeled corpora, add character level n grams, and explore lightweight hybrid representations.