cover
Contact Name
Mesran
Contact Email
mesran.skom.mkom@gmail.com
Phone
-
Journal Mail Official
jurnal.bits@gmail.com
Editorial Address
-
Location
Kota medan,
Sumatera utara
INDONESIA
Building of Informatics, Technology and Science
ISSN : 26848910     EISSN : 26853310     DOI : -
Core Subject : Science,
Building of Informatics, Technology and Science (BITS) is an open access media in publishing scientific articles that contain the results of research in information technology and computers. Paper that enters this journal will be checked for plagiarism and peer-rewiew first to maintain its quality. This journal is managed by Forum Kerjasama Pendidikan Tinggi (FKPT) published 2 times a year in Juni and Desember. The existence of this journal is expected to develop research and make a real contribution in improving research resources in the field of information technology and computers.
Arjuna Subject : -
Articles 926 Documents
Analysis of Stunting Prediction in Toddlers in Bekasi District Using Random Forest and Naïve Bayes Solin, Chintya Annisah; Gunawan, Putu Harry
Building of Informatics, Technology and Science (BITS) Vol 6 No 4 (2025): March 2025
Publisher : Forum Kerjasama Pendidikan Tinggi

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47065/bits.v6i4.6670

Abstract

This study aims to compare the performance of the Random Forest and Naïve Bayes algorithms in predicting stunting in toddlers using data from the Bekasi District Health Office. The analysis process begins with data cleaning, normalization, and sampling using the Adaptive Synthetic Sampling (ADASYN) method to handle data imbalance, followed by validation with Stratified K-Fold Cross Validation. The implementation of the algorithm shows that Random Forest has the highest accuracy of 89.62% and an F1-Score of 89.09%. Naïve Bayes Gaussian produces an accuracy of 88.72% and an F1-Score of 88.81%, while Naïve Bayes Bernoulli has a lower performance with an accuracy of 67.83% and an F1-Score of 69.72%. Random Forest shows advantages in overcoming noise and imbalanced data, making it an optimal choice for stunting prediction. Meanwhile, the performance of Naïve Bayes is influenced by the characteristics of the data, where the Gaussian variation is more suitable for continuous data. The results of this study provide insight that choosing the right algorithm, especially on imbalanced data, is very important to improve prediction accuracy. This study also recommends more attention to data preprocessing to ensure optimal prediction quality, especially for minority classes.
Comparison of Random Forest and Decision Tree Methods for Emotion Classification based on Social Media Posts Tsaqif, Muhammad Abiyyu; Maharani, Warih
Building of Informatics, Technology and Science (BITS) Vol 6 No 4 (2025): March 2025
Publisher : Forum Kerjasama Pendidikan Tinggi

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47065/bits.v6i4.6677

Abstract

Social media platforms like X (formerly Twitter) have become essential for expressing emotions and opinions, making emotion classification a critical task with applications in mental health, public sentiment monitoring, and customer feedback analysis. This study compares Random Forest and Decision Tree algorithms for classifying emotions such as joy, sadness, anger, and fear which are from social media posts. Data collection involved crawling tweets and manual labeling. Preprocessing included tokenization, stemming, and stopword removal, with feature extraction using TF-IDF and Bag of Words. Experimental scenarios tested data split ratios, resampling for class balance, and parameter tuning. Decision Tree parameters included criterion (gini, entropy), max depth (none, fixed values), min samples split (2, 5), and min samples leaf (1, 2). Random Forest parameters tuned were n_estimators (100–400), max depth (none, fixed values), min samples split (2, 5, 10), and min samples leaf (1, 2). Results showed Random Forest achieving a maximum accuracy of 76.17%, outperforming Decision Tree’s 72.62%. The combination of TF-IDF and Bag of Words delivered the highest accuracy for both models. This study underscores the importance of preprocessing, balanced datasets, and parameter optimization for effective emotion classification. The findings offer insights into advancing sentiment analysis and natural language processing, enabling practical applications in public sentiment tracking, customer experience enhancement, and crisis management.
Perbandingan Metode Naïve Bayes Dengan SVM Pada Analisis Sentimen Aplikasi Pemesanan Tiket Kapal Ferizy Sulhan, Muhammad; Erizal, Erizal
Building of Informatics, Technology and Science (BITS) Vol 6 No 4 (2025): March 2025
Publisher : Forum Kerjasama Pendidikan Tinggi

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47065/bits.v6i4.6715

Abstract

In the digital era, user reviews on application platforms play a crucial role in evaluating service quality and customer satisfaction. This study aims to compare two sentiment analysis methods, namely Naive Bayes and Support Vector Machine (SVM), in classifying the sentiment of Ferizy app reviews on PlayStore into positive, negative, and neutral categories. Naive Bayes, known for its simplicity, efficiency on small datasets, and fast training, is compared to SVM, which is recognized for its high performance on complex data with non-linear distributions and its flexibility in kernel usage. This study also evaluates the performance of both methods based on accuracy, precision, recall, and F1-score metrics, particularly in handling class imbalance and noise in the data. The dataset consists of user reviews of the Ferizy application, which are analyzed to identify sentiment patterns and trends. The implementation results show that Naive Bayes achieves an accuracy of 79.27%, while SVM reaches an accuracy of 82.62%. This difference indicates that SVM is superior in handling more complex patterns in review data, although the margin is relatively small. The findings also reveal significant differences between the two methods, particularly in sentiment classification accuracy. Factors such as language complexity, class imbalance, and algorithm parameter selection are found to influence the performance of each method. This study provides valuable insights for application developers to improve service quality based on user sentiment analysis. Additionally, the results are expected to contribute to the development of more advanced and targeted sentiment analysis strategies, particularly in the digital transportation domain.Keyword: Analisis Sentimen; Naïve Bayes; Support Vector Machine; Ferizy; Ulasan
ObeCheck Sebagai Platform Penerapan Metode K-Nearest Neighbors untuk Klasifikasi Obesitas Berbasis Website Farihah, Lailatul; Subekti, Puji
Building of Informatics, Technology and Science (BITS) Vol 6 No 4 (2025): March 2025
Publisher : Forum Kerjasama Pendidikan Tinggi

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47065/bits.v6i4.6726

Abstract

The increase in obesity has become one of the major challenges in the healthcare sector, requiring quick and effective solutions for early classification and diagnosis. This study aims to develop a web-based system using the K-Nearest Neighbors (KNN) method to classify obesity based on user data, thereby assisting the public in early detection of obesity. The dataset used in this research comprises 2,111 records and 17 attributes, covering various factors related to obesity, such as weight, height, age, gender, genetic factors, and lifestyle, including dietary habits and physical activity. This dataset was obtained from the UCI Repository website. The data is processed using the K-Nearest Neighbors (KNN) method to generate an accurate and relevant obesity classification model. To evaluate the performance of the K-Nearest Neighbors (KNN) model, the dataset was split into training and testing data with a ratio of 80:20 and evaluated using a Confusion Matrix, resulting in an accuracy of 89%. Since the model demonstrates good performance in classifying test data, it can be implemented as a web-based system to test new data. This system will produce weight classification results, including categories such as "Underweight," "Normal Weight," "Overweight," and "Obesity." Thus, the public can easily and accurately classify obesity using this system.
Impact of SMOTE for Imbalance Class in DDoS Attack Detection Using Deep Learning MLP Ilma, Zidni; Ghozi, Wildanil; Rafrastara, Fauzi Adi
Building of Informatics, Technology and Science (BITS) Vol 6 No 4 (2025): March 2025
Publisher : Forum Kerjasama Pendidikan Tinggi

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47065/bits.v6i4.6727

Abstract

DDoS attacks, which are becoming increasingly complex and frequent, pose significant challenges to network security, particularly with the rise of cyber exploitation of infrastructure. A major issue in detecting these attacks is the imbalance between normal traffic and attack data, which causes machine learning models to be biased toward the majority class. To address this, this study proposes the use of the Synthetic Minority Over-sampling Technique (SMOTE) to balance the CIC-DDoS2019 dataset, successfully enhancing the performance of a Multi-Layer Perceptron (MLP) in detecting various types of attacks. Analysis results indicate that, on the original dataset without SMOTE, the model achieved high accuracy but low F1-Score for minority classes, highlighting difficulties in recognizing underrepresented attack patterns. After applying SMOTE, the F1-Score significantly improved for minority classes, demonstrating the model's enhanced ability to identify attack patterns. All dataset subsets showed improved performance across key evaluation metrics, indicating that SMOTE effectively expanded the model's decision boundary for minority classes, enabling MLP to detect DDoS attacks more accurately in previously challenging data patterns. This approach illustrates increased model sensitivity to minority feature distributions without significantly compromising performance on majority classes.
Public Political Sentiment Post 2024 Presidential Election: Comparison of Naïve Bayes and Support Vector Machine Patria, Widya Yudha; Gunawan, Putu Harry; Aquarini, Narita
Building of Informatics, Technology and Science (BITS) Vol 6 No 4 (2025): March 2025
Publisher : Forum Kerjasama Pendidikan Tinggi

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47065/bits.v6i4.6734

Abstract

One nation with a democratic political system is Indonesia. The public is able to express themselves freely. The public's use of social media is expanding quickly, particularly among users of platform ‘X’. The now trending tweets concern the 2024 presidential election. The reaction to the results of the 2024 presidential election has ranged from positive to negative to neutral. Large numbers of tweets can be used as a source of information to do their sentiments analysis. It is possible to know if people, in general, are satisfied or unsatisfied with the outcome of the presidential election thanks to the emotion categorization. This study aims to analyze public sentiment regarding the election result utilizing machine learning methods which will provide insights into public opinion that can be useful in political strategy as well as in public discourse assessment. In this paper, we will compare the Naïve Bayes Classifier (NBC) and the Support Vector Machine (SVM) algorithms for tweet classification of platform ‘X’ sentiment. This study presents the performed data analysis on 2193 data points (from platform X) that have been classified into neutral, positive, and negative categories using the Naive Bayes Classifier (NBC) and Support Vector Machine (SVM) techniques. Balancing SMOTE is used to address data imbalance, and TF-IDF is applied for feature extraction. Results depicts that Naïve Bayes Classifier (NBC) gives an accuracy of 62.41% whereas Support Vector Machine (SVM) gives 62.19% accuracy. This accuracy on these creations demonstrates how able models can be when classifying varying public sentiments between political events, highlighting the abilities, but also weaknesses of such efforts in sentiment classification. This paper contributes to the further development of sentiment analysis by providing an assessment of how effective these algorithms are, and by stressing the need for unbalance data treatment on research utilizing social media.
Penerapan Metode GA-TL Pada Algoritma Naive Bayes Untuk Mengatasi Class Imbalance Data Beasiswa KIP-Kuliah Widyastuti, Dessy; Siswa, Taghfirul Azhima Yoga; Rudiman, Rudiman
Building of Informatics, Technology and Science (BITS) Vol 6 No 4 (2025): March 2025
Publisher : Forum Kerjasama Pendidikan Tinggi

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47065/bits.v6i4.6737

Abstract

The Indonesia Smart Card (KIP) Scholarship Program aims to support students from underprivileged families in pursuing higher education, yet the distribution of recipient data often experiences class imbalance, leading to inaccuracies in scholarship allocation. This imbalance, characterized by disproportionate data between recipient and non-recipient groups, affects classification model performance, causing models to favor the majority class and overlook the minority class, potentially excluding eligible recipients. To address this issue, this study combines the Genetic Algorithm for feature selection and optimization with Tomek Links-Random Undersampling for data balancing. The research process includes data preprocessing, 10-fold cross-validation, and performance evaluation using a confusion matrix. Results indicate that without Tomek Links-Random Undersampling, Naïve Bayes accuracy increased from 65.2% to 66.0% after feature selection and optimization using the Genetic Algorithm, while applying Tomek Links-Random Undersampling improved accuracy from 56% to 63%. This method also enhanced fairness in recipient classification, promoting a more equitable distribution of benefits. The improved model accuracy significantly aids future scholarship selection processes, demonstrating that integrating efficient machine learning approaches optimizes the KIP Scholarship Program by ensuring beneficiaries are appropriately targeted based on predetermined criteria.
Penerapan Metode GA-CBU Pada Algoritma Logistic Regression Untuk Mengatasi Class Imbalance Data Beasiswa KIP-Kuliah Poernamawan, Ahmad Nugraha; Siswa, Taghfirul Yoga Azhima; Rudiman, Rudiman
Building of Informatics, Technology and Science (BITS) Vol 6 No 4 (2025): March 2025
Publisher : Forum Kerjasama Pendidikan Tinggi

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47065/bits.v6i4.6747

Abstract

The issue of class imbalance often poses a challenge in data analysis, where the number of instances in the majority class is significantly higher than that in the minority class. This can lead classification models to be biased towards predicting the majority class, resulting in low accuracy in identifying the minority class. This research aims to implement the Logistic Regression (LR) algorithm combined with the Clustering Based Undersampling (CBU) method as an undersampling technique, feature selection, and optimization using Genetic Algorithm (GA) in classifying KIP-College scholarship data at Muhammadiyah University of East Kalimantan. In addition, this research also evaluates the performance of the model with 10-Fold Cross Validation and Confusion Matrix techniques as accuracy metrics and aims to overcome the problem of class imbalance in the data of scholarship recipients (KIP) at Muhammadiyah University of East Kalimantan. The data used consists of 1075 records with 37 features related to the socio-economic factors of scholarship recipients. The results from the application of the CBU method indicate an increase in the accuracy of the Logistic Regression model from 62.51% to 67.68%. Furthermore, the combination of GA and CBU has providing more stable results in classifying minority classes. It is hoped that this research can make a significant contribution to the development of a more accurate and efficient scholarship recipient selection system, as well as serve as a reference for future studies in the fields of data mining and machine learning.
Penerapan Metode GA-NM Pada Algoritma SVM Untuk Mengatasi Class Imbalance Data Beasiswa KIP-Kuliah Abror, Irfan Fiqry; Siswa, Taghfirul Yoga Azhima; Rudiman, Rudiman
Building of Informatics, Technology and Science (BITS) Vol 6 No 4 (2025): March 2025
Publisher : Forum Kerjasama Pendidikan Tinggi

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47065/bits.v6i4.6756

Abstract

Class imbalance is a common challenge in data analysis, especially when the number of instances in the majority class significantly exceeds that in the minority class. This imbalance can cause classification models to favor the majority class, resulting in low accuracy in identifying the minority class. In this study, the Support Vector Machine (SVM) method combined with Near Miss and Genetic Algorithm (GA) is used to address the class imbalance problem in the scholarship recipient data of the Kartu Indonesia Pintar (KIP) program at Universitas Muhammadiyah Kalimantan Timur. The dataset consists of 1,075 records with 27 features representing the socio-economic factors of the scholarship recipients. Near Miss was applied to undersample the majority class, producing a more balanced data distribution. Subsequently, the SVM algorithm was utilized as the primary classification model, with feature selection and parameter optimization conducted using GA. The results indicate that the combination of SVM, Near Miss, and GA improved classification performance in identifying the minority class. The initial accuracy obtained without the method was 60.55% and after implementation it increased to 76.88%. This approach not only enhances the overall accuracy of the model but also ensures more stable performance, particularly for the minority class. Therefore, this study is expected to provide a significant contribution to the development of a more accurate and efficient scholarship selection system, as well as serve as a reference for future research in data mining and machine learning.
Penerapan Metode GA-RU Pada Algoritma Random Forest Untuk Mengatasi Class Imbalance Data Beasiswa KIP-Kuliah Rahman, Febrian Nor; Siswa, Taghfirul Azhima Yoga; Rudiman, Rudiman
Building of Informatics, Technology and Science (BITS) Vol 6 No 4 (2025): March 2025
Publisher : Forum Kerjasama Pendidikan Tinggi

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47065/bits.v6i4.6757

Abstract

Class imbalance is a common challenge in data analysis, where the majority class significantly outnumbers the minority class. This condition causes classification models to lean toward predicting the majority class, resulting in low accuracy in identifying the minority class. This study proposes the application of Genetic Algorithm (GA) combined with Random Undersampling (RU) on the Random Forest algorithm to address class imbalance issues in the dataset of Indonesia Smart Card (KIP) scholarship recipients at Universitas Muhammadiyah Kalimantan Timur. The dataset comprises 1,080 records with 37 features related to the socio-economic factors of the scholarship recipients. After data cleaning, 1,075 records were retained. The results indicate that the Random Undersampling method improved the accuracy of the Random Forest model from 84.27% to 85.06%. Although this improvement appears modest, it is significant as it demonstrates increased model stability in classifying the minority class, which previously had low accuracy. The combination of GA and RU proved effective in enhancing model performance, resulting in more stable classification for the minority class. This study is expected to contribute to the development of more accurate and efficient scholarship selection systems and serve as a reference for research in data mining and machine learning.