Claim Missing Document
Check
Articles

Found 4 Documents
Search
Journal : Building of Informatics, Technology and Science

Klasifikasi Penyakit Jantung Tipe Kardiovaskular Menggunakan Adaptive Synthetic Sampling dan Algoritma Extreme Gradient Boosting Permana, Acep Handika; Umbara, Fajri Rakhmat; Kasyidi, Fatan
Building of Informatics, Technology and Science (BITS) Vol 6 No 1 (2024): June 2024
Publisher : Forum Kerjasama Pendidikan Tinggi

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47065/bits.v6i1.5421

Abstract

Cardiovascular diseases are conditions that commonly affect the cardiovascular system, such as heart disease and stroke. According to data from the World Health Organization (WHO), 17.9 million deaths worldwide in 2019 were attributable to cardiovascular disease. Early detection is crucial, but diagnosing heart disease is complex in developing countries due to the limited availability of diagnostic tools and medical personnel. This study uses the Heart Disease Dataset from Kaggle, consisting of 15 attributes and 4238 records, to develop a heart disease classification model using XGBoost. The research stages include data imputation, data transformation using LabelEncoder, data balancing using ADASYN, data splitting (80% training data, 20% testing data), and hyperparameter tuning with Bayesian Optimization. The results show that the XGBoost model with ADASYN performs better, with a ROC-AUC of 0.971 and an accuracy of 0.916, compared to the model without ADASYN, which has a ROC-AUC of 0.698 and an accuracy of 0.841. Based on the research results, ADASYN has proven effective in improving model performance on imbalanced datasets. Additionally, Bayesian Optimization plays an important role in finding the optimal parameter combination, which can further enhance model performance. With this research, the impact is quite significant in the development of early detection methods for cardiovascular heart disease, particularly through the application of the XGBoost classification algorithm
Klasifikasi Sentimen Untuk Mengetahui Kecenderungan Politik Pengguna X Pada Calon Presiden Indonesia 2024 Menggunakan Metode IndoBert Oktariansyah, Indro Abri; Umbara, Fajri Rakhmat; Kasyidi, Fatan
Building of Informatics, Technology and Science (BITS) Vol 6 No 2 (2024): September 2024
Publisher : Forum Kerjasama Pendidikan Tinggi

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47065/bits.v6i2.5435

Abstract

X has evolved into one of the most popular social media platforms in the world. In Indonesia, the use of X is quite widespread, especially in discussions about the presidential election, which is currently a hot topic. Everyone has different views on the candidates, both positive and negative. With a large amount of tweet data from users, this information can serve as a data source for processing and analysis. Various methods can be used to analyze and classify sentiment from this data, one of which is using BERT. This research conducts sentiment classification using BERT with the IndoBert model. The research aims to classify sentiments towards tweets related to the 2024 Indonesian presidential election to understand the political inclinations of X users, evaluate the performance of the IndoBert model in sentiment classification, and assess the extent to which back translation augmentation and synonym augmentation techniques can enhance the model's performance. Data was collected using crawling techniques for seven days leading up to the election and manually labeled by annotators. Synonym augmentation and back translation techniques were used to balance data in minority classes. The data was divided into 80% training data, 10% test data, and 10% validation data. The classification process was conducted using the IndoBert model that had been fine-tuned. The research results show that IndoBert with synonym augmentation achieved the highest accuracy, which was 82% in the first experiment and 81% in the second experiment. On the other hand, back translation only reached an accuracy of 78% in the first experiment and 74% in the second experiment. This indicates that synonym augmentation proved to be more effective in increasing data variation and model performance on the dataset used in this research.
Klasifikasi Website Phishing Menggunakan Metode X-Gboost dengan Teknik Penyeimbang Data Radial Based Undersampling Yoga, Yoga; Umbara, Fajri Rakhmat; Sabrina, Puspita Nurul
Building of Informatics, Technology and Science (BITS) Vol 7 No 2 (2025): September 2025
Publisher : Forum Kerjasama Pendidikan Tinggi

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47065/bits.v7i2.7920

Abstract

Phishing websites are one of the most prevalent forms of cyberattacks and have the potential to cause significant losses, both financially and non-financially. Automatic phishing detection using machine learning algorithms has become an effective solution to address this threat. This study aims to classify phishing websites using the Extreme Gradient Boosting (XGBoost) algorithm and to address the issue of class imbalance by applying the Radial Based Undersampling (RBU) method. In addition, hyperparameter tuning was performed using the Random Search method to optimize the model's performance. The dataset used was obtained from the Kaggle platform and exhibits an imbalanced class distribution, where the number of non-phishing instances far exceeds phishing instances. This imbalance can lead to a biased model and reduce its ability to detect minority class patterns. Based on the evaluation results, the application of RBU significantly improved the model’s capability in detecting phishing instances, while hyperparameter tuning further enhanced its accuracy. The best model was achieved through a combination of RBU and Random Search, reaching an accuracy of 90.39% on the test data. These findings indicate that the combined approach of data balancing and model optimization provides an effective solution for phishing website classification and can be applied to similar cases in the field of cybersecurity.
Analisis Sentimen Terhadap Cyberbullying di Twitter (X) Menggunakan Improved Word Vectors dan Bert Nusantara, Madya Dharma; Umbara, Fajri Rakhmat; Sabrina, Puspita Nurul
Building of Informatics, Technology and Science (BITS) Vol 7 No 2 (2025): September 2025
Publisher : Forum Kerjasama Pendidikan Tinggi

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47065/bits.v7i2.7968

Abstract

Text mining is an important approach in analyzing text data, particularly for detecting negative sentiments such as cyberbullying on social media. Twitter (X), as an open platform, often serves as a space for the proliferation of hate speech and abusive behavior recorded in text form. This study aims to improve the performance of sentiment classification models on Twitter (X) data by combining the Improved Word Vector (IWV) and Bidirectional Encoder Representations from Transformers (BERT) methods, evaluated using precision, recall, and F1-score metrics. The dataset used consists of 9,874 Indonesian-language tweets labeled into three categories: Hate Speech (HS), Abusive, and Neutral. This data is sourced from previous research and is the result of re-annotation of the original dataset of 13,169 tweets. IWV is formed from a combination of Word2Vec, GloVe, POS tagging, and emotion lexicon features designed to enrich word representation semantically. The preprocessing process is carried out through several important stages, namely tokenization, filtering, stemming/lemmatization, and normalization. The IWV extraction results were then combined with BERT embedding through concatenation to produce high-dimensional vector representations. Evaluation was performed using precision, recall, and F1-score metrics. The test results showed that the combined IWV+BERT model was able to produce better performance than BERT alone. The use of data that has been balanced through balancing techniques also contributed to the improvement in accuracy, with the highest accuracy value reaching 91%. This finding indicates that the integration of word representation features from IWV and sentence context from BERT can improve the effectiveness of text mining in sentiment analysis related to cyberbullying on social media