Claim Missing Document
Check
Articles

Found 3 Documents
Search
Journal : JOIV : International Journal on Informatics Visualization

Enhancing Heart Disease Classification: A Comparative Analysis of SMOTE and Naïve Bayes on Imbalanced Data Wibowo, Jonathan Juliano; Kristiyanti, Dinar Ajeng; Wiratama, Jansen
JOIV : International Journal on Informatics Visualization Vol 9, No 5 (2025)
Publisher : Society of Visual Informatics

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.62527/joiv.9.5.3248

Abstract

Heart disease remains a significant health concern, and early prediction plays a crucial role in improving patient outcomes. This study examines data mining techniques for heart disease classification, with a focus on the Naïve Bayes algorithm. A common challenge in such classification tasks is data imbalance, which can negatively impact the performance and evaluation metrics of the algorithm. To address this, we employed the Synthetic Minority Over-sampling Technique (SMOTE) to handle imbalanced data. Using the Knowledge Discovery in Databases (KDD) framework, the research followed data selection, pre-processing, transformation, mining, and evaluation stages. We applied SMOTE to the Naïve Bayes algorithm across three data split ratios (70:30, 60:40, and 50:50) and compared performance metrics before and after the SMOTE application. For the first dataset, the 50:50 split ratio showed the most tremendous improvement, with precision increasing from 30.74% to 78.15%, recall from 42.88% to 63.89%, and the Area Under Curve (AUC) from 0.819 to 0.831, although accuracy decreased from 86.82% to 73.01%. For the second dataset, the 70:30 split ratio yielded the most significant improvements, with accuracy rising from 95.22% to 97.72%, precision from 96.33% to 99.88%, recall from 51.11% to 95.57%, and AUC from 0.969 to 0.996. These results demonstrate that SMOTE can substantially improve classification performance in heart disease prediction, particularly in precision, recall, and AUC, with varying effects on accuracy depending on the dataset.
Comparison of Salp Swarm Algorithm and Particle Swarm Optimization as Feature Selection Techniques for Recession Sentiment Analysis in Indonesia Kristiyanti, Dinar Ajeng; Sanjaya, Samuel Ady; Irmawati, Irmawati; Ekachandra, Kristian; Suhali, Jason; Hairul Umam, Akhmad
JOIV : International Journal on Informatics Visualization Vol 9, No 5 (2025)
Publisher : Society of Visual Informatics

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.62527/joiv.9.5.3102

Abstract

Amidst global economic uncertainty, this study focuses on Twitter sentiment during the global recession issue on social media, especially in Indonesia. By utilizing sentiment analysis, this study uses machine learning algorithms such as Naïve Bayes (NB), Support Vector Machine (SVM), K-Nearest Neighbor (KNN) which are still less than optimal on high-dimensional Twitter data. The purpose of this study is to improve the accuracy of conventional machine learning using basic metaheuristic algorithms, namely the Salp Swarm Algorithm (SSA) and Particle Swarm Optimization (PSO) as feature selection. From January to May 2023, this study captures the evolving sentiment in response to economic conditions. Data preprocessing, including labeling through the TextBlob and VADER libraries, sets the stage for the analysis. Performance is compared based on labeling techniques, feature selection, and classification algorithms. Specifically, when applied to VADER labeled data without feature selection, the SVM model achieves an outstanding accuracy of 83% and an F1 score of 67%—notably, the application of SSA and PSO results in a reduction in model accuracy by 1%. However, the application of SSA and PSO slightly reduced the model accuracy performance by 1%. On the TextBlob labeled data, SVM showed an outstanding performance (80% accuracy, 77% F1 score). Interestingly, PSO on TextBlob data with SVM significantly decreased the model's performance. These findings contribute significantly to understanding the intricacies of sentiment dynamics during economic uncertainty on social media platforms, with SVM emerging as a strong choice for practical sentiment analysis.
A Comparative Analysis of Building Hidden Layer, Activation Function, and Optimizer on Neural Network Sentiment Analysis Sanjaya, Samuel Ady; Kristiyanti, Dinar Ajeng; Irmawati, Irmawati; Hadinata, Faustine Ilone; Karaeng, Cristin Natalia
JOIV : International Journal on Informatics Visualization Vol 9, No 3 (2025)
Publisher : Society of Visual Informatics

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.62527/joiv.9.3.2906

Abstract

The increasing diversity of opinions on social media offers a rich source for sentiment analysis, especially on controversial issues like the potential recession in Indonesia. This study aims to examine social media sentiment by utilizing three Deep Learning methods: Long Short-Term Memory (LSTM), Convolutional Neural Network (CNN), and Recurrent Neural Network (RNN). The main objective is to configure key hyperparameters, including the number of hidden layers, activation functions, and optimizers, to optimize performance. A dataset of 38,000 cleaned Twitter posts was used for this study. The preprocessing steps involve various techniques to prepare analysis, including case folding to standardize text, removal of punctuation to eliminate noise, stemming to reduce words to their root forms, and sentiment labeling using advanced tools like VADER and BERT to ensure accurate classification. Each deep learning model is trained using a diverse range of configurations for activation functions, such as Sigmoid and Swish, as well as optimizers like Adam and others to fine-tune performance. Among the models, the CNN, configured with 15 hidden layers, a Sigmoid activation function, and the Adam optimizer, outperformed the others, achieving the highest accuracy of 0.870 and a low loss of 0.316. The results highlight that while the number of hidden layers influences model performance, the choice of activation function and optimizer has a more significant impact on accuracy. Furthermore, the findings offer implications for future research, suggesting that activation functions and optimizers should be prioritized over hidden layers when aiming for improved sentiment analysis performance in various contexts.