Claim Missing Document
Check
Articles

Found 11 Documents
Search
Journal : Journal of Electronics, Electromedical Engineering, and Medical Informatics

LSTM and Bi-LSTM Models For Identifying Natural Disasters Reports From Social Media Yunida, Rahmi; Faisal, Mohammad Reza; Muliadi; Indriani, Fatma; Abadi, Friska; Budiman, Irwan; Prastya, Septyan Eka
Journal of Electronics, Electromedical Engineering, and Medical Informatics Vol 5 No 4 (2023): October
Publisher : Department of Electromedical Engineering, POLTEKKES KEMENKES SURABAYA

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.35882/jeeemi.v5i4.319

Abstract

Natural disaster events are occurrences that cause significant losses, primarily resulting in environmental and property damage and in the worst cases, even loss of life. In some cases of natural disasters, social media has been utilized as the fastest information bridge to inform many people, especially through platforms like Twitter. To provide accurate categorization of information, the field of text mining can be leveraged. This study implements a combination of the word2vec and LSTM methods and the combination of word2vec and Bi-LSTM to determine which method is the most accurate for use in the case study of news related to disaster events. The utility of word2vec lies in its feature extraction method, transforming textual data into vector form for processing in the classification stage. On the other hand, the LSTM and Bi-LSTM methods are used as classification techniques to categorize the vectorized data resulting from the extraction process. The experimental results show an accuracy of 70.67% for the combination of word2vec and LSTM and an accuracy of 72.17% for the combination of word2vec and Bi-LSTM. This indicates an improvement of 1.5% achieved by combining the word2vec and Bi-LSTM methods. This research is significant in identifying the comparative performance of each combination method, word2vec + LSTM and word2vec + Bi-LSTM, to determine the best-performing combination in the process of classifying data related to earthquake natural disasters. The study also offers insights into various parameters present in the word2vec, LSTM, and Bi-LSTM methods that researchers can determine.
Implementation of Random Forest and Extreme Gradient Boosting in the Classification of Heart Disease using Particle Swarm Optimization Feature Selection Ansyari, Muhammad Ridho; Mazdadi, Muhammad Itqan; Indriani, Fatma; Kartini, Dwi; Saragih, Triando Hamonangan
Journal of Electronics, Electromedical Engineering, and Medical Informatics Vol 5 No 4 (2023): October
Publisher : Department of Electromedical Engineering, POLTEKKES KEMENKES SURABAYA

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.35882/jeeemi.v5i4.322

Abstract

Heart disease is a condition that ranks as the primary cause of death worldwide. Based on available data, over 36 million people have succumbed to non-communicable diseases, and heart disease falls within the category of non-communicable diseases. This research employs a heart disease dataset from the UCI Repository, consisting of 303 instances and 14 categorical features. In this research, the data were analyzed using the classification methods XGBoost (Extreme Gradient Boosting) and Random Forest, which can be applied with PSO (Particle Swarm Optimization) as a feature selection technique to address the issue of irrelevant features. This issue can impact prediction performance on the heart disease dataset. From the results of the conducted research, the obtained values for the XGBoost (Extreme Gradient Boosting) model were 0.877, and for the Random Forest model, it was 0.874. On the other hand, in the model utilizing Particle Swarm Optimization (PSO), the obtained AUC values are 0.913 for XGBoost (Extreme Gradient Boosting) and 0.918 for Random Forest. These research results demonstrate that PSO (Particle Swarm Optimization) can enhance the AUC of heart disease prediction performance. Therefore, this research contributes to enhancing the precision and efficiency of heart disease patient data processing, which benefits heart disease diagnosis in terms of speed and accuracy.
Sentiment Analysis of TikTok Shop Closure in Indonesia on Twitter Using Supervised Machine Learning Al Habesyah, Noor Zalekha; Herteno, Rudy; Indriani, Fatma; Budiman, Irwan; Kartini, Dwi
Journal of Electronics, Electromedical Engineering, and Medical Informatics Vol 6 No 2 (2024): April
Publisher : Department of Electromedical Engineering, POLTEKKES KEMENKES SURABAYA

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.35882/jeeemi.v6i2.381

Abstract

TikTok Shop is one of the features in TikTok application which facilitates users to buy and sell products. The integration of TikTok Shop with social media has provided new opportunities to reach customers and increase sales. However, the closure of TikTok Shop has caused controversy among the public. This study aims to analyze the views and responses of TikTok users in Indonesia to the closure of TikTok Shop. The dataset used was obtained from Twitter. The research methodology consists of labeling, oversampling, splitting, and machine learning, which includes SVM, Random Forest, Decision Tree, and Deep Learning (H2O). The contribution of this research enriches our understanding of the implementation of machine learning, especially in sentiment analysis of TikTok Shop closures. From the test results, it is known that Deep Learning (H2O) + SMOTE obtained AUC 0.900, without using SMOTE, AUC 0.867. SVM + SMOTE obtained AUC 0.885, without using SMOTE AUC 0.881. Random Forest + SMOTE obtained AUC 0.822, while without using SMOTE AUC 0.830. Decision Tree + SMOTE AUC 0.59; without SMOTE, AUC 0.646. Deep Learning (H2O) with SMOTE produces better performance compared to SVM, Random Forest, and Decision Tree. With an AUC of 0.900; it can be said that Deep Learning (H2O) has excellent performance for sentiment analysis of TikTok Shop closures. This research has significant implications for social electronic commerce due to its potential utilization by social media analysts.
Implementation of C5.0 Algorithm using Chi-Square Feature Selection for Early Detection of Hepatitis C Disease MAHMUD, Mahmud; BUDİMAN, Irwan; INDRİANİ, Fatma; KARTİNİ, Dwi; FAİSAL, Mohammad Reza; ROZAQ, Hasri Akbar Awal; YILDIZ, Oktay; Caesarendra, Wahyu
Journal of Electronics, Electromedical Engineering, and Medical Informatics Vol 6 No 2 (2024): April
Publisher : Department of Electromedical Engineering, POLTEKKES KEMENKES SURABAYA

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.35882/jeeemi.v6i2.384

Abstract

Hepatitis C, a significant global health challenge, affects 71 million people worldwide, with severe complications such as cirrhosis and hepatocellular carcinoma. Despite its prevalence and availability in rapid diagnostic tests (RDTs), the need for accurate early detection methods remains critical. This research aims to enhance hepatitis C virus classification accuracy by integrating the C5.0 algorithm with Chi-Square feature selection, addressing the limitations of current diagnostic approaches and potentially reducing diagnostic errors. This research explores the development of a machine learning model for hepatitis C prediction, utilizing a publicly available dataset from Kaggle. It encompasses preprocessing techniques such as label encoding, handling missing values, normalization, feature selection, model development, and evaluation to ensure the model's efficacy and accuracy in diagnosing hepatitis C. The findings of this study reveal that implementing Chi-Square feature selection significantly enhances the effectiveness of machine learning algorithms. Specifically, the combination of the C5.0 algorithm and Chi-Square feature selection yielded a remarkable accuracy of 96.75%, surpassing previous research benchmarks. This highlights the potent synergy between advanced feature selection techniques and machine learning algorithms in improving diagnostic precision. The study conclusively demonstrates that machine learning is an effective tool for detecting hepatitis C, showcasing the potential to enhance diagnostic accuracy significantly. As a future recommendation, adopting AutoML is suggested to periodically automate the selection of the optimal algorithm, promising further improvements in detection capabilities.
Gender Classification on Social Media Messages Using fastText Feature Extraction and Long Short-Term Memory Sa’diah, Halimatus; Faisal, Mohammad Reza; Farmadi, Andi; Abadi, Friska; Indriani, Fatma; Alkaff, Muhammad; Abdullayev, Vugar
Journal of Electronics, Electromedical Engineering, and Medical Informatics Vol 6 No 3 (2024): July
Publisher : Department of Electromedical Engineering, POLTEKKES KEMENKES SURABAYA

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.35882/jeeemi.v6i3.407

Abstract

Currently, social media is used as a platform for interacting with many people and has also become a source of information for social media researchers or analysts. Twitter is one of the platforms commonly used for research purposes, especially for data from tweets written by individuals. However, on Twitter, user information such as gender is not explicitly displayed in the account profile, yet there is a plethora of unstructured information containing such data, often unnoticed. This research aims to classify gender based on tweet data and account description data and determine the accuracy of gender classification using machine learning methods. The method used involves FastText as a feature extraction method and LSTM as a classification method based on the extracted data, while to achieve the most accurate results, classification is performed on tweet data, account description data, and a combination of both. This research shows that LSTM classification on account description data and combined data obtained an accuracy of 70%, while tweet data classification achieved 69%. This research concludes that FastText feature extraction with LSTM classification can be implemented for gender classification. However, there is no significant difference in accuracy results for each dataset. However, this research demonstrates that both methods can work well together and yield optimal results.
Application Of SMOTE To Address Class Imbalance In Diabetes Disease Classification Utilizing C5.0, Random Forest, And SVM M. Khairul Rezki; Mazdadi, Muhammad Itqan; Indriani, Fatma; Muliadi, Muliadi; Saragih, Triando Hamonangan; Athavale, Vijay Annant
Journal of Electronics, Electromedical Engineering, and Medical Informatics Vol 6 No 4 (2024): October
Publisher : Department of Electromedical Engineering, POLTEKKES KEMENKES SURABAYA

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.35882/jeeemi.v6i4.434

Abstract

The implementation of SMOTE to tackle class imbalance in classification frequently results in suboptimal outcomes, owing to the intricacy of the dataset and the multitude of attributes at play. Consequently, alternative classification models were explored through experimentation to gauge their precision. This research aims to compare the precision of C5.0, Random Forest, and SVM classification models both with and without SMOTE. The methodology encompasses dataset selection, an overview of classification algorithms (C5.0, Random Forest, SVM), SMOTE technique, validation via split validation, preprocessing involving min-max normalization, and execution evaluation utilizing confusion matrices and AUC analysis. The dataset was sourced by Kaggle, specifically to rectify class imbalance in a diabetes dataset using SMOTE, consisting of 768 instances, with 268 samples for diabetic cases and 500 samples for non-diabetic cases. Prior to SMOTE application, the classification precision for C5.0, Random Forest, and SVM were 0.714, 0.733, and 0.746 respectively, with corresponding AUC values of 0.745, 0.824, and 0.799. Post-SMOTE, the precision depicts for the same techniques were 0.603, 0.727, and 0.727, with AUC values of 0.734, 0.831, and 0.794 respectively. It can be inferred that there's minimal impact post-SMOTE across the three classification models due to potential overfitting on the dataset, leading to excessive reliance on synthesized data for minority classes, resulting in diminished model execution, precision, and AUC scores.
Comparison of the Adaboost Method and the Extreme Learning Machine Method in Predicting Heart Failure Muhammad Nadim Mubaarok; Triando Hamonangan Saragih; Muliadi; Fatma Indriani; Andi Farmadi; Rizal, Achmad
Journal of Electronics, Electromedical Engineering, and Medical Informatics Vol 6 No 3 (2024): July
Publisher : Department of Electromedical Engineering, POLTEKKES KEMENKES SURABAYA

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.35882/jeeemi.v6i3.440

Abstract

Heart disease, which is classified as a non-communicable disease, is the main cause of death every year. The involvement of experts is considered very necessary in the process of diagnosing heart disease, considering its complex nature and potential severity. Machine Learning Algorithms have emerged as powerful tools capable of effectively predicting and detecting heart diseases, thereby reducing the challenges associated with their diagnosis. Notable examples of such algorithms include Extreme Learning Machine Algorithms and Adaptive Boosting, both of which represent Machine Learning techniques adapted for classification purposes. This research tries to introduce a new approach that relies on the use of one parameter. Through careful optimization of algorithm parameters, there is a marked improvement in the accuracy of machine learning predictions, a phenomenon that underscores the importance of parameter tuning in this domain. In this research, the Heart Failure dataset serves as the focal point, with the aim of demonstrating the optimal level of accuracy that can be achieved through the use of Machine Learning algorithms. The results of this study show an average accuracy of 0.83 for the Extreme Learning Machine Algorithm and 0.87 for Adaptive Boosting, the standard deviation for both methods is “0.83±0.02” for Extreme Machine Learning Algorithm and “0.87±0.03” for Adaptive Boosting thus highlighting the efficacy of these algorithms in the context of heart disease prediction. In particular, entering the Learning Rate parameter into Adaboost provides better results when compared with the previous algorithm. Our research findings underline the supremacy of Extreme Learning Machine Algorithms and Adaptive Improvement, especially when combined with the introduction of a single parameter, it can be seen that the addition of parameters results in increased accuracy performance when compared to previous research using standard methods alone.
A Comparative Study: Application of Principal Component Analysis and Recursive Feature Elimination in Machine Learning for Stroke Prediction Hermiati, Arya Syifa; Herteno, Rudy; Indriani, Fatma; Saragih, Triando Hamonangan; Muliadi; Triwiyanto, Triwiyanto
Journal of Electronics, Electromedical Engineering, and Medical Informatics Vol 6 No 3 (2024): July
Publisher : Department of Electromedical Engineering, POLTEKKES KEMENKES SURABAYA

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.35882/jeeemi.v6i3.446

Abstract

Stroke is a disease that occurs in the brain and can cause both vocal and global brain dysfunction. Stroke research mainly aims to predict risk and mortality. Machine learning can be used to diagnose and predict diseases in the healthcare field, especially in stroke prediction. However, collecting medical record data to predict a disease usually makes much noise because not all variables are important and relevant to the prediction process. In this case, dimensionality reduction is essential to remove noisy (i.e., irrelevant) and redundant features. This study aims to predict stroke using Recursive Feature Elimination as feature selection, Principal Component Analysis as feature extraction, and a combination of Recursive Feature Elimination and Principal Component Analysis. The dataset used in this research is stroke prediction from Kaggle. The research methodology consists of pre-processing, SMOTE, 10-fold Cross-Validation, feature selection, feature extraction, and machine learning, which includes SVM, Random Forest, Naive Bayes, and Linear Discriminant Analysis. From the results obtained, the SVM and Random Forest get the highest accuracy value of 0.8775 and 0.9511 without using PCA and RFE, Naive Bayes gets the highest value of 0.7685 when going through PCA with selection of 20 features followed by RFE feature selection with selection of 5 features, and LDA gets the highest accuracy with 20 features from feature selection and continued feature extraction with a value of 0. 7963. It can be concluded in this study that SVM and Random Forest get the highest accuracy value without PCA and RFE techniques, while Naive Bayes and LDA show better performance using a combination of PCA and RFE techniques. The implication of this research is to know the effect of RFE and PCA on machine learning to improve stroke prediction.
Analysis of Important Features in Software Defect Prediction Using Synthetic Minority Oversampling Techniques (SMOTE), Recursive Feature Elimination (RFE) and Random Forest Ghinaya, Helma; Herteno, Rudy; Faisal, Mohammad Reza; Farmadi, Andi; Indriani, Fatma
Journal of Electronics, Electromedical Engineering, and Medical Informatics Vol 6 No 3 (2024): July
Publisher : Department of Electromedical Engineering, POLTEKKES KEMENKES SURABAYA

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.35882/jeeemi.v6i3.453

Abstract

Software Defect Prediction (SDP) is essential for improving software quality during testing. As software systems grow more complex, accurately predicting defects becomes increasingly challenging. One of the challenges faced is dealing with imbalanced class distributions, where the number of defective instances is significantly lower than non-defective ones. To tackle the imbalanced class issue, use the SMOTE technique. Random Forest as a classification algorithm is due to its ability to handle non-linear data, its resistance to overfitting, and its ability to provide information about the importance of features in classification. This research aims to evaluate important features and measure accuracy in SDP using the SMOTE+RFE+Random Forest technique. The dataset used in this study is NASA MDP D", which included 12 data sets. The method used combines SMOTE, RFE, and random forest techniques. This study is conducted in two stages of approach. The first stage uses the RFE+Random Forest technique; the second stage involves adding the SMOTE technique before RFE and Random Forest to measure the accurate data from NASA MDP. The result of this study is that the use of the SMOTE technique enhances accuracy across most datasets, with the best performance achieved on the MC1 dataset with an accuracy of 0.9998. Feature importance analysis identifies "maintenance severity" and "cyclomatic density" as the most crucial features in data modeling for SDP. Therefore, the SMOTE+RFE+RF technique effectively improves prediction accuracy across various datasets and successfully addresses class imbalance issues.
A Classification of Appendicitis Disease in Children Using SVM with KNN Imputation and SMOTE Approach Difa Fitria; Triando Hamonangan Saragih; Muliadi; Dwi Kartini; Fatma Indriani
Journal of Electronics, Electromedical Engineering, and Medical Informatics Vol 6 No 3 (2024): July
Publisher : Department of Electromedical Engineering, POLTEKKES KEMENKES SURABAYA

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.35882/jeeemi.v6i3.470

Abstract

This study evaluates the effect of SMOTE and KNN imputation techniques on the performance of SVM classification models on a nearly balanced dataset. The results show that using SMOTE increases model precision but decreases recall. This shows the importance of careful consideration when choosing data processing strategies to achieve optimal classification model performance. This study evaluates the effect of the Synthetic Minority Over-sampling Technique (SMOTE) and K-Nearest Neighbors (KNN) imputation on the performance of Support Vector Machine (SVM) classification models on nearly balanced datasets. The results of this study noted that the use of SMOTE techniques in balancing the dataset led to a decrease in classification model accuracy from 87.26% to 85.99%. However, there was a slight increase in AUC-ROC, from 85.96% to 88.04%. The results of this study noted that the use of the SMOTE technique in balancing the dataset caused a decrease in the accuracy of the classification model from 87.26% to 85.99%. However, there was an improvement in the AUC-ROC, from 85.96% to 88.04%.
Co-Authors Abdilah, Muhammad Fariz Fata Abdullayev, Vugar Achmad Rizal Agustia Kuspita Aryanti Ahmad Rusadi Arrahimi - Universitas Lambung Mangkurat) Ahmad Rusadi Arrahimi - Universitas Lambung Mangkurat) Al Habesyah, Noor Zalekha Amini, Aisah Ananda, Zahra Andi Farmadi Andi Farmadi Anshari, Muhammad Ridha Ansyari, Muhammad Ridho Arianti, Tiara Astuti, Yeni Ayu Astuty, Delfriana Ayu Athavale, Vijay Annant Azizah, Azkiya Nur Baron Hidayat Barus, Nency Utami Br Berutu, Marwiyah Carolina, Ayu Dendy Fadhel Adhipratama Dendy Difa Fitria Dodon Turianto Nugrahadi Dwi Kartini Dwi Kartini, Dwi Fahmi Setiawan Fairudz Shahura Faisal, M. Reza Faisal, Mohammad Reza Fajrin Azwary Friska Abadi Ghinaya, Helma Gustara, Rizki Asih Harahap, Helma Denisah Hasyimi , Ali Hayati, Sera Br Hermiati, Arya Syifa Herteno, Rudy Heru Kartika Chandra I Gusti Ngurah Antaryama Ichwan Dwi Nugraha Ihsan, Muhammad Khairi Irwan Budiman Irwan Budiman Khairiyah Dwie Vanesa M. Apriannur M. Khairul Rezki Mahmud Mahmud Mawandri, Dwi Mohammad Mahfuzh Shiddiq Muhammad Alkaff Muhammad Itqan Mazdadi Muhammad Nadim Mubaarok Muhammad Reza Faisal, Muhammad Reza Muliadi Muliadi Muliadi Aziz Muliadi Muliadi Muliadi Muliadi Nita Arianty Nofi Susanti Nurhayani nurhayani Nurhayati Octavia, Mayang Dwi Oni Soesanto P., Chandrasekaran Patrick Ringkuangan Prastya, Septyan Eka Purnajaya, Akhmad Rezki Radityo Adi Nugroho Rapotan Hasibuan Reni Agustina Harahap Ridha Afifa Risma, Ade Ritonga, Egril Rehulina Rozaq, Hasri Akbar Awal Rudy Herteno Rudy Herteno Saragih, Triando Hamonangan Sa’diah, Halimatus Selvia Indah Liany Abdie Soesanto, Oni Sri Rahayu Suci Wulandari Triyoolanda, Anggun Wahyu Caesarendra Wati, Desi Indriani Rahma Wijaya Kusuma, Arizha YILDIZ, Oktay Yulia Khairina Ashar Yunida, Rahmi Zahra, Fairuz Zata Ismah Zida Ziyan Azkiya