Claim Missing Document
Check
Articles

Word Embedding Feature for Improvement Machine Learning Performance in Sentiment Analysis Disney Plus Hotstar Comments Jasmir, Jasmir; Nurhadi, Nurhadi; Rohaini, Eni; Pahlevi B, M Riza; Pardamean Simanjuntak, Daniel Sintong
Jurnal Ilmiah Teknik Elektro Komputer dan Informatika Vol. 10 No. 2 (2024): June
Publisher : Universitas Ahmad Dahlan

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.26555/jiteki.v10i2.28799

Abstract

In this research we apply several machine learning methods and word embedding features to process social media data, specifically comments on the Disney Plus Hotstar application. The word embedding features used include Word2Vec, GloVe, and FastText. Our aim is to evaluate the impact of these features on the classification performance of machine learning methods such as Naive Bayes (NB), K-Nearest Neighbor (KNN), and Random Forest (RF). NB is very simple and efficient and very sensitive to feature selection. Meanwhile, KNN is known for its weaknesses such as biased k values, overly complex computations, memory limitations, and ignoring irrelevant attributes. Then RF has a weakness, namely that the evaluation value can change significantly with just a slight change in the data. Feature selection in text classification is crucial for enhancing scalability, efficiency, and accuracy. Our testing results indicate that KNN achieved the highest accuracy both before and after feature selection. The FastText feature led to the highest performance for KNN, yielding balanced accuracy, precision, recall, and F1-score values.
Comparative Analysis of Optimizer Effectiveness in GRU and CNN-GRU Models for Airport Traffic Prediction Riyadi, Willy; Jasmir, Jasmir
Jurnal Ilmiah Teknik Elektro Komputer dan Informatika Vol. 10 No. 3 (2024): September
Publisher : Universitas Ahmad Dahlan

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.26555/jiteki.v10i3.29659

Abstract

The COVID-19 pandemic has posed significant challenges to airport traffic management, necessitating accurate predictive models. This research evaluates the effectiveness of various optimizers in enhancing airport traffic prediction using Deep Learning models, specifically Gated Recurrent Units (GRU) and Convolutional Neural Network-Gated Recurrent Units (CNN-GRU). We compare the performance of optimizers including RMSprop, Adam, Nadam, AdamW, Adamax, and Lion, and analyze the impact of their parameter tuning on model accuracy. Time series data from airports in the United States, Canada, Chile, and Australia were used, with preprocessing steps like filtering, cleaning, and applying a MinMax Scaler. The data was split into 80% for training and 20% for testing. Our findings reveal that the Adam optimizer paired with the GRU model achieved the lowest Mean Absolute Error (MAE) and Mean Absolute Percentage Error (MAPE) in the USA. The study underscores the importance of selecting and tuning optimizers, with ReduceLROnPlateau used to adjust the learning rate dynamically, preventing overfitting and improving model convergence. However, limitations include dataset imbalance and region-specific results, which may affect the generalizability of the findings. Future research should address these limitations by developing balanced datasets and exploring optimizer performance across a broader range of regions and conditions. This study lays the groundwork for further investigating sustainable and accurate airport traffic prediction models.
PATTERN CLASSIFICATION SIGN LANGUAGE USING FEATURES DESCRIPTORS AND MACHINE LEARNING Nurhadi, Nurhadi; Winanto, Eko Arip; Said, Rahaini Mohd; Jasmir, Jasmir; Afuan, Lasmedi
Jurnal Teknik Informatika (Jutif) Vol. 5 No. 2 (2024): JUTIF Volume 5, Number 2, April 2024
Publisher : Informatika, Universitas Jenderal Soedirman

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.52436/1.jutif.2024.5.2.1228

Abstract

Sign language is way of communication for the deaf and speech impaired. In Indonesia, the utilization of a standardized language involves the incorporation of American Sign Language (ASL). ASL is employed for various communication needs, ranging from basic alphanumeric fingerspelling (A-Z and numbers) to the more complex SIBI form (comprising gesture vocabulary) in everyday interactions as well as formal contexts. This surge in the digitization of sign language underscores the ongoing advancements in research and development. The challenge in this research lies in the ability to recognize American Sign Language (ASL) with diverse intensities and invariant backgrounds. Therefore, the study emphasis is on proposing a suitable segmentation method comparison for multi-intensity ASL cases. Subsequently, global feature descriptor methods, including Color Histogram, Hu Moments, and Haralick Texture techniques, are applied for feature extraction. The result of the Logistic Regression method versus the supervised Random Forest checks accuracy and suitability in identifying ASL fingerspelling. The findings of this research is predictive value of logistic regression is 48%, with class Y having the highest precision (0.86), class V having the lowest accuracy (0.16), and class L having the highest recall (0.73). The maximum precision in classes B, F, H, I, K, Y, and Z is 1.00, and the lowest in class U is 0.58, while the highest recall is in class G, which is 1.00. The lowest is in class V, while the predictive value from the random forest is 86 percent. Class H has the greatest f1 score (0.99), while class U has the lowest f1 score (0.64). The Random Forest method outperforms the two methods suggested in the paper, according to the comparison.
Comparison and Data Visualization in Thyroid Cancer Disease Prediction Using Machine Learning Algorithms Yudha, M. Zahran; Jasmir, Jasmir; Fachruddin, Fachruddin
MALCOM: Indonesian Journal of Machine Learning and Computer Science Vol. 6 No. 1 (2026): MALCOM January 2026
Publisher : Institut Riset dan Publikasi Indonesia

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.57152/malcom.v6i1.2249

Abstract

Thyroid cancer is a common endocrine malignancy requiring accurate early prediction for improved patient outcomes. Comprehensive comparative studies of machine learning algorithms, accompanied by systematic visualization, remain limited. This study compares tree-based algorithms (Decision Trees, Random Forest) and boosting algorithms (Gradient Boosting, XGBoost) for thyroid cancer prediction and develops visualization strategies for clinical interpretation. Four algorithms were evaluated using accuracy (correct prediction proportion), precision (positive predictive value), recall (true positive rate), F1-score (harmonic mean of precision and recall), and AUC-ROC (area under the ROC curve). Visualization techniques, including confusion matrices, ROC curves, and feature importance plots, facilitated the interpretation of the model. XGBoost achieved superior performance with accuracy 95.2%, precision 94.8%, recall 95.6%, F1-score 95.2%, and AUC-ROC 0.978, followed by Random Forest (93.5%, 92.7%, 94.1%, 93.4%, 0.965), Gradient Boosting (91.8%, 90.9%, 92.4%, 91.6%, 0.952), and Decision Trees (87.3%, 86.5%, 88.2%, 87.3%, 0.913). Feature importance analysis identified key predictors. Boosting algorithms, particularly XGBoost, demonstrate superior thyroid cancer prediction across all metrics. Integrated visualization enhances clinical interpretability, providing empirical guidance for implementing machine learning-based diagnostic support systems.