Claim Missing Document
Check
Articles

Found 6 Documents
Search
Journal : JOURNAL OF APPLIED INFORMATICS AND COMPUTING

Implementation of the K-Nearest Neighbors (KNN) Regressor Method to Predict Toyota Used Car Prices Ghaisani, Mauhiba Salmaa; Baita, Anna
Journal of Applied Informatics and Computing Vol. 9 No. 1 (2025): February 2025
Publisher : Politeknik Negeri Batam

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.30871/jaic.v9i1.8860

Abstract

The development of the automotive industry in Indonesia has experienced significant growth in recent decades, especially in the used car market segment. One of the used car brands that has high demand is Toyota, because it has a reliable reputation and quality. However, there are challenges that are often faced by sellers and buyers of used cars, namely in determining prices correctly and accurately. Incorrect pricing can be detrimental to one party, either the price is too high or too low. Prices that are too high can slow down the turnover of goods in the market. While low prices can cause sellers to experience losses. The purpose of this study is to help find good performance in determining the price of used Toyota cars. This study will use one of the Machine Learning methods, namely K-Nearest Neighbors Regressor. The KNN method is one method that can be used for classification and regression. In addition, this algorithm is a simple algorithm and can provide accurate prediction results based on its proximity to existing data. This study uses selected relevant features, namely model, year, kilometer, tax, mpg, and cc. The results of this study obtained MAE = 3.31686, MSE = 26.43640, RMSE = 5.14163, and R2-Score = 0.99501 using 90:10 data division and k = 1. This proves that KNN Regressor is an effective method in predicting the price of used Toyota cars. Therefore, the K-Nearest Neighbors (KNN) Regressor method is able to provide a fairly accurate price estimate with a minimal error rate.
Optimization of Decision Tree Algorithm for Chronic Kidney Disease Classification Based on Particle Swarm Optimization (PSO) Aulia Fitri, Laili; Baita, Anna
Journal of Applied Informatics and Computing Vol. 9 No. 1 (2025): February 2025
Publisher : Politeknik Negeri Batam

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.30871/jaic.v9i1.8940

Abstract

The body's most important vital organ is the kidney. The kidneys are responsible for maintaining acid and alkaline balance, regulating blood pressure, and filtering blood to prevent the accumulation of metabolic waste in the body. However, chronic kidney disease does not always show symptoms and signs but can progress to kidney failure. Algorithm-based predictive methods in data processing show great potential in the health field to predict various diseases, one of which is kidney disease. One of the techniques in data mining is classification. One of the classification algorithms in data mining that is often used to detect diseases is Decision Tree. In this study, it is expected that by combining these two methods, it will make a new contribution to the Decision Tree algorithm that is optimized with Particle Swarm Optimization (PSO) for the selection of relevant features, and improve the weaknesses in the model to improve more accurate predictions. By performing feature selection with the Particle Swarm Optimization (PSO) algorithm, it is shown that the use of Particle Swarm Optimization (PSO) can improve the accuracy and performance of the Decision Tree algorithm in the chronic kidney disease classification process. The accuracy of the Decision Tree algorithm with feature selection using Particle Swarm Optimization (PSO) is higher, reaching 0.967%, compared to the accuracy of Decision Tree without Particle Swarm Optimization (PSO) feature selection which is only 0.95%. This shows that Particle Swarm Optimization (PSO) is effective in selecting relevant features so that it can significantly improve model performance.
Comparison of Support Vector Machine and Decision Tree Algorithm Performance with Undersampling Approach in Predicting Heart Disease Based on Lifestyle Febriyanti, Gusti Ayu Putu; Baita, Anna
Journal of Applied Informatics and Computing Vol. 9 No. 2 (2025): April 2025
Publisher : Politeknik Negeri Batam

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.30871/jaic.v9i2.8941

Abstract

Heart disease is one of the leading causes of death in the world with risk factors such as atherosclerosis, high blood pressure, and smoking. Early diagnosis is essential to reduce mortality and improve patients' quality of life. This study evaluates the performance of two machine learning algorithms, namely Support Vector Machine (SVM) and Decision Tree (DT), in predicting heart disease risk by applying undersampling techniques to handle data imbalance. The K-fold cross-validation method with K=10 and hyperparameter tuning were applied to obtain the optimal performance of both models. The results showed that SVM without undersampling achieved 92% accuracy, while with undersampling the accuracy decreased to 76%. DT without undersampling has 91% accuracy, while with undersampling the accuracy reaches 75%. The undersampling technique successfully improved the balance in recognizing minority classes, although it reduced the overall accuracy. This finding confirms that SVM is more reliable in predicting heart disease in datasets with unbalanced class distribution.
Comparison of Support Vector Machine (SVM) and Random Forest (RF) Algorithm Performance with Random Undersampling Technique to Predict Gestational Diabetes Mellitus Risk Damayanti, Annisa; Baita, Anna
Journal of Applied Informatics and Computing Vol. 9 No. 2 (2025): April 2025
Publisher : Politeknik Negeri Batam

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.30871/jaic.v9i2.9009

Abstract

Gestational Diabetes Mellitus (GDM) is a condition of glucose intolerance that develops during pregnancy until the birth process, which is characterized by an abnormal increase in blood sugar levels. Accurate early diagnosis is very important to provide information that can accelerate the treatment process and reduce complications in the mother and baby. One of the machine learning methods that can be used to predict GDM is the Support Vector Machine (SVM) algorithm and the Random Forest (RF) algorithm. This study aims to compare, and evaluate GDM disease prediction models using the SVM and RF algorithms by balancing the target data using the Random Undersampling Technique. The approach using the random undersampling technique managed to increase accuracy by 18% from the accuracy before using the random undersampling technique. The SVM model in this study also uses hyperparameter tuning with kernel parameters, C (cost), and gamma, while the RF model uses Scoring Metrix and four other parameters, namely N_estimators, max_depth, min_samples_split, and min_samples_leaf. The best parameter search process is carried out using GridSearchCV on both models. The results of the study showed that the SVM classification model with random undersampling technique and hyperparameter tuning with K-Fold achieved an average accuracy of 100% with precision, recall, f1-score values also reaching 100%, with the Best Parameter Kernel Linear, C value = 0.1 and gamma value = 0.001 reaching the highest accuracy of 1.0, with a ROC-AUC value of 99% indicating very good prediction performance. While the RF model showed an accuracy result of 99%, tuning was also carried out using the appropriate parameters resulting in the same accuracy of 99%, with a ROC-AUC value of 99% as well. From both models, it shows that the SVM and RF algorithms have very good prediction performance in predicting DMG, but the SVM algorithm can predict DMG better than RF because the number of prediction errors is lower. 
Comparative Study of Support Vector Regression and Long Short-Term Memory for Stock Price Prediction Aviva Pradasyah; Baita, Anna
Journal of Applied Informatics and Computing Vol. 9 No. 4 (2025): August 2025
Publisher : Politeknik Negeri Batam

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.30871/jaic.v9i4.9425

Abstract

This study aims to compare the performance of two machine learning algorithms, Long Short-Term Memory (LSTM) and Support Vector Regression (SVR), in predicting the stock prices of PT Bank Rakyat Indonesia (BBRI) using daily historical data from January 1, 2020, to January 10, 2025. The data were processed using a 60-day sliding window technique and normalized with MinMaxScaler. Model performance was evaluated using Mean Absolute Error (MAE), Mean Squared Error (MSE), and the coefficient of determination (R²) across five independent trials (5-fold trials). The evaluation results show that SVR outperforms in short-term prediction, with an average MAE of 0.0281, MSE of 0.0014, and R² of 0.9072. Meanwhile, LSTM records an average MAE of 0.0312, MSE of 0.0015, and R² of 0.8962, but achieves better performance in medium-term predictions, with a smaller average error of Rp228.02 compared to Rp242.52 from SVR. Both models demonstrate strong generalization capabilities on test data without signs of overfitting. Based on these findings, SVR is recommended for stable short-term forecasts, while LSTM is better suited for medium-term predictions involving complex trend patterns.
Machine Learning-Based Sentiment Analysis on Twitter (X): A Case Study of the “Kabur Aja Dulu” Issue Using SVM Rohmatun, Lina; Baita, Anna
Journal of Applied Informatics and Computing Vol. 9 No. 4 (2025): August 2025
Publisher : Politeknik Negeri Batam

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.30871/jaic.v9i4.9991

Abstract

This study aims to analyze public sentiment toward the phenomenon of “Kabur Aja Dulu” on Twitter (X) using the Support Vector Machine (SVM) method. The data used consists of 4,768 Indonesian-language tweets collected through web scraping. The pre-processing process includes data cleaning, tokenization, stemming, and translation into English for automatic sentiment labeling using TextBlob. The data is then classified into three sentiment categories: positive, negative, and neutral. To address class imbalance, the SMOTE method is applied to the training data, along with TF-IDF techniques for feature extraction. The model was evaluated using the K-Fold Cross Validation method and Grid Search for hyperparameter tuning. The results of the study show that the SVM model with a linear kernel and parameter C=10 provides the best performance with an accuracy value of 85.56%, precision of 845.19%, recall of 85.56%, and F1-score of 85.30%. The main finding of this study is that the linear SVM method is capable of classifying sentiment well, particularly for neutral sentiment data, and has proven effective as an approach to sentiment analysis in the context of social media using the Indonesian language.