Claim Missing Document
Check
Articles

Found 3 Documents
Search
Journal : Jurnal Teknik Informatika (JUTIF)

OPTIMAL STUDY OF REAL-ESTATE PRICE PREDICTION MODELS USING MACHINE LEARNING Maulana, Ikhsan; Siregar, Amril Mutoi; Lestari, Santi Arum Puspita; Faisal, Sutan
Jurnal Teknik Informatika (Jutif) Vol. 5 No. 4 (2024): JUTIF Volume 5, Number 4, August 2024
Publisher : Informatika, Universitas Jenderal Soedirman

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.52436/1.jutif.2024.5.4.2565

Abstract

Everyone wants a place to live, especially close to work, shopping centers, easy transportation, low crime rates and others. Pricing must also pay attention to external factors, not just the house. Determining this price is sometimes difficult for some people. Therefore, the aim of this research is to predict real-estate prices by taking these factors into account. Prediction results are very useful for sellers who have difficulty determining prices and also for prospective buyers who are confused when making financial plans to buy a house in the desired neighborhood. The dataset used in this research was obtained from Kaggle and consists of 506 samples with 14 attributes. Several machine learning algorithms, such as Extra Trees (ET), Support Vector Regression (SVR), Random Forest (RF), eXtreme Gradient Boosting (XGB), Gradient Boosting Machine (GBM), Light Gradient Boosting Machine (LGBM), and CatBoost, used to predict real-estate prices. This research uses Principal Component Analysis (PCA) for feature selection techniques in data sets after the preprocessing phase and before model building. The highest accuracy model obtained is CatBoost with GridSearchCV, this model has been cross validated so there is very little chance of overfitting when given new data. The SVR model with a poly kernel uses a Principal Component (PC) of 10 and GridSearchCV gets an R2 Score of 0.87, a very large number close to the score of CatBoost with GridSearchCV.
IMPLEMENTATION OF DIABETES PREDICTION MODEL USING RANDOM FOREST ALGORITHM, K-NEAREST NEIGHBOR, AND LOGISTIC REGRESSION Pratama, Rio; Siregar, Amril Mutoi; Lestari, Santi Arum Puspita; Faisal, Sutan
Jurnal Teknik Informatika (Jutif) Vol. 5 No. 4 (2024): JUTIF Volume 5, Number 4, August 2024
Publisher : Informatika, Universitas Jenderal Soedirman

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.52436/1.jutif.2024.5.4.2593

Abstract

Diabetes is a serious metabolic disease that can cause various health complications. With more than 537 million people worldwide living with diabetes in 2021, early detection is crucial to preventing further complications. This research aims to predict the risk of diabetes using machine learning algorithms, namely Random Forest (RF), K-Nearest Neighbor (KNN), and Logistic Regression (LR), with the diabetes dataset from UCI. Previous research has explored a variety of algorithms and techniques, with results varying in accuracy. This research uses a dataset from Kaggle which consists of 768 data with 8 parameters, which are processed through pre-processing and data normalization techniques. The model was evaluated using metrics such as accuracy, confusion matrix, and ROC-AUC. The results showed that Logistic Regression had the best performance with 77% accuracy and AUC 0.83, compared to KNN (75% accuracy, AUC 0.81) and Random Forest ( 74% accuracy, AUC 0.81). These findings emphasize the importance of appropriate algorithm selection and good data pre-processing in diabetes risk prediction. This study concludes that Logistic Regression is the most effective method for predicting diabetes risk in the dataset used.
IMPROVING HEART DISEASE PREDICTION ACCURACY USING PRINCIPAL COMPONENT ANALYSIS (PCA) IN MACHINE LEARNING ALGORITHMS Jayidan, Zirji; Siregar, Amril Mutoi; Faisal, Sutan; Hikmayanti, Hanny
Jurnal Teknik Informatika (Jutif) Vol. 5 No. 3 (2024): JUTIF Volume 5, Number 3, June 2024
Publisher : Informatika, Universitas Jenderal Soedirman

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.52436/1.jutif.2024.5.3.2047

Abstract

This study aims to improve the accuracy of heart disease prediction using Principal Component Analysis (PCA) for feature extraction and various machine learning algorithms. The dataset consists of 334 rows with 49 attributes, 5 classes and 31 target diagnoses. The five algorithms used were K-nearest neighbors (KNN), Logistic Regression (LR), Random Forest (RF), Support Vector Machine (SVM), and Decision Tree (DT). Results show that algorithms using PCA achieve high accuracy, especially RF, LR, and DT with accuracy up to 1.00. This research highlights the potential of PCA-based machine learning models in early diagnosis of heart disease.