Claim Missing Document
Check
Articles

Found 5 Documents
Search
Journal : Jurnal Teknik Informatika (JUTIF)

OPTIMAL STUDY OF REAL-ESTATE PRICE PREDICTION MODELS USING MACHINE LEARNING Maulana, Ikhsan; Siregar, Amril Mutoi; Lestari, Santi Arum Puspita; Faisal, Sutan
Jurnal Teknik Informatika (Jutif) Vol. 5 No. 4 (2024): JUTIF Volume 5, Number 4, August 2024
Publisher : Informatika, Universitas Jenderal Soedirman

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.52436/1.jutif.2024.5.4.2565

Abstract

Everyone wants a place to live, especially close to work, shopping centers, easy transportation, low crime rates and others. Pricing must also pay attention to external factors, not just the house. Determining this price is sometimes difficult for some people. Therefore, the aim of this research is to predict real-estate prices by taking these factors into account. Prediction results are very useful for sellers who have difficulty determining prices and also for prospective buyers who are confused when making financial plans to buy a house in the desired neighborhood. The dataset used in this research was obtained from Kaggle and consists of 506 samples with 14 attributes. Several machine learning algorithms, such as Extra Trees (ET), Support Vector Regression (SVR), Random Forest (RF), eXtreme Gradient Boosting (XGB), Gradient Boosting Machine (GBM), Light Gradient Boosting Machine (LGBM), and CatBoost, used to predict real-estate prices. This research uses Principal Component Analysis (PCA) for feature selection techniques in data sets after the preprocessing phase and before model building. The highest accuracy model obtained is CatBoost with GridSearchCV, this model has been cross validated so there is very little chance of overfitting when given new data. The SVR model with a poly kernel uses a Principal Component (PC) of 10 and GridSearchCV gets an R2 Score of 0.87, a very large number close to the score of CatBoost with GridSearchCV.
IMPLEMENTATION OF DIABETES PREDICTION MODEL USING RANDOM FOREST ALGORITHM, K-NEAREST NEIGHBOR, AND LOGISTIC REGRESSION Pratama, Rio; Siregar, Amril Mutoi; Lestari, Santi Arum Puspita; Faisal, Sutan
Jurnal Teknik Informatika (Jutif) Vol. 5 No. 4 (2024): JUTIF Volume 5, Number 4, August 2024
Publisher : Informatika, Universitas Jenderal Soedirman

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.52436/1.jutif.2024.5.4.2593

Abstract

Diabetes is a serious metabolic disease that can cause various health complications. With more than 537 million people worldwide living with diabetes in 2021, early detection is crucial to preventing further complications. This research aims to predict the risk of diabetes using machine learning algorithms, namely Random Forest (RF), K-Nearest Neighbor (KNN), and Logistic Regression (LR), with the diabetes dataset from UCI. Previous research has explored a variety of algorithms and techniques, with results varying in accuracy. This research uses a dataset from Kaggle which consists of 768 data with 8 parameters, which are processed through pre-processing and data normalization techniques. The model was evaluated using metrics such as accuracy, confusion matrix, and ROC-AUC. The results showed that Logistic Regression had the best performance with 77% accuracy and AUC 0.83, compared to KNN (75% accuracy, AUC 0.81) and Random Forest ( 74% accuracy, AUC 0.81). These findings emphasize the importance of appropriate algorithm selection and good data pre-processing in diabetes risk prediction. This study concludes that Logistic Regression is the most effective method for predicting diabetes risk in the dataset used.
ANALYSIS AND IMPLEMENTATION OF AES-128 ALGORITHM IN SUKAHARJA KARAWANG VILLAGE SERVICE SYSTEM Fariz Duta Nugraha; Kiki Ahmad Baihaqi; Hilda Yulia Novita; Siregar, Amril Mutoi
Jurnal Teknik Informatika (Jutif) Vol. 5 No. 3 (2024): JUTIF Volume 5, Number 3, June 2024
Publisher : Informatika, Universitas Jenderal Soedirman

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.52436/1.jutif.2024.5.3.2038

Abstract

Data security in databases is needed in the industrial era 4.0 to prevent attacks and unwanted things from happening, one of the biggest cases that has been widely reported is data leakage, in this study aims to implement and analyze the Advanced Encryption Standard Algorithm, one of the data security algorithms with a block chiper type that has 4 transformations (SubByte, ShiftColumn, MixColumn, AddRoundKey), or what we usually call the Cryptography method. Cryptography is a method that is often used to secure important data in databases, in this article the Advanced Encryption Standard Algorithm is used to secure citizen data and family card data in the Sukaharja Karawang Village service system. The method in this research is the observation method, the data is obtained from each head of the neighborhood in Sukaharja Karawang Village with the permission of the head of Sukaharja Karawang Village. Citizen data and family cards were encrypted and analyzed for resource requirements in storing encryption results and time in returning and displaying original data. The results of the analysis obtained the amount of resources required 1.5MB to store family card data, which before encryption required 352KB. Citizen data requires a resource of 6.5MB, before encryption it takes 1.5MB. As for the AES resilience test stage using the Bruteforce attack method with the help of Hashcat software version 6.2.5 with 4 trial processes, One encrypted address data was taken for this test, but out of 4 attempts none of them showed that the data could be cracked.
IMPROVING HEART DISEASE PREDICTION ACCURACY USING PRINCIPAL COMPONENT ANALYSIS (PCA) IN MACHINE LEARNING ALGORITHMS Jayidan, Zirji; Siregar, Amril Mutoi; Faisal, Sutan; Hikmayanti, Hanny
Jurnal Teknik Informatika (Jutif) Vol. 5 No. 3 (2024): JUTIF Volume 5, Number 3, June 2024
Publisher : Informatika, Universitas Jenderal Soedirman

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.52436/1.jutif.2024.5.3.2047

Abstract

This study aims to improve the accuracy of heart disease prediction using Principal Component Analysis (PCA) for feature extraction and various machine learning algorithms. The dataset consists of 334 rows with 49 attributes, 5 classes and 31 target diagnoses. The five algorithms used were K-nearest neighbors (KNN), Logistic Regression (LR), Random Forest (RF), Support Vector Machine (SVM), and Decision Tree (DT). Results show that algorithms using PCA achieve high accuracy, especially RF, LR, and DT with accuracy up to 1.00. This research highlights the potential of PCA-based machine learning models in early diagnosis of heart disease.
OPTIMIZATION OF MACHINE LEARNING MODEL ACCURACY FOR BRAIN TUMOR CLASSIFICATION WITH PRINCIPAL COMPONENT ANALYSIS Maulana, Indra; Siregar, Amril Mutoi; Rahmat, Rahmat; Fauzi, Ahmad
Jurnal Teknik Informatika (Jutif) Vol. 5 No. 3 (2024): JUTIF Volume 5, Number 3, June 2024
Publisher : Informatika, Universitas Jenderal Soedirman

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.52436/1.jutif.2024.5.3.2058

Abstract

The main issue in brain tumor classification is the accuracy and speed of diagnosis through medical imaging. This study aims to improve the accuracy of machine learning models for brain tumor classification by using Principal Component Analysis (PCA) for dimensionality reduction. The research methods include image preprocessing, feature scaling, PCA application, and the implementation of machine learning algorithms such as Logistic Regression, Random Forest, Support Vector Machine (SVM), K-Nearest Neighbor (KNN), and Naive Bayes. The dataset consists of 3,264 images divided into training and testing sets. The results show that the use of PCA has varying impacts on different algorithms. PCA increases the accuracy of the SVM algorithm from 81% to 83% and KNN from 68% to 71%, but decreases the accuracy of Logistic Regression from 77% to 69% and Naive Bayes from 49% to 42%. Evaluation is performed using the Confusion Matrix and AUC-ROC to measure model performance. In conclusion, selecting the appropriate algorithm and preprocessing method is crucial in medical image classification, and the use of PCA should be considered based on the characteristics of the data and the algorithms used. This study also encourages the exploration of alternative dimensionality reduction methods for medical image analysis.