Mohamad Razi, Nor Faezah
Unknown Affiliation

Published : 2 Documents Claim Missing Document
Claim Missing Document
Check
Articles

Found 1 Documents
Search
Journal : Bulletin of Electrical Engineering and Informatics

Genetic programming in machine learning based on the evaluation of house affordability classification Masrom, Suraya; Baharun, Norhayati; Mohamad Razi, Nor Faezah; Abd Rahman, Abdullah Sani; Mohammad, Nor Hazlina; Sarkam, Nor Aslily
Bulletin of Electrical Engineering and Informatics Vol 13, No 5: October 2024
Publisher : Institute of Advanced Engineering and Science

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.11591/eei.v13i5.7594

Abstract

One of the big challenges in machine learning is difficulty of achieving high accuracy in a short completion time. A more difficulties appeared when the algorithm needs to be used for solving real dataset from the survey-based data collection. Imbalance dataset, insufficient strength of correlations, and outliers are common problems in real dataset. To accelerate the modelling processes, automated machine learning based on meta-heuristics optimization such as genetic programming (GP) has started to emerge and is gaining popularity. However, identifying the best hyper-parameters of the meta-heuristics’ algorithm is the critical issue. This paper demonstrates the evaluation of GP hyper-parameters in modeling machine learning on house affordability dataset. The important hyper-parameters of GP are population size (PS), that has been observed with different setting in this research. The machine learning with GP was used to predict house affordability among employers with transport expenditure and job mobility as some of the attributes. The results from testing that run on hold-out samples show that GP machine learning can reach to 70% accuracy with split ratio 0.2 and GP PS 30. This research contributes to the advancement of automated machine learning techniques, offering potential for faster and more accurate real survey-based datasets.