Naswir, Ahmad Fadhil
Unknown Affiliation

Published : 1 Documents Claim Missing Document
Claim Missing Document
Check
Articles

Found 1 Documents
Search

Feature Importance and Binary Classification using PyCaret Naswir, Ahmad Fadhil; Williem; Hasanul Fahmi
Journal of Data Science, Technology, and Artificial Intelligence Vol. 1 No. 1 (2024): July 2024
Publisher : CV. ADMITECH SOLUTIONS

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.63703/ditech.v1i1.16

Abstract

The rapid advancement of machine learning (ML) techniques has facilitated the development of robust models for various classification tasks. This study explores the application of PyCaret, an open-source, low-code machine learning library, to perform feature importance analysis and binary classification using the Titanic dataset from Kaggle. The dataset underwent preprocessing to convert categorical features into numerical values and to remove irrelevant columns. Multiple classification models were compared, with the Gradient Boosting Classifier achieving the highest performance, marked by an average accuracy of 81.52%. Detailed evaluation metrics, including precision, recall, F1 score, and AUC, further validated the model's effectiveness. Feature importance analysis identified gender (sex), fare, and age as the most significant predictors of survival, aligning with historical accounts. The results demonstrate PyCaret's capability to streamline the ML workflow, providing valuable insights and enabling rapid experimentation. This study highlights the potential of binary classification and feature importance analysis in handling large-scale datasets, where the identified important features can serve as a baseline for implementing advanced algorithms such as deep learning.