Jurnal Natural
Volume 23 Number 3, October 2023

Application of SHAP on CatBoost classification for identification of variabels characterizing food insecurity occurrences in Aceh Province households

MUHAMMAD SUBIANTO (Department of Statistics, Faculty of Mathematics and Natural Science, Universitas Syiah Kuala, Banda Aceh 23111, Indonesia)
INA YATUL ULYA (Department of Statistics, Faculty of Mathematics and Natural Science, Universitas Syiah Kuala, Banda Aceh 23111, Indonesia)
EVI RAMADHANI (Department of Statistics, Faculty of Mathematics and Natural Science, Universitas Syiah Kuala, Banda Aceh 23111, Indonesia)
BAGUS SARTONO (Department of Statistics, IPB University, Bogor, West Java, Indonesia)
ALFIAN FUTUHUL HADI (Department of Mathematics, University of Jember, East Java, Indonesia)



Article Info

Publish Date
31 Oct 2023

Abstract

Classification is the process of building a model that can distinguish between different classes of data. The model aims to predict the class of testing data based on patterns or relationships learned from training data. One of the data processing algorithms used to build classification models is Categorical Boosting (CatBoost). However, in general, the resulting models are difficult to interpret. To facilitate the interpretation of complex classification models, methods such as SHAP (SHapley Additive exPlanations) are needed. SHAP is a method to explain individual predictions. SHAP is based on the game theoretically optimal shapley values. In this study, an analysis of important SHAP variables was conducted on the CatBoost classification model to identify variables characterizing occurrences of food insecurity in households. The data used in this study was obtained from the Survei Sosial Ekonomi Nasional (Susenas) in March 2021 in Aceh Province, sourced from the Badan Pusat Statistik (BPS). There are 13,126 observations in the research data. The results from four evaluated classification models on the testing data showed that the best model had accuracy, sensitivity, specificity, and AUC values of 0.703, 0.349, 0.798, and 0.637, respectively. Furthermore, the results of the analysis of important SHAP variables showed that the variables number of household members who smoke ( ), education of the household head ( ), wall types ( ), drinking water source ( ), and decent sanitation ( ) significantly contributed to the occurrences of food insecurity in households in Aceh Province in the year 2021.

Copyrights © 2023






Journal Info

Abbrev

natural

Publisher

Subject

Agriculture, Biological Sciences & Forestry Astronomy Biochemistry, Genetics & Molecular Biology Chemistry Earth & Planetary Sciences Energy Immunology & microbiology Neuroscience Physics

Description

Jurnal Natural (JN) aims to publish original research results and reviews on sciences and mathematics. Jurnal Natural (JN) encompasses a broad range of research topics in chemistry, pharmacy, biology, physics, mathematics, statistics, informatic and ...