Claim Missing Document
Check
Articles

Found 3 Documents
Search

EASY ENSEMMBLE WITH RANDOM FOREST TO HANDLE IMBALANCED DATA IN CLASSIFICATION Sarini Abdullah; GV Prasetyo
Journal of Fundamental Mathematics and Applications (JFMA) Vol 3, No 1 (2020)
Publisher : Diponegoro University

Show Abstract | Download Original | Original Source | Check in Google Scholar | Full PDF (1177.977 KB) | DOI: 10.14710/jfma.v3i1.7415

Abstract

Imbalanced data might cause some issues in problem definition level, algorithm level, and data level. Some of the methods have been developed to overcome this issue, one of state-of-the-art method is Easy Ensemble. Easy Ensemble was claimed can improve model performance to classify minority class, and overcome the deficiency of random under- sampling. In this paper we discussed the implementation of Easy Ensemble with Random Forest Classifiers to handle imbalance problem in credit scoring case. This combination method is implemented in two datasets which taken from data science competition website, finhacks.id and kaggle.com with class proportion within majority and minority is 70:30 and 94:6. The results showed that resampling with Easy Ensemble can improve Random Forest classifier performance upon minority class. Recall on minority class increased significantly after the resampling. Before resampling, the recall on minority class for the first dataset (finhacks.id) was 0.49, and increased to 0.82 after the resampling. Similar results were obtained for the second data set (kaggle.com), where the recall for the minority class was increased from just 0.14 to 0.73.
Bayesian Accelerated Failure Time Model for Risk Pregnancy Detection Dennis Alexander; Sarini Abdullah; Adam Fahsyah Nurzaman
Engineering, MAthematics and Computer Science Journal (EMACS) Vol. 5 No. 3 (2023): EMACS
Publisher : Bina Nusantara University

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.21512/emacsjournal.v5i3.10540

Abstract

Preeclampsia (PE) also known as a hypertension during third trisemester of pregnancy. PE, is one of the most feared complications of pregnancy because it can potentially become serious complications in the future, including mother and fetus’s death. The goal of this study is other than to have a bettter undestanding about risk factor in pregnancy by modelling the relationship between several factors and the time until deliveries under the PE condition. Data on 924 patients at obstetric and gynecology department in Jakarta were used in the analysis. Accelerated Failure Time (AFT) model was proposed to indentify some risk factors that influenced the condition. Model parameters were estimated using Bayesian method. Due to imbalance data, undersampling method will be used as a pre-procesing stage. Ratio between PE and non-PE data will be 60:40. Flat prior and posterior sample will be used using MCMC simulation with 12,000 iterations (including 2,000 iterations as a burnin stage) to get a convergen result. The iteration was repeated for 100 times so that the chosen data from undersampling was not error and biased. A consistent result for credible interval of the mean result was considered as the factors that affect PE condition consistently. From this study, there are two factors that have consistent Credible Interval result, Body Mass Index (BMI) and Mean Arterial Pressure (MAP).
ANALISIS TINGKAT KESEHATAN DAN EFISIENSI PERBANKAN TERHADAP PROFITABILITAS BANK MENGGUNAKAN REGRESI BERGANDA DAN ANOVA: Studi kasus pada tahun 2014 – 2017 Dita Anggun Lestari; Sarini Abdullah
Indonesian Journal of Statistics and Applications Vol 4 No 3 (2020)
Publisher : Departemen Statistika, IPB University dengan Forum Perguruan Tinggi Statistika (FORSTAT)

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.29244/ijsa.v4i3.538

Abstract

In this digital era, the competitiveness of small banks has decreased, and many bank consolidation phenomena have occurred. This study aims to examine the effect of bank soundness and efficiency on profitability in the face of competition and the current bank consolidation or merger phenomenon. Determination of variables refers to Bank Indonesia standards in measuring bank performance using the RGEC method approach consisting of the ratio of LDR, NIM, BOPO, NPL, CAR, and prime lending rate (SBDK), while bank profitability is represented by ROA. The research object is the bank category BUKU 1 - 4 which is supervised by OJK and listed as issuers on the Indonesia Stock Exchange during 2014 - 2017. The sampling technique used is purposive sampling so that from 102 banks 34 banks were obtained which were used as research objects. The data analysis technique used is multiple regression analysis and Anova comparison test. Based on the results of data testing, it is known that simultaneously and partially the ratios of LDR, NIM, BOPO, NPL, CAR, and SBDK have an effect on ROA. In comparison to the average BOPO, prime lending rate, and ROA variables, there are significant differences with bank categorization BUKU 1-4.