Najie, Muhammad
Unknown Affiliation

Published : 1 Documents Claim Missing Document
Claim Missing Document
Check
Articles

Found 1 Documents
Search

Optimizing Startup Success Prediction Through SMOTE Oversampling and Classification Najie, Muhammad; Sofian, Ahmad Alif; Sidabutar, Ribka Julyasih; Untoro, Meida Cahyo
Journal of Intelligent Systems and Information Technology Vol. 1 No. 2 (2024): July
Publisher : Apik Cahaya Ilmu

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.61971/jisit.v1i2.33

Abstract

Rapid technological advancements have led to a surge in the number of startups competing with innovative ideas. Predicting the chances of a startup's future success becomes crucial for entrepreneurs in making informed decisions and strategizing their growth. This study investigates the effectiveness of the Gradient Boosting classification algorithm in predicting startup success. To address potential class imbalance within the dataset, a pre-processing step utilizing Synthetic Minority Oversampling Technique (SMOTE) was employed. The dataset itself encompassed a wide range of variables related to startup attributes and performance metrics. The F1-score metric was utilized to evaluate the model's accuracy while minimizing false positive predictions that could potentially mislead investors. Gradient Boosting algorithm was employed to analyze the dataset, which was pre-processed using SMOTE to handle potential class imbalance. This technique helps to create synthetic data points for the minority class, resulting in a more balanced dataset for the classification model. The dataset itself encompassed a wide range of variables related to startup attributes and performance. The F1-score metric was utilized to evaluate the model's accuracy while minimizing false positive predictions that could potentially mislead investors. Gradient Boosting algorithm achieved an F1-score of 86% for predicting successful startups and 85% for predicting unsuccessful ones. The low false positive prediction rate of 7.9% on the test data further validates the model's reliability. The findings demonstrate the effectiveness of Gradient Boosting in predicting startup success with high accuracy and minimal false positives