This study delves into the classification of exoplanets using data from the Kepler Space Telescope, comparing a suite of machine learning (ML) models to ascertain their efficacy in distinguishing confirmed planets, candidates, and false positives. With a dataset meticulously preprocessed for quality, completeness, and relevance, we embarked on an analytical journey employing models like Decision Tree, Random Forest, Hist Gradient Boosting, CatBoost, AdaBoost, LightGBM, XGBoost, Extra Trees, Logistic Regression, and XGBoost RF. These models underwent rigorous evaluation across metrics such as Accuracy, Precision, Recall, and F1 Score, revealing an unprecedented level of performance. Our findings showcased a near-uniform perfection in model predictions, with scores touching the zenith of 1.0 across most metrics for the majority of models, indicating their flawless prediction capabilities. This remarkable performance, however, was nuanced by the Gaussian NB model's slightly less than perfect scores of 0.99, highlighting a minor deviation due to its probabilistic nature. While these results underscore the models' exceptional accuracy and reliability in classifying exoplanetary data, they also prompt a critical examination of potential overfitting, the dataset's complexity, and the models' generalizability to unseen data.
Copyrights © 2024