The tragic sinking of the Titanic in 1912 has been a subject of great interest, particularly in analyzing the factors that influenced passenger survival rates. This study applies machine learning techniques to predict the survival of Titanic passengers based on various attributes. The dataset used includes demographic details and passenger-specific features such as age, gender, ticket class, number of siblings/spouses, number of parents/children traveling, ticket fare, and departure location. An exploratory data analysis is conducted to understand patterns within the dataset, followed by data preprocessing steps, including handling missing values and encoding categorical variables. To develop the predictive model, multiple machine learning algorithms are implemented, including Logistic Regression, Random Forest, Extra Trees, Decision Tree, LGBM Classifier, and XGBoost Classifier. The results indicate that the Random Forest model achieves the highest accuracy at 0.815, while the LGBM Classifier attains the highest cross-validation score of 0.821. Feature importance analysis highlights gender and ticket class as the most significant factors affecting survival probability. This study demonstrates the effectiveness of machine learning classification techniques in analyzing historical data and predicting binary outcomes. The insights gained from this research can be applied to other domains involving historical data analysis and classification tasks, such as risk assessment, medical prognosis, and social science research. By leveraging machine learning, this approach provides a data-driven perspective on historical events, enabling better decision-making in similar predictive modeling scenarios.
Copyrights © 2025