This Author published in this journals
All Journal Journal Collabits
Salsabila, Mutiara Rizky
Unknown Affiliation

Published : 1 Documents Claim Missing Document
Claim Missing Document
Check
Articles

Found 1 Documents
Search
Journal : Journal Collabits

A Data Science Approach to Cancer Patient Classification Using Support Vector Machine and Random Forest Anggraini, Devi Dwi; Salsabila, Mutiara Rizky; Kamila, Keisya Rizkia
Journal Collabits Vol 3, No 1 (2026)
Publisher : Journal Collabits

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.22441/collabits.v3i1.37642

Abstract

The increasing availability of healthcare data has encouraged the application of data science and machine learning techniques in medical research. Cancer patient datasets contain numerical demographic and clinical attributes that can be utilized for classification tasks; however, complex feature relationships and limited feature relevance remain key challenges. This study aims to analyze cancer patient data and compare the performance of Support Vector Machine and Random Forest algorithms for gender classification. The dataset used in this study consists of numerical features, including patient age, tumor size, number of examined lymph nodes, number of positive lymph nodes, body mass index, and survival duration measured in months. The research methodology includes data preprocessing, exploratory data analysis, model development, and performance evaluation. Feature normalization and data splitting are applied to ensure a fair comparison between models, while exploratory analysis is conducted to examine data distribution and relationships among variables. Both classification models are trained under identical experimental settings and evaluated using accuracy as the primary performance metric. The results indicate that both algorithms are capable of classifying cancer patient gender with satisfactory accuracy. Support Vector Machine demonstrates slightly better performance compared to Random Forest, suggesting its effectiveness in handling numerical data with complex decision boundaries. The findings highlight the importance of appropriate algorithm selection and feature utilization in healthcare data analysis.