Diabetes mellitus (DM) is increasing in prevalence globally and is becoming a serious health problem. Early detection reduces long-term complications. The purpose of our research is to evaluate and compare the effectiveness of Random Forest (RF) and CatBoost models with SMOTE technique in predicting DM risk based on test data processed to produce comparative analysis performance of both models in the form of precission, recall, F1-Score and accuracy. Our research type is quantitative using methods that include EDA, transformation, dividing test and training data, implementation of RF and CatBoost methods with SMOTE and evaluation of model performance. The dataset from the platform (Kaggle) includes 768 individual health data consisting of eight independent variables of pregnancy, glucose, blood pressure, skin thickness, insulin, Body Mass Index (BMI), DM history, age as well as one target (outcome) variable of DM status. The SMOTE analysis technique was applied to balance the class distribution and improve the representation of the minority class, making the prediction model more accurate and stable. The findings of the SMOTE-RF model were 82% accuracy and SMOTE CatBoost 81% accuracy. Based on the feature importances analysis, the main variables affecting DM risk prediction of both models are glucose, BMI and age. Glucose variable is the main DM risk indicator used for prediction to be more efficient. The practical implication of improved machine learning early detection has the potential to support doctors' decision making more accurately to prevent more serious complications in diabetes mellitus.