Cardiovascular disease (CVD) remains one of the leading causes of mortality in Indonesia, highlighting the urgent need for effective preventive strategies, including the development of risk prediction systems based on population health data. A major challenge in developing CVD prediction models is the limited availability of local medical data that adequately represent the Indonesian population. This study aims to develop a CVD risk prediction model using the Random Forest algorithm by integrating two data sources: private clinical data from cardiology outpatients at RSUD M. Yunus Bengkulu and a publicly available dataset. Data integration was conducted to address the limited size of private data and to improve model performance. The research was conducted through three experimental settings. Shapley Additive Explanations (SHAP) were employed to analyze the contribution of each feature, while Recursive Feature Elimination with Cross-Validation (RFECV) was applied for feature selection. The results indicate that Scenario 3 in the Experiment on Data Integration achieved the best performance, with an accuracy of 73.57%, recall of 81.44%, and F1-score of 77.06%. SHAP analysis identified blood pressure and age as the most influential predictors of CVD risk. These findings demonstrate that integrating limited private data with public datasets can significantly improve model performance while providing clinically interpretable insights, particularly in settings with constrained local data availability.