Support Vector Machine (SVM) is a machine learning algorithm used for classification. SVM has several advantages such as the ability to handle high-dimensional data, effective in handling nonlinear data through kernel functions, and resistance to overfitting through soft margins. However, SVM has weaknesses, especially when handling missing values in data. The use of SVM must consider the missing values strategy chosen. Missing values in data mining is a serious problem for researchers because it causes many problems such as loss of efficiency, complications in data handling and analysis, and the occurrence of bias due to differences between missing data and complete data. To overcome the above problems, this research focuses on understanding the characteristics of missing values and handling them using the Multiple Imputation by Chained Equations (MICE) technique. In this study, we utilized secondary data experiments that contain missing values from the Meteorological, Climatological, and Geophysical Agency (called BMKG) related to predictions of potential rain, especially in DKI Jakarta. Identification of types or patterns of missing values, exploration of the relationship between missing values and other variables, incorporation of the MICE method to handle missing values, and the Support Vector Machine Algorithm for classification will be carried out to produce a more reliable and accurate prediction model for rain potential. It shows that the imputation method with the MICE gives better results than other techniques (such as Complete Case Analysis, Imputation Method Mean, Median, Mode, and K-Nearest neighbor), namely an accuracy of 89% testing data when applying the Support Vector Machine algorithm for classification.
Copyrights © 2023