Arief Rachman Hakim
Department Of Statistics, Faculty Of Sciences And Mathematics, Diponegoro University

Published : 19 Documents Claim Missing Document
Claim Missing Document
Check
Articles

Found 19 Documents
Search

APLIKASI NAÏVE BAYES CLASSIFIER (NBC) PADA KLASIFIKASI STATUS GIZI BALITA STUNTING DENGAN PENGUJIAN K-FOLD CROSS VALIDATION Riza Rizqi Robbi Arisandi; Budi Warsito; Arief Rachman Hakim
Jurnal Gaussian Vol 11, No 1 (2022): Jurnal Gaussian
Publisher : Department of Statistics, Faculty of Science and Mathematics, Universitas Diponegoro

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.14710/j.gauss.v11i1.33991

Abstract

The case of stunting in Indonesia is a problem that has been discussed for a long time. One of many efforts to overcome this problem is through an accelerated stunting reduction program to improve the nutritional status of the community and also to reduce the prevalence of stunting or stunted toddlers. Generally, the index used to determine the nutritional status of stunting toddlers height compared to age. This study aims to identify the classification results, evaluate the model, and predict the nutritional status of stunting toddlers using the Naïve Bayes Classifier algorithm with K-Fold Cross Validation testing. The data processing system used is the GUI-R (Graphical User Interface) in order to facilitate the analysis process by implementing the Shiny Package in the Rstudio program. The results of accuracy using Naïve Bayes Classifier with 10-Fold Cross Validation test obtained the highest accuracy on the 6th iteration with an accuracy 94.39%, while the lowest accuracy on the 8th iteration with an accuracy 82.08%. Overall, the average accuracy in each iteration is 88.46%, so it can be concluded that Naïve Bayes Classifier model considered good enough to classified data on the nutritional status of stunting toddlers.Keywords: Stunting, Data Mining, Naïve Bayes Classifier, K-Fold Cross Validation, Shiny Package
PEMODELAN AUTOREGRESSIVE FRACTIONALLY INTEGRATED MOVING AVERAGE DENGAN EFEK EXPONENTIAL GARCH (ARFIMA-EGARCH) UNTUK PREDIKSI HARGA BERAS DI KOTA SEMARANG Rezky Dwi Hanifa; Mustafid Mustafid; Arief Rachman Hakim
Jurnal Gaussian Vol 10, No 2 (2021): Jurnal Gaussian
Publisher : Department of Statistics, Faculty of Science and Mathematics, Universitas Diponegoro

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.14710/j.gauss.v10i2.29933

Abstract

Time series data is a type of data that is often used to estimate future values. Long memory phenomenon often occurs in time series data. Long memory is a condition that shows a strong correlation between observations even though they are quite far away. This phenomenon can be overcome by modeling time series data using the Autoregressive Fractional Integrated Moving Average (ARFIMA) model. This model is characterized by a fractional difference value. ARFIMA (Autoregressive Fractional Integrated Moving Average) model assumes that the residuals are normally distributed, mutually independent, and homogeneous. However, usually in financial data, the residual variants are not constant. This can be overcome by modeling variants. Standard equipment that can be used to model variants is the ARCH / GARCH (Auto Regressive Conditional Heteroscedasticity / Generalized Auto Regressive Conditional Heteroscedasticity) model. Another phenomenon that often occurs in GARCH models is the leverage effect on the residuals of the model. EGARCH (Exponential General Auto Regessive Conditional Heteroscedasticity) is a development of the GARCH model that is appropriate for data that has an leverage effect. The implementation of this model is by modeling financial data, so this study takes 136 monthly data on rice prices in Semarang City from January 2009 to April 2020. The purpose of this study is to create a long memory data forecasting model using the Exponential method. Generalized Autoregressive Conditional Heteroscedasticity (EGARCH). The best model obtained is ARFIMA (1, d, 1) EGARCH (1,1) which is capable of forecasting with a MAPE value of 3.37%.Keyword : Rice price, forecasting , long memory, leverage effect, GARCH, EGARCH
LIFE EXPECTANCY MODELING USING MODIFIED SPATIAL AUTOREGRESSIVE MODEL Hasbi Yasin; Budi Warsito; Arief Rachman Hakim; Rahmasari Nur Azizah
MEDIA STATISTIKA Vol 15, No 1 (2022): Media Statistika
Publisher : Department of Statistics, Faculty of Science and Mathematics, Universitas Diponegoro

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.14710/medstat.15.1.72-82

Abstract

The presence of outliers will affect the parameter estimation results and model accuracy. It also occurs in the spatial regression model, especially the Spatial Autoregressive (SAR) model. Spatial Autoregressive (SAR) is a regression model where spatial effects are attached to the dependent variable. Removing outliers in the analysis will eliminate the necessary information. Therefore, the solution offered is to modify the SAR model, especially by giving special treatment to observations that have potentially become outliers. This study develops to modeling the life expectancy data in Central Java Province using a modified spatial autoregressive model with the Mean-Shift Outlier Model (MSOM) approach. Outliers are detected using the MSOM method. Then the result is used as the basis for modifying the SAR model. This modification, in principle, will reduce or increase the average of the observed data indicated as outliers. The results show that the modified model can improve the model accuracy compared to the original SAR model. It can be proved by the increased coefficient of determination and decreasing the Akaike Information Criterion (AIC) value of the modified model. In addition, the modified model can improve the skewness and kurtosis values of the residuals getting closer to the Normal distribution.
MODELING LIFE EXPECTANCY IN CENTRAL JAVA USING SPATIAL DURBIN MODEL Arief Rachman Hakim; Hasbi Yasin; Agus Rusgiyono
MEDIA STATISTIKA Vol 12, No 2 (2019): Media Statistika
Publisher : Department of Statistics, Faculty of Science and Mathematics, Universitas Diponegoro

Show Abstract | Download Original | Original Source | Check in Google Scholar | Full PDF (720.681 KB) | DOI: 10.14710/medstat.12.2.152-163

Abstract

Central Java in 2017 was one of the provinces with high life expectancy, ranking second. Life expectancy of Central Java Province in 2017 is 74.08% per year. The fields of education, health and socio-economics, are several factors that are thought to influence the life expectancy in an area. To find out what factors that the regression analysis method can use to find out what factors influence the life expectancy. But in observations found data that have a spatial effect (location) called spatial data, a spatial regression method was developed such as linear regression analysis by adding spatial effects. One form of spatial regression is Spatial Durbin Model (SDM) which has a form like the Spatial Autoregressive Model (SAR). The difference between the two if in the SAR model the effect of spatial lag taken into account in the model is only on the response variable (Y) but in the SDM method, effect of spatial lag on the predictor variable (X) and response (Y) are also taken into account. Selection of the best model using Mean Square Error (MSE), obtained by the MSE value of 1.156411, the number mentioned is relatively small 0, which indicates that the model is quite good.
PREDIKSI CURAH HUJAN EKSTREM DI KOTA SEMARANG MENGGUNAKAN SPATIAL EXTREME VALUE DENGAN PENDEKATAN MAX STABLE PROCESS (MSP) Hasbi Yasin; Budi Warsito; Arief Rachman Hakim
MEDIA STATISTIKA Vol 12, No 1 (2019): Media Statistika
Publisher : Department of Statistics, Faculty of Science and Mathematics, Universitas Diponegoro

Show Abstract | Download Original | Original Source | Check in Google Scholar | Full PDF (639.776 KB) | DOI: 10.14710/medstat.12.1.39-49

Abstract

This research covers Spatial Extreme Value method application with Max-Stable Process (MSP) approach that will be used to analysis Extreme Rainfall in Semarang city. Extreme value sample are selected by Block Maxima methods, it will be estimated into Spatial Extreme Value form by including location factors. Then it transform to Frechet distribution because it has a heavy tail pattern. Max Stable Process (MSP) is an extension of the multivariate extreme value distribution into infinite dimension of the Extreme Value Theory. After the best model of extreme rainfall data in Semarang is obtained, then calculated the prediction of extreme rainfall with a certain time period. Predictions are calculated using a return level, predictions of extreme rainfall using the return period of the next two years, at the Semarang City Climatology Station predicted to be a maximum of 100.7539 mm. At the Tanjung Mas Rain Monitoring Station it is predicted that a maximum of 100.1052 mm, Ahmad Yani Rain Monitoring Station is predicted to be a maximum of 109.9379 mm. Maximum prediction of extreme rainfall can also be calculated for future t (time) periods.
Kernel K-Means Clustering untuk Pengelompokan Sungai di Kota Semarang Berdasarkan Faktor Pencemaran Air Anestasya Nur Azizah; Tatik Widiharih; Arief Rachman Hakim
Jurnal Gaussian Vol 11, No 2 (2022): Jurnal Gaussian
Publisher : Department of Statistics, Faculty of Science and Mathematics, Universitas Diponegoro

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.14710/j.gauss.v11i2.35470

Abstract

K-Means Clustering is one of the types of non-hierarchical cluster analysis which is frequently used, but has a weakness in processing data with non-linearly separable (do not have clear boundaries) characteristic and overlapping cluster, that is when visually the results of a cluster are between other clusters. The Gaussian Kernel Function in Kernel K-Means Clustering can be used to solve data with non-linearly separable characteristic and overlapping cluster. The difference between Kernel K-Means Clustering and K-Means lies on the input data that have to be plotted in a new dimension using kernel function. The real data used are the data of 47 rivers and 18 indicators of river water pollution from Dinas Lingkungan Hidup (DLH) of Semarang City in the first semester of 2019. The cluster results evaluation is used the Calinski-Harabasz, Silhouette, and Xie-Beni indexes. The goals of this study are to know the step concepts and analysis results of Kernel K-Means Clustering for the grouping of rivers in Semarang City based on water pollution factors. Based on the results of the study, the cluster results evaluation show that the best number of clusters K=4
PENGGUNAAN SELEKSI FITUR CHI-SQUARE DAN ALGORITMA MULTINOMIAL NAÏVE BAYES UNTUK ANALISIS SENTIMEN PELANGGGAN TOKOPEDIA Tri Ernayanti; Mustafid Mustafid; Agus Rusgiyono; Arief Rachman Hakim
Jurnal Gaussian Vol 11, No 4 (2022): Jurnal Gaussian
Publisher : Department of Statistics, Faculty of Science and Mathematics, Universitas Diponegoro

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.14710/j.gauss.11.4.562-571

Abstract

E-commerce is a medium for online shopping that is popular among the public. Ease of access for all internet users and the completeness of products offered by e-commerce are new alternatives in meeting the needs of the community. This causes stiff competition in the e-commerce, so e-commerce need to carry out the right marketing strategy in order to compete in obtaining, retaining, and partnering with customers, one of which is by reviewing aspects of customer satisfaction. Tokopedia is an e-commerce buying and selling online that connects sellers and buyers throughout Indonesia for free. In this study, an analysis of Tokopedia's customer sentiment was carried out with the Multinomial Naïve Bayes classification. Algorithm Multinomial Nave Bayes is a model development of the Nave Bayes. The difference lies in the selection of data, if Naïve Bayes uses a Gaussian that is suitable for continue, while Multinomial Naïve Bayes is suitable for discrete data such as the number of words in a document. Multinomial Naïve Bayes is the simplest method of probability classification but is sensitive to feature selection, so the amount of data is determined by the results of Chi-Square.Multinomial Naïve Bayes is used to classify customer opinions that are positive and negative so that they can form customer satisfaction factors Tokopedia, while the Chi-Square used to measure the level of feature dependence with class (positive and negative) so as to eliminate disturbing features in the classification process. Classification performance results using Multinomial Naïve Bayes without Chi-Square obtained accuracy and kappa statistics of 88% and 75.95%, while using Chi-Square obtained accuracy and kappa statistics of 95% and 89.99%, respectively. This means that Multinomial Naïve Bayes has quite effective performance and results in analyzing Tokopedia customer satisfaction sentiment and the use of Chi-Square for feature selection can improve the accuracy of the classification process. 
ANALISIS INDEKS HARGA SAHAM GABUNGAN DAN FAKTOR PENGARUHNYA MENGGUNAKAN PEMODELAN REGRESI SEMIPARAMETRIK KERNEL DILENGKAPI GUI-R Arnisa Melani Kahar; Suparti Suparti; Arief Rachman Hakim
Jurnal Gaussian Vol 12, No 1 (2023): Jurnal Gaussian
Publisher : Department of Statistics, Faculty of Science and Mathematics, Universitas Diponegoro

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.14710/j.gauss.12.1.30-41

Abstract

Composite Stock Price Index (IDX) shows the movement of stock prices used by investors to determine their investment strategy. IDX movement is influenced by macroeconomic factors such as money supply and inflation, so regression analysis is used to determine the relationship between the variables. Based on the scatterplot, money supply is known as a parametric predictor variable as it has a linier line patterned scatterplot and inflation is a nonparametric predictor variable as it has a random patterned scatterplot, so semiparametric regression modelling is used for the analysis. Kernel regression was chosen to analyze the nonparametric component based on the random patterned scatterplot of inflation. This study aims to obtain the results of semiparametric kernel regression modelling analysis and to create a GUI to be applied to the analysis as a development of previous similar studies that still done based on CLI. This study uses monthly data from January 2013 to December 2020 with the proportion of in sample and out sample data distribution 87,5%:12,5%. Based on the smallest MSE value as the best model criteria, semiparametric regression model with triangle kernel function is the best model obtained with optimal bandwidth=3.24,  which means the model is strong and  which means that the forecasting results are very accurate. GUI has been created according to the needs of the modelling analysis implementation.
PEMODELAN PERTUMBUHAN EKONOMI DI PROVINSI BANTEN MENGGUNAKAN MIXED GEOGRAPHICALLY WEIGHTED REGRESSION Hasbi Yasin; Budi Warsito; Arief Rachman Hakim
MEDIA STATISTIKA Vol 11, No 1 (2018): Media Statistika
Publisher : Department of Statistics, Faculty of Science and Mathematics, Universitas Diponegoro

Show Abstract | Download Original | Original Source | Check in Google Scholar | Full PDF (3437.176 KB) | DOI: 10.14710/medstat.11.1.53-64

Abstract

Economic growth can be measured by amount of Gross Regional Domestic Product (GRDP). Based on official news of statistics BPS, Economic growth in Banten region has increase up to 5.59%. It supported by several sector, there are agriculture, business, industry and from various fields. Mixed Geographically Weighted Regression (MGWR) methods have been developed based on linear regression by giving spatial effect or location (longitude and latitude), the resulting model from Economic growth in Banten will be local or different based on each location. MGWR mixed method between linear regression and GWR, parameters in linear regression are global and GWR parameters are local. The results more specific because economic growth in Banten region assessed by location.Keywords: Banten, Economic growth, MGWR.