cover
Contact Name
Etis Sunandi
Contact Email
esunandi@unib.ac.id
Phone
6281295949261
Journal Mail Official
jsds_statistika@unib.ac.id
Editorial Address
Jl. WR. Supratman Kelurahan Kandang Limun Kota Bengkulu
Location
Kota bengkulu,
Bengkulu
INDONESIA
Journal of Statistics and Data Science
Published by Universitas Bengkulu
ISSN : -     EISSN : 28289986     DOI : https://doi.org/10.33369/jsds
Established in 2022, Journal of Statistics and Data Science (JSDS) publishes scientific papers in the fields of statistics, data science, and its applications. Published papers should be research-based papers on the following topics: experimental design and analysis, survey methods and analysis, operations research, data mining, machine learning, statistical modeling, computational statistics, time series, econometrics, statistical education, and other related topics. All papers are reviewed by peer reviewers consisting of experts and academics across universities and agencies. This journal publishes twice a year, which are March and October.
Articles 36 Documents
Modeling of Tuberculosis Cases in Sumatera Region using Poisson Inverse Gaussian Regression -, She Asa Handarzeni
Journal of Statistics and Data Science Vol. 1 No. 2 (2022)
Publisher : UNIB Press

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.33369/jsds.v1i2.24453

Abstract

In the Sumatra Region, tuberculosis (TB) is a disease that needs special attention because it tends to increase every year. Based on health theory, there are many factors that cause TB, but it is not easy to determine which factors have a significant effect. Therefore, in this study an analysis was carried out that could model, predict, and determine the factors causing TB disease in the Sumatra Region. The data used is data on TB cases in the Sumatra Region in 2018 taken from the Publication of the Central Statistics Agency. Poisson regression is an analysis that is suitable for modeling count data such as TB disease data. The assumption of Poisson regression is that the mean and variance of the response variables must be equal (equidispersion). However, the TB case data in the Sumatra Region in 2018 has an average value that is smaller than the variance (overdispersion) so it cannot be solved by Poisson regression. To overcome this problem, we need a method that can overcome overdispersion, namely Poisson Inverse Gaussian (PIG) ​​regression. From the results of the analysis using PIG regression, it can be concluded that the factors that have a significant effect on TB cases in the Sumatra Region are the percentage of the male population (X1), the percentage of the productive age population (X2), the percentage of households with a floor area of ≤ ​​19m2 (X3), and the percentage of households that have access to proper sanitation (X4), where the model formed is Based on the model, the predicted results of TB cases in the Sumatra Region had an average of 596.04178 where the lowest cases occurred in Pringsewu of 154.8943 and the highest cases occurred in Bukittinggi of 2719.59400.
Applied Different Pixel Selection in METRIC Model for Estimating Spatial Daily Evapotranspiration of Oil Palm in East Kalimantan Province, Indonesia Dhohir, Nur Muhammad Abdul; June, Tania
Journal of Statistics and Data Science Vol. 2 No. 1 (2023)
Publisher : UNIB Press

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.33369/jsds.v2i1.24805

Abstract

Determination of evapotranspiration (ET) plays a key role in managing water in oil palm plantations. Several energy balances models have been developed for mapping evapotranspiration regionally. Subsequently, this study aims to estimate daily evapotranspiration in oil palm plantation using the METRIC model, where pixel selection used and corrected by hot and cold pixels. The climate data were collected from ERA-5 Reanalysis and Landsat 8 was used for spatial analysis. The result depicts the means ± standard deviation of ET without pixel selection (with pixel selection), specifically for oil palms age of 4, 6, 7, 8, 9, 11, 12 and 13 years were 3.19 ± 1.62 mm d-1, 3.31 ± 1.14 mm d-1, 4.01 ± 0.96 mm d-1, 4.84 ± 0.87 mm d-1, 6.29 ± 0.43 mm d-1, 5.72 ± 0.44 mm d-1, 6.43 ± 0.23 mm d-1 and 6.21 ± 0.33 mm d-1 (4.22 ± 0.49 mm d-1, 3.99 ± 0.22 mm d-1, 2.96 ± 0.34 mm d-1, 3.14 ± 0.33 mm d-1, 4.22 ± 0.49 mm d-1, 3.99 ± 0.22 mm d-1, 4.26 ± 0.24 mm d-1 and 4.18 ± 0.30 mm d-1), respectively. We have found more accurate ET determination with pixel selection (higher coefficient of determination).
Goodness Test of Adaptability to Model of Technical Changes and Test of Forecasting Accuracy susiawati, susiawati; Kurniawan, Budi
Journal of Statistics and Data Science Vol. 2 No. 1 (2023)
Publisher : UNIB Press

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.33369/jsds.v2i1.27257

Abstract

The technical coefficient input-output as an element of the technical coefficient matrix (A) is estimated to have good forecasts for the next several periods . By substituting the final demand (F) for the period into the Input Output (IO) model in the equation the total output for the period will be obtained from the forecasting results. The total output of forecasting results is then compared with the actual total output to see the magnitude of the deviation. In the regression equation, the coefficient of determination is a measure of “goodness of fit” which states how well the regression line explains the independent variable with the dependent variable. The test is carried out by regressing the technical coefficient of input-output in the year against the technical coefficient in the nth year in a simple linear regression equation . This test was conducted to see the validity of the technical coefficients in forecasting the IO model. This research is an empirical study that uses data from the Jambi Province Input Output Tables in 1998, 2007 and 2016, each of which has been collected in a common set to see the comparability between observation periods. The results show that the technical change model is quite well used for forecasting according to the assumption that the technical coefficient level is constant during the planning period. Meanwhile, the estimated output deviation tends to be higher than that of the actual data.
A Panel Data Regression Analysis for Economic Growth Rate In Bengkulu Province Supianti, Filo
Journal of Statistics and Data Science Vol. 2 No. 1 (2023)
Publisher : UNIB Press

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.33369/jsds.v2i1.27258

Abstract

Panel data is a combination of time series data and cross section data. The analytical method used for panel data is panel data regression. One of the advantages of analysis using panel data regress One of the indicators to measure the development of the production of goods and services in an economic area in a given year against the value of the previous year which is calculated based on GDP/GRDP at constant prices is Economic Growth. The dependent variable in this study is the growth rate of GRDP. The independent variable in this study is IPM, TPAK, TPT. This study uses panel data regression analysis with the Common Effect Model (CEM), Fixed Effect Model (FEM) and Random Effect Model (REM). The data processing in this study uses the R Studio application.
Modeling Social Media Use and Anxiety Levels With Students’ Sleep Quality: Ordinal Logistic Regression ., Annisa Agustina
Journal of Statistics and Data Science Vol. 2 No. 1 (2023)
Publisher : UNIB Press

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.33369/jsds.v2i1.27259

Abstract

The study tries to model sleep quality using ordinal logistic regression since the response variable is in the form of categorical data. The purpose of this study was to identify factors related to students' sleep quality based on social media usage variables and anxiety levels. One hundred and fifty students of SMAN 1 Tualang, Riau are selected with snowball technique and participated online.  The result showed that there is a correlation between social media usage and anxiety over sleep quality. Social Media Usage Dependence degree on Sleep Quality was 59.3% and Anxiety level dependence degree on Sleep Quality was 65.3%. Ordinal logistical regression analysis showed that students who were inactive in social media had a good sleep quality, a rate of 0.462 times compared to students who were active in social media. Meanwhile, students with mild anxiety levels had a good sleep quality of 0.369 times compared to moderate anxiety levels.
Earthquake Clustering Using the CLARA Method and Modeling Using the Inhomogeneous Spatial Cox Processes Method in the Ambon Region: Earthquake Clustering Using the CLARA Method and Modeling Using the Inhomogeneous Spatial Cox Processes Method in the Ambon Region Meiwidian, Muhamad Iqbal; Crisdianto, Riki; Rini, Dyah Setyo
Journal of Statistics and Data Science Vol. 2 No. 2 (2023)
Publisher : UNIB Press

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.33369/jsds.v2i2.30249

Abstract

Earthquakes are natural events whose time and place cannot be predicted. Ambon is the largest city in the Maluku Islands region and is the center of development and the capital of Maluku Province. This research will group earthquake events, analyze the characteristics of earthquake events, create earthquake zones and map them using CLARA cluster analysis, and create modeling that will look at the risk of earthquake events in a location based on distance to faults and subduction zones using the Inhomogeneous Neyman-Scott Cox Process. The data used is data on earthquake events in the Ambon region obtained from the United States Geological Survey (USGS) catalog from January 1926 to December 2022, with a depth of ≤360.1 Km and a magnitude of ≥4 Mw. Grouping earthquake events in the Ambon area using CLARA cluster analysis obtained 2 groups of earthquake clusters with an optimal silhouette score of 0.7430. The model obtained in this earthquake research is not good because it is based on the K-function value plot of the original data which is far from the modeling K-function value plot.
Application of Small Area Estimation for Estimation of Sub-District Level Poverty in Bengkulu Province: Comparison of Empirical Best Linear Unbiased Prediction (EBLUP) and Hierarchical Bayesian (HB) Methods Pratama, Auliya Yudha Pratama
Journal of Statistics and Data Science Vol. 3 No. 1 (2024)
Publisher : UNIB Press

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.33369/jsds.v3i1.30367

Abstract

Poverty is an important problem facing the world. Various ways are done to eradicate poverty. In planning poverty alleviation, policy makers need detailed information down to the smallest area level that can be produced. Currently, the demand for estimation at the small area level is increasing, while the success of estimation using the indirect method in reducing the Relative Standard Error (RSE) is very dependent on data conditions and the selection of the right method. This study aims to compare the results of estimating the percentage of poor people using direct estimates with indirect estimates using the Small Area Estimation (SAE) technique such as Empirical Best Linear Unbiased Predictor (EBLUP) and Hierarchical Bayesian (HB) method using a case study of poverty data at the sub-district level of Bengkulu Province. The data used are from the Social and Economic Survey (Susenas) in March 2022 and the 2021 Village Potential Data Collection (Podes). There is one sub-district that was not sampled in the March 2022 Susenas. The average RSE value of the direct estimator is 47.014 and the average RSE of the EBLUP estimator is 39.40 and the HB estimator is 15.318. In addition, the SAE EBLUP and HB methods can reduce the mean and median values of RSE estimation results when compared with direct estimates. The RSE of the direct estimator is greater than the RSE of the indirect estimator.
Modeling the Open Unemployment Rate of Regency/City in West Java Province in 2021 using Spatial Autoregresive Moving Average and Spatial Durbin Model hermalia, Lia
Journal of Statistics and Data Science Vol. 2 No. 2 (2023)
Publisher : UNIB Press

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.33369/jsds.v2i2.30380

Abstract

The Open Unemployment Rate is an important indicator to see the non-absorption of labor by the labor market. According to statistic indonesia, in August 2021, the Open Unemployment Rate in Indonesia was 6.49% or around 9.1 million people a population of 50 million, West Java Province has a high unemployment rate, reaching 9.82%. When examined, the open unemployment rate in West Java tends to cluster higher in the west and lower in the east, indicating that there are spatial factors in the data. Therefore, an analysis was conducted involving the variables of Labor Force Participation Rate, Expected Years of Schooling, and Expenditure per capita as independent variables in measuring their influence on the Open Unemployment Rate, the methods used Spatial Autoregressive Moving Average and Spatial Durbin Model. The result shows that both methods are significant in all tests conducted, then the best method is chosen by comparing the AIC value, it is obtained that the best method in modeling the Open Unemployment Rate in West Java Province is the Spatial Durbin Model with Rsquared of 81.32%. Indicating that the independent variables have a significant effect of 81.32% while 18.68% is influenced by other variables not examined.
Sentiment Analysis of Twitter User’s Perceptions of the Campus Merdeka Using Naïve Bayes Classifier and Support Vector Machine Methods Salsabilla, Intan; Alwansyah, Muhammad Arib; Nugroho, Sigit; Agwil, Winalia
Journal of Statistics and Data Science Vol. 2 No. 2 (2023)
Publisher : UNIB Press

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.33369/jsds.v2i2.30577

Abstract

The Campus Merdeka program is being implemented by the government to realize autonomous and flexible learning in tertiary institutions to create a learning culture that is innovative, not restrictive, and the needs of students. The Campus Merdeka provides added value and is attractive and provides various responses from the public both directly and on different social media platforms. One of the social media platforms is Twitter. Therefore, research was conducted on the community's response to the Campus Merdeka program on Twitter social media. Twitter documents in the form of community response tweets to the Campus Merdeka program are classified into two categories, namely positive responses and negative responses. The method used in this study is the Naïve Bayes Classifier (NBC) and Support Vector Machine (SVM) with a Polynomial Degree 2 kernel. The highest level of accuracy resulting from this research is 73.5% with a parameter value of  of 0.5, a constant value  is 0.5, with training data of 309 documents for training data and 132 documents for test data. The accuracy results obtained for the Naïve Bayes Classifier method are 65.9% and for the Support Vector Machine method, an accuracy is 73.5%.
Classification of Hypertension Patients in Palembang by K-Nearest Neighbor and Local Mean K-Nearest Neighbor Rosdiana, Rosdiana
Journal of Statistics and Data Science Vol. 3 No. 1 (2024)
Publisher : UNIB Press

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.33369/jsds.v3i1.32381

Abstract

Classification is a multivariate technique for separating different data sets from an object and allocating new objects into predefined groups. Several methods that can be used to classify include the k-Nearest Neighbor (KNN) and Local Mean k-Nearest Neighbor (LMKNN) methods. The KNN method classifies objects based on the majority voting principle, while LMKNN classifies objects based on the local average vector of the k nearest neighbors in each class. In this study, a comparison was made on the results of classifying hypertensive patient data at the Merdeka Health Center in Palembang City with the KNN and LMKNN methods by looking at the accuracy and the smallest APER value produced. The results showed that by using the same proportion of training and testing data and choosing different k values, the results of classifying hypertension patient data at the Merdeka Health Center in Palembang City with the KNN and LMKNN methods resulted in the APER value or the same error rate and accuracy, namely sequentially equal to 0.0303 and 96.97%.

Page 2 of 4 | Total Record : 36