cover
Contact Name
Akbar Rizki
Contact Email
akbar.ritzki@apps.ipb.ac.id
Phone
+628111144470
Journal Mail Official
akbar.ritzki@apps.ipb.ac.id
Editorial Address
Departemen Statistika, IPB Jl. Meranti Kampus IPB Darmaga Wing 22, Level 4 Bogor 16680
Location
Kota bogor,
Jawa barat
INDONESIA
Xplore: Journal of Statistics
ISSN : 23025751     EISSN : 26552744     DOI : https://doi.org/10.29244/xplore
Xplore: Journal of Statistics diterbitkan berkala 3 (tiga) kali dalam setahun yang memuat tulisan ilmiah yang berhubungan dengan bidang statistika. Artikel yang dimuat berupa hasil penelitian atau kajian pustaka dalam bidang statistika dan atau penerapannya. ISSN: 2302-5751 Mulai Desember 2018, Xplore: Journal of Statistics mendapatkan ISSN baru untuk media online (eISSN:2655-2744) sesuai dengan SK no. 0005.26552744/JI.3.1/SK.ISSN/2018.12 - 13 Desember 2018. Maka sesuai ketentuan pada SK tersebut, edisi Xplore: Journal of Statistics mulai Desember 2018 akan dimulai menjadi Volume 7 dan No 3. eISSN: 2655-2744
Articles 106 Documents
Latent Dirichlet Allocation dalam Identifikasi Respon Masyarakat Indonesia Terhadap Covid-19 Tahun 2020-2021 Karel Fauzan Hakim; Pika Silvianti; Agus Mohamad Soleh
Xplore: Journal of Statistics Vol. 10 No. 3 (2021)
Publisher : Department of Statistics, IPB

Show Abstract | Download Original | Original Source | Check in Google Scholar | Full PDF (298.682 KB) | DOI: 10.29244/xplore.v10i3.836

Abstract

Covid-19 is a very troubling disease in Indonesia. Therefore, understanding public opinion is required to find solutions and evaluate the government performance in handling the pandemic. Twitter can be helpful to identify the public opinion of significant events. Twitter’s tweet is a large dimension text-based big data. It requires text sampling and text mining to be processed efficiently and effectively. Stratified random sampling with 20 repetitions applied to assume days as strata followed by topic modeling with latent Dirichlet allocation (LDA). This research aims to find out public opinion regarding Covid-19 and itsgrowth over time. Other than that, this research also aims to find out sampling effects on tweet data using stratified random sampling. Therefore, the extracted topics will be transformed into time-series data and considering the variety of the pattern made. Afterward, the transformation results will be explored and interpreted. This research suggests that discussions related to Covid-19 are divided into four topics by the first model, namely: “Vaccine”, “Positive or affected people”, “Health protocol”, and “Indonesia” then nine topics by the second model, namely: “Vaccine”, “Prayer”, “Health protocol”, “Social aid and corruption”, “Affected people”, “Indonesian economy”, “Work”, “Persuading to wear mask”, and “Willing to watch”. Furthermore, some topics peak whenever a significant event occurs in Indonesia. Afterward, this research suggests that 20 repetitions of stratified random sampling could provide good results.
Identifikasi Faktor-faktor yang Memengaruhi Hasil Akreditasi SMA di Indonesia Berdasarkan Data ARKAS Muh Nur Fiqri Adham; Budi Susetyo; Kusman Sadik; Satriyo Wibowo
Xplore: Journal of Statistics Vol. 10 No. 3 (2021)
Publisher : Department of Statistics, IPB

Show Abstract | Download Original | Original Source | Check in Google Scholar | Full PDF (540.898 KB) | DOI: 10.29244/xplore.v10i3.837

Abstract

Accreditation is an indicator of the quality of education at the education unit level. One affects the quality of education units is the school budget. School budgets are prepared in order to fulfill 8 national education standards. School budget management uses School Activity Plan and Budget Application (ARKAS) developed by the Ministry of Education, Culture, Research and Technology (Kemendikbudristek). ARKAS is an information system for managing school budget and expenditure planning. The Research is identifies the factors that influence the accreditation of high school (SMA) with accreditation as a response variable and 17 explanatory variables sourced from ARKAS and Dapodik data using ordinal logistic regression analysis. The best model stage is the model formed that has the smallest AIC value and has high model accuracy in determining the best model. The best model stage is the third model stage which is composed of 7 explanatory variables that affect the high school accreditation rating with AIC value of 1886,20 and model accuracy of 65,79%. The variables that affect to results of accreditation include school status, percentage of students eligible PIP, ratio of the number of students per number of teachers, percentage of teachers certified educators, ratio of the number of students per number of study groups, ratio of the number of students per number of computers, and ratio of the number of students per number of toilets
Penggerombolan Mutu SMA/MA per Provinsi Berdasarkan Hasil Akreditasi Menggunakan Metode Fuzzy C-Means Rifannisa Bahar; Pika Silvianti; Budi Susetyo
Xplore: Journal of Statistics Vol. 10 No. 3 (2021)
Publisher : Department of Statistics, IPB

Show Abstract | Download Original | Original Source | Check in Google Scholar | Full PDF (482.328 KB) | DOI: 10.29244/xplore.v10i3.842

Abstract

Mapping the quality of education in Indonesia needs to be studied so that the provincial government, as the institution responsible for secondary education management policies, can more easily determine priorities and what actions will be taken to improve the quality of education in Indonesia. One of the analytical methods that can be used to map the quality of education is fuzzy c-means. This research aims to classify the quality maps of provinces in Indonesia based on the results of SHS/MA accreditation using the fuzzy c-means method. The fuzzy c-means method can show the probability of objects entering a cluster with a degree of membership. The optimum cluster sizes obtained were 2 and 3. The final solution with cluster size 2 was 12 provinces categorized in cluster 1 and 22 provinces categorized in cluster 2. Clustering with cluster size 3 resulted in cluster 1 consisting of 11 provinces, cluster 2 consisting of 16 provinces, and cluster 3, which consists of 7 provinces. The main character of cluster 1 is a high national education standard score, while the main character of cluster 2 is a low national education standard score. Then the main character of group 3 is the national standard score, whose value is around the national average.
Perbandingan ARIMA dan Artificial Neural Networks dalam Peramalan Jumlah Positif Covid-19 Di DKI Jakarta Tri Wahyuni; Indahwati Indahwati; Kusman Sadik
Xplore: Journal of Statistics Vol. 10 No. 3 (2021)
Publisher : Department of Statistics, IPB

Show Abstract | Download Original | Original Source | Check in Google Scholar | Full PDF (425.867 KB) | DOI: 10.29244/xplore.v10i3.846

Abstract

DKI Jakarta is the center of the spread of Covid-19. This is indicated by the higher cumulative number of Covid-19 positive in DKI Jakarta compared to other provinces. The high number of cases in DKI Jakarta is a concern for all groups, so it is necessary to do forecasting to predict the number of Covid-19 positive in the next period. Accurate forecasting is needed to get better results. This study compares the Autoregressive Integrated Moving Average (ARIMA) and Artificial Neural Networks (ANN) methods in predicting the number of Covid-19 positive in DKI Jakarta. Forecasting accuracy is calculated using the value of Root Mean Square Error (RMSE), Mean Absolute Percentage Error (MAPE), and correlation. The results show that the best model for forecasting the number of Covid-19 positive in DKI Jakarta is ARIMA(0,1,1) with drift, with a MAPE value of 15.748, an RMSE of 268.808, and the correlation between the forecast value and the actual value of 0.845. Forecasting using ARIMA(0,1,1) with drift and BP(3,10,1) models produces the best forecast for the long forecasting period of the next six weeks.
Regresi Elastic Net dengan Peringkasan Luas untuk Mengukur Keakuratan Alat Non-Invasive Produk Tahun 2017 dan 2019 Fariz Mufti Rusdana; Itasia Dina Sulvianti; . Erfiani
Xplore: Journal of Statistics Vol. 11 No. 1 (2022)
Publisher : Department of Statistics, IPB

Show Abstract | Download Original | Original Source | Check in Google Scholar | Full PDF (869.716 KB) | DOI: 10.29244/xplore.v11i1.848

Abstract

Diabetes melitus is one of dangerous disease because it’s hard to be cured. This is shows it’s important for everyone to always control and checking their blood glucose levels to prevent make the diabetes melitus is getting worse. Non-invasive biomarking team from IPB currently developing blood glucose device measurement with non-invasive method. Now, the non-invasive biomarking team from IPB already created 2 products, design product for 2017’s and 2019’s with the output in the form of a residual intensity spectrum with respect to the time-domain. Therefore, calibration modeling is needed to predict blood glucose level. The best calibration modeling method for 2017’s device discovered by Herianti (2020) with elastic net regression and DDC algorithm for resolve the outlier. In 2019, measuring the blood glucose level were using different tools. This research aims to determine a more stable tool for measuring the blood glucose level with non-invasive method from 2 available tools, and to determine a more accurate summarization method of the intensity residual spectrum. More stable tool for measuring the blood glucose level is a 2017’s device. The summarization method in this research uses a trapezoidal area and 3 digit summarization approach. The result showed that the 2 summarization method didn’t have a significant different in accuracy.
Analisis Unggahan Media Sosial pada Instagram Rachel Vennya Menggunakan Metode Importance Performance Else Virdiani; Aam Alamudi; Yenni Angraini
Xplore: Journal of Statistics Vol. 11 No. 1 (2022)
Publisher : Department of Statistics, IPB

Show Abstract | Download Original | Original Source | Check in Google Scholar | Full PDF (1101.038 KB) | DOI: 10.29244/xplore.v11i1.850

Abstract

Instagram is one of the social media applications that can publish photos or videos for its users. Rachel Vennya is a well-known Instagram user who has more than five million followers. This research was conducted to see the expected posts by Rachel Vennya's followers on Instagram. Through the importance-performance analysis (IPA) it will be known the types of posts that are interesting and need to be increased in publication. This study's two IPA approaches, namely expected performance analysis (EPA) and importance-performance matrix analysis (IPMA). The results of each analysis are then mapped into a Cartesian diagram so that it is known that several posts increase follower loyalty and posts that need to be increased or decreased. After comparing the two Cartesian diagrams, it is known that there is no difference in the placement of variables between the two analyzes. Posts that deserve to be maintained include Motivation, Cooking, Family, and posts considered excessive in the publication are Business and Endorsements. Furthermore, customer satisfaction index (CSI) analysis was carried out to see follower satisfaction. The CSI value obtained is 72.69, which indicates the follower satisfaction index belongs to the satisfied criteria.
Penerapan Support Vector Machine dengan SMOTE Untuk Klasifikasi Sentimen Pemberitaan Omnibus Law Pada Situs CNNIndonesia.com Widiananda Putri Hutami; Hari Wijayanto; Itasia Dina Sulvianti
Xplore: Journal of Statistics Vol. 11 No. 1 (2022)
Publisher : Department of Statistics, IPB

Show Abstract | Download Original | Original Source | Check in Google Scholar | Full PDF (382.721 KB) | DOI: 10.29244/xplore.v11i1.852

Abstract

The declaration of the omnibus law reaped the pros and cons in the community. In a situation like this, the media should be neutral. One of the media that still maintains neutrality is Detik (Rumata 2017). Detik owns several channels such as detikNews, detikFinance, and CNN Indonesia. In this study, the neutrality of the CNN Indonesia media as part of Detik will be studied based on the tendency of sentiment on the omnibus law-related news. Sentiment analysis is used to examine the trend of opinion on news headlines. In conducting sentiment analysis, a method that supports classification is needed. The classification method that will be used in this research is the Support Vector Machine (SVM). There is an imbalance of data in the three categories of sentiment so that the Synthetic Minority Oversampling Technique (SMOTE) method is used to overcome this imbalance. The omnibus law tends to be reported neutrally by CNNIndonesia.com site. The one vs all method has a better classification result than the one vs one method. The application of SMOTE only gives slightly better results than data classification without the application of SMOTE because the imbalance in the data is not too extreme. Modeling using the one vs all method with SMOTE and distribution of data 90% train data 10% test data gives the best classification results with a macro average f1-score of 60,33%.
Penggerombolan Kabupaten/Kota di Indonesia Berdasarkan Indikator Indeks Pembangunan Manusia Menggunakan Metode K-Means dan Fuzzy C-Means . Hanniva; Anang Kurnia; Septian Rahardiantoro; Ahmad Ansori Mattjik
Xplore: Journal of Statistics Vol. 11 No. 1 (2022)
Publisher : Department of Statistics, IPB

Show Abstract | Download Original | Original Source | Check in Google Scholar | Full PDF (973.285 KB) | DOI: 10.29244/xplore.v11i1.855

Abstract

The achievement of the human development index in Indonesia differs between regions with striking gaps occurring in the western and eastern parts of Indonesia. This difference in achievement can be seen more clearly by grouping regencies/municipalities in Indonesia based on the four indicators of the human development index. With this aim, this study uses the k-means and fuzzy c-means methods to determine the optimal cluster size with two distance approaches, namely the Euclidean and Manhattan distances on the human development index indicators data in 2020. In addition, this study also seeks to identify the distribution of regencies/municipalities based on the characteristics of the human development index indicators in the clustering result. The result is that the best distance measure is Euclidean distance with optimal cluster size is four for k-means and six for fuzzy c-means. In addition, the clustering results obtained by the k-means method are more optimal than the fuzzy c-means because the evaluation value is better. In general, the four clusters formed were in accordance with the grouping carried out by BPS with the percentage of conformity reaching 66,54%. In summary, most regencies/municipalities on the Island of Sumatera, Java, Borneo and Sulawesi have higher life expectancy and percapita expenditure than many regencies/municipalities in the Nusa Tenggara Islands (besides Bali), Moluccas and Papua. Very high achievement for each HDI indicators is dominated by the capital city of each province with unfavorable conditions occurring in most regencies/municipalities in Papua Province.
Pendekatan Metode CHAID dan Regresi Logistik dalam Menganalisis Faktor Berpengaruh pada Kejadian Stunting di Provinsi Jawa Barat Fitri Dewi Shyntia; Anang Kurnia; Gerry Alfa Dito
Xplore: Journal of Statistics Vol. 11 No. 1 (2022)
Publisher : Department of Statistics, IPB

Show Abstract | Download Original | Original Source | Check in Google Scholar | Full PDF (886.259 KB) | DOI: 10.29244/xplore.v11i1.857

Abstract

Stunting is a chronic nutritional disorder characterized by short or very short height compared to the average child of his age. Data on the prevalence of stunting under five collected by the World Health Organization (WHO) in 2018 stated that Indonesia was the third-highest contributor to stunting in the South-East Asia Regional (SEAR) after Timor Leste and India. Indonesia's national stunting prevalence is 29,6%. West Java Province has the 12th the highest prevalence in Indonesia is one of the priority areas in stunting management, with the stunting prevalence rate most similar to the Indonesian national stunting prevalence of 29,2%. This study aims to examine the variables that are indicated to affect the incidence of stunting in children aged 0-59 months based on data obtained from the 2018 Basic Health Research (Riskesdas). Eighteen variables are categorized into child characteristics, nutritional fulfillment, socio-demographic, socialeconomic, and environmental characteristics. The analysis was performed using the logistic regression method and the Chi-Square Automatic Interaction Detection (CHAID) method. The analysis results show that the probability of stunting will increase significantly in children under five with several criteria. These Criteria are mothers with low education, sex of male toddlers, toddlers who do not carry out immunizations, toddlers who are not given additional food (PMT), and infants with households that have a safe place to eat and the disposal of wastewater from the kitchen is not suitable.
Perbandingan Performa Metode Pohon Model Logistik dan Random Forest pada Pengklasifikasian Data Purnama Sari; Kusman Sadik; Mulianto Raharjo
Xplore: Journal of Statistics Vol. 12 No. 1 (2023): Vol. 12 No. 1 (2023)
Publisher : Department of Statistics, IPB

Show Abstract | Download Original | Original Source | Check in Google Scholar | Full PDF (466.078 KB) | DOI: 10.29244/xplore.v12i1.858

Abstract

Multicollinearity and missing data are two common problems in big data. Missing data could decrease the prediction accuracy. Logistic model tree (LMT) is used to handle multicollinearity because multicollinearity does not affect the decision tree. Random forest can be used to decrease variance in prediction case. This study aimed to study the comparison of two methods, LMT and random forest, in multicollinearity and missing data in various cases using simulation study and real data as dataset. Evaluation model is based on classification accuracy and AUC measurement. The result stated that random forest had better performance if the multicollinearity level is moderate. LMT with omitted missing data is proven to have better performance for big data and when a high percentage of missing data occurred, and the multicollinearity level is severe. The next step is analysed real data with different sample size. The result stated that random forest have better performance. Omitted missing data have better performance in classification “breast cancer” data which consist 0,3 % missing data.

Page 8 of 11 | Total Record : 106