Claim Missing Document
Check
Articles

Latent Dirichlet Allocation dalam Identifikasi Respon Masyarakat Indonesia Terhadap Covid-19 Tahun 2020-2021 Karel Fauzan Hakim; Pika Silvianti; Agus Mohamad Soleh
Xplore: Journal of Statistics Vol. 10 No. 3 (2021)
Publisher : Department of Statistics, IPB

Show Abstract | Download Original | Original Source | Check in Google Scholar | Full PDF (298.682 KB) | DOI: 10.29244/xplore.v10i3.836

Abstract

Covid-19 is a very troubling disease in Indonesia. Therefore, understanding public opinion is required to find solutions and evaluate the government performance in handling the pandemic. Twitter can be helpful to identify the public opinion of significant events. Twitter’s tweet is a large dimension text-based big data. It requires text sampling and text mining to be processed efficiently and effectively. Stratified random sampling with 20 repetitions applied to assume days as strata followed by topic modeling with latent Dirichlet allocation (LDA). This research aims to find out public opinion regarding Covid-19 and itsgrowth over time. Other than that, this research also aims to find out sampling effects on tweet data using stratified random sampling. Therefore, the extracted topics will be transformed into time-series data and considering the variety of the pattern made. Afterward, the transformation results will be explored and interpreted. This research suggests that discussions related to Covid-19 are divided into four topics by the first model, namely: “Vaccine”, “Positive or affected people”, “Health protocol”, and “Indonesia” then nine topics by the second model, namely: “Vaccine”, “Prayer”, “Health protocol”, “Social aid and corruption”, “Affected people”, “Indonesian economy”, “Work”, “Persuading to wear mask”, and “Willing to watch”. Furthermore, some topics peak whenever a significant event occurs in Indonesia. Afterward, this research suggests that 20 repetitions of stratified random sampling could provide good results.
Analisis Tingkat Kepuasan Pelanggan dan Loyalitas Pelanggan terhadap Cafe Infinity Coffee Muhammad Nuruddin Prathama; Muhammad Nur Aidi; Agus Mohamad Soleh
Xplore: Journal of Statistics Vol. 11 No. 2 (2022):
Publisher : Department of Statistics, IPB

Show Abstract | Download Original | Original Source | Check in Google Scholar | Full PDF (283.025 KB) | DOI: 10.29244/xplore.v11i2.898

Abstract

Cafe and restaurant businesses are some of the most competitive businesses and have a sizeable market in Jakarta. In this case, the restaurant owner must know the wishes and preferences of the buyer. This research was conducted in one of the cafes in Jakarta "Infinity coffee", this study was conducted to identify consumer characteristics, customer satisfaction, and consumer loyalty. Applying customer satisfaction analysis in the Infinity coffee business can increase understanding of what Infinity coffee consumers want and improve the quality of Infinity coffee services based on research’s results. The analytical methods used in this study are descriptive analysis, Important Performance Analysis (IPA), and the Consumer Satisfaction Index (CSI) as well as correspondence analysis. The results of this study indicate that the entire Infinity coffee service satisfaction index for all aspects is above 80%, which means that the value is included in the satisfied category. However, the IPA scatter diagram shows that there are attributes with a high level of importance that need to be improved in terms of service quality. One of the most important attributes that become a priority for improvement is the attribute of completeness of supporting facilities and adequate cutlery. The Method that used was proven to be successful in examine level of consumer satisfaction also to know more about the characteristic of the consumer.
SUPPORT VECTOR REGRESSION (SVR) METHOD FOR PADDY GROWTH PHASE MODELING USING SENTINEL-1 IMAGE DATA Hengki Muradi; Asep Saefuddin; I Made Sumertajaya; Agus Mohamad Soleh; Dede Dirgahayu Domiri
MEDIA STATISTIKA Vol 16, No 1 (2023): Media Statistika
Publisher : Department of Statistics, Faculty of Science and Mathematics, Universitas Diponegoro

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.14710/medstat.16.1.25-36

Abstract

Support Vector Machines (SVMs) have received extensive attention over the last decade because it is claimed to be able to produce models that are accurate and have good predictions in various situations. This study aims to test the SVR (Support Vector Regression) method for modeling the growth phase of paddy using sentinel-1 image data. This method was compared for its accuracy with the LR (Linear Model) method using RMSE and R2 statistics and model stability using 10 repetitions. The accuracy of the model with the two best predictors is when the NDPI and API Polarization Index are the predictors. The paddy age model from the SVR method is better than the paddy age model from the LR method, where the SVR method produces a model with an average RMSE of 11.13 and an average coefficient of determination of 88.10%. The accuracy of the SVR model with NDPI and API predictors can be improved by adding VH polarization to the model, where the average RMSE statistic decreases to 11.0 and the average coefficient of determination becomes 88.42%. In this scenario, the best model gives a minimum RMSE value of 10.35 and a coefficient of determination of 90.05%.
BETA-BINOMIAL MODEL IN SMALL AREA ESTIMATION USING HIERARCHICAL LIKELIHOOD APPROACH Etis Sunandi; Khairil Anwar Notodiputro; Indahwati Indahwati; Agus Mohamad Soleh
MEDIA STATISTIKA Vol 16, No 1 (2023): Media Statistika
Publisher : Department of Statistics, Faculty of Science and Mathematics, Universitas Diponegoro

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.14710/medstat.16.1.88-99

Abstract

Small Area Estimation is a statistical method used to estimate parameters in sub-populations with small or even no sample sizes. This research aims to evaluate the Beta-Binomial model's performance for estimating small areas at the area level. The estimation method used is Hierarchical Likelihood (HL). The data used are simulation data and empirical data. Simulation studies were used to investigate the proposed model. The estimator's Mean Squared Error of Prediction (MSEP) and Absolute Bias (AB) estimator values determine the best estimation criteria. An empirical study using data on the illiteracy rate at the sub-district level in Bengkulu Province. The results of the simulation study show that, in general, the parameter estimators are nearly unbiased. Proportion prediction has the same tendency as parameters. Finally, the HL estimator has a small MSEP estimator. The results of an empirical study show that the average illiteracy rate in Bengkulu province is quite diverse. Kepahiang District has the highest average illiteracy rate in Bengkulu Province in 2021.
Land Use Change Modelling Using Logistic Regression, Random Forest and Additive Logistic Regression in Kubu Raya Regency, West Kalimantan Alfa Nugraha Pradana; Anik Djuraidah; Agus Mohamad Soleh
Forum Geografi Vol 37, No 2 (2023): December 2023
Publisher : Universitas Muhammadiyah Surakarta

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.23917/forgeo.v37i2.23270

Abstract

Kubu Raya Regency is a regency in the province of West Kalimantan which has a wetland ecosystem including a high-density swamp or peatland ecosystem along with an extensive area of mangroves. The function of wetland ecosystems is essential for fauna, as a source of livelihood for the surrounding community and as storage reservoir for carbon stocks. Most of the land in Kubu Raya Regency is peatland. As a consequence, peat has long been used for agriculture and as a source of livelihood for the community. Along with the vast area of peat, the regency also has a potential high risk of peat fires. This study aims to predict land use changes in Kubu Raya Regency using three statistical machine learning models, specifically Logistic Regression (LR), Random Forest (RF) and Additive Logistic Regression (ALR). Land cover map data were acquired from the Ministry of Environment and Forestry and subsequently reclassified into six types of land cover at a resolution of 100 m. The land cover data were employed to classify land use or land cover class for the Kubu Raya regency, for the years 2009, 2015 and 2020. Based on model performance, RF provides greater accuracy and F1 score as opposed to LR and ALR. The outcome of this study is expected to provide knowledge and recommendations that may aid in developing future sustainable development planning and management for Kubu Raya Regency.
Pengaruh Penggunaan Random Undersampling, Oversampling, dan SMOTE terhadap Kinerja Model Prediksi Penyakit Cardiovascular (CVD) Uswatun Hasanah; Agus Mohamad Soleh; Kusman Sadik
Jurnal Matematika, Statistika dan Komputasi Vol. 21 No. 1 (2024): SEPTEMBER 2024
Publisher : Department of Mathematics, Hasanuddin University

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.20956/j.v21i1.35552

Abstract

Cardiovascular Disease (CVD) or commonly known as Heart Disease is a leading cause of mortality globally, prompting extensive research into predictive models to assess individual risk and plan preventive measures. Machine learning approaches such as Random Forest, Support Vector Machine (SVM), and LASSO Logistic Regression have showed promise. Recent studies have indicated that traditional resampling methods like Random Oversampling, Random Undersampling, and SMOTE may not significantly improve model discrimination. This study aims to evaluate the impact of these techniques on the performance of Cardiovascular Disease (CVD) prediction models, utilizing data from the UCI Machine Learning Heart Disease database. By employing LASSO Logistic Regression, Random Forest, and Support Vector Machine (SVM) with resampling techniques, including Random Oversampling, Random Undersampling, and SMOTE. This research seeks to enhance understanding of model performance in addressing class imbalances within the dataset and contribute to refining cardiovascular disease (CVD) prediction strategies. This study demonstrates that the use of the SMOTE technique significantly enhances the performance of cardiovascular disease (CVD) prediction models. Specifically, when combined with the Random Forest algorithm, SMOTE achieves the best performance in terms of accuracy, sensitivity, and specificity. This highlights the importance of selecting appropriate resampling techniques to handle class imbalance in datasets. Consequently, this research contributes to refining CVD prediction strategies and provides new insights into improving prediction accuracy in imbalanced medical data.
Metode Machine Learning-Based Univariate Time Series Imputation Method untuk Estimasi Nilai Hilang pada Data Non-Stasioner Dini Ramadhani; Agus Mohamad Soleh; Erfiani Erfiani
Jurnal Matematika, Statistika dan Komputasi Vol. 21 No. 1 (2024): SEPTEMBER 2024
Publisher : Department of Mathematics, Hasanuddin University

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.20956/j.v21i1.36468

Abstract

Handling missing values in time series data is crucial because they can disrupt data analysis and interpretation. Sequentially missing values in time series often pose a more complex challenge compared to randomly missing values. One of the promising recent methods is Machine Learning-Based Univariate Time Series Imputation (MLBUI), although it is still not widely used and its accessibility is limited. MLBUI employs Random Forest Regression (RFR) and Support Vector Regression (SVR) algorithms. This study evaluates the performance of MLBUI in addressing missing data scenarios in non-stationary univariate time series data. The data used in this research is the average temperature data from Bogor Regency. The missing data scenarios considered include rates of 6%, 10%, and 14%. Besides MLBUI, five other comparison methods are used: Kalman StructTS, Kalman Auto-ARIMA, Spline Interpolation, Stine Interpolation, and Moving Average. The results show that MLBUI performs poorly for non-stationary data, although the obtained Mean Absolute Percentage Error (MAPE) is below 10%.
Evaluasi Perbandingan Kinerja Algoritma Cheng and Church Biclustering Terhadap Algoritma Clustering Klasik K-Means untuk Mengidentifikasi Pola Distribusi Barang Ekspor Indonesia Baehera, Seta; Utami Dyah Syafitri; Agus Mohamad Soleh
Jurnal Statistika dan Aplikasinya Vol. 7 No. 2 (2023): Jurnal Statistika dan Aplikasinya
Publisher : LPPM Universitas Negeri Jakarta

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.21009/JSA.07204

Abstract

Clustering is a process of grouping data into several groups (clusters) so that data in one cluster has a homogeneous level of similarity and data between clusters has heterogeneous similarity. A common example of a clustering algorithm is K-Means Clustering. Compared with classical clustering algorithms, the biclustering algorithm is a two-dimensional data grouping process. The biclustering algorithm functions to find data submatrices, namely row subgroups and column subgroups that have high correlation. One example of a biclustering algorithm is Cheng and Church Biclustering (CC Biclustering). The aim of this research is to evaluate the performance of the biclustering algorithm against classical clustering algorithms. Analysis applied to CC Biclustering and K-Means Clustering to identify distribution patterns of Indonesian export goods in the period 2013 to 2022. Based on research results, the optimal scenario for the K-Means algorithm is scenario 2, that is the application of the 7 cluster K-Means algorithm with pre- processing data scaling. Meanwhile, the optimal scenario for the CC Biclustering algorithm is scenario 1, that is the application of the CC Biclustering algorithm with a tolerance value of 0.10 with data scaling pre-processing. However, from these two scenarios, based on the MSR/Volume value, it was concluded that the best scenario is scenario 1 in the application of the CC Biclustering algorithm which has an MSR/Volume value of 0.077.
Perbandingan Algoritma Pohon dengan Beberapa Skenario Pelabelan untuk Analisis Sentimen pada Aplikasi Milik Pemerintah/BUMN Fitrianto, Anwar; Rizki Manaf, Silmi Anisa; Soleh, Agus Mohamad
JEPIN (Jurnal Edukasi dan Penelitian Informatika) Vol 10, No 1 (2024): Volume 10 No 1
Publisher : Program Studi Informatika

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.26418/jp.v10i1.73512

Abstract

Berkembangnya era digitalisasi mengakibatkan banyaknya inovasi yang diupayakan untuk mempermudah aktivitas masyarakat di berbagai bidang, salah satunya yaitu adanya aplikasi yang menunjang agar menjadi lebih efisien dan dapat diakses dari mana saja. Aplikasi milik pemerintah dan BUMN sebagai perusahaan berskala nasional cenderung belum banyak diketahui dan banyak yang memiliki rating rendah disertai dengan berbagai macam ulasan pengguna aplikasi. Analisis sentimen merupakan analisis yang cocok untuk menganalisis ulasan dari aplikasi yang dipilih. Data yang digunakan adalah ulasan aplikasi InfoBMKG, BPOM Mobile, MyIndihome, dan MyPertamina. Penelitian bertujuan untuk membandingkan performa algoritma double random forest   dan algoritma berbasis pohon lain yaitu decision tree, extra trees, dan random forest berdasarkan tingkat ketepatan performa akurasi model. Pelabelan data berdasarkan rating aplikasi, lexicon-based, dan sentiment scoring dengan peubah prediktor dihasilkan dari tokenisasi unigram yang diberi bobot dengan TF-IDF. Setiap observasi data dikategorikan ke dalam kelas positif, netral, dan negatif. Hasil penelitian menunjukkan algoritma extra trees dan metode pelabelan sentiment scoring mampu menghasilkan performa terbaik dengan nilai rata-rata akurasi mencapai 80 "“ 84% pada tiap aplikasi yang dipilih.
BHF and copula models in small area estimation for household per capita expenditure in Bogor District BELINDA, NADIRA SRI; NOTODIPUTRO, KHAIRIL ANWAR; SOLEH, AGUS MOHAMAD
Jurnal Natural Volume 24 Number 2, June 2024
Publisher : Universitas Syiah Kuala

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.24815/jn.v24i2.37278

Abstract

Small area statistics are required when the sample size is small to produce estimates with adequate precision. The assumptions underlying Battese Harter Fuller (BHF) unit-level models may often be unrealistic in some applications. Copula is an alternative approach when the assumptions are violated. This research discusses the performance of BHF and Copula in small area estimation (SAE) for estimating household per capita expenditure in sub-district levels. This study presents household per capita expenditure, which has a skewed distribution. Due to the fact that the data contains outliers, an appropriate method to handle outliers is also considered. In this research, the Gaussian and the Clayton Copulas are used. The results showed that the performance of BHF was better than Gaussian and Clayton Copulas, as indicated by small root mean square error (RMSE) with an average of 1.14, while the average RMSE of Gaussian copula was 2.71 and Clayton copula was 2.63. Furthermore, the coefficient of variation (CV) of BHF was also smaller compared to Gaussian and Clayton Copulas, and the resulting estimates can be categorized as reliable based on the CV of less than 25%.
Co-Authors Aam Alamudi Afendi, Farit M Aji Hamim Wigena Alfa Nugraha Pradana Alfa Nugraha Pradana Anadra, Rahmi Anang Kurnia Andespa, Reyuli Andriansyah, . Anik Djuraidah Annisarahmi Nur Aini Aldania Ardhani, Rizky Arif Handoyo Marsuhandi Aris Yaman ASEP SAEFUDDIN Astari, Reka Agustia Baehera, Seta Bagus Sartono Belinda, Nadira Sri Budi Susetyo Cici Suhaeni Dalimunthe, Amir Abduljabbar Daulay, Nurmai Syaroh Dede Dirgahayu Domiri Dede Dirgahayu Domiri Dede Dirgahayu Domiri, Dede Dirgahayu Deri Siswara Devi Andrian Dini Ramadhani Erfiani Erfiani Erfiani Etis Sunandi Farit Mochamad Afendi Fitrianto, Anwar Fulazzaky, Tahira Hamim Wigena, Aji Hari Wijayanto Hari Wijayanto Hasnataeni, Yunia Hengki Muradi Herlin Fransiska I Gusti Ngurah, Sentana Putra I Made Sumertajaya Indahwati Jumansyah, L. M. Risman Dwi Karel Fauzan Hakim Khairil Anwar Notodiputro Koesnandy H, Abialam Kusman Sadik Kusnaeni Kusnaeni, Kusnaeni Latifah K. Darusman Leni Anggraini Susanti Lutfiah Adisti, Tiara M. Yunus Mohamad Rafi Mubarak, Fadhlul Muhammad Nur Aidi Muhammad Nuruddin Prathama Muhammad Yusran Muradi, Hengki Nisrina Az-Zahra, Putri Nofrida Elly Zendrato NURADILLA, SITI Nurhambali, M Rizky Nurizki, Anisa Pika Silvianti Rahardiantoro, Septian Rais Ramadhani, Dini Rizki Manaf, Silmi Anisa Rizki, Akbar Rochman, Nur Seran, Karlina Setyono Siregar, Indra Rivaldi Siti Arni Wulandya, Siti Arni Siti Hafsah Suhaeni, Cici Tarida, Arna Ristiyanti Tyas, Maulida Fajrining Uswatun Hasanah Utami Dyah Syafitri Yanke, Aldino Yudistira Yudistira Yumna Karimah _ Aunuddin