Claim Missing Document
Check
Articles

Found 57 Documents
Search
Journal : Jurnal Gaussian

IMPLEMENTASI PAKET SHINY PADA PEMODELAN MULTISCALE AUTOREGRESSIVE UNTUK DATA HARGA SAHAM BBRI Bahtiar Ilham Triyunanto; Suparti Suparti; Rukun Santoso
Jurnal Gaussian Vol 10, No 3 (2021): Jurnal Gaussian
Publisher : Department of Statistics, Faculty of Science and Mathematics, Universitas Diponegoro

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.14710/j.gauss.v10i3.32781

Abstract

Stocks are an investment that attract people because they can earn large profits by having claim rights to the company's income and assets so investors have to observe stock price movements in the future to achieve investment goals. One of the statistical methods for time series data modeling is ARIMA. However, modeling assumptions must be fulfilled to use that method so an alternative model is proposed, namely nonparametric regression model, which has no modeling assumptions requirement. In this study, the nonparametric regression multiscale autoregressive (MAR) with two different filter and decomposition level J are compared to choose the best model and forecast it. The data are closing stock price, high stock price and low stock price of BBRI’s stocks that divided into 2 parts, namely in sample data from March 19, 2020 to February 4, 2021 to form a model and out sample data from February 5, 2021 to March 23, 2021 used for evaluation of model performance based on MAPE values. The chosen best model for each stock price are the MAR model with  wavelet haar filter and decomposition level 5 for the closing stock price which produces a MAPE value of 1.194%, the MAR model with wavelet haar filter and decomposition level 5 for the high stock price which produces a MAPE value of 1.283%, and the MAR model with a wavelet haar filter and decomposition level 5 for the low stock price which produces a MAPE value of 1.141%, indicating that the models have excellent forecasting capability. In this study, Graphical User Interface (GUI) using R software with the help of shiny package is also built, making data analyzing easier and generating more interactive display output.
GRAFIK PENGENDALI MIXED EXPONENTIALLY WEIGHTED MOVING AVERAGE – CUMULATIVE SUM (MEC) DALAM ANALISIS PENGAWASAN PROSES PRODUKSI (Studi Kasus : Wingko Babat Cap “Moel”) Aulia Resti; Tatik Widiharih; Rukun Santoso
Jurnal Gaussian Vol 10, No 1 (2021): Jurnal Gaussian
Publisher : Department of Statistics, Faculty of Science and Mathematics, Universitas Diponegoro

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.14710/j.gauss.v10i1.30938

Abstract

Quality control is an important role in industry for maintain quality stability.  Statistical process control can quickly investigate the occurrence of unforeseen causes or process shifts using control charts. Mixed Exponentially Weighted Moving Average - Cumulative Sum (MEC) control chart is a tool used to monitor and evaluate whether the production process is in control or not. The MEC control chart method is a combination of the Exponentially Weighted Moving Average (EWMA) and Cumulative Sum (CUSUM) charts. Combining the two charts aims to increase the sensitivity of the control chart in detecting out of control. To compare the sensitivity level of the EWMA, CUSUM, and MEC methods, the Average Run Length (ARL) was used. From the comparison of ARL values, the MEC chart is the most sensitive control chart in detecting out of control compared to EWMA and CUSUM charts for small shifts. Keywords: Grafik Pengendali, Exponentially Weighted Moving Average, Cumulative Sum, Mixed EWMA-CUSUM, Average Run Lenght, EWMA, CUSUM, MEC, ARL
PERAMALAN HARGA EMAS DUNIA DENGAN MODEL GLOSTEN-JAGANNATHAN-RUNCLE GENERALIZED AUTOREGRESSIVE CONDITIONAL HETEROSCEDASTICITY Uswatun Hasanah; Agus Rusgiyono; Rukun Santoso
Jurnal Gaussian Vol 11, No 2 (2022): Jurnal Gaussian
Publisher : Department of Statistics, Faculty of Science and Mathematics, Universitas Diponegoro

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.14710/j.gauss.v11i2.35477

Abstract

Gold investment is considered safer and has less risk than other types of investment. One of the important knowledge in investing in gold is predicting the price of gold in the future through modeling the price of gold in the past. The purpose of this study is to model the gold price in the past so that it can be used to predict gold prices in the future. The world gold price data is a time series data that has heteroscedasticity properties, so the time series model used to solve the heteroscedasticity problem is GARCH. This study has an asymmetric effect, so the asymmetric GARCH model is used, namely the Glosten-Jagannathan-Runkle GARCH (GJR-GARCH) model to model the world gold price data. The data is divided into in-sample data from January 3, 2012 to December 31, 2018 to create a world gold price model and out-sample data from January 1, 2019 to December 31, 2020, which is used to evaluate model performance based on MAPE values. The best model is the ARIMA(1,1,0) GJR-GARCH(1,1) model with a MAPE data out sample value of 18,93% which shows that the performance of the model has good forecasting abilities.
PENGELOMPOKAN PROVINSI DI INDONESIA BERDASARKAN INDIKATOR KESEHATAN LINGKUNGAN MENGGUNAKAN METODE PARTITIONING AROUND MEDOIDS DENGAN VALIDASI INDEKS INTERNAL Diah Aliyatus Saidah; Rukun Santoso; Tatik Widiharih
Jurnal Gaussian Vol 11, No 2 (2022): Jurnal Gaussian
Publisher : Department of Statistics, Faculty of Science and Mathematics, Universitas Diponegoro

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.14710/j.gauss.v11i2.35478

Abstract

Environmental health is an important aspect in efforts to achieve public health. The condition of environmental health in Indonesia is varies in each province, so the priorities for increasing environmental health are also different. This study aims to grouping provinces in Indonesia based on environmental health indicators in order to know the high/low environmental quality in each province to assist the government in optimizing environmental health efforts. The grouping of provinces is done partitioning around medoids method which is robust to data containing outliers. The measure of similarity objects is calculated using the Euclidean and Manhattan distances, the selection of the best number of clusters is done by validating the internal index, namely the Calinski-Harabasz index, Baker-Hubert index, silhouette index, C-index, and Davies-Bouldin index. The result of this study is that the best number of clusters are two clusters using the Manhattan distance measurement method, with the largest Calinski-Harabasz index value = 24.10072, the largest Baker-Hubert index = 0.8466251, the largest silhouette index = 0.4246581, the smallest C-index = 0.07290109, and the smallest Davies-Bouldin index = 1.094805.
IMPLEMENTASI K-MEDOIDS DAN MODEL WEIGHTED-LENGTH RECENCY FREQUENCY MONETARY (W-LRFM) UNTUK SEGMENTASI PELANGGAN DILENGKAPI GUI R Ta’fif Lukman Afandi; Budi Warsito; Rukun Santoso
Jurnal Gaussian Vol 11, No 3 (2022): Jurnal Gaussian
Publisher : Department of Statistics, Faculty of Science and Mathematics, Universitas Diponegoro

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.14710/j.gauss.11.3.429-438

Abstract

The k-medoids algorithm is a partition-based clustering algorithm that groups n objects as much as k clusters. The algorithm uses medoids as the center point (partition) of the cluster. Medoids are actual objects that are randomly selected as the most centered object in a cluster so that the k-medoids algorithm is robust against outliers. Grouping objects in cluster analysis based on similarities between objects. Measurement of similarity between objects can use the euclidean and manhattan distances. The use of distance in cluster analysis can affect cluster results. Validation of cluster results using internal validation, namely the silhouette index. The Weighted-Length Recency Frequency Monetary (W-LRFM) model is a model that applies the relative importance (weight) of the LRFM model according to the importance of each variable in the LRFM model. LRFM model is a model used for customer segmentation based on customer behavior which consists of variables length, recency, frequency, and monetary. The relative importance (weight) of the W-LFRM model uses the Analytics Hierarchical Process (AHP) method. The W-LRFM model is used to calculate the Customer Lifetime Value (CLV) of each cluster. The implementation of k-medoids and the W-LFRM model in this study are used for customer segmentation based on the length, recency frequency, and monetary variable. The formation of these variables is the result of transformation of customer behavior data such as transaction id, date of purchase, and a total amount of 41,073 rows into variable length, recency, frequency, and monetary as much as 5,108 rows. The criteria of the best cluster formed are k = 2 using the manhattan distance with the average of coefficient values = 0.62. The weights on the W-LRFM model produced based on the AHP method are 0.16, 0.29, 0.47, and 0.08 for the variable length, recency, frequency, and monetary. CLV formed from two clusters, namely 0.158 and 0.499. CLV in the second cluster is bigger so that the second cluster becomes the main priority in the marketing strategy. The second cluster has the characteristics 0.29, 0.47, and 0.08 for the variable length, recency, frequency, and monetary. The second cluster has the characteristics  means a loyal customer group. The first cluster has characteristics  means a potential customer group. This research is assisted by using Graphical User Interface (GUI) R to facilitate analysis
PENERAPAN TUNING HYPERPARAMETER RANDOMSEARCHCV PADA ADAPTIVE BOOSTING UNTUK PREDIKSI KELANGSUNGAN HIDUP PASIEN GAGAL JANTUNG Tita Aulia Edi Putri; Tatik Widiharih; Rukun Santoso
Jurnal Gaussian Vol 11, No 3 (2022): Jurnal Gaussian
Publisher : Department of Statistics, Faculty of Science and Mathematics, Universitas Diponegoro

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.14710/j.gauss.11.3.397-406

Abstract

Heart failure is the number one cause of death every year. Heart failure is a pathological condition characterized by abnormalities in heart function, which results in the failure of blood to be pumped to supply metabolic needs of tissues. The application of data mining and computational techniques to medical records can be an effective tool to predict each patient's survival who has heart failure symptoms. Data mining is a process of gathering important information from big data. The collection of important information is carried out through several processes, including statistical methods, mathematics, and artificial intelligence technology. The AdaBoost method is one of the supervised algorithms in data mining that is widely applied to make classification models. Hyperparameter Optimization is selecting the optimal set of hyperparameters for a learning algorithm. AdaBoost has hyperparameters requiring a classification process set, namely learning rate and n_estimators. RandomSearchCV is a random combination method of selected hyperparameters used to train the model. This research uses heart failure patient data collected at the Faisalabad Institute of Cardiology and at the Allied Hospital in Faisalabad (Punjab, Pakistan) from April to December 2015. The research uses learning rate: [-2.2] (log scale), n_estimators start from 10 to 776, and Kfold=5 and produces the best hyperparameters in learning rate=0.01 and n_estimators=443 with an accuracy value of 0.85 and AUC value of 0.897.
PERBANDINGAN SMOTE DAN ADASYN PADA DATA IMBALANCE UNTUK KLASIFIKASI RUMAH TANGGA MISKIN DI KABUPATEN TEMANGGUNG DENGAN ALGORITMA K-NEAREST NEIGHBOR Dinda Virrliana Ramadhanti; Rukun Santoso; Tatik Widiharih
Jurnal Gaussian Vol 11, No 4 (2022): Jurnal Gaussian
Publisher : Department of Statistics, Faculty of Science and Mathematics, Universitas Diponegoro

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.14710/j.gauss.11.4.499-505

Abstract

Poverty is a global problem that has occurred in various countries with various impacts. Poverty conditions are characterized by the inability of a person or household to meet the basic needs of life. Socio-economic problems, such as poverty, can be handled using machine learning, one of which is classification. The classification of households based on poverty criteria is expected to assist the government in preparing programs that are right on target. K-Nearest Neighbor is one of the easy-to-use classification algorithms. this classification is based on the closest neighborliness. The problem that can be experienced when classifying is if the data used is imbalanced. The data imbalance will causing the classification process to focus more on the majority class. SMOTE and ADASYN are used to solve the problem of imbalanced data. This study resulted in the addition of  SMOTE and ADASYN to imbalanced data can improve classification performance, especially on the G-mean value. G-mean is a performance measure that is widely used in the case of imbalanced data. The result of this study is that SMOTE can increase the G-mean value to 58.5%, while ADASYN is 57.3%. Therefore, it can be concluded that SMOTE-KNN is the best classification model for household poverty classification.
PENERAPAN ALGORITMA BACKPROPAGATION DAN OPTIMASI CONJUGATE GRADIENT UNTUK KLASIFIKASI HASIL TES LABORATORIUM Wahyu Tiara Rosaamalia; Rukun Santoso; Suparti Suparti
Jurnal Gaussian Vol 11, No 4 (2022): Jurnal Gaussian
Publisher : Department of Statistics, Faculty of Science and Mathematics, Universitas Diponegoro

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.14710/j.gauss.11.4.506-511

Abstract

A blood test is generally used to evaluate the condition of the blood and its components, conduct screening, and aid diagnosis. Blood tests in the laboratory are commonly used to deliberate whether a patient needs to be hospitalized or treated as an outpatient. Backpropagation algorithm was selected for its ability to solve complex problems. Conjugate gradient optimization is used because it facilitates faster solution search. An electronic medical record containing the results of patient laboratory examinations was obtained from Mendeley. The data was divided into training and testing with a 95:5 ratio, which was discovered to be the best ratio from the experiments. The best architecture was achieved by a combination of 10 neurons in the input layer, 16 neurons in the first hidden layer, 2 neurons in the second hidden layer, and a neuron in the output layer. Purelin is used as the activation function for both the first hidden and output layers, whereas the binary sigmoid is used for the second hidden layer. The analysis revealed that for 100 bootstraps in training data, the network worked with an average accuracy of 60.17% and a recall of 99.77%, while the accuracy results in testing data were 69.23%.
PEMODELAN KURS RUPIAH TERHADAP DOLAR AMERIKA SERIKAT MENGGUNAKAN REGRESI NONPARAMETRIK CAMPURAN KERNEL DAN SPLINE Khansa Amalia Fitroh; Rukun Santoso; Suparti Suparti
Jurnal Gaussian Vol 11, No 4 (2022): Jurnal Gaussian
Publisher : Department of Statistics, Faculty of Science and Mathematics, Universitas Diponegoro

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.14710/j.gauss.11.4.522-531

Abstract

Exchange currency is one way for a country to be able to transact with the outside world. Fluctuating movement of the rupiah exchange currency was caused by many influencing factors, such as exports, imports, the money supply (JUB), inflation, and JCI. To find out the relationship, nonparametric regression modeling was carried out with a mixed kernel estimator and a multivariable truncated linear spline. Import variables were approached with kernel regression because the data patterns were random and spread out while the export variables, JUB, inflation, and the Jakarta Composite Index (JCI) were approached with spline regression because the data patterns changed at certain sub-intervals. The purpose of this study is to model exchange currency of the rupiah against the US dollar with a mixed kernel and spline truncated estimator. The parameter estimation method used is Ordinary Least Square (OLS). The multivariable linear truncated spline and kernel mix estimator depends on knot points and bandwidth. The best model is seen from the knot point and optimal bandwidth obtained by selecting the minimum Generalized Cross Validation (GCV). The best model is applied to data on the exchange currency of the rupiah against the US dollar with two optimal knot points resulting in value of 0.7627. The model performance evaluation was calculated using MAPE and the resulting MAPE value was 0.598%.
PEMODELAN TOPIK ULASAN APLIKASI NETFLIX PADA GOOGLE PLAY STORE MENGGUNAKAN LATENT DIRICHLET ALLOCATION Gina Rosalinda; Rukun Santoso; Puspita Kartikasari
Jurnal Gaussian Vol 11, No 4 (2022): Jurnal Gaussian
Publisher : Department of Statistics, Faculty of Science and Mathematics, Universitas Diponegoro

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.14710/j.gauss.11.4.554-561

Abstract

The vast amount of review data available on the Google Play Store can be utilized to extract hidden essential information. These reviews have an unstructured format that requiring particular methods to automatically collect and analyze the review data. Topic modeling is an extension of text analysis that can find main themes or trends hidden in large sets of unstructured documents. This study applies topic modeling with the Latent Dirichlet Allocation (LDA) method to Netflix application review data sourced from the Google Play Store web. The Latent Dirichlet Allocation (LDA) method is a generative probabilistic model from textual data that can explain the hidden semantic themes in the review document. This research aims to analyze hidden topics that application users discuss. These hidden topics contain essential valuable information for Netflix users and the company. Users can use this information to decide before using Netflix services. Meanwhile, Netflix can use this information to improve the quality of its services. This research use data from a web scraping Netflix review on the Google Play Store from January 2021–August 2021. The results of topic modeling show that of the twelve topics generated, the most discussed topic by users is payment methods.
Co-Authors Abdiel Pandapotan Manullang Abdiyasti Nurul Arifa Abdul Hoyyi Achmad Soleh Ade Irma Pramudita Ade Irma Prianti Agum Prafindhani Putri, Agum Prafindhani Agus Rusgiyono Agustian, Kresnawidiansyah Aini Nurul Al Qarani, Muhammad Aqajahs Alan Prahutama Alan Prahutama Alika Ramadhani Alvita Rachma Devi Arief Rachman Hakim Aris Sugiharto Aukhal Maula Fina Aulia Resti Avida Anugraheni AYU LESTARI Bahtiar Ilham Triyunanto Brahim Abdullah Brahim Abdullah Budi Warsito Chrisentia Widya Ardianti Dhimas Bayususetyo Di Asih I Maruddani Di Asih I Maruddani Diah Aliyatus Saidah Diah Safitri Dinda Virrliana Ramadhanti Dwi Nooriqfina Emyria Natalia br Sembiring Endang Saefuddin Mubarok Erwin Permana Fauziyyah, Fida Fuadah, Alfi Gina Rosalinda Hadi, Bawa Mulyono Hana Hayati Hanum, Cholida Hasbi Yasin Hasbi Yasin Infan Nur Kharismawan Iryanto, Rivaldo Kurniawan Iyan Antono Jenesia Kusuma Wardhani Johanes Roisa Prabowo Khansa Amalia Fitroh Krismayadi Krismayadi Kurniawati, Galuh Nurvinda Laili Rahma Khairunnisa Lia Safitri Maharani, Chintya Ayu Mamuki, Emiliyan Margo Purnomo Mifta Fara Sany Mubarok, Endang Saefuddin Mubarok, Endang Saifuddin Muchammad Aziz Chusen Muhamad Syukron Muhammad Akhir Siregar Mustafid Mustafid Noer Rachma, Gustyas Zella Nor Hamidah Permana, Erwin Puspita Kartikasari Rahmat Hidayat Rahmatul Akbar Ratih Ayu Sekarini Ratna Kurniasari Ria Epelina Situmorang Ria Sulistyo Yuliani Rima Nurlita Sari Rismia, Erysta Risky Rita Rahmawati Rita Rahmawati Rosinar Siregar Saepudin, Yunus Sahara Sahara Sekarini, Ratih Ayu Setiani, Eri Shinta Karunia Permata Sari Siti Munawaroh Subagja, Asep Zamzam Subari Sudarno Sudarno Sudarno Sudarno Sudarno Sudarno Sugito - Sugito Sugito Suparti Suparti Suparti Suparti Syazwina Aufa Syiva Multi Fani Tamura Rolasnirohatta Siahaan Tarno Tarno Tasrif, Mohammad Jon Tatik Widiharih Tatik Widiharih Ta’fif Lukman Afandi Thea Zulfa Adiningrumh Tina Diningrum Tita Aulia Edi Putri Tomi Ardi Uswatun Hasanah Utami, Krisdiana Nur Via Risqiyanti Wahyu Tiara Rosaamalia wardhana, galih wisnu Wijayanto, Ahmad Windianingsih, Agustin Wiwin Wiwin Wiwin, Wiwin Yuciana Wilandari Zen, Agustian