Claim Missing Document
Check
Articles

Found 15 Documents
Search
Journal : Jurnal Gaussian

PERBANDINGAN MODEL REGRESI BINOMIAL NEGATIF BIVARIAT DENGAN MODEL GEOGRAPHICALLY WEIGHTED NEGATIVE BINOMIAL BIVARIAT REGRESSION (GWNBBR) PADA KASUS ANGKA KEMATIAN BAYI DAN KEMATIAN IBU DI JAWA TENGAH Yashmine Noor Islami; Dwi Ispriyanti; Puspita Kartikasari
Jurnal Gaussian Vol 10, No 4 (2021): Jurnal Gaussian
Publisher : Department of Statistics, Faculty of Science and Mathematics, Universitas Diponegoro

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.14710/j.gauss.v10i4.33096

Abstract

Infant mortality (0-11 months) and maternal mortality (during pregnancy, childbirth, and postpartum) are significant indicators in determining the level of public health. Central Java Province which has 35 regencies/cities is included in the top five regions with the highest number of infant and maternal mortality in Indonesia. The data characteristics of the number of infants and maternal mortality are count data. Therefore, the Poisson Regression method can be used to analyze the factors that influence the number of infants and maternal mortality. In Poisson regression analysis, there must be a fulfilled assumption, called equidispersion. Frequently, the variance of count data is greater than the mean, which is known as the overdispersion. The research, binomial negative bivariate regression is used as a solutions to overcome the problem of overdispersion in poisson regression. This method produce a global model. In reality, the geographical, socio-cultural, and economic conditions of each region will be different. This illustrates the effect of spatial heterogeneity, so it needs to be developed into Geographically Weighted Negative Binomial Bivariate Regression (GWNBBR). The model of GWNBBR provides weighting based on the position or distance from one observation area to another. Significant variables for modeling infant mortality cases included the percentage of obstetric complications treated (X1), the percentage of infants who were exclusively breastfed (X3), and the percentage of poor people (X5). Significant variable for modeling maternal mortality cases is the percentage of poor people (X5). Based on the AIC value, GWNBBR model is better than binomial negatif bivariat regression model because it has a smaller AIC value. 
PENGELOMPOKAN TWEETS PADA AKUN TWITTER TOKOPEDIA MENGGUNAKAN ALGORITMA DENSITY BASED SPATIAL CLUSTERING OF APPLICATIONS WITH NOISE Deanira Qinanty Alamsyah; Sudarno Sudarno; Puspita Kartikasari
Jurnal Gaussian Vol 11, No 1 (2022): Jurnal Gaussian
Publisher : Department of Statistics, Faculty of Science and Mathematics, Universitas Diponegoro

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.14710/j.gauss.v11i1.33992

Abstract

Social media has become a trend for Indonesian people to express opinions, socialize, and exchange ideas. Internet users in Indonesia in 2021 will reach 202.6 million, 84% of whom use the internet to access social media. Twitter is one of the popular social media in Indonesia. This phenomenon is an opportunity for companies to use Twitter as a marketing tool, one of which is a marketplace company in Indonesia, Tokopedia. This research is intended to cluster tweets uploaded by the @tokopedia Twitter account to find out the type of content that gets a lot of likes and retweets by followers of the @tokopedia Twitter account. Cluster formation is done by applying the Density-Based Spatial Clustering of Applications with Noise algorithm (DBSCAN). DBSCAN is a clustering algorithm based on density. The DBSCAN algorithm requires two parameters, namely the radius (Eps) and the minimum number of objects to form a cluster (MinObj). This research conducted several experiments with different Eps and MinObj parameters on 1.344 tweets that had gone through the stages of removing duplication, text preprocessing, and feature selection. The quality of the cluster formed is measured using the Silhouette Coefficient. Based on the highest average Silhouette Coefficient, the parameter values of Eps=5 and MinObj=3 with Silhouette Coefficient = 0.575 are determined as the best parameters that produce 2 clusters and 7 noise. The type of content that has the highest average number of likes and retweets is the WIB (Indonesian Shopping Time) campaign, so Tokopedia can use this type of content as a marketing tool on Twitter social media because this type of content is preferred by followers of the @tokopedia Twitter account. Keywords: Twitter, Tokopedia, Clustering, DBSCAN, Silhouette Coefficient
PERAMALAN INDEKS HARGA SAHAM MENGGUNAKAN ENSEMBLE EMPIRICAL MODE DECOMPOSITION (EEMD) Rosinar Siregar; Rukun Santoso; Puspita Kartikasari
Jurnal Gaussian Vol 10, No 2 (2021): Jurnal Gaussian
Publisher : Department of Statistics, Faculty of Science and Mathematics, Universitas Diponegoro

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.14710/j.gauss.v10i2.29919

Abstract

 Stock price fluctuations make investors tend to hesitate to invest in stock markets because of an uncertain situation in the future. One method that can solve these problems is to use forecasting about the stock prices in the future. Generally, the huge size of data non linear and non stationary, and it is difficult to be interpreted in concrete. This problem can be solved by performing the decomposition process. One of decomposition method in time series data is Ensemble Empirical Mode Decomposition (EEMD). EEMD is process decomposition data into several Intrinsic Mode Function (IMF) and the IMF residue. In this research, this concept applied to data Stock Price Index in Property, Real Estate, and Construction from July 1, 2019 to July 30, 2020 as many as 272 data. Based on the results of data processing, as many as 6 IMF and IMF remaining were used as IMF forecasting and the IMF remaining in the future. The forecast was performed by choosing the best model of each IMF component and IMF remaining, used ARIMA and polynomial trend. Keywords: Time Series Data, Stock Price Index, EEMD, ARIMA, Polynomial Trend.
PEMODELAN JUMLAH KASUS DEMAM BERDARAH DENGUE (DBD) DI JAWA TENGAH DENGAN GEOGRAPHICALLY WEIGHTED NEGATIVE BINOMIAL REGRESSION (GWNBR) Indah Suryani; Hasbi Yasin; Puspita Kartikasari
Jurnal Gaussian Vol 10, No 1 (2021): Jurnal Gaussian
Publisher : Department of Statistics, Faculty of Science and Mathematics, Universitas Diponegoro

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.14710/j.gauss.v10i1.29400

Abstract

Dengue Hemorrhagic Fever (DHF) is one of the diseases with unsual occurrence in Central Java and spread throughout the regency/city. The number sufferers of this disease is still high because the mortality rate is still above the national target. Regarding the less handling of DHF spread, it is necessary to make a plan by identify the factors that allegedly affect that case. Characteristics of data the DHF cases is count data, so this research is carried out using poisson regression. If in poisson regression there is overdispersion, it can be overcome using negative binomial regression. Meanwhile to see the spatial effect, we can use the Geographically Weighted Negative Binomial Regression (GWNBR) method. GWNBR modeling uses a fixed exponential kernel for weighting function. GWNBR is better at modeling the number of DHF cases because it has the smallest AIC value than poisson regression and negative binomial regression. The results of research with poisson regression obtained three variables that have a significant effect on dengue cases. For negative binomial regression, two variables have a significant effect on DHF cases. While the GWNBR method obtained two groups of districts/cities based on significant variables. The variables affecting the number of DHF cases in all districts/cities in Central Java are the percentage of healthy houses, the percentage of clean water quality, and the ratio of medical personnel.Keywords: DHF, GWNBR, Poisson Regression, Binomial Negative Regression, Fixed Exponential Kernel
PREDIKSI HARGA JUAL KAKAO DENGAN METODE LONG SHORT-TERM MEMORY MENGGUNAKAN METODE OPTIMASI ROOT MEAN SQUARE PROPAGATION DAN ADAPTIVE MOMENT ESTIMATION DILENGKAPI GUI RSHINY Yayan Setiawan; Tarno Tarno; Puspita Kartikasari
Jurnal Gaussian Vol 11, No 1 (2022): Jurnal Gaussian
Publisher : Department of Statistics, Faculty of Science and Mathematics, Universitas Diponegoro

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.14710/j.gauss.v11i1.33994

Abstract

Cocoa is a leading commodity from Indonesia. Cocoa prices from time to time fluctuate. Accurate Cocoa price predictions are very important to ensure future prices and help decision making. Cocoa price data is non-stationary and nonlinear, so to make accurate predictions, an Artificial Neural Network (ANN) model is applied. One type of ANN is Long Short-Term Memory (LSTM). LSTM has superior performance for time series based prediction. Optimization methods used are Root Mean Square Propagation, and Adaptive Moment Estimation. The best model was selected based on the Means Square Error (MSE) and Mean Absolute Percentage Error (MAPE) values. This study uses the R-Shiny GUI to facilitate the use of LSTM for users who are less proficient in programming languages. Based on the results, the Long Short-Term Memory model with the Adaptive Moment Estimation optimization method is more optimal than the Long Short-Term Memory with Root Mean Square Propagation seen from the smaller MSE and MAPE values. This study used 27 combinations of hyperparameters. Prediction results with LSTM using the R-Shiny GUI have different levels of accuracy in each experiment. The best accuracy value is experiment with MSE value of 491505.1 and MAPE value of 1.739155% . Cocoa Price Forecasting for the period November to December 2021 tends to decline.Keywords : Cocoa Prices, Forecasting, Long Short-Term Memory, Root Mean Square Propagation, Adaptive Moment Estimation, GUI R-Shiny
PERBANDINGAN MODEL REGRESI KEGAGALAN PROPORSIONAL DARI COX MENGGUNAKAN METODE EFRON DAN EXACT Asri Lutfia Silmi; Sudarno Sudarno; Puspita Kartikasari
Jurnal Gaussian Vol 9, No 4 (2020): Jurnal Gaussian
Publisher : Department of Statistics, Faculty of Science and Mathematics, Universitas Diponegoro

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.14710/j.gauss.v9i4.29008

Abstract

Cox proportional hazard regression analysis is one of statistical methods that is often used in survival analysis to determine the effect of independent variables on the dependent variable in the form of survival time. Survival time starts from the beginning of the study until the event occurs or has reached the end of the study. The Cox proportional hazard regression model does not require information about the distribution that underlies the survival time but there is an assumption of proportional hazard that must be met. The purpose of this study is to determine the factors that influence the survival time of coronary heart disease. Ties are often found in survival data, including the survival data used in this study. Ties is an event when there are two or more individuals who experience a failure at the same time or have the same survival time value. The Efron and Exact method approach is used to overcome the presence of ties that can cause problems in the estimation of parameters associated with determining the members of the risk set. The results showed that the variables of diabetes mellitus, family history, and platelets significantly affected the survival time of CHD patients for both methods. The best model obtained is the Exact method because it has smaller AIC value of 383,153 compared to the AIC value of the Efron method of 393,207. 
PENGUKURAN VALUE AT-RISK PADA PORTOFOLIO OBLIGASI DENGAN METODE VARIAN-KOVARIAN Khoirul Anam; Di Asih I Maruddani; Puspita Kartikasari
Jurnal Gaussian Vol 9, No 4 (2020): Jurnal Gaussian
Publisher : Department of Statistics, Faculty of Science and Mathematics, Universitas Diponegoro

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.14710/j.gauss.v9i4.29012

Abstract

A bond is investment instrument that is basically a debt investment. The profit gained in investing will be comparable with the risk. An investor must pay attention to the size of the risk in choosing bonds. Value at-Risk (VaR) is a risk measurement instruments for measure the maximum loss of asset or portfolio over a spesicif time interval for a given confidence level under normal market conditions. The purpose of this paper is to explain VaR measurement on bond portfolio using variance-covariance method and prove that method is valid to estimate VaR’s model using likelihood ratio. Variance covariance method was chosen because giving lower estimate potential volatility of asset or portfolio than historical simulation and Monte-Carlo simulation. This article use goverment bonds with code FR0053, FR0061, FR0073, FR0074 and portfolio combination. Normality test of return asset and portfolio are required before calculating VaR values. The result of this paper for confidence level 95% showed that bond portfolio FR0053 with FR0061 have a smaller value with VaR values 2,28% of the total market value. It was concluded that VaR bond portfolio are smaller than VaR single asset. Verification test estimate that VaR values using variance-covariance is valid at confidence level 95%.
ANALISIS SENTIMEN PEMINDAHAN IBU KOTA NEGARA DENGAN KLASIFIKASI NAÏVE BAYES UNTUK MODEL BERNOULLI DAN MULTINOMIAL Nabila Surya Wardani; Alan Prahutama; Puspita Kartikasari
Jurnal Gaussian Vol 9, No 3 (2020): Jurnal Gaussian
Publisher : Department of Statistics, Faculty of Science and Mathematics, Universitas Diponegoro

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.14710/j.gauss.v9i3.27963

Abstract

Text mining is a variation on a field called data mining that tries to find interesting patterns from large databases. Indonesian President affirmed that the capital would be moved to East Kalimantan on August 26, 2019. That planning would receive pros and cons from public. Sentiment analysis is part of text mining that typically involves taking data from opinion, comment, or response. Sentiment analysis is the choice to do on this topic to get results about the public’s opinion. As the most used social media in Indonesia, Youtube is able to be data source by crawling the comments on a video uploaded by Kompas TV channel. Those comments were crawled on October 15, 2019, and selected 1500 latest comments (August 26 – October 12, 2019). The selected comments get transformed by using data pre-processing technique that involves case folding, removing mention, unescaping HTML, removing numbers, removing punctuation, text normalization, stripping whitespace, stopwords removal, tokenizing, and stemming. Labeling of sentiment class uses the sentiment scoring technique. The number of negative comments is 849, while the number of positive comments is 651. The ratio between training data and testing data is 80%: 20%. The classification method used to do sentiment analysis is the Naive Bayes Classifier for Bernoulli and Multinomial model. Bernoulli model only uses occurrence information, whereas the multinomial model keeps track of multiple occurrences. The results show that Bernoulli Naïve Bayes has a 93,45% level of sensitivity (recall) and Multinomial Naïve Bayes has a 90,19% level of sensitivity (recall). It means that both Bernoulli and Multinomial have a good result for this research.  
PEMODELAN TOPIK PADA KELUHAN PELANGGAN MENGGUNAKAN ALGORITMA LATENT DIRICHLET ALLOCATION DALAM MEDIA SOSIAL TWITTER Diandra Zakeshia Tiara Kannitha; Mustafid Mustafid; Puspita Kartikasari
Jurnal Gaussian Vol 11, No 2 (2022): Jurnal Gaussian
Publisher : Department of Statistics, Faculty of Science and Mathematics, Universitas Diponegoro

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.14710/j.gauss.v11i2.35474

Abstract

Large scale social restrictions (PSBB) is a policy issued by the Government of Indonesia as one of the efforts to reduce the spread of the Covid-19 virus. The impact of the policy is that it requires people to conduct activities online . This makes the internet users in Indonesia in the year 2020 up to 73.7%. Each provider must be able to determine strategies in order to maintain the quality of service and customer loyalty. Good reputation for the company is also important, so customers want to use internet services through their company. One of them is by listening to the complaints of the customers towards the company. In this research, modeling the topic of customer complaints carried out using the Latent Dirichlet Allocation Algorithm. The Latent Dirichlet Allocation Algorithm was chosen because the method has good performance. The topic modelling process is carried out using the gibbs sampling estimation. The topic that is often complained to First Media is that internet was turns off while working, while for IndiHome is that the internet often turns off and disconnect. Based on the results of the interpretation, 70% for First Media and 81,81% for IndiHome that these topics had been in accordance with what is complained by customers through their tweets. From the topic that have been known, it can be used as an evaluation for their company in order to maintain service quality and customer loyalty
KLASIFIKASI MENGGUNAKAN METODE SUPPORT VECTOR MACHINE DAN RANDOM FOREST UNTUK DETEKSI AWAL RISIKO DIABETES MELITUS Chea Zahrah Vaganza Junus; Tarno Tarno; Puspita Kartikasari
Jurnal Gaussian Vol 11, No 3 (2022): Jurnal Gaussian
Publisher : Department of Statistics, Faculty of Science and Mathematics, Universitas Diponegoro

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.14710/j.gauss.11.3.386-396

Abstract

Diabetes Mellitus is one of the four leading causes of death and therefore possible treatments are of crucial importance to the world leaders. Prevention and control of Diabetes Mellitus are often done by implementing a healthy lifestyle. Thus, both people with risk factors and people diagnosed with Diabetes Mellitus can control their disease in order to prevent complications or premature death.. For a proper education and standardized disease management the early detection of Diabetes Mellitus is necessary, which led to this conducted study about the classification of early detection of Diabetes Mellitus risk by utilizing the use of Machine Learning. The classification algorithms used are the Support Vector Machine and Random Forest where the performance analysis of the two methods will be seen in classifying Diabetes Mellitus data. The type of data used in this study is secondary data obtained from the official website of the UCI Machine Learning Repository consisting of 520 diabetes patient data taken from Sylhet Diabetic Hospital in Bangladesh with 16 independent variables and 1 dependent variable. The dependent variable categorizes the test result into positive and negative Diabetes Mellitus classes. The results of this study indicate that the Random Forest classification algorithm produces a better classification performance on Accuracy (98.08%), Recall (97.87%), Precision (98.92), and F1_Score (88.40%).