Prismahardi Aji Riyantoko
Pembangunan Nasional “Veteran” University Of East Java

Published : 6 Documents Claim Missing Document
Claim Missing Document
Check
Articles

Found 4 Documents
Search
Journal : International Journal of Data Science, Engineering, and Analytics (IJDASEA)

Negative Binomial Time Series Regression – Random Forest Ensemble in Intermittent Data Amri Muhaimin; Prismahardi Aji Riyantoko; Hendri Prabowo; Trimono Trimono
Internasional Journal of Data Science, Engineering, and Anaylitics Vol. 1 No. 2 (2021): International Journal of Data Science, Engineering, and Analytics Vol 1, No 2,
Publisher : International Journal of Data Science, Engineering, and Analytics

Show Abstract | Download Original | Original Source | Check in Google Scholar | Full PDF (331.85 KB) | DOI: 10.33005/ijdasea.v1i2.10

Abstract

Intermittent dataset is a unique data that will be challenging to forecast. Because the data is containing a lot of zeros. The kind of intermittent data can be sales data and rainfall data. Because both sometimes no data recorded in a certain period. In this research, the model is created to overcome the problem. The approach that is used in this research is the ensemble method. Mostly the intermittent data comes from the Negative Binomial because the variance is over the mean. We use two datasets, which are rainfall and sales data. So, our approach is creating the base model from the time series regression with Negative Binomial based, and then we augmented the base model with a tree-based model which is random forest. Furthermore, we compare the result with the benchmark method which is The Croston method and Single Exponential Smoothing (SES). As the result, our approach can overcome the benchmark based on metric value by 1.79 and 7.18.
Water Availability Forecasting Using Univariate and Multivariate Prophet Time Series Model for ACEA (European Automobile Manufacturers Association) Prismahardi Aji Riyantoko; Tresna Maulana Fahrudin; Kartika Maulida Hindrayani; Amri Muhaimin; Trimono
Internasional Journal of Data Science, Engineering, and Anaylitics Vol. 1 No. 2 (2021): International Journal of Data Science, Engineering, and Analytics Vol 1, No 2,
Publisher : International Journal of Data Science, Engineering, and Analytics

Show Abstract | Download Original | Original Source | Check in Google Scholar | Full PDF (1292.381 KB) | DOI: 10.33005/ijdasea.v1i2.12

Abstract

Time series is one of method to forecasting the data. The ACEA company has competition with opened the data in the Water Availability and uses the data to forecast. The dataset namely, Aquifers-Petrignano in Italy in water resources field has five parameters e.g. rainfall, temperature, depth to groundwater, drainage volume, and river hydrometry. In our research will be forecast the depth to groundwater data using univariate and multivariate approach of time series using Prophet Method. Prophet method is one of library which develop by Facebook team. We also use the other approach to making the data clean, or the data ready to forecast. We use handle missing data, transforming, differencing, decomposition time series, determine lag, stationary approach, and Augmented Dickey-Fuller (ADF). The all approach will be uses to make sure that the data not appearing the problem while we tried to forecast. In the other describe, we already get the results using univariate and multivariate Prophet method. The multivariate approach has presented the value of MAE 0.82 and RMSE 0.99, it’s better than while we forecast using univariate Prophet.
Metric Comparison For Text Classification Amri Muhaimin; Tresna Maulana Fahrudin; Trimono; Prismahardi Aji Riyantoko; Kartika Maulida Hindrayani
Internasional Journal of Data Science, Engineering, and Anaylitics Vol. 2 No. 1 (2022): International Journal of Data Science, Engineering, and Analytics Vol 2, No 1,
Publisher : International Journal of Data Science, Engineering, and Analytics

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.33005/ijdasea.v2i1.34

Abstract

Text classifications have been popular in recent years. To classify the text, the first step that needs to be done is to convert the text into some value. Some values that can be used, such as Term Frequencies, Inverse Document Frequencies, Term Frequencies – Inverse Document Frequencies, and Frequency of the word itself. This study aims to get which metric value is best in text classification. The method used is Naïve Bayes, Logistic Regression, and Random Forest. The evaluation score that is used is accuracy and Area Under Curve value. It comes out that some metric values produce similar evaluation scores. Another finding is that Random Forest is the best method among others, also the best metric for text classification is Term Frequencies – Inverse Document Frequencies.
Simple Sentiment Analysis Using LSTM and BERT Algoritmhs for Classifying Spam and Non-Spam Data Prismahardi Aji Riyantoko; Dwi Arman Prasetya; Tahta Dari Timur
Internasional Journal of Data Science, Engineering, and Anaylitics Vol. 2 No. 2 (2022): International Journal of Data Science, Engineering, and Analytics Vol 2, No 2,
Publisher : International Journal of Data Science, Engineering, and Analytics

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.33005/ijdasea.v2i2.40

Abstract

Sentiment analysis has become a useful tool for doing data analysis and classification based on words, phrases, or documents. Previously, researchers conducted extensive research on sentiment analysis using a variety of algorithms and models. Based on previous research, the results of the sentiment analysis have a negative impact on model performance and data type. At the moment, researchers are using the LSTM and BERT models to classify SMS data into spam and non-spam. The researcher using TD-IDF and GloVe algorithm to determine the weighting of the values represented in vectors in each word to optimize the results of value accuracy. Regardless of the results obtained, the methods BERT and LSTM have a value accuracy sensitivity of 99.35% and 98.22%, respectively. The results present that the completion of spam and non-spam dataset classification is very effective and efficient. Tests were also carried out using disaster twitter data, but the level of accuracy of the values decreased. Therefore, it can be supposed that the different types of datasets considerably affect the performance of the temptation model.