Claim Missing Document
Check
Articles

Found 7 Documents
Search
Journal : JOIV : International Journal on Informatics Visualization

Comparative Analysis of Imputation Methods for Enhancing Predictive Accuracy in Data Models Zamri, Nurul Aqilah; Jaya, M. Izham; Irawati, Indrarini Dyah; Rassem, Taha H.; Rasyidah, -; Kasim, Shahreen
JOIV : International Journal on Informatics Visualization Vol 8, No 3 (2024)
Publisher : Society of Visual Informatics

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.62527/joiv.8.3.1666

Abstract

The presence of missing values within datasets can introduce a detrimental bias, significantly impeding the predictive algorithm's ability to discern patterns and accurately execute prediction. This paper aims to elucidate the intricacies of data imputation methods, providing a more profound understanding of prevalent imputation methods, including list-wise deletion (IGN), mean imputation (AVG), K-Nearest Neighbors (KNN), MissForest (MF), and Predictive Mean Matching (PMM). The dataset employed in this study consists of financial data about S&P 500 companies in the Compustat North America database. The training and validation dataset encompasses 1973 instances, consisting of data during the fourth quarter of 2009, the first quarter of 2010, and the third quarter of 2014. Within this set, 457 missing values were identified and imputed. The test dataset comprises 197 randomly selected instances from the fourth quarter of 2014, equivalent to ten percent of the total instances in the training dataset. The evaluation findings prominently position the dataset derived from MF imputation as the leading performer among all the imputed datasets. The insights derived from this study are intended to assist practitioners in making informed choices when selecting the most suitable data imputation method, particularly in the context of predictive modeling tasks.
Hybrid Logistic Regression Random Forest on Predicting Student Performance Rohman, Muhammad Ghofar; Abdullah, Zubaile; Kasim, Shahreen; Rasyidah, -
JOIV : International Journal on Informatics Visualization Vol 9, No 2 (2025)
Publisher : Society of Visual Informatics

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.62527/joiv.9.2.3972

Abstract

The research aims to investigate the effects of unbalanced data on machine learning, overcome imbalanced data using SMOTE oversampling, and improve machine learning performance using hyperparameter tuning. This study proposed a model that combines logistic regression and random forests as a hybrid logistic regression, random forest, and random search SV that uses SMOTE oversampling and hyperparameter tuning. The result of this study showed that the prediction model using the hybrid logistic regression, random forest, and random search SV that we proposed produces more effective performance than using logistic regression and random forest, with accuracy, precision, recall, and F1-score of 0.9574, 0.9665, 0.9576. This can contribute to a practical model to address imbalanced data classification based on data-level solutions for student performance prediction.
Rainfall-Runoff Modeling Using Artificial Neural Network for Batu Pahat River Basin Zulkiflee, Nurul Najihah; Mohd Safar, Noor Zuraidin; Kamaludin, Hazalila; Jofri, Muhamad Hanif; Kamarudin, Noraziahtulhidayu; Rasyidah, -
JOIV : International Journal on Informatics Visualization Vol 8, No 2 (2024)
Publisher : Society of Visual Informatics

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.62527/joiv.8.2.2704

Abstract

This research delves into the effectiveness of Artificial Neural Networks with Multilayer Perceptron (ANN-MLP) and Nonlinear AutoRegressive with eXogenous inputs (NARX) models in predicting short-term rainfall-runoff patterns in the Batu Pahat River Basin. This study aims to predict river water levels using historical rainfall and river level data for future intervals of 1, 3, and 6 hours. Data preprocessing techniques, including the management of missing values, identification of outliers, and reduction of noise, were applied to enhance the accuracy and dependability of the models. This study assessed the performance of the models for ANN-MLP and NARX by comparing their effectiveness across various forecast timeframes and evaluating their performance in different scenarios. The findings of the study revealed that the ANN-MLP model showed robust performance in short-term prediction. On the contrary, the NARX model exhibited higher accuracy, particularly in capturing intricate temporal relationships and external impacts on river behavior. The ANN-MLP produces 99% accuracy for 1-hour prediction, and NARX yields 98% accuracy with 0.3245 Root Mean Squared Error and 0.1967 Mean Absolute Error. This study makes a valuable contribution to hydrological forecasting by presenting a rigorous and precise modeling methodology.
Determinants Generating General Purpose Technologies in Economic Systems: A New Method of Analysis and Economic Implications Kargı, Bilal; Coccia, Mario; Uçkaç, Bekir Cihan; Rasyidah, -
JOIV : International Journal on Informatics Visualization Vol 8, No 3-2 (2024): IT for Global Goals: Building a Sustainable Tomorrow
Publisher : Society of Visual Informatics

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.62527/joiv.8.3-2.2657

Abstract

This research proposes using the fishbone diagram, a visualization tool for constructing a comprehensive theoretical framework to analyze the sources of innovation. Traditionally employed to identify causes of specific events, the fishbone diagram is applied innovatively to explore the root causes driving the emergence and evolution of General Purpose Technologies (GPTs). The study identifies critical driving forces such as increased democratization, population growth, demographic shifts, significant investments in research and development (R&D), global leadership aspirations among major powers, competitive socioeconomic environments, and potential threats from adversarial actors. By visually representing these drivers, the fishbone diagram offers insights crucial for technological analysis and foresight, illuminating groundbreaking innovations that drive technological and economic progress. Illustrated through examples from historical GPTs like the steam engine and contemporary technologies such as Information and Communication Technologies (ICTs), this study establishes a foundational framework for developing precise hypotheses about the specific causes and socio-economic impacts of GPTs. The fishbone diagram emerges as a versatile tool adept at systematically analyzing the complex root causes associated with GPTs, facilitating foresight and strategic management of these transformative innovations within society.
Laying Chicken Algorithm (LCA) Based For Clustering Yanto, Iwan Tri Riyadi; Setiyowati, Ririn; Irsalinda, Nursyiva; Rasyidah, -; Lestari, Tri
JOIV : International Journal on Informatics Visualization Vol 4, No 4 (2020)
Publisher : Society of Visual Informatics

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.30630/joiv.4.4.467

Abstract

Numerous research and related applications of fuzzy clustering are still interesting and important. In this paper, Fuzzy C-Means (FCM) and Laying Chicken Algorithm (LCA) were modified to improve local optimum of Fuzzy Clustering presented by using UCI dataset. In this study, the proposed FCMLCA performance was also compared to baseline technique based on CSO methods. The simulation results indicate that the FCMLCA method have better performance than the compared methods.
A Framework of Mutual Information Kullback-Leibler Divergence based for Clustering Categorical Data Yanto, Iwan Tri Riyadi; Setiyowati, Ririn; Azizah, Nur; Rasyidah, -
JOIV : International Journal on Informatics Visualization Vol 5, No 1 (2021)
Publisher : Society of Visual Informatics

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.30630/joiv.5.1.462

Abstract

Clustering is a process of grouping a set of objects into multiple clusters, so that the collection of similar objects will be grouped into the same cluster and dissimilar objects will be grouped into other clusters. Fuzzy k-means Algorithm is one of clustering algorithm by partitioning data into k clusters employing Euclidean distance as a distance function. This research discusses clustering categorical data using Fuzzy k-Means Kullback-Leibler Divergence. In the determination of the distance between data and center of cluster uses mutual information known as Kullback-Leibler Divergence distance between the joint distribution and the product distribution from two marginal distributions. Extensive theoretical analysis was performed to show the effectiveness of the proposed method. Moreover, the proposed method's comparison results with Fuzzy Centroid and Fuzzy k-Partition approaches in terms of response time and clustering accuracy were also performed employing several datasets from UCI Machine Learning. The experiment results show that the proposed Algorithm provides good results both from clustering quality and accuracy for clustering categorical data as compared to Fuzzy Centroid and Fuzzy k-Partition.
The Comprehensive Mamdani Inference to Support Scholarship Grantee Decision Humaira, -; Rasyidah, -; Junaldi, -; Rahmayuni, Indri
JOIV : International Journal on Informatics Visualization Vol 5, No 2 (2021)
Publisher : Society of Visual Informatics

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.30630/joiv.5.2.449

Abstract

Fuzzy Mamdani has been mostly used in various disciplines of science. Its ability to map the input-output in the form of a surface becomes an interesting thing. This research took DSS case of a scholarship grantee. Many criteria in taking a decision need to be simplified so that the result obtained remains intuitive. The model completion by conducting two stages consisted of two phases. The first phase consists of four FIS blocks. The second phase consists of one FIS block. The FIS design in the first phase was designed in such a way so that the output obtained has a big score interval. FIS output at the first phase will become FIS input at the second phase. This big value range becomes good input at FIS in the second phase. Each FIS block has different total input. Until the surface formed must be seen from various dimensions to assure trend surface increasing or decreasing softly. This kind of thing is conducted by observing the movement of output dots kept for its soft surface form. The output dots change influenced by the membership function, the regulations used, total fuzzy set, and parameter value of membership function. This research used the Gaussian membership function. The Gaussian membership function is highly suitable for this DSS case. This article also explains the usage of a fuzzy set in each input, the parameter from the membership function, and the input value range. After observing the surface form with an intuitive approach, then this model needs to be evaluated. The evaluation was done to measure the model performance using Confusion Matrix. The result of model performance obtained accuracy in the amount of 85%.