Claim Missing Document
Check
Articles

Found 2 Documents
Search
Journal : Variance : Journal of Statistics and Its Applications

PERFORMANCE EVALUATION OF NEURAL NETWORKS AND TRADITIONAL STATISTICAL METHODS IN ANALYZING IMBALANCED DATA: A COMPARATIVE STUDY Chairunissa, Abela; Nisa, Hilwin
VARIANCE: Journal of Statistics and Its Applications Vol 7 No 1 (2025): VARIANCE: Journal of Statistics and Its Applications
Publisher : Statistics Study Programme, Department of Mathematics, Faculty of Mathematics and Natural Sciences, University of Pattimura

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.30598/variancevol7iss1page21-30

Abstract

Class imbalance is a common issue in predictive modeling, particularly when minority classes carry critical significance, as seen in applications like fraud detection, rare disease prediction, and customer churn analysis. This study uses linear and non-linear simulated data scenarios to examine the performance of logistic regression, discriminant analysis, and neural networks on imbalanced data. For linear data, logistic regression and discriminant analysis displayed high sensitivity but extremely low specificity, indicating a strong bias toward the majority class. Neural networks showed marginal improvement but remained ineffective in detecting minority classes. In contrast, neural networks demonstrated superior sensitivity for non-linear data and were notably better at identifying minority classes, underscoring their suitability for complex data relationships. Our results highlight that accuracy alone is insufficient for evaluating models on imbalanced data; instead, sensitivity and specificity offer more relevant insights. Overall, this study suggests that neural networks are preferable for imbalanced data with non-linear patterns, and data characteristics and appropriate evaluation metrics should inform model selection.
CLUSTERING AND VISUALIZATION OF OLYMPIC ATHLETE DATA BASED ON PHYSICAL AND DISCIPLINARY ATTRIBUTES Nisa, Hilwin; Chairunissa, Abela
VARIANCE: Journal of Statistics and Its Applications Vol 7 No 1 (2025): VARIANCE: Journal of Statistics and Its Applications
Publisher : Statistics Study Programme, Department of Mathematics, Faculty of Mathematics and Natural Sciences, University of Pattimura

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.30598/variancevol7iss1page113-122

Abstract

This study aims to identify hidden patterns in international athlete data through clustering and data visualization approaches. The goal is to group athletes based on physical characteristics and sports disciplines to uncover meaningful trends. Utilizing a dataset of over 200,000 entries from 1896 to 2016, the study applies K-Means, Agglomerative and DBSCAN clustering methods. Preprocessing steps include handling missing data, selecting relevant variables (Height, Weight, Age, Sex, Sport, and Medal), and data normalization. The Silhouette score for K-Means (0.273647136516163645), Agglomerative (0.26134664130023655), and DBSCAN (-0.23920792207945957) indicates suboptimal clustering with overlapping clusters. K-Means clustering performs slightly better among the three methods. The findings are visualized through cluster plots and an interactive map showing medal distribution. This study highlights the limitations of traditional clustering methods for large datasets and suggests future exploration with advanced techniques.