Claim Missing Document
Check
Articles

Found 2 Documents
Search
Journal : Sciencestatistics: Journal of Statistics, Probability, and Its Application

Perbandingan Estimator Robust Huber dan Tukey’s Biweight terhadap Berbagai Skema Pencilan dalam Regresi Linier Linda Rassiyanti; Indah Suciati; Vina Nurmadani; Yoga Aji Sukma
Sciencestatistics: Journal of Statistics, Probability, and Its Application Vol. 3 No. 2 (2025): JULY
Publisher : Universitas Muhammadiyah Metro

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.24127/sciencestatistics.v3i2.9630

Abstract

Regresi linier secara umum menggunakan pendekatan Ordinary Least Squares (OLS) namun sering kali mengalami gangguan ketika data mengandung pencilan (outlier), yang dapat menyebabkan estimasi parameter menjadi bias dan tidak akurat. Regresi robust dikembangkan untuk mengatasi kelemahan OLS dengan menurunkan sensitivitas terhadap pencilan. Terdapat dua fungsi kerugian yang sering digunakan dalam regresi robust, yaitu Huber Loss dan Tukey’s Biweight Loss. Penelitian ini bertujuan untuk membandingkan performa dua metode regresi robust, yaitu Huber Loss dan Tukey’s Biweight, dalam menghadapi berbagai skema pencilan. Data simulasi dibangkitkan dengan parameter intersep dan slope masing-masing sebesar 3 dan 2, kemudian ditambahkan pencilan secara sistematis pada variabel X, Y, maupun keduanya, dengan proporsi 10%, 20%, dan 30%. Hasil analisis menunjukkan bahwa Tukey’s Biweight memberikan estimasi parameter yang lebih stabil pada kondisi pencilan ekstrem, terutama saat pencilan terjadi pada variabel Y atau kombinasi X dan Y. Sedangkan, Huber Loss cenderung menghasilkan Mean Squared Error (MSE) yang lebih rendah dalam beberapa kondisi, mencerminkan adanya trade-off antara bias dan variansi. Dengan demikian, Tukey’s Biweight lebih cocok untuk pencilan ekstrem, sedangkan Huber Loss lebih efisien dalam kondisi pencilan ringan hingga sedang. Linear regression, commonly estimated using the Ordinary Least Squares (OLS) method, is known for its sensitivity to outliers, which can lead to biased and inefficient parameter estimates. Robust regression was developed to overcome the weaknesses of OLS by reducing sensitivity to outliers. Two commonly used loss functions in robust regression are Huber Loss and Tukey’s Biweight Loss. This study aims to compare the performance of these two robust regression methods—Huber Loss and Tukey’s Biweight—in handling various outlier scenarios. Simulated data were generated with intercept and slope parameters set at 3 and 2, respectively, and outliers were systematically introduced to the X variable, the Y variable, or both, in proportions of 10%, 20%, and 30%. The analysis results indicate that Tukey’s Biweight provides more stable parameter estimates under extreme outlier conditions, especially when outliers occur in the Y variable or in both X and Y. Meanwhile, Huber Loss tends to yield lower Mean Squared Error (MSE) in certain conditions, reflecting a classic trade-off between bias and variance. Therefore, Tukey’s Biweight is more suitable for extreme outliers, whereas Huber Loss is more efficient under mild to moderate outlier conditions.
Optimizing Breast Cancer Prediction by Applying Machine Learning Vina Nurmadani; Indah Suciati; Yoga Aji Sukma; Linda Rassiyanti
Sciencestatistics: Journal of Statistics, Probability, and Its Application Vol. 3 No. 2 (2025): JULY
Publisher : Universitas Muhammadiyah Metro

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.24127/sciencestatistics.v3i2.9667

Abstract

In 2015, breast cancer ranked among the most prevalent and fatal cancers affecting women globally. Artificial intelligence is urgently needed to help medical professionals make more accurate decisions, reduce overdiagnosis, and streamline the diagnostic process. This study will implement and perform a comparative study of selected machine learning techniques algorithms, with a focus on SVM, XGBoost, and ANN, with various parameter combinations on the breast cancer dataset. Performance metrics such as accuracy, precision, recall, and F1-score were employed to evaluate and compare the algorithms. The results of this study show that the best model for predicting chronic breast cancer disease, which can help medical professionals predict chronic disease so that it can be treated quickly and accurately, is the SVM method using 8 parameters without the mitosis parameter: Clump thickness, Cell Size Uniformity, Cell Shape Uniformity, Marginal Adhesion, Single Epithelial Cell Size, Bare Nuclei, Bland Chromatin, and Normal Nuclei, with an accuracy value of 0.96 and a sensitivity value of 0.98.