Garuda - Garba Rujukan Digital

Article Per Year (5 Year)

p-Index From 2021 - 2026

0.23

P-Index

This Author published in this journals

All Journal BIOTROPIA - The Southeast Asian Journal of Tropical Biology Journal of Computing Theories and Applications

Ahmad, Aziz

Unknown Affiliation

Author-ID : 2108244

Agriculture, Biological Sciences & Forestry Biochemistry, Genetics & Molecular Biology Computer Science & IT Decision Sciences, Operations Research & Management Immunology & microbiology Veterinary

Published : 2 Documents Claim Missing Document

Claim Missing Document

Articles

Title

Evaluating Open-Source Machine Learning Project Quality Using SMOTE-Enhanced and Explainable ML/DL Models Hamza, Ali; Hussain, Wahid; Iftikhar, Hassan; Ahmad, Aziz; Shamim, Alamgir Md
Journal of Computing Theories and Applications Vol. 3 No. 2 (2025): JCTA 3(2) 2025
Publisher : Universitas Dian Nuswantoro

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.62411/jcta.14793

The rapid growth of open-source software (OSS) in machine learning (ML) has intensified the need for reliable, automated methods to assess project quality, particularly as OSS increasingly underpins critical applications in science, industry, and public infrastructure. This study evaluates the effectiveness of a diverse set of machine learning and deep learning (ML/DL) algorithms for classifying GitHub OSS ML projects as engineered or non-engineered using a SMOTE-enhanced and explainable modeling pipeline. The dataset used in this research includes both numerical and categorical attributes representing documentation, testing, architecture, community engagement, popularity, and repository activity. After handling missing values, standardizing numerical features, encoding categorical variables, and addressing the inherent class imbalance using the Synthetic Minority Oversampling Technique (SMOTE), seven different classifiers—K-Nearest Neighbors (KNN), Decision Tree (DT), Random Forest (RF), XGBoost (XGB), Logistic Regression (LR), Support Vector Machine (SVM), and a Deep Neural Network (DNN)—were trained and evaluated. Results show that LR (84%) and DNN (85%) outperform all other models, indicating that both linear and moderately deep non-linear architectures can effectively capture key quality indicators in OSS ML projects. Additional explainability analysis using SHAP reveals consistent feature importance across models, with documentation quality, unit testing practices, architectural clarity, and repository dynamics emerging as the strongest predictors. These findings demonstrate that automated, explainable ML/DL-based quality assessment is both feasible and effective, offering a practical pathway for improving OSS sustainability, guiding contributor decisions, and enhancing trust in ML-based systems that depend on open-source components.

Co-Authors Cha, Thye San Chuah, Tse Seng Hamza, Ali Hussain, Wahid Iftikhar, Hassan Loh, Saw Hong Osman, Siti-Mariam Shamim, Alamgir Md

Title Search

Found 1 Documents Search Journal : Journal of Computing Theories and Applications

Abstract

Title

Found 1 Documents
Search
Journal : Journal of Computing Theories and Applications