Garuda - Garba Rujukan Digital

p-Index From 2021 - 2026

1.048

P-Index

This Author published in this journals

All Journal International Journal on Advanced Technology, Engineering, and Information System (IJATEIS) Journal of Community Service and Society Empowerment Journal of Social Science Utilizing Technology Journal of Advanced Computer Knowledge and Algorithms Journal of Social Science Utilizing Technology EDUTREND: Journal of Emerging Issues and Trends in Education

Shahbazi, Hafizullah

Unknown Affiliation

Author-ID : 7316441

Religion Agriculture, Biological Sciences & Forestry Arts Humanities Automotive Engineering Chemical Engineering, Chemistry & Bioengineering Civil Engineering, Building, Construction & Architecture Computer Science & IT Education Electrical & Electronics Engineering Environmental Science Industrial & Manufacturing Engineering Languange, Linguistic, Communication & Media Mathematics Public Health Social Sciences Other

Published : 6 Documents Claim Missing Document

Claim Missing Document

Articles

Title

Effective Data Preprocessing in Data Science: From Method Selection to Domain-Specific Optimization Shahidi, Shahwali; Wahid Samadzai, Abdul; Shahbazi, Hafizullah
Journal of Advanced Computer Knowledge and Algorithms Vol. 2 No. 4 (2025): Journal of Advanced Computer Knowledge and Algorithms - October 2025
Publisher : Department of Informatics, Universitas Malikussaleh

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.29103/jacka.v2i4.22886

In the era of big data and artificial intelligence, data preprocessing has emerged as a critical step in the data science pipeline, influencing the quality, performance, and reliability of machine learning models. Despite its importance, the diversity of techniques, challenges, and evolving practices necessitate a structured understanding of this domain. This study conducts a systematic literature review (SLR) to explore current data preprocessing techniques, their domain-specific applications, associated challenges, and emerging trends. A total of 21 peer-reviewed articles from 2016 to 2024 were analyzed using well-defined inclusion and exclusion criteria, with a focus on machine learning and big data contexts. The results reveal that normalization, data cleaning, feature selection, and dimensionality reduction are the most commonly applied techniques. Key challenges identified include handling missing values, high dimensionality, and imbalanced data. Moreover, recent trends such as automated preprocessing (AutoML), privacy-preserving methods, and scalable preprocessing for distributed systems are gaining momentum. The review concludes that while traditional methods remain foundational, there is a shift toward adaptive and intelligent preprocessing strategies to meet the growing complexity of data environments. This study offers valuable insights for researchers and practitioners aiming to optimize data preparation processes in modern data science workflows

Co-Authors Danish, Jawad Ezam, Zakirullah Hakimi, Musawer Quraishi, Tamanna Rahimi, Behnaz Rahimi, Nasrallah Rahmani, Khoshal Rahmani, Khoshal Rahman Rastagari, Mohammad Aziz Shahidi, Shahwali Ulusi, Helena Wahid Samadzai, Abdul

Title Search

Found 1 Documents Search Journal : Journal of Advanced Computer Knowledge and Algorithms

Abstract

Title

Found 1 Documents
Search
Journal : Journal of Advanced Computer Knowledge and Algorithms