Claim Missing Document
Check
Articles

Found 1 Documents
Search

Methods, Challenges, and Ethical Considerations in Data Collection of Corpus Compilation Dalieva, Madina
Innovative Technologica: Methodical Research Journal Vol. 3 No. 3 (2024): September
Publisher : Indonesian Journal Publisher

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47134/innovative.v3i3.122

Abstract

Corpus compilation is a critical process in linguistics that involves gathering and organizing large datasets for language analysis and model training. This article examines key aspects of corpus compilation, with a particular focus on data collection. It explores the sources of data, strategies for ensuring representativeness, and challenges such as copyright constraints and data quality issues. Ethical considerations, such as anonymization and consent, are also discussed. By understanding these factors, researchers can build effective and ethically sound corpora for linguistic research and computational applications.