The Big Data era, characterized by the massive Volume, Velocity, and Variety of information, has revolutionized decision-making processes across various sectors. However, this paradigm shift has created significant methodological gaps, particularly related to population bias and the absence of standardized frameworks for validating non-probabilistic data representations. This study aims to bridge these gaps through a Systematic Literature Review, employing academic documentation and theoretical triangulation to synthesize both the challenges and solutions in the data acquisition phase. The findings identify three dominant data collection methods Web Scraping, Application Programming Interface, and the Internet of Things (IoT) as direct responses to the 3V characteristics of Big Data. Crucial insights reveal a persistent tension between the massive data volume and its validity, further complicated by technical risks (such as application programming interface rate limiting) and legal or ethical concerns (including compliance with Terms of Service and data privacy regulations). Research implementation in this era must therefore adopt a strategic framework, emphasizing essential practices such as personal identifiable information de-identification to ensure privacy rights and the application of Exponential Backoff techniques to overcome application programming interface quota limitations. This review presents a comprehensive synthesis of the pre-analysis phase of Big Data research, underscoring that the integrity and reliability of scientific findings in this era depend heavily on the adoption of rigorous methodological and ethical frameworks.