Massive digital transformation across various sectors has positioned data as a vital instrument in accelerating technological innovation, particularly in the development of the Artificial Intelligence (AI) ecosystem. One of the most effective yet controversial data collection methods to date is web scraping. This research aims to deeply analyze the ethical dilemmas and the legal boundaries of web scraping practices in Indonesia, specifically in the context of providing large-scale datasets for AI model training. The research method employed is descriptive qualitative with a literature review approach toward various positive legal instruments, platform privacy policies, and international professional codes of ethics. The results of the analysis indicate that although web scraping offers significant technical efficiency for various industrial sectors, this practice often overlooks the aspect of consent from data owners and potentially violates Law Number 28 of 2014 concerning Copyright and Law Number 27 of 2022 concerning Personal Data Protection (PDP Law). The emphasis on the integrity of Information Technology (IT) practitioners, referring to the ACM (Association for Computing Machinery) code of ethics, becomes highly crucial as a moral compass amidst the void of specific regulations regarding scraping at the national level. This research concludes that the true professionalism of an IT practitioner is measured not only by the mastery of data extraction technology but also by compliance with legal boundaries, transparency, and respect for private property rights to create a fair, secure, and transparent digital ecosystem.
Copyrights © 2026