Despite legally binding agreements between users and website owners, users often overlook website privacy policies due to their length and complexity. Transparency in these policies is crucial, particularly in Malaysia, where regulatory agencies face challenges ensuring compliance with the personal data protection act (PDPA) of 2010 due to intricate language and complex legal clauses. Machine learning has been used to analyse privacy policies under various legal frameworks, but no dataset currently exists for the Malaysian PDPA. Thus, to bridge this gap, we introduce a pilot corpus of 50 privacy policies specifically tailored to the Malaysian PDPA. This dataset is analysed and made available for academic research, offering insights into privacy regulations and identifying trends in privacy policy transparency. Our findings pave the way for the development of tools to enhance compliance with PDPA standards and improve policy readability for users. The corpus also serves as a foundation for further research in privacy and data protection, encouraging the exploration of automated approaches for policy analysis and regulatory oversight.
Copyrights © 2025