Med, Aron
Unknown Affiliation

Published : 1 Documents Claim Missing Document
Claim Missing Document
Check
Articles

Found 1 Documents
Search

Automatic Detection and Explanation of Dark Patterns from Interface Microcopy: Empirical Comparison of BERT-Style Encoders, RoBERTa-Style Encoders, and LLM-Style Decoders on the ec-darkpattern Dataset Xu, Haosen; Chen, Yushan; Med, Aron
Journal of Technology Informatics and Engineering Vol. 4 No. 3 (2025): DECEMBER | JTIE : Journal of Technology Informatics and Engineering
Publisher : University of Science and Computer Technology

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.51903/jtie.v4i3.491

Abstract

Dark patterns (also called deceptive design patterns) are interface choices that steer or pressure users into unintended actions such as rushed purchases, unnecessary disclosures, or hard-to-cancel subscriptions. In e-commerce, many dark patterns are expressed directly in microcopy (e.g., button labels, banners, and inline messages), which makes text-only detection attractive for scalable auditing. This paper presents a fully reproducible experimental study on ec-darkpattern, a text-based dataset of e-commerce interface strings with balanced binary labels (1,178 dark pattern vs. 1,178 non-dark pattern) and seven dark pattern categories. We compare (i) a rule-based lexicon baseline, (ii) hashed n-gram linear models, (iii) a lightweight BERT-style bidirectional transformer encoder with word tokenization, (iv) a lightweight RoBERTa-style bidirectional transformer encoder with character tokenization, and (v) an LLM-style causal decoder trained as a classifier on the same inputs. On a fixed 80/10/10 split with seed 42, the best-performing model is a hashing + linear SVM baseline (F1=0.9437, ROC-AUC=0.9810), while the BERT-style encoder achieves F1=0.9038 (ROC-AUC=0.9695), the RoBERTa-style encoder achieves F1=0.8907 (ROC-AUC=0.9573), and the LLM-style decoder achieves F1=0.7884 (ROC-AUC=0.8808). These results should be interpreted as a controlled comparison under low-resource, no-pretraining conditions on a single fixed split, rather than as a general ranking of encoder-style versus decoder-style transformers. To support explainability, we generate token-level attributions using gradient-based saliency, summarize them as key phrases, and estimate explanation consistency via top-k token overlap on an exploratory 20-instance sample (mean Jaccard up to 0.7482 between the two character-based transformers). Finally, we curate an error-case library that links misclassifications to their most influential phrases. Within this short-microcopy setting, the findings show that lexical baselines are especially strong, while transformer directionality and tokenization change both accuracy and the stability of highlighted cues.