The adoption of Large Language Models (LLMs) in digital business systems in Indonesia is rapidly increasing; however, systematic security evaluation against Indonesian language prompt injection remains limited. This study introduces the Indonesian Prompt Injection Dataset, consisting of 50 attack scenarios constructed using the STAR framework, which combines structured instruction variations with sociotechnical context to expose potential model vulnerabilities. The dataset was used to evaluate three commercial LLM platforms ChatGPT using a GPT-4 class lightweight variant (OpenAI), Gemini 2.5 Flash (Google), and Claude Sonnet 4.5 (Anthropic) through controlled experiments targeting instruction manipulation in Indonesian. The results reveal distinct robustness profiles across models. Gemini 2.5 Flash exhibits moderate observed resilience, with 76% of scenarios classified as medium risk and 12% as high risk. ChatGPT demonstrates higher observed robustness under the tested scenarios, with 88% of cases classified as low risk and no high-risk outcomes. Claude Sonnet 4.5 shows intermediate observed resilience, with 72% low-risk and 28% medium-risk scenarios. High-risk cases primarily involve direct role override, urgency- or emotion-based prompts, and anti-censorship instructions, while structural ambiguities and multi-intent manipulations tend to result in medium risk, and mildly persuasive prompts fall under low risk. These findings suggest that while contemporary LLM defense mechanisms are effective against explicit attacks, contextual and emotionally framed manipulations continue to pose residual security challenges. This study contributes the first Indonesian-language prompt injection dataset and demonstrates the STAR framework as a practical and standardized approach for evaluating LLM security in digital business applications.