DVAIA Workshop

Direct Prompt Injection

Indirect Prompt Injection

Excessive Agency

XSS in AI-Powered chat UI

Sensitive Information Disclosure

100

This is the LLM attack where the user’s input directly tries to override or redirect the model’s intended instructions.

What is Direct Prompt Injection?

100

This is prompt injection delivered through external content the model reads (documents, websites, emails) rather than the user typing it directly.

What is Indirect Prompt Injection?

100

The DVAIA lab theme where an AI agent has too many permissions and can be manipulated into doing things a user shouldn’t be able to trigger.

What is excessive agency?

100

This happens when model output is rendered as HTML in the browser instead of being escaped as text.

What is cross-site scripting (XSS)?

100

OWASP warns LLMs can reveal confidential data such as PII and API keys in responses; this is called:

What is sensitive data disclosure?

200

This technique attempts to bypass guardrails by rewriting the request using another language (or switching languages mid-stream).

What is a multi-language prompt injection?

200

Instead of attacking via the chat box, the attacker hides instructions inside data the model reads, like uploaded documents used for retrieval.

What is RAG document poisoning?

200

The most fundamental mitigation principle for reducing agent harm: grant only the minimum permissions necessary to do the job.

What is least privilege?

200

XSS where the malicious content appears immediately in the response because it came directly from the current request.

What is reflected XSS?

200

This is a “non-obvious” type of sensitive information that can leak: internal decision rules, workflows, or “how the app works” (often valuable to attackers).

What is business logic?

300

A technique where the attacker tells the model to “pretend” it’s a different persona (e.g., an admin, a dev, or a security tool) to bypass restrictions.

What is role playing?

300

In the DVAIA workshop scenario, this assistant agent is compromised when hidden instructions are embedded in message content it tries to summarize or respond to.

What is email agent injection?

300

This security control, called _____-in-the-loop approval, helps prevent “do it because the model said so” by requiring explicit approvals before sensitive actions.

What is human-in-the-loop approval?

300

XSS where the malicious content is saved (for example, into a profile field) and later re-displayed to trigger again.

What is stored XSS?

300

The best general defense pattern is minimizing what the model can ever see: don’t embed secrets in prompts, and limit access to only what’s needed—commonly summarized as:

What is data minimization + least privilege?

400

This direct prompt-injection goal tries to reveal the model’s hidden instructions or configuration text.

What is system prompt extraction?

400

Clue: The most important defense concept here is to treat retrieved/email/document text as ___, not commands.

What is data?

400

A dangerous sign: the agent can access “hidden” tools that normal users shouldn’t have, especially without access controls. What are _____ _____ tools?

What are exposed admin tools (unauthorized tool access)?

400

In the lab, an XSS payload that runs immediately in the current response is categorized as this type.

What is reflected XSS?

400

This vulnerability class happens when a system can be tricked into retrieving internal resources (often cloud or internal endpoints) instead of only intended external content.

What is SSRF (Server-Side Request Forgery)?

500

In the workshop, this umbrella concept covers attempts to break safety boundaries outright (often framed as “ignore prior rules and comply”).

What is a jailbreak?

500

Name the testing guidance number from OWASP LLM Top 10 2025 that describes a flow where a plugin/tool retrieves external content and the system merges it into the model prompt, causing unintended behavior — this is the “indirect” threat scenario for:

What is LLM01 Indirect Prompt Injection?

500

The risk pattern where the model is allowed to both decide and execute high-impact actions (like sending, deleting, transferring, or disclosing) with no guardrails. What is o___-p________ autonomous action (unsafe tool orchestration)?

What is over-permissioned autonomous action?

500

According to the OWASP AI Testing Guide, this failure allows model‑generated JavaScript to execute in the browser when output handling does not properly encode content. What is i_____ o_____ h____ leading to XSS?

What is improper output handling leading to XSS?

500

The “Vulnerable vs Fixed” toggle in Sensitive Data Disclosure is used to compare attack vs defense, including that “traversal [is] blocked” in fixed mode — that suggests defenses like validation/sanitization of:

What are file paths / resource identifiers (inputs that could enable path traversal)?