Gold Coast Surfers Paradise Twilight
← AI News
AI Security

The OWASP Agentic Top 10: A Defence Playbook for the Agents You Are Deploying

OWASP has published a Top 10 built specifically for AI agents. It reframes the agent as a privileged user that reads untrusted text and acts with your access. Here is the practical defence playbook.

·TheAICommand

An AI agent is a privileged user that reads its instructions from strangers.

That single sentence explains why the security tooling you already own does not fully cover the agents your teams are now deploying. A chatbot answers. An agent acts. It books the meeting, files the ticket, queries the database, sends the email. To do that it holds real permissions, and it decides what to do next by reading text it did not write, including text an attacker can plant.

In December 2025 the OWASP GenAI Security Project published the Top 10 for Agentic Applications 2026, built through collaboration with more than 100 contributors. It is the first widely adopted list that treats the agent, not the model, as the unit of risk. If your organisation is moving from "staff use a chat tool" to "a system takes actions on our behalf", this is the framework to govern that shift against. Here is what it says, and the controls to put in place before you grant an agent any real autonomy.

Why agents break the old security model

Traditional application security assumes a clear line between code, which is trusted, and data, which is not. Large language models erase that line. The model treats everything in its context window as language to act on, so a paragraph buried in a web page, a calendar invite or an email can read as an instruction. This is prompt injection, the top entry in the OWASP Top 10 for LLM Applications, and it does not disappear when you wrap the model in an agent. It gets worse, because now the model can take actions.

The clearest proof is EchoLeak. Disclosed in 2025 and tracked as CVE-2025-32711, it was a zero-click vulnerability in Microsoft 365 Copilot rated 9.3 on the CVSS scale. An attacker sent a single ordinary-looking email. The user did not have to click anything. When Copilot later processed that email as part of its context, hidden instructions inside it caused the assistant to gather data from the user's own OneDrive, SharePoint and Teams and route it out through an allowed channel. Microsoft patched it and found no evidence of exploitation in the wild, but the lesson stands: the attack arrived as normal business content, and the assistant did the exfiltration using the user's own access. The researchers named the underlying pattern an LLM Scope Violation, where untrusted input pulls the model beyond the boundary it was supposed to respect.

EchoLeak: a zero-click prompt injection in Microsoft 365 Copilot rated 9.3 on the CVSS scale, where a single email caused the assistant to leak a user's own files
EchoLeak (CVE-2025-32711): the worked example of an agent acting on an attacker's instructions with the user's access

Hold that picture in mind, because every risk in the OWASP list is a variation on it: the agent's intent gets hijacked, or the agent's reach gets abused, or the agent's network turns against itself.

Your agent is a privileged user

The most useful framing in the OWASP guidance is to stop thinking of an agent as a clever chatbot and start treating it as a non-human privileged account, in the same category as a service account or an automation bot. It authenticates. It holds tokens. It can read and write to systems. The difference is that, unlike a script, it improvises, and it improvises based on text it reads at runtime.

A single figure holding a glowing key at the gateway to a navy city, representing an AI agent that holds real permissions and acts on instructions it reads at runtime
Treat the agent as a non-human privileged identity, not a smarter chatbot

Once you accept that framing, the ten risks sort into three plain questions a practitioner can hold in their head.

Can someone redirect the agent's intent? This is ASI01 Agent Goal Hijack, where external text rewrites the agent's objective. It is reinforced by ASI06 Memory and Context Poisoning, where bad data written to the agent's memory quietly steers future decisions across sessions, and ASI09 Human-Agent Trust Exploitation, where the attacker leans on a person's confidence in the agent's recommendation to get a harmful action approved.

Can someone abuse the agent's reach? This is ASI02 Tool Misuse and Exploitation, where the agent is talked into using a legitimate tool in an illegitimate way, ASI03 Identity and Privilege Abuse, where cached credentials and delegation chains let the agent do more than intended, and ASI05 Unexpected Code Execution, where agent-generated or externally influenced code runs when it should not.

Can the agent's own supply chain and network turn against it? This covers ASI04 Agentic Supply Chain Vulnerabilities, where a plugin, tool descriptor or dependency loaded at runtime is malicious, ASI07 Insecure Inter-Agent Communication, where weak authentication between agents lets messages be intercepted or forged, ASI08 Cascading Failures, where one fault propagates across networked agents into a system-wide outage, and ASI10 Rogue Agents, where an agent drifts from its purpose through compromise or misalignment.

You do not need to memorise the codes. You need the three questions, because they map directly onto controls.

The defence playbook

The good news is that almost every control here is a discipline your security team already practises for human and service accounts. The work is applying it to a new kind of identity. Set these up before, not after, you grant autonomy.

A left to right defence flow: untrusted input, then isolate instructions from content, then least-privilege tool scope, then an approval gate for high-impact actions, then a logged and reversible action
The agent defence flow: isolate input, scope tightly, gate the dangerous actions, log everything

Scope the agent like a service account. Give it the narrowest set of permissions that lets it do its job, with short-lived and tightly scoped tokens, not a standing key to everything a person can reach. This is the single highest-leverage control against ASI02 and ASI03. If the agent only needs to read three folders, it should not be able to read the fourth.

Separate instructions from content. Treat anything the agent retrieves, an email, a document, a web page, as untrusted data, never as a command. Use system prompts and structured tool schemas to define what the agent is allowed to do, and do not let retrieved text expand that envelope. This is your primary defence against ASI01 and the EchoLeak pattern.

Put an approval gate on the actions that hurt. Map each tool the agent can call to a blast radius. Sending an internal draft is low. Moving money, deleting records, emailing an external party or changing a permission is high. High-blast-radius actions should pause for a human, with the agent showing its working. Start an agent on constrained autonomy and let it earn more, rather than granting broad rights to save time and clawing them back after an incident.

Constrain and allowlist tools. Define tool calls with strict schemas so the agent cannot smuggle arbitrary parameters, and do not let it load plugins, tools or dependencies at runtime without review. That closes off much of ASI04 and ASI05.

Practise memory hygiene. Decide deliberately what the agent is allowed to write to long-term memory, validate it on the way in, and scope it to the session or task where you can. Unvalidated memory is how ASI06 turns one bad input into a persistent problem.

Red-team it before production, against this list. The agent evaluation market has matured around exactly this. Tools now ship red-teaming presets aligned to the OWASP LLM and agentic Top 10, and a sound practice is to replay an agent's recorded actions in an isolated clone of production, a digital twin, to see whether the same sequence triggers a cascading failure before it does so for real. Pre-production testing is where ASI08 and ASI10 are cheapest to catch.

Log every action as evidence. Record what the agent did, which tool it called, with what inputs, and on whose authority. You need this to detect a rogue agent, and you need it to prove to an auditor that the controls work. The same log is what lets you reverse an action and run a post-incident review, so treat it as a control in its own right, not as debugging output you can switch off when the logs get noisy.

None of these controls is exotic. The reason agents catch teams out is not that the defences are hard, it is that an agent is introduced as a productivity feature and reviewed by no one who would normally sign off a new privileged account. Put one owner on each agent, give that owner this list, and most of the exposure closes.

The Australian angle

For APRA-regulated entities, none of this sits outside existing obligations. CPS 234 Information Security already requires you to manage information security capability in line with the threats, and an agent with broad access is an information asset and an access path that the standard reaches. CPS 230 Operational Risk Management, live since 1 July 2025, brings the agent's third-party stack into scope as a material service arrangement where the agent feeds critical operations. The practical implication is simple: the agent belongs in your access reviews, your third-party risk register and your incident response plan, governed as the privileged identity it is, not parked in an innovation pilot outside the controls.

The hype check

Two failure modes are worth naming. The first is granting broad autonomy because it saves time. Every control above costs a little friction, and the temptation is to skip it for a quick win, which is precisely how the access ends up wider than anyone intended. The second is treating the model's own refusal as a security control. A model that has been trained to decline harmful requests is a useful layer, but it is not a boundary, because the whole point of prompt injection is to talk the model out of its own guardrails. Controls live in the architecture around the model, in permissions, isolation, approval gates and logging, not in the model's good intentions.

What to do this week

You do not need an agent platform to start. You need to treat the agents you already have like the privileged users they are.

  1. List every AI agent or assistant in your organisation that can take an action, not just answer, and write down what each one can reach.
  2. For the highest-reach agent on that list, check whether its permissions are scoped to its job or inherited from a person, and tighten them.
  3. Identify its highest-blast-radius action and put a human approval gate in front of it.
  4. Add the agent to your access reviews and your third-party risk register.
  5. Book a pre-production red-team of one agent against the OWASP agentic Top 10 before its next capability expansion.

Agents are worth deploying. The teams that get the most from them will be the ones that governed them as privileged identities from the first day, not the ones that discovered the framing after an incident.

References

  • OWASP GenAI Security Project, Top 10 for Agentic Applications 2026, published 9 December 2025. https://genai.owasp.org/resource/owasp-top-10-for-agentic-applications/
  • OWASP GenAI Security Project, Top 10 for LLM Applications. https://genai.owasp.org/llm-top-10/
  • National Vulnerability Database, CVE-2025-32711 (EchoLeak, Microsoft 365 Copilot), CVSS 9.3. https://nvd.nist.gov/vuln/detail/CVE-2025-32711
  • APRA, Prudential Standard CPS 234 Information Security. https://www.apra.gov.au/information-security
  • APRA, Prudential Standard CPS 230 Operational Risk Management. https://www.apra.gov.au/operational-risk-management

General information and education only. Not legal, compliance, or security advice. Verify controls against your own environment and the primary OWASP and APRA sources before acting.*

TheAICommand. Intelligence, At Your Command.

Tags

AI AgentsAI SecurityPrompt InjectionOWASPAI Governance
← Back to AI News