AUSTA | Adversarial Intelligence

Security Engineering · 2026 Reference

OWASP Top 10 for AI Agents 2026: The New Risks Your LLM Pentest Isn't Catching

OWASP shipped the Agentic Security Initiative (ASI) Top 10 in 2026. It's a different list than the LLM Top 10, because agents have something LLMs don't: agency. Here's what changed, the ten new risks, and a starter test case for each.

By Austa · Published · ~12 min read

The one-sentence difference

An LLM is a stateless calculator: input tokens, output tokens. An agent is a stateful loop: takes actions, calls tools, writes memory, coordinates with other agents. The ASI Top 10 catalogs the risks that only exist because the agent has agency the LLM does not. If you secured your LLM application against the LLM Top 10 last year and you think you're done, you're not done.

Walk both lists. The Agent Top 10 assumes you've addressed prompt injection, output handling, model DoS, and the other LLM-tier risks. ASI is layered on top, not a replacement. A pentest that covers only one list misses half the surface.

The ten risks, with a test case for each

ASI01 — Rogue Orchestration

The orchestration layer (the code that decides which agent does what, in what order) is itself an attack surface. Most teams treat the orchestrator as plumbing and forget that its routing decisions are an authorization boundary.

How it fails: the orchestrator routes a task to a higher-privileged agent because the task prompt was crafted to look like the kind of task that agent handles. Privilege escalation by impersonation, mediated by your own routing code.

Starter test: submit a low-privilege task with a body that includes phrasing your higher-privilege agent typically handles ("I need to run a database migration"). Watch which agent picks it up. If the answer surprises you, ASI01 is open.

ASI02 — Context Injection

Indirect prompt injection at agent scale. An attacker plants malicious content in a data source the agent will read later (a webpage, a document, an email, a memory entry). When the agent retrieves and processes that content, the injected instructions execute with the agent's full privileges.

How it fails: agent ingests a "user review" with an embedded instruction to call a tool. Agent calls the tool, exfiltrates data, attacker never sent the prompt directly.

Starter test: plant a prompt-injection payload in every data surface the agent reads (user-submitted content, fetched URLs, memory retrievals, file uploads). Each surface is its own ASI02 surface.

ASI03 — Unbounded Tool Execution

The gap between "agent has access to this tool" and "agent should be allowed to use this tool right now, with these arguments, in this context." Agents with shell access, SQL access, HTTP-fetch, or file-write capability and no per-call review are a budget drain and a data-exfiltration channel.

How it fails: agent decides the right answer to a user's question involves running 200 SQL queries. It does. The database is on fire by lunch.

Starter test: instrument every tool call with a per-tool, per-time-window quota. Burn through it deliberately by asking the agent to do a routine task in an inefficient way. If the quota didn't trigger, you have no enforcement.

ASI04 — Improper Credential Management

Agents need credentials to use tools. Credentials with too-broad scope, credentials shared across agents, credentials baked into prompts, credentials surfacing in agent logs are all ASI04.

How it fails: agent is given a long-lived API key with full account scope because that was easiest. Three months later, an ASI02 incident exfiltrates conversation logs that contain the key.

Starter test: for every credential the agent uses, ask three questions. Is it scoped to the minimum operation set the agent needs? Does it rotate automatically? Is it ever serialized into a prompt, response, or log line the agent can see?

ASI05 — Agentic Data Exfiltration

The agent equivalent of "data leaving the system in unexpected ways." Because agents have tools, they have more channels: a misused HTTP-fetch tool, a code-execution sandbox with network, a "summarize this for the user" output that includes data the user shouldn't see.

How it fails: agent has both "read internal docs" and "send slack message" tools. Attacker tricks it into reading sensitive docs and DMing the summary to an attacker-controlled slack workspace via a misconfigured webhook.

Starter test: map every tool the agent has by its data-flow direction (read, write, send, receive). Look for pairs of tools that together enable read-then-send. Each pair is a candidate exfil channel.

ASI06 — Overreliance on LLM-Logic

Trusting the LLM to make authorization decisions, validate financial calculations, or enforce business invariants. The LLM is great at producing plausible-sounding answers and bad at being right about edge cases. Use it for content, not for security boundaries.

How it fails: code prompts an LLM with "is this user authorized to delete the account?" The LLM says yes (or no) based on conversational context. The actual authorization rule is never checked.

Starter test: grep your prompt templates and your agent-tool definitions for words like "authorize," "validate," "check if allowed," "confirm permission." Every match is a candidate ASI06. Replace LLM-mediated checks with deterministic code.

ASI07 — Insecure Agent Discovery

How does one agent learn about another? In multi-agent systems, the discovery mechanism (a service registry, a tool catalog, a shared memory) is itself attackable. An attacker who can register a malicious agent can intercept tasks the orchestrator would have sent to the legitimate one.

How it fails: agent registry accepts new agent registrations with minimal vetting. Attacker registers an agent with a name designed to win routing decisions ("data-export-agent-v2"). Routes intended for the legitimate exporter go to the attacker.

Starter test: review your agent registry's onboarding flow. Can a new agent be added without code review? Without identity verification? If yes, ASI07.

ASI08 — Lack of Process Isolation

Multiple agents sharing the same execution environment, same memory namespace, same credential pool. One compromised agent reaches into the others. The natural shape is "let them all share state because it's easier," and that's the vulnerability.

How it fails: agent A is the simple one with limited tools. Agent B has the elevated tool surface. They share an in-memory cache. Attacker compromises agent A and writes attacker-controlled data into the shared cache. Agent B reads it and executes accordingly.

Starter test: for each agent in your system, list everything it shares with another agent (memory, cache, file system, credentials, environment variables). Every shared surface is a candidate cross-contamination path.

ASI09 — Model-Agnostic Drift

The agent's behavior drifts because the underlying model changed (the provider updated it, you switched providers, the temperature or prompt subtly changed). A test suite that passed yesterday fails today. A guardrail that fired yesterday silently doesn't fire today.

How it fails: you upgraded Claude Opus 4.6 to 4.7. The "refuse to send wire transfers above $10,000" instruction now interprets "wire transfer" more narrowly. The exfil channel just opened.

Starter test: maintain a golden-path eval suite that runs against every model change. Include adversarial cases for every other ASI risk in this list. Re-run on every model swap and every prompt change.

ASI10 — Recursive Resource Exhaustion

An agent loops. An agent that spawns sub-agents that spawn sub-agents loops harder. Without bounded recursion depth and bounded total work-per-task, a single user request can produce a runaway cost spike or an unbounded queue.

How it fails: user submits a task. Agent decomposes it into five sub-tasks. Each sub-task is itself decomposed into five. Five levels deep, that's 3,125 leaf operations from one input.

Starter test: instrument every agent invocation with a "remaining budget" counter (tokens, sub-tasks, wall-clock). Trip it deliberately by submitting a recursively-decomposable task. If your agent doesn't terminate cleanly when the budget is exhausted, ASI10 is open.

How this maps to your existing pentest

If you ran an LLM pentest against the OWASP LLM Top 10 in the last six months, the work isn't wasted. The findings from that pentest are still findings. But the Agent Top 10 surfaces a different class of issue: the failures that come from the agent doing something, not just generating something.

Practical sequence:

  1. Walk the LLM Top 10 against every prompt-touching surface (auth flow, output handling, RAG retrieval). This catches the input/output failures.
  2. Walk the ASI Top 10 against every agent-touching surface (orchestration, tool execution, memory writes, multi-agent handoffs). This catches the agency failures.
  3. Build adversarial regression tests for every finding from either pass. Run them on every model swap and every prompt change.

The two walks together usually take a week for a system of moderate complexity. The findings backlog from a first walk is typically 20-40 items, biased toward partial implementation rather than total absence. Most teams have some protection for most categories. The work is closing the gaps.

What we ship at Austa for the ASI Top 10

Austa's adversarial intelligence platform ships ASI-tagged test cases for every category. You point it at your agent, it runs the test cases, you get a per-ASI scorecard. The platform is designed to be re-runnable on every model swap so ASI09 (drift) is caught the day it happens, not the week the bill arrives.

You don't need a platform to walk the list, though. The starter tests above are a real first pass. The 2026 LLM Security Checklist covers the LLM-tier layer. Together they're enough to ship.

Related reading