Nexora / Insights / Architecture · I/02 · D/01

Zero Trust for autonomous agents.

Identity, authorisation, and segmentation patterns when the principal isn't a human or a service — it's an agent. A working extension of NIST SP 800-207 for the agentic era.

Reading time12 minutes AudienceSecurity Architect · CISO · Platform SeriesArchitecture Intelligence / 02 UpdatedJanuary 2026

An autonomous agent is neither a user nor a service. It is a process that holds delegated authority from a user, makes non-deterministic decisions, and exercises that authority against tools and resources on the user's behalf. Every assumption baked into Zero Trust as defined for human and service principals breaks at this seam.

NIST SP 800-207 is the reference for Zero Trust architecture in the enterprise.1 It assumes that the subject of an access decision — the principal — is either a person being authenticated or a workload acting under its own identity. Continuous verification, least privilege, and microsegmentation are then arranged around that subject.

Agents do not fit. An LLM agent invoking a tool is not the user, but it is acting under the user's delegated authority. It is not the service it calls, but it is the calling subject. Its behaviour is non-deterministic across runs. Its principal can change mid-session. Its scope is governed less by configuration than by the contents of a prompt — including, occasionally, hostile content retrieved from untrusted sources.2

This is a Zero Trust problem disguised as an AI problem. It is solvable, but it requires extending the 800-207 control plane in three specific places: subject identity, delegation scope, and the tool boundary.

S/00The principal problem

In a classical Zero Trust deployment, the policy engine evaluates three things on every request: who is the subject, what context do they bring, and what are they asking to do? Continuous verification revisits those signals at every access decision. The model is clean because the subject is stable: a person with credentials, or a workload with a SPIFFE identity or equivalent.

An agent breaks this in three ways:

The architectural answer is not to retrofit the agent as either a user or a service. It is to treat the agent as a first-class subject category — with its own identity model, its own delegation rules, and its own audit obligations.

S/01Identity for agents

An agent identity should be:

OAuth 2.1 with token exchange (RFC 8693) is a workable starting point for the delegation primitive.4 The token exchange flow can mint an agent-scoped access token from the user's authenticated session, with the agent's process identity (workload identity) bound into the request. The resulting token carries both subjects — user and agent — and a scope narrower than either could request alone.

Pattern · Composite subject token

A practical token shape carries: act (the agent's workload identity), sub (the delegating user), scope (the narrowed capability set), task_id (the originating task), and build (a hash of model + prompt + tool manifest). The policy decision point evaluates all five on every access. The audit log records all five on every decision.

S/02Delegation and scope

Scope is where most agent deployments accumulate risk silently. The temptation is to grant the agent the union of permissions it might ever need across the user's task variants. The result is an agent with more authority than the user typically exercises, available to anyone who can steer its prompt.

The right model is capability tokens rather than ambient permissions.5 A capability token authorises a specific action against a specific resource for a specific duration. It cannot be ambient — the agent must present it for each call, and the call inherits exactly its expressed authority.

Three layers of scope reduction

An agent operating against enterprise resources should have its effective authority narrowed at three points before any tool call lands:

Static, ambient permissions are to agents what flat networks were to lateral movement. The fix is the same in spirit: deny by default, authorise per call, expire on completion.

S/03The tool layer

Tools are the boundary where agent authority becomes enterprise effect. They are the right place to enforce the policy that matters. Several patterns deserve to be treated as architecture rather than implementation detail.

Tools are mediated, not exposed

A tool exposed directly to the model — as a raw API or a thin wrapper — is a tool whose enforcement boundary is the model's compliance. This is the wrong primitive. Tools should be mediated through a policy enforcement point that:

The Model Context Protocol (MCP) and equivalent tool-mediation layers are the right place to terminate trust for tool calls.6 Treat them as policy enforcement points in the 800-207 sense — not as transport.

High-impact tools require explicit human-in-the-loop

Not every tool is equivalent. A read against a knowledge base is different from a write to a payment system. The classification should be explicit, codified per tool, and enforced at the mediation layer. High-impact tools — those that move money, alter customer state, send outbound communication, or modify access control — should require synchronous human approval, regardless of the agent's general authority.

The reflex to make agents "fully autonomous" for these tools is almost always wrong. The cost of synchronous approval for a payment write is far lower than the cost of a single misrouted one.

Retrieved content is data, not instructions

Prompt injection via retrieved content is the most reliably exploitable agent vulnerability in current production systems. Defence is layered:

S/04Reference architecture

The following extends the 800-207 core diagram (subject → policy decision point → enforcement → resource) with the three additional surfaces agents require: the delegation broker, the tool mediation layer, and the per-decision audit store.

D/01 · NXR-AGENT-ZTZero Trust reference for agent principals
SUBJECT · USER Authenticated OIDC · MFA DELEGATION BROKER Composite token RFC 8693 · scope reduction act + sub + build + task SUBJECT · AGENT Ephemeral identity Build-bound · scoped POLICY DECISION POINT Per-call evaluation Subject · scope · context Continuous verification RESOURCE Microsegment Data · API · workload TOOL MEDIATION LAYER · PEP Capability check · sanitise · approve MCP / equivalent High-impact tools → HITL AUDIT & EVIDENCE PLANE Immutable per-decision log · session replay · regulatory disclosure EVERY TOOL CALL PASSES THE MEDIATION LAYER · NO DIRECT MODEL → RESOURCE PATH EU AI ACT ART. 12 · NIST 800-207 §3.2 · ISO/IEC 42001 §8.5
The three additions to 800-207 are the delegation broker (issuing composite tokens), the tool mediation layer (terminating trust for tool calls), and the audit plane (per-decision evidence).

S/05Failure modes worth designing against

The reference is incomplete without the failure modes it is designed to absorb. The list below is not exhaustive — it is the set of failures observed often enough to be predictable.

F-01 · Capability accumulation

The agent's effective scope grows over time as edge cases prompt scope widening that is never reversed. Mitigation: scope is per-task and expires; persistent scope changes require a change-control gate.

F-02 · Prompt-injection-driven authority extension

Retrieved content steers the agent to attempt actions outside its delegated scope. Mitigation: capability tokens are unforgeable from content; the mediation layer denies out-of-scope calls regardless of model output.

F-03 · Tool composition leading to disallowed effect

Each tool call is individually authorised but the composition produces an effect the user did not intend (e.g. read customer list → draft mail → send mail). Mitigation: outbound communication and other irreversible categories require HITL regardless of upstream authorisation chain.

F-04 · Cross-session bleed

An agent reuses context — including credentials or retrieved content — from a prior session. Mitigation: session-scoped identities; no shared mutable state across sessions; cache keys bound to subject + task.

F-05 · Unattributable action

An action lands in a downstream system without a traceable agent + user pair. Mitigation: every action receives a session-bound correlation identifier propagated to all downstream calls; the audit plane is the regulator-facing record.

S/06What to build next

Most enterprises have one or more agents in production today, sitting behind an API key or a service-account credential that was the most convenient thing available. Re-architecting toward the reference above is not a single project; it is a sequence:

NXR · Architecture Note

The honest test of agent Zero Trust is not whether the model can refuse a bad instruction. It is whether the architecture would still deny the action if the model said yes. Build for the second case.

Nexora's Zero Trust Architecture Blueprint covers the foundational 800-207 patterns; the agent extension above is the active research direction we use it for. The AI Governance Framework wraps the operational governance — intake, approval, oversight, evidence — around the architecture.

References & further reading

  1. 1NIST SP 800-207, "Zero Trust Architecture," August 2020. nvlpubs.nist.gov/.../NIST.SP.800-207.pdf
  2. 2OWASP, "Top 10 for LLM Applications" — LLM01: Prompt Injection. genai.owasp.org/llm-top-10
  3. 3Greshake et al., "Not what you've signed up for: Compromising Real-World LLM-Integrated Applications with Indirect Prompt Injection," arXiv:2302.12173. arxiv.org/abs/2302.12173
  4. 4IETF, RFC 8693, "OAuth 2.0 Token Exchange," January 2020. datatracker.ietf.org/doc/html/rfc8693
  5. 5Miller, "Robust Composition: Towards a Unified Approach to Access Control and Concurrency Control," 2006. erights.org/talks/thesis/markm-thesis.pdf
  6. 6Anthropic, "Introducing the Model Context Protocol," 2024. anthropic.com/news/model-context-protocol
  7. 7NIST AI 100-1, "Artificial Intelligence Risk Management Framework," January 2023. nist.gov/itl/ai-risk-management-framework

Architect the agent boundary.

The Zero Trust Architecture Blueprint provides the foundational reference and an 18-month implementation roadmap. Paired with the AI Governance Framework, it covers both the architecture and the operational governance the agent surface area now demands.