AGF

Platform Profile

Deployment modes, environment architecture, MCP integration, and cost of governance.

For platform engineers, infrastructure architects, and DevOps/MLOps teams responsible for building, deploying, and scaling the infrastructure that autonomous AI agents run on.

The key question this profile answers: How do I build and deploy governed agent infrastructure?

Scope boundary: This profile covers build-time and deployment-time infrastructure. Runtime operations (monitoring, incident response, observability) belong to the Observability Profile. The split mirrors Platform Engineering vs. SRE.

The Platform Challenge

Building infrastructure for governed agentic systems is fundamentally different from building infrastructure for traditional applications. Traditional infrastructure serves deterministic software — the application does what it's told, every time. Agentic infrastructure serves non-deterministic, autonomous systems that select tools, modify their own behavior, and take actions their developers never explicitly programmed.

The platform challenge has three dimensions:

  1. Governance must be structural, not bolted on. You can't add governance to an agentic system after the fact — it must be built into the infrastructure from the start. The rings, the verification layers, the gates, the containment mechanisms — these are infrastructure, not application features.

  2. The topology must match the system type. A document processing pipeline and a conversational agent have completely different latency requirements, governance patterns, and failure modes. The infrastructure must adapt — the same logical governance deployed in different physical topologies.

  3. The agent's operating environment is infrastructure. Context composition, instruction management, tool provisioning, workspace scoping, session state — these are infrastructure concerns that determine agent performance as much as compute and networking.

Ring Deployment Modes

The Rings Model is a logical architecture. How the rings manifest physically depends on the system type, latency budget, and governance requirements. Three modes, each with different tradeoffs:

Ring Deployment Modes — Wrapper, Middleware/Interrupt, Graph-Embedded

Wrapper Mode

The rings literally wrap execution. Sequential, concentric — Ring 0 produces, Ring 1 verifies, Ring 2 governs, output releases.

Ring 0: Produce output
──── checkpoint ────
Ring 1: Verify (loop until converge)
──── checkpoint ────
Ring 2: Evaluate policy, gate if required
──── checkpoint ────
Output released
Ring 3: Learn (async)
PropertyAssessment
Best forBatch pipelines, document processing, assessment workflows, regulatory filings
Real-world examplesAI risk assessment pipelines, automated report generation, code review pipelines
LatencySeconds to hours — the full sequential pass adds wall-clock time proportional to verification complexity
Audit clarityHighest — each stage boundary is a clean cut in the provenance chain
Human oversightEasiest — gates pause cleanly, reviewers see complete context
ReproducibilityHighest — same inputs + configuration = same trace
TradeoffLatency. For user-facing agents, the sequential pass may be unacceptable.

Middleware / Interrupt Mode

Ring logic fires at specific decision points within an execution graph — tool calls, data access, state mutations. The agent executes continuously; the rings intercept at defined boundaries.

step 1 → step 2 → [R1: verify tool] → step 3

                    [R2: gate — destructive] ←┘

                       (human approves)

                          step 4 → step 5 → [R1: verify output]

                                                step 6 → done

Ring 3: learns from full trace (async)
Security fabric: active at every interrupt boundary
PropertyAssessment
Best forCoding agents, ops automation, multi-step task agents, infrastructure management
Real-world examplesClaude Code, Cursor, Devin, GitHub Copilot Workspace, CI/CD agents
LatencySub-second to seconds per action
Audit clarityGood — provenance shows which control points triggered and what was decided
Human oversightGood with constraints — richer context, more domain expertise required
CheckpointingCheckpoint at each interrupt boundary. Agent must be resumable — pausing mid-execution for a gate requires frozen, persisted, resumable state.
TradeoffInterrupt policy design is hard. Too many = constant pausing. Too few = missed consequential actions.

MCP as canonical implementation: The Model Context Protocol materializes middleware/interrupt mode directly — the protocol defines the boundary between agent reasoning and tool execution, making each tool call a natural interrupt point.

Graph-Embedded Mode

Verification, governance, and security run concurrently with execution as peer nodes in the orchestration graph.

PropertyAssessment
Best forConversational agents, voice assistants, real-time systems, agent swarms
Real-world examplesChatGPT-style agents, voice assistants, real-time recommendation engines, trading agents
LatencyMilliseconds (user-perceived)
Audit clarityLowest — concurrent execution produces a partial order, not a total order
Human oversightHardest — speculative execution means the agent has "moved on" by the time a gate fires
ReproducibilityLowest — concurrency introduces timing-dependent behavior
TradeoffLatency for governance clarity. Systems subject to regulatory audit should strongly consider wrapper or middleware.

Speculative execution bounds (Informed proposal):

  • Depth limit: 3–4 levels of speculative chaining. Governance overhead grows super-linearly beyond depth 4.
  • Entropy constraint: If historical rejection rate for an action class exceeds ~20%, exclude from speculation and process sequentially.
  • Side-effect fence: Speculative steps that produce irreversible side effects are held in a commit buffer until the governance release gate clears.

Hybrid Deployment

Systems are not required to use a single mode. The common pattern: middleware mode overall with graph-embedded subsections. The coding agent operates in middleware mode (interrupt-driven), but within a single user-facing response, the generation pipeline uses graph-embedded mode (parallel verification of streamed output).

Mode Selection Matrix

Choose the deployment mode based on system characteristics. When multiple modes could work, prefer the one with stronger governance properties unless latency requirements force otherwise.

System CharacteristicWrapperMiddlewareGraph-Embedded
Output typeDiscrete artifact (document, report, assessment)Sequence of actions (tool calls, mutations, operations)Continuous stream (conversation, real-time feed)
Latency toleranceSeconds to hoursSub-second to seconds per actionMilliseconds (user-perceived)
Governance intensityHigh — every output fully reviewedSelective — consequential actions trigger ringsMinimal blocking — most output auto-passes
Human gate frequencyHigh — frequent pause-and-review acceptableModerate — gates at high-stakes actions onlyLow — rare, and disruptive when they fire
Regulatory/auditStrong — clear evidence trail requiredModerate — action-level audit sufficientLight — behavioral monitoring sufficient
Side-effect profileContained — output is an artifactMixed — many actions, some irreversibleContinuous — streaming output, real-time effects
Regulatory jurisdictionEU AI Act high-risk (Art. 9–15)Most jurisdictionsPermissive or low-risk classification
Rollback/compensationSimple — discard the artifactPer-action compensation via Transaction Control (#16)Complex — speculative execution may have committed partial state

Decision heuristic: If you're unsure, start with middleware mode. It handles the widest range of use cases and has the strongest protocol ecosystem (MCP).

Ecosystem reality (March 2026): No single deployment mode dominates. LangGraph uses graph-embedded governance with state machines; CrewAI uses workflow checkpoint governance; OpenAI Agents SDK uses interceptor/middleware guardrails; Amazon Bedrock AgentCore uses external policy enforcement at the gateway layer. MCP is a connectivity protocol, not a governance mechanism — governance layers are built on top of or alongside MCP.

The Agent Environment Stack

Every agent operates within a 5-layer environment. Each layer has its own composition policy, governance intensity, and lifecycle:

┌──────────────────────────────────────────────────┐
│ L5: Session State                     20-30%     │
│ conversation history, tool results, working      │
│ memory, handoff context                          │
│ Ephemeral, session-scoped                        │
├──────────────────────────────────────────────────┤
│ L4: Retrieved Context                 30-40%     │
│ task-specific knowledge, documents, search       │
│ Dynamic, loaded JIT per task                     │
├──────────────────────────────────────────────────┤
│ L3: Capability Set                    10-15%     │
│ active tools, skills, MCP servers, API access    │
│ Provisioned per role, subject to trust level     │
├─ ─ ─ ─ ─ ─ TRUST BOUNDARY ─ ─ ─ ─ ─ ─ ─ ─ ─ ─┤
│ L2: Instruction Architecture          10-20%     │
│ system prompts, rules, personas, constraints     │
│ Versioned, tested, slow-changing                 │
├──────────────────────────────────────────────────┤
│ L1: Identity & Policy Substrate       5-10%      │
│ agent identity, ring assignment, governance      │
│ policy, trust level, workspace boundaries        │
│ Foundational                                     │
└──────────────────────────────────────────────────┘
         ▲ Composition flow: bottom-up

Trust boundary: Below L3 (L1–L2) is human-authored, version-controlled, and trusted. Above the boundary (L3–L5) is dynamic, runtime-composed, and treated as untrusted input by the Security Fabric.

Percentages are context budget allocation starting points. The Environment Optimization Loop adjusts based on measured effectiveness.

The Composability Interface

AGF defines a standard contract for how agents expose themselves to the ring stack. Every governed agent implements this interface:

Ring Interface and Composability — standard contract between rings

Signal set — what rings return to each other:

SignalMeaningWho Receives
PASSOutput meets criteriaRing 2 / Release
REVISEOutput needs change — structured finding attachedRing 0 (retry)
HALTOutput cannot be fixed — stop executionGovernance
GATEOutput meets quality criteria but requires human authorizationHuman reviewer
ERRORVerification process itself failedOrchestrator

Execution budgets: Every ring boundary carries a budget — maximum iterations, time, compute, API calls. Budget exhaustion triggers graceful halt. Without budgets, validation loops can spin indefinitely.

Delegation signals: The DELEGATE signal passes authority to a sub-agent with a bounded scope — the delegating agent's authority cannot be exceeded, only narrowed. Delegation chains are cryptographically bound and auditable.

MCP Integration Patterns

The Model Context Protocol (MCP) is the current dominant protocol for tool integration. AGF's governance layer sits on top of MCP, not inside it.

MCP as the interrupt boundary: Each MCP tool call is a natural interrupt point. The governance layer:

  1. Intercepts the tool call intent before execution
  2. Evaluates against policy (is this tool permitted? with these parameters? at this trust level?)
  3. Either passes, modifies, or blocks
  4. Records the decision in the event stream

MCP server trust tiers:

TierSourceTrust LevelGovernance
Tier 1 — OrganizationalInternal, first-partyHighStreamlined approval
Tier 2 — VerifiedKnown vendor, audited schemaMediumPolicy evaluation per call
Tier 3 — CommunityPublic registry, unverifiedLowSandboxed, mandatory gate on first use
Tier 4 — DynamicRuntime discoveryUntrustedBlocked by default, explicit allowlisting required

Supply chain posture: 53% of community MCP servers use insecure static API keys (Astrix Security, 2025). Default stance: Tier 3 and Tier 4 servers require explicit organizational approval before use.

Cost of Governance

Governance is not free. Every ring boundary adds latency, compute, and complexity. Understanding the cost model helps make informed tradeoffs.

Governance Latency Tradeoff — overhead by ring and deployment mode

Latency overhead by deployment mode:

ModeRing 1 overheadRing 2 overheadGate overhead
Wrapper+20–50% per output+10–30% per outputMinutes (human review)
Middleware+50ms–500ms per tool call+50ms–200ms per gate evaluationMinutes (human review)
Graph-EmbeddedNear-zero (concurrent)Near-zero (concurrent)Disruptive (blocks stream)

Cost reduction strategies:

  • Trust Ladders (#11): Higher-trust agents pass verification faster (fewer iterations, lighter checks).
  • Risk-based ring activation: Not every execution triggers all rings. Risk classification determines which rings activate at what intensity.
  • Async Ring 3: Learning and analysis run asynchronously — no latency impact on the critical path.
  • Adaptive gates: Mandatory only for classified irreversible actions. Routine operations auto-pass.

The governance dividend: Governance costs are offset by reduced failure costs — avoided regulatory penalties, reduced remediation, lower breach costs, and trust that enables higher-stakes automation.

Infrastructure Checklist

Deployment mode selection:

  • System type mapped to deployment mode (wrapper / middleware / graph-embedded)
  • Latency budget documented and mode selection justified
  • Regulatory jurisdiction requirements checked against mode properties

Agent environment:

  • 5-layer stack composed (L1–L5) with trust boundary respected
  • Versioned system prompts and instructions in source control
  • Tool provisioning scoped per role and trust level
  • Workspace boundaries enforced (tenant isolation)

MCP integration:

  • MCP server inventory documented with trust tier assignments
  • Static API keys eliminated from Tier 2–4 servers
  • Dynamic discovery blocked by default

Composability interface:

  • Signal set implemented (PASS/REVISE/HALT/GATE/ERROR)
  • Execution budgets defined per ring boundary
  • Delegation chain bounds enforced

Cost of governance:

  • Latency overhead measured per ring per deployment mode
  • Trust Ladders configured for high-volume, proven agents
  • Async Ring 3 confirmed off critical path

Related: Security Profile — security architecture and threat defense. Observability Profile — runtime monitoring and incident response. AI Engineering Profile — which primitives to implement first.

On this page