Building the HTTP for Agents: A Complete Guide to Agent Infrastructure

Autonomous agents are parsing docs, calling APIs, and triaging support tickets in production. The AI layer is getting smarter. The infrastructure around it is not keeping up.

Every team hand-rolls identity flows, secrets management, and policy enforcement. Every new agent is a snowflake. This is the same fragmentation web services faced a decade ago, before HTTP, OAuth, JWTs, and service meshes standardized the plumbing.

Agents need the same treatment.

The Problem

Agent infrastructure today has four structural failures.

Identity is an afterthought. Most agents run with static API keys. No fine-grained access control, no credential rotation, no way to distinguish one agent's actions from another's.

Isolation is poor. Agents share environments and credentials. Attributing an action to a specific agent -- or containing blast radius -- is nearly impossible.

Permissions are binary. Full access or nothing. No contextual authorization based on task, team, or environment.

Auditability is limited. Tracing what happened, which agent did it, and under what authority requires stitching together logs that were never designed to correlate.

The Architecture

The solution borrows from the service mesh pattern. An agent interacts with the outside world only through a local Envoy sidecar. The sidecar enforces policy, manages identity, and proxies requests to upstream services. Three control plane components make it work.

Identity: Auth0 or SPIFFE

Every agent authenticates via OAuth2 Client Credentials flow, receiving a short-lived JWT with identity, permissions, team, and environment encoded as claims. For Kubernetes-native environments, SPIFFE provides cryptographically verifiable workload identities via mTLS.

The key design choice: agent identity should be as rich as human identity. Not just "who are you" but "what team, what environment, what type of work."

Policy: OPA + Rego

Open Policy Agent evaluates authorization as a sidecar query. The Envoy sidecar extracts JWT claims, builds an input document with request context and resource metadata, and queries OPA before forwarding any request.

This enables rules like: support agents can access customer data only during business hours, only for their own team's customers, only with read permissions. Declarative. Testable. Version-controlled. No hardcoded conditionals in application code.

Secrets: Vault

Vault authenticates agents using their JWT, maps them to identity entities with metadata, and issues short-lived credentials. A Vault Agent sidecar handles token renewal and secret rotation automatically -- the application reads from a file.

Dynamic database credentials are the strongest feature. Instead of long-lived connection strings, each agent gets a unique Postgres user with a 5-minute TTL and read-only grants scoped to its function. When the lease expires, Vault creates fresh credentials and drops the old user.

The SDK

A Python SDK abstracts the complexity:

agent = MCPAgent(
    client_id="support-triage-agent",
    proxy_url="http://127.0.0.1:15000"
)

# All requests route through the local Envoy sidecar
# Identity, policy, and secrets handled transparently
response = agent.invoke_skill("sentiment-analysis", {
    "text": ticket.description
})

The agent uses an mcp:// URL scheme as a local routing convention -- not a new protocol. The sidecar matches the prefix, rewrites the path, selects the upstream cluster by service name, and routes the request. The agent only needs to know the logical name of the service.

What This Gets You

Every agent action is attributable to a specific, authenticated identity. Authorization is fine-grained based on identity, team, environment, and time-of-day. No credentials are hardcoded; secrets are short-lived and dynamically injected. Audit trails centralize across Envoy access logs, OPA decision logs, and Vault audit logs. New agents spin up in minutes instead of weeks.

Open Questions

Several hard problems remain. LLM input/output policy -- preventing agents from sending PII to external models or generating harmful content -- likely requires enforcement within the agent's LLM interaction loop, not just at the network boundary. Agent interactions form complex delegation graphs that may exceed standard RBAC/ABAC models. And the developer experience is still heavy: bootstrapping a new agent requires understanding Terraform, Rego, and Vault configuration.

The infrastructure patterns are well-understood from the microservices era. The challenge is adapting them to autonomous agents, where actions are less predictable, permissions need to be more dynamic, and the blast radius of a misconfigured policy is larger.

Autonomous agents are parsing docs, calling APIs, and triaging support tickets in production. The AI layer is getting smarter. The infrastructure around it is not keeping up.

Agents need the same treatment.

The Problem

Agent infrastructure today has four structural failures.

Identity is an afterthought. Most agents run with static API keys. No fine-grained access control, no credential rotation, no way to distinguish one agent's actions from another's.

Isolation is poor. Agents share environments and credentials. Attributing an action to a specific agent -- or containing blast radius -- is nearly impossible.

Permissions are binary. Full access or nothing. No contextual authorization based on task, team, or environment.

Auditability is limited. Tracing what happened, which agent did it, and under what authority requires stitching together logs that were never designed to correlate.

The Architecture

Identity: Auth0 or SPIFFE

The key design choice: agent identity should be as rich as human identity. Not just "who are you" but "what team, what environment, what type of work."

Policy: OPA + Rego

Secrets: Vault

The SDK

A Python SDK abstracts the complexity:

agent = MCPAgent(
    client_id="support-triage-agent",
    proxy_url="http://127.0.0.1:15000"
)

# All requests route through the local Envoy sidecar
# Identity, policy, and secrets handled transparently
response = agent.invoke_skill("sentiment-analysis", {
    "text": ticket.description
})

Building the HTTP for Agents: A Complete Guide to Agent Infrastructure

The Problem

The Architecture

Identity: Auth0 or SPIFFE

Policy: OPA + Rego

Secrets: Vault

The SDK

What This Gets You

Open Questions

Continue reading

Orchestrating AI Coding Agents: What I Learned Running Three Autonomous Sessions at Once

The AI Agent Gold Rush: Why Everyone's Building Picks and Shovels

OCode: Why I Built My Own Claude Code (and Why You Might Too)

Building the HTTP for Agents: A Complete Guide to Agent Infrastructure

The Problem

The Architecture

Identity: Auth0 or SPIFFE

Policy: OPA + Rego

Secrets: Vault

The SDK

What This Gets You

Open Questions

Continue reading

Orchestrating AI Coding Agents: What I Learned Running Three Autonomous Sessions at Once

The AI Agent Gold Rush: Why Everyone's Building Picks and Shovels

OCode: Why I Built My Own Claude Code (and Why You Might Too)