The Minimal Viable Governance Stack for AI Agents

Every time AI agents come up in a security conversation, someone reaches for a policy template. The result is usually 40 pages of requirements that nobody reads, mapped to controls that nobody enforces, attached to a risk register that nobody updates.

Real governance doesn't work that way. It works because specific mechanisms catch specific failures before they become incidents. Everything else is theater.

So what's the minimum viable governance stack for AI agents? Here's what actually matters.

Layer 1: Command Visibility

You cannot govern what you cannot see. The first requirement is a complete record of every command your agents attempt to run — not just what they ran successfully, but what they tried, what was blocked, and what was approved.

Most teams believe they have this. They don't. What they have:

Application logs from the agent framework
System logs from the destination server
Maybe a transcript of the LLM conversation

What they're missing: the pre-execution layer. The moment a command is formulated and before it runs. That gap is where the interesting failures live — commands that looked fine to the LLM but should have triggered review, commands that executed successfully but were anomalous, commands that were never logged because the agent bypassed the expected path.

Minimal requirement: every command that touches production infrastructure must pass through a single interception point that logs it immutably before execution. No exceptions, no bypass paths.

Layer 2: Tiered Approval Policy

Not every command needs a human. That's the trap that kills most governance initiatives — routing everything through review creates fatigue, and fatigued reviewers approve everything.

The working model is three tiers:

Tier	Example Commands	Disposition
Auto-allow	`git status`, `kubectl get pods`, read-only ops	Execute immediately, log it
Human review	File writes, deployments, config changes, API calls with side effects	Queue for approval, timeout to deny
Auto-deny	`rm -rf`, credential exports, firewall changes, bulk deletes	Block, alert, log with full context

The key insight is that the whitelist for tier 1 should be small and explicit, not large and permissive. Start strict and expand based on evidence, not assumption.

Layer 3: Anomaly Awareness

Whitelists handle known-good. Anomaly detection handles known-unknown — commands that aren't explicitly banned but don't fit the pattern of what this agent normally does.

Signals worth tracking:

Time-of-day deviation: An agent that normally runs during business hours executing a destructive command at 3am deserves a second look
Command volume spikes: 10x the normal rate of write operations in a single session
Target diversity: Agent suddenly touching systems it has never touched before
Risk score elevation: A session whose average risk score has shifted upward over the past hour

Anomaly detection isn't a substitute for command-level controls. It's an early warning system that something has changed — the model's behavior, the prompt it received, or the environment it's operating in.

Layer 4: Identity and Attribution

"The agent did it" is not accountability. Every agent action needs to trace back to:

Which agent instance (not just which type)
Which session or job triggered it
Which human or system initiated that session
What authorization that human or system had at the time

This matters for incident response — when something goes wrong, you need to know if it was one rogue prompt, a systematic misconfiguration, or a compromised credential. Without attribution, you're guessing.

Practical implementation:

Each agent session gets a unique ID that carries through all logs
The initiating user or service account is recorded at session creation
RBAC governs which agents can request which command categories
Elevated actions (anything in tier 2 or 3) log the reviewer identity alongside the agent identity

Layer 5: Time-Bounded Access

Persistent, indefinite agent credentials are a liability. Best practice is short-lived access tokens that expire at session end (or on a timer), with explicit re-authorization for sensitive operations.

The pattern:

Session credentials valid for N minutes or until session terminates
No long-lived keys embedded in agent config or environment variables
Credentials fetched from a secrets manager at session start, not at deploy time
Audit log captures credential issuance and expiry alongside commands

This also limits blast radius when something goes wrong. A compromised 15-minute token is a different incident category than a compromised permanent key.

Layer 6: Rollback Capability

Governance isn't just about preventing bad things — it's about recovering from them. The minimum rollback capability for AI agents:

Every destructive operation preceded by a snapshot or backup (automated, not optional)
The audit log is the ground truth for "what exactly happened and in what order"
A defined procedure for "undo the last N commands from session X"
Someone is responsible for executing that procedure — it's not assumed to be automatic

The audit log here is doing double duty: it's the compliance record and the recovery runbook. That only works if it's append-only, complete, and includes enough context to reconstruct state.

What to Leave Out

The governance stack above is deliberately lean. Here's what most policy documents include that doesn't actually help at the operational level:

Lengthy approval workflows for low-risk commands: If it takes 24 hours to approve a read-only diagnostic, the agent is useless and people will find workarounds
Generic "AI ethics" policies: Valuable in context, useless as operational control
Manual log review without automation: Human review of thousands of command logs is security theater; anomaly detection and alerting are what catch real issues
Separate governance for each agent: One interception layer, one audit log, one policy framework — multiple agents, same stack

The Test: Can You Answer These Questions?

Here's how to know if your governance stack is actually working. After any given agent session, you should be able to answer:

What commands did the agent run, and in what order?
Which commands were auto-allowed, which were reviewed, which were blocked?
Who initiated the session, and what were their permissions?
Were there any anomalies (time, volume, target, risk score) that triggered alerts?
If something went wrong, what's the recovery path and who's responsible?

If you can answer all five in under five minutes, your governance stack is functional. If any of them requires manual log archaeology, that's the gap to fix first.

Putting It Together

The minimal viable governance stack isn't complicated. Six layers:

Command visibility — immutable log of every attempted action
Tiered approval — auto-allow the safe, review the risky, block the dangerous
Anomaly awareness — catch behavioral shifts before they become incidents
Identity and attribution — trace every action to a human chain of responsibility
Time-bounded access — short-lived credentials, no persistent exposure
Rollback capability — defined recovery path, not assumed automatic

What makes this stack minimal isn't that it's easy to build — it's that each layer is doing real work that the others can't substitute for. Remove any one of them and you have a genuine gap. Add more layers without filling these gaps and you have paperwork.

Start with visibility. Everything else depends on it.

Build the stack, not the policy doc

Expacti gives you command-level visibility, tiered approval, anomaly detection, full attribution, and audit-ready logs — without building it yourself.

Try the interactive demo Start free