Every time AI agents come up in a security conversation, someone reaches for a policy template. The result is usually 40 pages of requirements that nobody reads, mapped to controls that nobody enforces, attached to a risk register that nobody updates.
Real governance doesn't work that way. It works because specific mechanisms catch specific failures before they become incidents. Everything else is theater.
So what's the minimum viable governance stack for AI agents? Here's what actually matters.
Layer 1: Command Visibility
You cannot govern what you cannot see. The first requirement is a complete record of every command your agents attempt to run — not just what they ran successfully, but what they tried, what was blocked, and what was approved.
Most teams believe they have this. They don't. What they have:
- Application logs from the agent framework
- System logs from the destination server
- Maybe a transcript of the LLM conversation
What they're missing: the pre-execution layer. The moment a command is formulated and before it runs. That gap is where the interesting failures live — commands that looked fine to the LLM but should have triggered review, commands that executed successfully but were anomalous, commands that were never logged because the agent bypassed the expected path.
Minimal requirement: every command that touches production infrastructure must pass through a single interception point that logs it immutably before execution. No exceptions, no bypass paths.
Layer 2: Tiered Approval Policy
Not every command needs a human. That's the trap that kills most governance initiatives — routing everything through review creates fatigue, and fatigued reviewers approve everything.
The working model is three tiers:
| Tier | Example Commands | Disposition |
|---|---|---|
| Auto-allow | git status, kubectl get pods, read-only ops |
Execute immediately, log it |
| Human review | File writes, deployments, config changes, API calls with side effects | Queue for approval, timeout to deny |
| Auto-deny | rm -rf, credential exports, firewall changes, bulk deletes |
Block, alert, log with full context |
The key insight is that the whitelist for tier 1 should be small and explicit, not large and permissive. Start strict and expand based on evidence, not assumption.
Layer 3: Anomaly Awareness
Whitelists handle known-good. Anomaly detection handles known-unknown — commands that aren't explicitly banned but don't fit the pattern of what this agent normally does.
Signals worth tracking:
- Time-of-day deviation: An agent that normally runs during business hours executing a destructive command at 3am deserves a second look
- Command volume spikes: 10x the normal rate of write operations in a single session
- Target diversity: Agent suddenly touching systems it has never touched before
- Risk score elevation: A session whose average risk score has shifted upward over the past hour
Anomaly detection isn't a substitute for command-level controls. It's an early warning system that something has changed — the model's behavior, the prompt it received, or the environment it's operating in.
Layer 4: Identity and Attribution
"The agent did it" is not accountability. Every agent action needs to trace back to:
- Which agent instance (not just which type)
- Which session or job triggered it
- Which human or system initiated that session
- What authorization that human or system had at the time
This matters for incident response — when something goes wrong, you need to know if it was one rogue prompt, a systematic misconfiguration, or a compromised credential. Without attribution, you're guessing.
Practical implementation:
- Each agent session gets a unique ID that carries through all logs
- The initiating user or service account is recorded at session creation
- RBAC governs which agents can request which command categories
- Elevated actions (anything in tier 2 or 3) log the reviewer identity alongside the agent identity
Layer 5: Time-Bounded Access
Persistent, indefinite agent credentials are a liability. Best practice is short-lived access tokens that expire at session end (or on a timer), with explicit re-authorization for sensitive operations.
The pattern:
- Session credentials valid for N minutes or until session terminates
- No long-lived keys embedded in agent config or environment variables
- Credentials fetched from a secrets manager at session start, not at deploy time
- Audit log captures credential issuance and expiry alongside commands
This also limits blast radius when something goes wrong. A compromised 15-minute token is a different incident category than a compromised permanent key.
Layer 6: Rollback Capability
Governance isn't just about preventing bad things — it's about recovering from them. The minimum rollback capability for AI agents:
- Every destructive operation preceded by a snapshot or backup (automated, not optional)
- The audit log is the ground truth for "what exactly happened and in what order"
- A defined procedure for "undo the last N commands from session X"
- Someone is responsible for executing that procedure — it's not assumed to be automatic
The audit log here is doing double duty: it's the compliance record and the recovery runbook. That only works if it's append-only, complete, and includes enough context to reconstruct state.
What to Leave Out
The governance stack above is deliberately lean. Here's what most policy documents include that doesn't actually help at the operational level:
- Lengthy approval workflows for low-risk commands: If it takes 24 hours to approve a read-only diagnostic, the agent is useless and people will find workarounds
- Generic "AI ethics" policies: Valuable in context, useless as operational control
- Manual log review without automation: Human review of thousands of command logs is security theater; anomaly detection and alerting are what catch real issues
- Separate governance for each agent: One interception layer, one audit log, one policy framework — multiple agents, same stack
The Test: Can You Answer These Questions?
Here's how to know if your governance stack is actually working. After any given agent session, you should be able to answer:
- What commands did the agent run, and in what order?
- Which commands were auto-allowed, which were reviewed, which were blocked?
- Who initiated the session, and what were their permissions?
- Were there any anomalies (time, volume, target, risk score) that triggered alerts?
- If something went wrong, what's the recovery path and who's responsible?
If you can answer all five in under five minutes, your governance stack is functional. If any of them requires manual log archaeology, that's the gap to fix first.
Putting It Together
The minimal viable governance stack isn't complicated. Six layers:
- Command visibility — immutable log of every attempted action
- Tiered approval — auto-allow the safe, review the risky, block the dangerous
- Anomaly awareness — catch behavioral shifts before they become incidents
- Identity and attribution — trace every action to a human chain of responsibility
- Time-bounded access — short-lived credentials, no persistent exposure
- Rollback capability — defined recovery path, not assumed automatic
What makes this stack minimal isn't that it's easy to build — it's that each layer is doing real work that the others can't substitute for. Remove any one of them and you have a genuine gap. Add more layers without filling these gaps and you have paperwork.
Start with visibility. Everything else depends on it.
Build the stack, not the policy doc
Expacti gives you command-level visibility, tiered approval, anomaly detection, full attribution, and audit-ready logs — without building it yourself.
Try the interactive demo Start free