Observability for AI Agents: What to Log, What to Alert On

April 7, 2026 · 9 min read · Observability AI Agents Security

Your AI agent ran a command at 3 AM. It completed successfully. No alerts fired. Was that good?

Maybe. Or maybe the agent was quietly exfiltrating data, cleaning up evidence of a misconfiguration, or drifting scope in a way that will only matter in three weeks. Traditional observability — CPU, error rates, latency — won't tell you which.

AI agents need a different observability model. Not instead of infrastructure metrics, but layered on top of them. This post breaks down what that model looks like.

Why Standard Observability Misses the Point

Conventional o11y answers: did it run? did it fail? was it slow? For AI agents, the more important questions are:

Did it do what it was supposed to do?
Did it do anything it wasn't supposed to do?
Could a human reconstruct what happened and why?

An agent that calls rm -rf /var/log/old/ and returns exit code 0 looks perfect to a metrics dashboard. But if nobody approved that deletion, if the logs contained an active audit trail, or if the path was slightly wrong — you want to know.

The core difference: Infrastructure observability is about system health. Agent observability is about intent fidelity — did the agent's actions match what a human would have authorized?

The Four Layers of Agent Observability

Layer 1: Command Telemetry

Every command an agent issues should be logged with enough context to reconstruct the decision. Minimum fields:

command — full command string (scrubbed for secrets)
session_id — links to the task/conversation that spawned it
agent_id — which agent, which model version
risk_score — computed score at time of submission
whitelist_matched — was it on the approved list?
decision — auto-approved / queued / denied
reviewer_id — if human-reviewed, who approved or denied it
latency_ms — time from submission to execution start
exit_code — outcome
output_bytes — rough sense of data volume (not the data itself)

Note output_bytes — not the full output. Full output storage is expensive and creates its own data security problem. But the volume is a useful signal: an agent reading 50MB from a database table when it normally reads a few KB is worth flagging.

Layer 2: Approval Flow Metrics

The approval pipeline is itself a system worth instrumenting:

Metric	Why It Matters	Alert Threshold (example)
Queue depth	Approval backlog building up	> 10 pending
Review latency p50/p95	Slowdown causing agent stalls	p95 > 5 min
Auto-approval rate	Sudden spike = whitelist too permissive	spike > +20%
Deny rate	Sudden spike = agent drift or prompt injection	spike > +15%
Timeout rate	Reviewer unavailable; commands auto-denied	> 5% of queue
Same reviewer approving own agent	Conflict of interest / policy violation	any

Auto-approval rate is particularly subtle. A drop in human review might look like efficiency — the whitelist is doing its job. But it can also mean the whitelist has drifted to being too permissive. Tracking this over time, not just in absolute terms, surfaces the drift.

Layer 3: Anomaly Signals

Raw command logs don't tell you when behavior is unusual. You need baselines and deviation detection:

Command entropy: How diverse are the commands this agent runs in a session? A coding agent that suddenly issues 10 network commands has deviated from its typical pattern.
Time-of-day distribution: Most agents run during business hours or scheduled windows. A burst at 3 AM could be a cron job, or it could be something else.
Target directory pattern: An agent that normally writes to /app/build/ and suddenly starts reading from /etc/ or ~/.ssh/ is doing something it hasn't done before.
New command vocabulary: Commands the agent has never issued before in a given role are worth extra scrutiny, regardless of risk score.
Deny-then-retry patterns: An agent that gets denied and immediately issues a reformulated version of the same command is either confused or probing.

Expacti's anomaly engine tracks eight behavioral rules including time-of-day distribution, new-directory access, and command velocity spikes. Each flagged command gets an anomaly badge in the reviewer UI — not a block, but a signal.

Layer 4: Session-Level Context

Individual commands exist in the context of sessions. Session-level metrics give you a different view:

Session duration — unusually long sessions can mean runaway tasks
Commands per session — high count can mean task scope creep
Session risk score — aggregate of all command scores; a session trending high-risk deserves a look even if no single command crossed a threshold
Session outcome — completed / abandoned / killed; abandoned sessions may leave partial state

Session playback (terminal recording) ties all of this together. When you're investigating an incident, you want to replay exactly what happened, in order, with approval decisions timestamped alongside the commands.

What to Alert On (vs. What to Just Log)

Not everything worth logging is worth alerting on. Alert fatigue kills the signal. Here's a practical split:

Event	Action	Rationale
Command denied (single)	Log only	Normal; reviewer judgment
3+ denials in one session	Alert	Pattern, not noise
CRITICAL risk score command	Alert + require 2-reviewer	High blast radius
New agent identity seen	Alert	Unregistered agents are unknown risk
Agent accesses secrets path	Alert	High sensitivity target
Approval latency spike	Alert (ops)	Reviewers may be unavailable
Auto-approval rate delta > 20%	Alert (security)	Whitelist drift
Failed auth on agent token	Alert immediately	Credential leak / probe
Deny-then-retry (same command)	Alert	Probing behavior
Session > N hours	Alert + notify owner	Runaway task risk

The Audit Trail Your Compliance Team Actually Wants

Security and observability converge in the audit trail. For compliance purposes (SOC 2, ISO 27001, HIPAA), you need to be able to answer:

Who authorized this action? (reviewer name + timestamp)
What was the stated purpose? (task context)
What was the risk assessment at the time? (risk score, anomaly flags)
What was the outcome? (exit code, output summary)
Was this within policy? (whitelist match, org policy version)

This is structurally different from a system log. It's a decision record, not just an event record. The whitelist isn't just a filter — it's documented policy. The approval isn't just a gate — it's an authorization record.

Design your log schema with this in mind from the start. Adding it later means retrofitting or correlating across disparate systems.

Practical Implementation Notes

Where to Store Agent Logs

Agent audit logs should be separate from application logs and write-protected from the agent itself. An agent that can delete its own logs is an agent that can cover its tracks — intentionally or through a prompt injection attack.

At minimum: append-only storage with a separate access key. Better: ship to an external SIEM in real-time so the logs survive even if the machine is compromised.

Secrets in Logs

Before logging any command string, run it through a scrubber that redacts:

Patterns matching --password=, -p , AWS_SECRET, etc.
Base64 blobs above a certain length (likely encoded secrets)
Known secret formats (AWS keys, GitHub PATs, etc.)

Log the scrubbed version, flag that scrubbing occurred, and preserve the original (encrypted) only if your threat model requires it and you have the key management infrastructure for it.

Correlation IDs

Every command should carry IDs that let you join across systems: session_id → task_id → command_id → approval_id. When you're debugging an incident at 2 AM, you'll be grateful you can pull the full chain with a single query.

The Honest Limitation

Even perfect observability is retrospective. You can see what happened; you can alert on it; but by the time you're reading the alert, the command has already run.

That's why observability is necessary but not sufficient. It works alongside approval gates, not instead of them. The approval gate is the prospective control. The audit trail is the retrospective control. You need both.

Think of it this way: a smoke detector is not a fire suppression system. You still need sprinklers.

Summary

Effective AI agent observability has four layers:

Command telemetry — full context on every command issued
Approval flow metrics — health of the human-in-the-loop pipeline
Anomaly signals — behavioral baselines and deviation detection
Session context — aggregate view across a full task lifecycle

Layer your alerts carefully — log everything, alert on patterns, not individual events. Design your audit schema as a decision record, not just an event log. And keep the logs out of the agent's reach.

The goal isn't to watch every move. It's to be able to answer "what happened and why?" in 10 minutes — whether for a security incident, a compliance audit, or just a confused teammate asking why the deploy agent deleted that file.

expacti gives you this out of the box

Structured audit logs, approval flow metrics, 8-rule anomaly detection, session playback, and JSON/CSV export for compliance. Every command is logged with its risk score, reviewer decision, and session context.

Get started free →