The Agent Identity Problem: Knowing Which AI Ran Which Command

April 8, 2026 · 9 min read · Security

Your audit log shows a DROP TABLE sessions ran at 3:47 AM. Your incident response team wants to know which agent did it. You open the logs and see: user: service_account_ai. That's it. One token, shared across four different AI agents, zero attribution.

This is the agent identity problem. And it's one of the most quietly corrosive issues in production AI deployments.

How Identity Collapses in Practice

It usually starts reasonably. You create a service account for your AI agent, grant it the permissions it needs, wire it up. It works. Then you add another agent. It needs similar permissions, so you reuse the token — easier than managing another secret. Then another agent. Then a dev builds a test harness that also uses the same token "temporarily."

Six months later, five agents share one identity. Your audit trail is technically complete — every command is logged — but attribution is gone. You know what ran. You have no idea who ran it.

This matters more than it might seem:

Incident response stalls — you can't determine which agent caused an incident without replaying every session
Compliance investigations fail — SOC 2, ISO 27001, and HIPAA auditors ask "who accessed what" — "the AI account" isn't an answer
Anomaly detection loses signal — behavioral baselines require stable identity; one token for five agents means no meaningful baseline for any of them
Kill switches don't work cleanly — revoking the shared token stops all agents; you can't isolate the one that's misbehaving

The Three Attribution Gaps

1. Shared credentials

Multiple agents, one API key or session token. All commands log under the same identity. This is the most common case, because it's operationally easy and nobody thinks about it until something goes wrong.

2. Agent-spawned subagents

An orchestrator agent delegates tasks to worker agents. The worker inherits the orchestrator's session context, credentials, and sometimes its identity. By the time a command reaches the shell, the chain of delegation is invisible in the log — you see the command and the terminal credential, not the chain of principals that authorized it.

3. Model-hopping

The same "agent" switches between models mid-session — GPT-4 for reasoning, a smaller model for code generation, another for tool use. Each model invocation may produce different behavior, but all of it is attributed to the same session identity. When something goes wrong, you can't tell which model made the call.

What Proper Agent Identity Looks Like

Attribute	What to capture	Why it matters
Agent ID	Stable identifier per agent (not per session)	Cross-session behavioral baselines, kill switch targeting
Session ID	Per-session identifier	Correlate all commands within one run
Principal chain	Who authorized this agent to act (human → orchestrator → worker)	Multi-agent attribution, delegation audit
Model/version	Which model produced this action	Behavioral analysis after model updates
Task context	What the agent was asked to do	Distinguish intended vs. out-of-scope actions
Credential ID	Which token/key was used (not the value)	Blast radius analysis if credential is compromised

Not all of this needs to be on every log line. But it all needs to be recoverable from your audit trail within a reasonable investigation window.

Practical Steps to Fix Attribution

One credential per agent

This is the foundation. Each distinct agent identity — "data-pipeline-agent," "code-review-agent," "deployment-agent" — gets its own credential. Yes, it's more secrets to manage. That cost is real. But it's far smaller than the cost of not being able to answer "which agent caused the incident."

In expacti, each agent authenticates with its own shell token. The reviewer sees not just the command, but which agent submitted it. If the data-pipeline agent starts issuing deployment commands at 2 AM, that's flagged — because the baseline is per-agent, not shared across all agents.

Embed identity in the session, not just the credential

A credential tells you the account. A session context tells you the intent. When an agent starts a session, it should declare:

Its stable agent ID
The task it was given (or a hash of the prompt)
Its parent orchestrator, if any
The model version it's running

This metadata travels with every command in that session. If the agent is later found to have done something wrong, you have the full context — not just "service_account_ai ran DROP TABLE at 3:47 AM."

Propagate principal chains in multi-agent systems

When an orchestrator delegates to a worker, the worker's commands should carry the full delegation chain. Think of it like a call stack for authorization:

principal_chain: [
  { type: "human", id: "[email protected]", authorized_at: "2026-04-08T03:40:00Z" },
  { type: "agent", id: "orchestrator-v2", task: "deploy staging release" },
  { type: "agent", id: "db-migration-agent", task: "run pending migrations" }
]

Every command the db-migration-agent issues carries this chain. The reviewer sees not just "db-migration-agent wants to run ALTER TABLE" — they see the full authorization lineage. If that lineage doesn't make sense (why is a staging deployment agent running production schema changes?), it's visible before the command executes.

Treat model versions as identity-relevant

When you upgrade an agent from GPT-4o to a newer model, your behavioral baselines break — the new model may have different command patterns, different verbosity, different risk tolerance. If you don't track model version per command, post-upgrade anomalies look like the agent changed behavior for no reason.

Log the model version with every command. When something looks weird after a model upgrade, you can filter by model version and see whether the anomaly started with the upgrade or predates it.

What This Enables

Proper agent identity isn't just about attribution after incidents. It unlocks several capabilities that are impossible without it:

Per-agent behavioral baselines. Anomaly detection only works when it can compare current behavior against a meaningful baseline. A shared credential has no stable baseline — it's five different agents' behaviors mixed together. Per-agent identity means you can flag when the code-review agent starts issuing network commands it's never issued before.

Targeted kill switches. If the data-pipeline agent is compromised or misbehaving, you revoke its credential. The other agents keep running. Without per-agent identity, you're choosing between "do nothing" and "take down everything."

Meaningful compliance reporting. Auditors ask which principal accessed which resource. "The AI account" is not a principal. "data-pipeline-agent (version 2.3, authorized by [email protected] on 2026-04-07)" is.

Honest post-mortems. When an incident happens, you want to know: was this the agent following instructions, or was it out-of-scope behavior? With task context in the session, you can compare what the agent was asked to do against what it actually did. Without it, post-mortems are speculation.

The Honest Difficulty

None of this is free. Separate credentials per agent means more secrets management overhead — rotation, storage, distribution. Principal chain propagation requires all agents in your system to actually propagate it, which means you need to control (or trust) all agents in the chain. Model version tracking requires discipline in your deployment process.

And there's a subtler problem: agents often don't know their own identity in any meaningful sense. The model itself doesn't have an agent ID — that identity has to be injected by the infrastructure that runs it. If your orchestration layer doesn't do this, you have to add it.

The practical starting point is simpler: before you worry about principal chains and model versioning, just do the one-credential-per-agent rule. That alone eliminates the most common attribution failure. Build from there.

A Minimum Viable Checklist

Each distinct agent has its own credential — no shared tokens across agents
Every command log includes a stable agent ID (not just a session ID)
Orchestrator-to-worker delegation is logged at delegation time, not reconstructed after the fact
Session context includes task description (or hash) and model version
Credential rotation is per-agent — revoking one agent's token doesn't affect others

If you can answer "which agent ran that command, and what was it supposed to be doing?" within five minutes of an incident, your attribution is workable. If that question takes hours or days, you have a structural problem — not a logging problem.

expacti gives every agent its own identity layer

Each agent authenticates with a scoped shell token. Every command is attributed to the specific agent, session, and task that produced it — not a shared service account. Reviewers see the full context before approving.

Try the interactive demo or join the waitlist.