The Agent Identity Problem: Knowing Which AI Ran Which Command
Your audit log shows a DROP TABLE sessions ran at 3:47 AM. Your incident response team wants to know which agent did it. You open the logs and see: user: service_account_ai. That's it. One token, shared across four different AI agents, zero attribution.
This is the agent identity problem. And it's one of the most quietly corrosive issues in production AI deployments.
How Identity Collapses in Practice
It usually starts reasonably. You create a service account for your AI agent, grant it the permissions it needs, wire it up. It works. Then you add another agent. It needs similar permissions, so you reuse the token — easier than managing another secret. Then another agent. Then a dev builds a test harness that also uses the same token "temporarily."
Six months later, five agents share one identity. Your audit trail is technically complete — every command is logged — but attribution is gone. You know what ran. You have no idea who ran it.
This matters more than it might seem:
- Incident response stalls — you can't determine which agent caused an incident without replaying every session
- Compliance investigations fail — SOC 2, ISO 27001, and HIPAA auditors ask "who accessed what" — "the AI account" isn't an answer
- Anomaly detection loses signal — behavioral baselines require stable identity; one token for five agents means no meaningful baseline for any of them
- Kill switches don't work cleanly — revoking the shared token stops all agents; you can't isolate the one that's misbehaving
The Three Attribution Gaps
1. Shared credentials
Multiple agents, one API key or session token. All commands log under the same identity. This is the most common case, because it's operationally easy and nobody thinks about it until something goes wrong.
2. Agent-spawned subagents
An orchestrator agent delegates tasks to worker agents. The worker inherits the orchestrator's session context, credentials, and sometimes its identity. By the time a command reaches the shell, the chain of delegation is invisible in the log — you see the command and the terminal credential, not the chain of principals that authorized it.
3. Model-hopping
The same "agent" switches between models mid-session — GPT-4 for reasoning, a smaller model for code generation, another for tool use. Each model invocation may produce different behavior, but all of it is attributed to the same session identity. When something goes wrong, you can't tell which model made the call.
What Proper Agent Identity Looks Like
| Attribute | What to capture | Why it matters |
|---|---|---|
| Agent ID | Stable identifier per agent (not per session) | Cross-session behavioral baselines, kill switch targeting |
| Session ID | Per-session identifier | Correlate all commands within one run |
| Principal chain | Who authorized this agent to act (human → orchestrator → worker) | Multi-agent attribution, delegation audit |
| Model/version | Which model produced this action | Behavioral analysis after model updates |
| Task context | What the agent was asked to do | Distinguish intended vs. out-of-scope actions |
| Credential ID | Which token/key was used (not the value) | Blast radius analysis if credential is compromised |
Not all of this needs to be on every log line. But it all needs to be recoverable from your audit trail within a reasonable investigation window.
Practical Steps to Fix Attribution
One credential per agent
This is the foundation. Each distinct agent identity — "data-pipeline-agent," "code-review-agent," "deployment-agent" — gets its own credential. Yes, it's more secrets to manage. That cost is real. But it's far smaller than the cost of not being able to answer "which agent caused the incident."
In expacti, each agent authenticates with its own shell token. The reviewer sees not just the command, but which agent submitted it. If the data-pipeline agent starts issuing deployment commands at 2 AM, that's flagged — because the baseline is per-agent, not shared across all agents.
Embed identity in the session, not just the credential
A credential tells you the account. A session context tells you the intent. When an agent starts a session, it should declare:
- Its stable agent ID
- The task it was given (or a hash of the prompt)
- Its parent orchestrator, if any
- The model version it's running
This metadata travels with every command in that session. If the agent is later found to have done something wrong, you have the full context — not just "service_account_ai ran DROP TABLE at 3:47 AM."
Propagate principal chains in multi-agent systems
When an orchestrator delegates to a worker, the worker's commands should carry the full delegation chain. Think of it like a call stack for authorization:
principal_chain: [
{ type: "human", id: "[email protected]", authorized_at: "2026-04-08T03:40:00Z" },
{ type: "agent", id: "orchestrator-v2", task: "deploy staging release" },
{ type: "agent", id: "db-migration-agent", task: "run pending migrations" }
]
Every command the db-migration-agent issues carries this chain. The reviewer sees not just "db-migration-agent wants to run ALTER TABLE" — they see the full authorization lineage. If that lineage doesn't make sense (why is a staging deployment agent running production schema changes?), it's visible before the command executes.
Treat model versions as identity-relevant
When you upgrade an agent from GPT-4o to a newer model, your behavioral baselines break — the new model may have different command patterns, different verbosity, different risk tolerance. If you don't track model version per command, post-upgrade anomalies look like the agent changed behavior for no reason.
Log the model version with every command. When something looks weird after a model upgrade, you can filter by model version and see whether the anomaly started with the upgrade or predates it.
What This Enables
Proper agent identity isn't just about attribution after incidents. It unlocks several capabilities that are impossible without it:
Per-agent behavioral baselines. Anomaly detection only works when it can compare current behavior against a meaningful baseline. A shared credential has no stable baseline — it's five different agents' behaviors mixed together. Per-agent identity means you can flag when the code-review agent starts issuing network commands it's never issued before.
Targeted kill switches. If the data-pipeline agent is compromised or misbehaving, you revoke its credential. The other agents keep running. Without per-agent identity, you're choosing between "do nothing" and "take down everything."
Meaningful compliance reporting. Auditors ask which principal accessed which resource. "The AI account" is not a principal. "data-pipeline-agent (version 2.3, authorized by [email protected] on 2026-04-07)" is.
Honest post-mortems. When an incident happens, you want to know: was this the agent following instructions, or was it out-of-scope behavior? With task context in the session, you can compare what the agent was asked to do against what it actually did. Without it, post-mortems are speculation.
The Honest Difficulty
None of this is free. Separate credentials per agent means more secrets management overhead — rotation, storage, distribution. Principal chain propagation requires all agents in your system to actually propagate it, which means you need to control (or trust) all agents in the chain. Model version tracking requires discipline in your deployment process.
And there's a subtler problem: agents often don't know their own identity in any meaningful sense. The model itself doesn't have an agent ID — that identity has to be injected by the infrastructure that runs it. If your orchestration layer doesn't do this, you have to add it.
The practical starting point is simpler: before you worry about principal chains and model versioning, just do the one-credential-per-agent rule. That alone eliminates the most common attribution failure. Build from there.
A Minimum Viable Checklist
- Each distinct agent has its own credential — no shared tokens across agents
- Every command log includes a stable agent ID (not just a session ID)
- Orchestrator-to-worker delegation is logged at delegation time, not reconstructed after the fact
- Session context includes task description (or hash) and model version
- Credential rotation is per-agent — revoking one agent's token doesn't affect others
If you can answer "which agent ran that command, and what was it supposed to be doing?" within five minutes of an incident, your attribution is workable. If that question takes hours or days, you have a structural problem — not a logging problem.
expacti gives every agent its own identity layer
Each agent authenticates with a scoped shell token. Every command is attributed to the specific agent, session, and task that produced it — not a shared service account. Reviewers see the full context before approving.