The Supply Chain Problem Is Different for Agents
Traditional supply chain attacks target the build pipeline: a malicious package gets into your package.json, your CI pulls it, and suddenly your production binary contains hostile code. You defend against this with lock files, dependency audits, and SBOM generation.
AI agents introduce a second supply chain surface — the runtime execution surface. Your agent doesn't just run code you've already vetted. It actively fetches and executes new code in response to tasks.
Consider what a typical AI coding agent does during a single task:
- Runs
pip installornpm installto pull new dependencies - Fetches bootstrap scripts with
curl | bash - Clones repositories from URLs it found in documentation
- Calls external APIs with credentials from the environment
- Executes generated code that wasn't part of your codebase 30 seconds ago
Each of these is a supply chain event. And unlike your CI pipeline, there's no SBOM, no lock file audit, and no reviewer standing between the model's decision and the execution.
The Three Risk Surfaces
1. Package Installation at Runtime
When your agent runs pip install requests-plus because it looked useful for the task, it's pulling code from a public registry with no prior vetting. Typosquatting, dependency confusion, and malicious packages with benign names are all real attacks that have hit production systems.
The difference from human developers: humans typically run pip install on a laptop first, then the install ends up in requirements.txt after review. Agents skip that step. They install, use, and potentially commit the dependency in one unreviewed sequence.
2. Script Fetching and Execution
curl https://install.example.com/setup.sh | bash is a common pattern agents learn from documentation. It's also one of the oldest and most dangerous anti-patterns in system administration.
When an agent runs an install script it found in a README, you don't know:
- Whether the script was modified since it was last reviewed (if it ever was)
- Whether the remote host is under attacker control
- What the script does beyond its documented purpose
- Whether TLS verification is actually happening
3. External API Calls with Ambient Credentials
AI agents often run with environment variables populated from secrets managers. When the agent decides to call an external API — a third-party service, a webhook, a data enrichment provider — it may be implicitly using credentials from your environment.
If that external service is compromised, or if the URL the agent constructed was influenced by prompt injection, you've exfiltrated credentials to a hostile endpoint. This is supply chain risk at the data layer, not the code layer.
Why Standard Defenses Don't Cover This
| Defense | Covers CI Supply Chain? | Covers Agent Runtime Supply Chain? |
|---|---|---|
| Lock files (package-lock.json, Pipfile.lock) | Yes | No — agents install outside the lock file |
| Dependency vulnerability scanners (Snyk, Dependabot) | Yes (at scan time) | No — runtime installs bypass static analysis |
| SBOM generation | Yes | No — SBOM is a snapshot; agents change the composition |
| Container image pinning | Yes (base image) | No — packages inside the container change at runtime |
| Network egress filtering | Partial | Partial — blocks known-bad registries, not typosquats |
| Command authorization (expacti) | N/A | Yes — every install, curl, and exec goes through approval |
The Fundamental Mismatch
Supply chain defenses were designed for a world where developers make installation decisions, not autonomous agents. The assumption is that a human evaluated the package before it went into requirements.txt. That assumption breaks when the agent is the one running pip install.
You need a different control: one that intercepts execution decisions at runtime, not during static analysis of committed code.
A Practical Defense Architecture
1. Command Authorization at the Shell Layer
The most direct defense is intercepting installation and execution commands before they run. This means putting an approval gate between the agent's decision to run pip install foo and the actual execution.
This isn't about slowing the agent down for every command — that's approval fatigue. It's about requiring explicit approval for commands that cross supply chain boundaries:
pip install,npm install,gem install,go getcurlandwgetthat pipe to shells or write to executable pathsgit clonefollowed by directory changes into the cloned repo- Any
chmod +xon a file that wasn't in the original codebase
Package installs that were already in the approved whitelist can pass automatically. New packages require a review step.
2. Private Registry Enforcement
For production environments, consider routing all package installs through your own registry mirror (Artifactory, Nexus, or a managed alternative). Configure the agent's environment so that pip and npm point to your internal registry, which only serves vetted packages.
This doesn't eliminate the risk entirely (your mirror has a lag behind public registries, and the mirror itself can be targeted), but it adds a meaningful layer: new packages require a human to pull them into the mirror first.
3. Credential Scoping to Prevent Exfiltration
Don't give agents access to long-lived, broad credentials in their environment. Use session-scoped tokens that expire, and scope them to the minimum necessary for the task.
If a prompt injection attack causes the agent to make an unexpected external API call, session-scoped credentials limit how much damage a credential exfiltration can cause. A token that expires in 30 minutes and only has read access to one S3 bucket is a much smaller loss than an AWS root access key.
4. Execution Sandboxing (with Caveats)
Running agents in containers or VMs limits the blast radius of a compromised package. A malicious package that tries to read /etc/passwd or establish persistence gets less traction if the agent is running in an ephemeral container.
The caveat: containers don't protect the data the agent has access to. If the agent's job is to read your codebase, a malicious package installed at runtime still has access to that codebase. Container isolation helps with system-level persistence; it doesn't help with data exfiltration.
The Audit Trail Requirement
Even if you don't catch a supply chain incident in real time, you need to be able to reconstruct what happened. This requires:
- Command audit log: every command the agent ran, with timestamps and session context
- Install audit: what packages were installed, from what source, at what time
- Network egress log: what external hosts were contacted during the session
- File modification log: what files were created or modified, especially in executable paths
This is the forensic layer. If you discover a compromise three days later, the audit log is how you determine the blast radius and whether credentials need rotation.
What This Looks Like in Practice
A team running AI coding agents on their backend codebase implements the following baseline:
- All
pip installandnpm installcommands require approval unless the package is in the whitelist - Whitelist is scoped to packages already in
requirements.txtorpackage.jsonat session start - Any
curl | bashorwget | shpattern is auto-flagged as HIGH risk and blocked pending approval - Agent runs with a session-scoped IAM role with minimal permissions
- All commands logged to the audit trail with the agent session ID
The result: the agent can work efficiently within the existing dependency set, but adding new external code is a human decision, not a model decision. That's the right boundary.
The Honest Limitation
Command authorization catches the obvious supply chain vectors (explicit installs, script fetching). It doesn't catch everything. If a package the agent uses legitimately has a malicious dependency buried three levels deep, that won't trigger an approval gate because the install command looks normal.
This is why supply chain defense is defense in depth, not a single control. Command authorization at the shell layer is one essential layer. Private registries, credential scoping, and behavioral anomaly detection are the others. None of them is sufficient alone.
The goal isn't to eliminate supply chain risk — that's impossible when the agent's job is to work with external code. The goal is to make the execution surface visible, auditable, and human-reviewed at the boundaries that matter most.