AI Agents and Secrets Management: The Credentials in Your Context Window

When you ask an AI agent to deploy a service, it needs credentials. When you ask it to query a database, it needs a connection string. When it fetches a config file to understand the environment, that file probably contains an API key.

This is not a bug. It's how agents work. The problem is what happens to those credentials once they enter the agent's context window — because the context window is not a vault. It's a scratchpad, and a surprisingly leaky one.

How Secrets Get Into Context

There are four primary paths through which credentials end up in an agent's context, and most teams are only aware of one or two of them.

1. Environment variables read at startup

The most common pattern: you pass secrets via environment variables, the agent reads them with os.environ or equivalent, and now the values are strings in the agent's working memory. If the agent ever includes them in a prompt — to explain what credentials it has, to construct a command string, to confirm its configuration — they've entered the context window as plaintext.

Most agents do include this information. They describe their capabilities. They construct shell commands like psql postgresql://user:PASSWORD@host/db that they then either run or ask you to approve. The password is now in the context.

2. Inline configuration and system prompts

Many agent deployments include credentials directly in system prompts or inline config: "You have access to the production database at the following connection string: ...". This is common because it's simple. It's also the most direct way to put credentials into a context window that will be transmitted to a model provider's API endpoint with every single request.

3. Tool call responses

When an agent calls a tool — fetches a URL, reads a file, calls an internal API — the response comes back into context. If that response contains credentials (and many do: cloud metadata endpoints, config management systems, secret managers queried via tool), they're now in context as plaintext.

AWS's instance metadata endpoint returns IAM credentials in JSON. A Vault read returns the secret value. A Kubernetes secret mounted as a file contains a base64-encoded token. All of these, when fetched by an agent tool, become part of the next prompt.

4. Files fetched during task execution

Agents that explore codebases and config directories routinely read files that contain credentials: .env files, config.yaml with embedded passwords, CI/CD pipeline configs, Terraform state files. The agent doesn't know these are sensitive — it's just reading what's there. The content enters context.

The Fundamental Problem: Context Windows Are Not Vaults

A vault (HashiCorp Vault, AWS Secrets Manager, 1Password) has specific security properties: access is audited, values are encrypted at rest and in transit, access is scoped to specific identities, and the plaintext value is only exposed to the process that explicitly requests it via authenticated API call.

A context window has none of these properties.

The context window is a sequence of tokens passed to an LLM for inference. It's readable by the model. It's readable by the application code that constructs prompts. It's transmitted to the model provider's infrastructure as part of every API request. It may be logged — by your application, by the model provider, by any middleware in between. It may be summarized and stored. It may appear in error traces when something fails.

Three Specific Risks

Risk 1: Context leakage via logging and summarization

Most agent frameworks log prompts and completions for debugging. Your observability stack ingests those logs. Your log aggregation system stores them. Your error reporting service captures them when things go wrong.

If a database password appears in a prompt, it will appear in every system that processes that prompt. Log retention periods of 30-90 days are standard. That's 30-90 days of exposure in your logging infrastructure for a secret that should have a much smaller blast radius.

Summarization makes this worse. Long-running agents that compress old context into summaries will include credential values in those summaries. The summary persists after the original conversation is gone. You've now stored the credential in a form that's harder to identify and rotate.

Risk 2: Prompt injection extracting secrets

If your agent has credentials in its context and processes external input — web pages, emails, documents, tool responses from third-party APIs — a prompt injection attack can extract those credentials.

The attack is straightforward: embed instructions in external content that tell the agent to repeat its credentials, send them to an external endpoint, or include them in a file it writes. If the credentials are in context, the agent has everything it needs to comply.

Prompt injection defenses help, but they're not complete. The most reliable protection is ensuring the credentials were never in context in the first place — you can't leak what isn't there.

Risk 3: Model providers seeing your secrets via API

This is the risk most teams don't think about explicitly, but it's real. Every API call to a model provider sends the current context. If your context contains a production database password, that password is transmitted to and processed by the provider's infrastructure.

Major providers have data processing agreements and claim not to train on API data by default. But "claims not to train on" and "never has access to" are different things. Your credentials are passing through infrastructure you don't control, processed by systems you can't audit, under terms that can change.

For credentials that matter — production database passwords, cloud provider keys, signing tokens — this is an unacceptable exposure surface even if the provider is trustworthy.

Why "Just Use Env Vars" Isn't Enough

The standard advice for keeping secrets out of source code is to use environment variables. It's good advice. It's also not sufficient for AI agents, and here's why: the problem isn't where the secret is stored, it's whether the secret passes through the agent's context.

An env var that is read by the application and used directly in a system call — say, passed as a parameter to a database library — never enters the LLM's context. That's safe.

An env var that is read by the agent and then used by the agent to construct a command string, describe its capabilities, or include in a prompt — that secret is now in context. The env var protected you from source code exposure, but it didn't protect you from context window exposure.

The distinction is: where does the value go after it's read? If it goes directly into a function call that executes without passing through the LLM, you're fine. If it gets interpolated into a string that becomes part of a prompt, you have a problem.

The Right Pattern: Secrets Flow to Tools, Not Through Context

The architecture that actually works keeps credentials out of the agent's context entirely. Instead of the agent knowing the credentials and using them, the agent describes what it wants to do and the execution layer injects credentials at execution time.

Concretely: instead of the agent constructing psql postgresql://admin:[email protected]/app -c "SELECT..." and either running it or asking for approval, the agent says "run a read-only query against the production database: SELECT..." and a separate component — the command executor, the tool implementation, the shell proxy — resolves the connection string and injects it without the agent ever seeing the value.

The agent's context contains: what it wants to do, what permissions it has, what the results were. It does not contain: the actual credential values used to accomplish the task.

How Command Authorization Helps

This is where the command authorization layer becomes a security primitive, not just a governance tool.

When an agent's commands pass through a proxy before execution, that proxy can do more than approve or reject them. It can inject credentials at execution time. The agent submits a command template; the proxy resolves the secrets and executes with actual values.

Here's what this looks like in practice:

Layer	What it sees	What it does
LLM context	`deploy to prod-db using {{DB_CREDENTIALS}}`	Reasons about the task, generates command intent
Command authorization proxy	Command intent + resolved template	Approves/rejects, injects actual credential values
Execution layer	Full command with real credentials	Executes, logs result (without credential values)

The agent never sees the real credential value. The human approver sees the command intent — "deploy migration to production database" — without the credentials embedded. The credential is injected at the moment of execution by a component that has appropriate access to the secrets store.

This is credential injection at the execution layer. It's the same principle used in CI/CD pipelines where runners have access to secret variables but those variables aren't visible to the job scripts themselves — except applied to AI agent command execution.

Practical Controls

Secret-less agent context

Audit what's in your agent's system prompt and initial context. Remove credential values. Replace them with capability descriptions: instead of "your database password is X", say "you have read access to the production database; use the query_db tool". The tool implementation holds the credentials, not the context.

For env vars: don't read them into agent memory unless the agent actually needs the value as a value (rare). Instead, structure your tooling so that env vars are consumed by tool implementations that execute operations directly.

Credential injection at the execution layer

Implement tools and command executors that resolve secrets from a secrets manager at call time. The agent passes intent; the tool resolves credentials. This keeps the secrets out of context while still giving the agent the operational capabilities it needs.

If you're running an SSH proxy or command authorization layer, extend it to handle credential injection: maintain a mapping from capability names to secrets, and resolve at execution time rather than passing credentials through the agent.

Scope limiting

Credentials that are scoped narrowly are less dangerous when they do leak. A read-only database credential that can only read specific tables has much lower blast radius than a full admin connection string.

Create agent-specific credentials with minimal necessary permissions. Don't share production admin credentials with AI agents — ever. The agent's credentials should be scoped to what it actually needs, not to what would be convenient.

Rotation

If credentials do appear in context — and some will, despite good architecture — rotation limits the exposure window. A credential that rotates every 24 hours and appears in three days of logs is compromised for at most one day, not indefinitely.

Build rotation into your agent infrastructure from the start. Short-lived credentials generated per-session are better than long-lived credentials generated per-agent. Credentials that expire when the agent session ends limit the window even further.

Log scrubbing

Add redaction to your logging pipeline. Known secret patterns — AWS key formats, database URL patterns, bearer tokens — can be identified and masked before they're written to persistent storage. This doesn't prevent the LLM from seeing the values, but it limits the persistence of the exposure.

Tools like detect-secrets can be run in your log pipeline. It's not a complete solution — it will miss novel secret formats — but it catches the common cases.

The Honest Limitations

Some agent architectures make context-free secrets management genuinely difficult. If your agent needs to construct a full connection string to validate it before using it, the credential has to be in context during validation. If your agent is doing introspection — reading its own configuration to explain what it can do — it will read whatever's there.

Prompt injection remains a hard problem independent of secrets management. Even if your agent has no credentials in context, a sufficiently sophisticated injection might convince it to call a tool that returns credentials. Defense in depth matters: credential injection at the execution layer, scoped credentials, and prompt injection defenses are all complementary, not alternatives.

Model providers are becoming more trustworthy and their data handling more auditable, but "more trustworthy" is not "zero risk". For the most sensitive credentials in your infrastructure, the answer might be: those credentials don't get used by AI agents at all, regardless of architecture. Some access patterns aren't appropriate for autonomous systems.

Finally: this is an area where the tooling is immature. There's no standardized "agent secrets manager" the way there's a standardized Vault interface. The patterns here require deliberate implementation rather than dropping in a library. That's the current state of the ecosystem, and it means this work falls on you.

The Checklist

Does your agent's system prompt contain any credential values? If yes, remove them and replace with capability descriptions.
Does your agent read env vars and use those values in prompt construction? If yes, restructure so env vars are consumed by tool implementations, not the agent.
Do your tools return raw credential values in their responses? If yes, restructure to execute operations directly rather than returning credentials to the agent.
Are agent-specific credentials scoped to minimum necessary permissions? If no, create scoped credentials for each agent and capability.
Do your agent logs contain plaintext credentials? If yes, add log scrubbing upstream and rotate any credentials already in logs.
Do your agent credentials rotate? If no, implement rotation — at minimum per-deployment, ideally per-session.
Does your command authorization layer support credential injection? If no, this is the highest-value architecture change you can make.

Credential injection at the execution layer

Expacti's command authorization proxy intercepts agent commands before execution, enabling credential injection that keeps secrets out of the LLM context entirely. The agent describes what it wants to do; the proxy resolves and injects credentials at execution time.

See it in action Join the waitlist