What Happens Inside an SSH Session Under Human Oversight

You've probably read the pitch: connect your AI agent through Expacti, and a human reviewer sees every command before it executes. But how does that actually work? What happens at the protocol level? What exactly flows over the wire?

This post walks through a real SSH session under human oversight — every step, in order.

The Setup

In a typical Expacti deployment, an AI agent (Claude Code, Codex, a custom Python script — whatever) connects to a production server via SSH. Instead of connecting directly to the server's sshd, it connects to expacti-sshd: an SSH proxy that sits in front of the real host.

AI agent → expacti-sshd → target server
              ↕
         expacti-backend
              ↕
         Reviewer UI / CLI

The reviewer — a human engineer — has a browser tab or terminal open to the Expacti reviewer interface. They'll see every command before it runs.

Step by Step

SSH handshake

The AI agent runs ssh [email protected]. expacti-sshd handles the SSH handshake normally: key exchange, host key verification, client authentication. The agent authenticates with an SSH key or password — same as any SSH client.

Session opened, backend notified

Once the SSH channel is open, expacti-sshd creates a session record in the backend database and opens a WebSocket connection to expacti-backend. The reviewer UI immediately shows a new active session.

PTY allocated, recording starts

If the agent requests a PTY (interactive terminal), expacti-sshd allocates one. All terminal I/O — including escape sequences, cursor movements, colors — is captured in asciicast v2 format for later replay. The reviewer can watch the session live if they want.

Command intercepted

The agent types a command: kubectl delete pod --all -n production. Before it reaches the shell, expacti-sshd intercepts it. This happens at the PTY layer — the proxy parses the input stream, extracts the command, and holds it.

Risk scoring

The backend runs the command through the risk engine. In ~1ms, it assigns a risk score (0–100) based on 14 command categories and 25 contextual modifiers.

kubectl delete pod --all hits: Kubernetes management category (+30), --all flag modifier (+20), production namespace in CWD context (+15) = score: 82 (Critical) CRITICAL

Whitelist check

Before bothering a human, the backend checks the whitelist. If this exact command (or a matching glob/regex pattern) is whitelisted for this org, it auto-approves and the command runs immediately. This is the low-friction path for routine operations.

A command like kubectl get pods is almost certainly whitelisted. kubectl delete pod --all almost certainly isn't.

Anomaly detection

If the command isn't whitelisted, the anomaly detector runs 8 rules against it in context. Is this command happening at an unusual hour? Is the risk score 3× higher than this session's baseline? Does it follow a pattern of repeated denials with variations? If any rule fires, the command gets an anomaly flag alongside the risk score.

Reviewer notified

The backend pushes a WebSocket message to all connected reviewers:

{
  "type": "pending_command",
  "id": "cmd_01h3...",
  "session_id": "sess_09f2...",
  "command": "kubectl delete pod --all -n production",
  "cwd": "/home/deploy",
  "timestamp": 1774072257,
  "risk_score": 82,
  "anomaly": true,
  "anomaly_reasons": ["risk_spike_vs_baseline"]
}

The reviewer UI shows the command in the pending queue with a red CRITICAL badge and an anomaly indicator. A sound notification fires. The reviewer's terminal buzzes if they're using the CLI.

Agent waits

The AI agent is blocked. It typed a command and nothing happened — because the PTY proxy is holding the input. The agent sees its cursor sit there. From its perspective, the command is "running". In reality, it's waiting for a human.

Reviewer decides

The reviewer sees the command, the context, the risk score, the anomaly flag. They can click Deny (keyboard shortcut: d), optionally with a comment ("not during business hours"), or Allow (shortcut: a). In the CLI: expacti deny cmd_01h3... "production namespace, needs change ticket".

Decision delivered

The backend sends the decision back to expacti-sshd via WebSocket:

{"type": "decision", "id": "cmd_01h3...", "allow": false,
 "comment": "production namespace, needs change ticket"}

expacti-sshd receives the decision and unblocks the PTY.

Agent informed

If denied: the agent sees an error message in its terminal: expacti: command denied — production namespace, needs change ticket. The command never reaches the server's shell. The agent can decide what to do next — try a different approach, ask for help, or stop.

If approved: the command is forwarded to the target shell via the SSH backend channel. Output streams back to the agent normally.

Audit log written

Either way, the decision is written to the immutable audit log: command, decision, who decided, comment, timestamp, risk score, anomaly flags, and a SHA-256 hash of the previous entry (chain integrity). This record cannot be altered without breaking the chain.

What the Timeout Looks Like

Reviewers aren't always watching. If no decision arrives within the configured timeout (default: 60 seconds), the backend auto-denies the command and writes a system:timeout entry to the audit log. The agent gets an error: expacti: command timed out — no reviewer responded.

This is deliberate. The system fails closed, not open. An AI agent that can't get approval waits, then gets denied. It does not execute unilaterally.

Fail-closed is non-negotiable. If the backend is unreachable, expacti-sshd denies all commands. If no reviewer responds, commands are denied. There is no mode where commands slip through silently.

Multi-Party Approval

For high-stakes operations, you can require more than one reviewer. Expacti supports four policy types:

Any — first reviewer to respond wins (default)
AllOf — all named reviewers must approve
AnyOf — at least one from a named set must approve
MinRole — approver must hold at least this role (admin, reviewer)

The most-specific matching policy wins. A policy scoped to kubectl delete * takes precedence over a catch-all. You can require an admin to approve anything touching production namespaces, while routine read commands only need any reviewer.

Latency Profile

Humans are slow. That's the point. But the system overhead itself is minimal:

Risk scoring: ~1ms
Whitelist lookup (SQLite, indexed): ~2ms
WebSocket round-trip (local network): ~5ms
Human decision time: 2–30 seconds for routine commands; longer for ambiguous ones

For whitelisted commands — the majority in a well-tuned setup — the overhead is ~3ms total. Only non-whitelisted commands wait for human judgment. The goal is to whitelist routine operations quickly, so reviewer attention is reserved for the commands that actually matter.

What Gets Recorded

After the session ends (or during it, live), the reviewer can replay the full terminal session. Every keystroke, every output, every escape sequence — rendered in the browser using asciinema-player. No information about what the agent did is lost.

This is not just for audit. It's for debugging. When an AI agent does something unexpected, you can watch exactly what it saw and what it typed, in order, at the original speed or faster.

The Design Choice

Every piece of this architecture reflects one principle: the human is in the loop, not in the cc.

Most security tools notify humans after something happens. Alert fatigue is the outcome. Expacti's model is different: the command doesn't run until a human says it does. The human isn't reviewing a log — they're the gate.

That makes the latency real. It makes the UX matter. It means your whitelist needs to be good. But it also means that when something goes wrong in production, you can answer the hardest question: who approved that?

See it in action

Try the interactive demo — four scenarios, no signup required.

Open Demo →