Architecture
Two storage roots, one event protocol, four adapters, one daemon, one opt-in proxy. The whole thing is small on purpose.
Two storage roots, two purposes
| Root | Scope | What lives there |
|---|---|---|
<repo>/.agent-relay/ | Per repository | Manual sessions, checkpoints, journals — everything relay c / relay handoff / relay watch write today. |
~/.relay/ | Per user, machine-wide | Always-on session events, snapshots, config, sockets. |
They never overlap. The always-on daemon never writes inside
.agent-relay/. The existing CLI never writes inside ~/.relay/. They
only meet at handoff time: when you run relay resume, the translator
materialises a small breadcrumb in <repo>/.agent-relay/handoffs/<id>/
pointing back at the global snapshot. Same relay handoff output as
always; just with richer context to draw from.
Adapters
~/.relay/events/<session-id>.jsonl
▲
│ append-only writes
┌───────────┬─────────────┴─────────────┬─────────────┐
│ │ │ │
┌────┴────┐ ┌────┴────┐ ┌─────┴─────┐ ┌─────┴─────┐
│ Claude │ │ PTY │ │ VS Code │ │ Warp │
│ Code │ │ wrap │ │ extension │ │ MCP │
│ hook │ │ (relay) │ │ (Cursor / │ │ server │
│ │ │ │ │ …) │ │ │
└─────────┘ └─────────┘ └───────────┘ └───────────┘
The whole protocol is documented in
docs/spec/event-protocol.md.
TL;DR: line-delimited JSON. Each adapter writes one file per session
under ~/.relay/events/. Adapters never need the daemon to be running
to capture; lines pile up on disk until something reads them.
Daemon
┌──────────────────────────┐
│ relay daemon │
│ ───────────────────── │
│ tailer → registry │
│ │ (fold events) │
│ ▼ │
│ handoff engine │
│ │ │
│ ▼ │
│ snapshot + notify │
└──────────────────────────┘
The daemon is one small Python process. It:
- Tails every JSONL file under
~/.relay/events/(poll-based, cheap). - Folds events into an in-memory session registry.
- Watches for
rate_limitedevents. On match → snapshot → desktop notification → wait forrelay resume. - Exposes a Unix-socket protocol for live tails and queries.
- On Windows it uses a named pipe at
\\.\pipe\agent-relay.
Auto-started by launchd (macOS), systemd user units (Linux), or the Startup folder (Windows). See Installation.
Proxy mode (opt-in)
┌────────────────────────┐
Your editor / agent ─────▶│ relay proxy (mitmdump)│─────▶ Anthropic
│ reads response │ OpenAI
│ headers │ Google
└─────────┬──────────────┘
│
▼
~/.relay/events/<session>.jsonl
Off by default. When you turn it on you get:
- Lossless rate-limit detection from
anthropic-ratelimit-*,x-ratelimit-*, and Google'sRESOURCE_EXHAUSTEDbody — exact retry-after, exact tokens-remaining, before the agent hits a hard 429. - Early-warning when remaining quota drops below 5%.
See Privacy for what the proxy does and doesn't store.
The handoff snapshot
When the engine fires, it writes ~/.relay/snapshots/snap-<ts>-<id>.md.
That's a Markdown document with:
- The previous agent's slug, cwd, and stop reason.
- Files touched (deduplicated).
- Last few decisions with reasoning.
- A tail of recent events.
- A short "how to continue" closing block.
Agent-agnostic on purpose — you can paste it into anything that takes a prompt.
Where to look in the code
| Concern | Module |
|---|---|
| Storage paths + layout version | src/agent_relay/daemon/paths.py |
| Event schema + JSONL append | src/agent_relay/daemon/events.py |
| Tailer (per-file offsets) | src/agent_relay/daemon/tailer.py |
| Session aggregate (fold) | src/agent_relay/daemon/session.py |
| Socket server | src/agent_relay/daemon/server.py |
| Handoff engine | src/agent_relay/daemon/handoff.py |
| Snapshot rendering | src/agent_relay/daemon/snapshot.py |
| Adapters | src/agent_relay/adapters/ |
| Platform service registration | src/agent_relay/daemon/platform/ |