ralphy CLI binary does the work. The workspace on disk is the state. The agent never reaches past the CLI — no raw curl against a provider, no ad-hoc ffmpeg, no yt-dlp outside a ralphy verb. That rule (AGENTS.md hard invariant #2) is what makes the gen-log, the cost rollup, the asset manifest, and cross-session memory work.
The three parts
The agent. Whatever editor you use — Claude Code, Cursor, Codex, or GitHub Copilot — the agent hasAGENTS.md auto-loaded into its system prompt. That file is the routing table from “user intent” to “playbook.” When you ask “make a video about my coffee shop’s pastry,” the agent matches the intent to a row, reads the matched playbook in docs/playbooks/<role>.md, then acts. It does not improvise on topics the playbook covers — the playbook has the model picks, the prompt scaffolds, and the failure modes.
The CLI. ralphy is a single binary built from TypeScript with bun. The verbs live under cli/commands/. The libraries live under cli/lib/. Every verb has the same shape: parse args, read state from the workspace, call a provider through cli/lib/providers/, write artifacts to the workspace, append to the logs. JSON output by default; -p for pretty tables. The CLI is also the only thing on the machine with the API keys — the agent never sees them.
The workspace. workspace/ is a gitignored directory in the repo root. Projects, brand definitions, persona definitions, refs, the asset cache, and the generations log all live there. The workspace is treated as canonical-but-wipeable: canonical because every file under workspace/projects/<id>/ is append-only, and wipeable because everything can be regenerated from a brief plus the registry at ~/.ralphy/. See /concepts/workspace for the directory layout.
How they talk
Agent reads the playbook
The agent matches your intent against the routing table in
AGENTS.md, opens the matched playbook in docs/playbooks/<role>.md via the Read tool, and reads any sub-docs the playbook points at (e.g. docs/playbooks/researcher/yt-dlp.md). The playbook tells the agent which ralphy verb to run with which flags.Agent invokes a ralphy verb
The agent runs
ralphy <verb> <args> (or bun run ralph -- <verb> in development). The verb is a TypeScript file in cli/commands/. The CLI parses args, picks the model from MODELS.md defaults (or honors --model), and calls the provider through cli/lib/providers/media.ts or cli/lib/providers/llm.ts.ralphy mutates the workspace and writes logs
The verb writes the produced file under
workspace/projects/<id>/assets/, appends an entry to workspace/projects/<id>/logs/generations.jsonl with {provider, endpoint, kind, slot, input, output, status, latency_ms, cost_usd}, and updates workspace/projects/<id>/asset-manifest.json to point at the new slot version. If a previous version of the slot exists, the old file becomes scene-03.v1.png and the new one is scene-03.png (auto-archived since commit 753d2f7).The one strict rule
The reason this rule exists is consequence, not aesthetic. Four downstream systems all depend on every model call flowing throughralphyis the only entry-point for model calls, ffmpeg recipes, yt-dlp pulls, and project mutations. Reaching forbunx tsxagainst a TS file,curlagainst any provider API, orffmpegad-hoc — STOP. Either there is aralphyverb for it (check the playbook’s## CLI cookbooksection), or the operation is not yet covered — in which case propose adding the verb tocli/commands/and stop. Never paste raw API code into a project. — AGENTS.md hard invariant #2
ralphy:
- The gen-log at
workspace/projects/<id>/logs/generations.jsonl. If the agent runscurlagainst fal.ai directly, the call is invisible. The next session’s agent will not see it, the cost rollup will under-count, and the postmortem will be wrong. - The asset manifest at
workspace/projects/<id>/asset-manifest.json. If the agent writes a file underassets/without going through the verb, the manifest gets out of sync with disk. Compositions then reference a slot that does not exist, or a slot that points at the wrong file. - The quality gates.
ralphy generate imagerunsscoreImageon the output.ralphy generate videorunsscoreVideo. A direct provider call skips the gate and ships a known-bad asset into the cut. - The append-only contract. The verbs know to archive
<slot>.<ext>→<slot>.v1.<ext>before writing the new file. A direct write overwrites the previous version, which is exactly the failure mode invariant #13 exists to prevent.
ralphy verb for this, I will just shell out” — that is the bug. The right move is to read the playbook’s ## CLI cookbook section, look for an existing verb, and if there genuinely is no verb, propose adding one to cli/commands/. Most of the verbs in the CLI started as exactly that: a postmortem flagged a missing one.
What the agent sees vs. what the CLI sees
The agent has no API keys and no provider knowledge. It sees:- The repo (
AGENTS.md,CLAUDE.md,MODELS.md,CLI.md,docs/playbooks/). - The workspace state (
workspace/projects/<id>/...). - The output of
ralphyverbs (JSON to stdout).
OPENROUTER_API_KEY, ELEVENLABS_API_KEY) and all provider knowledge. It reads:
~/.ralphy/config.jsonfor keys and registry.- The workspace for project state.
MODELS.mdis a documentation source — the CLI does not parse it. Default model picks live incli/lib/providers/media.tsand are kept in sync withMODELS.mdby hand.
The five mandatory reads at session start
docs/playbooks/meta.md names them:
AGENTS.md— auto-loaded byCLAUDE.md.MODELS.md— checked before every model call.CLI.md— verb / flag reference cheatsheet.- The closest sibling postmortem under
workspace/projects/<id>/postmortem/02-lessons.md. The postmortems are where the high-density “what went wrong, what to do instead” content lives. - The matched playbook from
AGENTS.mdrouting — read fully, then act.
What the workspace looks like during a run
A project mid-flight has roughly this shape:cli/commands/ and a schema. See /concepts/projects for the lifecycle, /concepts/generation-log for the JSONL schemas, and /concepts/workspace for the full directory layout including .ralph/ (brands, personas, refs, templates, asset cache).
Related
- Workspace — the directory layout in detail
- Projects — the per-project layout and lifecycle
- Generation log — the append-only contract and the JSONL schemas
- Playbooks and skills — how the agent decides what to do
- AGENTS.md — the routing contract and the 13 hard invariants