MODELS.md at the repo root is the single source of truth for which models Ralphy uses, what each one costs, and what’s known to break. The agent reads it before every model call (AGENTS invariant #6) because Claude’s training is stale and OpenRouter versions drift silently. This page documents the file’s structure so contributors can add entries that read cleanly to both humans and the agent.
File shape
MODELS.md is plain Markdown. There is no separate machine schema — the file is the schema. Sections are stable and ordered:
- A one-sentence statement of the endpoint contract (which file in
cli/lib/providers/handles the call). - A matrix table of supported models — one row per model.
- A “When to pick which” decision table.
- A “Lessons” or “Discovered breakage” numbered list, postmortem-cited.
- An
Avoid:bullet list.
The opening contract
The file’s opener pins the two-key rule and points at the freshness gate:Last reviewed line is the freshness signal. The session-start meta playbook checks this on the first call (docs/playbooks/meta.md rule 2). Bump the date in the same commit as any factual edit. Don’t bump without an edit — that defeats the gate.
Per-modality matrix tables
Every modality section has a table with the same column order. Here is the canonical shape for image:- Use case: one short phrase. The first row’s use case is always
**Default — <bucket>**(bold). Subsequent rows narrow. - Model: full OpenRouter path in backticks (
provider/model-id). Never shorten to “Kling” inside the cell. - Price: leading
~for ballpark, no~once empirically verified. Format~$0.20 / image,$0.14 / sec,subscriptionfor ElevenLabs. - Why: one sentence. Lead with the verb. No marketing language.
Video has an extra matrix
Video gets a second table — the live catalog snapshot — because the OR catalog drives runtime validation incli/commands/generate.ts → validateVideoParams():
- Durations (s): list discrete values (
6, 10) or a closed range (3-15). Match the model’ssupported_durationsarray fromralphy models list. - Resolutions: comma-list, lowest first.
720p,1080p,4Kare the canonical labels. - Aspects: comma-list of
W:Hstrings.7 aspectsis shorthand for “all seven supported aspects” — only use it for seedance. - Frame anchors:
first onlyorfirst + last. Drives whether--last-frameis legal on this model. - $/sec billed: rate with
✓when verified against actual OR billing. Without✓it’s a ballpark from the catalog — verify on first use and add the tick.
Decision table
Every modality has a “When to pick which” table directly below the matrix. Keep it 4-8 rows. Each row is one bolded user need plus one model name. No edge cases — those go in the lessons section.Lessons / discovered breakage
The numbered list under “Lessons from this session” or “Discovered breakage” is the most valuable section in the file. Each entry has a fixed shape:- Lead sentence in bold. Names the model and the surprising behaviour.
- Body. What you observed, with concrete prompt fragments.
Fix:— the workaround. Always present; if there is no fix yet, writeFix: TBD — open issue #N.- Postmortem cite at end of paragraph (
Postmortem: glitter-cream.). The agent uses this to pull the original session if it needs more context.
Tried-and-dropped table
The cross-reference table at the bottom of the video section is the bridge between “this model” and “this postmortem”:— mitigated 2026-05-19 by …) so future agents see the history.
Adding a new entry
- Confirm the model is live on OpenRouter. Run
ralphy models list(24h cached, refresh with--refresh) and grep for the model id. - Pick the section. Image / video / voice / music / transcription / LLM. If the modality is new, propose a new section in a separate PR.
- Decide the use case. What problem does it solve that the current default does not? If the answer is “none”, the entry does not go in.
- Insert a row in the matrix table. Use the column shape above. Price is
~$X.XX / unituntil you have billed it once; then drop the~. - Add a “When to pick which” line if the model owns a distinct niche.
- If the model behaves surprisingly, add a numbered lesson with the postmortem cite.
- Bump the
Last revieweddate at the top of the file in the same commit.
MODELS.md is human-curated prose. The discipline is enforced by code review and the cross-reference checks in the providers test suite (any model id mentioned in cli/lib/providers/media.ts should also appear in MODELS.md).
Lifecycle
- On the first session in a new chat: check
Last reviewed. Refresh if stale. - After every failure mode on a new model: add it to “Avoid” / “Lessons” with the reason and a postmortem cite.
- When you change a default in a skill or script: sync it here in the same commit.
- When you add a verb or flag to
ralphy generate: sync the price / param notes. - At least once a month: re-check OR catalog drift, bump the date.
Why one file, not a JSON catalog
MODELS.md is prose because the agent reads it as prose. A JSON catalog would force the agent to render it back into prose at decision time, costing tokens and losing the “why” context. The OR catalog is the JSON source-of-truth for parameter validation (supported_durations, supported_resolutions); MODELS.md is the prose layer on top — the rationale, the lessons, the postmortem cross-references.
If you need machine-readable data, parse the matrix tables — the columns are stable.
Related
- CLI:
ralphy models— live OR catalog access. MODELS.md— the file itself.AGENTS.md— invariant #6 (read MODELS.md before every model call).cli/lib/providers/media.ts— the single gateway for every media call.