Batches - Ralphy

The rule from AGENTS invariant and the producer playbook is simple: N ≥ 3 is always a batch. You don’t run a for loop over ralphy template use by hand; you use ralphy batch. The verb exists for three reasons that don’t exist when you generate one video at a time — cost deduplication (the master shots get generated once, not N times), parallel render with bounded concurrency, and a structured cost rollup at the end so you know what 10 variants actually cost.

What a batch is

A batch is a set of projects spawned from a single template plus a list of per-variant tweaks. Each project is a real project — own ID, own folder under .ralphy/workspaces/default/projects/, own logs, own manifest, own final render. The batch is the wrapper that creates them, runs them, and reports on them. The .ralphy/workspaces/default/batches/<batch-id>/state.json file tracks the in-flight state across the projects in the batch. The batch ID format is the standard {context}-{NNN}, same as project IDs.

The two entry points

Interactive, from a template

The simplest path. Pick a template, ask for N variants, let Ralphy spawn the projects:

ralphy batch create \
  --name "syrup-launch-batch" \
  --template ugc-selfie-product-review \
  --variations ./variations.json \
  --concurrency 3

variations.json is a JSON array, one object per variant. The object’s keys are the slots or template parameters you want to vary:

[
  { "brief": "Anna shows the product, kitchen counter, autumn light" },
  { "brief": "Anna shows the product, outdoor cafe, golden hour" },
  { "brief": "Anna shows the product, home office, soft window light" },
  { "brief": "Anna shows the product, gym after workout, energetic" },
  { "brief": "Anna shows the product, bookstore corner, cozy reading vibe" }
]

Each entry becomes its own project. Common assets (the persona master, the product master) are generated once and shared via --ref on every project’s scene gens — the cost deduplication that makes batch worth using.

Vary axis, from a base project

When you already have a successful single video and want to ship variants on a specific axis (the hook, the music, the persona, the aspect ratio):

ralphy batch vary \
  --base syrup-001 \
  --axis hook \
  --variants 5 \
  --variants-file ./hook-variants.json

--axis is one of the named axes the template supports (hook, body, cta, music, etc.) per the typed Scene[] schema from the scenarist playbook. The --variants-file carries N objects with the swap values. --dry-run previews what would be created without writing.

Concurrency and parallel render

--concurrency controls how many projects run in parallel. The default is 3; bump it if you have headroom and a stable network. Per-model concurrency caps still apply — openai/gpt-5.4-image-2 is capped at 1 concurrent call per OpenRouter key, so batch fan-out for that model serializes regardless of your --concurrency setting. google/gemini-3-pro-image-preview tolerates 4+ in parallel. The batch state machine tracks which projects are pending, in-flight, complete, or failed:

ralphy batch status syrup-launch-batch-001

{
  "id": "syrup-launch-batch-001",
  "template": "ugc-selfie-product-review",
  "concurrency": 3,
  "projects": [
    { "id": "syrup-launch-001", "status": "complete", "costUsd": 2.84, "render": "final.mp4" },
    { "id": "syrup-launch-002", "status": "complete", "costUsd": 2.71, "render": "final.mp4" },
    { "id": "syrup-launch-003", "status": "in-flight", "stage": "scene-03-vid" },
    { "id": "syrup-launch-004", "status": "pending" },
    { "id": "syrup-launch-005", "status": "pending" }
  ],
  "totals": { "complete": 2, "in-flight": 1, "pending": 2, "failed": 0, "costUsdSoFar": 5.55 }
}

The cost rollup

When the batch finishes, the rollup lands as part of the status output and as a markdown file at .ralphy/workspaces/default/batches/<batch-id>/COST_ROLLUP.md. It breaks down by project and by modality (image, video, voiceover, music, render) so you can see where the spend went:

syrup-launch-batch-001 — cost rollup
====================================
Total: $20.87 across 5 projects (avg $4.17/project)

Per modality:
  video:     $9.60 (5 × kling-v3.0-pro × 5s)
  image:     $9.00 (60 × gemini-3-pro-image-preview @ ~$0.15)
  voiceover: $1.50 (5 × elevenlabs/eleven_multilingual_v2)
  music:     $0.75 (5 × elevenlabs/music × 30s)
  render:    $0.02 (5 × hyperframes)

Per project:
  syrup-launch-001  $4.16  complete  → render/final.mp4
  syrup-launch-002  $4.03  complete  → render/final.mp4
  syrup-launch-003  $4.31  complete  → render/final.mp4
  syrup-launch-004  $4.20  complete  → render/final.mp4
  syrup-launch-005  $4.17  complete  → render/final.mp4

The same numbers are available structured via ralphy batch show <id> --json | jq.

The post-batch postmortem

When a batch completes, the producer playbook auto-generates a POSTMORTEM.md at .ralphy/workspaces/default/batches/<batch-id>/POSTMORTEM.md (or invokes the /postmortem skill on a long batch). The postmortem captures lessons learned, model picks that worked or didn’t, the cost rollup, and any CLI gaps the agent had to work around. The next batch starts at a higher skill level — that’s the design. You can trigger a postmortem manually any time after a batch with /postmortem in chat. The skill writes six structured files under .ralphy/workspaces/default/batches/<batch-id>/postmortem/ covering chronological history, lessons, bugs, cost rollup, and workflow fixes. Detail in the /postmortem skill.

Perf target

The producer playbook’s batch target is ≤ 25 minutes wall-clock for 10 videos (docs/perf-targets.md). If your batch is on track to exceed 50% over that (38+ minutes for 10), the producer playbook says report before continuing — usually the cause is a stuck job, a saturated model concurrency, or a missing asset that’s blocking the pipeline. Calculate ETA before kickoff with --dry-run:

ralphy batch create --name "test" --template ugc-selfie-product-review \
  --variations ./variations.json --concurrency 3 --dry-run

The output prints the per-project plan, the cost estimate per modality, and the wall-clock estimate based on past gen-log latency for the chosen models. A batch’s projects are normal projects — they can be exported via ralphy profile export, shared via PR to the ralphy-assets companion repo as a worked example, or extracted into a new template via ralphy template create --from-project <id> after picking the winning variant. If you want a teammate to see exactly what the batch produced (without sharing 50MB of renders), ralphy profile export <nick> defaults to skipping renders to keep PRs small. Pass --include-renders if you want to share the mp4s too.

Why the verb exists

You could, in principle, write a shell for loop that calls ralphy template use and ralphy render N times. You shouldn’t, and the producer playbook is explicit about it. The reasons:

Cost dedup. The persona master shot gets generated once, not N times. Same for the location plate. That’s 30%+ savings on most multi-scene batches.
Bounded concurrency. The batch runner respects per-model concurrency caps; a hand-rolled loop hammers OpenRouter and gets rate-limited.
Structured output. The cost rollup, the postmortem, the per-modality breakdown — all generated for free by the batch wrapper. A shell loop gives you N detached projects with no aggregated view.
Failure recovery. A failed project in the middle of a batch doesn’t kill the whole run; the runner marks it failed, moves on, and the rollup shows it. A shell loop just exits.

AGENTS invariant: N ≥ 3 always goes through ralphy batch. The four reasons above are why.

Profiles — sharing a batch’s output via PR
Reviewing and iterating — iterating on a single project before extracting to a batch
producer.md — the playbook this page paraphrases
docs/perf-targets.md — the perf budget

​What a batch is

​The two entry points

​Interactive, from a template

​Vary axis, from a base project

​Concurrency and parallel render

​The cost rollup

​The post-batch postmortem

​Perf target

​Sharing a batch

​Why the verb exists

​Related