Skip to main content
A Ralphy skill is one folder under .agents/skills/<name>/ containing a single SKILL.md. Skills follow the agentskills.io convention so they install cleanly into Claude Code, Cursor, Codex, and Copilot through ralphy skill install. This page is the wire-format spec: frontmatter fields, body sections, the lint that enforces both, and the description discipline that keeps the slash-command menu scannable.

Full example

A valid .agents/skills/evaluator/SKILL.md:
---
name: evaluator
namespace: user
description: >-
  Quality evaluation of rendered UGC mp4s — scene segmentation, audio loudness,
  caption density, and per-scene visual analysis. Produces an actionable report
  (eval.json + eval-report.md) sized for a downstream fixer agent.
  USE WHEN the user asks to "evaluate / score / grade / QA" a rendered video,
  drops an mp4 path with no other instruction, or asks "is this ready to ship".
  See body for ALSO FIRE / DO NOT FIRE / HARD INVARIANTS.
---

# evaluator

## Trigger refinements

**ALSO FIRE** when the user just dropped a path ending in `.mp4` from
`workspace/projects/<id>/render/` with no other instruction.

**DO NOT FIRE** for unrendered projects (handback to editor for `ralphy render`).

## Hard invariants

- Every vision pass routes through `cli/lib/providers/llm.ts → callLLM()`.
- Findings are deterministic outputs of `cli/lib/eval/*` — pass through verbatim.

## Workflow

1. Run `ralphy eval video <path-to-mp4>``eval.json` + `eval-report.md`.
2. Surface the verdict to the user. Hand off the report path to the fixer.

## Outputs

- `workspace/projects/<id>/eval/eval.json`
- `workspace/projects/<id>/eval/eval-report.md`

## Cookbook

"evaluate this video", "is this ready to ship", "score the render", "what's wrong with this video".
That file passes bun run lint:skills and renders correctly in every supported agent.

Where skills live

.agents/skills/
  evaluator/SKILL.md
  researcher/SKILL.md
  install/SKILL.md
  release/SKILL.md
  hyperframes/SKILL.md
One folder per skill. The folder name must match the name: field — the lint rejects a mismatch. The folder is yours; drop scripts, templates, or worked-example files alongside SKILL.md and they ship with the skill on install.

Frontmatter contract

The frontmatter is a YAML block between --- fences. scripts/lint-skills.ts enforces:
FieldRequiredValidationNotes
nameyes/^[a-z][a-z0-9-]*$/, matches folder namekebab-case
descriptionyeslength ≤ 1536 charsagentskills.io hard cap
namespacenouser or maintainerdrives install wizard visibility
when_to_usenofree-form stringdownstream filter tag (e.g. post-render)
allowed-toolsnostring arraytool allowlist
disable-model-invocationnobooleantrue makes the skill docs-only
pathsnostring arrayworkspace paths the skill touches
contextnofree-form stringlabel for “suggest this skill” surfaces
argument-hintnofree-form stringshown next to the slash command in Claude Code
argumentsnoYAML literalpositional-arg schema (future MCP use)
The lint parser handles two YAML shapes per field: inline key: value and the folded key: >- block used for multi-line description.

The name rule, in source

const NAME_RE = /^[a-z][a-z0-9-]*$/;
// scripts/lint-skills.ts enforces:
//   - name matches NAME_RE
//   - fm.name === folderName
Lower-case letters, digits, and hyphens. No underscores. No leading hyphen. No uppercase.

The description cap, in source

const DESC_MAX = 1536;
// scripts/lint-skills.ts:
//   if (fm.description.length > DESC_MAX) errors.push(...)
1536 is the agentskills.io hard cap. The lint hard-fails above it. Soft ceiling is ~1500: Claude Code budgets roughly 1% of context for skill discovery across every installed skill, so two bloated descriptions push siblings out of the menu without warning.

The namespace whitelist

const NAMESPACE_VALUES = new Set(["user", "maintainer"]);
Any other value is rejected. The split:
NamespaceAudienceExamples
userend usersevaluator, researcher, templater, install
maintainermaintainers / contributorsdev-release, dev-tasks
ralphy skill install installs only user skills by default. --dev opts into the maintainer set. The agents-md lint (scripts/lint-agents-md.ts) reads namespace: and skips the “is this skill referenced from the routing table” check for maintainer skills — maintainer skills are explicit-invocation-only.

Body structure

The lint warns (does not error) when section headings are missing. Every non-trivial skill should hit this order:
# <Skill display name>

## Trigger
When this skill fires. Mirror the `USE WHEN` line from the description.

## Hard invariants
Must-do / must-not lines, one bullet each. Concrete and refusable.

## Workflow
Numbered, step-by-step. Each step names the exact `ralphy <verb>` call.

## Outputs
What the user sees on success, what gets written to disk.

## Cookbook
EN + RU phrasing of triggers, sample invocations, common failure modes.
The lint walks for any ## heading starting with Trigger, Hard invariants, Workflow, Outputs, or Cookbook (case-insensitive prefix match). It only warns when none of them are present, so an intentionally minimal skill-creator-style body does not generate noise.

Writing a great description

The description is the user-facing summary of the skill, not an auto-route trigger phrase list. Claude Code renders it in the /<skill> slash-command menu, the “suggest this skill” surface, and the ralphy skill list output. A scannable paragraph beats a wall of synonyms.

Good: ~300 chars

description: >-
  Quality evaluation of rendered UGC mp4s — scene segmentation, audio loudness,
  caption density, per-scene visual analysis. Produces eval.json + eval-report.md
  sized for a downstream fixer agent. Use after `ralphy render` when you want
  verification before publishing.
What works: leads with the verb, names the output, says when the user reaches for it. No synonym lists.

Bad: ~1800 chars (rejected by the cap)

description: >-
  Quality evaluation, scoring, grading, reviewing, auditing, checking, inspecting,
  analyzing, verifying, validating, QA-ing, assessing of rendered UGC mp4s —
  scene segmentation, audio loudness, dead-air detection, caption density, frame
  analysis, per-scene visual analysis, retention check, scroll-stop check, hook
  strength check, virality rubric scoring, quality gate analysis, post-render
  quality gate verification, ready-to-ship readiness analysis, publish-readiness
  check... USE WHEN the user asks to "evaluate this video", "score the render",
  "grade the mp4", "review the final cut", "QA this video", "is this ready to
  ship", "what's wrong with this video", ...
What fails: every trigger phrase, every synonym, every example utterance crammed into one field. The lint rejects this for length, but even at 1535 chars it would be a defect — the menu becomes unreadable and sibling skills get squeezed out of the discovery budget.

Structure that works

One paragraph, optionally broken into “what / when” sentences:
  1. First sentence. What the skill does and what it produces.
  2. Second sentence. When the user reaches for it.
  3. (Optional) Third sentence. What it does not do — the closest adjacent skill the user might confuse it with.
If you need to spell out trigger phrases for routing fidelity, put them in the body’s ## Trigger section. The body is where the agent looks once the skill fires; the description is where the user looks before invoking it. The agentskills.io cap (1536) and the menu-budget reality (~1500 soft) are not the same number on purpose. Hard cap stops the worst case; soft ceiling is the discipline.

ALSO FIRE / DO NOT FIRE / HARD INVARIANTS

For longer skills (the existing evaluator and researcher), the description ends with See body for ALSO FIRE / DO NOT FIRE / HARD INVARIANTS. and those clauses move into the body under ## Trigger refinements and ## Hard invariants. This pattern keeps the description scannable while preserving the routing-fidelity detail the agent needs once the skill fires. The rule of thumb: if a clause is longer than half a sentence, move it from the description into the body. The description’s job is “should this skill fire?”; the body’s job is “now that it fired, what do I do?”

Lint

bun run lint:skills
The script walks .agents/skills/*/SKILL.md and checks:
  • Frontmatter parses cleanly (a ------ block with valid YAML).
  • name matches /^[a-z][a-z0-9-]*$/ and equals the folder name.
  • description is present and ≤ 1536 chars.
  • namespace, if set, is user or maintainer.
  • (Warning) The body has at least one ## section heading.
CI runs lint:skills on every PR. A failing lint blocks merge.

Installing skills

ralphy skill install is the cross-agent installer in cli/lib/skill/installer.ts. It supports four targets:
TargetWhat lands
claudeCopies (or symlinks) the bundle into ~/.claude/skills/ralphy/ (user) or ./.claude/skills/ralphy/ (project), then sentinel-merges a routing pointer into CLAUDE.md.
cursorWrites .cursor/rules/ralphy.mdc with the playbook routing block.
codexSentinel-merges a Ralphy section into AGENTS.md at the project root.
copilotSame sentinel-merge pattern.
The sentinel block (<!-- ralphy:start v=1 --><!-- ralphy:end -->) makes re-runs idempotent. A second ralphy skill install swaps the inner content without duplicating the block.