Ralphy
An iterative AI task execution framework. Runs Claude or Codex in a checklist-driven loop with state on disk, cost safeguards, and a long-lived agent that polls Linear, opens PRs, and iterates with reviewers.
$ npm install -g @neriros/ralphy Bun ≥ 1.0. The Claude engine also needs the
Claude CLI · or run with bunx @neriros/ralphy.
The loop
One unchecked task per iteration. Read the steering, do the work, validate, check it off — and repeat until the change is done. State lives on disk, so any run can be stopped and resumed.
iteration
steering
task
work
off
Each iteration reads ## Steering from proposal.md, picks the first unchecked item in tasks.md, does the
work, validates, and checks it off. The OpenSpec layout — proposal.md + design.md + tasks.md + specs/ — keeps every change reviewable.
What it does
Six pillars, from a one-shot task loop to a self-driving agent that lives in your tmux and ships PRs while you sleep.
The loop
- One unchecked task per iteration; state persists on disk so any run resumes.
- Engine choice — Claude
haiku/sonnet/opusor Codex, swappable per task. - Safeguards:
--max-iterations,--max-cost,--max-runtime,--max-failures. - OpenSpec layout per change — proposal, design, tasks, specs.
Agent mode
- Polls Linear — picks up Todo, resumes In Progress, re-runs flagged Done.
- Per-task git worktrees so concurrent workers never stomp each other.
- Optional confirmation gate; revise via
@ralphy revise: <why>. - Re-execs into a managed tmux session — detach without killing the loop.
PR + CI
- Auto-opens a PR on clean exit; idempotent — surfaces an existing one.
- Opt-in auto-merge (
squash/merge/rebase) right after PR creation. - Stacked PRs against a blocker's head branch via Linear relations.
- CI fix loop — pulls failed logs, re-spawns until green or cap hit.
Reviewer interaction
-
@ralphymentions in Linear and GitHub PR comments trigger a review run. - Unresolved threads queue a digest; Ralph fixes and resolves, or disagrees and replies.
- Sticky
tasks.mdmirror — one Linear comment that updates in place. - Self-review phase appends more work for another round before exit.
Observability
- Ink dashboard — per-worker cards with live phase, command-in-flight, stdout tail.
- Structured JSON event stream —
--json-outputfor CI, mirrored to disk. - Per-worker logs: global, per-task, and a per-change
LOG.jsonl. - Pre-existing error check — pauses pickups when the trunk is already red.
Extensibility
- MCP server exposes list / get / create / append-steering / stop to Claude-side agents.
-
WORKFLOW.md— Jinja-style template rendered per iteration with project rules. - Declarative indicators map labels & statuses to lifecycle events.
- Built on
@rosneri/xstate-mcp— explicit, inspectable state.
Two ways to run it
A one-shot loop for a single task, or a long-lived agent that drives itself off your Linear board.
Task mode one-shot
Point it at a single task. State is on disk, so you can resume or inspect any run later.
# run a one-shot loop $ ralphy loop task \ --name fix-auth \ --prompt "Fix the JWT validation bug" \ --claude opus --max-iterations 10 # resume later — state is on disk $ ralphy loop task --name fix-auth # inspect $ ralphy loop status --name fix-auth
Agent mode Linear-driven
Polls Linear, runs up to N concurrent loops, opens PRs, watches CI, and iterates with reviewers.
# self-driving off your Linear board $ export LINEAR_API_KEY=lin_api_xxx $ ralphy agent \ --linear-team ENG \ --linear-assignee me \ --concurrency 3 \ --create-pr --fix-ci
Todo → scaffold the change + spawn a worker.In Progress → reattach to the existing run.CONFLICTING → enqueue a conflict-resolution task.@ralphy mention → run with the mention as prompt.Hand it the checklist.
Open source, MIT-licensed, built on Bun. Star it, read the guide, or just install it and point it at a ticket.
$ npm install -g @neriros/ralphy