rosneri / ralphy
npm · @neriros/ralphy v3.10.10

Ralphy

An iterative AI task execution framework. Runs Claude or Codex in a checklist-driven loop with state on disk, cost safeguards, and a long-lived agent that polls Linear, opens PRs, and iterates with reviewers.

$ npm install -g @neriros/ralphy
Requires Bun ≥ 1.0. The Claude engine also needs the Claude CLI · or run with bunx @neriros/ralphy.
ralphy agent — tmux
$ ralphy agent --linear-team ENG --concurrency 3 --create-pr --fix-ci
polling Linear · team ENG · assignee me
┌─ worker 1 ──────────────────────────────┐
ENG-412 fix JWT validation implement
$ bun test ✓ 142 passed
task 3/5 checked off
└─────────────────────────────────────────┘
PR #284 opened · auto-merge --squash armed
ENG-415 resumed · @ralphy mention → review run
$
$ cat ~/.ralph/loop.md

The loop

One unchecked task per iteration. Read the steering, do the work, validate, check it off — and repeat until the change is done. State lives on disk, so any run can be stopped and resumed.

Start
iteration
Read
steering
First unchecked
task
Do the
work
Validate
Check
off
loops back to start  ·  when all tasks are checked archive change

Each iteration reads ## Steering from proposal.md, picks the first unchecked item in tasks.md, does the work, validates, and checks it off. The OpenSpec layout — proposal.md + design.md + tasks.md + specs/ — keeps every change reviewable.

2 engines
Claude · Codex
4 caps
iterations · cost · runtime · failures
N workers
isolated git worktrees
5 MCP
tools exposed to agents
MIT
license
$ ralphy --help | less

What it does

Six pillars, from a one-shot task loop to a self-driving agent that lives in your tmux and ships PRs while you sleep.

The loop

checklist-driven execution
  • One unchecked task per iteration; state persists on disk so any run resumes.
  • Engine choice — Claude haiku/sonnet/opus or Codex, swappable per task.
  • Safeguards: --max-iterations, --max-cost, --max-runtime, --max-failures.
  • OpenSpec layout per change — proposal, design, tasks, specs.

Agent mode

Linear-driven, long-lived
  • Polls Linear — picks up Todo, resumes In Progress, re-runs flagged Done.
  • Per-task git worktrees so concurrent workers never stomp each other.
  • Optional confirmation gate; revise via @ralphy revise: <why>.
  • Re-execs into a managed tmux session — detach without killing the loop.

PR + CI

ship and stay green
  • Auto-opens a PR on clean exit; idempotent — surfaces an existing one.
  • Opt-in auto-merge (squash/merge/rebase) right after PR creation.
  • Stacked PRs against a blocker's head branch via Linear relations.
  • CI fix loop — pulls failed logs, re-spawns until green or cap hit.

Reviewer interaction

agrees-and-fixes, or replies
  • @ralphy mentions in Linear and GitHub PR comments trigger a review run.
  • Unresolved threads queue a digest; Ralph fixes and resolves, or disagrees and replies.
  • Sticky tasks.md mirror — one Linear comment that updates in place.
  • Self-review phase appends more work for another round before exit.

Observability

see every worker
  • Ink dashboard — per-worker cards with live phase, command-in-flight, stdout tail.
  • Structured JSON event stream — --json-output for CI, mirrored to disk.
  • Per-worker logs: global, per-task, and a per-change LOG.jsonl.
  • Pre-existing error check — pauses pickups when the trunk is already red.

Extensibility

MCP + templates
  • MCP server exposes list / get / create / append-steering / stop to Claude-side agents.
  • WORKFLOW.md — Jinja-style template rendered per iteration with project rules.
  • Declarative indicators map labels & statuses to lifecycle events.
  • Built on @rosneri/xstate-mcp — explicit, inspectable state.
$ history | head

Two ways to run it

A one-shot loop for a single task, or a long-lived agent that drives itself off your Linear board.

Task mode one-shot

Point it at a single task. State is on disk, so you can resume or inspect any run later.

task-mode.sh
# run a one-shot loop $ ralphy loop task \ --name fix-auth \ --prompt "Fix the JWT validation bug" \ --claude opus --max-iterations 10
# resume later — state is on disk $ ralphy loop task --name fix-auth
# inspect $ ralphy loop status --name fix-auth

Agent mode Linear-driven

Polls Linear, runs up to N concurrent loops, opens PRs, watches CI, and iterates with reviewers.

agent-mode.sh
# self-driving off your Linear board $ export LINEAR_API_KEY=lin_api_xxx $ ralphy agent \ --linear-team ENG \ --linear-assignee me \ --concurrency 3 \ --create-pr --fix-ci
$ each poll routes every issue into one of:
fresh
Todo → scaffold the change + spawn a worker.
resume
In Progress → reattach to the existing run.
conflict-fix
CONFLICTING → enqueue a conflict-resolution task.
ci-fix
PR red on GitHub → prepend a fix task from the logs.
review
Reviewer comments → fresh review run.
code-review
@ralphy mention → run with the mention as prompt.
$ npm i -g @neriros/ralphy

Hand it the checklist.

Open source, MIT-licensed, built on Bun. Star it, read the guide, or just install it and point it at a ticket.

$ npm install -g @neriros/ralphy