npm · @neriros/ralphyv3.10.10

Ralphy

An iterative AI task execution framework. Runs Claude or Codex in a checklist-driven loop with state on disk, cost safeguards, and a long-lived agent that polls Linear, opens PRs, and iterates with reviewers.

See how it works npm Guide Idea file

$ npm install -g @neriros/ralphy

Requires Bun ≥ 1.0. The Claude engine also needs the Claude CLI · or run with bunx @neriros/ralphy.

ralphy agent — tmux

$ ralphy agent --linear-team ENG --concurrency 3 --create-pr --fix-ci

polling Linear · team ENG · assignee me

┌─ worker 1 ──────────────────────────────┐

│ ENG-412 fix JWT validation implement │

│ $ bun test ✓ 142 passed │

│ ✓ task 3/5 checked off │

└─────────────────────────────────────────┘

→ PR #284 opened · auto-merge --squash armed

● ENG-415 resumed · @ralphy mention → review run

$ cat ~/.ralph/loop.md

The loop

One unchecked task per iteration. Read the steering, do the work, validate, check it off — and repeat until the change is done. State lives on disk, so any run can be stopped and resumed.

Start
iteration

Read
steering

First unchecked
task

Do the
work

Validate

Check
off

loops back to start · when all tasks are checked archive change

Each iteration reads ## Steering from proposal.md, picks the first unchecked item in tasks.md, does the work, validates, and checks it off. The OpenSpec layout — proposal.md + design.md + tasks.md + specs/ — keeps every change reviewable.

2 engines

Claude · Codex

4 caps

iterations · cost · runtime · failures

N workers

isolated git worktrees

5 MCP

tools exposed to agents

MIT

license

$ ralphy --help | less

What it does

Six pillars, from a one-shot task loop to a self-driving agent that lives in your tmux and ships PRs while you sleep.

The loop

checklist-driven execution

One unchecked task per iteration; state persists on disk so any run resumes.
Engine choice — Claude haiku/sonnet/opus or Codex, swappable per task.
Safeguards: --max-iterations, --max-cost, --max-runtime, --max-failures.
OpenSpec layout per change — proposal, design, tasks, specs.

Agent mode

Linear-driven, long-lived

Polls Linear — picks up Todo, resumes In Progress, re-runs flagged Done.
Per-task git worktrees so concurrent workers never stomp each other.
Optional confirmation gate; revise via @ralphy revise: <why>.
Re-execs into a managed tmux session — detach without killing the loop.

PR + CI

ship and stay green

Auto-opens a PR on clean exit; idempotent — surfaces an existing one.
Opt-in auto-merge (squash/merge/rebase) right after PR creation.
Stacked PRs against a blocker's head branch via Linear relations.
CI fix loop — pulls failed logs, re-spawns until green or cap hit.

Reviewer interaction

agrees-and-fixes, or replies

@ralphy mentions in Linear and GitHub PR comments trigger a review run.
Unresolved threads queue a digest; Ralph fixes and resolves, or disagrees and replies.
Sticky tasks.md mirror — one Linear comment that updates in place.
Self-review phase appends more work for another round before exit.

Observability

see every worker

Ink dashboard — per-worker cards with live phase, command-in-flight, stdout tail.
Structured JSON event stream — --json-output for CI, mirrored to disk.
Per-worker logs: global, per-task, and a per-change LOG.jsonl.
Pre-existing error check — pauses pickups when the trunk is already red.

Extensibility

MCP + templates

MCP server exposes list / get / create / append-steering / stop to Claude-side agents.
WORKFLOW.md — Jinja-style template rendered per iteration with project rules.
Declarative indicators map labels & statuses to lifecycle events.
Built on @rosneri/xstate-mcp — explicit, inspectable state.

$ history | head

Two ways to run it

A one-shot loop for a single task, or a long-lived agent that drives itself off your Linear board.

Task mode one-shot

Point it at a single task. State is on disk, so you can resume or inspect any run later.

task-mode.sh

# run a one-shot loop$ ralphy loop task \--name fix-auth \--prompt "Fix the JWT validation bug" \--claude opus --max-iterations 10# resume later — state is on disk$ ralphy loop task --name fix-auth# inspect$ ralphy loop status --name fix-auth

Agent mode Linear-driven

Polls Linear, runs up to N concurrent loops, opens PRs, watches CI, and iterates with reviewers.

agent-mode.sh

# self-driving off your Linear board$ export LINEAR_API_KEY=lin_api_xxx$ ralphy agent \--linear-team ENG \--linear-assignee me \--concurrency 3 \--create-pr --fix-ci

$ each poll routes every issue into one of:

fresh

Todo → scaffold the change + spawn a worker.

resume

In Progress → reattach to the existing run.

conflict-fix

CONFLICTING → enqueue a conflict-resolution task.

ci-fix

PR red on GitHub → prepend a fix task from the logs.

review

Reviewer comments → fresh review run.

code-review

@ralphy mention → run with the mention as prompt.

$ npm i -g @neriros/ralphy

Hand it the checklist.

Open source, MIT-licensed, built on Bun. Star it, read the guide, or just install it and point it at a ticket.

$ npm install -g @neriros/ralphy

Star on GitHub Read the guide The idea file