CR Homie

An evidence-based code review skill for AI agents. Six specialized reviewers run in parallel, then an independent verifier agent proves or refutes each finding against the codebase — replacing numeric confidence scoring with fact-checking.

Installation

npx skills add YuArtian/cr-homie

What it checks

Security — XSS, injection, SSRF, path traversal, auth/authz gaps, secret leakage, race conditions, supply chain, package manager consistency. Every exploitable claim is backed by a traced attack chain (entry → framework/middleware → vulnerable code) and a confidence label: Confirmed exploitable / Defense-in-depth gap / Needs verification.
Quality (three sub-domains in one agent)
- Runtime quality — error handling, performance (N+1, hot-path CPU, missing cache), boundary conditions, concurrency, observability
- API contracts — breaking changes in REST/GraphQL/exports/schemas, backward compatibility, type safety regressions, versioning
- Dead code removal — unused exports, deprecated paths, stale feature flags, with safe-delete vs defer-with-plan recommendations
SOLID — SRP/OCP/LSP/ISP/DIP violations, code smells, with incremental refactor plans sized to the current diff
Testing — coverage gaps, behavioral vs implementation-detail tests, anti-patterns (flakiness, over-mocking, testing private internals), language-specific checks for JS/TS/Go/Python/Java
Frontend — 7 code principles (KISS, Component SRP, FP, DRY, Clean Architecture, YAGNI, Production Readiness), a11y, rendering perf, bundle, CSS, state, i18n. Fake data / mock URLs in prod code paths are always P0/P1.
Page verification (optional) — runtime validation via Chrome DevTools MCP: console errors, fake network requests, Lighthouse audits, a11y tree, visual layout, dark mode, performance traces.

How it works

Design principles

Verification > Scoring — findings are validated by an independent verifier agent that uses grep, Read, and Bash to check attack chains, callers, tests, and consumers. Replaces the previous Haiku-scoring filter.
HIGH SIGNAL only — 9 explicit categories are silently dropped (linter-catchable, pre-existing, looks-like-a-bug-but-correct, pedantic nitpicks, speculative refactors, already-tested, generic advice, cross-agent duplicates, hypothetical future needs).
Iron Law — every finding MUST cite exact file:line, describe a realistic risk scenario, and propose a specific fix. No vague advice like "consider improving error handling".
Review-first — the skill reports findings and waits for your confirmation before implementing any fix.
Shared meta-rules — Iron Law, HIGH SIGNAL filter, Severity calibration, Verify Before Reporting, and Anti-Patterns live in a single file (agents/_base-reviewer.md) inherited by every reviewer, so rules can't drift between agents.
Multi-agent parallel — domain-specific reviewers run concurrently on the same Preflight Context Block, reducing token pressure and improving coverage.
Conditional activation — frontend, page verification, and the API / removal sub-domains of the quality reviewer only activate when Preflight detects relevant signals.

Architecture

Phase 1: Preflight (orchestrator)
    ├─→ Smart scope detection (branch-aware — branch diff on feature branch, unstaged/staged on default branch)
    ├─→ Two-pass mode if diff > 2000 lines or > 15 files
    ├─→ Detect languages, frontend, API surface, dead code, linter configs, package manager
    └─→ Build Preflight Context Block
    ↓
Phase 2: Parallel Review Agents
    ├─→ security-reviewer   [opus]    ← always required
    ├─→ testing-reviewer    [inherit] ← always required
    ├─→ quality-reviewer    [inherit] ← runtime + API (if detected) + dead code (if detected)
    ├─→ solid-reviewer      [opus]    ← unless --quick or --focus excludes
    ├─→ frontend-reviewer   [inherit] ← if frontend detected or --focus frontend
    └─→ page-verifier       [inherit] ← if --verify + --url + MCP available
    ↓ findings from all agents
Phase 3: Aggregation (orchestrator)
    ├─→ Collect + Deduplicate
    ├─→ finding-verifier    [inherit] ← CONFIRMED / DOWNGRADED / REFUTED / NEEDS-MANUAL-REVIEW
    ├─→ Safety overrides (SEC-P0 strong-evidence bar, production readiness, consensus lock)
    ├─→ Self-check + HIGH SIGNAL re-filter
    ├─→ Format output (markdown-link file:line, merge verification evidence)
    └─→ Next-steps confirmation ⚠️

Workflow

Phase 1: Preflight ⛔ — First probes the repo (auto-detects the default branch via git symbolic-ref refs/remotes/origin/HEAD with fallback probes for main / master / develop / trunk; gracefully handles detached HEAD and repos without a remote). Then gathers the applicable scope signals (unstaged, staged, branch-vs-default) in parallel and picks based on current branch: on the default branch it falls back to uncommitted work; on a feature branch the branch diff is the default (so a tiny unstaged edit cannot mask a 20-commit branch). When multiple signals have content, shows a one-screen summary and lets you override before committing to a scope. Detects languages, frontend files, API surface, dead code signals, linter configs (ESLint/tsc/Prettier/Ruff/golangci-lint/etc. — controls HIGH SIGNAL linter-catchable filter), package manager consistency. If diff > 2000 lines OR > 15 files → two-pass mode (Pass 1 hotspot detection is orchestrator-inline, no separate agent; Pass 2 delivers hotspot-filtered context to agents while the verifier always gets the full diff). For project scope, enforces hard limits (300 files / 50k lines soft warn; 1k files / 150k lines hard gate that forces you to narrow or switch to hotspot-only mode).
Phase 2: Parallel agents — 2 always run (security, testing) + quality (unless focus excludes) + solid (unless --quick) + frontend and page-verifier conditionally. All inherit shared rules from agents/_base-reviewer.md. Each agent must attach a Verified: line showing which callers / tests / framework behavior it checked before reporting.
Phase 3: Aggregation ⛔ — Deduplicates overlapping findings. finding-verifier then reads the full original diff (never the two-pass filtered slice) plus each finding's file:line context and returns one of four outcomes per finding. Safety overrides protect Security P0s (REFUTE requires strong, unambiguous evidence), fake-data-in-prod findings, and co-flagged locations (at least one must survive). The orchestrator then merges agent Verified: with verifier Evidence: into a single field, converts file:line to IDE-clickable markdown links, applies --min-severity, runs a final HIGH SIGNAL sweep, and asks you how to proceed.

⛔ = BLOCKING (must complete before proceeding) · ⚠️ = REQUIRED (must not skip)

Usage

/cr-homie                        # Branch-aware smart scope detection
/cr-homie unstaged               # Review unstaged changes explicitly
/cr-homie staged                 # Review staged changes
/cr-homie commit:abc123          # Review a specific commit
/cr-homie pr:42                  # Review a PR (requires gh CLI)
/cr-homie branch:feat/login      # Review a named branch vs the detected default branch
/cr-homie project                # Full project scan (subject to hard limits)
/cr-homie project:src/           # Scan only src/ directory
/cr-homie --focus security       # Focus on security only
/cr-homie --focus quality        # Focus on quality (includes API + dead code sub-domains)
/cr-homie --focus frontend       # Focus on frontend quality
/cr-homie --quick                # P0/P1 only; skip SOLID and frontend
/cr-homie --verify --url http://localhost:3000   # Code review + runtime page verification

Parameters

Parameter	Description	Default
`<scope>`	`unstaged`, `staged`, `commit:<hash>`, `pr:<number>` (requires `gh` CLI), `branch:<name>` (compared against the detected default branch), `project[:<path>]`, or file path	branch-aware smart detection
`--focus <area>`	`security`, `quality` (includes `performance` and `api` aliases), `solid`, `testing`, `frontend`, `all`	`all`
`--min-severity <level>`	Minimum severity to report: `P0`, `P1`, `P2`, `P3`	`P3`
`--quick`	P0/P1 only; skip SOLID and frontend	off
`--verify`	Enable runtime page verification (requires `--url` and Chrome DevTools MCP)	off
`--url <url>`	Dev server URL for page verification	—

Repository requirements

Git repo with any default branch — cr-homie auto-detects main / master / develop / trunk (or whatever origin/HEAD points to). No hardcoded branch name.
Detached HEAD and no-remote repos — handled gracefully; branch-vs-default comparisons are skipped, and cr-homie falls back to unstaged → staged or asks for an explicit scope.
gh CLI — only required when you use pr:<number>. Install from cli.github.com if needed.
Chrome DevTools MCP server — only required when you use --verify. Install with claude mcp add chrome-devtools -- npx -y chrome-devtools-mcp@latest; if missing, page-verifier fails fast with a skip notice and the rest of the review still runs.

Output example

## Code Review Summary

**Scope**: branch diff vs main (feat/login)
**Files reviewed**: 3 files, 87 lines changed
**Primary language(s)**: TypeScript
**Agents launched**: security-reviewer, testing-reviewer, quality-reviewer, solid-reviewer
**Overall assessment**: REQUEST_CHANGES

---

## Findings

### P0 - Critical
(none)

### P1 - High
1. **[src/auth/login.ts:42](src/auth/login.ts#L42)** SQL injection via username `[SEC-001]`
   - **Risk**: Attacker can extract/modify the users table via crafted username in POST /api/login
   - **Attack path**: POST /api/login → Express JSON body → login(req.body.username) → raw SQL concat at line 42
   - **Confidence**: Confirmed exploitable
   - **Fix**: Use parameterized query: `db.query('SELECT * FROM users WHERE name = ?', [username])`
   - **Verified**: Grepped callers of `login()` — only called from POST /api/login with no upstream validation; verifier: Confirmed — input flows unsanitized from JSON body to the raw concat.

### P2 - Medium
2. **[src/services/file.ts:73](src/services/file.ts#L73)** Path concat without validation `[SEC-002]` `[defense-in-depth gap]`
   - **Risk**: Code lacks its own path validation, relies on Express routing behavior
   - **Attack path**: GET /files/{name} → Express `req.params.name` matches a single segment → `../` blocked at the route level
   - **Confidence**: Defense-in-depth gap
   - **Fix**: Add `path.resolve()` + prefix check to match the `deleteFile()` pattern
   - **Verified**: Grepped route definitions; verifier: Downgraded — framework blocks the input, but code-level validation absent.

3. **[src/services/order.ts:118](src/services/order.ts#L118)** Read-modify-write without transaction `[QUAL-001]`
   - **Risk**: Concurrent orders can oversell inventory under load
   - **Fix**: Wrap in a transaction with `SELECT ... FOR UPDATE`
   - **Verified**: Grepped callers; no outer transaction; verifier: Confirmed — race window is real under concurrent checkout.

### P3 - Low
4. **[src/utils/format.ts:7](src/utils/format.ts#L7)** Magic number 86400 `[SOLID-001]`
   - **Risk**: Readability — unclear what 86400 represents
   - **Fix**: Extract to `const SECONDS_PER_DAY = 86400`

---

## Not Covered
- Database migration files not included in diff
- No integration test environment available to verify query changes

Severity levels

Level	Must do
P0	Block merge — exploitable security, data loss, correctness bug, fake data in prod path
P1	Fix before merge — significant vulnerability, logic error in critical path, missing critical tests, breaking API change without migration
P2	Fix in this PR or create follow-up — defense-in-depth gap, code smell, missing edge tests
P3	Optional — style, naming, minor optimization

Full calibration rules and examples: agents/_base-reviewer.md.

Review agents

Agent	Domain	Model	Condition
`security-reviewer`	Security, reliability, race conditions, supply chain, package manager consistency	opus	Always (required)
`testing-reviewer`	Test coverage, quality, anti-patterns	inherit	Always (required)
`quality-reviewer`	Runtime quality + API contracts + dead code removal (three sub-domains, single agent)	inherit	Unless `--focus` excludes
`solid-reviewer`	SOLID principles, architecture smells	opus	Unless `--quick` or `--focus` excludes
`frontend-reviewer`	7 frontend principles, a11y, perf, bundle, CSS, state, i18n, production readiness	inherit	If frontend detected or `--focus frontend`
`page-verifier`	Runtime page verification via Chrome DevTools MCP	inherit	If `--verify` + `--url` + MCP available
`finding-verifier`	Evidence-based verification (replaces confidence scoring)	inherit	Always in Phase 3

Shared meta-rules inherited by every reviewer live in agents/_base-reviewer.md.

Structure

cr-homie/
├── SKILL.md                             # 3-phase orchestration workflow
├── agents/
│   ├── agent.yaml                       # Skill interface metadata
│   ├── _base-reviewer.md                # Shared meta-rules (Iron Law, HIGH SIGNAL, Severity, Verify, Anti-Patterns, output format)
│   ├── security-reviewer.md             # Security + reliability + package manager consistency
│   ├── quality-reviewer.md              # Runtime quality + API contracts + dead code removal
│   ├── solid-reviewer.md                # SOLID + architecture smells
│   ├── testing-reviewer.md              # Testing quality + coverage
│   ├── frontend-reviewer.md             # Frontend quality (conditional)
│   ├── page-verifier.md                 # Runtime page verification (conditional)
│   └── finding-verifier.md              # Evidence-based verification agent (Phase 3)
└── references/                          # Knowledge base (loaded by each agent)
    ├── security-checklist.md            # OWASP risks, race conditions, crypto, supply chain, attack-chain verification procedure
    ├── code-quality-checklist.md        # Error handling, N+1, caching, boundaries, observability
    ├── api-contract-checklist.md        # Breaking changes, backward compatibility, versioning
    ├── removal-plan.md                  # Safe-delete vs defer-with-plan templates
    ├── solid-checklist.md               # SOLID smell prompts + language-specific flags
    ├── testing-checklist.md             # Coverage, test quality, anti-patterns
    ├── frontend-checklist.md            # 7 code principles + a11y, perf, bundle, CSS, state, i18n
    └── page-verification-guide.md       # Chrome DevTools MCP verification procedures

Each review agent loads only its own checklist(s) from references/, keeping per-agent context usage efficient.

Page verification (Chrome DevTools MCP)

When --verify --url <url> is provided, CR Homie goes beyond static code review and validates the running page:

/cr-homie --verify --url http://localhost:5173

Prerequisite: Install the Chrome DevTools MCP server:

claude mcp add chrome-devtools -- npx -y chrome-devtools-mcp@latest

If the MCP server is not connected, the page-verifier fails fast with a skip notice and the rest of the review still runs.

What it checks:

Check	How	Severity
HTTP status	navigate + status	5xx = P0; public 4xx = P1; auth-required 401/403 = P3
Console errors	`list_console_messages` filter error/warn	Uncaught error = P0, framework warning = P1
Fake data	`list_network_requests` + `evaluate_script`	Mock URLs in prod build = P0, placeholder text = P1
Accessibility	`lighthouse_audit` + `take_snapshot`	Score < 50 = P1, missing labels = P2
Visual layout	`take_screenshot` desktop + mobile (375px)	Overflow / broken layout = P2
Dark mode	`emulate` colorScheme: dark	Invisible text / hardcoded colors = P2
Performance	`performance_start_trace` (optional)	LCP > 2.5s = P1, CLS > 0.25 = P1

Frontend code principles

When reviewing frontend code, CR Homie applies these 7 principles (detailed checks in references/frontend-checklist.md):

KISS — Delete what can be deleted. Most direct implementation wins.
Component SRP — One component = one reason to change. Componentize, but don't over-split.
Functional Programming — Business logic stays pure. Side effects live at the edges (hooks, event handlers).
DRY — Extract shared patterns at the third occurrence. Two is coincidence, three is a pattern.
Clean Architecture — Naming as documentation. Reading the code reveals intent without comments.
YAGNI — Build for today's requirements. No speculative abstractions.
Production Readiness ⚠️ — No fake data, no mock URLs, no stub features in production code (always P0/P1).

Why verification instead of scoring

Scoring asks "Does this finding sound correct?" — prone to keeping plausible-sounding false positives.

Verification asks "Can I prove this finding reaches the described failure mode?" — requires evidence, catches hallucinations.

Code-review false positives almost always share the same shape: the pattern looks dangerous, but context already mitigates it (framework auto-escaping, upstream middleware, existing transaction, existing test). A scorer reading only the finding's text can't see that context. A verifier with grep/read can. This is the same design choice made by Anthropic's official code-review plugin, Cursor BugBot V11, and CodeRabbit's sandbox validation — reached after each project's scoring-only pipeline plateaued around ~50% precision.

See ANALYSIS.md for the full industry benchmark and the reasoning behind the v3 architecture.

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
agents		agents
references		references
ANALYSIS.md		ANALYSIS.md
CHANGELOG.md		CHANGELOG.md
README.md		README.md
README.zh-CN.md		README.zh-CN.md
SKILL.md		SKILL.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CR Homie

Installation

What it checks

How it works

Design principles

Architecture

Workflow

Usage

Parameters

Repository requirements

Output example

Severity levels

Review agents

Structure

Page verification (Chrome DevTools MCP)

Frontend code principles

Why verification instead of scoring

License

About

Uh oh!

Packages

Uh oh!

Contributors

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

CR Homie

Installation

What it checks

How it works

Design principles

Architecture

Workflow

Usage

Parameters

Repository requirements

Output example

Severity levels

Review agents

Structure

Page verification (Chrome DevTools MCP)

Frontend code principles

Why verification instead of scoring

License

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Packages 0

Uh oh!

Contributors

Uh oh!

Packages