
09 — The daily loop
When to read this: Once your backlog is triaged. The chapter you'll come back to most often. It's the rhythm of shipping work day-to-day.
The four-step loop#
Once setup is done, every shipped feature goes through the same loop:
┌──────────────────────────────────────────────────┐
│ │
│ pick │
│ │ │
│ ▼ │
│ kickoff (dispatches by spec kind) │
│ │ │
│ │ ┌─ kind: ui → /craft-ui │
│ ├──◄ │
│ │ └─ kind: backend|infra → kickoff-spec │
│ │ │
│ ▼ │
│ verification gate │
│ (visual rubric / tests / reviewer) │
│ │ │
│ ▼ │
│ ship │
│ │ │
│ ▼ │
│ (loop) │
│ │
└──────────────────────────────────────────────────┘Plus three escape hatches you reach for mid-loop when something architectural surfaces:
/architecture-reviewfor decisions that need framing before code is written.design-reviewer/architecture-reviewer/migration-reviewer/api-reviewersubagents for review during execution./forbidwhen you spot a generic regression worth pinning into DESIGN.md.
This chapter walks the steps in order, then covers the escape hatches.
Step 1 — Pick the next task#
/pick-next-taskThe skill is read-only. It reads the phased backlog (docs/backlog/phase-*.md), parses spec dependencies, identifies which specs are unblocked (their dependencies are [x]), and surfaces a recommendation.
For solo work, it recommends the single next item. For parallelizable work, it surfaces a parallel-safe set ("you could also work on US-12 in parallel since it has no dependency on the recommended item").
The skill never mutates files. It picks. You decide.
A typical output:
## Recommended next: US-04 (P1)
US-04: App computes consecutive-day streak from localStorage
- Phase: P1
- Kind: backend
- Dependencies: US-02 (done)
- Acceptance criteria: 3 items, all unmet
Parallel-safe alternatives (no shared state with US-04):
- TASK-07: Add Vitest config (P0, kind: infra)The skill stops there. Kickoff is the next step.
When the skill says "nothing unblocked"#
If every unstarted spec has at least one blocking dependency that's not done, the cause is usually one of:
- A
[~]spec is sitting open. Finish it. - An actual dependency cycle. Resolve by re-triaging or splitting one of the specs.
- You've shipped everything in V1. Run
/prd-reviseand consider what's next.
Step 2 — Kick off the spec#
This is where the kind tag matters. The kit dispatches to two different executors based on kind::
spec.kind = ui spec.kind = backend|infra
│ │
▼ ▼
/craft-ui <id> /kickoff-spec <id>
│ │
▼ ▼
Visual review Tests-pass gate
(4 viewports, rubric) (+ smoke for infra)/kickoff-spec <id> — for backend and infra#
/kickoff-spec TASK-02The skill validates the spec (does it exist? does it have a kind tag? are its dependencies met?), flips its status from [ ] to [~] (in progress), then dispatches to the actual executor: usually the main agent in your session.
The executor reads the spec's TASK and CONSTRAINTS, looks at the existing code, and implements the change. When done, it runs the verification gate:
- For
kind: backend: tests pass. - For
kind: infra: tests pass + smoke check (typically "the app starts and core paths work").
If the gate fails, the spec stays at [~] and the executor reports what failed. You decide whether to fix forward or revert.
If the gate passes, the spec is ready to ship.
/craft-ui <description> — for UI#
/craft-ui is a multi-phase workflow for UI work. Not just an executor. A taste-first workflow that forces the agent to commit to specifics before writing code. The 9 phases:
| Phase | What it does |
|---|---|
| 0 | Read DESIGN.md, CLAUDE.md, the design-md-builder skill (if DESIGN.md is missing), token files, existing components. Stop if DESIGN.md is missing. |
| 1 | Classify the task (SaaS / app UI vs marketing / landing vs design system / component). Different rules apply. |
| 2 | Brief — up to 5 clarifying questions, only the ones the spec doesn't already answer. |
| 3 | Aesthetic commitment. Name the direction (e.g. "editorial brutalism"), 3 IS / 3 IS NOT adjectives, dominant color move, type contrast, motion temperament. Do not write code until this phase is done. |
| 4 | Forbidden defaults — banned for this task unless DESIGN.md explicitly overrides. |
| 5 | Information architecture — outline before JSX. |
| 6 | Token alignment — every visual decision maps to a token. |
| 7 | Build. State coverage is mandatory: hover, focus-visible, active, disabled, loading, error, empty. |
| 8 | Visual review loop. Screenshot at 4 viewports, apply rubric, iterate. |
| 9 | Hand-off — summary, new tokens, follow-ups. |
You invoke it like:
/craft-ui hero section for the marketing pageThe agent walks through the 9 phases. At Phase 8 (visual review), it invokes the design-reviewer subagent (or instructs you to start a dev server if Playwright MCP isn't available) and iterates until the rubric clears.
Phase 3 (aesthetic commitment) is the single most important. If the agent skips it and starts coding, the work drifts toward generic. Phase 3 is mandatory. /craft-ui won't proceed past it without a written commitment.
Why two executors?#
UI and backend have different verification gates that aren't substitutable. A passing test suite proves backend logic is correct. It proves nothing about whether the UI is good. A clean visual rubric proves the UI matches DESIGN.md. It proves nothing about whether the API behind it is sound.
/kickoff-spec runs the tests-pass gate. /craft-ui runs the visual-rubric gate. They share status mechanics (flipping [~] and [x]) but the actual verification differs.
Chapter 10 covers verification gates in depth.
Step 3 — Verification#
The gate is non-substitutable. You can't ship a UI spec on a passing test suite alone. The kit doesn't only run the kind's primary gate. It also surfaces secondary reviewers when relevant.
After UI work — design-reviewer#
The design-reviewer subagent is the kit's read-only design critic. It reads DESIGN.md, identifies what changed, takes screenshots at 4 viewports (375 / 768 / 1280 / 1920), and applies the rubric:
- Aesthetic match — does the work reflect the named direction in DESIGN.md?
- Forbidden defaults — Inter, Roboto, purple-on-white, Material easing, decorative hover scales.
- Token discipline — no hex literals in className, no raw Tailwind color utilities.
- Type contrast — display vs body distinguishable at a glance.
- Spatial rhythm — spacing values from the scale.
- State coverage — hover, focus-visible, active, disabled, loading, error, empty.
- Motion sanity — durations under 400ms, only
transformandopacityanimated,prefers-reduced-motionrespected.
The output is a structured report with verdict (PASS / NEEDS CHANGES / FAIL) and specific diff suggestions:
🔴 BLOCKING
- app/page.tsx:23 — replace `bg-zinc-900` with `bg-[--color-bg-elevated]`
- app/page.tsx:88 — body leading is 1.4; DESIGN.md specifies 1.6 for body
- Hero animation duration is 600ms; DESIGN.md ceiling is 400msThe reviewer never edits code. You apply the diffs and re-invoke the reviewer. Repeat until PASS.
After architectural changes — architecture-reviewer#
When a spec touches schema, service boundaries, auth, caching, queues, migrations, or public APIs, invoke the architecture-reviewer subagent:
have the architecture-reviewer check thisOr via the Agent tool directly. The reviewer reads ARCHITECTURE.md, identifies what changed in the diff, and applies a rubric:
- Stack alignment — do new packages align with §1?
- Data model — do new entities/FKs/columns/migrations match §2?
- Service shape — does the change respect §3 module boundaries?
- Cross-cutting concerns — auth at the documented enforcement layer, errors from defined classes, logging in the documented format, caching obeying §4 rules.
- Migration & rollout — pattern matches §4 strategy, backfill stated, rollback story present.
- Evolution / bets — change doesn't conflict with §5 bets.
- Trade-off log freshness — material changes are logged in §6.
Severity: 🔴 BLOCKING / 🟡 NEEDS DECISION / 🟢 ADVISORY.
After schema migrations — migration-reviewer#
For any DDL — added column, new index, FK addition, NOT NULL toggle, rename — invoke the migration-reviewer subagent. It does per-statement review:
- Lock acquisition —
ACCESS EXCLUSIVElock risks, blocking writes vs reads. - Backfill cost — row-count estimate, batch strategy, idempotency.
- NOT NULL adds — must use the safe pattern (add nullable → backfill → CHECK NOT VALID → VALIDATE → SET NOT NULL).
- FK adds —
NOT VALIDthenVALIDATE CONSTRAINT, index on referencing column, ON DELETE behavior. - Index hygiene — every FK has an index,
CONCURRENTLYon hot tables. - Rename safety — single-step renames are deploy-time race conditions; expand-contract pattern instead.
- Transaction wrapping —
CREATE INDEX CONCURRENTLYcan't be inside a transaction. - Rollback story — reversible / forward-only / irreversible.
- Multi-tenant exposure — every new table needs
tenant_idif the project is multi-tenant.
The reviewer cross-references ARCHITECTURE.md §1 (database + version) and §4 (migration strategy). 🔴 findings block the kickoff verification gate.
After API changes — api-reviewer#
For any new or modified HTTP endpoint, server action, RPC handler, or webhook receiver, invoke the api-reviewer subagent. It does per-endpoint review:
- Authorization granularity — auth check exists and verifies the user belongs to the resource (catches BOLA / IDOR).
- Multi-tenant filtering — every query filters by
tenant_idif the project is multi-tenant. - Input validation — schema-validated, mass-assignment safe, no
as anyshortcuts. - Idempotency — POST endpoints that side-effect externally (charges, emails, third-party calls) need idempotency keys.
- Rate limiting — per-route, per-user, scoped where documented.
- Status codes — 201 on create, 422 (or 400) on validation, 401 vs 403, 429 with
Retry-After. - Webhook handlers — signature verification before parsing, replay protection, event-ID idempotency, async-safe.
- URL safety — open redirect, SSRF.
🔴 findings block the kickoff verification gate. The most common 🔴 finding in AI-generated APIs is broken access control with granularity: auth is present, but it doesn't verify the user owns the resource being queried.
Chapter 10 covers all four reviewers in depth.
Step 4 — Ship#
/ship-spec US-04The skill:
- Runs a code review — typically
/pre-commit-reviewor equivalent, focused on code quality, naming, formatting, and tests. (This is different from the architecture / migration / api / design reviewers — those check shape; pre-commit-review checks style.) - Pauses for human merge confirmation. Ship-spec deliberately doesn't auto-merge; you confirm before the merge mechanics fire.
- Executes merge mechanics — for worktree-based workflows, that's rebase + push from the worktree; for direct-on-main workflows, it's a push from your branch.
- Cleans up — closes the worktree if applicable, flips the spec status from
[~]to[x], increments the phase's done count.
The spec is now done. Loop back to step 1.
/ship-followup (optional)#
If /ship-spec surfaced deferred items (🟡 NEEDS DECISION findings, FILES TOUCHED deviations, operational chores, workflow gaps), /ship-followup processes them with per-item confirmation: fix in place, file as new inbox item, or flag for human review.
Mid-loop escape hatches#
Sometimes you're partway through a spec and something architectural or visual surfaces. Stop, frame the decision, then continue.
/architecture-review for in-flight architectural decisions#
If you're executing a backend spec and realize the spec didn't actually decide whether to use Redis or in-process caching, stop and run:
/architecture-reviewOr by skill name:
architecture-reviewThe skill frames the decision (often by reframing it — your phrasing usually hides the real question), anchors against existing ARCHITECTURE.md commitments, surfaces trade-offs, recommends a path, and appends to the Trade-off log on your approval.
If existing commitments resolve the question, the skill stops in 30 seconds: "ARCHITECTURE.md §1 already commits to in-process for V1."
Chapter 06 covers /architecture-review in depth.
Subagent reviewers as you go#
You don't have to wait until kickoff verification to invoke the reviewers. If you've just finished writing an API endpoint, invoke api-reviewer immediately:
have the api-reviewer check the new endpoints
Catches drift earlier and shortens the fix loop. The kit's CLAUDE.md template encourages this:
After any HTTP endpoint, server action, RPC handler, or webhook receiver is added or modified, invoke the api-reviewer subagent before considering the work done.
Invoke them via the Agent tool directly. No behavioral difference between user-triggered and agent-triggered invocations.
/forbid when you spot a regression#
If during a /craft-ui run you notice the agent reaching for a pattern that doesn't fit, and DESIGN.md doesn't already forbid it, capture it:
/forbid hover scales over 1.02 — decorative, not communicating stateThe slash command appends the rule to DESIGN.md's project-specific forbidden section. Future /craft-ui and design-reviewer runs will catch it.
Chapter 07 covers /forbid in depth.
Anti-patterns in the daily loop#
| Anti-pattern | Cost |
|---|---|
Kicking off a spec without kind: triage | Wrong execution lane runs; UI specs miss the visual gate, backend specs miss the tests gate |
| Shipping without invoking the secondary reviewers (architecture, migration, api) | Drift compounds. By month 3, every architectural decision needs reverse-engineering |
| Batching multiple specs into one kickoff | Verification becomes ambiguous. One failed gate could be from any of the batched changes |
Skipping /craft-ui Phase 3 (aesthetic commitment) for "small" UI changes | The agent fills the silence with generic defaults. "Small" UI changes are where regression hides |
Manually flipping spec status [~] or [x] instead of using kickoff/ship | Phase counts silently drift. /pick-next-task and prd-revise start lying |
Running prd-revise mid-spec | Produces context noise during implementation. Wait for a quiet moment |
| Pushing back on a 🔴 finding from a reviewer | Either the change is wrong or the rule is wrong. Both require deliberate action. Never silent acceptance |
A representative day#
A real day in the loop, condensed:
Morning.
/pick-next-taskrecommendsUS-04(compute streak, kind: backend, P1)./kickoff-spec US-04flips the status, the executor implements it.- Tests pass. No schema change, no new API endpoint. No reviewer needed beyond the tests-pass gate.
/ship-spec US-04runs pre-commit-review, asks for merge confirmation, pushes, cleans up.
Late morning.
/pick-next-taskrecommendsUS-05(streak shown above today's entry, kind: ui, P2)./craft-ui US-05walks Phase 0–9. Phase 3 commits to "editorial restraint, large numerals, monospace for the streak count, no animation." Phase 7 builds. Phase 8 runs design-reviewer at 4 viewports — surfaces 🟡 finding: "streak number uses Newsreader, but DESIGN.md commits monospace for numerals." Fix, re-review, PASS./ship-spec US-05ships.
Afternoon.
- New idea surfaces in conversation: CSV export. Not in PRD. Capture immediately:
/backlog-intake The user mentioned CSV export, eventually- The idea is now in the Inbox. Won't disrupt current work.
/pick-next-taskshows two more parallel-safe items in P2. Pick one.- Kickoff, execute, ship.
End of phase.
- All P2 specs are
[x]. P2 just shipped. /prd-revisesurfaces drift: shipped streak numerals in monospace, but PRD didn't specify. Suggests adding to PRD §3. Approve.- Run
/backlog-triageon the CSV export inbox item. Gets tagged kind: backend, sized, scheduled to a deferred phase since it's not urgent.
That's the rhythm. The setup pays off in not having to think about which docs to update or which gate to run. The kit makes those decisions mechanical.
Continue#
Next: Chapter 10 goes deep on the verification gates that make the loop trustworthy.