Chapter 15

Technical engraving of a diagnostic bench with a jammed paper path and inspection tools — Fig. 15The jam exposed for repair.

15 — Troubleshooting

When to read this: When something is broken. Organized by symptom, not by chapter.

Install and bootstrap#

`agent-workflow: command not found`#

~/.local/bin isn't on your PATH. Add to ~/.zshrc or ~/.bashrc:

export PATH="$HOME/.local/bin:$PATH"

Open a new terminal. Run which agent-workflow. Should print a path.

If the symlink itself is missing:

ls -la ~/.local/bin/agent-workflow

If it's broken or absent, re-run ./scripts/install-global.sh from the kit's source directory.

`agent-workflow` runs but says "kit not found"#

The wrapper finds the kit via $AGENT_WORKFLOW_KIT env var or the parent directory of the symlink target. If you moved the kit folder after installing, the lookup breaks.

Fix by re-running ./scripts/install-global.sh from the kit's new location, or by setting the env var:

export AGENT_WORKFLOW_KIT="/path/to/your/kit/folder"

Bootstrap skipped my `.claude/settings.json`#

Expected behavior. The kit refuses to overwrite .claude/settings.json, even with --force, because settings are usually project-customized.

If you want the kit's example settings (which wires up check-tokens.sh and continual-learning Stop hooks), copy by hand:

cp ~/.local/share/agent-workflow-kit/templates/claude-settings.example.json .claude/settings.json

If you have an existing settings.json, merge the hooks.Stop block from the example into your existing file rather than replacing the whole file. The example block:

{
  "hooks": {
    "Stop": [
      {
        "matcher": "",
        "hooks": [
          { "type": "command", "command": "bash scripts/check-tokens.sh" },
          { "type": "command", "command": "bun run .claude/hooks/continual-learning-stop.ts" }
        ]
      }
    ]
  }
}

Re-running bootstrap doesn't pick up template changes#

By design. Bootstrap skips files that exist. Use --force to overwrite, or copy specific files by hand. .claude/settings.json is never overwritten regardless.

Bun isn't installed and the Stop hook errors#

Install Bun (one-liner): curl -fsSL https://bun.sh/install | bash. Restart your terminal.

Or remove the continual-learning hook from .claude/settings.json (delete that one entry from the hooks.Stop[].hooks array). The rest of the kit works fine without Bun. You just lose the auto-memory loop.

DESIGN.md and UI#

`/craft-ui` halts saying DESIGN.md is missing#

DESIGN.md is required before any UI work. Run:

design-md-builder

The skill walks through brand identity, type, color, motion, voice, forbidden defaults, and writes DESIGN.md at the repo root.

DESIGN.md exists but `/craft-ui` says required sections are missing#

The doc is too vague. /craft-ui Phase 0 looks for specific section headers (brand, type system, color, motion, forbidden defaults). Run design-md-builder again. It diagnoses missing or vague sections and only re-interrogates on those.

Output looks generic despite DESIGN.md#

DESIGN.md is too soft. Specifically check:

Phase 3 of /craft-ui (aesthetic commitment) was skipped or stayed vague. Re-run /craft-ui. The phase is mandatory.
DESIGN.md Forbidden defaults section is mostly empty. Add project-specific rejections via /forbid.
The "IS / IS NOT" adjective lists are missing or generic. Re-run design-md-builder Phase 2.

The unifying test: read DESIGN.md aloud and ask, "could this describe a different brand I've seen?" If yes, it's too vague.

`design-reviewer` says PASS but the work looks bad#

DESIGN.md is too sparse. Nothing for the reviewer to anchor against. Add forbids via /forbid for the patterns you find generic. Re-run the review.

If the issue is taste-level rather than rule-level (e.g., the spacing rhythm "feels off" but no specific rule is broken), DESIGN.md needs a rhythm rule added. Run design-md-builder on the space-and-shape section.

`design-reviewer` says it can't take screenshots#

Playwright MCP isn't configured. Install:

claude mcp add playwright -- npx @playwright/mcp@latest

Restart Claude Code. The next design-reviewer run will use real screenshots.

Without Playwright MCP, the reviewer falls back to a code-only review and announces that at the top of the report.

Token check fails on a color you "need"#

Add the token to DESIGN.md and your CSS. Don't bypass. Adding a token costs 30 seconds and saves consistency for the rest of the project.

If the rule itself is wrong (e.g., DESIGN.md should permit a specific font but the kit's universal forbid blocks it), edit scripts/check-tokens.sh to relax that check, and document the override in DESIGN.md so future-you knows why.

Stop hook isn't running `check-tokens.sh`#

.claude/settings.json doesn't have the hook wired. Either:

Copy the kit's example settings: cp templates/claude-settings.example.json .claude/settings.json (only if you don't already have settings).
Or merge the hooks.Stop block from the example into your existing settings.

Verify the Stop hook is firing by adding a temporary echo "hook fired" to the script. You should see it in Claude Code's logs after each turn.

PRD and architecture#

PRD reads "could be any productivity tool"#

Too generic. Run prd-grill again with the unifying test in mind. Push hardest on Phase 1 (primary user) and Phase 4 (non-goals). Those drift hardest toward generic.

Specs keep getting rejected by triage as "architecturally loaded"#

Take the hint. Run /architecture-review on the architectural decision the spec implies. The skill frames the decision, surfaces trade-offs, recommends a path, and appends to ARCHITECTURE.md Trade-off log. Once the decision is logged, resume triage.

If your project has multiple architectural decisions surfacing in the same triage batch, run architecture-md-builder instead of multiple /architecture-review invocations. The full builder is more efficient when you have several to make at once.

PRD and shipped reality have drifted#

Run /prd-revise. The skill compares shipped specs to PRD capabilities, surfaces drift in three dimensions (shipped but not in PRD, in PRD but not shipped, invalidated assumptions), and walks you through deliberate updates with an append-only Revision log entry.

Don't run prd-revise mid-spec. Wait for a phase boundary or a quiet moment.

The Trade-off log is empty#

Every architectural decision should land there. If the log is empty, the architecture isn't documented. It's implied. Pick the three biggest decisions you've made (stack, auth, data model) and add Trade-off log entries retroactively. Going forward, run /architecture-review on every architecturally-loaded decision.

`architecture-reviewer` says PASS when work feels off#

ARCHITECTURE.md is too vague. Specifically check:

§2 Data model. Does it commit to multi-tenancy, identity strategy, soft-delete policy?
§3 Service shape. Does it commit to API style, error envelope, pagination?
§4 Cross-cutting. Does it commit to auth enforcement layer, error taxonomy, caching policy?

Sparse sections mean the reviewer has nothing to anchor against. Run architecture-md-builder on the missing sections.

Backlog and execution#

`/kickoff-spec` runs the wrong execution lane#

Spec is missing kind: tag. Re-run /backlog-triage on it. The skill checks each spec's tag and dispatches to /craft-ui (for kind: ui) or the kickoff-spec executor (for kind: backend | infra).

Backlog has 30+ items per phase#

Sizing too granular. Merge anything <30 minutes into a parent item. The seed-backlog prompt aims for 5–15 items per phase. If it produced more, the agent over-decomposed.

Backlog has 1 item per phase#

Sizing too coarse. Split anything >1 day. A phase with one giant item isn't really phased. It's a single-task project.

`/pick-next-task` says "nothing unblocked"#

Three causes:

A [~] spec is open. Finish it before starting another.
A real dependency cycle. Re-triage the involved specs or split one.
Everything in V1 is [x]. Run /prd-revise and consider the next phase.

Phase counts in the backlog file are wrong#

Did you manually flip [~] or [x] instead of using /kickoff-spec and /ship-spec? Manual flips don't update the per-phase count tracking. Either:

Fix the count by hand (recount and update).
Or use the kit's commands going forward; they manage state correctly.

Continual learning#

Continual learning never fires#

Three things to check, in order:

Bun is installed. which bun should return a path.
The hook is wired in .claude/settings.json. The hooks.Stop[].hooks array should include bun run .claude/hooks/continual-learning-stop.ts.
You're past the cadence threshold. Default is ≥10 turns + ≥120 minutes. To verify the rest of the loop in a shorter window, set CONTINUAL_LEARNING_TRIAL_MODE=1. Trial mode uses ≥3 turns + ≥15 minutes for 24 hours.

"No high-signal memory updates" runs every time#

The subagent is mining transcripts but not finding durable patterns. Likely causes:

Sessions are short or vary widely. Recurring patterns need a few sessions to be detectable. Be patient.
The patterns are already in CLAUDE.md or DESIGN.md. The subagent skips anything covered by authoritative docs. If you keep correcting the agent on something already in CLAUDE.md, the issue is upstream. Either CLAUDE.md isn't being read, or it's wrong.

State or index file is corrupted#

Delete and let the system recreate:

rm .claude/hooks/state/continual-learning.json
echo '{}' > .claude/hooks/state/continual-learning-index.json

The hook will recreate the state file on the next run. You'll lose the cadence count but not any memory in AGENTS.md.

AGENTS.md is growing too long#

Each section caps at 12 bullets, but the cap only applies during memory updates. If you've been editing AGENTS.md by hand and added more than 12 bullets, force a trim by invoking the subagent:

have agents-memory-updater clean up AGENTS.md

The subagent dedupes and caps.

AGENTS.md has facts that don't apply anymore#

Edit AGENTS.md by hand. Delete the wrong bullets. The subagent respects manual deletions on subsequent runs. It updates matching bullets and adds new ones, but doesn't fight your deletions.

If a deleted bullet keeps coming back because the underlying signal is still in transcripts, the pattern is genuinely durable. Either accept the bullet or change the upstream behavior.

Cursor and Claude Code indexes diverge#

Both clients have their own index. The kit creates both. If you primarily use one, ignore the other's state directory. AGENTS.md is shared. Only transcript-mining is duplicated.

To disable one side, delete its state directory contents. The unused client won't recreate them if its hook isn't firing.

Reviewers#

Reviewer subagent edits files#

It shouldn't. The kit's reviewers are configured with read-only tools (Read, Grep, Glob, Bash). If a custom reviewer is editing, check its frontmatter tools line — Edit and Write shouldn't be there for read-only review work.

Reviewer produces vague feedback#

Review the rubric in the subagent's markdown file. Each finding should cite a specific file:line and a specific rule it breaks. Vague feedback ("looks risky," "consider refactoring") is exactly what the kit's reviewers are designed to avoid. If a custom reviewer is producing it, tighten the rubric to require file:line + specific rule citation.

A reviewer keeps saying NEEDS CHANGES on something that's actually fine#

Two possibilities:

The rule is genuinely wrong. The reviewer is anchoring against a doc that no longer reflects intent. Update the doc (DESIGN.md, ARCHITECTURE.md). That's the deliberate fix.
The work is genuinely wrong but feels right. This is more common than people admit. Re-read the finding carefully. If the reviewer cites a specific rule and a specific violation, the path of least resistance is fixing the code, not rewriting the rule.

The kit's reviewers will not approve work that violates a 🔴 BLOCKING rule, even if you push back. That's by design.

A 🔴 finding seems wrong#

Surface the disagreement in chat. If the reviewer is actually wrong (rule misapplied, false positive), update the rubric in the subagent's markdown file. If the rule is genuinely right and the finding is genuinely correct, the work needs to change.

The reviewer's stance is consistent: if a 🔴 finding is fired, either the code is wrong or the rule is wrong. Both require a deliberate update.

Multi-IDE#

I'm using Cursor + Claude Code on the same project#

Both clients are supported. The kit creates state for both. Documents (CLAUDE.md, AGENTS.md, DESIGN.md) are shared. Tooling is parallel.

The Cursor side reads .cursor/rules/project-memory.mdc (always-apply rule pointing at CLAUDE.md and AGENTS.md). The Claude Code side reads CLAUDE.md and AGENTS.md natively.

Continual-learning runs on both. Cursor's continual-learning plugin (if installed) updates AGENTS.md from Cursor transcripts. The kit's Stop hook + skill + subagent updates AGENTS.md from Claude Code transcripts. Some redundancy is unavoidable. The subagent dedupes.

Slash commands work in Claude Code but not Cursor#

The kit's slash commands live at .claude/commands/<name>.md, Claude Code's location. Cursor uses a different commands system. The kit doesn't currently mirror commands to Cursor's location.

Workaround: invoke the same workflows by name in Cursor (e.g., "run craft-ui for [thing]") instead of using a leading /. The agent will recognize the intent and follow the markdown instructions. Slightly less ergonomic but functionally equivalent.

CI/CD#

Token check failing on third-party generated code#

Either:

Fix the third-party code to use tokens (the kit's /scaffold-component does this from the start).
Scope the check to your custom dirs only: bash scripts/check-tokens.sh src/components src/lib.
Add the third-party path to an exclusion list in the script.

Increase the wait-on timeout, or use webServer config in playwright.config.ts instead of starting the server in a separate step.

// playwright.config.ts
export default defineConfig({
  webServer: {
    command: 'npm run start',
    url: 'http://localhost:3000',
    timeout: 120_000,
    reuseExistingServer: !process.env.CI,
  },
});

Use axe's disableRules option for the specific test:

const results = await new AxeBuilder({ page })
  .withTags(['wcag2a', 'wcag2aa', 'wcag21a', 'wcag21aa'])
  .disableRules(['color-contrast'])  // for this test only
  .analyze();

Don't disable globally. That defeats the purpose.

When the kit is genuinely the wrong tool#

A few cases where the kit's overhead exceeds its value:

Throwaway prototypes. If the project will live a week, you don't need PRD / ARCHITECTURE / DESIGN. The kit becomes ceremony.
One-page static sites. No state, no integrations, one page of UI. Just write the page. DESIGN.md still helps for non-genericness. The rest is overkill.
Library / SDK projects. No UI to gate, no architecture beyond "it's a library." The kit's reviewers don't apply. Skip.

For these cases, bootstrap and use the parts that help (DESIGN.md and the token check, maybe). Ignore the rest. The kit doesn't punish partial adoption.

Where to ask for help#

If something is broken and not in this chapter:

Open an issue on the kit's GitHub repo. Include: what you ran, what you expected, what you saw.
The kit is small enough that reading the relevant template or script directly is often faster than tracing through troubleshooting docs. The shipped subagents (.claude/agents/*.md) and skills (.claude/skills/*/SKILL.md) are plain markdown. They describe their own behavior.
If you've found a bug in one of the kit's templates and you're confident in the fix, send a PR.

Continue#

The reference is done. Next: the worked examples ground these abstractions in a concrete end-to-end project.