Agent Workflow Kit
Chapter 06
Technical engraving of document towers, bridge trusses, and boundary walls in an exploded system model
Fig. 06Boundaries, towers, and bridges.

06 — The architecture doc

When to read this: Before you write feature code, if your project has state, integrations, or cross-cutting concerns. If it's a static site, simple CLI, or library, you can probably skip this chapter (and skip writing ARCHITECTURE.md).

When you need ARCHITECTURE.md#

The PRD captures what you're building and why. ARCHITECTURE captures how it's built at a level specific enough that someone reading it could make compatible decisions on a feature you haven't shipped yet.

You need ARCHITECTURE.md if your project has any of:

  • A database (any flavor)
  • Auth (any kind, including "magic links" or OAuth)
  • Payments
  • Multi-tenancy or organization-scoped data
  • Background jobs or queues
  • Caching
  • Real-time / websockets / live multiplayer
  • Multiple cross-cutting concerns that have to behave consistently across features

You can probably skip it if your project is:

  • A static site or content site (no backend state).
  • A simple CLI tool with no persistence.
  • A library or SDK.
  • A throwaway prototype.

When in doubt, write a short one. A 5-page ARCHITECTURE.md beats no ARCHITECTURE.md once your project has any state.

What ARCHITECTURE.md is for#

ARCHITECTURE.md exists for two reasons:

  1. To prevent re-litigation of decisions. Without a written-down stack and schema, every feature spec re-asks "what database again? what auth library? what session model?" The agent hallucinates whatever feels right and you get five inconsistent implementations.
  2. To anchor the reviewers. architecture-reviewer, migration-reviewer, and api-reviewer all require ARCHITECTURE.md to do their job. Without it, they fall back to code-only sanity checks. With it, they catch drift specifically: "this migration violates §4's expand-contract policy" instead of "this migration looks risky."

The doc has six sections:

§SectionWhat it pins down
1StackSpecific named choices for language, framework, hosting, database, ORM, auth, payments, storage, email, queues, search, observability
2Data modelEntities, relationships, identity strategy, multi-tenancy, soft-delete policy, audit, time fields
3Service shapeTopology, module boundaries, API style, conventions
4Cross-cutting concernsAuth enforcement layer, error taxonomy, logging, caching, queues, secrets, testing strategy, migration & deploy strategy
5Evolution / betsReversibility map, falsifiable bets ("we're betting users won't exceed 10k rows per tenant"), deferred decisions, known wrong choices
6Trade-off logAppend-only log of every architectural decision: date, decision, alternatives, reason chosen

The single highest-leverage section is §6, the Trade-off log. Every architectural decision the project makes lands there. Future-you (or future agents) reads it to understand why the architecture is shaped the way it is.

Specificity, again#

Like the PRD, ARCHITECTURE.md only constrains if it's specific. Examples of vague vs specific:

"Postgres."

Which version? Which host (managed Neon, Supabase, RDS, self-hosted)? Connection pooler? Branching strategy?

"Postgres 16 on Neon. PgBouncer connection pooling via Neon's transaction-mode pooler. Branch-per-PR for preview deploys."

"Stripe for payments."

Which integration? Checkout (hosted), Elements (embedded), or Connect (multi-party)? Where do webhooks land? What's the customer model?

"Stripe Checkout (hosted) for the customer-facing payment flow. Webhooks at /api/webhooks/stripe, signature-verified before parsing. Customer model: one Stripe customer per workspace, attached to workspaces.stripe_customer_id."

"REST API."

Resource naming convention? Auth scheme? Error envelope? Pagination style? Versioning?

"REST under /api/v1/. Auth via session cookie validated in middleware. Error envelope: { error: { code, message, details? } }. Cursor-based pagination: { items, nextCursor }. No versioning beyond /v1 until we have an external API."

The pattern: name the choice, name the host or library or version, name the convention. "REST" alone doesn't constrain anything. "REST under /api/v1/, cursor pagination, this error envelope" does.

How architecture-md-builder works#

architecture-md-builder is a skill, not a slash command:

architecture-md-builder

Same shape as prd-grill: one question at a time, recommended answer with each, push past vague responses.

Phase 0: diagnose#

The skill first checks:

  • Does docs/PRD.md exist and is it filled in? If not, the skill stops. Architecture without product context produces wrong choices.
  • Does docs/ARCHITECTURE.md already exist? If yes, is it specific or vague? The skill only re-asks the vague sections.
  • Is there existing code? package.json, lockfiles, top-level folders are constraints, not blank-slate decisions. They get captured verbatim.

Phases 1 through 6: stack, data model, service shape, cross-cutting, evolution, trade-off log#

Each phase corresponds to a section of the doc. The skill asks for named, specific answers. "TBD" is acceptable. It gets logged in the Trade-off log with a deadline by which it must be decided.

Phase 1: stack#

The skill walks through this matrix, one row at a time:

DecisionPush past vague answers like...
Language / runtime"TypeScript" → which Node version, ESM/CJS
Framework"Next.js" → which router (App / Pages), version, rendering mode default
Hosting / deploy target"Vercel" → fluid compute / edge / sandbox; preview-deploy strategy
Database"Postgres" → host, version, connection pooler, branching strategy
ORM / query layer"Drizzle" → migrations tool, schema location, transaction patterns
Auth"Clerk" → session model, organizations, JWT vs cookie
Payments / billing"Stripe" → integration shape, webhook handler location, customer model
Storage / files"Vercel Blob" → public/private split, signed URL strategy
Email / notifications"Resend" → transactional vs marketing split, templates location
Background jobs / queues"Vercel Queues" → at-least-once acceptance, idempotency keys
Search"Postgres FTS" → if scaled, what's the migration path
Analytics / observability"PostHog + Sentry" → what's tracked vs sampled

Stack decisions get logged to §6 (Trade-off log) as you make them: name the decision, name the alternative considered, name the reason.

Phase 2: data model#

The most expensive section to get wrong. The skill pushes hard:

  • Entities. Every persisted entity, with a one-sentence purpose and 3–7 fields that matter most.
  • Relationships. For each FK: cardinality, ON DELETE behavior, soft-delete applicability.
  • Identity strategy. UUIDs (v4? v7?), nanoids, sequential ints, or composite keys. Pick once.
  • Multi-tenancy. None, row-level (tenant_id column), schema-per-tenant, or DB-per-tenant.
  • Soft-delete vs hard-delete. Per-entity if mixed.
  • Audit / history. Which entities need change history, where it lives.
  • Time fields. created_at / updated_at / deleted_at? Timezone storage? timestamptz or timestamp?

Phase 2 ends with a Mermaid entity diagram inside ARCHITECTURE.md. Even rough, the visual catches relationship mistakes prose hides.

Phase 3: service shape and boundaries#

For most projects this is a one-page section. Skipping it gives you tangled code in month 3.

  • Topology (monolith / modular monolith / services). Default modular monolith.
  • Module boundaries — 3–6 top-level modules, what each owns.
  • API style (REST / RPC / GraphQL / server actions / mix).
  • API conventions — resource naming, error envelope, pagination, idempotency.
  • Internal vs external API split. Auth boundary at each.

Phase 4: cross-cutting concerns#

These don't belong to any one feature spec, so without ARCHITECTURE.md they get reinvented per-spec inconsistently.

  1. Auth & authorization. Role model. Where checks live (middleware / route handler / query layer).
  2. Error handling. Error class taxonomy. How errors surface to clients.
  3. Logging. What's logged at each level. PII handling. Where logs go.
  4. Observability. Tracing, metrics, alerts.
  5. Caching. What, where, TTL/tag/invalidation.
  6. Rate limiting. Per-route, per-user, global. Backend.
  7. Secrets. Where they live. Rotation. Injection.
  8. Background work. Where it runs. Failure handling. Idempotency.
  9. Testing strategy. Unit / integration / e2e split. What's mocked vs real (especially the database).
  10. Migration & deploy strategy. Forward-only / expand-contract / dual-write. Rollback story.

Phase 5: evolution and bets#

A good ARCHITECTURE.md ages well because it admits what's provisional.

  • Reversibility map. For each major decision, mark easy / medium / hard to reverse. The hard ones are where you should have spent the most thought.
  • Bets. Falsifiable claims, e.g. "users won't exceed 10k rows per tenant." Include the trigger that would force a redesign.
  • Deferred decisions. Things you intentionally pushed off, with the deadline by which they must be revisited (usually a phase boundary in docs/ROADMAP.md).
  • Known wrong choices shipping anyway. Documenting them prevents the "why did we do this" archaeology in 6 months.

Phase 6: trade-off log#

Append-only section at the bottom. Every entry: date, decision, alternatives considered, reason chosen, links to relevant specs/PRs if available.

Example entry:

### 2025-03-12 — ORM choice
 
- **Chose:** Drizzle
- **Considered:** Prisma, Kysely, raw SQL with Postgres.js
- **Reason:** Schema-first authoring, low runtime overhead (no codegen process running),
  Postgres-typed query builder. Prisma's runtime engine adds ~30ms per query in our
  early benchmarks; Drizzle's prepared queries are faster. Kysely is also fast but its
  ecosystem is thinner and we want batteries-included migrations.
- **Reversibility:** medium. Migration would require rewriting query layer but schema
  is portable.
- **Related:** spec P0-#3 (database setup)

Short, specific, structured the same way every time so future-you can scan the log quickly.

Phase 7: write the file#

After all six phases, the skill writes docs/ARCHITECTURE.md from its internal template. Sections you decided are filled in. Sections you deferred read **Deferred** — see Trade-off log entry [date], so triage and review can flag specs that touch unresolved decisions.

After writing, the skill suggests:

  1. Re-running /backlog-triage on any pending Inbox items now that ARCHITECTURE.md exists. Triage is sharper with architecture context.
  2. Updating the PRD's Revision log if any ARCHITECTURE decision invalidated a PRD assumption.

Per-decision review with /architecture-review#

architecture-md-builder is the full interrogation, run once at project setup (or after a major pivot).

For individual architectural decisions later — a new service, a schema migration strategy, a caching choice for a specific query — use the architecture-review skill:

architecture-review

Or invoke as a slash command in some configurations:

/architecture-review

This is a targeted skill. You describe one decision; it reads PRD + ARCHITECTURE.md, frames the decision honestly, surfaces trade-offs, and recommends a path. It appends the resolved decision to ARCHITECTURE.md §6 Trade-off log.

When to use /architecture-review#

Run it when the decision touches one of:

  • Schema or data model changes (new entity, FK, index, partition strategy)
  • Service or module boundaries (extract into its own service, new module)
  • Auth, authorization, or session model changes
  • Caching strategy (what to cache, where, invalidation model)
  • Background work / queueing (new job, retry semantics, idempotency)
  • Tech stack picks (new library for a cross-cutting concern)
  • Migration / rollout strategy (forward-only vs expand-contract, dual-write, backfill)
  • Public API shape (a new external surface, breaking change)

When NOT to use it#

Skip it for:

  • Choosing between two ways to write a function — that's the executor's call.
  • UI patterns — that's /craft-ui and DESIGN.md.
  • Bug fixes that don't change the shape of anything.
  • Decisions already resolved in ARCHITECTURE.md — just follow the doc.
  • Renames.

The test: if this decision turns out wrong, how expensive is reversing it? If "an afternoon," skip the review. If "weeks of migration work or a breaking change for users," run it.

What /architecture-review produces#

A typical run produces:

  1. A reframed decision statement (the skill restates the question to confirm understanding — often the user's phrasing hides the real question).
  2. An anchor check against ARCHITECTURE.md (does the existing doc already commit to one of the options? does it rule out one?).
  3. Trade-off table for surviving options (concrete pros and cons, reversibility, operational cost, coupling).
  4. A recommendation with one paragraph of reasoning.
  5. A Trade-off log entry the user approves before it's written.

If existing commitments resolve the question, the skill says so and stops. The right outcome of a 30-second review is "ARCHITECTURE.md §1 already commits to X — go with X." That's a win, not a punt.

The Trade-off log earns its keep#

The Trade-off log is the highest-leverage piece of ARCHITECTURE.md. Six months in, when someone asks "wait, why did we choose X over Y?", the log has the answer: date, alternatives, reason. Without it, every architectural choice gets re-litigated, often by an agent with no memory of the original constraints.

If you do nothing else from this chapter, commit to the Trade-off log. A thin ARCHITECTURE.md with a thick Trade-off log beats a thick ARCHITECTURE.md with no log.

Common stumbles#

SymptomFix
Specs keep needing rework because schema/auth/etc. choices were wrongRun architecture-md-builder once before backlog-triage; run /architecture-review per architecturally loaded decision
Triage refuses to write a spec citing "architectural load"Take the hint — run /architecture-review and append to §6, then resume triage
ARCHITECTURE.md says "Postgres" and not much elseRun architecture-md-builder again; the skill will only re-interrogate vague sections
The Trade-off log is emptyEvery architectural decision should land there. If the log is empty, the architecture isn't documented. It's implied.
You're tempted to make a big architectural decision in chat without logging itStop. Run /architecture-review. Five minutes of friction prevents a year of "why" archaeology.

Continue#

If your project has UI: Chapter 07: The design doc. Otherwise: Chapter 08: Roadmap and backlog.