docs: sprint 4 wrap-up — CHANGELOG + README + 7 lessons + plan final

- CHANGELOG: sprint 4 entry under [Unreleased] (covers all 9 US: dark mode, MITRE matrix overhaul, tactic_ids, done read-only + Reopen, engagement auto-status, UI polish, design-reviewer agent, PR helper, screenshots mandatory). Sprint 3 moved to its own [Sprint 3] section.
- README: status bump, test counts refreshed (193/92/158).
- tasks/lessons.md: 7 sprint-4 lessons captured (git status before sprint close, endpoint round-trip mismatch caught only by e2e, ink vs slab token split, structural row layout > class tweaks, hardcoded paths in migration tests, screenshots with auth, builder cross-context summaries as accidental re-dispatch).
- tasks/todo.md: status flipped to 🟢 SPRINT COMPLET, execution sequence ticks updated with commit hashes.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
Knacky
2026-05-27 21:41:47 +02:00
parent 43ab7073f1
commit 6d2bb091e2
4 changed files with 97 additions and 7 deletions

View File

@@ -4,6 +4,44 @@ Recurring mistakes and the rule we adopted so the same issue doesn't bite twice.
---
## Sprint 4 (closed 2026-05-27)
### Process — git status before declaring sprint complete (team-lead)
**Context** : Sprint 3 §0 of the plan updated `SPEC.md`'s § Simulation line to plural multi-techniques. That edit sat in the sprint 3 worktree but was never committed; PR #6 merged the sprint 3 code WITHOUT the corresponding spec text. I rediscovered it at sprint 4 start with a stray `M SPEC.md` in the worktree, then carried over via `ba313a3`. Sprint 3 had shipped with SPEC and code briefly out of sync.
**Lesson** : at sprint wrap-up, the team-lead's final pre-PR step MUST be `git status` followed by a 5-second scan for unstaged hunks. If anything is M/?? in the listing, decide explicitly (commit, stash, or abandon). Don't trust commit lists alone.
### Process — Endpoint round-trip mismatch caught by e2e, missed by 3 spec-reviewer passes (team-lead + spec-reviewer)
**Context** : Sprint 3's `GET /api/mitre/matrix` returned `tactic_id` as a slug (`"discovery"`). Sprint 4 added `tactic_ids` PATCH validation against TA-format (`"TA0007"`). Both endpoints worked in isolation but the round-trip (matrix → frontend → PATCH) failed silently — frontend sent the slug from the matrix as a `tactic_ids` value, and the backend rejected with 400. Spec-reviewer ran 3 passes against the plan WITHOUT catching this contract mismatch because both formats were documented in different sections. The test-verifier's e2e caught it (AC-21.6 defect).
**Lesson** : when a sprint introduces a NEW data path that re-uses an EXISTING endpoint's output, the spec-reviewer should explicitly trace the round-trip in code (matrix output → frontend store → PATCH body → backend validator). Pure spec reading misses contract mismatches that only manifest when a feature wires two existing endpoints together. Add to the spec-reviewer checklist: "for any new feature reusing a prior endpoint, trace the actual data flow A → B → C in code, not just specs in isolation."
### Engineering — Token system: split themed text vs fixed surface (frontend-builder + design-reviewer)
**Context** : First dark mode pass made `--color-ink` themed (light = `#1a1a1a`, dark = `#f9fafb`). That correctly inverts for body text, but inverts WRONG for "dark slabs" (utility strip, footer, modal backdrop) that used `bg-ink` as a fixed-dark surface — they became near-white bars in dark mode. Three separate symptoms, one root cause. Design-reviewer (its first run) caught it.
**Lesson** : if a single token (`ink`) plays multiple roles (text + dark surface), split into a themed token (`ink` = text) and a fixed token (`slab` = dark surface that does NOT invert). Same for overlays — `bg-ink/60` for a backdrop inverts; `bg-overlay` or a dedicated `.modal-backdrop` class with `rgba(0,0,0,0.6)` stays correct. Audit token roles BEFORE wiring dark mode, not after.
### Engineering — Form alignment: structural row layout > class tweaks (frontend-builder × 3 attempts)
**Context** : `UsersAdminPage` "Create account" form had labels misaligned by 24px when one cell carried a hint (`≥ 8 characters`) and others didn't, because `items-end` aligned cells to row-bottom. Sprint 2 post-QA "fixed" it with class tweaks. Sprint 4 post-QA reported the bug back. Sprint 4 first attempt added another class tweak (still broken per design-reviewer). Third attempt finally refactored to an explicit 3-row grid (labels / inputs+button / hints) where the browser CAN'T misalign rows of different cells.
**Lesson** : after one failed alignment fix via class tweaks, the next attempt MUST be structural (explicit grid rows, or `display: subgrid`, or DOM restructure). Don't try a third class-level tweak on an alignment issue that survived a prior class-level fix — the problem is in the structure, not the styling.
### Engineering — Hardcoded absolute paths in migration tests (backend-builder × 2 sprints)
**Context** : Sprint 3 backend-builder hardcoded `/home/user/.../.claude/worktrees/sprint-3-mitre-matrix/` in the migration round-trip test to load the migration module via `importlib.util.spec_from_file_location`. Sprint 4 the worktree path was renamed and the test broke — backend-builder "fixed" it by hardcoding the NEW worktree path. Code-reviewer caught it with the same finding twice.
**Lesson** : any test that loads a sibling file by absolute path is wrong by default. Always derive paths from `__file__` (`Path(__file__).resolve().parent.parent / "migrations" / "versions" / "0004_*.py"`). Backend-builder mental checklist before completing a migration test: grep `/home/user/` and grep `worktrees/` in the diff — anything matching is a smell.
### Process — Screenshots-mandatory rule has a fragile transport (frontend-builder)
**Context** : Sprint 4 first round of post-design-review screenshots, frontend-builder couldn't authenticate Playwright cleanly and delivered 5 screenshots of the login page only — the 4 critical fixes (UsersAdmin alignment, slab in matrix, badge contrast on form, done state dark) were NOT visible on the login page and could not be validated. The DoD says "if dev server can't start, escalate explicitly", but here the dev server DID start — auth just didn't resolve before the screenshot. Builder DID flag it; team-lead bounced back with the exact round-1 auth flow path. Second round worked.
**Lesson** : the DoD's screenshots clause needs to cover not just "can the server start" but "can the screenshot capture authenticated UI". For sprint 5+, the brief should remind the builder to use the exact auth pattern that worked in round 1 (`page.goto('/login') → fill → submit → wait for navigation`) OR to seed localStorage with a valid token before `goto`. Don't accept login-page-only screenshots as "screenshots" when the feature being validated lives behind auth.
### Process — Builders received cross-context inline summaries as re-dispatches (sprint 3 + 4 — pattern)
**Context** : When dispatching to one builder I include the other builder's API summary inline (e.g., backend summary embedded in the frontend dispatch as context). Twice now the OTHER builder has interpreted that embedded summary as a re-dispatch to themselves, re-verified their work, and confirmed completion (sprint 3 503-on-unloaded + sprint 4 AC-21.6 round-trip both occurred this way). Net positive — they caught real gaps — but coordination cost.
**Lesson** : when embedding another builder's API contract as context for builder B, prefix it with `# REFERENCE — BUILDER A's SUMMARY, INLINE CONTEXT ONLY, NO ACTION NEEDED FROM YOU` (or equivalent). Builders inherit the whole message and don't always parse which sections target them.
### Engineering — Backend-builder's self-correction via re-dispatch surfaced two real defects (sprint 3 + 4)
**Context** : Twice now the backend-builder, after interpreting a frontend dispatch as a "re-dispatch", noticed an unspecified or wrong behavior I'd left in the brief and shipped a fix:
- Sprint 3: I left "Bundle non chargé → comportement à décider, je propose 503" ambiguous. Backend-builder picked 503 and added a regression test (`673b25e`).
- Sprint 4: backend-builder re-read the AC-21.6 inline brief and reasoned through but their code was correct; the actual fix came after the test-verifier's defect report.
**Lesson** : when something in a builder brief reads "à décider" / "TBD" / "we'll see", that's a SPEC HOLE — close it before dispatch, not after. Use AskUserQuestion or take a defensible default and document explicitly which it is.
---
## Sprint 3 (closed 2026-05-27)
### Process — Spec-review 2-pass after team-lead edits (team-lead)