CHANGELOG.md

# Changelog

All notable changes to this project will be documented here. Format: [Keep a Changelog](https://keepachangelog.com/en/1.1.0/) · [Conventional Commits](https://www.conventionalcommits.org/).

## [Unreleased]

### Fixed (post-M7 review pass — spec-reviewer + code-reviewer)
- **Idempotent transition leaked false success to a wrong-side user** (`backend/app/services/mission_tests.py:570`): a blue-only viewer POSTing `target_state="executed"` while the test was already executed got a 200 idempotent response, falsely advertising that they held `mission.write_red_fields`. Reordered the gate so the side-perm check runs *before* the idempotency short-circuit, with a new `_IDEMPOTENT_SIDE` table that asks "which side originally produced this state?" — re-asserting that perm even on no-op replays. Test `test_idempotent_transition_still_checks_side_perm`.
- **Cross-mission evidence access not pinned by a test** (`backend/tests/test_mission_tests.py:test_evidence_member_of_other_mission_gets_404`): added explicit coverage that a user who is a blue member of mission B sees 404 on an evidence row attached to mission A. The chain walk in `_resolve_evidence_chain` already enforced this, but the regression test was missing.
- **`shutil.move` swapped for `os.replace`** (`backend/app/services/evidence.py:240`): `os.replace` is the documented atomic-rename primitive on POSIX and Windows when src/dst share a volume — and our tmpfile is always staged inside the destination directory, so the guarantee holds. Removes the implicit copy+remove fallback from `shutil.move` that would silently break atomicity on a cross-fs `EVIDENCE_DIR`.
- **SHA256 path component now hex-validated** (`backend/app/services/evidence.py:227`): the hash always comes from hashlib so it's already hex, but if a future caller ever passes pre-computed bytes we want to fail loudly rather than write to a path like `..something.evtx`. Cheap `re.fullmatch(r"[0-9a-f]{64}", sha256)` guard.
- **`EVIDENCE_DIR` filesystem-root guard** (`backend/app/services/evidence.py:_test_dir`): refuse to create per-mission directories when `EVIDENCE_DIR` resolves to `/` (or the equivalent on Windows). Stops a mis-configured operator from laying down content-addressed evidence files at the filesystem root.
- **`/diag/reset` evidence cleanup now skips symlinks** (`backend/app/api/diag.py:127`): switched from `is_dir()` to `is_symlink() or not is_dir()` so a hostile or accidental symlink inside `EVIDENCE_DIR` is unlinked rather than `rmtree`'d through.
- **N+1 in `_to_detail_view`** (`backend/app/services/mission_tests.py:_to_detail_view`): the last-actor and detection-level lookups each issued their own `s.get()`. Replaced with `select(columns)` queries that return just the needed scalar fields — same SQL count but fewer ORM round-trips, and every PUT/transition exercises this path so it adds up.
- **Mission detail row `onClick` removed in favour of the wrapped `Link`** (`frontend/src/pages/MissionDetailPage.tsx:684`): the `tr onClick` + nested `Link` with `stopPropagation` worked but was fragile to accessibility tooling. The link on the test name + the explicit hover class is enough.

### Added — M7 (Red & blue execution on a mission test)
- **Per-mission-test write API** (`app/api/missions.py` + `app/services/mission_tests.py`):
  - `GET /missions/{id}/tests/{test_id}` — full detail view with snapshot, state, red/blue fields, MITRE tags, evidence list, last-actor metadata.
  - `PUT /missions/{id}/tests/{test_id}` — patch any subset of `red_command` / `red_output` / `red_comment_md` / `blue_comment_md` / `detection_level_id` / `executed_at` / `executed_at_overridden`. The service classifies each touched field as red-side or blue-side and rejects with 403 if the caller lacks the matching perm. `executed_at*` only writable when the test sits in `executed` or `reviewed_by_blue`.
  - `POST /missions/{id}/tests/{test_id}/transition` — drives the state machine `pending↔skipped/blocked` + `pending→executed→reviewed_by_blue` (allows undo back to `pending`). Side-aware perm gating: `pending→executed` and `executed→pending` require `write_red_fields`; `executed↔reviewed_by_blue` requires `write_blue_fields`; `pending↔skipped/blocked` accepts either side. Transitioning into `executed` stamps `executed_at=now()` and clears the override; transitioning out (to `pending`) wipes the timestamp.
  - `GET /missions/{id}/activity?since=<ISO>` — returns mission_tests whose `updated_at > since`, freshest first. Drives the SPA's 15-second polling badge. Response includes `server_time` so the client can chain calls without clock drift.
- **Evidence storage pipeline** (`app/services/evidence.py` + `app/api/evidence.py`):
  - `POST /missions/{id}/tests/{test_id}/evidence` (multipart, gated on `mission.write_blue_fields`): streams the upload into a tmpfile next to the final location, hashing chunk-by-chunk and aborting at the 25 MB cap. Validates extension (whitelist: png/jpg/jpeg/pdf/txt/log/json/csv/evtx/zip) and MIME (permissive allowlist + `application/octet-stream` fallback for `.evtx`). Content-addressed storage: `${EVIDENCE_DIR}/<mission>/<test>/<sha256><ext>` — re-uploading byte-identical content reuses the file on disk and inserts a fresh row.
  - `GET /evidence/{id}` — JSON metadata view; `?download=true` switches to `send_file` with the original filename in `Content-Disposition` and the SHA256 as the ETag.
  - `DELETE /evidence/{id}` — soft delete (only flips `deleted_at`; physical purge lands in M12).
  - All three routes are membership-aware via the same chain walk (`evidence → test → scenario → mission`), collapsing "not found" / "not visible" into 404 to prevent existence leaks.
- **Activity tracking column** (`backend/alembic/versions/20260514_1000_91a4e7c6d2f3_m7_mission_test_last_actor.py`): added `mission_tests.last_actor_id` (FK `users.id` `ON DELETE SET NULL`) + `ix_mission_tests_updated_at` to support the polling endpoint. Every red/blue write or transition stamps the actor so the "modified by X Ns ago" indicator can resolve a human label.
- **Detection-level seed + read** (`app/services/detection_levels.py` + `app/api/detection_levels.py`):
  - 4 default rows seeded at boot — `detected_blocked` / `detected_alert` / `logged_only` / `not_detected` — colored on the design-system accent palette. The seed is idempotent and never mutates existing rows; new keys added to `DEFAULT_LEVELS` in future releases surface on next boot.
  - `GET /detection-levels` (gated on `detection_level.read`) returns the catalogue ordered by position. CRUD is M8's territory.
- **Per-test page** (`frontend/src/pages/MissionTestPage.tsx`): two-zone layout with the red border on the red half (command, output, markdown comment, mark-executed button, override toggle) and the cyan border on the blue half (detection-level select, comment, drag-and-drop evidence dropzone). Per-field disable based on `mission.write_red_fields` / `mission.write_blue_fields`; server is the ultimate arbiter so the UI is purely advisory. The "Last touched Xs ago by Y" badge polls `/activity` every 15 s while the document is visible.
- **Mission detail page wires through to the per-test page** (`frontend/src/pages/MissionDetailPage.tsx`): every row in the Tests tab is now clickable (cursor + hover state) and links to `/missions/<id>/tests/<test_id>`. The route is registered in `App.tsx` behind `RequireAuth`.
- **TanStack query keys** (`frontend/src/lib/missions.ts`): added `missionTestKeys.detail()` / `.activity()` / `.detectionLevels()` so the per-test page invalidations stay surgical (don't blow away the whole missions list).
- **`/diag/reset` extended** (`app/api/diag.py`): test mode now wipes `${EVIDENCE_DIR}/*` so e2e uploads don't accumulate across runs. Detection levels are preserved (reference data, not catalogue) and the seed is re-run as a safety net.
- **Tests**:
  - **`backend/tests/test_mission_tests.py`** — 25 pytest tests covering: detection-level seed + perm gating; red/blue field-level perms (red user blocked on blue fields and vice-versa); mark-executed stamps `executed_at`; override gating (forbidden while pending, blue-side blocked); state-machine matrix + side perm refinement; membership 404 vs admin bypass; evidence 24 MB ok / 26 MB rejected; SHA256 verification; MIME/extension whitelist; soft-delete hides bytes from detail view; activity polling with `since=` URL-encoded; future `since` returns empty.
  - **`e2e/tests/m7-execution.spec.ts`** — 5 Playwright tests against the live stack: red-only/blue-only API gating, mark-executed + reviewed_by_blue side enforcement, 24 MB/26 MB upload + SHA256 round-trip, SPA per-test page save + transition, non-member sees the 404 alert instead of mission content. `afterAll` restores the stable admin and re-syncs MITRE.
- **HomePage**: hero + roadmap card bumped to `M7 — Red & blue execution on a mission test (done). Next: M8`.

### Fixed (post-M6 SPA — mission detail page was read-only)
- **Mission detail page couldn't edit metadata, append scenarios, or change members** (`frontend/src/pages/MissionDetailPage.tsx`): the M6 SPA shipped the 3-step *creation* wizard but no edit affordance on the detail page — even though the backend already exposed `PUT /missions/{id}`, `POST /missions/{id}/scenarios`, and `PUT /missions/{id}/members`. Added three modals gated by `is_admin || mission.update`:
  - **Edit metadata** (header button, opens a 3xl modal): name / client_target / dates / description_md, full inline validation (empty name, inverted dates) mirroring the wizard's step 1.
  - **Add scenarios** (in the Tests tab): scenario picker reusing the wizard step-2 visual, calls `POST /missions/{id}/scenarios` which appends snapshots at `current_max_position + 1`. The footer line tells the user how many tests will be appended.
  - **Edit members** (in the Members tab): roster + red/blue toggles, calls `PUT /missions/{id}/members` (full-set replace) — same UX as the wizard step 3, pre-populated with the current member set.
- Detail page now imports `useAuth` to compute `canEdit` once and reuses it across all three buttons.
- E2E spec extended: new test `SPA — detail page edits metadata, appends scenarios, edits members` exercises the three modals end-to-end against a pre-seeded mission. Suite is now 44 Playwright tests (6 in M6).

### Fixed (post-M6 review pass — spec-reviewer + code-reviewer)
- **SPA cache invalidation only refreshed the empty-filter list** (`frontend/src/lib/missions.ts:136`): `missionKeys.list()` returns `['missions','list',{}]`. TanStack v5's `invalidateQueries({queryKey})` is prefix-based, but `{}` is treated as an atomic final element — so create / transition / delete called with that key only invalidated the *exact* empty-filter list, leaving any filtered variant stale until manual refetch. Added `missionKeys.listPrefix()` returning `['missions','list']` and switched all three mutation `onSuccess` paths to it.
- **Snapshot lacked the per-scenario advisory lock** (`backend/app/services/missions.py:467`): a concurrent `PUT /scenario-templates/{id}/tests` (M5 reorder, which deletes-then-reinserts join rows) running while `_snapshot_scenarios` walked `sc.tests` could freeze a torn snapshot — `selectinload` re-queries under READ COMMITTED so a partial view was possible. Added `_lock_scenario_ids_for_snapshot` that acquires the same `pg_advisory_xact_lock` key used by `set_scenario_tests` (blake2b digest of the scenario UUID, sorted to avoid deadlocks). Snapshot and reorder now serialise per scenario.
- **Transition endpoint leaked its body shape via 400 before the perm gate** (`backend/app/api/missions.py:441`): a user without `mission.update` or `mission.archive` POSTing `{"status":"x"}` got a Pydantic 400 instead of 403. Added `@require_perm("mission.update", "mission.archive")` so the gate fires before the parse; the inner refinement still enforces the per-target perm. Test `test_transition_perm_gate_runs_before_payload_parse`.
- **LIKE wildcards in user-typed search were honoured as SQL wildcards** (`backend/app/services/missions.py:632,637`): `?q=%` matched every mission. Added `_escape_like` that pre-escapes `%`, `_`, `\` and a matching `escape='\\'` argument on every `.like(...)` call. Test `test_search_treats_wildcards_as_literals`.
- **Counts ignored soft-deleted mission children** (`backend/app/services/missions.py:587,597`): `tests_count` and the detail view summed `len(sc.tests)` without filtering `MissionTest.deleted_at`. Harmless today (M6 doesn't soft-delete mission tests), but would drift silently once M7+ surfaces `state=skipped/blocked`. Added the filter in both `_to_list_item` and `_scenario_views`.
- **`/users/roster` was unordered** (`backend/app/api/users.py:73`): the wizard's member list shuffled rows on every refetch. Sorted by `email` for predictable rendering + stable e2e selectors.
- **Frontend transition button accent collapsed `in_progress` and `completed` into one colour** (`frontend/src/pages/MissionDetailPage.tsx:97`): both rendered cyan, so the status legend in the list didn't match the transition button. Added a `TRANSITION_BUTTON_ACCENT` map mirroring `MISSION_STATUS_ACCENT` (cyan/orange/green/teal).
- **Soft-deleted source scenario was a silent foot-gun**: `_load_scenario_templates_for_snapshot` already rejected it, but no test pinned the behaviour. Added `test_create_mission_rejects_soft_deleted_scenario` so future refactors can't regress to "freeze a tombstoned scenario into a fresh mission".
- **E2E wizard assertion used `getByRole('button', { name: /In Progress/i })`** (`e2e/tests/m6-missions.spec.ts:287`): the accessible name is `→ In Progress` and the arrow Unicode is brittle. Switched to `getByTestId('mission-transition-in_progress')`.

### Added — M6 (Missions & snapshot)
- **CRUD `missions`** (`app/services/missions.py` + `app/api/missions.py`):
  - Fields: name, client_target, date_start, date_end, status (`draft/in_progress/completed/archived`), description (markdown), visibility_mode (frozen to `whitebox` in v1).
  - On creation/append, the service **snapshots** the selected `scenario_templates` and all their `test_templates` into `mission_scenarios` / `mission_tests` (every template field — including OPSEC level, tags, expected IOCs, MITRE tags). The denormalised `mission_test_mitre_tags` table copies `external_id`, `name`, `url` so a later MITRE re-sync that drops the entry can't alter a mission's tags (spec §11).
  - `source_*_template_id` FKs survive template soft-deletes (`ON DELETE SET NULL`); the mission's frozen content is unaffected.
  - **Membership visibility**: non-admin viewers see only missions where they are a `mission_members` row. The service maps "not visible" → 404 (no existence leak via 403). Admins bypass via the `admin` group.
  - **Status state machine**: `draft → in_progress → completed → archived`; `archived → ∅`. The transition endpoint accepts the target status, validates the move, and rejects invalid jumps with 409. Idempotent (target=current) is a no-op 200.
  - Auto-creator-membership: a non-admin caller of `POST /missions` is auto-added as `role_hint='red'` if not already in the `members[]` payload — so they retain visibility on the mission they just created.
  - REST: `GET/POST /missions`, `GET/PUT/DELETE /missions/{id}`, `POST /missions/{id}/scenarios` (append snapshots at the end), `PUT /missions/{id}/members` (replace set), `POST /missions/{id}/transition`.
  - Filters on list: `q` (LIKE on name/description), `status`, `client` (LIKE on client_target). `include_deleted=true` is admin-only (403 otherwise).
- **`GET /users/roster`** (`app/api/users.py`): a deliberately minimal listing — `id`, `email`, `display_name` of active users only — accessible to any holder of `user.read`, `mission.create`, or `mission.update`. Lets a non-admin red teamer populate the wizard's member picker without exposing the admin-grade `/users` endpoint (which leaks `is_admin`, `is_active`, group memberships).
- **Frontend**:
  - `lib/missions.ts` — typed client + queryKey factory + status accent map + filter query-string builder.
  - `pages/MissionsListPage.tsx` — list cards (one per mission) with status accent, scenario/test/member counts, date range, plus filters (q, client, status).
  - `pages/MissionsCreatePage.tsx` — **3-step wizard**: metadata → scenario picker → member roster (red/blue toggles + auto-include the non-admin creator). Submits via `POST /missions` and redirects to the detail page.
  - `pages/MissionDetailPage.tsx` — header with transition buttons (only the legal next states are rendered), soft-delete with confirm prompt, and 4 tabs: **Tests** (table of snapshotted tests with MITRE tags, OPSEC, state), **Members** (role-coloured pills), **Synthesis** (placeholder for M10), **Export** (placeholder for M11).
  - Nav adds **Missions** link visible to anyone with `mission.read` or admin.
- **/diag/reset** truncates the mission tables before the template tables — `mission_scenarios.source_scenario_template_id` and `mission_tests.source_test_template_id` are `ON DELETE SET NULL`, so wiping missions first avoids the round-trip through the null-update path.
- **Testing**:
  - `backend/tests/test_missions.py` — **22 pytest** covering snapshot fidelity (rename source template after snapshot → mission unchanged), MITRE tag propagation, membership-based 404, perm gating (create vs read vs archive), status transition chain + invalid jumps (409), member set replace + role-hint validation, scenario append at correct position, soft-delete, partial metadata update, inverted-date rejection, admin-only `include_deleted`.
  - `e2e/tests/m6-missions.spec.ts` — **5 Playwright** (snapshot freezing, membership visibility for non-admin red, status transition + 409, SPA wizard end-to-end, SPA list + status filter).
  - `tasks/testing-m6.md`.

### Added — M5 (Test & scenario templates)
- **CRUD `test_templates`** (`app/services/test_templates.py` + `app/api/test_templates.py`):
  - Fields: name, description, objective, procedure (markdown), prerequisites (markdown), expected result red, expected detection blue, OPSEC level (`low/medium/high`), free tags (TEXT[]), expected IOCs (TEXT[]).
  - Polymorphic MITRE tag set (`(kind, external_id)` ↔ exactly one of `tactic_id`/`technique_id`/`subtechnique_id`). The wire payload uses ATT&CK external IDs — server resolves to UUIDs.
  - Filters: `q` (LIKE on name/description), `tactic`/`technique`/`subtechnique` (joined via subquery on the polymorphic tag table), `opsec`, `tag` (array contains).
  - REST: `GET /test-templates`, `GET /test-templates/{id}`, `POST /test-templates`, `PUT /test-templates/{id}` (partial, with explicit `_UNSET` sentinel so omitted fields stay untouched), `DELETE /test-templates/{id}` (soft).
- **CRUD `scenario_templates`** (`app/services/scenario_templates.py` + `app/api/scenario_templates.py`):
  - Ordered list of test_templates with `position` (UNIQUE `scenario_template_id, position`).
  - Reorder via full replace: `PUT /scenario-templates/{id}/tests` deletes the join rows and re-inserts at positions `0..N-1` — clean atomic op that respects the UNIQUE constraint without a 2-phase position shuffle.
  - The same test can appear multiple times (chained operations).
  - REST: `GET`/`POST`/`PATCH` (metadata) / `DELETE` (soft) on `/scenario-templates`.
- **Frontend**:
  - `lib/templates.ts` — typed client + queryKey factory.
  - `pages/AdminTestsPage.tsx` — list + filters (q, tactic, opsec, tag) + modal with full field set + embedded `<MitreTagPicker>` for tags.
  - `pages/AdminScenariosPage.tsx` — list + modal with **@dnd-kit/sortable** vertical drag-and-drop on the ordered test list. New deps: `@dnd-kit/core`, `@dnd-kit/sortable`, `@dnd-kit/utilities`.
  - `components/MarkdownField.tsx` — lean textarea with markdown hint (no heavy editor dep; rendering happens at display time in M7).
  - Nav adds **Tests** and **Scenarios** links (admin-gated).
- **/diag/reset** truncates the 4 new tables before the MITRE block — the `scenario_template_tests.test_template_id` FK is `ON DELETE RESTRICT`, so the order matters.
- **Testing**:
  - `backend/tests/test_templates.py` — **19 pytest** (create/list/filter by tactic+opsec+tag, MITRE tag resolution + replacement on update, soft-delete, perm gating, scenario create+reorder+delete, soft-deleted test linking semantics).
  - `e2e/tests/m5-templates.spec.ts` — **4 Playwright** (API CRUD round-trip, scenario reorder, SPA list + opsec filter, SPA scenario list rendering with ordered tests).
  - `tasks/testing-m5.md`.

### Fixed (M5 implementation)
- **`LogRecord` key collision**: `log.info(..., extra={"name": ...})` raises `KeyError("Attempt to overwrite 'name' in LogRecord")` because `name` is reserved by Python's stdlib logging. Renamed to `template_name`.
- **React `currentTarget` null in deferred state updaters**: `onChange={(e) => setX((prev) => ({ ...prev, q: e.currentTarget.value }))}` blanked the page on the first user input because `currentTarget` is cleared after the listener bubble ends, before React invokes the updater. Switched all M5 handlers to `e.target.value`, which persists on the synthetic event.

### Fixed (post-M5 — scenario reorder 500 + cross-worker lock correctness)
- **`PUT /scenario-templates/{id}/tests` returned 500** (`backend/app/services/scenario_templates.py:218`): the two-argument form `pg_advisory_xact_lock(:n, :m)` failed with `function pg_advisory_xact_lock(smallint, bigint) does not exist`. Postgres only provides `(int4, int4)` and `(bigint)` overloads — psycopg promoted `m = hash(uuid) & 0xFFFFFFFF` (up to 2^32-1) to bigint and there's no matching overload. Switched to the single-argument bigint form with `CAST(:key AS bigint)`.
- **Cross-worker lock was a no-op** (same site): Python's built-in `hash()` is randomised per process via `PYTHONHASHSEED`, so each gunicorn worker computed a different key for the same `scenario_id`, and concurrent reorders on different workers acquired independent locks — defeating the serialisation. Replaced with `blake2b(scenario_id.bytes, digest_size=8)` interpreted as a signed int64. Stable, deterministic, fits in `bigint`.

### Fixed (post-M5 UI — modal layout for the test-template editor)
- **Modal box capped its width at `max-w-2xl` and had no vertical scroll** (`frontend/src/components/ui/Modal.tsx`): opening **+ New test** rendered the 15-column MITRE matrix inside a 672 px frame with no height cap, so the matrix spilled to the right and the form bottom dropped below the viewport — buttons unreachable, no scroll. Added a `size` prop (default `2xl` for back-compat), `max-h-[calc(100vh-2rem)]` + `flex flex-col` on the dialog, and an inner `min-w-0 flex-1 overflow-y-auto` body so the header stays pinned while the form scrolls inside the modal.
- **MITRE matrix overflow-x failed to scroll inside the modal body** (`frontend/src/components/MitreTagPicker.tsx`): `overflow-x-auto` sat directly on the grid element, but the grid's intrinsic min-width (`15 × minmax(7rem, …)` = 1680 px) prevented it from shrinking below its content, so the grid spilled outside its parent instead of scrolling. Wrapped the grid in a dedicated `overflow-x-auto rounded min-w-0 w-full` scroller and added `min-w-0` to the picker root so the constraint propagates from the modal body. The grid now scrolls horizontally inside the modal.
- **`grid gap-3` form layout in the test-template modal propagated `min-width: auto`** (`frontend/src/pages/AdminTestsPage.tsx`): each grid item refused to shrink below its widest child, so the picker dragged the form (and the body) past the modal width. Switched the form to `flex flex-col gap-3 min-w-0`, which breaks the propagation while preserving vertical spacing.
- **Test-template modal now uses `size="7xl"`** and the scenario-template modal `size="3xl"` to match their content density.

### Fixed (post-M5 review pass — spec-reviewer + code-reviewer)
- **Filter combinator was OR, not AND** (`backend/app/services/test_templates.py:235`): `?tactic=TA0002&technique=T1059` returned templates matching *either* facet instead of *both*. Pre-fix also pooled all three UUIDs into a shared `IN` list across three columns, theoretically allowing a UUID collision to match across kinds. Refactored to one IN-subquery per facet, ANDed together via repeated `WHERE id IN (...)`.
- **Concurrent reorder race on `set_scenario_tests`** (`backend/app/services/scenario_templates.py:207`): two parallel reorders on the same scenario could deadlock on the `UNIQUE(scenario_id, position)` constraint under READ COMMITTED. Added a per-scenario `pg_advisory_xact_lock(0x5C3, hash(scenario_id))` mirroring the M4 `/mitre/sync` pattern; different scenarios don't contend.
- **N+1 on `_to_view` MITRE resolution** (`backend/app/services/test_templates.py:160`): rendering K templates with ~T tags each fired up to K×T `s.get(...)` calls. Added `_to_views_batch` that pre-builds `{uuid → MitreRow}` maps in 3 queries and feeds them to per-template view assembly; `list_test_templates` now issues 4 queries total regardless of list size.
- **Wire-level item length cap on `tags` / `expected_iocs`** (`backend/app/api/test_templates.py:18-21`): the DB columns are `ARRAY(String(64))` / `ARRAY(String(255))` but the API layer only capped the LIST length, not item strings — long inputs hit the driver with `StringDataRightTruncation`. Added `Annotated[str, StringConstraints(...)]` types so the API returns 400 with a clean validation error.
- **Front-end mutation cache hygiene** (`frontend/src/pages/AdminScenariosPage.tsx:148-156`): `updateMeta` and `setTests` mutations are run sequentially in `submit()`; on partial failure (metadata saved but reorder failed) the cache stayed stale. Both mutations now `onSettled: invalidate` so whatever step landed is reflected without manual refresh.
- **Backend vs front-end consistency on duplicate tests in a scenario** (`frontend/src/pages/AdminScenariosPage.tsx:227-231`): the backend allows the same `test_template` to appear multiple times (chained ops; the UNIQUE constraint is `(scenario_id, position)` not `(scenario_id, test_template_id)`), but the catalogue picker was filtering out already-picked items. Removed the filter — only soft-deleted tests are excluded now.
- **Test coverage closure** (`backend/tests/test_templates.py`): +4 pytest (tactic+technique AND-semantics, `extra="forbid"` rejection, empty `mitre_tags` explicit clear, 65-char tag length cap → 400). Total backend now 23 M5 tests + 39 elsewhere = 81 pass.

### Added — M4 (MITRE ATT&CK Enterprise)
- **STIX 2.1 parser + upsert** (`app/services/mitre_seed.py`): stdlib-only (`urllib.request` + `hashlib`), pinned to Enterprise v19.0 (`enterprise-attack-19.0.json`, sha256 `df520ea0…`). Parses 25k+ STIX objects → 15 tactics, 222 techniques, 475 sub-techniques in ~1.1 s. Skips revoked + deprecated, resolves sub-technique parents via `relationship[subtechnique-of]` with a `T1003.001 → T1003` dotted-id fallback, copies kill-chain phases into the `mitre_technique_tactics` M2M.
- **CLI**: `flask metamorph seed-mitre [--source <path|url>] [--checksum-sha256 <hex>] [--skip-checksum]` (`app/cli.py`). `make seed-mitre` wraps it.
- **REST endpoints** (`app/api/mitre.py`):
  - `GET /api/v1/mitre/tactics`, `/mitre/techniques?tactic=…&q=…`, `/mitre/subtechniques?technique=…&q=…` (paginated, search on name/external_id).
  - `GET /api/v1/mitre/status` (last_sync, version, source_url, defaults).
  - `POST /api/v1/mitre/sync` (perm `mitre.sync`) — re-pull on demand.
- **Persisted metadata** in `settings`: `mitre_last_sync`, `mitre_version`, `mitre_source_url`.
- **Compose volume `metamorph_mitre`** mounted at `/data/mitre/` in the api container — caches the downloaded bundle across restarts. Owned by `metamorph:metamorph`.
- **Frontend**:
  - `<MitreTagPicker>` component: flat ATT&CK matrix matching `attack.mitre.org/#` — full-bleed beyond `max-w-page`, 15 equal-width columns via `grid-template-columns: repeat(N, minmax(7rem, 1fr))`, sans-serif 12px, **name-only cells** (external_id surfaces on hover via `title` and in selection chips), `▸/▾` chevron expands sub-techniques inline within the column, multi-select with chip-removal at the top. Returns `MitreTag[]` (`kind`, `id`, `external_id`, `name`), ready for M5 templates.
  - `/mitre` showcase page with status card, admin-gated **Trigger sync** button, the picker, and a JSON `<pre>` preview of the current selection.
  - Nav adds **MITRE** link for any logged-in user.
- **Testing**:
  - `backend/tests/test_mitre.py` — **12 pytest** (parser, idempotence, checksum mismatch, persisted settings, endpoint variants, perm enforcement) using a hand-crafted minimal STIX bundle (no network in tests).
  - `e2e/tests/m4-mitre.spec.ts` — **6 Playwright** against the live stack (calls `/mitre/sync` once in `beforeAll`).
  - `tasks/testing-m4.md`.

### Fixed (post-M4 spec-review pass)
- **Sync integrity guarantee**: `seed_mitre()` now refuses a custom URL without either `expected_sha256` or an explicit `allow_unverified=true`. Closes a "typo in `mitre_source_url` setting routes the seed to attacker JSON" footgun. CLI surfaces this via `--checksum-sha256` / `--skip-checksum`; API via `{"source", "expected_sha256", "allow_unverified"}` body.
- **`/diag/reset` consistency**: now truncates the `mitre_*` tables alongside `settings` so `GET /mitre/status` and `GET /mitre/tactics` agree after a reset (previously: catalogue rows persisted, but `mitre_last_sync` got wiped → status lied).
- **Spec drift §10 #4**: amended "14 tactics" → "≥ 14 tactics (v19 ships 15)" to reflect MITRE v8+ reality.

### Fixed (post-M4 code-review pass)
- **SSRF allowlist on `/mitre/sync`**: host must be in `MITRE_ALLOWED_HOSTS` (defaults to `raw.githubusercontent.com`, comma-separated env override). Closes the "admin holding `mitre.sync` can pivot the api container at cloud metadata (`169.254.169.254`) or internal mirrors" vector. New `MitreSourceForbidden` exception → 400 with `source_forbidden` error code.
- **Concurrent sync race**: `seed_mitre()` now acquires `pg_advisory_xact_lock(hashtext('mitre.seed'))` at the top of the transaction so two `/mitre/sync` calls serialise cleanly across the `DELETE` + re-`INSERT` of `mitre_technique_tactics`.
- **Typed sync contract end-to-end**: Pydantic `SyncResultOut` on the backend (`app/api/mitre.py`) mirrored by a `MitreSyncResult` TS interface (`frontend/src/lib/mitre.ts`). The MitrePage mutation no longer uses an `as Record<string, unknown>` escape hatch.
- **N+1 in dotted sub-technique fallback**: pre-built `{external_id → id}` dict at function entry; was firing one extra SELECT per orphan (currently 0 with MITRE, but a latent footgun for partial bundles).
- **`SETTING_VERSION` cleared explicitly when source != default**: previously kept the stale pinned version after a custom-URL re-sync; now `_upsert_setting(..., None)` so `/mitre/status` doesn't lie.
- **Internal error scrub on `/mitre/sync`**: 500 responses no longer leak URLError / DB driver text via `str(e)` — stack lands in JSON logs only.
- **E2E pinned to exact MITRE v19 counts** (15/222/475/0 orphans) for parser-regression detection; previously `>=` thresholds could mask "revoked tactics silently included".
- **E2E uses `crypto.randomUUID()`** instead of `Math.random()` for unique test emails.
- **Test coverage for security guards**: `file://` rejection, disallowed HTTPS host, custom-URL-without-sha refusal, dotted-id fallback, version-clearing semantics — 5 new pytest covering paths the spec-review demanded but no test enforced.

### Decisions (intentional)
- **Bundle "embarqué" interpreted as seed-time download + named-volume cache**, not "binary baked into the Docker image". Keeps the image ~150 MB, makes version bumps a constant edit, plays nicely with `make seed-mitre` re-runs. Air-gapped operators copy the file into the volume + pass `--source /data/mitre/<file>`.
- **Read endpoints unauthenticated-perm-wise but auth-required**: MITRE data is public reference material — no `mitre.read` perm. Status endpoint is similarly open (under `@require_auth`) to keep `/mitre/status` simple for the UI badge.
- **No `requests` / `httpx` dep added**: stdlib `urllib.request` is enough and avoids inflating the image.

### Validated end-to-end (M4 DoD)
- `make clean && make up && make migrate && make seed-mitre` → 15 tactics / 222 techniques / 475 sub-techniques / 254 links / 0 orphans / ~1.1 s.
- `make test-api` → **58 pytest pass** (1 health + 8 schema + 15 auth + 15 RBAC + 19 MITRE) in ~5 s.
- `make e2e` → **34 Playwright pass** (8 M0 + 4 M1 + 8 M2 + 8 M3 + 6 M4) in ~18 s.
- Spec-reviewer PASS after fixes applied.

### Added — M3 (RBAC: groups, permissions, users)
- **Permission catalogue** (`app/services/permissions_seed.py`): 31 atomic codes across 10 families (`user`, `group`, `invitation`, `test_template`, `scenario_template`, `mission`, `detection_level`, `setting`, `mitre.sync`). Seeded at boot **and** after `/setup` to handle a freshly truncated DB. Idempotent + additive on system groups (never removes a perm).
- **Default group bindings**: `admin` = all 31 codes; `redteam` = 8 (catalogue read + mission.{read,create,update,archive,write_red_fields} + detection_level.read); `blueteam` = 5 (catalogue read + mission.{read,write_blue_fields} + detection_level.read).
- **Users admin service + API** (`app/services/users.py`, `app/api/users.py`): list (q + is_active filter + pagination), get, patch (display_name/locale/is_active), soft-delete, set groups. Last-admin protection on update/delete/group-strip.
- **Groups admin service + API** (`app/services/groups.py`, `app/api/groups.py`): full CRUD with system-group protection (no rename, no delete), `PUT /groups/{id}/permissions` for the bindings. Admin system group's perm set is locked to "every perm" (preserves the bypass invariant).
- **Permissions read-only API** (`app/api/permissions.py`): `GET /permissions` returns the catalogue (admin or `group.read` holders).
- **Frontend admin pages** (`frontend/src/pages/Admin{Users,Groups,Invitations}Page.tsx`): list + edit modals using TanStack Query mutations, multi-select for perms grouped by family, copy-once invitation URL display.
- **Frontend chrome** (`Layout.tsx` + `RequireAdmin.tsx`): admin nav links shown only when `is_admin === true`; direct navigation to `/admin/*` by non-admins redirects to `/`. Server remains the arbiter.
- **`/diag/reset` now clears the rate-limit counters** so the Playwright suite can iterate without hitting `10/min/IP` budgets across spec files. Gated to non-prod environments only.
- **Testing**:
  - `tests/test_rbac.py` — **15 pytest integration tests** (39 backend total).
  - `e2e/tests/m3-rbac.spec.ts` — **8 Playwright tests** covering DoD §10 #2/#3 (28 e2e total).
  - `tasks/testing-m3.md` — manual + automated procedure.
- **Frontend api helpers**: `apiPatch`, `apiPut`, `apiDelete` added to `frontend/src/lib/api.ts`.

### Fixed (post-M3 spec-review pass)
- **Rate-limit scope clarified**: `app/core/rate_limit.py` now enables the limiter for `APP_ENV in ("prod", "staging")` instead of `prod` only — a public staging deployment without auth limits would be surprising. Dev/test stay unthrottled for Playwright ergonomics. Spec §6 NF-security applies to operator-facing deployments.
- **Admin perm invariant**: `set_group_permissions` refuses to alter the admin system group's perm set to anything other than the full catalogue (`SystemGroupProtected` → 409). The decorator bypass relies on `is_admin = "admin" in group_names`, but a future refactor could move to a perm-based check, so we keep the invariant.
- **LogRecord field collision**: `log.info("...", extra={"name": g.name})` raised `KeyError: "Attempt to overwrite 'name' in LogRecord"` because Python's logger reserves `name`. Renamed to `group_name`. Audited all other `extra=` payloads in `app/api/`+`app/services/` for the same trap.

### Validated end-to-end (M3 DoD)
- `make clean && make up && make migrate` → boot logs show `metamorph.permissions.seeded {perms_created: 31, perms_total: 31, bindings: {admin: 31, redteam: 8, blueteam: 5}}`.
- `make test-api` → **39 pytest pass** (1 health + 8 schema + 15 auth + 15 RBAC) in ~4 s.
- `make e2e` → **28 Playwright pass** (8 M0 + 4 M1 + 8 M2 + 8 M3) in ~16 s.
- Spec-reviewer pass: PASS verdict, 2 minor fixes applied (above), 2 anticipations noted for M12/M14 (no current action).

### Added — M2 (Auth, bootstrap, invitations)
- **Crypto plumbing**: `app.core.security` (Argon2id `time_cost=2 memory_cost=64MiB parallelism=2`, opaque-token SHA-256 helpers), `app.core.jwt_tokens` (HS256, claims `iss/sub/type/jti/iat/exp`, access 1h / refresh 30d).
- **Auth services** (`app.services.auth`): login, refresh **with token rotation + reuse-detection chain revoke**, logout (idempotent), change_password (forces logout-all).
- **Invitation services** (`app.services.invitations`): create, preview, accept, revoke. Token persisted only as SHA-256, default 7-day TTL.
- **Bootstrap** (`app.services.bootstrap` + `app.core.install_token`): seeds 3 system groups (`admin`/`redteam`/`blueteam`), mints a one-shot install token at first boot when `users` is empty, logs a banner with the raw token. CLI `flask --app app.cli metamorph print-install-token [--force]`.
- **Auth middleware** (`app.core.auth_decorators`): `@require_auth` populates `g.current_user`; `@require_perm("...")` checks atomic permissions; admin group bypasses the check (atomic perms land in M3).
- **API endpoints**:
  - `POST /api/v1/setup` (consume install token, create 1st admin) + `GET /api/v1/setup` (status).
  - `POST /api/v1/auth/login` + `POST /auth/refresh` + `POST /auth/logout` + `GET /auth/me` + `POST /auth/change-password`.
  - `POST /api/v1/invitations` (admin) + `GET /invitations` + `GET /invitations/preview/<token>` + `POST /invitations/accept/<token>` + `POST /invitations/<id>/revoke`.
  - `POST /api/v1/diag/reset` (test-only kill switch — wipes auth tables + mints fresh install token; only available in `dev`/`test`).
- **Rate limiting** (`flask-limiter`): 10/min/IP on `/auth/login`, `/auth/refresh`; 5/min on `/auth/change-password` and `/setup`; 10–20/min on invitation endpoints. Globally disabled when `APP_ENV=test`.
- **Refresh cookie** `metamorph_refresh`: HttpOnly + Secure + SameSite=Strict + Path=`/api/v1/auth/`.
- **Frontend auth state** (`frontend/src/lib/{api,auth}.ts`): access token in module memory, refresh in cookie, automatic 401-retry via `/auth/refresh` with reentrancy guard. `useAuth()` hook + `<RequireAuth>` route guard.
- **Frontend pages**: `/login`, `/setup`, `/register?token=…`, `/profile` (with change-password form), all in RTOps design. Protected layout: nav shows email + Logout when authenticated, Login + Setup links when not.
- **Frontend deps**: `@tanstack/react-query`, `react-router-dom`. Tanstack provider in `App.tsx` (will carry actual queries from M3+).
- **Email validation** (`app.api._validation.Email`): permissive RFC-shape regex that accepts internal TLDs (`.local`, `.corp`) — `pydantic.EmailStr` was too strict for red-team labs.
- **Testing**:
  - `tests/test_auth_flow.py` — **15 pytest integration tests** (24 backend total with M0/M1).
  - `e2e/tests/m2-auth.spec.ts` — **8 Playwright tests** covering setup → login → me → invitation → register → 2nd login → RBAC 403 → refresh rotation → logout (20 e2e total).
  - `tasks/testing-m2.md` — manual + automated procedure.

### Fixed (post-M2 spec-review pass)
- **Refresh cookie `Secure=True`** unconditionally (`backend/app/api/auth.py`). Modern browsers treat `localhost` as a secure context, so dev/test still works. Closes the silent-degradation found by the reviewer.
- **`/auth/refresh` rate-limit lowered to 10/min/IP** (`backend/app/api/auth.py`) to match spec §M2 ("10 req/min/IP on `/auth/*`").
- `/diag/reset` kept allowed in `dev` *and* `test` (a `make e2e` against a `make up` dev stack must be able to reset). Added a WARNING log when triggered in `dev` and a clear docstring; production envs (`prod`/`staging`) remain locked out.

### Known scope-creep (intentional, not retracted)
- Rate-limits on `/setup` (5/min), `/invitations/preview` (20/min), `/invitations/accept` (10/min) and `/auth/change-password` (5/min) were added in M2 even though §M2 only mandated `/auth/*`. Defensible (these are abuse-attractor endpoints), and noted here so M14 doesn't double-spec them.

### Added — M1 (DB schema & migrations)
- **23 tables** + `alembic_version` covering auth/RBAC (8), MITRE (4), templates (4), missions (6), evidence (1), settings/detection-levels (2), notifications (1).
- SQLAlchemy 2.x declarative models with Mapped[]/mapped_column(), grouped under `backend/app/models/{auth,mitre,template,mission,evidence,setting,notification}.py`.
- Alembic init: `alembic.ini`, `alembic/env.py` reading `app.core.config.settings.database_url`, `alembic/script.py.mako`, naming convention `pk_/fk_/ck_/uq_/ix_` enforced via `MetaData(naming_convention=...)` on `app.db.base.Base`.
- Reusable mixins in `app.db.mixins`: `UuidPkMixin` (uuid4 server-side), `TimestampMixin` (created_at/updated_at, server-default + onupdate), `SoftDeleteMixin` (deleted_at, no auto-injected index — declared explicitly per table to avoid mixin-vs-class `__table_args__` clobbering).
- Postgres-specific features used: `JSONB` for `settings.value` and `notifications.payload`; native `Uuid` columns; partial indexes (`WHERE deleted_at IS NULL` on 9 tables; `WHERE read_at IS NULL` on `notifications`); CHECK constraints for status/state/opsec_level/mitre_kind enums; `exactly_one_mitre_fk` CHECK on `test_template_mitre_tags`.
- **`mission_test_mitre_tags` deliberately denormalised** (no FK to `mitre_*` tables): copies `mitre_external_id`, `mitre_name`, `mitre_url` at tag time so a later MITRE re-sync that drops an entry cannot purge a mission's tags. Companion `test_template_mitre_tags` keeps FKs since templates are editable. (Spec §11 risk addressed.)
- Backend `pyproject.toml` deps: SQLAlchemy ≥2, Alembic ≥1.13, psycopg[binary] ≥3.1.
- New Makefile targets: `migrate`, `migrate-down`, `migrate-revision MSG=…`, `migrate-status`. The Dockerfile now ships `alembic.ini` + `alembic/` so the api container can run migrations directly.
- **Test stage in `backend/Dockerfile`** (`--target test`): runtime image + dev extras + `tests/` dir. New `make test-api` target spins an ephemeral container against the live DB on the compose network. Backend tests no longer require any local Python toolchain.
- `tests/test_schema.py` (8 integration tests + the existing M0 health test = 9 total): expected tables, expected timestamp/soft-delete columns, partial-index presence, expected FK pairs, expected CHECK constraints, alembic-at-head, and a negative INSERT proving the `exactly_one_mitre_fk` CHECK fires.
- `tasks/testing-m1.md` — manual + automated verification procedure.

### Fixed (post-M1 spec-review pass)
- Soft delete now consistent across snapshot-bearing tables: `mission_scenarios`, `mission_tests`, `mission_categories` gained `SoftDeleteMixin` + their `ix_<table>_active` partial index (M12 trash bin depends on this).
- `evidence_files` gained `TimestampMixin` (`created_at`/`updated_at`) on top of the domain `uploaded_at` (audit minimal everywhere, per M1 brief).
- `mission_members` gained `TimestampMixin`, replacing the bespoke `added_at` column.
- `scenario_template_tests` PK refactored to a UUID + `UNIQUE(scenario_template_id, position)` so the same test can appear at multiple positions in a scenario (chained operations).
- `SoftDeleteMixin.__table_args__` removed (silently clobbered by class `__table_args__`); each soft-delete table now declares `ix_<table>_active` explicitly. Documented in the mixin's docstring.
- `mission_test_mitre_tags` schema redesigned to denormalise MITRE labels (see "Added" entry above).
- Migration 0001 regenerated end-to-end after these fixes — `24765a5014b6` is the new HEAD.

### Validated end-to-end (M1 DoD)
- `make clean && make up && make migrate` from a vide DB → 27 tables, 32 FK, 9 CHECK, 14 UQ, 12 partial indexes.
- `make test-api` → **9 pytest pass** (1 health + 8 schema integration) in <1 s.
- `make e2e` → **12 Playwright pass** (8 M0 smoke + 4 M1 db visibility) in 3 s.

### Added (M1 visibility)
- New API endpoint `GET /api/v1/diag/db` exposes `alembic_revision` (short-hashable) and the public-schema `table_count`. Returns 503 with `{"reachable": false}` when Postgres is down.
- New `Database` card on the SPA home page consumes that endpoint, renders the revision short-hash and the count next to the existing `API` and `Roadmap` cards.
- Footer updated to `M0 bootstrap · M1 db schema`. Roadmap card now points to `M2 — Auth + JWT`.
- New e2e suite `e2e/tests/m1-db.spec.ts` (4 tests) covers the diag endpoint contract, the Database card rendering, and the footer/roadmap labels.

### Added — M0 (bootstrap)
- Repo scaffolding: `.gitignore`, `.env.example`, `Makefile`, `docker-compose.yml`, `README.md`, `CHANGELOG.md`.
- `docker-compose.yml` with three services: `db` (postgres:16-alpine, no host port), `api` (Flask 3, port 8000), `front` (nginx serving the Vite bundle, port 80).
- Named volumes `metamorph_db` and `metamorph_evidence` for data persistence.
- Backend skeleton: Flask app factory, JSON structured logging on stdout, `GET /api/v1/health` endpoint, multi-stage Dockerfile, `pyproject.toml` driven by `uv`.
- Frontend skeleton: Vite + React 18 + TypeScript strict + TailwindCSS, RTOps design tokens (`tasks/design.md`) translated into `tailwind.config.ts`, base UI primitives (`Card`, `Tag`, `SectionHeader`, `FlowNode`, `Button`), home page wired to `/api/v1/health`.
- Multi-stage frontend Dockerfile that builds the bundle and serves it via nginx, proxying `/api/*` to the api container.
- Pre-commit hook config: `ruff` for backend, `eslint` + `tsc --noEmit` for frontend.

### Validated
- `docker compose config` parses (validated via `pyyaml` since Docker is not installed in the dev shell).
- Every env var referenced by the compose file is documented in `.env.example`.
- All Python source files parse cleanly (`ast.parse`).
- All TS/JSON config files parse cleanly.

### Notes
- TLS termination is delegated to an external reverse proxy (per spec §6 NF-network). The compose stack exposes plain HTTP on `HOST_FRONT_PORT` (8080) and `HOST_API_PORT` (8000).
- The first-admin bootstrap token (M2) will be printed to the api container's stdout on first boot when the `users` table is empty.
- `tasks/spec.md` and `tasks/todo.md` remain authoritative; update them before changing scope.

### Fixed (M0 DoD validation pass on real podman)
- **FQDN image references** in `docker-compose.yml`, `backend/Dockerfile`, `frontend/Dockerfile`. Podman on Fedora enforces `short-name-mode=enforcing` for pulls (no TTY ⇒ no prompt ⇒ failure). Replaced `postgres:16-alpine` / `python:3.12-slim` / `node:20-alpine` / `nginx:1.27-alpine` with their `docker.io/library/…` qualified equivalents. Docker accepts the same prefix transparently.
- **`*.md` removed from `backend/.dockerignore` and `frontend/.dockerignore`**: `pyproject.toml` declared `readme = "README.md"`, but the file was being filtered out of the build context, so `hatchling.build.build_wheel` raised `OSError: Readme file does not exist: README.md`. Also removed the `readme` field itself from `pyproject.toml` to decouple the build from the doc.
- **`Card.tsx` type clash**: `CardProps extends HTMLAttributes<HTMLDivElement>` redefined `title` as `ReactNode`, but the native `title` is `string`. `tsc -b` failed with TS2430 during `vite build`. Switched to `Omit<HTMLAttributes<HTMLDivElement>, 'title'>`.
- **Explicit healthchecks added to compose `api` and `front`**: podman-compose 1.x doesn't surface healthchecks declared only in the `Dockerfile` via `inspect`. Mirroring them in `docker-compose.yml` makes `make inspect-health` actually see `healthy/unhealthy/starting` on every engine.
- **Suppressed `podman compose` external-provider banner** via `PODMAN_COMPOSE_WARNING_LOGS=false` exported from the Makefile.

### Validated end-to-end on podman 5.x (Fedora 43)
- `make up` → 3 containers, all 3 healthy after start_period.
- `make health` → `{"status":"ok","version":"0.1.0"}` via the front nginx proxy (port 8080) and direct API (port 8000).
- `make logs-api` → JSON-structured lines on stdout (`ts`, `level`, `logger`, `message`, custom fields).
- `make e2e` → **8/8 Playwright tests pass** in 2.5 s. Reports: `e2e/playwright-report/index.html` (529 KB, autoportant) + `junit.xml` (`tests=8 failures=0 skipped=0 errors=0`).

### Added (engine portability)
- Makefile auto-detects **docker** or **podman** at runtime and selects the matching compose driver (`docker compose`, `podman compose`, or legacy `podman-compose`). Override via `ENGINE=…` and/or `COMPOSE="…"`.
- New targets: `engine` (print detected runtime), `volumes` (list project-named volumes), `inspect-health` (health status of all 3 containers), `logs-api` (tail just the api), `health` (single curl probe). All engine-agnostic.
- `make help` now prints the active engine + compose driver in its footer.
- `tasks/testing-m0.md` and `README.md` rewritten to be engine-agnostic — raw `docker logs` / `docker volume ls` / `docker inspect` calls replaced with the new make targets.

### Added (M0 testing)
- `e2e/` Playwright project with chromium, HTML + JUnit XML reporters, traces / screenshots / videos kept on retry. Reports land in `e2e/playwright-report/`.
- `e2e/tests/m0-smoke.spec.ts` — 8 smoke tests covering the front rendering, the API proxy, the design tokens, the absence of any runtime CDN traffic (spec §7), and the CORS contract.
- Makefile targets `e2e-install`, `e2e`, `e2e-report`, `e2e-up`, `wait-healthy`.
- `tasks/testing-m0.md` — step-by-step manual + automated verification procedure for M0.
- Convention added to `tasks/todo.md`: every milestone N delivers `tasks/testing-m<N>.md` + at least one `e2e/tests/m<N>-*.spec.ts`, and the spec-reviewer subagent runs before marking the milestone done.

### Fixed (post-M0 spec-review pass)
- `.pre-commit-config.yaml` added at repo root: ruff + ruff-format on backend, eslint + tsc --noEmit + prettier --check on frontend, plus baseline whitespace/JSON/private-key checks. Documented `pre-commit install` in `README.md`.
- Self-hosted webfonts via `@fontsource/jetbrains-mono` and `@fontsource/ibm-plex-sans` (imported in `frontend/src/index.css`); dropped the Google Fonts `<link>` from `frontend/index.html` to honor spec §7 ("no runtime CDN").
- Refuse-to-boot guard in `backend/app/core/config.py`: when `APP_ENV != "dev"`, defaults / placeholders for `JWT_SECRET` and `POSTGRES_PASSWORD` raise at startup. New `APP_ENV` env var documented in `.env.example`, `README.md`, and `docker-compose.yml`.
- `make dev` now runs `dev-api` and `dev-front` in parallel via `make -j2` instead of just printing a hint.
- Removed dead `database_url` property from `Settings` (will be reintroduced in M1 with the SQLAlchemy/Alembic stack).
- Pinned Node engines to `>=20` in `frontend/package.json`.
- Reconciled M0 DoD wording in `tasks/todo.md` (HTTP via `HOST_FRONT_PORT`, with explicit note that prod TLS is external).
- Documented the `2xs/3xs/4xs` font-size aliases in `frontend/tailwind.config.ts` against the design.md §3 scale.
-												feat(m0): bootstrap repo, design system, compose stack

- Repo scaffolding: .gitignore, .env.example, Makefile, docker-compose.yml,
  README.md, CHANGELOG.md, pre-commit config.
- Three-service stack: api (Flask 3), db (postgres:16-alpine), front (nginx
  serving the Vite bundle). Named volumes metamorph_db + metamorph_evidence.
- Backend skeleton: Flask app factory, JSON structured logging on stdout,
  GET /api/v1/health, multi-stage Dockerfile, pyproject.toml driven by uv,
  Pydantic Settings with secret guard rails (refuses to boot in non-dev with
  placeholders), APP_ENV gating.
- Frontend skeleton: Vite + React 18 + TypeScript strict + TailwindCSS, RTOps
  design tokens from tasks/design.md, self-hosted JetBrains Mono / IBM Plex
  Sans via @fontsource, base UI primitives (Card/Tag/SectionHeader/FlowNode/
  Button), home page wired to /api/v1/health.
- Engine-agnostic Makefile: auto-detects docker or podman, picks the matching
  compose driver. Targets: up/down/build/rebuild/dev/lint/fmt/test/migrate/
  seed-mitre/print-install-token/e2e/inspect-health.
- Playwright suite: e2e/tests/m0-smoke.spec.ts (8 tests) + HTML + JUnit
  reports + traces on retry.
- Docs: tasks/spec.md (finalized after Q&A), tasks/design.md, tasks/todo.md
  (14 milestones), tasks/testing-m0.md, tasks/lessons.md.

DoD: make up + make health + make e2e all pass on podman 5.x (Fedora) and
docker. TLS terminated by external reverse proxy (spec §6 NF-network).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

											
										
										
											2026-05-11 06:16:00 +02:00
+								# Changelog
 								All notable changes to this project will be documented here. Format: [Keep a Changelog](https://keepachangelog.com/en/1.1.0/) · [Conventional Commits](https://www.conventionalcommits.org/).
 								## [Unreleased]
-												feat(m7): per-test execution — red/blue zones, evidence pipeline, activity poll

DoD M7 (spec §F5 + §F6 + §F8 + tasks/todo.md M7) covered end-to-end:

Backend
- New migration `91a4e7c6d2f3` adds `mission_tests.last_actor_id` (FK users
  ON DELETE SET NULL) and `ix_mission_tests_updated_at` for the polling query.
- `detection_levels`: 4 default rows seeded at boot, `GET /detection-levels`
  read-only (CRUD lands in M8).
- `mission_tests` service + `missions` API extension:
  - `GET /missions/{id}/tests/{test_id}` — full detail incl. evidence list
  - `PUT  /missions/{id}/tests/{test_id}` — patch red/blue fields with per-field
    perm classification (`mission.write_red_fields` vs `mission.write_blue_fields`)
  - `POST /missions/{id}/tests/{test_id}/transition` — pending↔skipped/blocked
    and pending→executed→reviewed_by_blue (+ undo paths), side-aware perm gate
    that fires *before* idempotency, `executed_at` auto-stamped on the way in
  - `GET  /missions/{id}/activity?since=<ISO>` — drives the 15 s polling badge
- `evidence` service + top-level `/evidence/<id>` API:
  - Streaming upload, SHA256 chunk-by-chunk, 25 MB cap, ext+MIME whitelist
  - Content-addressed storage at ${EVIDENCE_DIR}/<mission>/<test>/<sha256><ext>
  - Atomic `os.replace`, hex-validated SHA path component, root-dir guard
  - Membership-aware (404 on miss/forbidden, no existence leak)
- `/diag/reset` now wipes ${EVIDENCE_DIR}/* in test mode (symlink-safe) and
  re-seeds detection levels as a safety net.

Frontend
- `lib/missions.ts` — M7 types + queryKey factory + state-machine matrix.
- `pages/MissionTestPage.tsx` — two-zone layout: red border (command, output,
  comment, mark-executed + override toggle) and cyan border (detection-level
  select, comment, drag-and-drop evidence dropzone). Last-touched badge polls
  /activity every 15 s, gated on document.visibilityState. Per-field disable
  based on the user's red/blue perms (server stays the arbiter).
- `pages/MissionDetailPage.tsx` — test rows link to the new per-test page.
- `App.tsx` — registers /missions/:id/tests/:testId behind RequireAuth.
- `HomePage.tsx` — hero + roadmap card bumped to M7; next is M8.

Tests
- `backend/tests/test_mission_tests.py` — 27 pytest tests (red/blue field
  gating, state-machine matrix incl. idempotent-side enforcement, executed_at
  override, 24/26 MB upload + SHA256, MIME/ext whitelist, soft-delete hide,
  activity polling with URL-encoded `since`, membership 404 vs admin bypass,
  cross-mission evidence access).
- `e2e/tests/m7-execution.spec.ts` — 5 Playwright tests against the live stack
  (red-only/blue-only API gating, mark-executed + reviewed_by_blue side
  enforcement, 24 MB/26 MB upload + SHA256 round-trip, SPA per-test page save
  + transition, non-member 404 message). afterAll restores stable admin and
  re-syncs MITRE.

Docs
- CHANGELOG.md: M7 section + post-M7 review-pass subsection.
- README.md: status, feature blurb, roadmap, testing-m7 link.
- tasks/testing-m7.md: manual + automated procedure with transition matrix
  and perm-gating table.
- tasks/lessons.md: M7 retrospectives (LogRecord `created` trap, URL-encoded
  query timestamps, perm-before-flush, atomic move, polling visibility gate).

Test count: 133 pytest / 49 Playwright, all green.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

											
										
										
											2026-05-14 08:16:48 +02:00
+								### Fixed (post-M7 review pass — spec-reviewer + code-reviewer)
 								- **Idempotent transition leaked false success to a wrong-side user** (`backend/app/services/mission_tests.py:570`): a blue-only viewer POSTing `target_state="executed"` while the test was already executed got a 200 idempotent response, falsely advertising that they held `mission.write_red_fields`. Reordered the gate so the side-perm check runs *before* the idempotency short-circuit, with a new `_IDEMPOTENT_SIDE` table that asks "which side originally produced this state?" — re-asserting that perm even on no-op replays. Test `test_idempotent_transition_still_checks_side_perm`.
 								- **Cross-mission evidence access not pinned by a test** (`backend/tests/test_mission_tests.py:test_evidence_member_of_other_mission_gets_404`): added explicit coverage that a user who is a blue member of mission B sees 404 on an evidence row attached to mission A. The chain walk in `_resolve_evidence_chain` already enforced this, but the regression test was missing.
 								- **`shutil.move` swapped for `os.replace`** (`backend/app/services/evidence.py:240`): `os.replace` is the documented atomic-rename primitive on POSIX and Windows when src/dst share a volume — and our tmpfile is always staged inside the destination directory, so the guarantee holds. Removes the implicit copy+remove fallback from `shutil.move` that would silently break atomicity on a cross-fs `EVIDENCE_DIR`.
 								- **SHA256 path component now hex-validated** (`backend/app/services/evidence.py:227`): the hash always comes from hashlib so it's already hex, but if a future caller ever passes pre-computed bytes we want to fail loudly rather than write to a path like `..something.evtx`. Cheap `re.fullmatch(r"[0-9a-f]{64}", sha256)` guard.
 								- **`EVIDENCE_DIR` filesystem-root guard** (`backend/app/services/evidence.py:_test_dir`): refuse to create per-mission directories when `EVIDENCE_DIR` resolves to `/` (or the equivalent on Windows). Stops a mis-configured operator from laying down content-addressed evidence files at the filesystem root.
 								- **`/diag/reset` evidence cleanup now skips symlinks** (`backend/app/api/diag.py:127`): switched from `is_dir()` to `is_symlink() or not is_dir()` so a hostile or accidental symlink inside `EVIDENCE_DIR` is unlinked rather than `rmtree`'d through.
 								- **N+1 in `_to_detail_view`** (`backend/app/services/mission_tests.py:_to_detail_view`): the last-actor and detection-level lookups each issued their own `s.get()`. Replaced with `select(columns)` queries that return just the needed scalar fields — same SQL count but fewer ORM round-trips, and every PUT/transition exercises this path so it adds up.
 								- **Mission detail row `onClick` removed in favour of the wrapped `Link`** (`frontend/src/pages/MissionDetailPage.tsx:684`): the `tr onClick` + nested `Link` with `stopPropagation` worked but was fragile to accessibility tooling. The link on the test name + the explicit hover class is enough.
 								### Added — M7 (Red & blue execution on a mission test)
 								- **Per-mission-test write API** (`app/api/missions.py` + `app/services/mission_tests.py`):
 								  - `GET /missions/{id}/tests/{test_id}` — full detail view with snapshot, state, red/blue fields, MITRE tags, evidence list, last-actor metadata.
 								  - `PUT /missions/{id}/tests/{test_id}` — patch any subset of `red_command` / `red_output` / `red_comment_md` / `blue_comment_md` / `detection_level_id` / `executed_at` / `executed_at_overridden`. The service classifies each touched field as red-side or blue-side and rejects with 403 if the caller lacks the matching perm. `executed_at*` only writable when the test sits in `executed` or `reviewed_by_blue`.
 								  - `POST /missions/{id}/tests/{test_id}/transition` — drives the state machine `pending↔skipped/blocked` + `pending→executed→reviewed_by_blue` (allows undo back to `pending`). Side-aware perm gating: `pending→executed` and `executed→pending` require `write_red_fields`; `executed↔reviewed_by_blue` requires `write_blue_fields`; `pending↔skipped/blocked` accepts either side. Transitioning into `executed` stamps `executed_at=now()` and clears the override; transitioning out (to `pending`) wipes the timestamp.
 								  - `GET /missions/{id}/activity?since=<ISO>` — returns mission_tests whose `updated_at > since`, freshest first. Drives the SPA's 15-second polling badge. Response includes `server_time` so the client can chain calls without clock drift.
 								- **Evidence storage pipeline** (`app/services/evidence.py` + `app/api/evidence.py`):
 								  - `POST /missions/{id}/tests/{test_id}/evidence` (multipart, gated on `mission.write_blue_fields`): streams the upload into a tmpfile next to the final location, hashing chunk-by-chunk and aborting at the 25 MB cap. Validates extension (whitelist: png/jpg/jpeg/pdf/txt/log/json/csv/evtx/zip) and MIME (permissive allowlist + `application/octet-stream` fallback for `.evtx`). Content-addressed storage: `${EVIDENCE_DIR}/<mission>/<test>/<sha256><ext>` — re-uploading byte-identical content reuses the file on disk and inserts a fresh row.
 								  - `GET /evidence/{id}` — JSON metadata view; `?download=true` switches to `send_file` with the original filename in `Content-Disposition` and the SHA256 as the ETag.
 								  - `DELETE /evidence/{id}` — soft delete (only flips `deleted_at`; physical purge lands in M12).
 								  - All three routes are membership-aware via the same chain walk (`evidence → test → scenario → mission`), collapsing "not found" / "not visible" into 404 to prevent existence leaks.
 								- **Activity tracking column** (`backend/alembic/versions/20260514_1000_91a4e7c6d2f3_m7_mission_test_last_actor.py`): added `mission_tests.last_actor_id` (FK `users.id` `ON DELETE SET NULL`) + `ix_mission_tests_updated_at` to support the polling endpoint. Every red/blue write or transition stamps the actor so the "modified by X Ns ago" indicator can resolve a human label.
 								- **Detection-level seed + read** (`app/services/detection_levels.py` + `app/api/detection_levels.py`):
 								  - 4 default rows seeded at boot — `detected_blocked` / `detected_alert` / `logged_only` / `not_detected` — colored on the design-system accent palette. The seed is idempotent and never mutates existing rows; new keys added to `DEFAULT_LEVELS` in future releases surface on next boot.
 								  - `GET /detection-levels` (gated on `detection_level.read`) returns the catalogue ordered by position. CRUD is M8's territory.
 								- **Per-test page** (`frontend/src/pages/MissionTestPage.tsx`): two-zone layout with the red border on the red half (command, output, markdown comment, mark-executed button, override toggle) and the cyan border on the blue half (detection-level select, comment, drag-and-drop evidence dropzone). Per-field disable based on `mission.write_red_fields` / `mission.write_blue_fields`; server is the ultimate arbiter so the UI is purely advisory. The "Last touched Xs ago by Y" badge polls `/activity` every 15 s while the document is visible.
 								- **Mission detail page wires through to the per-test page** (`frontend/src/pages/MissionDetailPage.tsx`): every row in the Tests tab is now clickable (cursor + hover state) and links to `/missions/<id>/tests/<test_id>`. The route is registered in `App.tsx` behind `RequireAuth`.
 								- **TanStack query keys** (`frontend/src/lib/missions.ts`): added `missionTestKeys.detail()` / `.activity()` / `.detectionLevels()` so the per-test page invalidations stay surgical (don't blow away the whole missions list).
 								- **`/diag/reset` extended** (`app/api/diag.py`): test mode now wipes `${EVIDENCE_DIR}/*` so e2e uploads don't accumulate across runs. Detection levels are preserved (reference data, not catalogue) and the seed is re-run as a safety net.
 								- **Tests**:
 								  - **`backend/tests/test_mission_tests.py`** — 25 pytest tests covering: detection-level seed + perm gating; red/blue field-level perms (red user blocked on blue fields and vice-versa); mark-executed stamps `executed_at`; override gating (forbidden while pending, blue-side blocked); state-machine matrix + side perm refinement; membership 404 vs admin bypass; evidence 24 MB ok / 26 MB rejected; SHA256 verification; MIME/extension whitelist; soft-delete hides bytes from detail view; activity polling with `since=` URL-encoded; future `since` returns empty.
 								  - **`e2e/tests/m7-execution.spec.ts`** — 5 Playwright tests against the live stack: red-only/blue-only API gating, mark-executed + reviewed_by_blue side enforcement, 24 MB/26 MB upload + SHA256 round-trip, SPA per-test page save + transition, non-member sees the 404 alert instead of mission content. `afterAll` restores the stable admin and re-syncs MITRE.
 								- **HomePage**: hero + roadmap card bumped to `M7 — Red & blue execution on a mission test (done). Next: M8`.
-												fix(m6): mission detail page can now edit metadata, append scenarios, edit members

The M6 SPA shipped the create wizard but the detail page was read-only —
even though the backend already exposed PUT /missions/{id}, POST
/missions/{id}/scenarios, and PUT /missions/{id}/members. So once a
mission was created you couldn't fix a typo in the client name, add a
scenario you forgot, or change member assignments without curl.

Added three modals on the detail page, gated by `is_admin ||
mission.update`:

- Edit metadata (header button, 3xl modal): name + client + dates +
  markdown description, same validation as the wizard step 1.
- Add scenarios (Tests tab): scenario picker matching wizard step 2,
  calls POST /missions/{id}/scenarios which appends snapshots at
  current_max_position + 1.
- Edit members (Members tab): roster + red/blue toggles, calls
  PUT /missions/{id}/members (full-set replace), pre-populated with
  the current member set.

The detail page now imports useAuth so `canEdit` is computed once and
shared between the three buttons.

E2E: new "detail page edits metadata, appends scenarios, edits members"
spec exercises the three modals end-to-end. M6 e2e count is now 6.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

											
										
										
											2026-05-14 07:37:06 +02:00
+								### Fixed (post-M6 SPA — mission detail page was read-only)
 								- **Mission detail page couldn't edit metadata, append scenarios, or change members** (`frontend/src/pages/MissionDetailPage.tsx`): the M6 SPA shipped the 3-step *creation* wizard but no edit affordance on the detail page — even though the backend already exposed `PUT /missions/{id}`, `POST /missions/{id}/scenarios`, and `PUT /missions/{id}/members`. Added three modals gated by `is_admin || mission.update`:
 								  - **Edit metadata** (header button, opens a 3xl modal): name / client_target / dates / description_md, full inline validation (empty name, inverted dates) mirroring the wizard's step 1.
 								  - **Add scenarios** (in the Tests tab): scenario picker reusing the wizard step-2 visual, calls `POST /missions/{id}/scenarios` which appends snapshots at `current_max_position + 1`. The footer line tells the user how many tests will be appended.
 								  - **Edit members** (in the Members tab): roster + red/blue toggles, calls `PUT /missions/{id}/members` (full-set replace) — same UX as the wizard step 3, pre-populated with the current member set.
 								- Detail page now imports `useAuth` to compute `canEdit` once and reuses it across all three buttons.
 								- E2E spec extended: new test `SPA — detail page edits metadata, appends scenarios, edits members` exercises the three modals end-to-end against a pre-seeded mission. Suite is now 44 Playwright tests (6 in M6).
-												fix(m6): post-review pass — cache prefix, snapshot lock, perm-before-parse, LIKE escape

Addresses spec-reviewer + code-reviewer feedback on the M6 bundle:

Critical:
- frontend/src/lib/missions.ts: add `listPrefix()` so TanStack invalidation
  catches every filtered list variant; the previous `list()` returned
  `['missions','list',{}]` and only matched the exact empty-filter cache,
  leaving filtered tables stale after create/transition/delete.
- backend/app/services/missions.py: acquire the same per-scenario
  `pg_advisory_xact_lock` key used by `set_scenario_tests` before
  snapshotting; without it a concurrent M5 reorder could freeze a torn
  snapshot under READ COMMITTED. Sorted by key to avoid deadlocks with
  another snapshotter.

Important:
- backend/app/api/missions.py: `@require_perm("mission.update",
  "mission.archive")` on the transition endpoint so users without either
  perm get 403 before the body is parsed (no shape leak via 400).
- backend/app/services/missions.py: escape `%` / `_` / `\` in user-typed
  `q` / `client` LIKE search; users can no longer trigger wildcard
  semantics by typing literal `%`. Added `escape='\\'` arg on every .like().
- backend/app/services/missions.py: filter `MissionTest.deleted_at` and
  `MissionScenario.deleted_at` in the list-item and detail counts so M7+
  soft-deletes don't drift the totals silently.

Nits:
- backend/app/api/users.py: order `/users/roster` by email for stable
  rendering + deterministic e2e selectors.
- frontend/src/pages/MissionDetailPage.tsx: distinct accent per
  transition target (cyan/orange/green/teal) matching the status legend.
- e2e/tests/m6-missions.spec.ts: switch fragile `getByRole(name=/In
  Progress/i)` to the stable `mission-transition-in_progress` data-testid.

New tests:
- test_create_mission_rejects_soft_deleted_scenario
- test_transition_perm_gate_runs_before_payload_parse
- test_search_treats_wildcards_as_literals

Suite: 106 pytest passing (was 103), 43 Playwright passing.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

											
										
										
											2026-05-13 15:14:57 +02:00
+								### Fixed (post-M6 review pass — spec-reviewer + code-reviewer)
 								- **SPA cache invalidation only refreshed the empty-filter list** (`frontend/src/lib/missions.ts:136`): `missionKeys.list()` returns `['missions','list',{}]`. TanStack v5's `invalidateQueries({queryKey})` is prefix-based, but `{}` is treated as an atomic final element — so create / transition / delete called with that key only invalidated the *exact* empty-filter list, leaving any filtered variant stale until manual refetch. Added `missionKeys.listPrefix()` returning `['missions','list']` and switched all three mutation `onSuccess` paths to it.
 								- **Snapshot lacked the per-scenario advisory lock** (`backend/app/services/missions.py:467`): a concurrent `PUT /scenario-templates/{id}/tests` (M5 reorder, which deletes-then-reinserts join rows) running while `_snapshot_scenarios` walked `sc.tests` could freeze a torn snapshot — `selectinload` re-queries under READ COMMITTED so a partial view was possible. Added `_lock_scenario_ids_for_snapshot` that acquires the same `pg_advisory_xact_lock` key used by `set_scenario_tests` (blake2b digest of the scenario UUID, sorted to avoid deadlocks). Snapshot and reorder now serialise per scenario.
 								- **Transition endpoint leaked its body shape via 400 before the perm gate** (`backend/app/api/missions.py:441`): a user without `mission.update` or `mission.archive` POSTing `{"status":"x"}` got a Pydantic 400 instead of 403. Added `@require_perm("mission.update", "mission.archive")` so the gate fires before the parse; the inner refinement still enforces the per-target perm. Test `test_transition_perm_gate_runs_before_payload_parse`.
 								- **LIKE wildcards in user-typed search were honoured as SQL wildcards** (`backend/app/services/missions.py:632,637`): `?q=%` matched every mission. Added `_escape_like` that pre-escapes `%`, `_`, `\` and a matching `escape='\\'` argument on every `.like(...)` call. Test `test_search_treats_wildcards_as_literals`.
 								- **Counts ignored soft-deleted mission children** (`backend/app/services/missions.py:587,597`): `tests_count` and the detail view summed `len(sc.tests)` without filtering `MissionTest.deleted_at`. Harmless today (M6 doesn't soft-delete mission tests), but would drift silently once M7+ surfaces `state=skipped/blocked`. Added the filter in both `_to_list_item` and `_scenario_views`.
 								- **`/users/roster` was unordered** (`backend/app/api/users.py:73`): the wizard's member list shuffled rows on every refetch. Sorted by `email` for predictable rendering + stable e2e selectors.
 								- **Frontend transition button accent collapsed `in_progress` and `completed` into one colour** (`frontend/src/pages/MissionDetailPage.tsx:97`): both rendered cyan, so the status legend in the list didn't match the transition button. Added a `TRANSITION_BUTTON_ACCENT` map mirroring `MISSION_STATUS_ACCENT` (cyan/orange/green/teal).
 								- **Soft-deleted source scenario was a silent foot-gun**: `_load_scenario_templates_for_snapshot` already rejected it, but no test pinned the behaviour. Added `test_create_mission_rejects_soft_deleted_scenario` so future refactors can't regress to "freeze a tombstoned scenario into a fresh mission".
 								- **E2E wizard assertion used `getByRole('button', { name: /In Progress/i })`** (`e2e/tests/m6-missions.spec.ts:287`): the accessible name is `→ In Progress` and the arrow Unicode is brittle. Switched to `getByTestId('mission-transition-in_progress')`.
-												feat(m6): missions + snapshot CRUD, membership visibility, status state machine

Adds the mission layer that materialises template snapshots, plus the SPA
list / 3-step wizard / detail page.

Backend:
- app/services/missions.py — create_mission snapshots scenarios, tests, MITRE
  tags in a 4-query write; list/get apply a non-admin membership filter that
  collapses to 404 (no existence leak); status state machine enforces
  draft → in_progress → completed → archived with archived as a sink; the
  non-admin creator is auto-added as role_hint='red' to retain visibility.
- app/api/missions.py — 8 endpoints (list, get, create, update, add
  scenarios, set members, transition, soft-delete) with strict pydantic
  schemas. The transition endpoint splits the perm gate manually so
  archive requires mission.archive while other targets use mission.update.
- app/api/users.py — new GET /users/roster returning (id, email,
  display_name) only, gated by user.read OR mission.create OR
  mission.update — lets non-admin wizard users see assignable peers
  without exposing the admin /users payload.
- app/api/diag.py — /diag/reset truncates the mission_* tables before the
  template tables because the source_*_template_id FKs are ON DELETE SET
  NULL, which is cheaper to short-circuit by removing the children first.

Frontend:
- lib/missions.ts — typed client, queryKey factory, status accent map.
- pages/MissionsListPage.tsx — list cards with status accent + filters
  (q, client, status).
- pages/MissionsCreatePage.tsx — 3-step wizard (meta → scenarios → members)
  with member roster fed by /users/roster.
- pages/MissionDetailPage.tsx — header + transition buttons (legal next
  states only) + Tests/Members/Synthesis/Export tabs.
- Routes + nav entry (visible to anyone with mission.read or admin).

Tests:
- backend/tests/test_missions.py — 22 pytest covering snapshot fidelity,
  MITRE propagation, membership visibility, transition state machine,
  perm gating, member set replace, append scenarios, soft-delete, partial
  update, inverted-date rejection.
- e2e/tests/m6-missions.spec.ts — 5 Playwright (snapshot freezing, non-admin
  visibility, status transitions + 409, SPA wizard end-to-end, list filter).

Docs:
- CHANGELOG, tasks/testing-m6.md, tasks/lessons.md (snapshot tradeoffs,
  membership=404 pattern, /diag/reset order, auto-creator add).
- README + tasks/todo.md updated.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

											
										
										
											2026-05-13 15:07:32 +02:00
+								### Added — M6 (Missions & snapshot)
 								- **CRUD `missions`** (`app/services/missions.py` + `app/api/missions.py`):
 								  - Fields: name, client_target, date_start, date_end, status (`draft/in_progress/completed/archived`), description (markdown), visibility_mode (frozen to `whitebox` in v1).
 								  - On creation/append, the service **snapshots** the selected `scenario_templates` and all their `test_templates` into `mission_scenarios` / `mission_tests` (every template field — including OPSEC level, tags, expected IOCs, MITRE tags). The denormalised `mission_test_mitre_tags` table copies `external_id`, `name`, `url` so a later MITRE re-sync that drops the entry can't alter a mission's tags (spec §11).
 								  - `source_*_template_id` FKs survive template soft-deletes (`ON DELETE SET NULL`); the mission's frozen content is unaffected.
 								  - **Membership visibility**: non-admin viewers see only missions where they are a `mission_members` row. The service maps "not visible" → 404 (no existence leak via 403). Admins bypass via the `admin` group.
 								  - **Status state machine**: `draft → in_progress → completed → archived`; `archived → ∅`. The transition endpoint accepts the target status, validates the move, and rejects invalid jumps with 409. Idempotent (target=current) is a no-op 200.
 								  - Auto-creator-membership: a non-admin caller of `POST /missions` is auto-added as `role_hint='red'` if not already in the `members[]` payload — so they retain visibility on the mission they just created.
 								  - REST: `GET/POST /missions`, `GET/PUT/DELETE /missions/{id}`, `POST /missions/{id}/scenarios` (append snapshots at the end), `PUT /missions/{id}/members` (replace set), `POST /missions/{id}/transition`.
 								  - Filters on list: `q` (LIKE on name/description), `status`, `client` (LIKE on client_target). `include_deleted=true` is admin-only (403 otherwise).
 								- **`GET /users/roster`** (`app/api/users.py`): a deliberately minimal listing — `id`, `email`, `display_name` of active users only — accessible to any holder of `user.read`, `mission.create`, or `mission.update`. Lets a non-admin red teamer populate the wizard's member picker without exposing the admin-grade `/users` endpoint (which leaks `is_admin`, `is_active`, group memberships).
 								- **Frontend**:
 								  - `lib/missions.ts` — typed client + queryKey factory + status accent map + filter query-string builder.
 								  - `pages/MissionsListPage.tsx` — list cards (one per mission) with status accent, scenario/test/member counts, date range, plus filters (q, client, status).
 								  - `pages/MissionsCreatePage.tsx` — **3-step wizard**: metadata → scenario picker → member roster (red/blue toggles + auto-include the non-admin creator). Submits via `POST /missions` and redirects to the detail page.
 								  - `pages/MissionDetailPage.tsx` — header with transition buttons (only the legal next states are rendered), soft-delete with confirm prompt, and 4 tabs: **Tests** (table of snapshotted tests with MITRE tags, OPSEC, state), **Members** (role-coloured pills), **Synthesis** (placeholder for M10), **Export** (placeholder for M11).
 								  - Nav adds **Missions** link visible to anyone with `mission.read` or admin.
 								- **/diag/reset** truncates the mission tables before the template tables — `mission_scenarios.source_scenario_template_id` and `mission_tests.source_test_template_id` are `ON DELETE SET NULL`, so wiping missions first avoids the round-trip through the null-update path.
 								- **Testing**:
 								  - `backend/tests/test_missions.py` — **22 pytest** covering snapshot fidelity (rename source template after snapshot → mission unchanged), MITRE tag propagation, membership-based 404, perm gating (create vs read vs archive), status transition chain + invalid jumps (409), member set replace + role-hint validation, scenario append at correct position, soft-delete, partial metadata update, inverted-date rejection, admin-only `include_deleted`.
 								  - `e2e/tests/m6-missions.spec.ts` — **5 Playwright** (snapshot freezing, membership visibility for non-admin red, status transition + 409, SPA wizard end-to-end, SPA list + status filter).
 								  - `tasks/testing-m6.md`.
-												test(m5): playwright spec + docs (CHANGELOG, README, lessons, testing-m5)

- 4 Playwright tests: API CRUD round-trip, scenario reorder via PUT, SPA
  list + opsec filter, SPA scenario list rendering with ordered tests.
- afterAll restores the stable admin (admin@metamorph.local) per the
  test_admin memory rule.
- CHANGELOG M5 section + Fixed subsections for the LogRecord 'name'
  collision and the React `currentTarget` vs `target` quirk.
- README status bumps to M0-M5.
- tasks/lessons.md captures the new patterns (sentinel pattern for
  partial-update, FK ordering in /diag/reset, dnd-kit stable IDs).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

											
										
										
											2026-05-12 19:57:51 +02:00
+								### Added — M5 (Test & scenario templates)
 								- **CRUD `test_templates`** (`app/services/test_templates.py` + `app/api/test_templates.py`):
 								  - Fields: name, description, objective, procedure (markdown), prerequisites (markdown), expected result red, expected detection blue, OPSEC level (`low/medium/high`), free tags (TEXT[]), expected IOCs (TEXT[]).
 								  - Polymorphic MITRE tag set (`(kind, external_id)` ↔ exactly one of `tactic_id`/`technique_id`/`subtechnique_id`). The wire payload uses ATT&CK external IDs — server resolves to UUIDs.
 								  - Filters: `q` (LIKE on name/description), `tactic`/`technique`/`subtechnique` (joined via subquery on the polymorphic tag table), `opsec`, `tag` (array contains).
 								  - REST: `GET /test-templates`, `GET /test-templates/{id}`, `POST /test-templates`, `PUT /test-templates/{id}` (partial, with explicit `_UNSET` sentinel so omitted fields stay untouched), `DELETE /test-templates/{id}` (soft).
 								- **CRUD `scenario_templates`** (`app/services/scenario_templates.py` + `app/api/scenario_templates.py`):
 								  - Ordered list of test_templates with `position` (UNIQUE `scenario_template_id, position`).
 								  - Reorder via full replace: `PUT /scenario-templates/{id}/tests` deletes the join rows and re-inserts at positions `0..N-1` — clean atomic op that respects the UNIQUE constraint without a 2-phase position shuffle.
 								  - The same test can appear multiple times (chained operations).
 								  - REST: `GET`/`POST`/`PATCH` (metadata) / `DELETE` (soft) on `/scenario-templates`.
 								- **Frontend**:
 								  - `lib/templates.ts` — typed client + queryKey factory.
 								  - `pages/AdminTestsPage.tsx` — list + filters (q, tactic, opsec, tag) + modal with full field set + embedded `<MitreTagPicker>` for tags.
 								  - `pages/AdminScenariosPage.tsx` — list + modal with **@dnd-kit/sortable** vertical drag-and-drop on the ordered test list. New deps: `@dnd-kit/core`, `@dnd-kit/sortable`, `@dnd-kit/utilities`.
 								  - `components/MarkdownField.tsx` — lean textarea with markdown hint (no heavy editor dep; rendering happens at display time in M7).
 								  - Nav adds **Tests** and **Scenarios** links (admin-gated).
 								- **/diag/reset** truncates the 4 new tables before the MITRE block — the `scenario_template_tests.test_template_id` FK is `ON DELETE RESTRICT`, so the order matters.
 								- **Testing**:
 								  - `backend/tests/test_templates.py` — **19 pytest** (create/list/filter by tactic+opsec+tag, MITRE tag resolution + replacement on update, soft-delete, perm gating, scenario create+reorder+delete, soft-deleted test linking semantics).
 								  - `e2e/tests/m5-templates.spec.ts` — **4 Playwright** (API CRUD round-trip, scenario reorder, SPA list + opsec filter, SPA scenario list rendering with ordered tests).
 								  - `tasks/testing-m5.md`.
 								### Fixed (M5 implementation)
 								- **`LogRecord` key collision**: `log.info(..., extra={"name": ...})` raises `KeyError("Attempt to overwrite 'name' in LogRecord")` because `name` is reserved by Python's stdlib logging. Renamed to `template_name`.
 								- **React `currentTarget` null in deferred state updaters**: `onChange={(e) => setX((prev) => ({ ...prev, q: e.currentTarget.value }))}` blanked the page on the first user input because `currentTarget` is cleared after the listener bubble ends, before React invokes the updater. Switched all M5 handlers to `e.target.value`, which persists on the synthetic event.
-												fix(m5): scenario reorder 500 — wrong pg_advisory_xact_lock overload

Editing a scenario and saving (with or without changes) returned 500:
  function pg_advisory_xact_lock(smallint, bigint) does not exist

Postgres only ships (int4, int4) and (bigint) variants. The two-arg call
passed `m = hash(uuid) & 0xFFFFFFFF` which can reach 2^32-1, so psycopg
promoted it to bigint and no overload matched.

Switched to the single-arg bigint form. While there, replaced Python's
built-in hash() with hashlib.blake2b(...) — the built-in is randomised
per process via PYTHONHASHSEED, so gunicorn workers were computing
different lock keys for the same scenario and the lock wasn't actually
serialising across workers.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

											
										
										
											2026-05-13 09:29:27 +02:00
+								### Fixed (post-M5 — scenario reorder 500 + cross-worker lock correctness)
 								- **`PUT /scenario-templates/{id}/tests` returned 500** (`backend/app/services/scenario_templates.py:218`): the two-argument form `pg_advisory_xact_lock(:n, :m)` failed with `function pg_advisory_xact_lock(smallint, bigint) does not exist`. Postgres only provides `(int4, int4)` and `(bigint)` overloads — psycopg promoted `m = hash(uuid) & 0xFFFFFFFF` (up to 2^32-1) to bigint and there's no matching overload. Switched to the single-argument bigint form with `CAST(:key AS bigint)`.
 								- **Cross-worker lock was a no-op** (same site): Python's built-in `hash()` is randomised per process via `PYTHONHASHSEED`, so each gunicorn worker computed a different key for the same `scenario_id`, and concurrent reorders on different workers acquired independent locks — defeating the serialisation. Replaced with `blake2b(scenario_id.bytes, digest_size=8)` interpreted as a signed int64. Stable, deterministic, fits in `bigint`.
-												fix(m5): modal layout for the test-template editor

The +New test modal capped at max-w-2xl rendered the 15-column MITRE matrix
in a 672px frame with no height cap, so the matrix spilled to the right of
the dialog, the form bottom dropped below the viewport, and neither scroll
direction worked — buttons were unreachable.

- Modal: add a `size` prop (default 2xl, back-compat) with a `7xl` preset.
  Cap height at calc(100vh-2rem), make the header sticky, and wrap children
  in a min-w-0 flex-1 overflow-y-auto body so tall content scrolls inside.
- MitreTagPicker: move overflow-x-auto from the grid itself to a dedicated
  scroller wrapper, and add `min-w-0` so the constraint propagates from the
  modal body. The grid's 1680px intrinsic min-width previously prevented
  the parent's overflow-x-auto from kicking in.
- AdminTestsPage: switch the form layout from `grid gap-3` to `flex flex-col
  gap-3 min-w-0` and set the modal size to 7xl. The CSS Grid form was
  propagating min-width: auto to all its items, which let the picker drag
  the body past the modal width.
- AdminScenariosPage: bump the modal to size 3xl for breathing room around
  the catalogue picker.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

											
										
										
											2026-05-13 08:31:13 +02:00
+								### Fixed (post-M5 UI — modal layout for the test-template editor)
 								- **Modal box capped its width at `max-w-2xl` and had no vertical scroll** (`frontend/src/components/ui/Modal.tsx`): opening **+ New test** rendered the 15-column MITRE matrix inside a 672 px frame with no height cap, so the matrix spilled to the right and the form bottom dropped below the viewport — buttons unreachable, no scroll. Added a `size` prop (default `2xl` for back-compat), `max-h-[calc(100vh-2rem)]` + `flex flex-col` on the dialog, and an inner `min-w-0 flex-1 overflow-y-auto` body so the header stays pinned while the form scrolls inside the modal.
 								- **MITRE matrix overflow-x failed to scroll inside the modal body** (`frontend/src/components/MitreTagPicker.tsx`): `overflow-x-auto` sat directly on the grid element, but the grid's intrinsic min-width (`15 × minmax(7rem, …)` = 1680 px) prevented it from shrinking below its content, so the grid spilled outside its parent instead of scrolling. Wrapped the grid in a dedicated `overflow-x-auto rounded min-w-0 w-full` scroller and added `min-w-0` to the picker root so the constraint propagates from the modal body. The grid now scrolls horizontally inside the modal.
 								- **`grid gap-3` form layout in the test-template modal propagated `min-width: auto`** (`frontend/src/pages/AdminTestsPage.tsx`): each grid item refused to shrink below its widest child, so the picker dragged the form (and the body) past the modal width. Switched the form to `flex flex-col gap-3 min-w-0`, which breaks the propagation while preserving vertical spacing.
 								- **Test-template modal now uses `size="7xl"`** and the scenario-template modal `size="3xl"` to match their content density.
-												fix(m5): post-review pass — AND filter, advisory lock, N+1, item caps, mutation cache

Spec-reviewer + code-reviewer findings applied:

Must-fix
- Filter combinator AND-semantics: tactic+technique+subtechnique now intersect
  (one IN subquery per facet) instead of being pooled into one OR. Reviewers
  flagged both the wrong default semantics and the theoretical UUID-collision
  risk of pooling tactic/technique/sub UUIDs into a shared list across
  three columns.
- Front-end mutation cache hygiene: updateMeta + setTests both
  `onSettled: invalidate` so a partial failure leaves the cache consistent.

Should-fix
- Per-scenario pg_advisory_xact_lock on set_scenario_tests — serialises
  concurrent reorders, mirrors M4 /mitre/sync pattern.
- Backend/front consistency on duplicate tests in a scenario: the
  UNIQUE(scenario_id, position) constraint already allows the same
  test_template multiple times (chained ops), so the catalogue picker no
  longer excludes already-picked items.

Nice-to-have
- N+1 eradicated in test_template view rendering: _to_views_batch
  builds {uuid → MitreRow} maps in 3 queries up-front; list endpoint
  now issues 4 queries total regardless of list size.
- Wire-level item length caps on tags (64) and expected_iocs (255)
  via Annotated[str, StringConstraints(...)] — returns 400 instead of
  bubbling up StringDataRightTruncation.
- 4 new pytest covering the AND-filter, extra="forbid" rejection,
  empty mitre_tags clearing, and the 65-char tag cap. Total now
  81 pytest + 38 e2e pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

											
										
										
											2026-05-12 20:05:00 +02:00
+								### Fixed (post-M5 review pass — spec-reviewer + code-reviewer)
 								- **Filter combinator was OR, not AND** (`backend/app/services/test_templates.py:235`): `?tactic=TA0002&technique=T1059` returned templates matching *either* facet instead of *both*. Pre-fix also pooled all three UUIDs into a shared `IN` list across three columns, theoretically allowing a UUID collision to match across kinds. Refactored to one IN-subquery per facet, ANDed together via repeated `WHERE id IN (...)`.
 								- **Concurrent reorder race on `set_scenario_tests`** (`backend/app/services/scenario_templates.py:207`): two parallel reorders on the same scenario could deadlock on the `UNIQUE(scenario_id, position)` constraint under READ COMMITTED. Added a per-scenario `pg_advisory_xact_lock(0x5C3, hash(scenario_id))` mirroring the M4 `/mitre/sync` pattern; different scenarios don't contend.
 								- **N+1 on `_to_view` MITRE resolution** (`backend/app/services/test_templates.py:160`): rendering K templates with ~T tags each fired up to K×T `s.get(...)` calls. Added `_to_views_batch` that pre-builds `{uuid → MitreRow}` maps in 3 queries and feeds them to per-template view assembly; `list_test_templates` now issues 4 queries total regardless of list size.
 								- **Wire-level item length cap on `tags` / `expected_iocs`** (`backend/app/api/test_templates.py:18-21`): the DB columns are `ARRAY(String(64))` / `ARRAY(String(255))` but the API layer only capped the LIST length, not item strings — long inputs hit the driver with `StringDataRightTruncation`. Added `Annotated[str, StringConstraints(...)]` types so the API returns 400 with a clean validation error.
 								- **Front-end mutation cache hygiene** (`frontend/src/pages/AdminScenariosPage.tsx:148-156`): `updateMeta` and `setTests` mutations are run sequentially in `submit()`; on partial failure (metadata saved but reorder failed) the cache stayed stale. Both mutations now `onSettled: invalidate` so whatever step landed is reflected without manual refresh.
 								- **Backend vs front-end consistency on duplicate tests in a scenario** (`frontend/src/pages/AdminScenariosPage.tsx:227-231`): the backend allows the same `test_template` to appear multiple times (chained ops; the UNIQUE constraint is `(scenario_id, position)` not `(scenario_id, test_template_id)`), but the catalogue picker was filtering out already-picked items. Removed the filter — only soft-deleted tests are excluded now.
 								- **Test coverage closure** (`backend/tests/test_templates.py`): +4 pytest (tactic+technique AND-semantics, `extra="forbid"` rejection, empty `mitre_tags` explicit clear, 65-char tag length cap → 400). Total backend now 23 M5 tests + 39 elsewhere = 81 pass.
-												docs(m4): CHANGELOG, README, lessons, spec drift fix, todo tick

- CHANGELOG: added M4 section listing endpoints, CLI, volume, persisted
  settings, picker, and the post-spec-review fixes (custom-URL integrity
  requirement + /diag/reset consistency + spec drift). Includes the
  intentional decisions paragraph (seed-time download not image-baked, read
  endpoints unauthenticated-perm-wise, stdlib over httpx).
- README: status bumped to M0–M4, added MITRE quickstart (make seed-mitre +
  air-gapped path with --source /data/mitre/<file> + --skip-checksum),
  testing-m<N>.md pointer updated to testing-m4.md, roadmap line.
- tasks/spec.md §10 #4: amended "14 tactics Enterprise" → "≥14 tactics
  Enterprise (la v19 du pin actuel en ship 15)".
- tasks/lessons.md: 7 M4 lessons captured (stdlib STIX parsing, decoupling
  DoD asserts from upstream versions, subtechnique parent resolution, single-
  transaction safety, custom-URL footgun mitigation, /diag/reset consistency,
  named-volume permission caveat, podman build cache surprise).
- tasks/todo.md: M4 marked ☑.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

											
										
										
											2026-05-12 13:54:46 +02:00
+								### Added — M4 (MITRE ATT&CK Enterprise)
 								- **STIX 2.1 parser + upsert** (`app/services/mitre_seed.py`): stdlib-only (`urllib.request` + `hashlib`), pinned to Enterprise v19.0 (`enterprise-attack-19.0.json`, sha256 `df520ea0…`). Parses 25k+ STIX objects → 15 tactics, 222 techniques, 475 sub-techniques in ~1.1 s. Skips revoked + deprecated, resolves sub-technique parents via `relationship[subtechnique-of]` with a `T1003.001 → T1003` dotted-id fallback, copies kill-chain phases into the `mitre_technique_tactics` M2M.
 								- **CLI**: `flask metamorph seed-mitre [--source <path|url>] [--checksum-sha256 <hex>] [--skip-checksum]` (`app/cli.py`). `make seed-mitre` wraps it.
 								- **REST endpoints** (`app/api/mitre.py`):
 								  - `GET /api/v1/mitre/tactics`, `/mitre/techniques?tactic=…&q=…`, `/mitre/subtechniques?technique=…&q=…` (paginated, search on name/external_id).
 								  - `GET /api/v1/mitre/status` (last_sync, version, source_url, defaults).
 								  - `POST /api/v1/mitre/sync` (perm `mitre.sync`) — re-pull on demand.
 								- **Persisted metadata** in `settings`: `mitre_last_sync`, `mitre_version`, `mitre_source_url`.
 								- **Compose volume `metamorph_mitre`** mounted at `/data/mitre/` in the api container — caches the downloaded bundle across restarts. Owned by `metamorph:metamorph`.
 								- **Frontend**:
-												docs(m4): reconcile CHANGELOG + testing-m4 with the flat matrix + CR fixes

- CHANGELOG M4 Added: rewrote the frontend bullet to describe the actual
  flat ATT&CK matrix that ships (full-bleed, 15-col grid with minmax(7rem,
  1fr), name-only cells, ▸/▾ chevron). The original entry still described
  the abandoned 3-column drill-down picker.
- New "Fixed (post-M4 code-review pass)" subsection enumerating the six
  CR-driven fixes that landed in this branch (SSRF allowlist, advisory
  lock, typed contract, N+1 elimination, version clearing, error scrub +
  the test additions and e2e count pinning).
- DoD counts: 53 → 58 pytest, 34 e2e unchanged. testing-m4.md follows.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

											
										
										
											2026-05-12 19:19:44 +02:00
+								  - `<MitreTagPicker>` component: flat ATT&CK matrix matching `attack.mitre.org/#` — full-bleed beyond `max-w-page`, 15 equal-width columns via `grid-template-columns: repeat(N, minmax(7rem, 1fr))`, sans-serif 12px, **name-only cells** (external_id surfaces on hover via `title` and in selection chips), `▸/▾` chevron expands sub-techniques inline within the column, multi-select with chip-removal at the top. Returns `MitreTag[]` (`kind`, `id`, `external_id`, `name`), ready for M5 templates.
 								  - `/mitre` showcase page with status card, admin-gated **Trigger sync** button, the picker, and a JSON `<pre>` preview of the current selection.
-												docs(m4): CHANGELOG, README, lessons, spec drift fix, todo tick

- CHANGELOG: added M4 section listing endpoints, CLI, volume, persisted
  settings, picker, and the post-spec-review fixes (custom-URL integrity
  requirement + /diag/reset consistency + spec drift). Includes the
  intentional decisions paragraph (seed-time download not image-baked, read
  endpoints unauthenticated-perm-wise, stdlib over httpx).
- README: status bumped to M0–M4, added MITRE quickstart (make seed-mitre +
  air-gapped path with --source /data/mitre/<file> + --skip-checksum),
  testing-m<N>.md pointer updated to testing-m4.md, roadmap line.
- tasks/spec.md §10 #4: amended "14 tactics Enterprise" → "≥14 tactics
  Enterprise (la v19 du pin actuel en ship 15)".
- tasks/lessons.md: 7 M4 lessons captured (stdlib STIX parsing, decoupling
  DoD asserts from upstream versions, subtechnique parent resolution, single-
  transaction safety, custom-URL footgun mitigation, /diag/reset consistency,
  named-volume permission caveat, podman build cache surprise).
- tasks/todo.md: M4 marked ☑.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

											
										
										
											2026-05-12 13:54:46 +02:00
+								  - Nav adds **MITRE** link for any logged-in user.
 								- **Testing**:
 								  - `backend/tests/test_mitre.py` — **12 pytest** (parser, idempotence, checksum mismatch, persisted settings, endpoint variants, perm enforcement) using a hand-crafted minimal STIX bundle (no network in tests).
 								  - `e2e/tests/m4-mitre.spec.ts` — **6 Playwright** against the live stack (calls `/mitre/sync` once in `beforeAll`).
 								  - `tasks/testing-m4.md`.
 								### Fixed (post-M4 spec-review pass)
 								- **Sync integrity guarantee**: `seed_mitre()` now refuses a custom URL without either `expected_sha256` or an explicit `allow_unverified=true`. Closes a "typo in `mitre_source_url` setting routes the seed to attacker JSON" footgun. CLI surfaces this via `--checksum-sha256` / `--skip-checksum`; API via `{"source", "expected_sha256", "allow_unverified"}` body.
 								- **`/diag/reset` consistency**: now truncates the `mitre_*` tables alongside `settings` so `GET /mitre/status` and `GET /mitre/tactics` agree after a reset (previously: catalogue rows persisted, but `mitre_last_sync` got wiped → status lied).
 								- **Spec drift §10 #4**: amended "14 tactics" → "≥ 14 tactics (v19 ships 15)" to reflect MITRE v8+ reality.
-												docs(m4): reconcile CHANGELOG + testing-m4 with the flat matrix + CR fixes

- CHANGELOG M4 Added: rewrote the frontend bullet to describe the actual
  flat ATT&CK matrix that ships (full-bleed, 15-col grid with minmax(7rem,
  1fr), name-only cells, ▸/▾ chevron). The original entry still described
  the abandoned 3-column drill-down picker.
- New "Fixed (post-M4 code-review pass)" subsection enumerating the six
  CR-driven fixes that landed in this branch (SSRF allowlist, advisory
  lock, typed contract, N+1 elimination, version clearing, error scrub +
  the test additions and e2e count pinning).
- DoD counts: 53 → 58 pytest, 34 e2e unchanged. testing-m4.md follows.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

											
										
										
											2026-05-12 19:19:44 +02:00
+								### Fixed (post-M4 code-review pass)
 								- **SSRF allowlist on `/mitre/sync`**: host must be in `MITRE_ALLOWED_HOSTS` (defaults to `raw.githubusercontent.com`, comma-separated env override). Closes the "admin holding `mitre.sync` can pivot the api container at cloud metadata (`169.254.169.254`) or internal mirrors" vector. New `MitreSourceForbidden` exception → 400 with `source_forbidden` error code.
 								- **Concurrent sync race**: `seed_mitre()` now acquires `pg_advisory_xact_lock(hashtext('mitre.seed'))` at the top of the transaction so two `/mitre/sync` calls serialise cleanly across the `DELETE` + re-`INSERT` of `mitre_technique_tactics`.
 								- **Typed sync contract end-to-end**: Pydantic `SyncResultOut` on the backend (`app/api/mitre.py`) mirrored by a `MitreSyncResult` TS interface (`frontend/src/lib/mitre.ts`). The MitrePage mutation no longer uses an `as Record<string, unknown>` escape hatch.
 								- **N+1 in dotted sub-technique fallback**: pre-built `{external_id → id}` dict at function entry; was firing one extra SELECT per orphan (currently 0 with MITRE, but a latent footgun for partial bundles).
 								- **`SETTING_VERSION` cleared explicitly when source != default**: previously kept the stale pinned version after a custom-URL re-sync; now `_upsert_setting(..., None)` so `/mitre/status` doesn't lie.
 								- **Internal error scrub on `/mitre/sync`**: 500 responses no longer leak URLError / DB driver text via `str(e)` — stack lands in JSON logs only.
 								- **E2E pinned to exact MITRE v19 counts** (15/222/475/0 orphans) for parser-regression detection; previously `>=` thresholds could mask "revoked tactics silently included".
 								- **E2E uses `crypto.randomUUID()`** instead of `Math.random()` for unique test emails.
 								- **Test coverage for security guards**: `file://` rejection, disallowed HTTPS host, custom-URL-without-sha refusal, dotted-id fallback, version-clearing semantics — 5 new pytest covering paths the spec-review demanded but no test enforced.
-												docs(m4): CHANGELOG, README, lessons, spec drift fix, todo tick

- CHANGELOG: added M4 section listing endpoints, CLI, volume, persisted
  settings, picker, and the post-spec-review fixes (custom-URL integrity
  requirement + /diag/reset consistency + spec drift). Includes the
  intentional decisions paragraph (seed-time download not image-baked, read
  endpoints unauthenticated-perm-wise, stdlib over httpx).
- README: status bumped to M0–M4, added MITRE quickstart (make seed-mitre +
  air-gapped path with --source /data/mitre/<file> + --skip-checksum),
  testing-m<N>.md pointer updated to testing-m4.md, roadmap line.
- tasks/spec.md §10 #4: amended "14 tactics Enterprise" → "≥14 tactics
  Enterprise (la v19 du pin actuel en ship 15)".
- tasks/lessons.md: 7 M4 lessons captured (stdlib STIX parsing, decoupling
  DoD asserts from upstream versions, subtechnique parent resolution, single-
  transaction safety, custom-URL footgun mitigation, /diag/reset consistency,
  named-volume permission caveat, podman build cache surprise).
- tasks/todo.md: M4 marked ☑.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

											
										
										
											2026-05-12 13:54:46 +02:00
+								### Decisions (intentional)
 								- **Bundle "embarqué" interpreted as seed-time download + named-volume cache**, not "binary baked into the Docker image". Keeps the image ~150 MB, makes version bumps a constant edit, plays nicely with `make seed-mitre` re-runs. Air-gapped operators copy the file into the volume + pass `--source /data/mitre/<file>`.
 								- **Read endpoints unauthenticated-perm-wise but auth-required**: MITRE data is public reference material — no `mitre.read` perm. Status endpoint is similarly open (under `@require_auth`) to keep `/mitre/status` simple for the UI badge.
 								- **No `requests` / `httpx` dep added**: stdlib `urllib.request` is enough and avoids inflating the image.
 								### Validated end-to-end (M4 DoD)
 								- `make clean && make up && make migrate && make seed-mitre` → 15 tactics / 222 techniques / 475 sub-techniques / 254 links / 0 orphans / ~1.1 s.
-												docs(m4): reconcile CHANGELOG + testing-m4 with the flat matrix + CR fixes

- CHANGELOG M4 Added: rewrote the frontend bullet to describe the actual
  flat ATT&CK matrix that ships (full-bleed, 15-col grid with minmax(7rem,
  1fr), name-only cells, ▸/▾ chevron). The original entry still described
  the abandoned 3-column drill-down picker.
- New "Fixed (post-M4 code-review pass)" subsection enumerating the six
  CR-driven fixes that landed in this branch (SSRF allowlist, advisory
  lock, typed contract, N+1 elimination, version clearing, error scrub +
  the test additions and e2e count pinning).
- DoD counts: 53 → 58 pytest, 34 e2e unchanged. testing-m4.md follows.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

											
										
										
											2026-05-12 19:19:44 +02:00
+								- `make test-api` → **58 pytest pass** (1 health + 8 schema + 15 auth + 15 RBAC + 19 MITRE) in ~5 s.
-												docs(m4): CHANGELOG, README, lessons, spec drift fix, todo tick

- CHANGELOG: added M4 section listing endpoints, CLI, volume, persisted
  settings, picker, and the post-spec-review fixes (custom-URL integrity
  requirement + /diag/reset consistency + spec drift). Includes the
  intentional decisions paragraph (seed-time download not image-baked, read
  endpoints unauthenticated-perm-wise, stdlib over httpx).
- README: status bumped to M0–M4, added MITRE quickstart (make seed-mitre +
  air-gapped path with --source /data/mitre/<file> + --skip-checksum),
  testing-m<N>.md pointer updated to testing-m4.md, roadmap line.
- tasks/spec.md §10 #4: amended "14 tactics Enterprise" → "≥14 tactics
  Enterprise (la v19 du pin actuel en ship 15)".
- tasks/lessons.md: 7 M4 lessons captured (stdlib STIX parsing, decoupling
  DoD asserts from upstream versions, subtechnique parent resolution, single-
  transaction safety, custom-URL footgun mitigation, /diag/reset consistency,
  named-volume permission caveat, podman build cache surprise).
- tasks/todo.md: M4 marked ☑.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

											
										
										
											2026-05-12 13:54:46 +02:00
+								- `make e2e` → **34 Playwright pass** (8 M0 + 4 M1 + 8 M2 + 8 M3 + 6 M4) in ~18 s.
 								- Spec-reviewer PASS after fixes applied.
-												feat(m0): bootstrap repo, design system, compose stack

- Repo scaffolding: .gitignore, .env.example, Makefile, docker-compose.yml,
  README.md, CHANGELOG.md, pre-commit config.
- Three-service stack: api (Flask 3), db (postgres:16-alpine), front (nginx
  serving the Vite bundle). Named volumes metamorph_db + metamorph_evidence.
- Backend skeleton: Flask app factory, JSON structured logging on stdout,
  GET /api/v1/health, multi-stage Dockerfile, pyproject.toml driven by uv,
  Pydantic Settings with secret guard rails (refuses to boot in non-dev with
  placeholders), APP_ENV gating.
- Frontend skeleton: Vite + React 18 + TypeScript strict + TailwindCSS, RTOps
  design tokens from tasks/design.md, self-hosted JetBrains Mono / IBM Plex
  Sans via @fontsource, base UI primitives (Card/Tag/SectionHeader/FlowNode/
  Button), home page wired to /api/v1/health.
- Engine-agnostic Makefile: auto-detects docker or podman, picks the matching
  compose driver. Targets: up/down/build/rebuild/dev/lint/fmt/test/migrate/
  seed-mitre/print-install-token/e2e/inspect-health.
- Playwright suite: e2e/tests/m0-smoke.spec.ts (8 tests) + HTML + JUnit
  reports + traces on retry.
- Docs: tasks/spec.md (finalized after Q&A), tasks/design.md, tasks/todo.md
  (14 milestones), tasks/testing-m0.md, tasks/lessons.md.

DoD: make up + make health + make e2e all pass on podman 5.x (Fedora) and
docker. TLS terminated by external reverse proxy (spec §6 NF-network).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

											
										
										
											2026-05-11 06:16:00 +02:00
+								### Added — M3 (RBAC: groups, permissions, users)
 								- **Permission catalogue** (`app/services/permissions_seed.py`): 31 atomic codes across 10 families (`user`, `group`, `invitation`, `test_template`, `scenario_template`, `mission`, `detection_level`, `setting`, `mitre.sync`). Seeded at boot **and** after `/setup` to handle a freshly truncated DB. Idempotent + additive on system groups (never removes a perm).
 								- **Default group bindings**: `admin` = all 31 codes; `redteam` = 8 (catalogue read + mission.{read,create,update,archive,write_red_fields} + detection_level.read); `blueteam` = 5 (catalogue read + mission.{read,write_blue_fields} + detection_level.read).
 								- **Users admin service + API** (`app/services/users.py`, `app/api/users.py`): list (q + is_active filter + pagination), get, patch (display_name/locale/is_active), soft-delete, set groups. Last-admin protection on update/delete/group-strip.
 								- **Groups admin service + API** (`app/services/groups.py`, `app/api/groups.py`): full CRUD with system-group protection (no rename, no delete), `PUT /groups/{id}/permissions` for the bindings. Admin system group's perm set is locked to "every perm" (preserves the bypass invariant).
 								- **Permissions read-only API** (`app/api/permissions.py`): `GET /permissions` returns the catalogue (admin or `group.read` holders).
 								- **Frontend admin pages** (`frontend/src/pages/Admin{Users,Groups,Invitations}Page.tsx`): list + edit modals using TanStack Query mutations, multi-select for perms grouped by family, copy-once invitation URL display.
 								- **Frontend chrome** (`Layout.tsx` + `RequireAdmin.tsx`): admin nav links shown only when `is_admin === true`; direct navigation to `/admin/*` by non-admins redirects to `/`. Server remains the arbiter.
 								- **`/diag/reset` now clears the rate-limit counters** so the Playwright suite can iterate without hitting `10/min/IP` budgets across spec files. Gated to non-prod environments only.
 								- **Testing**:
 								  - `tests/test_rbac.py` — **15 pytest integration tests** (39 backend total).
 								  - `e2e/tests/m3-rbac.spec.ts` — **8 Playwright tests** covering DoD §10 #2/#3 (28 e2e total).
 								  - `tasks/testing-m3.md` — manual + automated procedure.
 								- **Frontend api helpers**: `apiPatch`, `apiPut`, `apiDelete` added to `frontend/src/lib/api.ts`.
 								### Fixed (post-M3 spec-review pass)
 								- **Rate-limit scope clarified**: `app/core/rate_limit.py` now enables the limiter for `APP_ENV in ("prod", "staging")` instead of `prod` only — a public staging deployment without auth limits would be surprising. Dev/test stay unthrottled for Playwright ergonomics. Spec §6 NF-security applies to operator-facing deployments.
 								- **Admin perm invariant**: `set_group_permissions` refuses to alter the admin system group's perm set to anything other than the full catalogue (`SystemGroupProtected` → 409). The decorator bypass relies on `is_admin = "admin" in group_names`, but a future refactor could move to a perm-based check, so we keep the invariant.
 								- **LogRecord field collision**: `log.info("...", extra={"name": g.name})` raised `KeyError: "Attempt to overwrite 'name' in LogRecord"` because Python's logger reserves `name`. Renamed to `group_name`. Audited all other `extra=` payloads in `app/api/`+`app/services/` for the same trap.
 								### Validated end-to-end (M3 DoD)
 								- `make clean && make up && make migrate` → boot logs show `metamorph.permissions.seeded {perms_created: 31, perms_total: 31, bindings: {admin: 31, redteam: 8, blueteam: 5}}`.
 								- `make test-api` → **39 pytest pass** (1 health + 8 schema + 15 auth + 15 RBAC) in ~4 s.
 								- `make e2e` → **28 Playwright pass** (8 M0 + 4 M1 + 8 M2 + 8 M3) in ~16 s.
 								- Spec-reviewer pass: PASS verdict, 2 minor fixes applied (above), 2 anticipations noted for M12/M14 (no current action).
 								### Added — M2 (Auth, bootstrap, invitations)
 								- **Crypto plumbing**: `app.core.security` (Argon2id `time_cost=2 memory_cost=64MiB parallelism=2`, opaque-token SHA-256 helpers), `app.core.jwt_tokens` (HS256, claims `iss/sub/type/jti/iat/exp`, access 1h / refresh 30d).
 								- **Auth services** (`app.services.auth`): login, refresh **with token rotation + reuse-detection chain revoke**, logout (idempotent), change_password (forces logout-all).
 								- **Invitation services** (`app.services.invitations`): create, preview, accept, revoke. Token persisted only as SHA-256, default 7-day TTL.
 								- **Bootstrap** (`app.services.bootstrap` + `app.core.install_token`): seeds 3 system groups (`admin`/`redteam`/`blueteam`), mints a one-shot install token at first boot when `users` is empty, logs a banner with the raw token. CLI `flask --app app.cli metamorph print-install-token [--force]`.
 								- **Auth middleware** (`app.core.auth_decorators`): `@require_auth` populates `g.current_user`; `@require_perm("...")` checks atomic permissions; admin group bypasses the check (atomic perms land in M3).
 								- **API endpoints**:
 								  - `POST /api/v1/setup` (consume install token, create 1st admin) + `GET /api/v1/setup` (status).
 								  - `POST /api/v1/auth/login` + `POST /auth/refresh` + `POST /auth/logout` + `GET /auth/me` + `POST /auth/change-password`.
 								  - `POST /api/v1/invitations` (admin) + `GET /invitations` + `GET /invitations/preview/<token>` + `POST /invitations/accept/<token>` + `POST /invitations/<id>/revoke`.
 								  - `POST /api/v1/diag/reset` (test-only kill switch — wipes auth tables + mints fresh install token; only available in `dev`/`test`).
 								- **Rate limiting** (`flask-limiter`): 10/min/IP on `/auth/login`, `/auth/refresh`; 5/min on `/auth/change-password` and `/setup`; 10–20/min on invitation endpoints. Globally disabled when `APP_ENV=test`.
 								- **Refresh cookie** `metamorph_refresh`: HttpOnly + Secure + SameSite=Strict + Path=`/api/v1/auth/`.
 								- **Frontend auth state** (`frontend/src/lib/{api,auth}.ts`): access token in module memory, refresh in cookie, automatic 401-retry via `/auth/refresh` with reentrancy guard. `useAuth()` hook + `<RequireAuth>` route guard.
 								- **Frontend pages**: `/login`, `/setup`, `/register?token=…`, `/profile` (with change-password form), all in RTOps design. Protected layout: nav shows email + Logout when authenticated, Login + Setup links when not.
 								- **Frontend deps**: `@tanstack/react-query`, `react-router-dom`. Tanstack provider in `App.tsx` (will carry actual queries from M3+).
 								- **Email validation** (`app.api._validation.Email`): permissive RFC-shape regex that accepts internal TLDs (`.local`, `.corp`) — `pydantic.EmailStr` was too strict for red-team labs.
 								- **Testing**:
 								  - `tests/test_auth_flow.py` — **15 pytest integration tests** (24 backend total with M0/M1).
 								  - `e2e/tests/m2-auth.spec.ts` — **8 Playwright tests** covering setup → login → me → invitation → register → 2nd login → RBAC 403 → refresh rotation → logout (20 e2e total).
 								  - `tasks/testing-m2.md` — manual + automated procedure.
 								### Fixed (post-M2 spec-review pass)
 								- **Refresh cookie `Secure=True`** unconditionally (`backend/app/api/auth.py`). Modern browsers treat `localhost` as a secure context, so dev/test still works. Closes the silent-degradation found by the reviewer.
 								- **`/auth/refresh` rate-limit lowered to 10/min/IP** (`backend/app/api/auth.py`) to match spec §M2 ("10 req/min/IP on `/auth/*`").
 								- `/diag/reset` kept allowed in `dev` *and* `test` (a `make e2e` against a `make up` dev stack must be able to reset). Added a WARNING log when triggered in `dev` and a clear docstring; production envs (`prod`/`staging`) remain locked out.
 								### Known scope-creep (intentional, not retracted)
 								- Rate-limits on `/setup` (5/min), `/invitations/preview` (20/min), `/invitations/accept` (10/min) and `/auth/change-password` (5/min) were added in M2 even though §M2 only mandated `/auth/*`. Defensible (these are abuse-attractor endpoints), and noted here so M14 doesn't double-spec them.
 								### Added — M1 (DB schema & migrations)
 								- **23 tables** + `alembic_version` covering auth/RBAC (8), MITRE (4), templates (4), missions (6), evidence (1), settings/detection-levels (2), notifications (1).
 								- SQLAlchemy 2.x declarative models with Mapped[]/mapped_column(), grouped under `backend/app/models/{auth,mitre,template,mission,evidence,setting,notification}.py`.
 								- Alembic init: `alembic.ini`, `alembic/env.py` reading `app.core.config.settings.database_url`, `alembic/script.py.mako`, naming convention `pk_/fk_/ck_/uq_/ix_` enforced via `MetaData(naming_convention=...)` on `app.db.base.Base`.
 								- Reusable mixins in `app.db.mixins`: `UuidPkMixin` (uuid4 server-side), `TimestampMixin` (created_at/updated_at, server-default + onupdate), `SoftDeleteMixin` (deleted_at, no auto-injected index — declared explicitly per table to avoid mixin-vs-class `__table_args__` clobbering).
 								- Postgres-specific features used: `JSONB` for `settings.value` and `notifications.payload`; native `Uuid` columns; partial indexes (`WHERE deleted_at IS NULL` on 9 tables; `WHERE read_at IS NULL` on `notifications`); CHECK constraints for status/state/opsec_level/mitre_kind enums; `exactly_one_mitre_fk` CHECK on `test_template_mitre_tags`.
 								- **`mission_test_mitre_tags` deliberately denormalised** (no FK to `mitre_*` tables): copies `mitre_external_id`, `mitre_name`, `mitre_url` at tag time so a later MITRE re-sync that drops an entry cannot purge a mission's tags. Companion `test_template_mitre_tags` keeps FKs since templates are editable. (Spec §11 risk addressed.)
 								- Backend `pyproject.toml` deps: SQLAlchemy ≥2, Alembic ≥1.13, psycopg[binary] ≥3.1.
 								- New Makefile targets: `migrate`, `migrate-down`, `migrate-revision MSG=…`, `migrate-status`. The Dockerfile now ships `alembic.ini` + `alembic/` so the api container can run migrations directly.
 								- **Test stage in `backend/Dockerfile`** (`--target test`): runtime image + dev extras + `tests/` dir. New `make test-api` target spins an ephemeral container against the live DB on the compose network. Backend tests no longer require any local Python toolchain.
 								- `tests/test_schema.py` (8 integration tests + the existing M0 health test = 9 total): expected tables, expected timestamp/soft-delete columns, partial-index presence, expected FK pairs, expected CHECK constraints, alembic-at-head, and a negative INSERT proving the `exactly_one_mitre_fk` CHECK fires.
 								- `tasks/testing-m1.md` — manual + automated verification procedure.
 								### Fixed (post-M1 spec-review pass)
 								- Soft delete now consistent across snapshot-bearing tables: `mission_scenarios`, `mission_tests`, `mission_categories` gained `SoftDeleteMixin` + their `ix_<table>_active` partial index (M12 trash bin depends on this).
 								- `evidence_files` gained `TimestampMixin` (`created_at`/`updated_at`) on top of the domain `uploaded_at` (audit minimal everywhere, per M1 brief).
 								- `mission_members` gained `TimestampMixin`, replacing the bespoke `added_at` column.
 								- `scenario_template_tests` PK refactored to a UUID + `UNIQUE(scenario_template_id, position)` so the same test can appear at multiple positions in a scenario (chained operations).
 								- `SoftDeleteMixin.__table_args__` removed (silently clobbered by class `__table_args__`); each soft-delete table now declares `ix_<table>_active` explicitly. Documented in the mixin's docstring.
 								- `mission_test_mitre_tags` schema redesigned to denormalise MITRE labels (see "Added" entry above).
 								- Migration 0001 regenerated end-to-end after these fixes — `24765a5014b6` is the new HEAD.
 								### Validated end-to-end (M1 DoD)
 								- `make clean && make up && make migrate` from a vide DB → 27 tables, 32 FK, 9 CHECK, 14 UQ, 12 partial indexes.
 								- `make test-api` → **9 pytest pass** (1 health + 8 schema integration) in <1 s.
 								- `make e2e` → **12 Playwright pass** (8 M0 smoke + 4 M1 db visibility) in 3 s.
 								### Added (M1 visibility)
 								- New API endpoint `GET /api/v1/diag/db` exposes `alembic_revision` (short-hashable) and the public-schema `table_count`. Returns 503 with `{"reachable": false}` when Postgres is down.
 								- New `Database` card on the SPA home page consumes that endpoint, renders the revision short-hash and the count next to the existing `API` and `Roadmap` cards.
 								- Footer updated to `M0 bootstrap · M1 db schema`. Roadmap card now points to `M2 — Auth + JWT`.
 								- New e2e suite `e2e/tests/m1-db.spec.ts` (4 tests) covers the diag endpoint contract, the Database card rendering, and the footer/roadmap labels.
 								### Added — M0 (bootstrap)
 								- Repo scaffolding: `.gitignore`, `.env.example`, `Makefile`, `docker-compose.yml`, `README.md`, `CHANGELOG.md`.
 								- `docker-compose.yml` with three services: `db` (postgres:16-alpine, no host port), `api` (Flask 3, port 8000), `front` (nginx serving the Vite bundle, port 80).
 								- Named volumes `metamorph_db` and `metamorph_evidence` for data persistence.
 								- Backend skeleton: Flask app factory, JSON structured logging on stdout, `GET /api/v1/health` endpoint, multi-stage Dockerfile, `pyproject.toml` driven by `uv`.
 								- Frontend skeleton: Vite + React 18 + TypeScript strict + TailwindCSS, RTOps design tokens (`tasks/design.md`) translated into `tailwind.config.ts`, base UI primitives (`Card`, `Tag`, `SectionHeader`, `FlowNode`, `Button`), home page wired to `/api/v1/health`.
 								- Multi-stage frontend Dockerfile that builds the bundle and serves it via nginx, proxying `/api/*` to the api container.
 								- Pre-commit hook config: `ruff` for backend, `eslint` + `tsc --noEmit` for frontend.
 								### Validated
 								- `docker compose config` parses (validated via `pyyaml` since Docker is not installed in the dev shell).
 								- Every env var referenced by the compose file is documented in `.env.example`.
 								- All Python source files parse cleanly (`ast.parse`).
 								- All TS/JSON config files parse cleanly.
 								### Notes
 								- TLS termination is delegated to an external reverse proxy (per spec §6 NF-network). The compose stack exposes plain HTTP on `HOST_FRONT_PORT` (8080) and `HOST_API_PORT` (8000).
 								- The first-admin bootstrap token (M2) will be printed to the api container's stdout on first boot when the `users` table is empty.
 								- `tasks/spec.md` and `tasks/todo.md` remain authoritative; update them before changing scope.
 								### Fixed (M0 DoD validation pass on real podman)
 								- **FQDN image references** in `docker-compose.yml`, `backend/Dockerfile`, `frontend/Dockerfile`. Podman on Fedora enforces `short-name-mode=enforcing` for pulls (no TTY ⇒ no prompt ⇒ failure). Replaced `postgres:16-alpine` / `python:3.12-slim` / `node:20-alpine` / `nginx:1.27-alpine` with their `docker.io/library/…` qualified equivalents. Docker accepts the same prefix transparently.
 								- **`*.md` removed from `backend/.dockerignore` and `frontend/.dockerignore`**: `pyproject.toml` declared `readme = "README.md"`, but the file was being filtered out of the build context, so `hatchling.build.build_wheel` raised `OSError: Readme file does not exist: README.md`. Also removed the `readme` field itself from `pyproject.toml` to decouple the build from the doc.
 								- **`Card.tsx` type clash**: `CardProps extends HTMLAttributes<HTMLDivElement>` redefined `title` as `ReactNode`, but the native `title` is `string`. `tsc -b` failed with TS2430 during `vite build`. Switched to `Omit<HTMLAttributes<HTMLDivElement>, 'title'>`.
 								- **Explicit healthchecks added to compose `api` and `front`**: podman-compose 1.x doesn't surface healthchecks declared only in the `Dockerfile` via `inspect`. Mirroring them in `docker-compose.yml` makes `make inspect-health` actually see `healthy/unhealthy/starting` on every engine.
 								- **Suppressed `podman compose` external-provider banner** via `PODMAN_COMPOSE_WARNING_LOGS=false` exported from the Makefile.
 								### Validated end-to-end on podman 5.x (Fedora 43)
 								- `make up` → 3 containers, all 3 healthy after start_period.
 								- `make health` → `{"status":"ok","version":"0.1.0"}` via the front nginx proxy (port 8080) and direct API (port 8000).
 								- `make logs-api` → JSON-structured lines on stdout (`ts`, `level`, `logger`, `message`, custom fields).
 								- `make e2e` → **8/8 Playwright tests pass** in 2.5 s. Reports: `e2e/playwright-report/index.html` (529 KB, autoportant) + `junit.xml` (`tests=8 failures=0 skipped=0 errors=0`).
 								### Added (engine portability)
 								- Makefile auto-detects **docker** or **podman** at runtime and selects the matching compose driver (`docker compose`, `podman compose`, or legacy `podman-compose`). Override via `ENGINE=…` and/or `COMPOSE="…"`.
 								- New targets: `engine` (print detected runtime), `volumes` (list project-named volumes), `inspect-health` (health status of all 3 containers), `logs-api` (tail just the api), `health` (single curl probe). All engine-agnostic.
 								- `make help` now prints the active engine + compose driver in its footer.
 								- `tasks/testing-m0.md` and `README.md` rewritten to be engine-agnostic — raw `docker logs` / `docker volume ls` / `docker inspect` calls replaced with the new make targets.
 								### Added (M0 testing)
 								- `e2e/` Playwright project with chromium, HTML + JUnit XML reporters, traces / screenshots / videos kept on retry. Reports land in `e2e/playwright-report/`.
 								- `e2e/tests/m0-smoke.spec.ts` — 8 smoke tests covering the front rendering, the API proxy, the design tokens, the absence of any runtime CDN traffic (spec §7), and the CORS contract.
 								- Makefile targets `e2e-install`, `e2e`, `e2e-report`, `e2e-up`, `wait-healthy`.
 								- `tasks/testing-m0.md` — step-by-step manual + automated verification procedure for M0.
 								- Convention added to `tasks/todo.md`: every milestone N delivers `tasks/testing-m<N>.md` + at least one `e2e/tests/m<N>-*.spec.ts`, and the spec-reviewer subagent runs before marking the milestone done.
 								### Fixed (post-M0 spec-review pass)
 								- `.pre-commit-config.yaml` added at repo root: ruff + ruff-format on backend, eslint + tsc --noEmit + prettier --check on frontend, plus baseline whitespace/JSON/private-key checks. Documented `pre-commit install` in `README.md`.
 								- Self-hosted webfonts via `@fontsource/jetbrains-mono` and `@fontsource/ibm-plex-sans` (imported in `frontend/src/index.css`); dropped the Google Fonts `<link>` from `frontend/index.html` to honor spec §7 ("no runtime CDN").
 								- Refuse-to-boot guard in `backend/app/core/config.py`: when `APP_ENV != "dev"`, defaults / placeholders for `JWT_SECRET` and `POSTGRES_PASSWORD` raise at startup. New `APP_ENV` env var documented in `.env.example`, `README.md`, and `docker-compose.yml`.
 								- `make dev` now runs `dev-api` and `dev-front` in parallel via `make -j2` instead of just printing a hint.
 								- Removed dead `database_url` property from `Settings` (will be reintroduced in M1 with the SQLAlchemy/Alembic stack).
 								- Pinned Node engines to `>=20` in `frontend/package.json`.
 								- Reconciled M0 DoD wording in `tasks/todo.md` (HTTP via `HOST_FRONT_PORT`, with explicit note that prod TLS is external).
 								- Documented the `2xs/3xs/4xs` font-size aliases in `frontend/tailwind.config.ts` against the design.md §3 scale.