Files

Knacky 00b7557e30 feat(m6): missions + snapshot CRUD, membership visibility, status state machine

Adds the mission layer that materialises template snapshots, plus the SPA
list / 3-step wizard / detail page.

Backend:
- app/services/missions.py — create_mission snapshots scenarios, tests, MITRE
  tags in a 4-query write; list/get apply a non-admin membership filter that
  collapses to 404 (no existence leak); status state machine enforces
  draft → in_progress → completed → archived with archived as a sink; the
  non-admin creator is auto-added as role_hint='red' to retain visibility.
- app/api/missions.py — 8 endpoints (list, get, create, update, add
  scenarios, set members, transition, soft-delete) with strict pydantic
  schemas. The transition endpoint splits the perm gate manually so
  archive requires mission.archive while other targets use mission.update.
- app/api/users.py — new GET /users/roster returning (id, email,
  display_name) only, gated by user.read OR mission.create OR
  mission.update — lets non-admin wizard users see assignable peers
  without exposing the admin /users payload.
- app/api/diag.py — /diag/reset truncates the mission_* tables before the
  template tables because the source_*_template_id FKs are ON DELETE SET
  NULL, which is cheaper to short-circuit by removing the children first.

Frontend:
- lib/missions.ts — typed client, queryKey factory, status accent map.
- pages/MissionsListPage.tsx — list cards with status accent + filters
  (q, client, status).
- pages/MissionsCreatePage.tsx — 3-step wizard (meta → scenarios → members)
  with member roster fed by /users/roster.
- pages/MissionDetailPage.tsx — header + transition buttons (legal next
  states only) + Tests/Members/Synthesis/Export tabs.
- Routes + nav entry (visible to anyone with mission.read or admin).

Tests:
- backend/tests/test_missions.py — 22 pytest covering snapshot fidelity,
  MITRE propagation, membership visibility, transition state machine,
  perm gating, member set replace, append scenarios, soft-delete, partial
  update, inverted-date rejection.
- e2e/tests/m6-missions.spec.ts — 5 Playwright (snapshot freezing, non-admin
  visibility, status transitions + 409, SPA wizard end-to-end, list filter).

Docs:
- CHANGELOG, tasks/testing-m6.md, tasks/lessons.md (snapshot tradeoffs,
  membership=404 pattern, /diag/reset order, auto-creator add).
- README + tasks/todo.md updated.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

2026-05-13 15:07:32 +02:00

37 KiB

Raw Blame History

Changelog

All notable changes to this project will be documented here. Format: Keep a Changelog · Conventional Commits.

[Unreleased]

Added — M6 (Missions & snapshot)

CRUD missions (app/services/missions.py + app/api/missions.py):
- Fields: name, client_target, date_start, date_end, status (draft/in_progress/completed/archived), description (markdown), visibility_mode (frozen to whitebox in v1).
- On creation/append, the service snapshots the selected scenario_templates and all their test_templates into mission_scenarios / mission_tests (every template field — including OPSEC level, tags, expected IOCs, MITRE tags). The denormalised mission_test_mitre_tags table copies external_id, name, url so a later MITRE re-sync that drops the entry can't alter a mission's tags (spec §11).
- source_*_template_id FKs survive template soft-deletes (ON DELETE SET NULL); the mission's frozen content is unaffected.
- Membership visibility: non-admin viewers see only missions where they are a mission_members row. The service maps "not visible" → 404 (no existence leak via 403). Admins bypass via the admin group.
- Status state machine: draft → in_progress → completed → archived; archived → ∅. The transition endpoint accepts the target status, validates the move, and rejects invalid jumps with 409. Idempotent (target=current) is a no-op 200.
- Auto-creator-membership: a non-admin caller of POST /missions is auto-added as role_hint='red' if not already in the members[] payload — so they retain visibility on the mission they just created.
- REST: GET/POST /missions, GET/PUT/DELETE /missions/{id}, POST /missions/{id}/scenarios (append snapshots at the end), PUT /missions/{id}/members (replace set), POST /missions/{id}/transition.
- Filters on list: q (LIKE on name/description), status, client (LIKE on client_target). include_deleted=true is admin-only (403 otherwise).
GET /users/roster (app/api/users.py): a deliberately minimal listing — id, email, display_name of active users only — accessible to any holder of user.read, mission.create, or mission.update. Lets a non-admin red teamer populate the wizard's member picker without exposing the admin-grade /users endpoint (which leaks is_admin, is_active, group memberships).
Frontend:
- lib/missions.ts — typed client + queryKey factory + status accent map + filter query-string builder.
- pages/MissionsListPage.tsx — list cards (one per mission) with status accent, scenario/test/member counts, date range, plus filters (q, client, status).
- pages/MissionsCreatePage.tsx — 3-step wizard: metadata → scenario picker → member roster (red/blue toggles + auto-include the non-admin creator). Submits via POST /missions and redirects to the detail page.
- pages/MissionDetailPage.tsx — header with transition buttons (only the legal next states are rendered), soft-delete with confirm prompt, and 4 tabs: Tests (table of snapshotted tests with MITRE tags, OPSEC, state), Members (role-coloured pills), Synthesis (placeholder for M10), Export (placeholder for M11).
- Nav adds Missions link visible to anyone with mission.read or admin.
/diag/reset truncates the mission tables before the template tables — mission_scenarios.source_scenario_template_id and mission_tests.source_test_template_id are ON DELETE SET NULL, so wiping missions first avoids the round-trip through the null-update path.
Testing:
- backend/tests/test_missions.py — 22 pytest covering snapshot fidelity (rename source template after snapshot → mission unchanged), MITRE tag propagation, membership-based 404, perm gating (create vs read vs archive), status transition chain + invalid jumps (409), member set replace + role-hint validation, scenario append at correct position, soft-delete, partial metadata update, inverted-date rejection, admin-only include_deleted.
- e2e/tests/m6-missions.spec.ts — 5 Playwright (snapshot freezing, membership visibility for non-admin red, status transition + 409, SPA wizard end-to-end, SPA list + status filter).
- tasks/testing-m6.md.

Added — M5 (Test & scenario templates)

CRUD test_templates (app/services/test_templates.py + app/api/test_templates.py):
- Fields: name, description, objective, procedure (markdown), prerequisites (markdown), expected result red, expected detection blue, OPSEC level (low/medium/high), free tags (TEXT[]), expected IOCs (TEXT[]).
- Polymorphic MITRE tag set ((kind, external_id) ↔ exactly one of tactic_id/technique_id/subtechnique_id). The wire payload uses ATT&CK external IDs — server resolves to UUIDs.
- Filters: q (LIKE on name/description), tactic/technique/subtechnique (joined via subquery on the polymorphic tag table), opsec, tag (array contains).
- REST: GET /test-templates, GET /test-templates/{id}, POST /test-templates, PUT /test-templates/{id} (partial, with explicit _UNSET sentinel so omitted fields stay untouched), DELETE /test-templates/{id} (soft).
CRUD scenario_templates (app/services/scenario_templates.py + app/api/scenario_templates.py):
- Ordered list of test_templates with position (UNIQUE scenario_template_id, position).
- Reorder via full replace: PUT /scenario-templates/{id}/tests deletes the join rows and re-inserts at positions 0..N-1 — clean atomic op that respects the UNIQUE constraint without a 2-phase position shuffle.
- The same test can appear multiple times (chained operations).
- REST: GET/POST/PATCH (metadata) / DELETE (soft) on /scenario-templates.
Frontend:
- lib/templates.ts — typed client + queryKey factory.
- pages/AdminTestsPage.tsx — list + filters (q, tactic, opsec, tag) + modal with full field set + embedded <MitreTagPicker> for tags.
- pages/AdminScenariosPage.tsx — list + modal with @dnd-kit/sortable vertical drag-and-drop on the ordered test list. New deps: @dnd-kit/core, @dnd-kit/sortable, @dnd-kit/utilities.
- components/MarkdownField.tsx — lean textarea with markdown hint (no heavy editor dep; rendering happens at display time in M7).
- Nav adds Tests and Scenarios links (admin-gated).
/diag/reset truncates the 4 new tables before the MITRE block — the scenario_template_tests.test_template_id FK is ON DELETE RESTRICT, so the order matters.
Testing:
- backend/tests/test_templates.py — 19 pytest (create/list/filter by tactic+opsec+tag, MITRE tag resolution + replacement on update, soft-delete, perm gating, scenario create+reorder+delete, soft-deleted test linking semantics).
- e2e/tests/m5-templates.spec.ts — 4 Playwright (API CRUD round-trip, scenario reorder, SPA list + opsec filter, SPA scenario list rendering with ordered tests).
- tasks/testing-m5.md.

Fixed (M5 implementation)

LogRecord key collision: log.info(..., extra={"name": ...}) raises KeyError("Attempt to overwrite 'name' in LogRecord") because name is reserved by Python's stdlib logging. Renamed to template_name.
React currentTarget null in deferred state updaters: onChange={(e) => setX((prev) => ({ ...prev, q: e.currentTarget.value }))} blanked the page on the first user input because currentTarget is cleared after the listener bubble ends, before React invokes the updater. Switched all M5 handlers to e.target.value, which persists on the synthetic event.

Fixed (post-M5 — scenario reorder 500 + cross-worker lock correctness)

PUT /scenario-templates/{id}/tests returned 500 (backend/app/services/scenario_templates.py:218): the two-argument form pg_advisory_xact_lock(:n, :m) failed with function pg_advisory_xact_lock(smallint, bigint) does not exist. Postgres only provides (int4, int4) and (bigint) overloads — psycopg promoted m = hash(uuid) & 0xFFFFFFFF (up to 2^32-1) to bigint and there's no matching overload. Switched to the single-argument bigint form with CAST(:key AS bigint).
Cross-worker lock was a no-op (same site): Python's built-in hash() is randomised per process via PYTHONHASHSEED, so each gunicorn worker computed a different key for the same scenario_id, and concurrent reorders on different workers acquired independent locks — defeating the serialisation. Replaced with blake2b(scenario_id.bytes, digest_size=8) interpreted as a signed int64. Stable, deterministic, fits in bigint.

Modal box capped its width at max-w-2xl and had no vertical scroll (frontend/src/components/ui/Modal.tsx): opening + New test rendered the 15-column MITRE matrix inside a 672 px frame with no height cap, so the matrix spilled to the right and the form bottom dropped below the viewport — buttons unreachable, no scroll. Added a size prop (default 2xl for back-compat), max-h-[calc(100vh-2rem)] + flex flex-col on the dialog, and an inner min-w-0 flex-1 overflow-y-auto body so the header stays pinned while the form scrolls inside the modal.
MITRE matrix overflow-x failed to scroll inside the modal body (frontend/src/components/MitreTagPicker.tsx): overflow-x-auto sat directly on the grid element, but the grid's intrinsic min-width (15 × minmax(7rem, …) = 1680 px) prevented it from shrinking below its content, so the grid spilled outside its parent instead of scrolling. Wrapped the grid in a dedicated overflow-x-auto rounded min-w-0 w-full scroller and added min-w-0 to the picker root so the constraint propagates from the modal body. The grid now scrolls horizontally inside the modal.
grid gap-3 form layout in the test-template modal propagated min-width: auto (frontend/src/pages/AdminTestsPage.tsx): each grid item refused to shrink below its widest child, so the picker dragged the form (and the body) past the modal width. Switched the form to flex flex-col gap-3 min-w-0, which breaks the propagation while preserving vertical spacing.
Test-template modal now uses size="7xl" and the scenario-template modal size="3xl" to match their content density.

Fixed (post-M5 review pass — spec-reviewer + code-reviewer)

Filter combinator was OR, not AND (backend/app/services/test_templates.py:235): ?tactic=TA0002&technique=T1059 returned templates matching either facet instead of both. Pre-fix also pooled all three UUIDs into a shared IN list across three columns, theoretically allowing a UUID collision to match across kinds. Refactored to one IN-subquery per facet, ANDed together via repeated WHERE id IN (...).
Concurrent reorder race on set_scenario_tests (backend/app/services/scenario_templates.py:207): two parallel reorders on the same scenario could deadlock on the UNIQUE(scenario_id, position) constraint under READ COMMITTED. Added a per-scenario pg_advisory_xact_lock(0x5C3, hash(scenario_id)) mirroring the M4 /mitre/sync pattern; different scenarios don't contend.
N+1 on _to_view MITRE resolution (backend/app/services/test_templates.py:160): rendering K templates with ~T tags each fired up to K×T s.get(...) calls. Added _to_views_batch that pre-builds {uuid → MitreRow} maps in 3 queries and feeds them to per-template view assembly; list_test_templates now issues 4 queries total regardless of list size.
Wire-level item length cap on tags / expected_iocs (backend/app/api/test_templates.py:18-21): the DB columns are ARRAY(String(64)) / ARRAY(String(255)) but the API layer only capped the LIST length, not item strings — long inputs hit the driver with StringDataRightTruncation. Added Annotated[str, StringConstraints(...)] types so the API returns 400 with a clean validation error.
Front-end mutation cache hygiene (frontend/src/pages/AdminScenariosPage.tsx:148-156): updateMeta and setTests mutations are run sequentially in submit(); on partial failure (metadata saved but reorder failed) the cache stayed stale. Both mutations now onSettled: invalidate so whatever step landed is reflected without manual refresh.
Backend vs front-end consistency on duplicate tests in a scenario (frontend/src/pages/AdminScenariosPage.tsx:227-231): the backend allows the same test_template to appear multiple times (chained ops; the UNIQUE constraint is (scenario_id, position) not (scenario_id, test_template_id)), but the catalogue picker was filtering out already-picked items. Removed the filter — only soft-deleted tests are excluded now.
Test coverage closure (backend/tests/test_templates.py): +4 pytest (tactic+technique AND-semantics, extra="forbid" rejection, empty mitre_tags explicit clear, 65-char tag length cap → 400). Total backend now 23 M5 tests + 39 elsewhere = 81 pass.

Added — M4 (MITRE ATT&CK Enterprise)

STIX 2.1 parser + upsert (app/services/mitre_seed.py): stdlib-only (urllib.request + hashlib), pinned to Enterprise v19.0 (enterprise-attack-19.0.json, sha256 df520ea0…). Parses 25k+ STIX objects → 15 tactics, 222 techniques, 475 sub-techniques in ~1.1 s. Skips revoked + deprecated, resolves sub-technique parents via relationship[subtechnique-of] with a T1003.001 → T1003 dotted-id fallback, copies kill-chain phases into the mitre_technique_tactics M2M.
CLI: flask metamorph seed-mitre [--source <path|url>] [--checksum-sha256 <hex>] [--skip-checksum] (app/cli.py). make seed-mitre wraps it.
REST endpoints (app/api/mitre.py):
- GET /api/v1/mitre/tactics, /mitre/techniques?tactic=…&q=…, /mitre/subtechniques?technique=…&q=… (paginated, search on name/external_id).
- GET /api/v1/mitre/status (last_sync, version, source_url, defaults).
- POST /api/v1/mitre/sync (perm mitre.sync) — re-pull on demand.
Persisted metadata in settings: mitre_last_sync, mitre_version, mitre_source_url.
Compose volume metamorph_mitre mounted at /data/mitre/ in the api container — caches the downloaded bundle across restarts. Owned by metamorph:metamorph.
Frontend:
- <MitreTagPicker> component: flat ATT&CK matrix matching attack.mitre.org/# — full-bleed beyond max-w-page, 15 equal-width columns via grid-template-columns: repeat(N, minmax(7rem, 1fr)), sans-serif 12px, name-only cells (external_id surfaces on hover via title and in selection chips), ▸/▾ chevron expands sub-techniques inline within the column, multi-select with chip-removal at the top. Returns MitreTag[] (kind, id, external_id, name), ready for M5 templates.
- /mitre showcase page with status card, admin-gated Trigger sync button, the picker, and a JSON <pre> preview of the current selection.
- Nav adds MITRE link for any logged-in user.
Testing:
- backend/tests/test_mitre.py — 12 pytest (parser, idempotence, checksum mismatch, persisted settings, endpoint variants, perm enforcement) using a hand-crafted minimal STIX bundle (no network in tests).
- e2e/tests/m4-mitre.spec.ts — 6 Playwright against the live stack (calls /mitre/sync once in beforeAll).
- tasks/testing-m4.md.

Fixed (post-M4 spec-review pass)

Sync integrity guarantee: seed_mitre() now refuses a custom URL without either expected_sha256 or an explicit allow_unverified=true. Closes a "typo in mitre_source_url setting routes the seed to attacker JSON" footgun. CLI surfaces this via --checksum-sha256 / --skip-checksum; API via {"source", "expected_sha256", "allow_unverified"} body.
/diag/reset consistency: now truncates the mitre_* tables alongside settings so GET /mitre/status and GET /mitre/tactics agree after a reset (previously: catalogue rows persisted, but mitre_last_sync got wiped → status lied).
Spec drift §10 #4: amended "14 tactics" → "≥ 14 tactics (v19 ships 15)" to reflect MITRE v8+ reality.

Fixed (post-M4 code-review pass)

SSRF allowlist on /mitre/sync: host must be in MITRE_ALLOWED_HOSTS (defaults to raw.githubusercontent.com, comma-separated env override). Closes the "admin holding mitre.sync can pivot the api container at cloud metadata (169.254.169.254) or internal mirrors" vector. New MitreSourceForbidden exception → 400 with source_forbidden error code.
Concurrent sync race: seed_mitre() now acquires pg_advisory_xact_lock(hashtext('mitre.seed')) at the top of the transaction so two /mitre/sync calls serialise cleanly across the DELETE + re-INSERT of mitre_technique_tactics.
Typed sync contract end-to-end: Pydantic SyncResultOut on the backend (app/api/mitre.py) mirrored by a MitreSyncResult TS interface (frontend/src/lib/mitre.ts). The MitrePage mutation no longer uses an as Record<string, unknown> escape hatch.
N+1 in dotted sub-technique fallback: pre-built {external_id → id} dict at function entry; was firing one extra SELECT per orphan (currently 0 with MITRE, but a latent footgun for partial bundles).
SETTING_VERSION cleared explicitly when source != default: previously kept the stale pinned version after a custom-URL re-sync; now _upsert_setting(..., None) so /mitre/status doesn't lie.
Internal error scrub on /mitre/sync: 500 responses no longer leak URLError / DB driver text via str(e) — stack lands in JSON logs only.
E2E pinned to exact MITRE v19 counts (15/222/475/0 orphans) for parser-regression detection; previously >= thresholds could mask "revoked tactics silently included".
E2E uses crypto.randomUUID() instead of Math.random() for unique test emails.
Test coverage for security guards: file:// rejection, disallowed HTTPS host, custom-URL-without-sha refusal, dotted-id fallback, version-clearing semantics — 5 new pytest covering paths the spec-review demanded but no test enforced.

Decisions (intentional)

Bundle "embarqué" interpreted as seed-time download + named-volume cache, not "binary baked into the Docker image". Keeps the image ~150 MB, makes version bumps a constant edit, plays nicely with make seed-mitre re-runs. Air-gapped operators copy the file into the volume + pass --source /data/mitre/<file>.
Read endpoints unauthenticated-perm-wise but auth-required: MITRE data is public reference material — no mitre.read perm. Status endpoint is similarly open (under @require_auth) to keep /mitre/status simple for the UI badge.
No requests / httpx dep added: stdlib urllib.request is enough and avoids inflating the image.

Validated end-to-end (M4 DoD)

make clean && make up && make migrate && make seed-mitre → 15 tactics / 222 techniques / 475 sub-techniques / 254 links / 0 orphans / ~1.1 s.
make test-api → 58 pytest pass (1 health + 8 schema + 15 auth + 15 RBAC + 19 MITRE) in ~5 s.
make e2e → 34 Playwright pass (8 M0 + 4 M1 + 8 M2 + 8 M3 + 6 M4) in ~18 s.
Spec-reviewer PASS after fixes applied.

Added — M3 (RBAC: groups, permissions, users)

Permission catalogue (app/services/permissions_seed.py): 31 atomic codes across 10 families (user, group, invitation, test_template, scenario_template, mission, detection_level, setting, mitre.sync). Seeded at boot and after /setup to handle a freshly truncated DB. Idempotent + additive on system groups (never removes a perm).
Default group bindings: admin = all 31 codes; redteam = 8 (catalogue read + mission.{read,create,update,archive,write_red_fields} + detection_level.read); blueteam = 5 (catalogue read + mission.{read,write_blue_fields} + detection_level.read).
Users admin service + API (app/services/users.py, app/api/users.py): list (q + is_active filter + pagination), get, patch (display_name/locale/is_active), soft-delete, set groups. Last-admin protection on update/delete/group-strip.
Groups admin service + API (app/services/groups.py, app/api/groups.py): full CRUD with system-group protection (no rename, no delete), PUT /groups/{id}/permissions for the bindings. Admin system group's perm set is locked to "every perm" (preserves the bypass invariant).
Permissions read-only API (app/api/permissions.py): GET /permissions returns the catalogue (admin or group.read holders).
Frontend admin pages (frontend/src/pages/Admin{Users,Groups,Invitations}Page.tsx): list + edit modals using TanStack Query mutations, multi-select for perms grouped by family, copy-once invitation URL display.
Frontend chrome (Layout.tsx + RequireAdmin.tsx): admin nav links shown only when is_admin === true; direct navigation to /admin/* by non-admins redirects to /. Server remains the arbiter.
/diag/reset now clears the rate-limit counters so the Playwright suite can iterate without hitting 10/min/IP budgets across spec files. Gated to non-prod environments only.
Testing:
- tests/test_rbac.py — 15 pytest integration tests (39 backend total).
- e2e/tests/m3-rbac.spec.ts — 8 Playwright tests covering DoD §10 #2/#3 (28 e2e total).
- tasks/testing-m3.md — manual + automated procedure.
Frontend api helpers: apiPatch, apiPut, apiDelete added to frontend/src/lib/api.ts.

Fixed (post-M3 spec-review pass)

Rate-limit scope clarified: app/core/rate_limit.py now enables the limiter for APP_ENV in ("prod", "staging") instead of prod only — a public staging deployment without auth limits would be surprising. Dev/test stay unthrottled for Playwright ergonomics. Spec §6 NF-security applies to operator-facing deployments.
Admin perm invariant: set_group_permissions refuses to alter the admin system group's perm set to anything other than the full catalogue (SystemGroupProtected → 409). The decorator bypass relies on is_admin = "admin" in group_names, but a future refactor could move to a perm-based check, so we keep the invariant.
LogRecord field collision: log.info("...", extra={"name": g.name}) raised KeyError: "Attempt to overwrite 'name' in LogRecord" because Python's logger reserves name. Renamed to group_name. Audited all other extra= payloads in app/api/+app/services/ for the same trap.

Validated end-to-end (M3 DoD)

make clean && make up && make migrate → boot logs show metamorph.permissions.seeded {perms_created: 31, perms_total: 31, bindings: {admin: 31, redteam: 8, blueteam: 5}}.
make test-api → 39 pytest pass (1 health + 8 schema + 15 auth + 15 RBAC) in ~4 s.
make e2e → 28 Playwright pass (8 M0 + 4 M1 + 8 M2 + 8 M3) in ~16 s.
Spec-reviewer pass: PASS verdict, 2 minor fixes applied (above), 2 anticipations noted for M12/M14 (no current action).

Added — M2 (Auth, bootstrap, invitations)

Crypto plumbing: app.core.security (Argon2id time_cost=2 memory_cost=64MiB parallelism=2, opaque-token SHA-256 helpers), app.core.jwt_tokens (HS256, claims iss/sub/type/jti/iat/exp, access 1h / refresh 30d).
Auth services (app.services.auth): login, refresh with token rotation + reuse-detection chain revoke, logout (idempotent), change_password (forces logout-all).
Invitation services (app.services.invitations): create, preview, accept, revoke. Token persisted only as SHA-256, default 7-day TTL.
Bootstrap (app.services.bootstrap + app.core.install_token): seeds 3 system groups (admin/redteam/blueteam), mints a one-shot install token at first boot when users is empty, logs a banner with the raw token. CLI flask --app app.cli metamorph print-install-token [--force].
Auth middleware (app.core.auth_decorators): @require_auth populates g.current_user; @require_perm("...") checks atomic permissions; admin group bypasses the check (atomic perms land in M3).
API endpoints:
- POST /api/v1/setup (consume install token, create 1st admin) + GET /api/v1/setup (status).
- POST /api/v1/auth/login + POST /auth/refresh + POST /auth/logout + GET /auth/me + POST /auth/change-password.
- POST /api/v1/invitations (admin) + GET /invitations + GET /invitations/preview/<token> + POST /invitations/accept/<token> + POST /invitations/<id>/revoke.
- POST /api/v1/diag/reset (test-only kill switch — wipes auth tables + mints fresh install token; only available in dev/test).
Rate limiting (flask-limiter): 10/min/IP on /auth/login, /auth/refresh; 5/min on /auth/change-password and /setup; 10–20/min on invitation endpoints. Globally disabled when APP_ENV=test.
Refresh cookie metamorph_refresh: HttpOnly + Secure + SameSite=Strict + Path=/api/v1/auth/.
Frontend auth state (frontend/src/lib/{api,auth}.ts): access token in module memory, refresh in cookie, automatic 401-retry via /auth/refresh with reentrancy guard. useAuth() hook + <RequireAuth> route guard.
Frontend pages: /login, /setup, /register?token=…, /profile (with change-password form), all in RTOps design. Protected layout: nav shows email + Logout when authenticated, Login + Setup links when not.
Frontend deps: @tanstack/react-query, react-router-dom. Tanstack provider in App.tsx (will carry actual queries from M3+).
Email validation (app.api._validation.Email): permissive RFC-shape regex that accepts internal TLDs (.local, .corp) — pydantic.EmailStr was too strict for red-team labs.
Testing:
- tests/test_auth_flow.py — 15 pytest integration tests (24 backend total with M0/M1).
- e2e/tests/m2-auth.spec.ts — 8 Playwright tests covering setup → login → me → invitation → register → 2nd login → RBAC 403 → refresh rotation → logout (20 e2e total).
- tasks/testing-m2.md — manual + automated procedure.

Fixed (post-M2 spec-review pass)

Refresh cookie Secure=True unconditionally (backend/app/api/auth.py). Modern browsers treat localhost as a secure context, so dev/test still works. Closes the silent-degradation found by the reviewer.
/auth/refresh rate-limit lowered to 10/min/IP (backend/app/api/auth.py) to match spec §M2 ("10 req/min/IP on /auth/*").
/diag/reset kept allowed in dev and test (a make e2e against a make up dev stack must be able to reset). Added a WARNING log when triggered in dev and a clear docstring; production envs (prod/staging) remain locked out.

Known scope-creep (intentional, not retracted)

Rate-limits on /setup (5/min), /invitations/preview (20/min), /invitations/accept (10/min) and /auth/change-password (5/min) were added in M2 even though §M2 only mandated /auth/*. Defensible (these are abuse-attractor endpoints), and noted here so M14 doesn't double-spec them.

Added — M1 (DB schema & migrations)

23 tables + alembic_version covering auth/RBAC (8), MITRE (4), templates (4), missions (6), evidence (1), settings/detection-levels (2), notifications (1).
SQLAlchemy 2.x declarative models with Mapped[]/mapped_column(), grouped under backend/app/models/{auth,mitre,template,mission,evidence,setting,notification}.py.
Alembic init: alembic.ini, alembic/env.py reading app.core.config.settings.database_url, alembic/script.py.mako, naming convention pk_/fk_/ck_/uq_/ix_ enforced via MetaData(naming_convention=...) on app.db.base.Base.
Reusable mixins in app.db.mixins: UuidPkMixin (uuid4 server-side), TimestampMixin (created_at/updated_at, server-default + onupdate), SoftDeleteMixin (deleted_at, no auto-injected index — declared explicitly per table to avoid mixin-vs-class __table_args__ clobbering).
Postgres-specific features used: JSONB for settings.value and notifications.payload; native Uuid columns; partial indexes (WHERE deleted_at IS NULL on 9 tables; WHERE read_at IS NULL on notifications); CHECK constraints for status/state/opsec_level/mitre_kind enums; exactly_one_mitre_fk CHECK on test_template_mitre_tags.
mission_test_mitre_tags deliberately denormalised (no FK to mitre_* tables): copies mitre_external_id, mitre_name, mitre_url at tag time so a later MITRE re-sync that drops an entry cannot purge a mission's tags. Companion test_template_mitre_tags keeps FKs since templates are editable. (Spec §11 risk addressed.)
Backend pyproject.toml deps: SQLAlchemy ≥2, Alembic ≥1.13, psycopg[binary] ≥3.1.
New Makefile targets: migrate, migrate-down, migrate-revision MSG=…, migrate-status. The Dockerfile now ships alembic.ini + alembic/ so the api container can run migrations directly.
Test stage in backend/Dockerfile (--target test): runtime image + dev extras + tests/ dir. New make test-api target spins an ephemeral container against the live DB on the compose network. Backend tests no longer require any local Python toolchain.
tests/test_schema.py (8 integration tests + the existing M0 health test = 9 total): expected tables, expected timestamp/soft-delete columns, partial-index presence, expected FK pairs, expected CHECK constraints, alembic-at-head, and a negative INSERT proving the exactly_one_mitre_fk CHECK fires.
tasks/testing-m1.md — manual + automated verification procedure.

Fixed (post-M1 spec-review pass)

Soft delete now consistent across snapshot-bearing tables: mission_scenarios, mission_tests, mission_categories gained SoftDeleteMixin + their ix_<table>_active partial index (M12 trash bin depends on this).
evidence_files gained TimestampMixin (created_at/updated_at) on top of the domain uploaded_at (audit minimal everywhere, per M1 brief).
mission_members gained TimestampMixin, replacing the bespoke added_at column.
scenario_template_tests PK refactored to a UUID + UNIQUE(scenario_template_id, position) so the same test can appear at multiple positions in a scenario (chained operations).
SoftDeleteMixin.__table_args__ removed (silently clobbered by class __table_args__); each soft-delete table now declares ix_<table>_active explicitly. Documented in the mixin's docstring.
mission_test_mitre_tags schema redesigned to denormalise MITRE labels (see "Added" entry above).
Migration 0001 regenerated end-to-end after these fixes — 24765a5014b6 is the new HEAD.

Validated end-to-end (M1 DoD)

make clean && make up && make migrate from a vide DB → 27 tables, 32 FK, 9 CHECK, 14 UQ, 12 partial indexes.
make test-api → 9 pytest pass (1 health + 8 schema integration) in <1 s.
make e2e → 12 Playwright pass (8 M0 smoke + 4 M1 db visibility) in 3 s.

Added (M1 visibility)

New API endpoint GET /api/v1/diag/db exposes alembic_revision (short-hashable) and the public-schema table_count. Returns 503 with {"reachable": false} when Postgres is down.
New Database card on the SPA home page consumes that endpoint, renders the revision short-hash and the count next to the existing API and Roadmap cards.
Footer updated to M0 bootstrap · M1 db schema. Roadmap card now points to M2 — Auth + JWT.
New e2e suite e2e/tests/m1-db.spec.ts (4 tests) covers the diag endpoint contract, the Database card rendering, and the footer/roadmap labels.

Added — M0 (bootstrap)

Repo scaffolding: .gitignore, .env.example, Makefile, docker-compose.yml, README.md, CHANGELOG.md.
docker-compose.yml with three services: db (postgres:16-alpine, no host port), api (Flask 3, port 8000), front (nginx serving the Vite bundle, port 80).
Named volumes metamorph_db and metamorph_evidence for data persistence.
Backend skeleton: Flask app factory, JSON structured logging on stdout, GET /api/v1/health endpoint, multi-stage Dockerfile, pyproject.toml driven by uv.
Frontend skeleton: Vite + React 18 + TypeScript strict + TailwindCSS, RTOps design tokens (tasks/design.md) translated into tailwind.config.ts, base UI primitives (Card, Tag, SectionHeader, FlowNode, Button), home page wired to /api/v1/health.
Multi-stage frontend Dockerfile that builds the bundle and serves it via nginx, proxying /api/* to the api container.
Pre-commit hook config: ruff for backend, eslint + tsc --noEmit for frontend.

Validated

docker compose config parses (validated via pyyaml since Docker is not installed in the dev shell).
Every env var referenced by the compose file is documented in .env.example.
All Python source files parse cleanly (ast.parse).
All TS/JSON config files parse cleanly.

Notes

TLS termination is delegated to an external reverse proxy (per spec §6 NF-network). The compose stack exposes plain HTTP on HOST_FRONT_PORT (8080) and HOST_API_PORT (8000).
The first-admin bootstrap token (M2) will be printed to the api container's stdout on first boot when the users table is empty.
tasks/spec.md and tasks/todo.md remain authoritative; update them before changing scope.

Fixed (M0 DoD validation pass on real podman)

FQDN image references in docker-compose.yml, backend/Dockerfile, frontend/Dockerfile. Podman on Fedora enforces short-name-mode=enforcing for pulls (no TTY ⇒ no prompt ⇒ failure). Replaced postgres:16-alpine / python:3.12-slim / node:20-alpine / nginx:1.27-alpine with their docker.io/library/… qualified equivalents. Docker accepts the same prefix transparently.
*.md removed from backend/.dockerignore and frontend/.dockerignore: pyproject.toml declared readme = "README.md", but the file was being filtered out of the build context, so hatchling.build.build_wheel raised OSError: Readme file does not exist: README.md. Also removed the readme field itself from pyproject.toml to decouple the build from the doc.
Card.tsx type clash: CardProps extends HTMLAttributes<HTMLDivElement> redefined title as ReactNode, but the native title is string. tsc -b failed with TS2430 during vite build. Switched to Omit<HTMLAttributes<HTMLDivElement>, 'title'>.
Explicit healthchecks added to compose api and front: podman-compose 1.x doesn't surface healthchecks declared only in the Dockerfile via inspect. Mirroring them in docker-compose.yml makes make inspect-health actually see healthy/unhealthy/starting on every engine.
Suppressed podman compose external-provider banner via PODMAN_COMPOSE_WARNING_LOGS=false exported from the Makefile.

Validated end-to-end on podman 5.x (Fedora 43)

make up → 3 containers, all 3 healthy after start_period.
make health → {"status":"ok","version":"0.1.0"} via the front nginx proxy (port 8080) and direct API (port 8000).
make logs-api → JSON-structured lines on stdout (ts, level, logger, message, custom fields).
make e2e → 8/8 Playwright tests pass in 2.5 s. Reports: e2e/playwright-report/index.html (529 KB, autoportant) + junit.xml (tests=8 failures=0 skipped=0 errors=0).

Added (engine portability)

Makefile auto-detects docker or podman at runtime and selects the matching compose driver (docker compose, podman compose, or legacy podman-compose). Override via ENGINE=… and/or COMPOSE="…".
New targets: engine (print detected runtime), volumes (list project-named volumes), inspect-health (health status of all 3 containers), logs-api (tail just the api), health (single curl probe). All engine-agnostic.
make help now prints the active engine + compose driver in its footer.
tasks/testing-m0.md and README.md rewritten to be engine-agnostic — raw docker logs / docker volume ls / docker inspect calls replaced with the new make targets.

Added (M0 testing)

e2e/ Playwright project with chromium, HTML + JUnit XML reporters, traces / screenshots / videos kept on retry. Reports land in e2e/playwright-report/.
e2e/tests/m0-smoke.spec.ts — 8 smoke tests covering the front rendering, the API proxy, the design tokens, the absence of any runtime CDN traffic (spec §7), and the CORS contract.
Makefile targets e2e-install, e2e, e2e-report, e2e-up, wait-healthy.
tasks/testing-m0.md — step-by-step manual + automated verification procedure for M0.
Convention added to tasks/todo.md: every milestone N delivers tasks/testing-m<N>.md + at least one e2e/tests/m<N>-*.spec.ts, and the spec-reviewer subagent runs before marking the milestone done.

Fixed (post-M0 spec-review pass)

.pre-commit-config.yaml added at repo root: ruff + ruff-format on backend, eslint + tsc --noEmit + prettier --check on frontend, plus baseline whitespace/JSON/private-key checks. Documented pre-commit install in README.md.
Self-hosted webfonts via @fontsource/jetbrains-mono and @fontsource/ibm-plex-sans (imported in frontend/src/index.css); dropped the Google Fonts <link> from frontend/index.html to honor spec §7 ("no runtime CDN").
Refuse-to-boot guard in backend/app/core/config.py: when APP_ENV != "dev", defaults / placeholders for JWT_SECRET and POSTGRES_PASSWORD raise at startup. New APP_ENV env var documented in .env.example, README.md, and docker-compose.yml.
make dev now runs dev-api and dev-front in parallel via make -j2 instead of just printing a hint.
Removed dead database_url property from Settings (will be reintroduced in M1 with the SQLAlchemy/Alembic stack).
Pinned Node engines to >=20 in frontend/package.json.
Reconciled M0 DoD wording in tasks/todo.md (HTTP via HOST_FRONT_PORT, with explicit note that prod TLS is external).
Documented the 2xs/3xs/4xs font-size aliases in frontend/tailwind.config.ts against the design.md §3 scale.

37 KiB Raw Blame History Unescape Escape

Changelog

[Unreleased]

Added — M6 (Missions & snapshot)

Added — M5 (Test & scenario templates)

Fixed (M5 implementation)

Fixed (post-M5 — scenario reorder 500 + cross-worker lock correctness)

Fixed (post-M5 UI — modal layout for the test-template editor)

Fixed (post-M5 review pass — spec-reviewer + code-reviewer)

Added — M4 (MITRE ATT&CK Enterprise)

Fixed (post-M4 spec-review pass)

Fixed (post-M4 code-review pass)

Decisions (intentional)

Validated end-to-end (M4 DoD)

Added — M3 (RBAC: groups, permissions, users)

Fixed (post-M3 spec-review pass)

Validated end-to-end (M3 DoD)

Added — M2 (Auth, bootstrap, invitations)

Fixed (post-M2 spec-review pass)

Known scope-creep (intentional, not retracted)

Added — M1 (DB schema & migrations)

Fixed (post-M1 spec-review pass)

Validated end-to-end (M1 DoD)

Added (M1 visibility)

Added — M0 (bootstrap)

Validated

Notes

Fixed (M0 DoD validation pass on real podman)

Validated end-to-end on podman 5.x (Fedora 43)

Added (engine portability)

Added (M0 testing)

Fixed (post-M0 spec-review pass)

37 KiB

Raw Blame History