User: «Enlève également le workflow d'un test, quand on saisit des
informations côtés redteam cela signifie qu'il a été exécuté et donc
en attente d'une review blueteam.»
Backend (update_mission_test_fields)
- At the end of every PUT, inspect the touched-field set:
- any red write on state in {pending, skipped, blocked} → state=executed
+ auto-stamp executed_at=now() if absent
- any blue write on state=executed → state=reviewed_by_blue
- /transition endpoint kept for back-fill/admin use, not called from UI.
Frontend MissionTestPage
- Removed the transition-buttons header block and the `transition`
mutation. State pill stays as a passive indicator.
- New labels: "Not started" / "Awaiting review" / "Reviewed" describe
the implicit lifecycle, no longer exposing the state-machine concept.
E2E
- The SPA test that clicked `transition-executed` now verifies the
implicit promotion: typing red fields and saving flips the pill from
"Not started" → "Awaiting review", no button click required.
Spec
- §4 reword: "Cycle de vie implicite, piloté par les écritures" replaces
the old "Workflow par test instance" bullet.
Tests
- 3 new pytest: red_command-alone implicit execute + auto-stamp,
blue write promotes executed→reviewed, blue write on pending no-op.
- 142 pytest + 49 Playwright green.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
61 KiB
Changelog
All notable changes to this project will be documented here. Format: Keep a Changelog · Conventional Commits.
[Unreleased]
Changed (amendement 2026-05-15 bis) — explicit test workflow removed, lifecycle now driven by writes
User feedback: «Enlève également le workflow d'un test, quand on saisit des informations côtés redteam cela signifie qu'il a été exécuté et donc en attente d'une review blueteam.»
- Backend
update_mission_test_fields: at the end of every PUT, the service inspects the touched field set. Any red-side write on a non- executed test (pending/skipped/blocked) promotes the state toexecuted; if noexecuted_atwas supplied, it auto-stampsnow(). Any blue-side write on anexecutedtest promotes toreviewed_by_blue. The/transitionendpoint stays operational for back-fill / admin use but is no longer the primary path. MISSION_TEST_STATE_LABELrephrased to describe the implicit lifecycle instead of the workflow:Pending → Not started,Executed → Awaiting review,Reviewed_by_blue → Reviewed,Skipped,Blocked.MissionTestPage.tsx: transition buttons in the header are gone. The state pill remains as a passive indicator. Thetransitionmutation and its imports are dropped;useMutationis still used for the red / blue field saves.- E2E: the SPA test that previously clicked
transition-executednow exercises the implicit promotion — it just types in the red fields and asserts that the state-pill flips fromNot startedtoAwaiting reviewon save. - Tests: 3 new pytest cases —
test_red_writing_any_red_field_implicitly_executes_and_stamps(red_command alone bumps state + auto-stamps executed_at),test_blue_writing_any_blue_field_promotes_executed_to_reviewed,test_blue_write_on_pending_does_not_auto_execute(blue-on-pending is a no-op — only red drives execution per the user's mental model). - Total: 142 pytest + 49 Playwright green.
Fixed (post-amendement 2026-05-15) — stamping executed_at no longer needs a prior state transition
User feedback: when a red user typed executed_at inline on a pending test
in the new scenario table, the backend rejected with HTTP 400 — executed_at can only be set when state is executed/reviewed_by_blue. The state-gate
was a holdover from the original "Mark executed button + override toggle"
workflow; it made no sense once the UX let the operator type the time
directly.
update_mission_test_fields(backend/app/services/mission_tests.py) no longer rejects writes based on the source state. Stamping a non-nullexecuted_atwhile state ∈ {pending,skipped,blocked} now auto-promotes the state toexecutedin the same write. The promotion rides on the samemission.write_red_fieldsperm that the executed_at field already required — no privilege escalation.MissionTestPage.tsxdrops the state-based gate oncanEditExecutedAt: the field is editable any time the viewer holdsmission.write_red_fields.- Tests:
test_executed_at_override_requires_red_perm_and_statewas the old guard; it's split into two new cases —test_red_setting_executed_at_on_pending_auto_transitions_to_executed(pending → executed via inline stamp, blue still 403'd) andtest_red_setting_executed_at_from_skipped_state_auto_transitions(skipped → executed via the same path). - Total: 139 pytest green.
Added — M7 amendment (2026-05-15) — blue review fields + full-width scenario table
User feedback after the M7 ship: the blue team used to maintain 5 extra fields in Excel that we didn't capture, and the per-test page didn't fit their workflow — they wanted a tabular view (one table per scenario, one row per test) with double-click inline edit.
Reviewer follow-ups (applied)
blue_incident_atrejects naïve datetimes (backend/app/api/missions.py:_ensure_aware_datetime): a request with"2026-05-15T11:00:00"(no offset) now returns 400 instead of silently letting Postgres interpret it in the session timezone — same rule applied toexecuted_atfor consistency. Clients must sendZor an explicit+HH:MM.blue_incident_recipient_emailis shape-validated (backend/app/api/missions.py:_validate_email_shape): permissive RFC regex (/^[^@\s]+@[^@\s]+\.[^@\s]+$/) that allows.local/.corp/.testinternal domains. We deliberately don't use PydanticEmailStr—email-validator'sglobally_deliverable=Truerejects those (lessons.md M2 captured the same trap for the user signup).MissionTestViewpayload expansion documented as a deliberate F6 enabler — surfacing every annotation in the nested GET means the scenario table renders in a single round trip. Without this, the table would have to callGET /missions/{id}/tests/{test_id}once per row.
Backend (shipped)
- Migration
c2a8f4b1d6e9adds five nullable columns tomission_tests:blue_log_source(varchar 120) — short text likeFirewall,NDR,Proxy,AV,EDR.blue_siem_logs(text) — long-form SIEM excerpt (raw log lines).blue_incident_at(timestamptz) — cyber-incident notification timestamp.blue_incident_number(varchar 120) — incident reference (INC-2026-1234).blue_incident_recipient_email(varchar 255) — SOC recipient of the alert.
- All five fields are blue-side — added to
_BLUE_FIELDSinapp/services/mission_tests.pyso the existing per-field perm classifier rejects red-only writers with 403, no field-by-field special case. update_mission_test_fieldsaccepts each new field via the same_UNSETsentinel pattern;blue_siem_logsuses the command-style normaliser (_opt_cmd) to preserve leading whitespace in log table excerpts; the other text fields use_opt_md.MissionTestView(the nested view returned byGET /missions/{id}) now exposes every annotation field pluslast_actor_*+updated_at+detection_level_key. The two FK lookups (detection-level keys, last-actor user labels) are batch-loaded once per request so the call stays O(1) regardless of how many tests the mission contains. Lets the front-end scenario table render in a single GET — no per-row round-trip.- API:
UpdateMissionTestPayloadand_serialize_test/_serialize_test_detailupdated. Length caps per spec (120 / 200_000 / 120 / 255). - Tests: 3 new pytest cases —
test_blue_user_writes_new_blue_review_fields,test_red_user_cannot_write_new_blue_review_fields(loops each of the 5 fields),test_blue_review_fields_survive_round_trip_via_get. Total: 136 pytest green.
Spec & docs
tasks/spec.mdamended — §4 in-scope bullet on blue saisie now lists the 5 fields, §F6 describes the tabular UX (full-bleed, one table per scenario, double-click inline edit), §8 model bullet enumerates the new columns. Header carries arevised: 2026-05-15note pointing readers at the amendment.tasks/todo.mdM7 section carries a dedicated "Amendement 2026-05-15" sub-block tracking the backend (☑) and frontend (☐) items.
Frontend (shipped)
MissionScenarioTablecomponent (frontend/src/pages/MissionScenarioTable.tsx): per-scenario<table>with 7 columns (Test | Procédure | Exécution | Source de log | Commentaires | Logs SIEM | Cyber Incident) plus anActionscell that links to the full per-test page. Read mode shows truncated values; double-click toggles a row into edit mode where each cell becomes the right input (text, textarea, datetime-local, select). Thedetection_levellives inside the Commentaires cell as a pill + select — no 8th column.- Single-row-edit invariant:
editingTestIdstate lives inMissionDetailPage's tests tab so only one row across the whole mission is editable at a time. Double-clicking another row while dirty surfaces aDiscard unsaved changes?prompt; Esc reverts; Save commits the diff. - Diff-only PUT:
draftDiff(test, draft)walks every field and only includes the ones that changed; submitting an unchanged form is a no-oponEditRequest(null). Keeps the per-field perm gate on the server cleanly applicable. - Full-bleed layout: the tests tab escapes the layout's
max-w-pagevia the canonicalcalc(50% - 50vw)recipe (same as the M4 MITRE picker) so the 7-column table breathes on wide screens without horizontal scroll. - Per-test page kept at
/missions/<id>/tests/<test_id>for evidence upload and the full procedure view — every row's "open ↗" link routes there. - Datetime semantics consistent: the table's two datetime-local inputs (executed_at + blue_incident_at) reuse the M7 verbatim recipe (
iso.slice(0, 16)+${local}:00Z), no TZ shift on read or write.
Tests
- E2E: existing m6 + m7 specs unaffected (all 49 still green). The new table reuses the
mission-add-scenariostestid for the modal trigger so the wizard test still works. The oldmission-test-${id}rows are gone but were never wired into any e2e selector.
Fixed (post-M7 UX feedback — evidence whitelist visibility)
- Evidence dropzone didn't tell the operator which extensions are accepted, and the OS file picker showed "All files" (
frontend/src/pages/MissionTestPage.tsx): an operator could spend the time picking a.exeonly to receive a 400 back. Surfaced the whitelist in the UI:- Dropzone now prints
Accepted: .png · .jpg · .jpeg · .pdf · .txt · .log · .json · .csv · .evtx · .zip · max 25 MB / file(testidevidence-allowed-formats). <input type="file" accept=".png,.jpg,…">pre-filters the OS picker to those extensions.handleFilesrejects drag-and-drops of unsupported extensions client-side (still re-checked server-side — defence in depth, not a security boundary).
- Dropzone now prints
- Constants
EVIDENCE_ALLOWED_EXTENSIONS+EVIDENCE_MAX_BYTESinfrontend/src/lib/missions.tskeep a single source of truth client-side. Manual mirror ofapp/services/evidence.py:ALLOWED_EXTS+MAX_BYTES; cross-referenced via comments so the next bump touches both files.
Fixed (post-M7 UX feedback — executed_at override editable in any timezone)
- Time portion of the
executed_atoverride was un-editable in non-UTC timezones (frontend/src/pages/MissionTestPage.tsx:RedZone): the naivenew Date(executedAt).toISOString().slice(0, 16)round-trip on every keystroke silently shifted the hour by the local TZ offset, snapping the time field back to UTC each render. The date could be changed (offset shifts both source and target by the same amount), but the hour couldn't stick. - Fix: keep the local state in
YYYY-MM-DDTHH:MMform (executedAtLocal) and only convert to/from UTC ISO at the boundaries — initial sync from server (isoToLocalInputValue) and submit (localInputValueToIso). - Also tightened the
useEffectreset on both Red and Blue zones to depend ontest.idinstead of the wholetestobject so a polling refetch (every 15 s) no longer wipes an in-progress edit. The 15 s activity poll returns a fresh object reference even when the row's content is unchanged.
Fixed (post-M7 review pass — spec-reviewer + code-reviewer)
- Idempotent transition leaked false success to a wrong-side user (
backend/app/services/mission_tests.py:570): a blue-only viewer POSTingtarget_state="executed"while the test was already executed got a 200 idempotent response, falsely advertising that they heldmission.write_red_fields. Reordered the gate so the side-perm check runs before the idempotency short-circuit, with a new_IDEMPOTENT_SIDEtable that asks "which side originally produced this state?" — re-asserting that perm even on no-op replays. Testtest_idempotent_transition_still_checks_side_perm. - Cross-mission evidence access not pinned by a test (
backend/tests/test_mission_tests.py:test_evidence_member_of_other_mission_gets_404): added explicit coverage that a user who is a blue member of mission B sees 404 on an evidence row attached to mission A. The chain walk in_resolve_evidence_chainalready enforced this, but the regression test was missing. shutil.moveswapped foros.replace(backend/app/services/evidence.py:240):os.replaceis the documented atomic-rename primitive on POSIX and Windows when src/dst share a volume — and our tmpfile is always staged inside the destination directory, so the guarantee holds. Removes the implicit copy+remove fallback fromshutil.movethat would silently break atomicity on a cross-fsEVIDENCE_DIR.- SHA256 path component now hex-validated (
backend/app/services/evidence.py:227): the hash always comes from hashlib so it's already hex, but if a future caller ever passes pre-computed bytes we want to fail loudly rather than write to a path like..something.evtx. Cheapre.fullmatch(r"[0-9a-f]{64}", sha256)guard. EVIDENCE_DIRfilesystem-root guard (backend/app/services/evidence.py:_test_dir): refuse to create per-mission directories whenEVIDENCE_DIRresolves to/(or the equivalent on Windows). Stops a mis-configured operator from laying down content-addressed evidence files at the filesystem root./diag/resetevidence cleanup now skips symlinks (backend/app/api/diag.py:127): switched fromis_dir()tois_symlink() or not is_dir()so a hostile or accidental symlink insideEVIDENCE_DIRis unlinked rather thanrmtree'd through.- N+1 in
_to_detail_view(backend/app/services/mission_tests.py:_to_detail_view): the last-actor and detection-level lookups each issued their owns.get(). Replaced withselect(columns)queries that return just the needed scalar fields — same SQL count but fewer ORM round-trips, and every PUT/transition exercises this path so it adds up. - Mission detail row
onClickremoved in favour of the wrappedLink(frontend/src/pages/MissionDetailPage.tsx:684): thetr onClick+ nestedLinkwithstopPropagationworked but was fragile to accessibility tooling. The link on the test name + the explicit hover class is enough.
Added — M7 (Red & blue execution on a mission test)
- Per-mission-test write API (
app/api/missions.py+app/services/mission_tests.py):GET /missions/{id}/tests/{test_id}— full detail view with snapshot, state, red/blue fields, MITRE tags, evidence list, last-actor metadata.PUT /missions/{id}/tests/{test_id}— patch any subset ofred_command/red_output/red_comment_md/blue_comment_md/detection_level_id/executed_at/executed_at_overridden. The service classifies each touched field as red-side or blue-side and rejects with 403 if the caller lacks the matching perm.executed_at*only writable when the test sits inexecutedorreviewed_by_blue.POST /missions/{id}/tests/{test_id}/transition— drives the state machinepending↔skipped/blocked+pending→executed→reviewed_by_blue(allows undo back topending). Side-aware perm gating:pending→executedandexecuted→pendingrequirewrite_red_fields;executed↔reviewed_by_bluerequireswrite_blue_fields;pending↔skipped/blockedaccepts either side. Transitioning intoexecutedstampsexecuted_at=now()and clears the override; transitioning out (topending) wipes the timestamp.GET /missions/{id}/activity?since=<ISO>— returns mission_tests whoseupdated_at > since, freshest first. Drives the SPA's 15-second polling badge. Response includesserver_timeso the client can chain calls without clock drift.
- Evidence storage pipeline (
app/services/evidence.py+app/api/evidence.py):POST /missions/{id}/tests/{test_id}/evidence(multipart, gated onmission.write_blue_fields): streams the upload into a tmpfile next to the final location, hashing chunk-by-chunk and aborting at the 25 MB cap. Validates extension (whitelist: png/jpg/jpeg/pdf/txt/log/json/csv/evtx/zip) and MIME (permissive allowlist +application/octet-streamfallback for.evtx). Content-addressed storage:${EVIDENCE_DIR}/<mission>/<test>/<sha256><ext>— re-uploading byte-identical content reuses the file on disk and inserts a fresh row.GET /evidence/{id}— JSON metadata view;?download=trueswitches tosend_filewith the original filename inContent-Dispositionand the SHA256 as the ETag.DELETE /evidence/{id}— soft delete (only flipsdeleted_at; physical purge lands in M12).- All three routes are membership-aware via the same chain walk (
evidence → test → scenario → mission), collapsing "not found" / "not visible" into 404 to prevent existence leaks.
- Activity tracking column (
backend/alembic/versions/20260514_1000_91a4e7c6d2f3_m7_mission_test_last_actor.py): addedmission_tests.last_actor_id(FKusers.idON DELETE SET NULL) +ix_mission_tests_updated_atto support the polling endpoint. Every red/blue write or transition stamps the actor so the "modified by X Ns ago" indicator can resolve a human label. - Detection-level seed + read (
app/services/detection_levels.py+app/api/detection_levels.py):- 4 default rows seeded at boot —
detected_blocked/detected_alert/logged_only/not_detected— colored on the design-system accent palette. The seed is idempotent and never mutates existing rows; new keys added toDEFAULT_LEVELSin future releases surface on next boot. GET /detection-levels(gated ondetection_level.read) returns the catalogue ordered by position. CRUD is M8's territory.
- 4 default rows seeded at boot —
- Per-test page (
frontend/src/pages/MissionTestPage.tsx): two-zone layout with the red border on the red half (command, output, markdown comment, mark-executed button, override toggle) and the cyan border on the blue half (detection-level select, comment, drag-and-drop evidence dropzone). Per-field disable based onmission.write_red_fields/mission.write_blue_fields; server is the ultimate arbiter so the UI is purely advisory. The "Last touched Xs ago by Y" badge polls/activityevery 15 s while the document is visible. - Mission detail page wires through to the per-test page (
frontend/src/pages/MissionDetailPage.tsx): every row in the Tests tab is now clickable (cursor + hover state) and links to/missions/<id>/tests/<test_id>. The route is registered inApp.tsxbehindRequireAuth. - TanStack query keys (
frontend/src/lib/missions.ts): addedmissionTestKeys.detail()/.activity()/.detectionLevels()so the per-test page invalidations stay surgical (don't blow away the whole missions list). /diag/resetextended (app/api/diag.py): test mode now wipes${EVIDENCE_DIR}/*so e2e uploads don't accumulate across runs. Detection levels are preserved (reference data, not catalogue) and the seed is re-run as a safety net.- Tests:
backend/tests/test_mission_tests.py— 25 pytest tests covering: detection-level seed + perm gating; red/blue field-level perms (red user blocked on blue fields and vice-versa); mark-executed stampsexecuted_at; override gating (forbidden while pending, blue-side blocked); state-machine matrix + side perm refinement; membership 404 vs admin bypass; evidence 24 MB ok / 26 MB rejected; SHA256 verification; MIME/extension whitelist; soft-delete hides bytes from detail view; activity polling withsince=URL-encoded; futuresincereturns empty.e2e/tests/m7-execution.spec.ts— 5 Playwright tests against the live stack: red-only/blue-only API gating, mark-executed + reviewed_by_blue side enforcement, 24 MB/26 MB upload + SHA256 round-trip, SPA per-test page save + transition, non-member sees the 404 alert instead of mission content.afterAllrestores the stable admin and re-syncs MITRE.
- HomePage: hero + roadmap card bumped to
M7 — Red & blue execution on a mission test (done). Next: M8.
Fixed (post-M6 SPA — mission detail page was read-only)
- Mission detail page couldn't edit metadata, append scenarios, or change members (
frontend/src/pages/MissionDetailPage.tsx): the M6 SPA shipped the 3-step creation wizard but no edit affordance on the detail page — even though the backend already exposedPUT /missions/{id},POST /missions/{id}/scenarios, andPUT /missions/{id}/members. Added three modals gated byis_admin || mission.update:- Edit metadata (header button, opens a 3xl modal): name / client_target / dates / description_md, full inline validation (empty name, inverted dates) mirroring the wizard's step 1.
- Add scenarios (in the Tests tab): scenario picker reusing the wizard step-2 visual, calls
POST /missions/{id}/scenarioswhich appends snapshots atcurrent_max_position + 1. The footer line tells the user how many tests will be appended. - Edit members (in the Members tab): roster + red/blue toggles, calls
PUT /missions/{id}/members(full-set replace) — same UX as the wizard step 3, pre-populated with the current member set.
- Detail page now imports
useAuthto computecanEditonce and reuses it across all three buttons. - E2E spec extended: new test
SPA — detail page edits metadata, appends scenarios, edits membersexercises the three modals end-to-end against a pre-seeded mission. Suite is now 44 Playwright tests (6 in M6).
Fixed (post-M6 review pass — spec-reviewer + code-reviewer)
- SPA cache invalidation only refreshed the empty-filter list (
frontend/src/lib/missions.ts:136):missionKeys.list()returns['missions','list',{}]. TanStack v5'sinvalidateQueries({queryKey})is prefix-based, but{}is treated as an atomic final element — so create / transition / delete called with that key only invalidated the exact empty-filter list, leaving any filtered variant stale until manual refetch. AddedmissionKeys.listPrefix()returning['missions','list']and switched all three mutationonSuccesspaths to it. - Snapshot lacked the per-scenario advisory lock (
backend/app/services/missions.py:467): a concurrentPUT /scenario-templates/{id}/tests(M5 reorder, which deletes-then-reinserts join rows) running while_snapshot_scenarioswalkedsc.testscould freeze a torn snapshot —selectinloadre-queries under READ COMMITTED so a partial view was possible. Added_lock_scenario_ids_for_snapshotthat acquires the samepg_advisory_xact_lockkey used byset_scenario_tests(blake2b digest of the scenario UUID, sorted to avoid deadlocks). Snapshot and reorder now serialise per scenario. - Transition endpoint leaked its body shape via 400 before the perm gate (
backend/app/api/missions.py:441): a user withoutmission.updateormission.archivePOSTing{"status":"x"}got a Pydantic 400 instead of 403. Added@require_perm("mission.update", "mission.archive")so the gate fires before the parse; the inner refinement still enforces the per-target perm. Testtest_transition_perm_gate_runs_before_payload_parse. - LIKE wildcards in user-typed search were honoured as SQL wildcards (
backend/app/services/missions.py:632,637):?q=%matched every mission. Added_escape_likethat pre-escapes%,_,\and a matchingescape='\\'argument on every.like(...)call. Testtest_search_treats_wildcards_as_literals. - Counts ignored soft-deleted mission children (
backend/app/services/missions.py:587,597):tests_countand the detail view summedlen(sc.tests)without filteringMissionTest.deleted_at. Harmless today (M6 doesn't soft-delete mission tests), but would drift silently once M7+ surfacesstate=skipped/blocked. Added the filter in both_to_list_itemand_scenario_views. /users/rosterwas unordered (backend/app/api/users.py:73): the wizard's member list shuffled rows on every refetch. Sorted byemailfor predictable rendering + stable e2e selectors.- Frontend transition button accent collapsed
in_progressandcompletedinto one colour (frontend/src/pages/MissionDetailPage.tsx:97): both rendered cyan, so the status legend in the list didn't match the transition button. Added aTRANSITION_BUTTON_ACCENTmap mirroringMISSION_STATUS_ACCENT(cyan/orange/green/teal). - Soft-deleted source scenario was a silent foot-gun:
_load_scenario_templates_for_snapshotalready rejected it, but no test pinned the behaviour. Addedtest_create_mission_rejects_soft_deleted_scenarioso future refactors can't regress to "freeze a tombstoned scenario into a fresh mission". - E2E wizard assertion used
getByRole('button', { name: /In Progress/i })(e2e/tests/m6-missions.spec.ts:287): the accessible name is→ In Progressand the arrow Unicode is brittle. Switched togetByTestId('mission-transition-in_progress').
Added — M6 (Missions & snapshot)
- CRUD
missions(app/services/missions.py+app/api/missions.py):- Fields: name, client_target, date_start, date_end, status (
draft/in_progress/completed/archived), description (markdown), visibility_mode (frozen towhiteboxin v1). - On creation/append, the service snapshots the selected
scenario_templatesand all theirtest_templatesintomission_scenarios/mission_tests(every template field — including OPSEC level, tags, expected IOCs, MITRE tags). The denormalisedmission_test_mitre_tagstable copiesexternal_id,name,urlso a later MITRE re-sync that drops the entry can't alter a mission's tags (spec §11). source_*_template_idFKs survive template soft-deletes (ON DELETE SET NULL); the mission's frozen content is unaffected.- Membership visibility: non-admin viewers see only missions where they are a
mission_membersrow. The service maps "not visible" → 404 (no existence leak via 403). Admins bypass via theadmingroup. - Status state machine:
draft → in_progress → completed → archived;archived → ∅. The transition endpoint accepts the target status, validates the move, and rejects invalid jumps with 409. Idempotent (target=current) is a no-op 200. - Auto-creator-membership: a non-admin caller of
POST /missionsis auto-added asrole_hint='red'if not already in themembers[]payload — so they retain visibility on the mission they just created. - REST:
GET/POST /missions,GET/PUT/DELETE /missions/{id},POST /missions/{id}/scenarios(append snapshots at the end),PUT /missions/{id}/members(replace set),POST /missions/{id}/transition. - Filters on list:
q(LIKE on name/description),status,client(LIKE on client_target).include_deleted=trueis admin-only (403 otherwise).
- Fields: name, client_target, date_start, date_end, status (
GET /users/roster(app/api/users.py): a deliberately minimal listing —id,email,display_nameof active users only — accessible to any holder ofuser.read,mission.create, ormission.update. Lets a non-admin red teamer populate the wizard's member picker without exposing the admin-grade/usersendpoint (which leaksis_admin,is_active, group memberships).- Frontend:
lib/missions.ts— typed client + queryKey factory + status accent map + filter query-string builder.pages/MissionsListPage.tsx— list cards (one per mission) with status accent, scenario/test/member counts, date range, plus filters (q, client, status).pages/MissionsCreatePage.tsx— 3-step wizard: metadata → scenario picker → member roster (red/blue toggles + auto-include the non-admin creator). Submits viaPOST /missionsand redirects to the detail page.pages/MissionDetailPage.tsx— header with transition buttons (only the legal next states are rendered), soft-delete with confirm prompt, and 4 tabs: Tests (table of snapshotted tests with MITRE tags, OPSEC, state), Members (role-coloured pills), Synthesis (placeholder for M10), Export (placeholder for M11).- Nav adds Missions link visible to anyone with
mission.reador admin.
- /diag/reset truncates the mission tables before the template tables —
mission_scenarios.source_scenario_template_idandmission_tests.source_test_template_idareON DELETE SET NULL, so wiping missions first avoids the round-trip through the null-update path. - Testing:
backend/tests/test_missions.py— 22 pytest covering snapshot fidelity (rename source template after snapshot → mission unchanged), MITRE tag propagation, membership-based 404, perm gating (create vs read vs archive), status transition chain + invalid jumps (409), member set replace + role-hint validation, scenario append at correct position, soft-delete, partial metadata update, inverted-date rejection, admin-onlyinclude_deleted.e2e/tests/m6-missions.spec.ts— 5 Playwright (snapshot freezing, membership visibility for non-admin red, status transition + 409, SPA wizard end-to-end, SPA list + status filter).tasks/testing-m6.md.
Added — M5 (Test & scenario templates)
- CRUD
test_templates(app/services/test_templates.py+app/api/test_templates.py):- Fields: name, description, objective, procedure (markdown), prerequisites (markdown), expected result red, expected detection blue, OPSEC level (
low/medium/high), free tags (TEXT[]), expected IOCs (TEXT[]). - Polymorphic MITRE tag set (
(kind, external_id)↔ exactly one oftactic_id/technique_id/subtechnique_id). The wire payload uses ATT&CK external IDs — server resolves to UUIDs. - Filters:
q(LIKE on name/description),tactic/technique/subtechnique(joined via subquery on the polymorphic tag table),opsec,tag(array contains). - REST:
GET /test-templates,GET /test-templates/{id},POST /test-templates,PUT /test-templates/{id}(partial, with explicit_UNSETsentinel so omitted fields stay untouched),DELETE /test-templates/{id}(soft).
- Fields: name, description, objective, procedure (markdown), prerequisites (markdown), expected result red, expected detection blue, OPSEC level (
- CRUD
scenario_templates(app/services/scenario_templates.py+app/api/scenario_templates.py):- Ordered list of test_templates with
position(UNIQUEscenario_template_id, position). - Reorder via full replace:
PUT /scenario-templates/{id}/testsdeletes the join rows and re-inserts at positions0..N-1— clean atomic op that respects the UNIQUE constraint without a 2-phase position shuffle. - The same test can appear multiple times (chained operations).
- REST:
GET/POST/PATCH(metadata) /DELETE(soft) on/scenario-templates.
- Ordered list of test_templates with
- Frontend:
lib/templates.ts— typed client + queryKey factory.pages/AdminTestsPage.tsx— list + filters (q, tactic, opsec, tag) + modal with full field set + embedded<MitreTagPicker>for tags.pages/AdminScenariosPage.tsx— list + modal with @dnd-kit/sortable vertical drag-and-drop on the ordered test list. New deps:@dnd-kit/core,@dnd-kit/sortable,@dnd-kit/utilities.components/MarkdownField.tsx— lean textarea with markdown hint (no heavy editor dep; rendering happens at display time in M7).- Nav adds Tests and Scenarios links (admin-gated).
- /diag/reset truncates the 4 new tables before the MITRE block — the
scenario_template_tests.test_template_idFK isON DELETE RESTRICT, so the order matters. - Testing:
backend/tests/test_templates.py— 19 pytest (create/list/filter by tactic+opsec+tag, MITRE tag resolution + replacement on update, soft-delete, perm gating, scenario create+reorder+delete, soft-deleted test linking semantics).e2e/tests/m5-templates.spec.ts— 4 Playwright (API CRUD round-trip, scenario reorder, SPA list + opsec filter, SPA scenario list rendering with ordered tests).tasks/testing-m5.md.
Fixed (M5 implementation)
LogRecordkey collision:log.info(..., extra={"name": ...})raisesKeyError("Attempt to overwrite 'name' in LogRecord")becausenameis reserved by Python's stdlib logging. Renamed totemplate_name.- React
currentTargetnull in deferred state updaters:onChange={(e) => setX((prev) => ({ ...prev, q: e.currentTarget.value }))}blanked the page on the first user input becausecurrentTargetis cleared after the listener bubble ends, before React invokes the updater. Switched all M5 handlers toe.target.value, which persists on the synthetic event.
Fixed (post-M5 — scenario reorder 500 + cross-worker lock correctness)
PUT /scenario-templates/{id}/testsreturned 500 (backend/app/services/scenario_templates.py:218): the two-argument formpg_advisory_xact_lock(:n, :m)failed withfunction pg_advisory_xact_lock(smallint, bigint) does not exist. Postgres only provides(int4, int4)and(bigint)overloads — psycopg promotedm = hash(uuid) & 0xFFFFFFFF(up to 2^32-1) to bigint and there's no matching overload. Switched to the single-argument bigint form withCAST(:key AS bigint).- Cross-worker lock was a no-op (same site): Python's built-in
hash()is randomised per process viaPYTHONHASHSEED, so each gunicorn worker computed a different key for the samescenario_id, and concurrent reorders on different workers acquired independent locks — defeating the serialisation. Replaced withblake2b(scenario_id.bytes, digest_size=8)interpreted as a signed int64. Stable, deterministic, fits inbigint.
Fixed (post-M5 UI — modal layout for the test-template editor)
- Modal box capped its width at
max-w-2xland had no vertical scroll (frontend/src/components/ui/Modal.tsx): opening + New test rendered the 15-column MITRE matrix inside a 672 px frame with no height cap, so the matrix spilled to the right and the form bottom dropped below the viewport — buttons unreachable, no scroll. Added asizeprop (default2xlfor back-compat),max-h-[calc(100vh-2rem)]+flex flex-colon the dialog, and an innermin-w-0 flex-1 overflow-y-autobody so the header stays pinned while the form scrolls inside the modal. - MITRE matrix overflow-x failed to scroll inside the modal body (
frontend/src/components/MitreTagPicker.tsx):overflow-x-autosat directly on the grid element, but the grid's intrinsic min-width (15 × minmax(7rem, …)= 1680 px) prevented it from shrinking below its content, so the grid spilled outside its parent instead of scrolling. Wrapped the grid in a dedicatedoverflow-x-auto rounded min-w-0 w-fullscroller and addedmin-w-0to the picker root so the constraint propagates from the modal body. The grid now scrolls horizontally inside the modal. grid gap-3form layout in the test-template modal propagatedmin-width: auto(frontend/src/pages/AdminTestsPage.tsx): each grid item refused to shrink below its widest child, so the picker dragged the form (and the body) past the modal width. Switched the form toflex flex-col gap-3 min-w-0, which breaks the propagation while preserving vertical spacing.- Test-template modal now uses
size="7xl"and the scenario-template modalsize="3xl"to match their content density.
Fixed (post-M5 review pass — spec-reviewer + code-reviewer)
- Filter combinator was OR, not AND (
backend/app/services/test_templates.py:235):?tactic=TA0002&technique=T1059returned templates matching either facet instead of both. Pre-fix also pooled all three UUIDs into a sharedINlist across three columns, theoretically allowing a UUID collision to match across kinds. Refactored to one IN-subquery per facet, ANDed together via repeatedWHERE id IN (...). - Concurrent reorder race on
set_scenario_tests(backend/app/services/scenario_templates.py:207): two parallel reorders on the same scenario could deadlock on theUNIQUE(scenario_id, position)constraint under READ COMMITTED. Added a per-scenariopg_advisory_xact_lock(0x5C3, hash(scenario_id))mirroring the M4/mitre/syncpattern; different scenarios don't contend. - N+1 on
_to_viewMITRE resolution (backend/app/services/test_templates.py:160): rendering K templates with ~T tags each fired up to K×Ts.get(...)calls. Added_to_views_batchthat pre-builds{uuid → MitreRow}maps in 3 queries and feeds them to per-template view assembly;list_test_templatesnow issues 4 queries total regardless of list size. - Wire-level item length cap on
tags/expected_iocs(backend/app/api/test_templates.py:18-21): the DB columns areARRAY(String(64))/ARRAY(String(255))but the API layer only capped the LIST length, not item strings — long inputs hit the driver withStringDataRightTruncation. AddedAnnotated[str, StringConstraints(...)]types so the API returns 400 with a clean validation error. - Front-end mutation cache hygiene (
frontend/src/pages/AdminScenariosPage.tsx:148-156):updateMetaandsetTestsmutations are run sequentially insubmit(); on partial failure (metadata saved but reorder failed) the cache stayed stale. Both mutations nowonSettled: invalidateso whatever step landed is reflected without manual refresh. - Backend vs front-end consistency on duplicate tests in a scenario (
frontend/src/pages/AdminScenariosPage.tsx:227-231): the backend allows the sametest_templateto appear multiple times (chained ops; the UNIQUE constraint is(scenario_id, position)not(scenario_id, test_template_id)), but the catalogue picker was filtering out already-picked items. Removed the filter — only soft-deleted tests are excluded now. - Test coverage closure (
backend/tests/test_templates.py): +4 pytest (tactic+technique AND-semantics,extra="forbid"rejection, emptymitre_tagsexplicit clear, 65-char tag length cap → 400). Total backend now 23 M5 tests + 39 elsewhere = 81 pass.
Added — M4 (MITRE ATT&CK Enterprise)
- STIX 2.1 parser + upsert (
app/services/mitre_seed.py): stdlib-only (urllib.request+hashlib), pinned to Enterprise v19.0 (enterprise-attack-19.0.json, sha256df520ea0…). Parses 25k+ STIX objects → 15 tactics, 222 techniques, 475 sub-techniques in ~1.1 s. Skips revoked + deprecated, resolves sub-technique parents viarelationship[subtechnique-of]with aT1003.001 → T1003dotted-id fallback, copies kill-chain phases into themitre_technique_tacticsM2M. - CLI:
flask metamorph seed-mitre [--source <path|url>] [--checksum-sha256 <hex>] [--skip-checksum](app/cli.py).make seed-mitrewraps it. - REST endpoints (
app/api/mitre.py):GET /api/v1/mitre/tactics,/mitre/techniques?tactic=…&q=…,/mitre/subtechniques?technique=…&q=…(paginated, search on name/external_id).GET /api/v1/mitre/status(last_sync, version, source_url, defaults).POST /api/v1/mitre/sync(permmitre.sync) — re-pull on demand.
- Persisted metadata in
settings:mitre_last_sync,mitre_version,mitre_source_url. - Compose volume
metamorph_mitremounted at/data/mitre/in the api container — caches the downloaded bundle across restarts. Owned bymetamorph:metamorph. - Frontend:
<MitreTagPicker>component: flat ATT&CK matrix matchingattack.mitre.org/#— full-bleed beyondmax-w-page, 15 equal-width columns viagrid-template-columns: repeat(N, minmax(7rem, 1fr)), sans-serif 12px, name-only cells (external_id surfaces on hover viatitleand in selection chips),▸/▾chevron expands sub-techniques inline within the column, multi-select with chip-removal at the top. ReturnsMitreTag[](kind,id,external_id,name), ready for M5 templates./mitreshowcase page with status card, admin-gated Trigger sync button, the picker, and a JSON<pre>preview of the current selection.- Nav adds MITRE link for any logged-in user.
- Testing:
backend/tests/test_mitre.py— 12 pytest (parser, idempotence, checksum mismatch, persisted settings, endpoint variants, perm enforcement) using a hand-crafted minimal STIX bundle (no network in tests).e2e/tests/m4-mitre.spec.ts— 6 Playwright against the live stack (calls/mitre/synconce inbeforeAll).tasks/testing-m4.md.
Fixed (post-M4 spec-review pass)
- Sync integrity guarantee:
seed_mitre()now refuses a custom URL without eitherexpected_sha256or an explicitallow_unverified=true. Closes a "typo inmitre_source_urlsetting routes the seed to attacker JSON" footgun. CLI surfaces this via--checksum-sha256/--skip-checksum; API via{"source", "expected_sha256", "allow_unverified"}body. /diag/resetconsistency: now truncates themitre_*tables alongsidesettingssoGET /mitre/statusandGET /mitre/tacticsagree after a reset (previously: catalogue rows persisted, butmitre_last_syncgot wiped → status lied).- Spec drift §10 #4: amended "14 tactics" → "≥ 14 tactics (v19 ships 15)" to reflect MITRE v8+ reality.
Fixed (post-M4 code-review pass)
- SSRF allowlist on
/mitre/sync: host must be inMITRE_ALLOWED_HOSTS(defaults toraw.githubusercontent.com, comma-separated env override). Closes the "admin holdingmitre.synccan pivot the api container at cloud metadata (169.254.169.254) or internal mirrors" vector. NewMitreSourceForbiddenexception → 400 withsource_forbiddenerror code. - Concurrent sync race:
seed_mitre()now acquirespg_advisory_xact_lock(hashtext('mitre.seed'))at the top of the transaction so two/mitre/synccalls serialise cleanly across theDELETE+ re-INSERTofmitre_technique_tactics. - Typed sync contract end-to-end: Pydantic
SyncResultOuton the backend (app/api/mitre.py) mirrored by aMitreSyncResultTS interface (frontend/src/lib/mitre.ts). The MitrePage mutation no longer uses anas Record<string, unknown>escape hatch. - N+1 in dotted sub-technique fallback: pre-built
{external_id → id}dict at function entry; was firing one extra SELECT per orphan (currently 0 with MITRE, but a latent footgun for partial bundles). SETTING_VERSIONcleared explicitly when source != default: previously kept the stale pinned version after a custom-URL re-sync; now_upsert_setting(..., None)so/mitre/statusdoesn't lie.- Internal error scrub on
/mitre/sync: 500 responses no longer leak URLError / DB driver text viastr(e)— stack lands in JSON logs only. - E2E pinned to exact MITRE v19 counts (15/222/475/0 orphans) for parser-regression detection; previously
>=thresholds could mask "revoked tactics silently included". - E2E uses
crypto.randomUUID()instead ofMath.random()for unique test emails. - Test coverage for security guards:
file://rejection, disallowed HTTPS host, custom-URL-without-sha refusal, dotted-id fallback, version-clearing semantics — 5 new pytest covering paths the spec-review demanded but no test enforced.
Decisions (intentional)
- Bundle "embarqué" interpreted as seed-time download + named-volume cache, not "binary baked into the Docker image". Keeps the image ~150 MB, makes version bumps a constant edit, plays nicely with
make seed-mitrere-runs. Air-gapped operators copy the file into the volume + pass--source /data/mitre/<file>. - Read endpoints unauthenticated-perm-wise but auth-required: MITRE data is public reference material — no
mitre.readperm. Status endpoint is similarly open (under@require_auth) to keep/mitre/statussimple for the UI badge. - No
requests/httpxdep added: stdliburllib.requestis enough and avoids inflating the image.
Validated end-to-end (M4 DoD)
make clean && make up && make migrate && make seed-mitre→ 15 tactics / 222 techniques / 475 sub-techniques / 254 links / 0 orphans / ~1.1 s.make test-api→ 58 pytest pass (1 health + 8 schema + 15 auth + 15 RBAC + 19 MITRE) in ~5 s.make e2e→ 34 Playwright pass (8 M0 + 4 M1 + 8 M2 + 8 M3 + 6 M4) in ~18 s.- Spec-reviewer PASS after fixes applied.
Added — M3 (RBAC: groups, permissions, users)
- Permission catalogue (
app/services/permissions_seed.py): 31 atomic codes across 10 families (user,group,invitation,test_template,scenario_template,mission,detection_level,setting,mitre.sync). Seeded at boot and after/setupto handle a freshly truncated DB. Idempotent + additive on system groups (never removes a perm). - Default group bindings:
admin= all 31 codes;redteam= 8 (catalogue read + mission.{read,create,update,archive,write_red_fields} + detection_level.read);blueteam= 5 (catalogue read + mission.{read,write_blue_fields} + detection_level.read). - Users admin service + API (
app/services/users.py,app/api/users.py): list (q + is_active filter + pagination), get, patch (display_name/locale/is_active), soft-delete, set groups. Last-admin protection on update/delete/group-strip. - Groups admin service + API (
app/services/groups.py,app/api/groups.py): full CRUD with system-group protection (no rename, no delete),PUT /groups/{id}/permissionsfor the bindings. Admin system group's perm set is locked to "every perm" (preserves the bypass invariant). - Permissions read-only API (
app/api/permissions.py):GET /permissionsreturns the catalogue (admin orgroup.readholders). - Frontend admin pages (
frontend/src/pages/Admin{Users,Groups,Invitations}Page.tsx): list + edit modals using TanStack Query mutations, multi-select for perms grouped by family, copy-once invitation URL display. - Frontend chrome (
Layout.tsx+RequireAdmin.tsx): admin nav links shown only whenis_admin === true; direct navigation to/admin/*by non-admins redirects to/. Server remains the arbiter. /diag/resetnow clears the rate-limit counters so the Playwright suite can iterate without hitting10/min/IPbudgets across spec files. Gated to non-prod environments only.- Testing:
tests/test_rbac.py— 15 pytest integration tests (39 backend total).e2e/tests/m3-rbac.spec.ts— 8 Playwright tests covering DoD §10 #2/#3 (28 e2e total).tasks/testing-m3.md— manual + automated procedure.
- Frontend api helpers:
apiPatch,apiPut,apiDeleteadded tofrontend/src/lib/api.ts.
Fixed (post-M3 spec-review pass)
- Rate-limit scope clarified:
app/core/rate_limit.pynow enables the limiter forAPP_ENV in ("prod", "staging")instead ofprodonly — a public staging deployment without auth limits would be surprising. Dev/test stay unthrottled for Playwright ergonomics. Spec §6 NF-security applies to operator-facing deployments. - Admin perm invariant:
set_group_permissionsrefuses to alter the admin system group's perm set to anything other than the full catalogue (SystemGroupProtected→ 409). The decorator bypass relies onis_admin = "admin" in group_names, but a future refactor could move to a perm-based check, so we keep the invariant. - LogRecord field collision:
log.info("...", extra={"name": g.name})raisedKeyError: "Attempt to overwrite 'name' in LogRecord"because Python's logger reservesname. Renamed togroup_name. Audited all otherextra=payloads inapp/api/+app/services/for the same trap.
Validated end-to-end (M3 DoD)
make clean && make up && make migrate→ boot logs showmetamorph.permissions.seeded {perms_created: 31, perms_total: 31, bindings: {admin: 31, redteam: 8, blueteam: 5}}.make test-api→ 39 pytest pass (1 health + 8 schema + 15 auth + 15 RBAC) in ~4 s.make e2e→ 28 Playwright pass (8 M0 + 4 M1 + 8 M2 + 8 M3) in ~16 s.- Spec-reviewer pass: PASS verdict, 2 minor fixes applied (above), 2 anticipations noted for M12/M14 (no current action).
Added — M2 (Auth, bootstrap, invitations)
- Crypto plumbing:
app.core.security(Argon2idtime_cost=2 memory_cost=64MiB parallelism=2, opaque-token SHA-256 helpers),app.core.jwt_tokens(HS256, claimsiss/sub/type/jti/iat/exp, access 1h / refresh 30d). - Auth services (
app.services.auth): login, refresh with token rotation + reuse-detection chain revoke, logout (idempotent), change_password (forces logout-all). - Invitation services (
app.services.invitations): create, preview, accept, revoke. Token persisted only as SHA-256, default 7-day TTL. - Bootstrap (
app.services.bootstrap+app.core.install_token): seeds 3 system groups (admin/redteam/blueteam), mints a one-shot install token at first boot whenusersis empty, logs a banner with the raw token. CLIflask --app app.cli metamorph print-install-token [--force]. - Auth middleware (
app.core.auth_decorators):@require_authpopulatesg.current_user;@require_perm("...")checks atomic permissions; admin group bypasses the check (atomic perms land in M3). - API endpoints:
POST /api/v1/setup(consume install token, create 1st admin) +GET /api/v1/setup(status).POST /api/v1/auth/login+POST /auth/refresh+POST /auth/logout+GET /auth/me+POST /auth/change-password.POST /api/v1/invitations(admin) +GET /invitations+GET /invitations/preview/<token>+POST /invitations/accept/<token>+POST /invitations/<id>/revoke.POST /api/v1/diag/reset(test-only kill switch — wipes auth tables + mints fresh install token; only available indev/test).
- Rate limiting (
flask-limiter): 10/min/IP on/auth/login,/auth/refresh; 5/min on/auth/change-passwordand/setup; 10–20/min on invitation endpoints. Globally disabled whenAPP_ENV=test. - Refresh cookie
metamorph_refresh: HttpOnly + Secure + SameSite=Strict + Path=/api/v1/auth/. - Frontend auth state (
frontend/src/lib/{api,auth}.ts): access token in module memory, refresh in cookie, automatic 401-retry via/auth/refreshwith reentrancy guard.useAuth()hook +<RequireAuth>route guard. - Frontend pages:
/login,/setup,/register?token=…,/profile(with change-password form), all in RTOps design. Protected layout: nav shows email + Logout when authenticated, Login + Setup links when not. - Frontend deps:
@tanstack/react-query,react-router-dom. Tanstack provider inApp.tsx(will carry actual queries from M3+). - Email validation (
app.api._validation.Email): permissive RFC-shape regex that accepts internal TLDs (.local,.corp) —pydantic.EmailStrwas too strict for red-team labs. - Testing:
tests/test_auth_flow.py— 15 pytest integration tests (24 backend total with M0/M1).e2e/tests/m2-auth.spec.ts— 8 Playwright tests covering setup → login → me → invitation → register → 2nd login → RBAC 403 → refresh rotation → logout (20 e2e total).tasks/testing-m2.md— manual + automated procedure.
Fixed (post-M2 spec-review pass)
- Refresh cookie
Secure=Trueunconditionally (backend/app/api/auth.py). Modern browsers treatlocalhostas a secure context, so dev/test still works. Closes the silent-degradation found by the reviewer. /auth/refreshrate-limit lowered to 10/min/IP (backend/app/api/auth.py) to match spec §M2 ("10 req/min/IP on/auth/*")./diag/resetkept allowed indevandtest(amake e2eagainst amake updev stack must be able to reset). Added a WARNING log when triggered indevand a clear docstring; production envs (prod/staging) remain locked out.
Known scope-creep (intentional, not retracted)
- Rate-limits on
/setup(5/min),/invitations/preview(20/min),/invitations/accept(10/min) and/auth/change-password(5/min) were added in M2 even though §M2 only mandated/auth/*. Defensible (these are abuse-attractor endpoints), and noted here so M14 doesn't double-spec them.
Added — M1 (DB schema & migrations)
- 23 tables +
alembic_versioncovering auth/RBAC (8), MITRE (4), templates (4), missions (6), evidence (1), settings/detection-levels (2), notifications (1). - SQLAlchemy 2.x declarative models with Mapped[]/mapped_column(), grouped under
backend/app/models/{auth,mitre,template,mission,evidence,setting,notification}.py. - Alembic init:
alembic.ini,alembic/env.pyreadingapp.core.config.settings.database_url,alembic/script.py.mako, naming conventionpk_/fk_/ck_/uq_/ix_enforced viaMetaData(naming_convention=...)onapp.db.base.Base. - Reusable mixins in
app.db.mixins:UuidPkMixin(uuid4 server-side),TimestampMixin(created_at/updated_at, server-default + onupdate),SoftDeleteMixin(deleted_at, no auto-injected index — declared explicitly per table to avoid mixin-vs-class__table_args__clobbering). - Postgres-specific features used:
JSONBforsettings.valueandnotifications.payload; nativeUuidcolumns; partial indexes (WHERE deleted_at IS NULLon 9 tables;WHERE read_at IS NULLonnotifications); CHECK constraints for status/state/opsec_level/mitre_kind enums;exactly_one_mitre_fkCHECK ontest_template_mitre_tags. mission_test_mitre_tagsdeliberately denormalised (no FK tomitre_*tables): copiesmitre_external_id,mitre_name,mitre_urlat tag time so a later MITRE re-sync that drops an entry cannot purge a mission's tags. Companiontest_template_mitre_tagskeeps FKs since templates are editable. (Spec §11 risk addressed.)- Backend
pyproject.tomldeps: SQLAlchemy ≥2, Alembic ≥1.13, psycopg[binary] ≥3.1. - New Makefile targets:
migrate,migrate-down,migrate-revision MSG=…,migrate-status. The Dockerfile now shipsalembic.ini+alembic/so the api container can run migrations directly. - Test stage in
backend/Dockerfile(--target test): runtime image + dev extras +tests/dir. Newmake test-apitarget spins an ephemeral container against the live DB on the compose network. Backend tests no longer require any local Python toolchain. tests/test_schema.py(8 integration tests + the existing M0 health test = 9 total): expected tables, expected timestamp/soft-delete columns, partial-index presence, expected FK pairs, expected CHECK constraints, alembic-at-head, and a negative INSERT proving theexactly_one_mitre_fkCHECK fires.tasks/testing-m1.md— manual + automated verification procedure.
Fixed (post-M1 spec-review pass)
- Soft delete now consistent across snapshot-bearing tables:
mission_scenarios,mission_tests,mission_categoriesgainedSoftDeleteMixin+ theirix_<table>_activepartial index (M12 trash bin depends on this). evidence_filesgainedTimestampMixin(created_at/updated_at) on top of the domainuploaded_at(audit minimal everywhere, per M1 brief).mission_membersgainedTimestampMixin, replacing the bespokeadded_atcolumn.scenario_template_testsPK refactored to a UUID +UNIQUE(scenario_template_id, position)so the same test can appear at multiple positions in a scenario (chained operations).SoftDeleteMixin.__table_args__removed (silently clobbered by class__table_args__); each soft-delete table now declaresix_<table>_activeexplicitly. Documented in the mixin's docstring.mission_test_mitre_tagsschema redesigned to denormalise MITRE labels (see "Added" entry above).- Migration 0001 regenerated end-to-end after these fixes —
24765a5014b6is the new HEAD.
Validated end-to-end (M1 DoD)
make clean && make up && make migratefrom a vide DB → 27 tables, 32 FK, 9 CHECK, 14 UQ, 12 partial indexes.make test-api→ 9 pytest pass (1 health + 8 schema integration) in <1 s.make e2e→ 12 Playwright pass (8 M0 smoke + 4 M1 db visibility) in 3 s.
Added (M1 visibility)
- New API endpoint
GET /api/v1/diag/dbexposesalembic_revision(short-hashable) and the public-schematable_count. Returns 503 with{"reachable": false}when Postgres is down. - New
Databasecard on the SPA home page consumes that endpoint, renders the revision short-hash and the count next to the existingAPIandRoadmapcards. - Footer updated to
M0 bootstrap · M1 db schema. Roadmap card now points toM2 — Auth + JWT. - New e2e suite
e2e/tests/m1-db.spec.ts(4 tests) covers the diag endpoint contract, the Database card rendering, and the footer/roadmap labels.
Added — M0 (bootstrap)
- Repo scaffolding:
.gitignore,.env.example,Makefile,docker-compose.yml,README.md,CHANGELOG.md. docker-compose.ymlwith three services:db(postgres:16-alpine, no host port),api(Flask 3, port 8000),front(nginx serving the Vite bundle, port 80).- Named volumes
metamorph_dbandmetamorph_evidencefor data persistence. - Backend skeleton: Flask app factory, JSON structured logging on stdout,
GET /api/v1/healthendpoint, multi-stage Dockerfile,pyproject.tomldriven byuv. - Frontend skeleton: Vite + React 18 + TypeScript strict + TailwindCSS, RTOps design tokens (
tasks/design.md) translated intotailwind.config.ts, base UI primitives (Card,Tag,SectionHeader,FlowNode,Button), home page wired to/api/v1/health. - Multi-stage frontend Dockerfile that builds the bundle and serves it via nginx, proxying
/api/*to the api container. - Pre-commit hook config:
rufffor backend,eslint+tsc --noEmitfor frontend.
Validated
docker compose configparses (validated viapyyamlsince Docker is not installed in the dev shell).- Every env var referenced by the compose file is documented in
.env.example. - All Python source files parse cleanly (
ast.parse). - All TS/JSON config files parse cleanly.
Notes
- TLS termination is delegated to an external reverse proxy (per spec §6 NF-network). The compose stack exposes plain HTTP on
HOST_FRONT_PORT(8080) andHOST_API_PORT(8000). - The first-admin bootstrap token (M2) will be printed to the api container's stdout on first boot when the
userstable is empty. tasks/spec.mdandtasks/todo.mdremain authoritative; update them before changing scope.
Fixed (M0 DoD validation pass on real podman)
- FQDN image references in
docker-compose.yml,backend/Dockerfile,frontend/Dockerfile. Podman on Fedora enforcesshort-name-mode=enforcingfor pulls (no TTY ⇒ no prompt ⇒ failure). Replacedpostgres:16-alpine/python:3.12-slim/node:20-alpine/nginx:1.27-alpinewith theirdocker.io/library/…qualified equivalents. Docker accepts the same prefix transparently. *.mdremoved frombackend/.dockerignoreandfrontend/.dockerignore:pyproject.tomldeclaredreadme = "README.md", but the file was being filtered out of the build context, sohatchling.build.build_wheelraisedOSError: Readme file does not exist: README.md. Also removed thereadmefield itself frompyproject.tomlto decouple the build from the doc.Card.tsxtype clash:CardProps extends HTMLAttributes<HTMLDivElement>redefinedtitleasReactNode, but the nativetitleisstring.tsc -bfailed with TS2430 duringvite build. Switched toOmit<HTMLAttributes<HTMLDivElement>, 'title'>.- Explicit healthchecks added to compose
apiandfront: podman-compose 1.x doesn't surface healthchecks declared only in theDockerfileviainspect. Mirroring them indocker-compose.ymlmakesmake inspect-healthactually seehealthy/unhealthy/startingon every engine. - Suppressed
podman composeexternal-provider banner viaPODMAN_COMPOSE_WARNING_LOGS=falseexported from the Makefile.
Validated end-to-end on podman 5.x (Fedora 43)
make up→ 3 containers, all 3 healthy after start_period.make health→{"status":"ok","version":"0.1.0"}via the front nginx proxy (port 8080) and direct API (port 8000).make logs-api→ JSON-structured lines on stdout (ts,level,logger,message, custom fields).make e2e→ 8/8 Playwright tests pass in 2.5 s. Reports:e2e/playwright-report/index.html(529 KB, autoportant) +junit.xml(tests=8 failures=0 skipped=0 errors=0).
Added (engine portability)
- Makefile auto-detects docker or podman at runtime and selects the matching compose driver (
docker compose,podman compose, or legacypodman-compose). Override viaENGINE=…and/orCOMPOSE="…". - New targets:
engine(print detected runtime),volumes(list project-named volumes),inspect-health(health status of all 3 containers),logs-api(tail just the api),health(single curl probe). All engine-agnostic. make helpnow prints the active engine + compose driver in its footer.tasks/testing-m0.mdandREADME.mdrewritten to be engine-agnostic — rawdocker logs/docker volume ls/docker inspectcalls replaced with the new make targets.
Added (M0 testing)
e2e/Playwright project with chromium, HTML + JUnit XML reporters, traces / screenshots / videos kept on retry. Reports land ine2e/playwright-report/.e2e/tests/m0-smoke.spec.ts— 8 smoke tests covering the front rendering, the API proxy, the design tokens, the absence of any runtime CDN traffic (spec §7), and the CORS contract.- Makefile targets
e2e-install,e2e,e2e-report,e2e-up,wait-healthy. tasks/testing-m0.md— step-by-step manual + automated verification procedure for M0.- Convention added to
tasks/todo.md: every milestone N deliverstasks/testing-m<N>.md+ at least onee2e/tests/m<N>-*.spec.ts, and the spec-reviewer subagent runs before marking the milestone done.
Fixed (post-M0 spec-review pass)
.pre-commit-config.yamladded at repo root: ruff + ruff-format on backend, eslint + tsc --noEmit + prettier --check on frontend, plus baseline whitespace/JSON/private-key checks. Documentedpre-commit installinREADME.md.- Self-hosted webfonts via
@fontsource/jetbrains-monoand@fontsource/ibm-plex-sans(imported infrontend/src/index.css); dropped the Google Fonts<link>fromfrontend/index.htmlto honor spec §7 ("no runtime CDN"). - Refuse-to-boot guard in
backend/app/core/config.py: whenAPP_ENV != "dev", defaults / placeholders forJWT_SECRETandPOSTGRES_PASSWORDraise at startup. NewAPP_ENVenv var documented in.env.example,README.md, anddocker-compose.yml. make devnow runsdev-apianddev-frontin parallel viamake -j2instead of just printing a hint.- Removed dead
database_urlproperty fromSettings(will be reintroduced in M1 with the SQLAlchemy/Alembic stack). - Pinned Node engines to
>=20infrontend/package.json. - Reconciled M0 DoD wording in
tasks/todo.md(HTTP viaHOST_FRONT_PORT, with explicit note that prod TLS is external). - Documented the
2xs/3xs/4xsfont-size aliases infrontend/tailwind.config.tsagainst the design.md §3 scale.