Metamorph

Author	SHA1	Message	Date
Knacky	28b8855e88	feat(m7-amend2): implicit lifecycle — writes drive state, no workflow UI User: «Enlève également le workflow d'un test, quand on saisit des informations côtés redteam cela signifie qu'il a été exécuté et donc en attente d'une review blueteam.» Backend (update_mission_test_fields) - At the end of every PUT, inspect the touched-field set: - any red write on state in {pending, skipped, blocked} → state=executed + auto-stamp executed_at=now() if absent - any blue write on state=executed → state=reviewed_by_blue - /transition endpoint kept for back-fill/admin use, not called from UI. Frontend MissionTestPage - Removed the transition-buttons header block and the `transition` mutation. State pill stays as a passive indicator. - New labels: "Not started" / "Awaiting review" / "Reviewed" describe the implicit lifecycle, no longer exposing the state-machine concept. E2E - The SPA test that clicked `transition-executed` now verifies the implicit promotion: typing red fields and saving flips the pill from "Not started" → "Awaiting review", no button click required. Spec - §4 reword: "Cycle de vie implicite, piloté par les écritures" replaces the old "Workflow par test instance" bullet. Tests - 3 new pytest: red_command-alone implicit execute + auto-stamp, blue write promotes executed→reviewed, blue write on pending no-op. - 142 pytest + 49 Playwright green. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-15 16:09:26 +02:00
Knacky	9fc78e0832	feat(m7-amend): full-bleed scenario table with inline edit + docs Frontend half of the 2026-05-15 amendment (backend shipped in `447f152`). - `MissionScenarioTable` component: per-scenario <table> with 7 cols (Test \| Procédure \| Exécution \| Source de log \| Commentaires \| Logs SIEM \| Cyber Incident) + Actions cell. Read mode truncates; double- click toggles a row into edit mode where each cell becomes the right control. detection_level lives inside the Commentaires cell as a pill + select (no 8th column). - MissionDetailPage Tests tab uses the new component, lifts `editingTestId` so only one row across the whole mission is editable at a time. Esc reverts (prompt if dirty), double-click on a different row with a dirty draft also prompts. - Full-bleed escape via `calc(50% - 50vw)` (same recipe as the M4 MITRE picker). 7 dense columns breathe on wide screens, no horizontal scroll. - `draftDiff(test, draft)` returns `null` when nothing changed → no PUT on a no-op save. The diff carries only touched fields so the server's per-field perm gate stays clean. - Datetime semantics: both datetime-local inputs reuse the M7 verbatim recipe (`iso.slice(0, 16)` + `${local}:00Z`), zero TZ shift. Docs - tasks/testing-m7.md §3.0 documents the column matrix + edit workflow. - tasks/lessons.md captures the Pydantic ctx-serialisation pitfall, the naïve-datetime guard, the table-edit pattern. - CHANGELOG section moves "Frontend (in progress)" → "Frontend (shipped)" and details the diff. 49 Playwright tests still green. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-15 14:51:28 +02:00
Knacky	447f15213a	feat(m7): blue review fields + spec amendment + reviewer follow-ups User feedback after the M7 ship: blue team's Excel workflow had 5 extra fields we didn't capture. Per-test page also doesn't match their workflow — they need a tabular view, one table per scenario. Spec - tasks/spec.md amended (`revised: 2026-05-15`): §4 in-scope, §F6, §8 model bullet. §F6 now pins the column matrix, single-row-edit semantics, Esc-cancel, blur-confirm, and reconciles detection_level as a pill inside the Commentaires cell (no 8th column). - tasks/todo.md M7 section grew an "Amendement 2026-05-15" sub-block tracking backend ☑ and frontend ☐. Backend - Migration c2a8f4b1d6e9: 5 nullable columns on mission_tests (blue_log_source, blue_siem_logs, blue_incident_at, blue_incident_number, blue_incident_recipient_email). - _BLUE_FIELDS extended; update_mission_test_fields propagates each field; MissionTestDetailView + MissionTestView (the nested view in GET /missions/{id}) surface every annotation field, plus last_actor_*, updated_at, detection_level_key — O(1) batch lookup for detection-level keys and last-actor users keeps it scalable. - UpdateMissionTestPayload accepts each field with length caps (120/200_000/120/255). Reviewer follow-ups applied - blue_incident_at + executed_at now reject naïve datetimes (_ensure_aware_datetime) — Postgres would otherwise interpret them in the session TZ, defeating the M7 verbatim-time contract. - blue_incident_recipient_email goes through a permissive RFC-shape regex (_validate_email_shape) so internal/lab TLDs like .local / .corp / .test pass — Pydantic EmailStr is too strict (lessons.md M2 trap). - Project-wide: switched `e.errors()` to `e.errors(include_context=False, include_url=False)` because the AfterValidator-raised ValueError lands in ctx and Flask can't serialize it. Tests - 5 new pytest cases: blue user writes the 5 new fields, red user is individually 403'd on each, round-trip via GET, naïve datetime rejected, email shape validated (.local accepted, bad shape 400). - 138 pytest green. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-15 14:45:18 +02:00
Knacky	5030f4bd83	docs(m7): backfill changelog + testing-m7 for the two post-merge UX fixes User feedback flagged that the doc didn't reflect the two hotfixes shipped after the M7 PR: - evidence whitelist surfaced in the dropzone + OS picker pre-filter - executed_at override fixed in non-UTC timezones (no more time-snap) Added a CHANGELOG entry per fix and a §3.5 in tasks/testing-m7.md walking through the timezone semantics of the datetime-local input. spec.md is left untouched — these are UX/implementation fixes, not contract changes. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-15 09:51:23 +02:00
Knacky	ed70458d8f	feat(m7): per-test execution — red/blue zones, evidence pipeline, activity poll DoD M7 (spec §F5 + §F6 + §F8 + tasks/todo.md M7) covered end-to-end: Backend - New migration `91a4e7c6d2f3` adds `mission_tests.last_actor_id` (FK users ON DELETE SET NULL) and `ix_mission_tests_updated_at` for the polling query. - `detection_levels`: 4 default rows seeded at boot, `GET /detection-levels` read-only (CRUD lands in M8). - `mission_tests` service + `missions` API extension: - `GET /missions/{id}/tests/{test_id}` — full detail incl. evidence list - `PUT /missions/{id}/tests/{test_id}` — patch red/blue fields with per-field perm classification (`mission.write_red_fields` vs `mission.write_blue_fields`) - `POST /missions/{id}/tests/{test_id}/transition` — pending↔skipped/blocked and pending→executed→reviewed_by_blue (+ undo paths), side-aware perm gate that fires before idempotency, `executed_at` auto-stamped on the way in - `GET /missions/{id}/activity?since=<ISO>` — drives the 15 s polling badge - `evidence` service + top-level `/evidence/<id>` API: - Streaming upload, SHA256 chunk-by-chunk, 25 MB cap, ext+MIME whitelist - Content-addressed storage at ${EVIDENCE_DIR}/<mission>/<test>/<sha256><ext> - Atomic `os.replace`, hex-validated SHA path component, root-dir guard - Membership-aware (404 on miss/forbidden, no existence leak) - `/diag/reset` now wipes ${EVIDENCE_DIR}/* in test mode (symlink-safe) and re-seeds detection levels as a safety net. Frontend - `lib/missions.ts` — M7 types + queryKey factory + state-machine matrix. - `pages/MissionTestPage.tsx` — two-zone layout: red border (command, output, comment, mark-executed + override toggle) and cyan border (detection-level select, comment, drag-and-drop evidence dropzone). Last-touched badge polls /activity every 15 s, gated on document.visibilityState. Per-field disable based on the user's red/blue perms (server stays the arbiter). - `pages/MissionDetailPage.tsx` — test rows link to the new per-test page. - `App.tsx` — registers /missions/:id/tests/:testId behind RequireAuth. - `HomePage.tsx` — hero + roadmap card bumped to M7; next is M8. Tests - `backend/tests/test_mission_tests.py` — 27 pytest tests (red/blue field gating, state-machine matrix incl. idempotent-side enforcement, executed_at override, 24/26 MB upload + SHA256, MIME/ext whitelist, soft-delete hide, activity polling with URL-encoded `since`, membership 404 vs admin bypass, cross-mission evidence access). - `e2e/tests/m7-execution.spec.ts` — 5 Playwright tests against the live stack (red-only/blue-only API gating, mark-executed + reviewed_by_blue side enforcement, 24 MB/26 MB upload + SHA256 round-trip, SPA per-test page save + transition, non-member 404 message). afterAll restores stable admin and re-syncs MITRE. Docs - CHANGELOG.md: M7 section + post-M7 review-pass subsection. - README.md: status, feature blurb, roadmap, testing-m7 link. - tasks/testing-m7.md: manual + automated procedure with transition matrix and perm-gating table. - tasks/lessons.md: M7 retrospectives (LogRecord `created` trap, URL-encoded query timestamps, perm-before-flush, atomic move, polling visibility gate). Test count: 133 pytest / 49 Playwright, all green. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-14 08:16:48 +02:00
Knacky	00b7557e30	feat(m6): missions + snapshot CRUD, membership visibility, status state machine Adds the mission layer that materialises template snapshots, plus the SPA list / 3-step wizard / detail page. Backend: - app/services/missions.py — create_mission snapshots scenarios, tests, MITRE tags in a 4-query write; list/get apply a non-admin membership filter that collapses to 404 (no existence leak); status state machine enforces draft → in_progress → completed → archived with archived as a sink; the non-admin creator is auto-added as role_hint='red' to retain visibility. - app/api/missions.py — 8 endpoints (list, get, create, update, add scenarios, set members, transition, soft-delete) with strict pydantic schemas. The transition endpoint splits the perm gate manually so archive requires mission.archive while other targets use mission.update. - app/api/users.py — new GET /users/roster returning (id, email, display_name) only, gated by user.read OR mission.create OR mission.update — lets non-admin wizard users see assignable peers without exposing the admin /users payload. - app/api/diag.py — /diag/reset truncates the mission_* tables before the template tables because the source_*_template_id FKs are ON DELETE SET NULL, which is cheaper to short-circuit by removing the children first. Frontend: - lib/missions.ts — typed client, queryKey factory, status accent map. - pages/MissionsListPage.tsx — list cards with status accent + filters (q, client, status). - pages/MissionsCreatePage.tsx — 3-step wizard (meta → scenarios → members) with member roster fed by /users/roster. - pages/MissionDetailPage.tsx — header + transition buttons (legal next states only) + Tests/Members/Synthesis/Export tabs. - Routes + nav entry (visible to anyone with mission.read or admin). Tests: - backend/tests/test_missions.py — 22 pytest covering snapshot fidelity, MITRE propagation, membership visibility, transition state machine, perm gating, member set replace, append scenarios, soft-delete, partial update, inverted-date rejection. - e2e/tests/m6-missions.spec.ts — 5 Playwright (snapshot freezing, non-admin visibility, status transitions + 409, SPA wizard end-to-end, list filter). Docs: - CHANGELOG, tasks/testing-m6.md, tasks/lessons.md (snapshot tradeoffs, membership=404 pattern, /diag/reset order, auto-creator add). - README + tasks/todo.md updated. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-13 15:07:32 +02:00
Knacky	ce4bd40551	fix(m5): post-review pass — AND filter, advisory lock, N+1, item caps, mutation cache Spec-reviewer + code-reviewer findings applied: Must-fix - Filter combinator AND-semantics: tactic+technique+subtechnique now intersect (one IN subquery per facet) instead of being pooled into one OR. Reviewers flagged both the wrong default semantics and the theoretical UUID-collision risk of pooling tactic/technique/sub UUIDs into a shared list across three columns. - Front-end mutation cache hygiene: updateMeta + setTests both `onSettled: invalidate` so a partial failure leaves the cache consistent. Should-fix - Per-scenario pg_advisory_xact_lock on set_scenario_tests — serialises concurrent reorders, mirrors M4 /mitre/sync pattern. - Backend/front consistency on duplicate tests in a scenario: the UNIQUE(scenario_id, position) constraint already allows the same test_template multiple times (chained ops), so the catalogue picker no longer excludes already-picked items. Nice-to-have - N+1 eradicated in test_template view rendering: _to_views_batch builds {uuid → MitreRow} maps in 3 queries up-front; list endpoint now issues 4 queries total regardless of list size. - Wire-level item length caps on tags (64) and expected_iocs (255) via Annotated[str, StringConstraints(...)] — returns 400 instead of bubbling up StringDataRightTruncation. - 4 new pytest covering the AND-filter, extra="forbid" rejection, empty mitre_tags clearing, and the 65-char tag cap. Total now 81 pytest + 38 e2e pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-12 20:05:00 +02:00
Knacky	a559823386	test(m5): playwright spec + docs (CHANGELOG, README, lessons, testing-m5) - 4 Playwright tests: API CRUD round-trip, scenario reorder via PUT, SPA list + opsec filter, SPA scenario list rendering with ordered tests. - afterAll restores the stable admin (admin@metamorph.local) per the test_admin memory rule. - CHANGELOG M5 section + Fixed subsections for the LogRecord 'name' collision and the React `currentTarget` vs `target` quirk. - README status bumps to M0-M5. - tasks/lessons.md captures the new patterns (sentinel pattern for partial-update, FK ordering in /diag/reset, dnd-kit stable IDs). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-12 19:57:51 +02:00
Knacky	2c85f9b57e	docs(m4): reconcile CHANGELOG + testing-m4 with the flat matrix + CR fixes - CHANGELOG M4 Added: rewrote the frontend bullet to describe the actual flat ATT&CK matrix that ships (full-bleed, 15-col grid with minmax(7rem, 1fr), name-only cells, ▸/▾ chevron). The original entry still described the abandoned 3-column drill-down picker. - New "Fixed (post-M4 code-review pass)" subsection enumerating the six CR-driven fixes that landed in this branch (SSRF allowlist, advisory lock, typed contract, N+1 elimination, version clearing, error scrub + the test additions and e2e count pinning). - DoD counts: 53 → 58 pytest, 34 e2e unchanged. testing-m4.md follows. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-12 19:19:44 +02:00
Knacky	7a69f10f3e	docs(m4): post-review polish — helper text + test counts Spec-reviewer PASS pointed two factual nits: - MitrePage helper text still referenced the old 3-column drill-down ("Pick a tactic on the left, then a technique..."). Reworded for the flat matrix with the ▸ glyph + hover-for-id idiom. - testing-m4.md + CHANGELOG were stale at 51/12; the actual counts are 53/14 after the GET /mitre/matrix tests landed. Reconciled. No code-path change, no e2e fallout — DoD remains 53 pytest + 34 Playwright. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-12 18:58:51 +02:00
Knacky	b52cb0e5e4	refactor(m4): full-bleed matrix + word-only line breaks Two follow-up tweaks per user feedback ("wrap sur les mots, agrandit le cadre"): - Full-bleed wrapper: the matrix breaks out of the page's max-w-page (1400px) constraint via `margin: 0 calc(50% - 50vw)` + `width: 100vw`, mirroring the 60px page padding internally. On wide viewports the picker now uses the ENTIRE viewport width, so column widths grow proportionally — names that used to wrap on 3 lines now fit on 1-2. - Word-only wrapping: replaced `break-words` (overflow-wrap: break-word, which falls back to mid-word breaks) with `break-normal hyphens-none` (overflow-wrap: normal + word-break: normal). Cells break only at word boundaries; if a single word is longer than the cell it overflows visually rather than splitting `Aut\nhentication`-style. The grid is configured `minmax(7rem, 1fr)` so the minimum column is wide enough for every single word in MITRE v19 names, and stretches with available space. - Spec §F2 rewritten as a bullet contract locking in: full-bleed, 15 cols minmax(7rem, 1fr), word-only wrap, font sans 12px / count 10px, headers/ cells show name-only with external_id on hover + chips. Future spec-reviewer passes can grade against this. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-12 18:53:51 +02:00
Knacky	8742fb2b6e	refactor(m4): match attack.mitre.org sizing — equal-width cols, name-only cells Visual parity pass against attack.mitre.org/# per user feedback ("trop dense, illisible, je veux la même représentation"): - Layout switched from flex+fixed-width 224px columns to a CSS grid of `repeat(N, minmax(0, 1fr))` so the 15 tactic columns share the container width equally. No more horizontal scroll on a standard desktop. - Cells now show NAME ONLY (matches mitre.org). The external_id (TA00xx / T1xxx / T1xxx.xxx) is preserved in the chip selection bar at the top and in the `title` hover tooltip on every cell — surfaces on demand, doesn't consume cell real estate. - Font: switched to `font-sans` (IBM Plex Sans) at `text-xs` (12px) across cells, matching the mitre.org typography. Headers use the same family at the same size with a 10px sub-line for the technique count. - Chevron icons: ▸ (collapsed) / ▾ (expanded) — small, sub-technique count rendered inline beside the chevron. - Helper line below the matrix tells the user where the IDs went. Spec §F2 + testing-m4.md walkthrough rewritten to lock the new sizing rules in (font-xs, no external_id in cells, hover/chip for the ID, no horizontal scroll). spec-reviewer will see the matching contract. DoD: make e2e → 34 passed. Selectors (data-testid + aria-pressed) unchanged so the existing M4 e2e test still walks the new layout end-to-end. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-12 18:41:11 +02:00
Knacky	7dbe2dbc28	refactor(m4): flatten the MITRE picker into the attack.mitre.org matrix The hierarchical 3-column drill-down was hard to scan and forced a stateful walk per tag. Replaced with a flat, columns-as-tactics matrix that mirrors attack.mitre.org/# — every cell is a one-click select target, with inline sub-technique expand via a `+N` chevron. - New endpoint GET /api/v1/mitre/matrix returns the full grid (tactics → techniques → sub-techniques nested) in a single ~55 KB response, so the SPA renders the whole matrix without firing 15 parallel queries. Two pytest tests added (nested structure + auth required). - MitreTagPicker.tsx rewritten as a horizontal-scrolling matrix: - Click a tactic header → select the tactic (cyan filled). - Click a technique cell → select the technique (orange filled). - Click the `+N` chevron → expand sub-techniques inline within the column. - Click a sub-technique → select (purple filled). - Single Filter field matches on external_id or name across all kinds. - Selection chips at the top, clickable to remove. - `aria-pressed` on every clickable cell for screen readers and Playwright. - e2e test updated to walk the new flow (click cell → assert aria-pressed, expand chevron, click sub, verify chip + JSON preview, filter to T1078). - Spec §F2 + §F12 + todo.md M4 entry updated to make the matrix layout the canonical UI for MITRE tagging (so future spec-reviewer passes accept it). - testing-m4.md walkthrough rewritten for the flat picker. DoD post-refactor: make test-api → 53 passed (was 51), make e2e → 34 passed. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-12 18:32:20 +02:00
Knacky	37e9e03f02	docs(m4): CHANGELOG, README, lessons, spec drift fix, todo tick - CHANGELOG: added M4 section listing endpoints, CLI, volume, persisted settings, picker, and the post-spec-review fixes (custom-URL integrity requirement + /diag/reset consistency + spec drift). Includes the intentional decisions paragraph (seed-time download not image-baked, read endpoints unauthenticated-perm-wise, stdlib over httpx). - README: status bumped to M0–M4, added MITRE quickstart (make seed-mitre + air-gapped path with --source /data/mitre/<file> + --skip-checksum), testing-m<N>.md pointer updated to testing-m4.md, roadmap line. - tasks/spec.md §10 #4: amended "14 tactics Enterprise" → "≥14 tactics Enterprise (la v19 du pin actuel en ship 15)". - tasks/lessons.md: 7 M4 lessons captured (stdlib STIX parsing, decoupling DoD asserts from upstream versions, subtechnique parent resolution, single- transaction safety, custom-URL footgun mitigation, /diag/reset consistency, named-volume permission caveat, podman build cache surprise). - tasks/todo.md: M4 marked ☑. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-12 13:54:46 +02:00
Knacky	90036437cc	test(m4): pytest parser + endpoints + e2e tag picker - backend/tests/test_mitre.py: 12 integration tests using a hand-crafted minimal STIX bundle (no network in tests). Covers parser (revoked/deprecated skip, sub-technique parent linkage), seed idempotence, persisted settings, checksum mismatch path, all four read endpoints, perm enforcement on /mitre/sync, ILIKE search. - e2e/tests/m4-mitre.spec.ts: 6 Playwright tests against the live stack. beforeAll calls POST /mitre/sync once (real bundle, ~50 MB, ~1.1 s) then the suite validates tactics ≥14, T1003 has ≥5 sub-techniques, the picker walks tactic→technique→subtechnique with chip multi-select, and non-admin sees /mitre but no Sync card. - tasks/testing-m4.md: manual + automated checklist, air-gapped operator notes, volume-permission caveat for pre-existing root-owned volumes. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-12 13:54:26 +02:00
Knacky	bb23bf3928	feat(m3): RBAC — atomic perms, groups, users, admin SPA pages Permission catalogue (services/permissions_seed.py) - 31 atomic codes across 10 families: user., group., invitation., test_template., scenario_template., mission. (incl. mission.write_red_fields + mission.write_blue_fields), detection_level.{read,update}, setting.{read,update}, mitre.sync. - Default bindings: admin = all 31; redteam = 8 (catalogue read + mission. {read,create,update,archive,write_red_fields} + detection_level.read); blueteam = 5 (catalogue read + mission.{read,write_blue_fields} + detection_level.read). - Seed runs at boot AND after /setup so a freshly truncated DB (via /diag/reset) gets the bindings back via the bootstrap path. Idempotent + additive (never removes a perm from a system group). Users admin (services/users.py + api/users.py) - list (q + is_active filter + pagination), get, patch (display_name / locale / is_active with tri-state sentinel for clear-vs-unset), soft-delete, set groups. - Last-admin protection on update (deactivate), delete, and group-strip (refusing to remove the admin group from the last active admin). Groups admin (services/groups.py + api/groups.py) - Full CRUD with system-group protection (no rename, no delete on admin/redteam/blueteam). - PUT /groups/{id}/permissions sets the perm list. - Admin system group's perm set is locked to the full catalogue (SystemGroupProtected → 409) — preserves the bypass invariant even if a future refactor moves to perm-based checks. Permissions read-only (api/permissions.py) - GET /permissions returns the catalogue (admin or group.read holders). /diag/reset extension - After truncate + token mint, the limiter is also reset (limiter.reset()) so the Playwright suite doesn't hit 10/min budgets across spec files. Guarded by limiter.enabled to no-op in APP_ENV=test. Rate-limit scope (core/rate_limit.py) - enabled = APP_ENV in ("prod", "staging"). A staging deployment serves humans, so it gets the limits too. Dev/test stay unthrottled for Playwright ergonomics. Spec §6 NF-security is an operator-facing requirement. Frontend chrome - components/RequireAdmin.tsx + ui/Modal.tsx (reusable centered dialog with accessible name + Escape + backdrop-click). - Layout.tsx shows Admin nav links only when is_admin === true. Server remains the arbiter — non-admins hitting /admin/* get redirected to /. Frontend pages - pages/AdminUsersPage.tsx, AdminGroupsPage.tsx, AdminInvitationsPage.tsx with edit modals using TanStack Query mutations + multi-select for perms grouped by family + copy-once invitation URL display. - lib/admin.ts: shared types + query keys + groupPermsByFamily helper. - lib/api.ts: apiPatch / apiPut / apiDelete added. Playwright config (e2e/playwright.config.ts) - workers: 1 + fullyParallel: false: spec files share the live Postgres, so concurrent /diag/reset calls clobber each other. Intra-file order preserved via test.describe.configure({ mode: 'serial' }). Testing - backend/tests/test_rbac.py: 15 integration tests (39 backend total — 1 health + 8 schema + 15 auth + 15 RBAC). - e2e/tests/m3-rbac.spec.ts: 8 Playwright tests covering DoD §10 #2/#3 (28 e2e total — 8 M0 + 4 M1 + 8 M2 + 8 M3). - tasks/testing-m3.md. DoD: make test-api → 39 passed, make e2e → 28 passed. Spec-reviewer pass applied (admin perm invariant + staging rate-limit scope). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-11 06:17:07 +02:00
Knacky	700b563297	feat(m2): auth, JWT, invitations, bootstrap, RTOps SPA pages Crypto + tokens - app/core/security.py: Argon2id PasswordHasher (time_cost=2, memory_cost= 64 MiB, parallelism=2) + opaque-token SHA-256 helpers (raw token shown once, only the hash lives in the DB). - app/core/jwt_tokens.py: HS256, claims iss/sub/type/jti/iat/exp. Access 1h, refresh 30d. Services - services/auth.py: login, refresh with token rotation + reuse-detection chain revoke, logout (idempotent), change_password (forces logout-all). - services/invitations.py: create, preview, accept, revoke. Default 7d TTL. - services/bootstrap.py: seeds the 3 system groups (admin/redteam/blueteam), consumes the install token, attaches the first user to admin. - core/install_token.py: mints, persists in settings, marks consumed, regenerate hook for /diag/reset. API - POST /setup (consume install token, create 1st admin) + GET /setup (status). - POST /auth/{login,refresh,logout,change-password} + GET /auth/me. - POST /invitations + GET /invitations + GET /invitations/preview/<token> + POST /invitations/accept/<token> + POST /invitations/<id>/revoke. - POST /diag/reset: test-only kill switch (truncate auth tables + mint fresh install token). Allowed in dev too (with WARNING log) so the e2e suite can run against a make-up stack; production locked out. Middleware - @require_auth populates g.current_user (snapshot dataclass, session closed before request handler runs). - @require_perm(*codes): atomic perm union check; admin group bypasses. Perm catalogue lands in M3, scaffolding here. - flask-limiter: 10/min/IP on /auth/login & /auth/refresh, 5/min on /auth/change-password & /setup, 10–20/min on invitation endpoints. Disabled in APP_ENV=test. CLI - flask --app app.cli metamorph print-install-token [--force] - flask --app app.cli metamorph seed-mitre (M4 placeholder) Refresh cookie metamorph_refresh: HttpOnly + Secure (localhost is a secure context for modern browsers) + SameSite=Strict + Path=/api/v1/auth/. Email validation: app.api._validation.Email permissive RFC-shape regex so internal TLDs (.local/.corp/.test) are accepted — pydantic.EmailStr's deliverability check is too strict for red-team labs. Frontend - lib/{api,auth}.ts: access token in module memory, refresh cookie, automatic 401-retry via /auth/refresh, useAuth() hook. - components/{Layout,RequireAuth}.tsx + ui/{TextField,Alert}.tsx. - pages/{Login,Setup,Register,Profile}. Testing - tests/test_auth_flow.py: 15 integration tests (24 backend total). - e2e/tests/m2-auth.spec.ts: 8 Playwright tests (20 e2e total). - tasks/testing-m2.md. DoD: make test-api → 24 passed, make e2e → 20 passed; spec-reviewer pass applied (Secure unconditional, refresh limit 10/min/IP). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-11 06:16:48 +02:00
Knacky	e995853f0d	feat(m1): DB schema, migrations, diag visibility 23 tables + alembic_version covering the v1 data model: - Auth/RBAC (8): users, groups, permissions, user_groups, group_permissions, invitations, invitation_groups, refresh_tokens. - MITRE (4): mitre_tactics, mitre_techniques, mitre_subtechniques + the technique↔tactic many-to-many. - Templates (4): test_templates, test_template_mitre_tags (3 nullable FKs + CHECK exactly_one_mitre_fk), scenario_templates, scenario_template_tests (UUID PK + UNIQUE(scenario_id, position) so a test can appear at multiple positions). - Missions (6): missions, mission_members, mission_scenarios, mission_tests, mission_test_mitre_tags (deliberately denormalised — copies external_id + name + url, no FK to mitre_* — so a re-sync of the catalogue can't purge historical tags), mission_categories. - Evidence/settings/notifications (5): evidence_files, settings (JSONB value), detection_levels, notifications. SQLAlchemy 2.x with Mapped[]/mapped_column(), pk_/fk_/ck_/uq_/ix_ naming convention. Reusable mixins (UuidPkMixin, TimestampMixin, SoftDeleteMixin — no auto __table_args__ since classes silently clobber the mixin's). Soft delete: deleted_at + partial indexes ix_<table>_active WHERE deleted_at IS NULL on 9 tables (users, groups, test_templates, scenario_templates, missions, mission_scenarios, mission_tests, mission_categories, evidence_files). Notifications gets ix_..._unread WHERE read_at IS NULL. CHECK constraints for status / state / opsec_level / mitre_kind enums. New API endpoint GET /api/v1/diag/db: returns alembic_revision (short hash) and the public-schema table_count. 503 with {"reachable": false} on a DB outage. Database card on the SPA home consumes it. Test stage in backend/Dockerfile (--target test): runtime + dev extras + tests/. New make test-api spins an ephemeral pytest container against the live DB on the compose network. backend/tests/test_schema.py: 8 integration tests (tables, FK pairs, CHECK constraints, partial indexes, alembic-at-head, negative INSERT proving the exactly_one_mitre_fk CHECK fires). e2e/tests/m1-db.spec.ts: 4 Playwright tests covering the diag endpoint contract + the Database card + footer/roadmap labels. DoD: make clean && make up && make migrate → 23 tables, 32 FKs, 9 CHECKs, make test-api → 9 passed, make e2e → 12 passed. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-11 06:16:24 +02:00
Knacky	f1fdf27012	feat(m0): bootstrap repo, design system, compose stack - Repo scaffolding: .gitignore, .env.example, Makefile, docker-compose.yml, README.md, CHANGELOG.md, pre-commit config. - Three-service stack: api (Flask 3), db (postgres:16-alpine), front (nginx serving the Vite bundle). Named volumes metamorph_db + metamorph_evidence. - Backend skeleton: Flask app factory, JSON structured logging on stdout, GET /api/v1/health, multi-stage Dockerfile, pyproject.toml driven by uv, Pydantic Settings with secret guard rails (refuses to boot in non-dev with placeholders), APP_ENV gating. - Frontend skeleton: Vite + React 18 + TypeScript strict + TailwindCSS, RTOps design tokens from tasks/design.md, self-hosted JetBrains Mono / IBM Plex Sans via @fontsource, base UI primitives (Card/Tag/SectionHeader/FlowNode/ Button), home page wired to /api/v1/health. - Engine-agnostic Makefile: auto-detects docker or podman, picks the matching compose driver. Targets: up/down/build/rebuild/dev/lint/fmt/test/migrate/ seed-mitre/print-install-token/e2e/inspect-health. - Playwright suite: e2e/tests/m0-smoke.spec.ts (8 tests) + HTML + JUnit reports + traces on retry. - Docs: tasks/spec.md (finalized after Q&A), tasks/design.md, tasks/todo.md (14 milestones), tasks/testing-m0.md, tasks/lessons.md. DoD: make up + make health + make e2e all pass on podman 5.x (Fedora) and docker. TLS terminated by external reverse proxy (spec §6 NF-network). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-11 06:16:00 +02:00

19 Commits