Commit Graph

10 Commits

Author SHA1 Message Date
Knacky
e1b51db25f fix(m6): post-review pass — cache prefix, snapshot lock, perm-before-parse, LIKE escape
Addresses spec-reviewer + code-reviewer feedback on the M6 bundle:

Critical:
- frontend/src/lib/missions.ts: add `listPrefix()` so TanStack invalidation
  catches every filtered list variant; the previous `list()` returned
  `['missions','list',{}]` and only matched the exact empty-filter cache,
  leaving filtered tables stale after create/transition/delete.
- backend/app/services/missions.py: acquire the same per-scenario
  `pg_advisory_xact_lock` key used by `set_scenario_tests` before
  snapshotting; without it a concurrent M5 reorder could freeze a torn
  snapshot under READ COMMITTED. Sorted by key to avoid deadlocks with
  another snapshotter.

Important:
- backend/app/api/missions.py: `@require_perm("mission.update",
  "mission.archive")` on the transition endpoint so users without either
  perm get 403 before the body is parsed (no shape leak via 400).
- backend/app/services/missions.py: escape `%` / `_` / `\` in user-typed
  `q` / `client` LIKE search; users can no longer trigger wildcard
  semantics by typing literal `%`. Added `escape='\\'` arg on every .like().
- backend/app/services/missions.py: filter `MissionTest.deleted_at` and
  `MissionScenario.deleted_at` in the list-item and detail counts so M7+
  soft-deletes don't drift the totals silently.

Nits:
- backend/app/api/users.py: order `/users/roster` by email for stable
  rendering + deterministic e2e selectors.
- frontend/src/pages/MissionDetailPage.tsx: distinct accent per
  transition target (cyan/orange/green/teal) matching the status legend.
- e2e/tests/m6-missions.spec.ts: switch fragile `getByRole(name=/In
  Progress/i)` to the stable `mission-transition-in_progress` data-testid.

New tests:
- test_create_mission_rejects_soft_deleted_scenario
- test_transition_perm_gate_runs_before_payload_parse
- test_search_treats_wildcards_as_literals

Suite: 106 pytest passing (was 103), 43 Playwright passing.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-13 15:14:57 +02:00
Knacky
00b7557e30 feat(m6): missions + snapshot CRUD, membership visibility, status state machine
Adds the mission layer that materialises template snapshots, plus the SPA
list / 3-step wizard / detail page.

Backend:
- app/services/missions.py — create_mission snapshots scenarios, tests, MITRE
  tags in a 4-query write; list/get apply a non-admin membership filter that
  collapses to 404 (no existence leak); status state machine enforces
  draft → in_progress → completed → archived with archived as a sink; the
  non-admin creator is auto-added as role_hint='red' to retain visibility.
- app/api/missions.py — 8 endpoints (list, get, create, update, add
  scenarios, set members, transition, soft-delete) with strict pydantic
  schemas. The transition endpoint splits the perm gate manually so
  archive requires mission.archive while other targets use mission.update.
- app/api/users.py — new GET /users/roster returning (id, email,
  display_name) only, gated by user.read OR mission.create OR
  mission.update — lets non-admin wizard users see assignable peers
  without exposing the admin /users payload.
- app/api/diag.py — /diag/reset truncates the mission_* tables before the
  template tables because the source_*_template_id FKs are ON DELETE SET
  NULL, which is cheaper to short-circuit by removing the children first.

Frontend:
- lib/missions.ts — typed client, queryKey factory, status accent map.
- pages/MissionsListPage.tsx — list cards with status accent + filters
  (q, client, status).
- pages/MissionsCreatePage.tsx — 3-step wizard (meta → scenarios → members)
  with member roster fed by /users/roster.
- pages/MissionDetailPage.tsx — header + transition buttons (legal next
  states only) + Tests/Members/Synthesis/Export tabs.
- Routes + nav entry (visible to anyone with mission.read or admin).

Tests:
- backend/tests/test_missions.py — 22 pytest covering snapshot fidelity,
  MITRE propagation, membership visibility, transition state machine,
  perm gating, member set replace, append scenarios, soft-delete, partial
  update, inverted-date rejection.
- e2e/tests/m6-missions.spec.ts — 5 Playwright (snapshot freezing, non-admin
  visibility, status transitions + 409, SPA wizard end-to-end, list filter).

Docs:
- CHANGELOG, tasks/testing-m6.md, tasks/lessons.md (snapshot tradeoffs,
  membership=404 pattern, /diag/reset order, auto-creator add).
- README + tasks/todo.md updated.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-13 15:07:32 +02:00
Knacky
a559823386 test(m5): playwright spec + docs (CHANGELOG, README, lessons, testing-m5)
- 4 Playwright tests: API CRUD round-trip, scenario reorder via PUT, SPA
  list + opsec filter, SPA scenario list rendering with ordered tests.
- afterAll restores the stable admin (admin@metamorph.local) per the
  test_admin memory rule.
- CHANGELOG M5 section + Fixed subsections for the LogRecord 'name'
  collision and the React `currentTarget` vs `target` quirk.
- README status bumps to M0-M5.
- tasks/lessons.md captures the new patterns (sentinel pattern for
  partial-update, FK ordering in /diag/reset, dnd-kit stable IDs).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-12 19:57:51 +02:00
Knacky
8b1de6e258 test(m4): cover the new security guards + pin e2e to exact MITRE v19 counts
- 5 new pytest covering paths the code-reviewer flagged as un-asserted:
    * `test_seed_refuses_file_url` — `file://` scheme rejected before I/O
      (was the SSRF-to-local-FS vector).
    * `test_seed_refuses_disallowed_https_host` — non-allowlisted HTTPS
      host rejected with `MitreSourceForbidden`.
    * `test_seed_refuses_custom_url_without_sha` — end-to-end guard that
      `seed_mitre(source=<custom URL>, expected_sha256=None,
      allow_unverified=False)` raises `MitreSeedError`.
    * `test_dotted_id_fallback_resolves_orphan_subtechnique` — STIX bundle
      without `relationship[subtechnique-of]` still attaches T1059.001 to
      T1059 via the dotted-id convention.
    * `test_seed_clears_version_when_source_is_not_default` — seed from a
      local path leaves `settings.mitre_version` NULL (no stale pin).
- Existing `test_checksum_mismatch_aborts` reworked to monkey-patch
  `_ensure_host_allowed` so `file://` can drive the test past the allowlist
  gate (was relying on file:// being accepted before CR1).
- Removed unused `uuid` import.
- e2e: assertions on `tactics_upserted`/`techniques_upserted`/
  `subtechniques_upserted` switched from `>= 14/180/400` thresholds to
  `=== 15/222/475` exact counts pinned to MITRE Enterprise v19.0 + 0
  orphans. Catches parser regressions that would silently include revoked
  rows. Bump alongside MITRE_VERSION when re-pinning.
- e2e: `Math.random()` → `crypto.randomUUID().slice(0, 8)` for unique
  test-run emails (collision-safe across parallel CI workers).

DoD: 58 pytest pass (was 53), 34 Playwright pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-12 19:19:34 +02:00
Knacky
7dbe2dbc28 refactor(m4): flatten the MITRE picker into the attack.mitre.org matrix
The hierarchical 3-column drill-down was hard to scan and forced a stateful
walk per tag. Replaced with a flat, columns-as-tactics matrix that mirrors
attack.mitre.org/# — every cell is a one-click select target, with inline
sub-technique expand via a `+N` chevron.

- New endpoint GET /api/v1/mitre/matrix returns the full grid (tactics →
  techniques → sub-techniques nested) in a single ~55 KB response, so the
  SPA renders the whole matrix without firing 15 parallel queries. Two
  pytest tests added (nested structure + auth required).
- MitreTagPicker.tsx rewritten as a horizontal-scrolling matrix:
  - Click a tactic header → select the tactic (cyan filled).
  - Click a technique cell → select the technique (orange filled).
  - Click the `+N` chevron → expand sub-techniques inline within the column.
  - Click a sub-technique → select (purple filled).
  - Single Filter field matches on external_id or name across all kinds.
  - Selection chips at the top, clickable to remove.
  - `aria-pressed` on every clickable cell for screen readers and Playwright.
- e2e test updated to walk the new flow (click cell → assert aria-pressed,
  expand chevron, click sub, verify chip + JSON preview, filter to T1078).
- Spec §F2 + §F12 + todo.md M4 entry updated to make the matrix layout the
  canonical UI for MITRE tagging (so future spec-reviewer passes accept it).
- testing-m4.md walkthrough rewritten for the flat picker.

DoD post-refactor: make test-api → 53 passed (was 51), make e2e → 34 passed.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-12 18:32:20 +02:00
Knacky
90036437cc test(m4): pytest parser + endpoints + e2e tag picker
- backend/tests/test_mitre.py: 12 integration tests using a hand-crafted
  minimal STIX bundle (no network in tests). Covers parser
  (revoked/deprecated skip, sub-technique parent linkage), seed idempotence,
  persisted settings, checksum mismatch path, all four read endpoints, perm
  enforcement on /mitre/sync, ILIKE search.
- e2e/tests/m4-mitre.spec.ts: 6 Playwright tests against the live stack.
  beforeAll calls POST /mitre/sync once (real bundle, ~50 MB, ~1.1 s) then
  the suite validates tactics ≥14, T1003 has ≥5 sub-techniques, the picker
  walks tactic→technique→subtechnique with chip multi-select, and non-admin
  sees /mitre but no Sync card.
- tasks/testing-m4.md: manual + automated checklist, air-gapped operator
  notes, volume-permission caveat for pre-existing root-owned volumes.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-12 13:54:26 +02:00
Knacky
bb23bf3928 feat(m3): RBAC — atomic perms, groups, users, admin SPA pages
Permission catalogue (services/permissions_seed.py)
- 31 atomic codes across 10 families: user.*, group.*, invitation.*,
  test_template.*, scenario_template.*, mission.* (incl.
  mission.write_red_fields + mission.write_blue_fields),
  detection_level.{read,update}, setting.{read,update}, mitre.sync.
- Default bindings: admin = all 31; redteam = 8 (catalogue read + mission.
  {read,create,update,archive,write_red_fields} + detection_level.read);
  blueteam = 5 (catalogue read + mission.{read,write_blue_fields} +
  detection_level.read).
- Seed runs at boot AND after /setup so a freshly truncated DB (via
  /diag/reset) gets the bindings back via the bootstrap path. Idempotent +
  additive (never removes a perm from a system group).

Users admin (services/users.py + api/users.py)
- list (q + is_active filter + pagination), get, patch (display_name /
  locale / is_active with tri-state sentinel for clear-vs-unset),
  soft-delete, set groups.
- Last-admin protection on update (deactivate), delete, and group-strip
  (refusing to remove the admin group from the last active admin).

Groups admin (services/groups.py + api/groups.py)
- Full CRUD with system-group protection (no rename, no delete on
  admin/redteam/blueteam).
- PUT /groups/{id}/permissions sets the perm list.
- Admin system group's perm set is locked to the full catalogue
  (SystemGroupProtected → 409) — preserves the bypass invariant even if a
  future refactor moves to perm-based checks.

Permissions read-only (api/permissions.py)
- GET /permissions returns the catalogue (admin or group.read holders).

/diag/reset extension
- After truncate + token mint, the limiter is also reset (limiter.reset())
  so the Playwright suite doesn't hit 10/min budgets across spec files.
  Guarded by limiter.enabled to no-op in APP_ENV=test.

Rate-limit scope (core/rate_limit.py)
- enabled = APP_ENV in ("prod", "staging"). A staging deployment serves
  humans, so it gets the limits too. Dev/test stay unthrottled for
  Playwright ergonomics. Spec §6 NF-security is an operator-facing
  requirement.

Frontend chrome
- components/RequireAdmin.tsx + ui/Modal.tsx (reusable centered dialog
  with accessible name + Escape + backdrop-click).
- Layout.tsx shows Admin nav links only when is_admin === true. Server
  remains the arbiter — non-admins hitting /admin/* get redirected to /.

Frontend pages
- pages/AdminUsersPage.tsx, AdminGroupsPage.tsx, AdminInvitationsPage.tsx
  with edit modals using TanStack Query mutations + multi-select for perms
  grouped by family + copy-once invitation URL display.
- lib/admin.ts: shared types + query keys + groupPermsByFamily helper.
- lib/api.ts: apiPatch / apiPut / apiDelete added.

Playwright config (e2e/playwright.config.ts)
- workers: 1 + fullyParallel: false: spec files share the live Postgres,
  so concurrent /diag/reset calls clobber each other. Intra-file order
  preserved via test.describe.configure({ mode: 'serial' }).

Testing
- backend/tests/test_rbac.py: 15 integration tests (39 backend total — 1
  health + 8 schema + 15 auth + 15 RBAC).
- e2e/tests/m3-rbac.spec.ts: 8 Playwright tests covering DoD §10 #2/#3
  (28 e2e total — 8 M0 + 4 M1 + 8 M2 + 8 M3).
- tasks/testing-m3.md.

DoD: make test-api → 39 passed, make e2e → 28 passed. Spec-reviewer pass
applied (admin perm invariant + staging rate-limit scope).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-11 06:17:07 +02:00
Knacky
700b563297 feat(m2): auth, JWT, invitations, bootstrap, RTOps SPA pages
Crypto + tokens
- app/core/security.py: Argon2id PasswordHasher (time_cost=2, memory_cost=
  64 MiB, parallelism=2) + opaque-token SHA-256 helpers (raw token shown
  once, only the hash lives in the DB).
- app/core/jwt_tokens.py: HS256, claims iss/sub/type/jti/iat/exp. Access
  1h, refresh 30d.

Services
- services/auth.py: login, refresh with token rotation + reuse-detection
  chain revoke, logout (idempotent), change_password (forces logout-all).
- services/invitations.py: create, preview, accept, revoke. Default 7d TTL.
- services/bootstrap.py: seeds the 3 system groups (admin/redteam/blueteam),
  consumes the install token, attaches the first user to admin.
- core/install_token.py: mints, persists in settings, marks consumed,
  regenerate hook for /diag/reset.

API
- POST /setup (consume install token, create 1st admin) + GET /setup
  (status).
- POST /auth/{login,refresh,logout,change-password} + GET /auth/me.
- POST /invitations + GET /invitations + GET /invitations/preview/<token> +
  POST /invitations/accept/<token> + POST /invitations/<id>/revoke.
- POST /diag/reset: test-only kill switch (truncate auth tables + mint
  fresh install token). Allowed in dev too (with WARNING log) so the e2e
  suite can run against a make-up stack; production locked out.

Middleware
- @require_auth populates g.current_user (snapshot dataclass, session
  closed before request handler runs).
- @require_perm(*codes): atomic perm union check; admin group bypasses.
  Perm catalogue lands in M3, scaffolding here.
- flask-limiter: 10/min/IP on /auth/login & /auth/refresh, 5/min on
  /auth/change-password & /setup, 10–20/min on invitation endpoints.
  Disabled in APP_ENV=test.

CLI
- flask --app app.cli metamorph print-install-token [--force]
- flask --app app.cli metamorph seed-mitre (M4 placeholder)

Refresh cookie metamorph_refresh: HttpOnly + Secure (localhost is a secure
context for modern browsers) + SameSite=Strict + Path=/api/v1/auth/.

Email validation: app.api._validation.Email permissive RFC-shape regex so
internal TLDs (.local/.corp/.test) are accepted — pydantic.EmailStr's
deliverability check is too strict for red-team labs.

Frontend
- lib/{api,auth}.ts: access token in module memory, refresh cookie,
  automatic 401-retry via /auth/refresh, useAuth() hook.
- components/{Layout,RequireAuth}.tsx + ui/{TextField,Alert}.tsx.
- pages/{Login,Setup,Register,Profile}.

Testing
- tests/test_auth_flow.py: 15 integration tests (24 backend total).
- e2e/tests/m2-auth.spec.ts: 8 Playwright tests (20 e2e total).
- tasks/testing-m2.md.

DoD: make test-api → 24 passed, make e2e → 20 passed; spec-reviewer pass
applied (Secure unconditional, refresh limit 10/min/IP).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-11 06:16:48 +02:00
Knacky
e995853f0d feat(m1): DB schema, migrations, diag visibility
23 tables + alembic_version covering the v1 data model:
- Auth/RBAC (8): users, groups, permissions, user_groups, group_permissions,
  invitations, invitation_groups, refresh_tokens.
- MITRE (4): mitre_tactics, mitre_techniques, mitre_subtechniques + the
  technique↔tactic many-to-many.
- Templates (4): test_templates, test_template_mitre_tags (3 nullable FKs +
  CHECK exactly_one_mitre_fk), scenario_templates, scenario_template_tests
  (UUID PK + UNIQUE(scenario_id, position) so a test can appear at multiple
  positions).
- Missions (6): missions, mission_members, mission_scenarios, mission_tests,
  mission_test_mitre_tags (deliberately denormalised — copies external_id +
  name + url, no FK to mitre_* — so a re-sync of the catalogue can't purge
  historical tags), mission_categories.
- Evidence/settings/notifications (5): evidence_files, settings (JSONB
  value), detection_levels, notifications.

SQLAlchemy 2.x with Mapped[]/mapped_column(), pk_/fk_/ck_/uq_/ix_ naming
convention. Reusable mixins (UuidPkMixin, TimestampMixin, SoftDeleteMixin —
no auto __table_args__ since classes silently clobber the mixin's).

Soft delete: deleted_at + partial indexes ix_<table>_active WHERE deleted_at
IS NULL on 9 tables (users, groups, test_templates, scenario_templates,
missions, mission_scenarios, mission_tests, mission_categories,
evidence_files). Notifications gets ix_..._unread WHERE read_at IS NULL.

CHECK constraints for status / state / opsec_level / mitre_kind enums.

New API endpoint GET /api/v1/diag/db: returns alembic_revision (short hash)
and the public-schema table_count. 503 with {"reachable": false} on a DB
outage. Database card on the SPA home consumes it.

Test stage in backend/Dockerfile (--target test): runtime + dev extras +
tests/. New make test-api spins an ephemeral pytest container against the
live DB on the compose network. backend/tests/test_schema.py: 8 integration
tests (tables, FK pairs, CHECK constraints, partial indexes, alembic-at-head,
negative INSERT proving the exactly_one_mitre_fk CHECK fires).

e2e/tests/m1-db.spec.ts: 4 Playwright tests covering the diag endpoint
contract + the Database card + footer/roadmap labels.

DoD: make clean && make up && make migrate → 23 tables, 32 FKs, 9 CHECKs,
make test-api → 9 passed, make e2e → 12 passed.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-11 06:16:24 +02:00
Knacky
f1fdf27012 feat(m0): bootstrap repo, design system, compose stack
- Repo scaffolding: .gitignore, .env.example, Makefile, docker-compose.yml,
  README.md, CHANGELOG.md, pre-commit config.
- Three-service stack: api (Flask 3), db (postgres:16-alpine), front (nginx
  serving the Vite bundle). Named volumes metamorph_db + metamorph_evidence.
- Backend skeleton: Flask app factory, JSON structured logging on stdout,
  GET /api/v1/health, multi-stage Dockerfile, pyproject.toml driven by uv,
  Pydantic Settings with secret guard rails (refuses to boot in non-dev with
  placeholders), APP_ENV gating.
- Frontend skeleton: Vite + React 18 + TypeScript strict + TailwindCSS, RTOps
  design tokens from tasks/design.md, self-hosted JetBrains Mono / IBM Plex
  Sans via @fontsource, base UI primitives (Card/Tag/SectionHeader/FlowNode/
  Button), home page wired to /api/v1/health.
- Engine-agnostic Makefile: auto-detects docker or podman, picks the matching
  compose driver. Targets: up/down/build/rebuild/dev/lint/fmt/test/migrate/
  seed-mitre/print-install-token/e2e/inspect-health.
- Playwright suite: e2e/tests/m0-smoke.spec.ts (8 tests) + HTML + JUnit
  reports + traces on retry.
- Docs: tasks/spec.md (finalized after Q&A), tasks/design.md, tasks/todo.md
  (14 milestones), tasks/testing-m0.md, tasks/lessons.md.

DoD: make up + make health + make e2e all pass on podman 5.x (Fedora) and
docker. TLS terminated by external reverse proxy (spec §6 NF-network).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-11 06:16:00 +02:00