m7-execution/README.md

# Metamorph

Collaborative purple-team platform. Red team logs the tests they execute (procedure, command, timestamp); blue team annotates each test with detection evidence (alerts, logs, files). At the end of an engagement, Metamorph generates a standalone reveal.js slide deck classified by MITRE ATT&CK tactic.

> **Status**: M0–M7 delivered (bootstrap → DB schema → auth → RBAC → MITRE ATT&CK reference → test & scenario templates → missions snapshot → red/blue execution on a mission test). See `tasks/spec.md` for the full specification and `tasks/todo.md` for the milestone-by-milestone plan.

## Stack

- **Backend**: Python 3.12, Flask 3, SQLAlchemy 2 + Alembic (M1+), PostgreSQL 16.
- **Frontend**: React 18 + TypeScript + Vite + TailwindCSS (RTOps design tokens, see `tasks/design.md`).
- **Auth (M2+)**: JWT access (1h) + refresh (30d), Argon2id, invite-link enrollment.
- **RBAC (M3+)**: atomic permissions (31 codes) bundled into custom groups; 3 system groups seeded (`admin` / `redteam` / `blueteam`).
- **MITRE ATT&CK (M4+)**: Enterprise reference catalogue pinned to v19.0, seedable via `make seed-mitre`.
- **Template catalogue (M5+)**: reusable `test_templates` (markdown procedure, OPSEC level, free tags, expected IOCs, MITRE tags) + ordered `scenario_templates` with drag-and-drop reordering. Admin pages at `/admin/tests` and `/admin/scenarios`.
- **Missions (M6+)**: `missions` snapshot one or more scenario templates at creation time; template edits don't drift live missions (`mission_*` tables freeze every field, including MITRE tags). Non-admin members see only their own missions (membership filter, 404 on existence-leak attempts). Status state machine `draft → in_progress → completed → archived`, archive perm gated separately. SPA: list/filter at `/missions`, 3-step create wizard at `/missions/new`, detail page with Tests / Members / Synthesis / Export tabs.
- **Execution (M7+)**: per-test page `/missions/<id>/tests/<test_id>` with two zones — red (command/output/comment + mark-executed with override) and blue (detection-level select / comment / drag-and-drop evidence upload). Field-level perm gating: `mission.write_red_fields` / `mission.write_blue_fields` are server-enforced *per field*. State machine `pending↔skipped/blocked` + `pending→executed→reviewed_by_blue` with side-aware perms. Evidence pipeline: streaming upload to `${EVIDENCE_DIR}/<mission>/<test>/<sha256><ext>`, SHA256 + MIME + extension + 25 MB cap. 15 s activity polling via `/missions/<id>/activity?since=…` drives the "modified by X" badge. 4 default `detection_levels` seeded at boot.
- **Delivery**: docker-compose. TLS termination is expected to be handled by an external reverse proxy in production.

## Quickstart

Works with **Docker** *or* **Podman**. The Makefile auto-detects the available engine and picks the matching compose driver (`docker compose`, `podman compose`, or `podman-compose`).

Requires one of:

- Docker Engine 24+ with the Compose v2 plugin, **or**
- Podman 4.0+ with `podman compose` (or the legacy `podman-compose` ≥ 1.0.6)

```bash
git clone <this repo>
cd Metamorph
make engine               # confirm which engine the Makefile picked up
make env                  # creates .env from .env.example
$EDITOR .env              # set strong values for POSTGRES_PASSWORD and JWT_SECRET
make up                   # builds and starts api + db + front
make logs                 # tail logs
```

Override the auto-detection if you have both engines installed:

```bash
make up ENGINE=podman                       # force podman + auto-pick its compose driver
make up ENGINE=docker COMPOSE="docker compose"
COMPOSE=podman-compose make up              # force the legacy wrapper specifically
```

Then:

- Front: <http://localhost:8080>
- API health: <http://localhost:8080/api/v1/health> (proxied) or <http://localhost:8000/api/v1/health>

### First-time setup

```bash
make migrate                  # apply DB schema
make print-install-token      # prints the bootstrap admin token (logs banner)
# visit http://localhost:8080/setup, paste the token, create the admin account
make seed-mitre               # populate the MITRE ATT&CK reference (~50 MB, ~1 s)
```

The MITRE bundle is cached in the named volume `metamorph_mitre` (`/data/mitre/<file>.json` inside the api container). For air-gapped operators, pre-populate the volume with your own STIX 2.1 file and run `podman compose exec api flask --app app.cli metamorph seed-mitre --source /data/mitre/your-file.json --skip-checksum`.

To stop:

```bash
make down            # keep volumes
make clean           # also drop volumes (DESTRUCTIVE)
```

## Local dev (no Docker)

Requires:

- [uv](https://github.com/astral-sh/uv) for Python deps
- Node.js 20+ and `npm`
- A reachable Postgres (or `make up db` to run only the db container)

```bash
make dev-api     # in one terminal
make dev-front   # in another
```

## Environment variables

See `.env.example`. The most important ones:

| Variable           | Purpose                                              |
|--------------------|------------------------------------------------------|
| `APP_ENV`          | `dev` allows placeholder secrets; anything else (prod/staging) refuses to boot with defaults |
| `POSTGRES_*`       | DB credentials (used by `db` and `api`)              |
| `JWT_SECRET`       | HS256 signing key — generate 64+ random bytes (`python -c "import secrets; print(secrets.token_urlsafe(64))"`) |
| `LOG_LEVEL`        | `DEBUG` / `INFO` / `WARNING` / `ERROR`               |
| `FRONT_ORIGIN`     | Allowed CORS origin for the SPA                      |
| `EVIDENCE_DIR`     | Path inside the api container where uploads land     |
| `HOST_API_PORT`    | Host port mapped to the api (default 8000)           |
| `HOST_FRONT_PORT`  | Host port mapped to the front nginx (default 8080)   |

## Testing

- **Manual + automated checklist for the current milestone**: see [`tasks/testing-m<N>.md`](tasks/testing-m7.md) (current: `testing-m7.md`).
- **Backend unit tests**: `make test-api`
- **End-to-end (Playwright)**: `make e2e-install` (once), then `make up && make e2e`. Reports land in `e2e/playwright-report/` (HTML + JUnit XML); open with `make e2e-report`.

## Pre-commit hooks

After cloning, install hooks once:

```bash
pipx install pre-commit   # or: pip install --user pre-commit
pre-commit install
pre-commit run --all-files   # initial sweep
```

The hooks run `ruff` + `ruff-format` on the backend and `eslint` / `tsc --noEmit` / `prettier --check` on the frontend (see `.pre-commit-config.yaml`).

## Project layout

```
.
├── backend/             # Flask API
│   └── app/
│       ├── api/         # HTTP layer (blueprints)
│       ├── core/        # config, logging, errors
│       ├── db/          # SQLAlchemy session, migrations (M1+)
│       ├── models/      # ORM models (M1+)
│       ├── services/    # domain logic (M2+)
│       └── i18n/        # message catalogs (M13)
├── frontend/            # Vite + React + TS + Tailwind
│   └── src/components/ui/   # RTOps design system primitives
├── tasks/
│   ├── spec.md          # source of truth for requirements
│   ├── design.md        # RTOps design system
│   ├── todo.md          # milestone plan
│   └── lessons.md       # session retrospectives
├── docker-compose.yml
├── Makefile
└── CHANGELOG.md
```

## Roadmap

See `tasks/todo.md`. Current milestone: **M7 — Red & blue execution on a mission test** (done). Next: M8 (custom detection-level CRUD).

## License

TBD.
-												feat(m0): bootstrap repo, design system, compose stack

- Repo scaffolding: .gitignore, .env.example, Makefile, docker-compose.yml,
  README.md, CHANGELOG.md, pre-commit config.
- Three-service stack: api (Flask 3), db (postgres:16-alpine), front (nginx
  serving the Vite bundle). Named volumes metamorph_db + metamorph_evidence.
- Backend skeleton: Flask app factory, JSON structured logging on stdout,
  GET /api/v1/health, multi-stage Dockerfile, pyproject.toml driven by uv,
  Pydantic Settings with secret guard rails (refuses to boot in non-dev with
  placeholders), APP_ENV gating.
- Frontend skeleton: Vite + React 18 + TypeScript strict + TailwindCSS, RTOps
  design tokens from tasks/design.md, self-hosted JetBrains Mono / IBM Plex
  Sans via @fontsource, base UI primitives (Card/Tag/SectionHeader/FlowNode/
  Button), home page wired to /api/v1/health.
- Engine-agnostic Makefile: auto-detects docker or podman, picks the matching
  compose driver. Targets: up/down/build/rebuild/dev/lint/fmt/test/migrate/
  seed-mitre/print-install-token/e2e/inspect-health.
- Playwright suite: e2e/tests/m0-smoke.spec.ts (8 tests) + HTML + JUnit
  reports + traces on retry.
- Docs: tasks/spec.md (finalized after Q&A), tasks/design.md, tasks/todo.md
  (14 milestones), tasks/testing-m0.md, tasks/lessons.md.

DoD: make up + make health + make e2e all pass on podman 5.x (Fedora) and
docker. TLS terminated by external reverse proxy (spec §6 NF-network).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

											
										
										
											2026-05-11 06:16:00 +02:00
+								# Metamorph
 								Collaborative purple-team platform. Red team logs the tests they execute (procedure, command, timestamp); blue team annotates each test with detection evidence (alerts, logs, files). At the end of an engagement, Metamorph generates a standalone reveal.js slide deck classified by MITRE ATT&CK tactic.
-												feat(m7): per-test execution — red/blue zones, evidence pipeline, activity poll

DoD M7 (spec §F5 + §F6 + §F8 + tasks/todo.md M7) covered end-to-end:

Backend
- New migration `91a4e7c6d2f3` adds `mission_tests.last_actor_id` (FK users
  ON DELETE SET NULL) and `ix_mission_tests_updated_at` for the polling query.
- `detection_levels`: 4 default rows seeded at boot, `GET /detection-levels`
  read-only (CRUD lands in M8).
- `mission_tests` service + `missions` API extension:
  - `GET /missions/{id}/tests/{test_id}` — full detail incl. evidence list
  - `PUT  /missions/{id}/tests/{test_id}` — patch red/blue fields with per-field
    perm classification (`mission.write_red_fields` vs `mission.write_blue_fields`)
  - `POST /missions/{id}/tests/{test_id}/transition` — pending↔skipped/blocked
    and pending→executed→reviewed_by_blue (+ undo paths), side-aware perm gate
    that fires *before* idempotency, `executed_at` auto-stamped on the way in
  - `GET  /missions/{id}/activity?since=<ISO>` — drives the 15 s polling badge
- `evidence` service + top-level `/evidence/<id>` API:
  - Streaming upload, SHA256 chunk-by-chunk, 25 MB cap, ext+MIME whitelist
  - Content-addressed storage at ${EVIDENCE_DIR}/<mission>/<test>/<sha256><ext>
  - Atomic `os.replace`, hex-validated SHA path component, root-dir guard
  - Membership-aware (404 on miss/forbidden, no existence leak)
- `/diag/reset` now wipes ${EVIDENCE_DIR}/* in test mode (symlink-safe) and
  re-seeds detection levels as a safety net.

Frontend
- `lib/missions.ts` — M7 types + queryKey factory + state-machine matrix.
- `pages/MissionTestPage.tsx` — two-zone layout: red border (command, output,
  comment, mark-executed + override toggle) and cyan border (detection-level
  select, comment, drag-and-drop evidence dropzone). Last-touched badge polls
  /activity every 15 s, gated on document.visibilityState. Per-field disable
  based on the user's red/blue perms (server stays the arbiter).
- `pages/MissionDetailPage.tsx` — test rows link to the new per-test page.
- `App.tsx` — registers /missions/:id/tests/:testId behind RequireAuth.
- `HomePage.tsx` — hero + roadmap card bumped to M7; next is M8.

Tests
- `backend/tests/test_mission_tests.py` — 27 pytest tests (red/blue field
  gating, state-machine matrix incl. idempotent-side enforcement, executed_at
  override, 24/26 MB upload + SHA256, MIME/ext whitelist, soft-delete hide,
  activity polling with URL-encoded `since`, membership 404 vs admin bypass,
  cross-mission evidence access).
- `e2e/tests/m7-execution.spec.ts` — 5 Playwright tests against the live stack
  (red-only/blue-only API gating, mark-executed + reviewed_by_blue side
  enforcement, 24 MB/26 MB upload + SHA256 round-trip, SPA per-test page save
  + transition, non-member 404 message). afterAll restores stable admin and
  re-syncs MITRE.

Docs
- CHANGELOG.md: M7 section + post-M7 review-pass subsection.
- README.md: status, feature blurb, roadmap, testing-m7 link.
- tasks/testing-m7.md: manual + automated procedure with transition matrix
  and perm-gating table.
- tasks/lessons.md: M7 retrospectives (LogRecord `created` trap, URL-encoded
  query timestamps, perm-before-flush, atomic move, polling visibility gate).

Test count: 133 pytest / 49 Playwright, all green.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

											
										
										
											2026-05-14 08:16:48 +02:00
+								> **Status**: M0–M7 delivered (bootstrap → DB schema → auth → RBAC → MITRE ATT&CK reference → test & scenario templates → missions snapshot → red/blue execution on a mission test). See `tasks/spec.md` for the full specification and `tasks/todo.md` for the milestone-by-milestone plan.
-												feat(m0): bootstrap repo, design system, compose stack

- Repo scaffolding: .gitignore, .env.example, Makefile, docker-compose.yml,
  README.md, CHANGELOG.md, pre-commit config.
- Three-service stack: api (Flask 3), db (postgres:16-alpine), front (nginx
  serving the Vite bundle). Named volumes metamorph_db + metamorph_evidence.
- Backend skeleton: Flask app factory, JSON structured logging on stdout,
  GET /api/v1/health, multi-stage Dockerfile, pyproject.toml driven by uv,
  Pydantic Settings with secret guard rails (refuses to boot in non-dev with
  placeholders), APP_ENV gating.
- Frontend skeleton: Vite + React 18 + TypeScript strict + TailwindCSS, RTOps
  design tokens from tasks/design.md, self-hosted JetBrains Mono / IBM Plex
  Sans via @fontsource, base UI primitives (Card/Tag/SectionHeader/FlowNode/
  Button), home page wired to /api/v1/health.
- Engine-agnostic Makefile: auto-detects docker or podman, picks the matching
  compose driver. Targets: up/down/build/rebuild/dev/lint/fmt/test/migrate/
  seed-mitre/print-install-token/e2e/inspect-health.
- Playwright suite: e2e/tests/m0-smoke.spec.ts (8 tests) + HTML + JUnit
  reports + traces on retry.
- Docs: tasks/spec.md (finalized after Q&A), tasks/design.md, tasks/todo.md
  (14 milestones), tasks/testing-m0.md, tasks/lessons.md.

DoD: make up + make health + make e2e all pass on podman 5.x (Fedora) and
docker. TLS terminated by external reverse proxy (spec §6 NF-network).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

											
										
										
											2026-05-11 06:16:00 +02:00
 								## Stack
 								- **Backend**: Python 3.12, Flask 3, SQLAlchemy 2 + Alembic (M1+), PostgreSQL 16.
 								- **Frontend**: React 18 + TypeScript + Vite + TailwindCSS (RTOps design tokens, see `tasks/design.md`).
 								- **Auth (M2+)**: JWT access (1h) + refresh (30d), Argon2id, invite-link enrollment.
-												docs(m4): CHANGELOG, README, lessons, spec drift fix, todo tick

- CHANGELOG: added M4 section listing endpoints, CLI, volume, persisted
  settings, picker, and the post-spec-review fixes (custom-URL integrity
  requirement + /diag/reset consistency + spec drift). Includes the
  intentional decisions paragraph (seed-time download not image-baked, read
  endpoints unauthenticated-perm-wise, stdlib over httpx).
- README: status bumped to M0–M4, added MITRE quickstart (make seed-mitre +
  air-gapped path with --source /data/mitre/<file> + --skip-checksum),
  testing-m<N>.md pointer updated to testing-m4.md, roadmap line.
- tasks/spec.md §10 #4: amended "14 tactics Enterprise" → "≥14 tactics
  Enterprise (la v19 du pin actuel en ship 15)".
- tasks/lessons.md: 7 M4 lessons captured (stdlib STIX parsing, decoupling
  DoD asserts from upstream versions, subtechnique parent resolution, single-
  transaction safety, custom-URL footgun mitigation, /diag/reset consistency,
  named-volume permission caveat, podman build cache surprise).
- tasks/todo.md: M4 marked ☑.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

											
										
										
											2026-05-12 13:54:46 +02:00
+								- **RBAC (M3+)**: atomic permissions (31 codes) bundled into custom groups; 3 system groups seeded (`admin` / `redteam` / `blueteam`).
 								- **MITRE ATT&CK (M4+)**: Enterprise reference catalogue pinned to v19.0, seedable via `make seed-mitre`.
-												test(m5): playwright spec + docs (CHANGELOG, README, lessons, testing-m5)

- 4 Playwright tests: API CRUD round-trip, scenario reorder via PUT, SPA
  list + opsec filter, SPA scenario list rendering with ordered tests.
- afterAll restores the stable admin (admin@metamorph.local) per the
  test_admin memory rule.
- CHANGELOG M5 section + Fixed subsections for the LogRecord 'name'
  collision and the React `currentTarget` vs `target` quirk.
- README status bumps to M0-M5.
- tasks/lessons.md captures the new patterns (sentinel pattern for
  partial-update, FK ordering in /diag/reset, dnd-kit stable IDs).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

											
										
										
											2026-05-12 19:57:51 +02:00
+								- **Template catalogue (M5+)**: reusable `test_templates` (markdown procedure, OPSEC level, free tags, expected IOCs, MITRE tags) + ordered `scenario_templates` with drag-and-drop reordering. Admin pages at `/admin/tests` and `/admin/scenarios`.
-												feat(m6): missions + snapshot CRUD, membership visibility, status state machine

Adds the mission layer that materialises template snapshots, plus the SPA
list / 3-step wizard / detail page.

Backend:
- app/services/missions.py — create_mission snapshots scenarios, tests, MITRE
  tags in a 4-query write; list/get apply a non-admin membership filter that
  collapses to 404 (no existence leak); status state machine enforces
  draft → in_progress → completed → archived with archived as a sink; the
  non-admin creator is auto-added as role_hint='red' to retain visibility.
- app/api/missions.py — 8 endpoints (list, get, create, update, add
  scenarios, set members, transition, soft-delete) with strict pydantic
  schemas. The transition endpoint splits the perm gate manually so
  archive requires mission.archive while other targets use mission.update.
- app/api/users.py — new GET /users/roster returning (id, email,
  display_name) only, gated by user.read OR mission.create OR
  mission.update — lets non-admin wizard users see assignable peers
  without exposing the admin /users payload.
- app/api/diag.py — /diag/reset truncates the mission_* tables before the
  template tables because the source_*_template_id FKs are ON DELETE SET
  NULL, which is cheaper to short-circuit by removing the children first.

Frontend:
- lib/missions.ts — typed client, queryKey factory, status accent map.
- pages/MissionsListPage.tsx — list cards with status accent + filters
  (q, client, status).
- pages/MissionsCreatePage.tsx — 3-step wizard (meta → scenarios → members)
  with member roster fed by /users/roster.
- pages/MissionDetailPage.tsx — header + transition buttons (legal next
  states only) + Tests/Members/Synthesis/Export tabs.
- Routes + nav entry (visible to anyone with mission.read or admin).

Tests:
- backend/tests/test_missions.py — 22 pytest covering snapshot fidelity,
  MITRE propagation, membership visibility, transition state machine,
  perm gating, member set replace, append scenarios, soft-delete, partial
  update, inverted-date rejection.
- e2e/tests/m6-missions.spec.ts — 5 Playwright (snapshot freezing, non-admin
  visibility, status transitions + 409, SPA wizard end-to-end, list filter).

Docs:
- CHANGELOG, tasks/testing-m6.md, tasks/lessons.md (snapshot tradeoffs,
  membership=404 pattern, /diag/reset order, auto-creator add).
- README + tasks/todo.md updated.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

											
										
										
											2026-05-13 15:07:32 +02:00
+								- **Missions (M6+)**: `missions` snapshot one or more scenario templates at creation time; template edits don't drift live missions (`mission_*` tables freeze every field, including MITRE tags). Non-admin members see only their own missions (membership filter, 404 on existence-leak attempts). Status state machine `draft → in_progress → completed → archived`, archive perm gated separately. SPA: list/filter at `/missions`, 3-step create wizard at `/missions/new`, detail page with Tests / Members / Synthesis / Export tabs.
-												feat(m7): per-test execution — red/blue zones, evidence pipeline, activity poll

DoD M7 (spec §F5 + §F6 + §F8 + tasks/todo.md M7) covered end-to-end:

Backend
- New migration `91a4e7c6d2f3` adds `mission_tests.last_actor_id` (FK users
  ON DELETE SET NULL) and `ix_mission_tests_updated_at` for the polling query.
- `detection_levels`: 4 default rows seeded at boot, `GET /detection-levels`
  read-only (CRUD lands in M8).
- `mission_tests` service + `missions` API extension:
  - `GET /missions/{id}/tests/{test_id}` — full detail incl. evidence list
  - `PUT  /missions/{id}/tests/{test_id}` — patch red/blue fields with per-field
    perm classification (`mission.write_red_fields` vs `mission.write_blue_fields`)
  - `POST /missions/{id}/tests/{test_id}/transition` — pending↔skipped/blocked
    and pending→executed→reviewed_by_blue (+ undo paths), side-aware perm gate
    that fires *before* idempotency, `executed_at` auto-stamped on the way in
  - `GET  /missions/{id}/activity?since=<ISO>` — drives the 15 s polling badge
- `evidence` service + top-level `/evidence/<id>` API:
  - Streaming upload, SHA256 chunk-by-chunk, 25 MB cap, ext+MIME whitelist
  - Content-addressed storage at ${EVIDENCE_DIR}/<mission>/<test>/<sha256><ext>
  - Atomic `os.replace`, hex-validated SHA path component, root-dir guard
  - Membership-aware (404 on miss/forbidden, no existence leak)
- `/diag/reset` now wipes ${EVIDENCE_DIR}/* in test mode (symlink-safe) and
  re-seeds detection levels as a safety net.

Frontend
- `lib/missions.ts` — M7 types + queryKey factory + state-machine matrix.
- `pages/MissionTestPage.tsx` — two-zone layout: red border (command, output,
  comment, mark-executed + override toggle) and cyan border (detection-level
  select, comment, drag-and-drop evidence dropzone). Last-touched badge polls
  /activity every 15 s, gated on document.visibilityState. Per-field disable
  based on the user's red/blue perms (server stays the arbiter).
- `pages/MissionDetailPage.tsx` — test rows link to the new per-test page.
- `App.tsx` — registers /missions/:id/tests/:testId behind RequireAuth.
- `HomePage.tsx` — hero + roadmap card bumped to M7; next is M8.

Tests
- `backend/tests/test_mission_tests.py` — 27 pytest tests (red/blue field
  gating, state-machine matrix incl. idempotent-side enforcement, executed_at
  override, 24/26 MB upload + SHA256, MIME/ext whitelist, soft-delete hide,
  activity polling with URL-encoded `since`, membership 404 vs admin bypass,
  cross-mission evidence access).
- `e2e/tests/m7-execution.spec.ts` — 5 Playwright tests against the live stack
  (red-only/blue-only API gating, mark-executed + reviewed_by_blue side
  enforcement, 24 MB/26 MB upload + SHA256 round-trip, SPA per-test page save
  + transition, non-member 404 message). afterAll restores stable admin and
  re-syncs MITRE.

Docs
- CHANGELOG.md: M7 section + post-M7 review-pass subsection.
- README.md: status, feature blurb, roadmap, testing-m7 link.
- tasks/testing-m7.md: manual + automated procedure with transition matrix
  and perm-gating table.
- tasks/lessons.md: M7 retrospectives (LogRecord `created` trap, URL-encoded
  query timestamps, perm-before-flush, atomic move, polling visibility gate).

Test count: 133 pytest / 49 Playwright, all green.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

											
										
										
											2026-05-14 08:16:48 +02:00
+								- **Execution (M7+)**: per-test page `/missions/<id>/tests/<test_id>` with two zones — red (command/output/comment + mark-executed with override) and blue (detection-level select / comment / drag-and-drop evidence upload). Field-level perm gating: `mission.write_red_fields` / `mission.write_blue_fields` are server-enforced *per field*. State machine `pending↔skipped/blocked` + `pending→executed→reviewed_by_blue` with side-aware perms. Evidence pipeline: streaming upload to `${EVIDENCE_DIR}/<mission>/<test>/<sha256><ext>`, SHA256 + MIME + extension + 25 MB cap. 15 s activity polling via `/missions/<id>/activity?since=…` drives the "modified by X" badge. 4 default `detection_levels` seeded at boot.
-												feat(m0): bootstrap repo, design system, compose stack

- Repo scaffolding: .gitignore, .env.example, Makefile, docker-compose.yml,
  README.md, CHANGELOG.md, pre-commit config.
- Three-service stack: api (Flask 3), db (postgres:16-alpine), front (nginx
  serving the Vite bundle). Named volumes metamorph_db + metamorph_evidence.
- Backend skeleton: Flask app factory, JSON structured logging on stdout,
  GET /api/v1/health, multi-stage Dockerfile, pyproject.toml driven by uv,
  Pydantic Settings with secret guard rails (refuses to boot in non-dev with
  placeholders), APP_ENV gating.
- Frontend skeleton: Vite + React 18 + TypeScript strict + TailwindCSS, RTOps
  design tokens from tasks/design.md, self-hosted JetBrains Mono / IBM Plex
  Sans via @fontsource, base UI primitives (Card/Tag/SectionHeader/FlowNode/
  Button), home page wired to /api/v1/health.
- Engine-agnostic Makefile: auto-detects docker or podman, picks the matching
  compose driver. Targets: up/down/build/rebuild/dev/lint/fmt/test/migrate/
  seed-mitre/print-install-token/e2e/inspect-health.
- Playwright suite: e2e/tests/m0-smoke.spec.ts (8 tests) + HTML + JUnit
  reports + traces on retry.
- Docs: tasks/spec.md (finalized after Q&A), tasks/design.md, tasks/todo.md
  (14 milestones), tasks/testing-m0.md, tasks/lessons.md.

DoD: make up + make health + make e2e all pass on podman 5.x (Fedora) and
docker. TLS terminated by external reverse proxy (spec §6 NF-network).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

											
										
										
											2026-05-11 06:16:00 +02:00
+								- **Delivery**: docker-compose. TLS termination is expected to be handled by an external reverse proxy in production.
 								## Quickstart
 								Works with **Docker** *or* **Podman**. The Makefile auto-detects the available engine and picks the matching compose driver (`docker compose`, `podman compose`, or `podman-compose`).
 								Requires one of:
 								- Docker Engine 24+ with the Compose v2 plugin, **or**
 								- Podman 4.0+ with `podman compose` (or the legacy `podman-compose` ≥ 1.0.6)
 								```bash
 								git clone <this repo>
 								cd Metamorph
 								make engine               # confirm which engine the Makefile picked up
 								make env                  # creates .env from .env.example
 								$EDITOR .env              # set strong values for POSTGRES_PASSWORD and JWT_SECRET
 								make up                   # builds and starts api + db + front
 								make logs                 # tail logs
 								```
 								Override the auto-detection if you have both engines installed:
 								```bash
 								make up ENGINE=podman                       # force podman + auto-pick its compose driver
 								make up ENGINE=docker COMPOSE="docker compose"
 								COMPOSE=podman-compose make up              # force the legacy wrapper specifically
 								```
 								Then:
 								- Front: <http://localhost:8080>
 								- API health: <http://localhost:8080/api/v1/health> (proxied) or <http://localhost:8000/api/v1/health>
-												docs(m4): CHANGELOG, README, lessons, spec drift fix, todo tick

- CHANGELOG: added M4 section listing endpoints, CLI, volume, persisted
  settings, picker, and the post-spec-review fixes (custom-URL integrity
  requirement + /diag/reset consistency + spec drift). Includes the
  intentional decisions paragraph (seed-time download not image-baked, read
  endpoints unauthenticated-perm-wise, stdlib over httpx).
- README: status bumped to M0–M4, added MITRE quickstart (make seed-mitre +
  air-gapped path with --source /data/mitre/<file> + --skip-checksum),
  testing-m<N>.md pointer updated to testing-m4.md, roadmap line.
- tasks/spec.md §10 #4: amended "14 tactics Enterprise" → "≥14 tactics
  Enterprise (la v19 du pin actuel en ship 15)".
- tasks/lessons.md: 7 M4 lessons captured (stdlib STIX parsing, decoupling
  DoD asserts from upstream versions, subtechnique parent resolution, single-
  transaction safety, custom-URL footgun mitigation, /diag/reset consistency,
  named-volume permission caveat, podman build cache surprise).
- tasks/todo.md: M4 marked ☑.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

											
										
										
											2026-05-12 13:54:46 +02:00
+								### First-time setup
 								```bash
 								make migrate                  # apply DB schema
 								make print-install-token      # prints the bootstrap admin token (logs banner)
 								# visit http://localhost:8080/setup, paste the token, create the admin account
 								make seed-mitre               # populate the MITRE ATT&CK reference (~50 MB, ~1 s)
 								```
 								The MITRE bundle is cached in the named volume `metamorph_mitre` (`/data/mitre/<file>.json` inside the api container). For air-gapped operators, pre-populate the volume with your own STIX 2.1 file and run `podman compose exec api flask --app app.cli metamorph seed-mitre --source /data/mitre/your-file.json --skip-checksum`.
-												feat(m0): bootstrap repo, design system, compose stack

- Repo scaffolding: .gitignore, .env.example, Makefile, docker-compose.yml,
  README.md, CHANGELOG.md, pre-commit config.
- Three-service stack: api (Flask 3), db (postgres:16-alpine), front (nginx
  serving the Vite bundle). Named volumes metamorph_db + metamorph_evidence.
- Backend skeleton: Flask app factory, JSON structured logging on stdout,
  GET /api/v1/health, multi-stage Dockerfile, pyproject.toml driven by uv,
  Pydantic Settings with secret guard rails (refuses to boot in non-dev with
  placeholders), APP_ENV gating.
- Frontend skeleton: Vite + React 18 + TypeScript strict + TailwindCSS, RTOps
  design tokens from tasks/design.md, self-hosted JetBrains Mono / IBM Plex
  Sans via @fontsource, base UI primitives (Card/Tag/SectionHeader/FlowNode/
  Button), home page wired to /api/v1/health.
- Engine-agnostic Makefile: auto-detects docker or podman, picks the matching
  compose driver. Targets: up/down/build/rebuild/dev/lint/fmt/test/migrate/
  seed-mitre/print-install-token/e2e/inspect-health.
- Playwright suite: e2e/tests/m0-smoke.spec.ts (8 tests) + HTML + JUnit
  reports + traces on retry.
- Docs: tasks/spec.md (finalized after Q&A), tasks/design.md, tasks/todo.md
  (14 milestones), tasks/testing-m0.md, tasks/lessons.md.

DoD: make up + make health + make e2e all pass on podman 5.x (Fedora) and
docker. TLS terminated by external reverse proxy (spec §6 NF-network).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

											
										
										
											2026-05-11 06:16:00 +02:00
+								To stop:
 								```bash
 								make down            # keep volumes
 								make clean           # also drop volumes (DESTRUCTIVE)
 								```
 								## Local dev (no Docker)
 								Requires:
 								- [uv](https://github.com/astral-sh/uv) for Python deps
 								- Node.js 20+ and `npm`
 								- A reachable Postgres (or `make up db` to run only the db container)
 								```bash
 								make dev-api     # in one terminal
 								make dev-front   # in another
 								```
 								## Environment variables
 								See `.env.example`. The most important ones:
 								| Variable           | Purpose                                              |
 								|--------------------|------------------------------------------------------|
 								| `APP_ENV`          | `dev` allows placeholder secrets; anything else (prod/staging) refuses to boot with defaults |
 								| `POSTGRES_*`       | DB credentials (used by `db` and `api`)              |
 								| `JWT_SECRET`       | HS256 signing key — generate 64+ random bytes (`python -c "import secrets; print(secrets.token_urlsafe(64))"`) |
 								| `LOG_LEVEL`        | `DEBUG` / `INFO` / `WARNING` / `ERROR`               |
 								| `FRONT_ORIGIN`     | Allowed CORS origin for the SPA                      |
 								| `EVIDENCE_DIR`     | Path inside the api container where uploads land     |
 								| `HOST_API_PORT`    | Host port mapped to the api (default 8000)           |
 								| `HOST_FRONT_PORT`  | Host port mapped to the front nginx (default 8080)   |
 								## Testing
-												feat(m7): per-test execution — red/blue zones, evidence pipeline, activity poll

DoD M7 (spec §F5 + §F6 + §F8 + tasks/todo.md M7) covered end-to-end:

Backend
- New migration `91a4e7c6d2f3` adds `mission_tests.last_actor_id` (FK users
  ON DELETE SET NULL) and `ix_mission_tests_updated_at` for the polling query.
- `detection_levels`: 4 default rows seeded at boot, `GET /detection-levels`
  read-only (CRUD lands in M8).
- `mission_tests` service + `missions` API extension:
  - `GET /missions/{id}/tests/{test_id}` — full detail incl. evidence list
  - `PUT  /missions/{id}/tests/{test_id}` — patch red/blue fields with per-field
    perm classification (`mission.write_red_fields` vs `mission.write_blue_fields`)
  - `POST /missions/{id}/tests/{test_id}/transition` — pending↔skipped/blocked
    and pending→executed→reviewed_by_blue (+ undo paths), side-aware perm gate
    that fires *before* idempotency, `executed_at` auto-stamped on the way in
  - `GET  /missions/{id}/activity?since=<ISO>` — drives the 15 s polling badge
- `evidence` service + top-level `/evidence/<id>` API:
  - Streaming upload, SHA256 chunk-by-chunk, 25 MB cap, ext+MIME whitelist
  - Content-addressed storage at ${EVIDENCE_DIR}/<mission>/<test>/<sha256><ext>
  - Atomic `os.replace`, hex-validated SHA path component, root-dir guard
  - Membership-aware (404 on miss/forbidden, no existence leak)
- `/diag/reset` now wipes ${EVIDENCE_DIR}/* in test mode (symlink-safe) and
  re-seeds detection levels as a safety net.

Frontend
- `lib/missions.ts` — M7 types + queryKey factory + state-machine matrix.
- `pages/MissionTestPage.tsx` — two-zone layout: red border (command, output,
  comment, mark-executed + override toggle) and cyan border (detection-level
  select, comment, drag-and-drop evidence dropzone). Last-touched badge polls
  /activity every 15 s, gated on document.visibilityState. Per-field disable
  based on the user's red/blue perms (server stays the arbiter).
- `pages/MissionDetailPage.tsx` — test rows link to the new per-test page.
- `App.tsx` — registers /missions/:id/tests/:testId behind RequireAuth.
- `HomePage.tsx` — hero + roadmap card bumped to M7; next is M8.

Tests
- `backend/tests/test_mission_tests.py` — 27 pytest tests (red/blue field
  gating, state-machine matrix incl. idempotent-side enforcement, executed_at
  override, 24/26 MB upload + SHA256, MIME/ext whitelist, soft-delete hide,
  activity polling with URL-encoded `since`, membership 404 vs admin bypass,
  cross-mission evidence access).
- `e2e/tests/m7-execution.spec.ts` — 5 Playwright tests against the live stack
  (red-only/blue-only API gating, mark-executed + reviewed_by_blue side
  enforcement, 24 MB/26 MB upload + SHA256 round-trip, SPA per-test page save
  + transition, non-member 404 message). afterAll restores stable admin and
  re-syncs MITRE.

Docs
- CHANGELOG.md: M7 section + post-M7 review-pass subsection.
- README.md: status, feature blurb, roadmap, testing-m7 link.
- tasks/testing-m7.md: manual + automated procedure with transition matrix
  and perm-gating table.
- tasks/lessons.md: M7 retrospectives (LogRecord `created` trap, URL-encoded
  query timestamps, perm-before-flush, atomic move, polling visibility gate).

Test count: 133 pytest / 49 Playwright, all green.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

											
										
										
											2026-05-14 08:16:48 +02:00
+								- **Manual + automated checklist for the current milestone**: see [`tasks/testing-m<N>.md`](tasks/testing-m7.md) (current: `testing-m7.md`).
-												feat(m0): bootstrap repo, design system, compose stack

- Repo scaffolding: .gitignore, .env.example, Makefile, docker-compose.yml,
  README.md, CHANGELOG.md, pre-commit config.
- Three-service stack: api (Flask 3), db (postgres:16-alpine), front (nginx
  serving the Vite bundle). Named volumes metamorph_db + metamorph_evidence.
- Backend skeleton: Flask app factory, JSON structured logging on stdout,
  GET /api/v1/health, multi-stage Dockerfile, pyproject.toml driven by uv,
  Pydantic Settings with secret guard rails (refuses to boot in non-dev with
  placeholders), APP_ENV gating.
- Frontend skeleton: Vite + React 18 + TypeScript strict + TailwindCSS, RTOps
  design tokens from tasks/design.md, self-hosted JetBrains Mono / IBM Plex
  Sans via @fontsource, base UI primitives (Card/Tag/SectionHeader/FlowNode/
  Button), home page wired to /api/v1/health.
- Engine-agnostic Makefile: auto-detects docker or podman, picks the matching
  compose driver. Targets: up/down/build/rebuild/dev/lint/fmt/test/migrate/
  seed-mitre/print-install-token/e2e/inspect-health.
- Playwright suite: e2e/tests/m0-smoke.spec.ts (8 tests) + HTML + JUnit
  reports + traces on retry.
- Docs: tasks/spec.md (finalized after Q&A), tasks/design.md, tasks/todo.md
  (14 milestones), tasks/testing-m0.md, tasks/lessons.md.

DoD: make up + make health + make e2e all pass on podman 5.x (Fedora) and
docker. TLS terminated by external reverse proxy (spec §6 NF-network).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

											
										
										
											2026-05-11 06:16:00 +02:00
+								- **Backend unit tests**: `make test-api`
 								- **End-to-end (Playwright)**: `make e2e-install` (once), then `make up && make e2e`. Reports land in `e2e/playwright-report/` (HTML + JUnit XML); open with `make e2e-report`.
 								## Pre-commit hooks
 								After cloning, install hooks once:
 								```bash
 								pipx install pre-commit   # or: pip install --user pre-commit
 								pre-commit install
 								pre-commit run --all-files   # initial sweep
 								```
 								The hooks run `ruff` + `ruff-format` on the backend and `eslint` / `tsc --noEmit` / `prettier --check` on the frontend (see `.pre-commit-config.yaml`).
 								## Project layout
 								```
 								.
 								├── backend/             # Flask API
 								│   └── app/
 								│       ├── api/         # HTTP layer (blueprints)
 								│       ├── core/        # config, logging, errors
 								│       ├── db/          # SQLAlchemy session, migrations (M1+)
 								│       ├── models/      # ORM models (M1+)
 								│       ├── services/    # domain logic (M2+)
 								│       └── i18n/        # message catalogs (M13)
 								├── frontend/            # Vite + React + TS + Tailwind
 								│   └── src/components/ui/   # RTOps design system primitives
 								├── tasks/
 								│   ├── spec.md          # source of truth for requirements
 								│   ├── design.md        # RTOps design system
 								│   ├── todo.md          # milestone plan
 								│   └── lessons.md       # session retrospectives
 								├── docker-compose.yml
 								├── Makefile
 								└── CHANGELOG.md
 								```
 								## Roadmap
-												feat(m7): per-test execution — red/blue zones, evidence pipeline, activity poll

DoD M7 (spec §F5 + §F6 + §F8 + tasks/todo.md M7) covered end-to-end:

Backend
- New migration `91a4e7c6d2f3` adds `mission_tests.last_actor_id` (FK users
  ON DELETE SET NULL) and `ix_mission_tests_updated_at` for the polling query.
- `detection_levels`: 4 default rows seeded at boot, `GET /detection-levels`
  read-only (CRUD lands in M8).
- `mission_tests` service + `missions` API extension:
  - `GET /missions/{id}/tests/{test_id}` — full detail incl. evidence list
  - `PUT  /missions/{id}/tests/{test_id}` — patch red/blue fields with per-field
    perm classification (`mission.write_red_fields` vs `mission.write_blue_fields`)
  - `POST /missions/{id}/tests/{test_id}/transition` — pending↔skipped/blocked
    and pending→executed→reviewed_by_blue (+ undo paths), side-aware perm gate
    that fires *before* idempotency, `executed_at` auto-stamped on the way in
  - `GET  /missions/{id}/activity?since=<ISO>` — drives the 15 s polling badge
- `evidence` service + top-level `/evidence/<id>` API:
  - Streaming upload, SHA256 chunk-by-chunk, 25 MB cap, ext+MIME whitelist
  - Content-addressed storage at ${EVIDENCE_DIR}/<mission>/<test>/<sha256><ext>
  - Atomic `os.replace`, hex-validated SHA path component, root-dir guard
  - Membership-aware (404 on miss/forbidden, no existence leak)
- `/diag/reset` now wipes ${EVIDENCE_DIR}/* in test mode (symlink-safe) and
  re-seeds detection levels as a safety net.

Frontend
- `lib/missions.ts` — M7 types + queryKey factory + state-machine matrix.
- `pages/MissionTestPage.tsx` — two-zone layout: red border (command, output,
  comment, mark-executed + override toggle) and cyan border (detection-level
  select, comment, drag-and-drop evidence dropzone). Last-touched badge polls
  /activity every 15 s, gated on document.visibilityState. Per-field disable
  based on the user's red/blue perms (server stays the arbiter).
- `pages/MissionDetailPage.tsx` — test rows link to the new per-test page.
- `App.tsx` — registers /missions/:id/tests/:testId behind RequireAuth.
- `HomePage.tsx` — hero + roadmap card bumped to M7; next is M8.

Tests
- `backend/tests/test_mission_tests.py` — 27 pytest tests (red/blue field
  gating, state-machine matrix incl. idempotent-side enforcement, executed_at
  override, 24/26 MB upload + SHA256, MIME/ext whitelist, soft-delete hide,
  activity polling with URL-encoded `since`, membership 404 vs admin bypass,
  cross-mission evidence access).
- `e2e/tests/m7-execution.spec.ts` — 5 Playwright tests against the live stack
  (red-only/blue-only API gating, mark-executed + reviewed_by_blue side
  enforcement, 24 MB/26 MB upload + SHA256 round-trip, SPA per-test page save
  + transition, non-member 404 message). afterAll restores stable admin and
  re-syncs MITRE.

Docs
- CHANGELOG.md: M7 section + post-M7 review-pass subsection.
- README.md: status, feature blurb, roadmap, testing-m7 link.
- tasks/testing-m7.md: manual + automated procedure with transition matrix
  and perm-gating table.
- tasks/lessons.md: M7 retrospectives (LogRecord `created` trap, URL-encoded
  query timestamps, perm-before-flush, atomic move, polling visibility gate).

Test count: 133 pytest / 49 Playwright, all green.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

											
										
										
											2026-05-14 08:16:48 +02:00
+								See `tasks/todo.md`. Current milestone: **M7 — Red & blue execution on a mission test** (done). Next: M8 (custom detection-level CRUD).
-												feat(m0): bootstrap repo, design system, compose stack

- Repo scaffolding: .gitignore, .env.example, Makefile, docker-compose.yml,
  README.md, CHANGELOG.md, pre-commit config.
- Three-service stack: api (Flask 3), db (postgres:16-alpine), front (nginx
  serving the Vite bundle). Named volumes metamorph_db + metamorph_evidence.
- Backend skeleton: Flask app factory, JSON structured logging on stdout,
  GET /api/v1/health, multi-stage Dockerfile, pyproject.toml driven by uv,
  Pydantic Settings with secret guard rails (refuses to boot in non-dev with
  placeholders), APP_ENV gating.
- Frontend skeleton: Vite + React 18 + TypeScript strict + TailwindCSS, RTOps
  design tokens from tasks/design.md, self-hosted JetBrains Mono / IBM Plex
  Sans via @fontsource, base UI primitives (Card/Tag/SectionHeader/FlowNode/
  Button), home page wired to /api/v1/health.
- Engine-agnostic Makefile: auto-detects docker or podman, picks the matching
  compose driver. Targets: up/down/build/rebuild/dev/lint/fmt/test/migrate/
  seed-mitre/print-install-token/e2e/inspect-health.
- Playwright suite: e2e/tests/m0-smoke.spec.ts (8 tests) + HTML + JUnit
  reports + traces on retry.
- Docs: tasks/spec.md (finalized after Q&A), tasks/design.md, tasks/todo.md
  (14 milestones), tasks/testing-m0.md, tasks/lessons.md.

DoD: make up + make health + make e2e all pass on podman 5.x (Fedora) and
docker. TLS terminated by external reverse proxy (spec §6 NF-network).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

											
										
										
											2026-05-11 06:16:00 +02:00
 								## License
 								TBD.