feat(m0): bootstrap repo, design system, compose stack
- Repo scaffolding: .gitignore, .env.example, Makefile, docker-compose.yml,
README.md, CHANGELOG.md, pre-commit config.
- Three-service stack: api (Flask 3), db (postgres:16-alpine), front (nginx
serving the Vite bundle). Named volumes metamorph_db + metamorph_evidence.
- Backend skeleton: Flask app factory, JSON structured logging on stdout,
GET /api/v1/health, multi-stage Dockerfile, pyproject.toml driven by uv,
Pydantic Settings with secret guard rails (refuses to boot in non-dev with
placeholders), APP_ENV gating.
- Frontend skeleton: Vite + React 18 + TypeScript strict + TailwindCSS, RTOps
design tokens from tasks/design.md, self-hosted JetBrains Mono / IBM Plex
Sans via @fontsource, base UI primitives (Card/Tag/SectionHeader/FlowNode/
Button), home page wired to /api/v1/health.
- Engine-agnostic Makefile: auto-detects docker or podman, picks the matching
compose driver. Targets: up/down/build/rebuild/dev/lint/fmt/test/migrate/
seed-mitre/print-install-token/e2e/inspect-health.
- Playwright suite: e2e/tests/m0-smoke.spec.ts (8 tests) + HTML + JUnit
reports + traces on retry.
- Docs: tasks/spec.md (finalized after Q&A), tasks/design.md, tasks/todo.md
(14 milestones), tasks/testing-m0.md, tasks/lessons.md.
DoD: make up + make health + make e2e all pass on podman 5.x (Fedora) and
docker. TLS terminated by external reverse proxy (spec §6 NF-network).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-11 06:16:00 +02:00
---
type: todo
date: "2026-05-08"
tags: [todo, plan]
status: in_progress
project: Metamorph
spec: tasks/spec.md
---
# Metamorph — Plan d'implémentation
> Découpage en 14 milestones livrables indépendamment. Chaque milestone a une **DoD** vérifiable. Cocher au fil de l'eau, documenter les écarts dans `CHANGELOG.md`, retours d'expérience dans `tasks/lessons.md`.
## Convention
- ☐ = à faire · ☑ = fait · ⚠ = bloqué (commenter) · ↻ = en cours
- Branches : `feature/m<N>-<slug>` · commits : `feat(m<N>): …` / `fix(m<N>): …`
- Chaque PR doit : passer lint/typecheck, mettre à jour `CHANGELOG.md` , mettre à jour `README.md` si surface utilisateur.
- **Chaque milestone livre un fichier `tasks/testing-m<N>.md` ** (procédure manuelle + automatisée) **et au moins un spec Playwright `e2e/tests/m<N>-*.spec.ts` ** .
- À la fin de chaque milestone : lancer le subagent `spec-reviewer` (HARD RULE 4 du CLAUDE.md global) avant de marquer le milestone done.
---
## M0 — Bootstrap repo & infra ☐
**But** : squelette buildable de bout en bout sans aucune feature métier.
- ☐ `backend/` (Flask 3, Python 3.12, `pyproject.toml` avec uv ou poetry, structure `app/{api,core,db,models,services,i18n}` )
- ☐ `frontend/` (Vite + React 18 + TS strict, Tailwind 3, ESLint + Prettier, alias `@/` )
- ☐ Tokens design `tasks/design.md` traduits en `frontend/tailwind.config.ts` (palette CSS vars, typo JetBrains Mono / IBM Plex Sans, radii 3/4/6/10).
- ☐ Composants UI de base : `<Card>` , `<Tag>` , `<SectionHeader>` (avec `// ` ), `<FlowNode>` , `<Button>` — fidèles au design.
- ☐ `docker-compose.yml` : services `api` , `db` (postgres:16-alpine), `front` (nginx servant le bundle Vite).
- ☐ Dockerfile multi-stage par service ; volumes nommés `metamorph_db` , `metamorph_evidence` .
- ☐ `Makefile` : `dev` , `build` , `up` , `down` , `migrate` , `seed-mitre` , `lint` , `test` .
- ☐ Pré-commit hook : `ruff` (back), `eslint` +`tsc --noEmit` (front).
- ☐ `README.md` minimal (run en dev, run en prod, variables d'env attendues).
- ☐ `.gitignore` : `.env` , `*.exe` , `*.dll` , `__pycache__/` , `node_modules/` , `dist/` , `data/` .
- ☐ `.env.example` documenté (`POSTGRES_*` , `JWT_SECRET` , `LOG_LEVEL` , `FRONT_ORIGIN` ).
- ☐ Logs JSON structurés sur stdout (`python-json-logger` ).
**DoD** : `make up` démarre les 3 conteneurs ; `curl http://localhost:${HOST_FRONT_PORT:-8080}/api/v1/health` renvoie `{ "status": "ok", "version": "..." }` (proxifié par nginx via `api:8000` ) ; le front sur `:8080` affiche une page d'accueil au design RTOps ; * * `make e2e` passe les 8 tests Playwright** ; rapport HTML dans `e2e/playwright-report/` . Procédure complète : `tasks/testing-m0.md` . * En prod, la TLS est terminée par un reverse proxy externe (cf. spec §6 NF-network) — la stack compose ne sert que du HTTP. *
---
## M1 — Schéma DB & migrations Alembic ☐
**But** : modèle de données complet versionné, sans logique métier.
- ☐ Configurer SQLAlchemy 2.x + Alembic.
- ☐ Tables auth/RBAC : `users` , `groups` , `permissions` , `user_groups` , `group_permissions` , `invitations` , `refresh_tokens` .
- ☐ Tables MITRE : `mitre_tactics` , `mitre_techniques` , `mitre_subtechniques` (avec `external_id` , `name` , `description` , `url` ).
- ☐ Tables templates : `test_templates` , `test_template_mitre_tags` (jointure many-to-many tactic/technique/subtechnique), `scenario_templates` , `scenario_template_tests` (avec `position` ).
- ☐ Tables missions : `missions` , `mission_members` , `mission_scenarios` (snapshot), `mission_tests` (snapshot + state), `mission_test_mitre_tags` , `mission_categories` (custom).
- ☐ Tables exécution : `evidence_files` (FK `mission_test_id` , `sha256` , `mime` , `size_bytes` , `storage_path` , `original_filename` ).
- ☐ Tables paramétrage : `detection_levels` (clé, label_fr, label_en, color_token, position, is_default), `settings` (key/value).
- ☐ Table notifications : `notifications` (FK user, type, payload JSONB, read_at, created_at).
- ☐ Soft delete : colonne `deleted_at` partout sauf tables jointures simples ; index partiel `WHERE deleted_at IS NULL` .
- ☐ Audit minimal : `created_at` , `updated_at` partout.
- ☐ Migration initiale Alembic + commande `make migrate` .
**DoD** : `make migrate` applique le schéma sur une DB vide ; `\dt` montre toutes les tables ; les contraintes FK et les index sont en place.
---
## M2 — Auth, bootstrap, invitations ☑
**But** : un humain peut s'inscrire et se connecter.
- ☐ Hash mot de passe : `argon2-cffi` (params modérés, `time_cost=2, memory_cost=64MB` ).
- ☐ JWT : `pyjwt` , HS256, claims `sub` , `iat` , `exp` , `type` (access|refresh), `jti` . Access 1h, refresh 30j.
- ☐ Stockage refresh tokens en DB (rotation à chaque usage, révocation au logout).
- ☐ Endpoints : `POST /auth/login` , `POST /auth/refresh` , `POST /auth/logout` , `GET /auth/me` , `POST /auth/change-password` .
- ☐ Bootstrap : commande `flask metamorph print-install-token` génère + persiste un token unique au 1er démarrage (si table `users` vide), écrit dans les logs au boot.
- ☐ Endpoint `POST /setup` : consomme le token d'install, crée le 1er admin (groupe `admin` seedé).
- ☐ Invitations : `POST /invitations` (admin, génère token 7j), `GET /invitations/{token}` (preview), `POST /invitations/{token}/accept` (création compte avec password choisi).
- ☐ Middleware d'auth Flask (`@require_auth` , `@require_perm("...")` ).
- ☐ Rate-limit `flask-limiter` sur `/auth/*` (10/min/IP).
- ☐ Front : pages `/login` , `/setup` , `/register?token=…` , `/profile` . Stockage access en mémoire, refresh en cookie HTTPOnly Secure SameSite=Strict.
- ☐ Hook React `useAuth()` + interceptor TanStack Query (refresh auto sur 401).
- ☐ CORS strict (origin `FRONT_ORIGIN` ).
**DoD** : `flask metamorph print-install-token` → /setup → création admin → login → /auth/me OK ; admin crée invitation → user s'inscrit via lien → login OK ; `/auth/refresh` renouvelle correctement.
---
## M3 — RBAC : groupes, permissions, gestion users ☑
**But** : admin peut composer des groupes custom et y assigner des users.
- ☐ Seed des permissions atomiques (familles spec §4) :
- `user.{read,create,update,delete}` , `group.{read,create,update,delete}` , `invitation.{create,revoke,read}`
- `test_template.{read,create,update,delete}` , `scenario_template.{read,create,update,delete}`
- `mission.{read,create,update,archive,delete}` , `mission.write_red_fields` , `mission.write_blue_fields`
- `detection_level.{read,update}` , `setting.{read,update}` , `mitre.sync`
- ☐ Seed des 3 groupes par défaut (`admin` = toutes, `redteam` = templates(read) + missions(read,create,update) + write_red_fields, `blueteam` = templates(read) + missions(read) + write_blue_fields).
- ☐ Endpoints CRUD `groups` , `permissions` (lecture seule), `users` (admin), `users/{id}/groups` (assign).
- ☐ Décorateur `@require_perm` qui vérifie l'union des perms via tous les groupes du user.
- ☐ Front : page Admin > Users (liste, recherche, modale d'édition des groupes), Admin > Groups (CRUD + multi-select des perms), Admin > Invitations (liste, créer, révoquer).
- ☐ UI : on n'affiche pas les actions interdites (mais le serveur reste l'arbitre).
**DoD** : un admin peut créer un groupe `pentest-2026-Q2` avec uniquement `mission.read` + `mission.write_red_fields` , l'attribuer à Bob ; Bob voit les missions auxquelles il est membre mais ne peut pas écrire dans les champs blue (HTTP 403 au niveau API).
---
docs(m4): CHANGELOG, README, lessons, spec drift fix, todo tick
- CHANGELOG: added M4 section listing endpoints, CLI, volume, persisted
settings, picker, and the post-spec-review fixes (custom-URL integrity
requirement + /diag/reset consistency + spec drift). Includes the
intentional decisions paragraph (seed-time download not image-baked, read
endpoints unauthenticated-perm-wise, stdlib over httpx).
- README: status bumped to M0–M4, added MITRE quickstart (make seed-mitre +
air-gapped path with --source /data/mitre/<file> + --skip-checksum),
testing-m<N>.md pointer updated to testing-m4.md, roadmap line.
- tasks/spec.md §10 #4: amended "14 tactics Enterprise" → "≥14 tactics
Enterprise (la v19 du pin actuel en ship 15)".
- tasks/lessons.md: 7 M4 lessons captured (stdlib STIX parsing, decoupling
DoD asserts from upstream versions, subtechnique parent resolution, single-
transaction safety, custom-URL footgun mitigation, /diag/reset consistency,
named-volume permission caveat, podman build cache surprise).
- tasks/todo.md: M4 marked ☑.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-12 13:54:46 +02:00
## M4 — MITRE ATT&CK Enterprise ☑
feat(m0): bootstrap repo, design system, compose stack
- Repo scaffolding: .gitignore, .env.example, Makefile, docker-compose.yml,
README.md, CHANGELOG.md, pre-commit config.
- Three-service stack: api (Flask 3), db (postgres:16-alpine), front (nginx
serving the Vite bundle). Named volumes metamorph_db + metamorph_evidence.
- Backend skeleton: Flask app factory, JSON structured logging on stdout,
GET /api/v1/health, multi-stage Dockerfile, pyproject.toml driven by uv,
Pydantic Settings with secret guard rails (refuses to boot in non-dev with
placeholders), APP_ENV gating.
- Frontend skeleton: Vite + React 18 + TypeScript strict + TailwindCSS, RTOps
design tokens from tasks/design.md, self-hosted JetBrains Mono / IBM Plex
Sans via @fontsource, base UI primitives (Card/Tag/SectionHeader/FlowNode/
Button), home page wired to /api/v1/health.
- Engine-agnostic Makefile: auto-detects docker or podman, picks the matching
compose driver. Targets: up/down/build/rebuild/dev/lint/fmt/test/migrate/
seed-mitre/print-install-token/e2e/inspect-health.
- Playwright suite: e2e/tests/m0-smoke.spec.ts (8 tests) + HTML + JUnit
reports + traces on retry.
- Docs: tasks/spec.md (finalized after Q&A), tasks/design.md, tasks/todo.md
(14 milestones), tasks/testing-m0.md, tasks/lessons.md.
DoD: make up + make health + make e2e all pass on podman 5.x (Fedora) and
docker. TLS terminated by external reverse proxy (spec §6 NF-network).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-11 06:16:00 +02:00
**But** : le référentiel ATT&CK est interrogeable et tagué sur les tests.
- ☐ Téléchargement initial du STIX bundle Enterprise depuis `github.com/mitre/cti` (vérifier hash, pin une version).
- ☐ Parser STIX → tables `mitre_tactics` / `mitre_techniques` / `mitre_subtechniques` (extraire `external_id` ATT&CK, `name` , `description` , `url` , relations technique↔tactic).
- ☐ Commande `flask metamorph seed-mitre [--source <path|url>]` .
- ☐ Endpoint `POST /mitre/sync` (perm `mitre.sync` ) qui re-pull depuis l'URL configurée (setting `mitre_source_url` ).
- ☐ Persister `mitre_last_sync` dans `settings` .
- ☐ Endpoint `GET /mitre/tactics` , `/mitre/techniques?tactic=…` , `/mitre/subtechniques?technique=…` (pagination + recherche full-text simple sur `name` ).
2026-05-12 18:32:20 +02:00
- ☐ Front : composant `<MitreTagPicker>` — matrice flat type `attack.mitre.org/#` (colonnes = tactics, cellules = techniques, chevron `+N` qui déplie les sub-techniques inline). Click = (dé)sélection, multi-niveaux cumulatif, chips en haut, recherche par `external_id` ou `name` . Alimenté par `GET /mitre/matrix` (one-shot, ~55 KB).
feat(m0): bootstrap repo, design system, compose stack
- Repo scaffolding: .gitignore, .env.example, Makefile, docker-compose.yml,
README.md, CHANGELOG.md, pre-commit config.
- Three-service stack: api (Flask 3), db (postgres:16-alpine), front (nginx
serving the Vite bundle). Named volumes metamorph_db + metamorph_evidence.
- Backend skeleton: Flask app factory, JSON structured logging on stdout,
GET /api/v1/health, multi-stage Dockerfile, pyproject.toml driven by uv,
Pydantic Settings with secret guard rails (refuses to boot in non-dev with
placeholders), APP_ENV gating.
- Frontend skeleton: Vite + React 18 + TypeScript strict + TailwindCSS, RTOps
design tokens from tasks/design.md, self-hosted JetBrains Mono / IBM Plex
Sans via @fontsource, base UI primitives (Card/Tag/SectionHeader/FlowNode/
Button), home page wired to /api/v1/health.
- Engine-agnostic Makefile: auto-detects docker or podman, picks the matching
compose driver. Targets: up/down/build/rebuild/dev/lint/fmt/test/migrate/
seed-mitre/print-install-token/e2e/inspect-health.
- Playwright suite: e2e/tests/m0-smoke.spec.ts (8 tests) + HTML + JUnit
reports + traces on retry.
- Docs: tasks/spec.md (finalized after Q&A), tasks/design.md, tasks/todo.md
(14 milestones), tasks/testing-m0.md, tasks/lessons.md.
DoD: make up + make health + make e2e all pass on podman 5.x (Fedora) and
docker. TLS terminated by external reverse proxy (spec §6 NF-network).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-11 06:16:00 +02:00
**DoD** : après `make seed-mitre` , `GET /mitre/tactics` retourne 14 tactics Enterprise ; le picker permet de tagger un test avec « TA0002 / T1059.001 » et l'enregistrement est persistant.
---
2026-05-12 19:57:51 +02:00
## M5 — Templates : tests unitaires & scénarios ☑
feat(m0): bootstrap repo, design system, compose stack
- Repo scaffolding: .gitignore, .env.example, Makefile, docker-compose.yml,
README.md, CHANGELOG.md, pre-commit config.
- Three-service stack: api (Flask 3), db (postgres:16-alpine), front (nginx
serving the Vite bundle). Named volumes metamorph_db + metamorph_evidence.
- Backend skeleton: Flask app factory, JSON structured logging on stdout,
GET /api/v1/health, multi-stage Dockerfile, pyproject.toml driven by uv,
Pydantic Settings with secret guard rails (refuses to boot in non-dev with
placeholders), APP_ENV gating.
- Frontend skeleton: Vite + React 18 + TypeScript strict + TailwindCSS, RTOps
design tokens from tasks/design.md, self-hosted JetBrains Mono / IBM Plex
Sans via @fontsource, base UI primitives (Card/Tag/SectionHeader/FlowNode/
Button), home page wired to /api/v1/health.
- Engine-agnostic Makefile: auto-detects docker or podman, picks the matching
compose driver. Targets: up/down/build/rebuild/dev/lint/fmt/test/migrate/
seed-mitre/print-install-token/e2e/inspect-health.
- Playwright suite: e2e/tests/m0-smoke.spec.ts (8 tests) + HTML + JUnit
reports + traces on retry.
- Docs: tasks/spec.md (finalized after Q&A), tasks/design.md, tasks/todo.md
(14 milestones), tasks/testing-m0.md, tasks/lessons.md.
DoD: make up + make health + make e2e all pass on podman 5.x (Fedora) and
docker. TLS terminated by external reverse proxy (spec §6 NF-network).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-11 06:16:00 +02:00
**But** : admin peut bâtir le catalogue réutilisable.
- ☐ Modèle `test_template` : nom, description, objectif, procédure (markdown), prérequis (markdown), résultat attendu red, détection attendue blue, niveau OPSEC (`enum low/med/high` ), tags libres (array text), IOCs attendus (array text), tags MITRE (multi).
- ☐ Endpoints CRUD `/test-templates` avec validation pydantic.
- ☐ Modèle `scenario_template` : nom, description, liste ordonnée de tests (`position` ).
- ☐ Endpoints CRUD `/scenario-templates` , `PUT /scenario-templates/{id}/tests` (réordonnancement).
- ☐ Front : page Admin > Tests (liste filtrable par tactic / OPSEC / tag), modale d'édition (form complet avec markdown editor — `@uiw/react-md-editor` ou équivalent léger).
- ☐ Front : page Admin > Scénarios, drag-and-drop avec `@dnd-kit/sortable` .
- ☐ Filtres : recherche full-text sur nom/desc, facettes MITRE/OPSEC/tags.
**DoD** : admin crée 5 tests + 1 scénario de 3 tests réordonnés ; recharge la page → ordre persistant ; suppression soft-delete d'un template n'efface pas les scénarios.
---
feat(m6): missions + snapshot CRUD, membership visibility, status state machine
Adds the mission layer that materialises template snapshots, plus the SPA
list / 3-step wizard / detail page.
Backend:
- app/services/missions.py — create_mission snapshots scenarios, tests, MITRE
tags in a 4-query write; list/get apply a non-admin membership filter that
collapses to 404 (no existence leak); status state machine enforces
draft → in_progress → completed → archived with archived as a sink; the
non-admin creator is auto-added as role_hint='red' to retain visibility.
- app/api/missions.py — 8 endpoints (list, get, create, update, add
scenarios, set members, transition, soft-delete) with strict pydantic
schemas. The transition endpoint splits the perm gate manually so
archive requires mission.archive while other targets use mission.update.
- app/api/users.py — new GET /users/roster returning (id, email,
display_name) only, gated by user.read OR mission.create OR
mission.update — lets non-admin wizard users see assignable peers
without exposing the admin /users payload.
- app/api/diag.py — /diag/reset truncates the mission_* tables before the
template tables because the source_*_template_id FKs are ON DELETE SET
NULL, which is cheaper to short-circuit by removing the children first.
Frontend:
- lib/missions.ts — typed client, queryKey factory, status accent map.
- pages/MissionsListPage.tsx — list cards with status accent + filters
(q, client, status).
- pages/MissionsCreatePage.tsx — 3-step wizard (meta → scenarios → members)
with member roster fed by /users/roster.
- pages/MissionDetailPage.tsx — header + transition buttons (legal next
states only) + Tests/Members/Synthesis/Export tabs.
- Routes + nav entry (visible to anyone with mission.read or admin).
Tests:
- backend/tests/test_missions.py — 22 pytest covering snapshot fidelity,
MITRE propagation, membership visibility, transition state machine,
perm gating, member set replace, append scenarios, soft-delete, partial
update, inverted-date rejection.
- e2e/tests/m6-missions.spec.ts — 5 Playwright (snapshot freezing, non-admin
visibility, status transitions + 409, SPA wizard end-to-end, list filter).
Docs:
- CHANGELOG, tasks/testing-m6.md, tasks/lessons.md (snapshot tradeoffs,
membership=404 pattern, /diag/reset order, auto-creator add).
- README + tasks/todo.md updated.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-13 15:07:32 +02:00
## M6 — Missions & snapshot ☑
feat(m0): bootstrap repo, design system, compose stack
- Repo scaffolding: .gitignore, .env.example, Makefile, docker-compose.yml,
README.md, CHANGELOG.md, pre-commit config.
- Three-service stack: api (Flask 3), db (postgres:16-alpine), front (nginx
serving the Vite bundle). Named volumes metamorph_db + metamorph_evidence.
- Backend skeleton: Flask app factory, JSON structured logging on stdout,
GET /api/v1/health, multi-stage Dockerfile, pyproject.toml driven by uv,
Pydantic Settings with secret guard rails (refuses to boot in non-dev with
placeholders), APP_ENV gating.
- Frontend skeleton: Vite + React 18 + TypeScript strict + TailwindCSS, RTOps
design tokens from tasks/design.md, self-hosted JetBrains Mono / IBM Plex
Sans via @fontsource, base UI primitives (Card/Tag/SectionHeader/FlowNode/
Button), home page wired to /api/v1/health.
- Engine-agnostic Makefile: auto-detects docker or podman, picks the matching
compose driver. Targets: up/down/build/rebuild/dev/lint/fmt/test/migrate/
seed-mitre/print-install-token/e2e/inspect-health.
- Playwright suite: e2e/tests/m0-smoke.spec.ts (8 tests) + HTML + JUnit
reports + traces on retry.
- Docs: tasks/spec.md (finalized after Q&A), tasks/design.md, tasks/todo.md
(14 milestones), tasks/testing-m0.md, tasks/lessons.md.
DoD: make up + make health + make e2e all pass on podman 5.x (Fedora) and
docker. TLS terminated by external reverse proxy (spec §6 NF-network).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-11 06:16:00 +02:00
**But** : transformer les templates en missions vivantes.
- ☐ Modèle `mission` : nom, client/cible (texte), date_start, date_end, status (`enum draft/in_progress/completed/archived` ), description (markdown), `visibility_mode` figé à `whitebox` v1.
- ☐ `mission_members` : (mission_id, user_id, role_hint `red|blue` ) — rôle hint informatif, l'autorisation reste portée par les permissions.
- ☐ Lors de la création/modification d'une mission, sélection de scénarios → **snapshot ** : copie complète des `scenario_templates` et `test_templates` dans `mission_scenarios` / `mission_tests` (y compris tags MITRE).
- ☐ `mission_tests` ajoute : `state` (`enum pending/executed/reviewed_by_blue/skipped/blocked` ), `executed_at` (nullable), `executed_at_override` (bool), `red_command` , `red_output` , `red_comment` , `blue_comment` , `detection_level_id` (nullable).
- ☐ Endpoints : `POST /missions` , `GET /missions` (filtré par perms + membership pour les non-admin), `GET /missions/{id}` (avec scénarios+tests), `PUT /missions/{id}` (métadonnées + ajout de scénarios → snapshot), `POST /missions/{id}/transition` (drift de status), `DELETE /missions/{id}` (soft).
- ☐ Front : page Missions (liste + filtres status/client/dates), création (wizard 3 étapes : meta → scénarios → membres), vue mission (header + onglets Tests / Membres / Synthèse / Export).
- ☐ Vue mission : tableau des tests avec colonnes Tactic | Test | Statut | Niveau de détection | Last update, actions selon perms.
**DoD** : red crée une mission avec 1 scénario de 3 tests, ajoute Alice (red) et Bob (blue) ; modification ultérieure d'un test_template ne change rien dans la mission (snapshot préservé).
---
feat(m7): per-test execution — red/blue zones, evidence pipeline, activity poll
DoD M7 (spec §F5 + §F6 + §F8 + tasks/todo.md M7) covered end-to-end:
Backend
- New migration `91a4e7c6d2f3` adds `mission_tests.last_actor_id` (FK users
ON DELETE SET NULL) and `ix_mission_tests_updated_at` for the polling query.
- `detection_levels`: 4 default rows seeded at boot, `GET /detection-levels`
read-only (CRUD lands in M8).
- `mission_tests` service + `missions` API extension:
- `GET /missions/{id}/tests/{test_id}` — full detail incl. evidence list
- `PUT /missions/{id}/tests/{test_id}` — patch red/blue fields with per-field
perm classification (`mission.write_red_fields` vs `mission.write_blue_fields`)
- `POST /missions/{id}/tests/{test_id}/transition` — pending↔skipped/blocked
and pending→executed→reviewed_by_blue (+ undo paths), side-aware perm gate
that fires *before* idempotency, `executed_at` auto-stamped on the way in
- `GET /missions/{id}/activity?since=<ISO>` — drives the 15 s polling badge
- `evidence` service + top-level `/evidence/<id>` API:
- Streaming upload, SHA256 chunk-by-chunk, 25 MB cap, ext+MIME whitelist
- Content-addressed storage at ${EVIDENCE_DIR}/<mission>/<test>/<sha256><ext>
- Atomic `os.replace`, hex-validated SHA path component, root-dir guard
- Membership-aware (404 on miss/forbidden, no existence leak)
- `/diag/reset` now wipes ${EVIDENCE_DIR}/* in test mode (symlink-safe) and
re-seeds detection levels as a safety net.
Frontend
- `lib/missions.ts` — M7 types + queryKey factory + state-machine matrix.
- `pages/MissionTestPage.tsx` — two-zone layout: red border (command, output,
comment, mark-executed + override toggle) and cyan border (detection-level
select, comment, drag-and-drop evidence dropzone). Last-touched badge polls
/activity every 15 s, gated on document.visibilityState. Per-field disable
based on the user's red/blue perms (server stays the arbiter).
- `pages/MissionDetailPage.tsx` — test rows link to the new per-test page.
- `App.tsx` — registers /missions/:id/tests/:testId behind RequireAuth.
- `HomePage.tsx` — hero + roadmap card bumped to M7; next is M8.
Tests
- `backend/tests/test_mission_tests.py` — 27 pytest tests (red/blue field
gating, state-machine matrix incl. idempotent-side enforcement, executed_at
override, 24/26 MB upload + SHA256, MIME/ext whitelist, soft-delete hide,
activity polling with URL-encoded `since`, membership 404 vs admin bypass,
cross-mission evidence access).
- `e2e/tests/m7-execution.spec.ts` — 5 Playwright tests against the live stack
(red-only/blue-only API gating, mark-executed + reviewed_by_blue side
enforcement, 24 MB/26 MB upload + SHA256 round-trip, SPA per-test page save
+ transition, non-member 404 message). afterAll restores stable admin and
re-syncs MITRE.
Docs
- CHANGELOG.md: M7 section + post-M7 review-pass subsection.
- README.md: status, feature blurb, roadmap, testing-m7 link.
- tasks/testing-m7.md: manual + automated procedure with transition matrix
and perm-gating table.
- tasks/lessons.md: M7 retrospectives (LogRecord `created` trap, URL-encoded
query timestamps, perm-before-flush, atomic move, polling visibility gate).
Test count: 133 pytest / 49 Playwright, all green.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-14 08:16:48 +02:00
## M7 — Saisie red & blue sur un test ☑
feat(m0): bootstrap repo, design system, compose stack
- Repo scaffolding: .gitignore, .env.example, Makefile, docker-compose.yml,
README.md, CHANGELOG.md, pre-commit config.
- Three-service stack: api (Flask 3), db (postgres:16-alpine), front (nginx
serving the Vite bundle). Named volumes metamorph_db + metamorph_evidence.
- Backend skeleton: Flask app factory, JSON structured logging on stdout,
GET /api/v1/health, multi-stage Dockerfile, pyproject.toml driven by uv,
Pydantic Settings with secret guard rails (refuses to boot in non-dev with
placeholders), APP_ENV gating.
- Frontend skeleton: Vite + React 18 + TypeScript strict + TailwindCSS, RTOps
design tokens from tasks/design.md, self-hosted JetBrains Mono / IBM Plex
Sans via @fontsource, base UI primitives (Card/Tag/SectionHeader/FlowNode/
Button), home page wired to /api/v1/health.
- Engine-agnostic Makefile: auto-detects docker or podman, picks the matching
compose driver. Targets: up/down/build/rebuild/dev/lint/fmt/test/migrate/
seed-mitre/print-install-token/e2e/inspect-health.
- Playwright suite: e2e/tests/m0-smoke.spec.ts (8 tests) + HTML + JUnit
reports + traces on retry.
- Docs: tasks/spec.md (finalized after Q&A), tasks/design.md, tasks/todo.md
(14 milestones), tasks/testing-m0.md, tasks/lessons.md.
DoD: make up + make health + make e2e all pass on podman 5.x (Fedora) and
docker. TLS terminated by external reverse proxy (spec §6 NF-network).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-11 06:16:00 +02:00
**But** : exécution de la mission, le cœur du produit.
feat(m7): per-test execution — red/blue zones, evidence pipeline, activity poll
DoD M7 (spec §F5 + §F6 + §F8 + tasks/todo.md M7) covered end-to-end:
Backend
- New migration `91a4e7c6d2f3` adds `mission_tests.last_actor_id` (FK users
ON DELETE SET NULL) and `ix_mission_tests_updated_at` for the polling query.
- `detection_levels`: 4 default rows seeded at boot, `GET /detection-levels`
read-only (CRUD lands in M8).
- `mission_tests` service + `missions` API extension:
- `GET /missions/{id}/tests/{test_id}` — full detail incl. evidence list
- `PUT /missions/{id}/tests/{test_id}` — patch red/blue fields with per-field
perm classification (`mission.write_red_fields` vs `mission.write_blue_fields`)
- `POST /missions/{id}/tests/{test_id}/transition` — pending↔skipped/blocked
and pending→executed→reviewed_by_blue (+ undo paths), side-aware perm gate
that fires *before* idempotency, `executed_at` auto-stamped on the way in
- `GET /missions/{id}/activity?since=<ISO>` — drives the 15 s polling badge
- `evidence` service + top-level `/evidence/<id>` API:
- Streaming upload, SHA256 chunk-by-chunk, 25 MB cap, ext+MIME whitelist
- Content-addressed storage at ${EVIDENCE_DIR}/<mission>/<test>/<sha256><ext>
- Atomic `os.replace`, hex-validated SHA path component, root-dir guard
- Membership-aware (404 on miss/forbidden, no existence leak)
- `/diag/reset` now wipes ${EVIDENCE_DIR}/* in test mode (symlink-safe) and
re-seeds detection levels as a safety net.
Frontend
- `lib/missions.ts` — M7 types + queryKey factory + state-machine matrix.
- `pages/MissionTestPage.tsx` — two-zone layout: red border (command, output,
comment, mark-executed + override toggle) and cyan border (detection-level
select, comment, drag-and-drop evidence dropzone). Last-touched badge polls
/activity every 15 s, gated on document.visibilityState. Per-field disable
based on the user's red/blue perms (server stays the arbiter).
- `pages/MissionDetailPage.tsx` — test rows link to the new per-test page.
- `App.tsx` — registers /missions/:id/tests/:testId behind RequireAuth.
- `HomePage.tsx` — hero + roadmap card bumped to M7; next is M8.
Tests
- `backend/tests/test_mission_tests.py` — 27 pytest tests (red/blue field
gating, state-machine matrix incl. idempotent-side enforcement, executed_at
override, 24/26 MB upload + SHA256, MIME/ext whitelist, soft-delete hide,
activity polling with URL-encoded `since`, membership 404 vs admin bypass,
cross-mission evidence access).
- `e2e/tests/m7-execution.spec.ts` — 5 Playwright tests against the live stack
(red-only/blue-only API gating, mark-executed + reviewed_by_blue side
enforcement, 24 MB/26 MB upload + SHA256 round-trip, SPA per-test page save
+ transition, non-member 404 message). afterAll restores stable admin and
re-syncs MITRE.
Docs
- CHANGELOG.md: M7 section + post-M7 review-pass subsection.
- README.md: status, feature blurb, roadmap, testing-m7 link.
- tasks/testing-m7.md: manual + automated procedure with transition matrix
and perm-gating table.
- tasks/lessons.md: M7 retrospectives (LogRecord `created` trap, URL-encoded
query timestamps, perm-before-flush, atomic move, polling visibility gate).
Test count: 133 pytest / 49 Playwright, all green.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-14 08:16:48 +02:00
- ☑ Modale ou page dédiée `Mission > Test #N` avec deux zones distinctes (red / blue), bordures accentuées par couleur (rouge / cyan).
- ☑ Côté red : champ commande (mono), output (mono multiline), commentaire markdown, bouton « Marquer exécuté » qui set `state=executed` + `executed_at=now()` ; édition de `executed_at` derrière un toggle « override ».
- ☑ Côté blue : sélecteur `detection_level` , commentaire markdown, zone d'upload multi-fichiers (drag-and-drop).
- ☑ Upload preuves : `POST /missions/{id}/tests/{test_id}/evidence` (multipart, validation extension+MIME+taille≤25Mo, calcul SHA256, stockage `/data/evidence/<mission_id>/<test_id>/<sha256>{ext}` ).
- ☑ `GET /evidence/{id}` (download, vérif perm) ; `DELETE /evidence/{id}` (soft).
- ☑ Permissions : tout endpoint d'écriture vérifie `mission.write_red_fields` ou `mission.write_blue_fields` selon le champ touché ; les deux peuvent coexister sur un même groupe (pas exclusifs en code).
- ☑ Bouton « Statut » avec choix `executed` , `reviewed_by_blue` , `skipped` , `blocked` (transitions contrôlées : pending↔skipped/blocked, executed→reviewed_by_blue).
- ☑ Indicateur « modifié par X il y a Ns » : polling `GET /missions/{id}/activity?since=…` toutes les 15 s tant que la page est active.
feat(m0): bootstrap repo, design system, compose stack
- Repo scaffolding: .gitignore, .env.example, Makefile, docker-compose.yml,
README.md, CHANGELOG.md, pre-commit config.
- Three-service stack: api (Flask 3), db (postgres:16-alpine), front (nginx
serving the Vite bundle). Named volumes metamorph_db + metamorph_evidence.
- Backend skeleton: Flask app factory, JSON structured logging on stdout,
GET /api/v1/health, multi-stage Dockerfile, pyproject.toml driven by uv,
Pydantic Settings with secret guard rails (refuses to boot in non-dev with
placeholders), APP_ENV gating.
- Frontend skeleton: Vite + React 18 + TypeScript strict + TailwindCSS, RTOps
design tokens from tasks/design.md, self-hosted JetBrains Mono / IBM Plex
Sans via @fontsource, base UI primitives (Card/Tag/SectionHeader/FlowNode/
Button), home page wired to /api/v1/health.
- Engine-agnostic Makefile: auto-detects docker or podman, picks the matching
compose driver. Targets: up/down/build/rebuild/dev/lint/fmt/test/migrate/
seed-mitre/print-install-token/e2e/inspect-health.
- Playwright suite: e2e/tests/m0-smoke.spec.ts (8 tests) + HTML + JUnit
reports + traces on retry.
- Docs: tasks/spec.md (finalized after Q&A), tasks/design.md, tasks/todo.md
(14 milestones), tasks/testing-m0.md, tasks/lessons.md.
DoD: make up + make health + make e2e all pass on podman 5.x (Fedora) and
docker. TLS terminated by external reverse proxy (spec §6 NF-network).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-11 06:16:00 +02:00
**DoD** : red et blue saisissent en parallèle sans conflit ; un user sans `write_blue_fields` reçoit 403 sur les champs blue ; un fichier .evtx de 24 Mo est uploadé, un de 26 Mo est rejeté ; le hash SHA256 est correct.
---
## M8 — Niveaux de détection custom ☐
**But** : la taxonomie d'icônes du slide est paramétrable.
- ☐ Seed initial : `detected_blocked` (red), `detected_alert` (orange), `logged_only` (yellow), `not_detected` (rose).
- ☐ Endpoints `/detection-levels` : list, create, update (label_fr, label_en, color_token, position, is_default).
- ☐ Garde-fou : empêcher la suppression si utilisé dans des `mission_tests` (proposer désactivation).
- ☐ Front : page Admin > Settings > Detection Levels (table + modale, picker de color_token parmi les 10 accents du design).
**DoD** : admin renomme `not_detected` → `missed` , ajoute `false_positive` avec accent purple ; les missions existantes affichent les nouveaux libellés ; un blueteamer voit la nouvelle option dans le sélecteur.
---
## M9 — Notifications in-app ☐
**But** : red et blue savent quand l'autre a agi.
- ☐ Service `notify(user_id, type, payload)` appelé sur transitions clés : `test_executed` , `test_reviewed_by_blue` , `evidence_added` , `mission_status_changed` .
- ☐ Endpoints `GET /notifications?unread_only=…` , `POST /notifications/{id}/read` , `POST /notifications/read-all` .
- ☐ Front : badge dans le header avec compteur, dropdown listant les 20 dernières + lien vers la mission/test.
- ☐ Polling `GET /notifications?unread_only=true` toutes les 30 s (ou WebSocket plus tard, hors scope).
**DoD** : Bob (blue) reçoit un badge « Test #4 prêt à review » 30 s max après qu'Alice (red) clique « Marquer exécuté ».
---
## M10 — Génération du slide reveal.js ☐
**But** : livrable client de fin de mission.
- ☐ Backend : endpoint `GET /missions/{id}/slide.html` qui calcule l'agrégat (tests groupés par MITRE Tactic, comptages par detection_level, plus regroupements custom si configurés).
- ☐ Côté serveur, on émet **un seul fichier HTML standalone ** : reveal.js inliné (CSS + JS), tokens design.md inlinés, données JSON inlinées, **aucune ressource externe ** .
- ☐ Layout : slide titre, slide « Méthodologie », une slide par Tactic avec liste des techniques/tests + icône colorée par detection_level, slide synthèse (matrice tactic × detection_level), slide annexes (preuves référencées par titre, sans binaires).
- ☐ Bouton « Export PDF » dans le slide → `print-pdf` reveal.js (`window.print()` + media query reveal).
- ☐ Front : page Mission > Synthèse avec preview iframe + bouton « Télécharger HTML ».
- ☐ Conformité design : `// ` headings en cyan, accents par detection_level, JetBrains Mono partout.
**DoD** : on télécharge `mission-X.html` , on l'ouvre offline dans Firefox/Chrome, navigation reveal OK, export PDF côté navigateur produit un PDF lisible.
---
## M11 — Exports JSON & CSV ☐
**But** : sortie des données brutes pour archivage.
- ☐ `GET /missions/{id}/export.json` : mission + scénarios + tests + niveaux de détection + métadonnées preuves (sans binaires, mais avec hash + filename).
- ☐ `GET /missions/{id}/export.csv` : une ligne par test (cols : test_name, mitre_tactic, mitre_technique, mitre_subtechnique, executed_at, status, red_command, detection_level, blue_comment_excerpt).
- ☐ Front : boutons d'export sur la page mission, headers `Content-Disposition: attachment` .
**DoD** : `curl -OJ` sur les deux endpoints donne deux fichiers cohérents et complets ; le JSON peut être réimporté dans le futur (laisser cet import en backlog).
---
## M12 — Soft delete & purge admin ☐
**But** : aucune perte de donnée par accident, ménage explicite.
- ☐ Toutes les `DELETE` du back deviennent `UPDATE deleted_at` .
- ☐ Tous les `GET` filtrent `deleted_at IS NULL` par défaut, paramètre `?include_deleted=true` réservé aux admins.
- ☐ Endpoint `POST /admin/purge` (perm admin) avec body `{entity, ids}` qui DELETE physiquement (suppression fichiers preuves incluse).
- ☐ Commande `flask metamorph purge-soft-deleted --older-than 30d` (manuelle, pas de cron auto).
- ☐ Front : page Admin > Trash (filtrée par entity), bouton « Restaurer » + bouton « Purger ».
**DoD** : suppression d'un test depuis l'UI → disparait des listes mais reste en DB ; admin peut le restaurer ; admin peut le purger définitivement, le fichier evidence associé disparait du disque.
---
## M13 — i18n FR / EN ☐
**But** : commutation de langue par utilisateur.
- ☐ Backend : `flask-babel` , deux locales `fr` / `en` . Messages d'erreur API via `gettext` . Fichier `messages.pot` extrait via `pybabel extract` .
- ☐ Frontend : `react-i18next` , namespaces par page, fichiers `frontend/src/i18n/{fr,en}/*.json` .
- ☐ Préférence user : champ `users.locale` (default `fr` ), endpoint `PATCH /auth/me {locale}` , switch dans le header.
- ☐ Données MITRE conservées en EN (officielles, non traduites).
- ☐ Tous les libellés UI passent par `t('…')` — interdit le texte en dur.
**DoD** : Bob change sa langue en EN, recharge → toute l'UI en EN sauf les noms ATT&CK ; un message d'erreur API arrive aussi en EN.
---
## M14 — Polish, sécu, observabilité, doc ☐
**But** : prêt pour livraison.
- ☐ Logs JSON : `request_id` , `user_id` , `path` , `method` , `status` , `duration_ms` , `action` (libre côté service).
- ☐ Audit minimal : logger toute action sensible (`auth.login` , `mission.create` , `evidence.delete` , `admin.purge` ).
- ☐ Rate-limit confirmé sur `/auth/*` et `/invitations/*` .
- ☐ Headers sécu : `Strict-Transport-Security` (si reverse proxy le pose, sinon doc), `Content-Security-Policy` strict côté front, `X-Frame-Options: DENY` , `Referrer-Policy: same-origin` .
- ☐ Validation : tailles max body globale (Flask `MAX_CONTENT_LENGTH` ), schéma pydantic strict partout.
- ☐ `README.md` complet (déploiement, env, premier admin, sync MITRE, backup volumes).
- ☐ `CHANGELOG.md` à jour (Conventional changelog).
- ☐ Critères §10 de la spec : check 1 par 1 sur une démo end-to-end documentée dans `tasks/lessons.md` .
- ☐ Tests : pytest pour la logique critique (auth, RBAC, snapshot, upload, exports). Smoke E2E Playwright (non bloquant, but nice).
**DoD** : démo from-scratch sur Debian 13 — `git clone` → `make up` → setup admin → invite users → crée mission → exécute → annote → génère slide → export. Tous les 15 critères §10 spec validés.
---
## Backlog v2+ (rappel pour ne pas oublier)
- Bascule auth Keycloak/OIDC.
- API d'ingestion C2 externe (push automatique des résultats).
- Audit log détaillé + versioning par champ.
- 2FA TOTP self-service.
- Notifications mail.
- Intégration tunnel C2 (binaires fournis).
- Métriques Prometheus.
- Multi-tenancy / workspaces.
- Branding configurable (logos, couleurs).
---
## Hygiène de session
- Au début de chaque session : relire `tasks/lessons.md` , `CHANGELOG.md` , ce fichier.
- À la fin : mettre à jour les ☐/☑, ajouter une entrée `CHANGELOG.md` , capturer les apprentissages dans `tasks/lessons.md` .
- Pour tout doute architectural : repasser par AskUserQuestion avant d'ouvrir un éditeur.