feat(m7): per-test execution — red/blue zones, evidence pipeline, activity poll
DoD M7 (spec §F5 + §F6 + §F8 + tasks/todo.md M7) covered end-to-end:
Backend
- New migration `91a4e7c6d2f3` adds `mission_tests.last_actor_id` (FK users
ON DELETE SET NULL) and `ix_mission_tests_updated_at` for the polling query.
- `detection_levels`: 4 default rows seeded at boot, `GET /detection-levels`
read-only (CRUD lands in M8).
- `mission_tests` service + `missions` API extension:
- `GET /missions/{id}/tests/{test_id}` — full detail incl. evidence list
- `PUT /missions/{id}/tests/{test_id}` — patch red/blue fields with per-field
perm classification (`mission.write_red_fields` vs `mission.write_blue_fields`)
- `POST /missions/{id}/tests/{test_id}/transition` — pending↔skipped/blocked
and pending→executed→reviewed_by_blue (+ undo paths), side-aware perm gate
that fires *before* idempotency, `executed_at` auto-stamped on the way in
- `GET /missions/{id}/activity?since=<ISO>` — drives the 15 s polling badge
- `evidence` service + top-level `/evidence/<id>` API:
- Streaming upload, SHA256 chunk-by-chunk, 25 MB cap, ext+MIME whitelist
- Content-addressed storage at ${EVIDENCE_DIR}/<mission>/<test>/<sha256><ext>
- Atomic `os.replace`, hex-validated SHA path component, root-dir guard
- Membership-aware (404 on miss/forbidden, no existence leak)
- `/diag/reset` now wipes ${EVIDENCE_DIR}/* in test mode (symlink-safe) and
re-seeds detection levels as a safety net.
Frontend
- `lib/missions.ts` — M7 types + queryKey factory + state-machine matrix.
- `pages/MissionTestPage.tsx` — two-zone layout: red border (command, output,
comment, mark-executed + override toggle) and cyan border (detection-level
select, comment, drag-and-drop evidence dropzone). Last-touched badge polls
/activity every 15 s, gated on document.visibilityState. Per-field disable
based on the user's red/blue perms (server stays the arbiter).
- `pages/MissionDetailPage.tsx` — test rows link to the new per-test page.
- `App.tsx` — registers /missions/:id/tests/:testId behind RequireAuth.
- `HomePage.tsx` — hero + roadmap card bumped to M7; next is M8.
Tests
- `backend/tests/test_mission_tests.py` — 27 pytest tests (red/blue field
gating, state-machine matrix incl. idempotent-side enforcement, executed_at
override, 24/26 MB upload + SHA256, MIME/ext whitelist, soft-delete hide,
activity polling with URL-encoded `since`, membership 404 vs admin bypass,
cross-mission evidence access).
- `e2e/tests/m7-execution.spec.ts` — 5 Playwright tests against the live stack
(red-only/blue-only API gating, mark-executed + reviewed_by_blue side
enforcement, 24 MB/26 MB upload + SHA256 round-trip, SPA per-test page save
+ transition, non-member 404 message). afterAll restores stable admin and
re-syncs MITRE.
Docs
- CHANGELOG.md: M7 section + post-M7 review-pass subsection.
- README.md: status, feature blurb, roadmap, testing-m7 link.
- tasks/testing-m7.md: manual + automated procedure with transition matrix
and perm-gating table.
- tasks/lessons.md: M7 retrospectives (LogRecord `created` trap, URL-encoded
query timestamps, perm-before-flush, atomic move, polling visibility gate).
Test count: 133 pytest / 49 Playwright, all green.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -2,7 +2,7 @@
|
||||
|
||||
Collaborative purple-team platform. Red team logs the tests they execute (procedure, command, timestamp); blue team annotates each test with detection evidence (alerts, logs, files). At the end of an engagement, Metamorph generates a standalone reveal.js slide deck classified by MITRE ATT&CK tactic.
|
||||
|
||||
> **Status**: M0–M5 delivered (bootstrap → DB schema → auth → RBAC → MITRE ATT&CK reference → test & scenario templates). See `tasks/spec.md` for the full specification and `tasks/todo.md` for the milestone-by-milestone plan.
|
||||
> **Status**: M0–M7 delivered (bootstrap → DB schema → auth → RBAC → MITRE ATT&CK reference → test & scenario templates → missions snapshot → red/blue execution on a mission test). See `tasks/spec.md` for the full specification and `tasks/todo.md` for the milestone-by-milestone plan.
|
||||
|
||||
## Stack
|
||||
|
||||
@@ -13,6 +13,7 @@ Collaborative purple-team platform. Red team logs the tests they execute (proced
|
||||
- **MITRE ATT&CK (M4+)**: Enterprise reference catalogue pinned to v19.0, seedable via `make seed-mitre`.
|
||||
- **Template catalogue (M5+)**: reusable `test_templates` (markdown procedure, OPSEC level, free tags, expected IOCs, MITRE tags) + ordered `scenario_templates` with drag-and-drop reordering. Admin pages at `/admin/tests` and `/admin/scenarios`.
|
||||
- **Missions (M6+)**: `missions` snapshot one or more scenario templates at creation time; template edits don't drift live missions (`mission_*` tables freeze every field, including MITRE tags). Non-admin members see only their own missions (membership filter, 404 on existence-leak attempts). Status state machine `draft → in_progress → completed → archived`, archive perm gated separately. SPA: list/filter at `/missions`, 3-step create wizard at `/missions/new`, detail page with Tests / Members / Synthesis / Export tabs.
|
||||
- **Execution (M7+)**: per-test page `/missions/<id>/tests/<test_id>` with two zones — red (command/output/comment + mark-executed with override) and blue (detection-level select / comment / drag-and-drop evidence upload). Field-level perm gating: `mission.write_red_fields` / `mission.write_blue_fields` are server-enforced *per field*. State machine `pending↔skipped/blocked` + `pending→executed→reviewed_by_blue` with side-aware perms. Evidence pipeline: streaming upload to `${EVIDENCE_DIR}/<mission>/<test>/<sha256><ext>`, SHA256 + MIME + extension + 25 MB cap. 15 s activity polling via `/missions/<id>/activity?since=…` drives the "modified by X" badge. 4 default `detection_levels` seeded at boot.
|
||||
- **Delivery**: docker-compose. TLS termination is expected to be handled by an external reverse proxy in production.
|
||||
|
||||
## Quickstart
|
||||
@@ -95,7 +96,7 @@ See `.env.example`. The most important ones:
|
||||
|
||||
## Testing
|
||||
|
||||
- **Manual + automated checklist for the current milestone**: see [`tasks/testing-m<N>.md`](tasks/testing-m6.md) (current: `testing-m6.md`).
|
||||
- **Manual + automated checklist for the current milestone**: see [`tasks/testing-m<N>.md`](tasks/testing-m7.md) (current: `testing-m7.md`).
|
||||
- **Backend unit tests**: `make test-api`
|
||||
- **End-to-end (Playwright)**: `make e2e-install` (once), then `make up && make e2e`. Reports land in `e2e/playwright-report/` (HTML + JUnit XML); open with `make e2e-report`.
|
||||
|
||||
@@ -137,7 +138,7 @@ The hooks run `ruff` + `ruff-format` on the backend and `eslint` / `tsc --noEmit
|
||||
|
||||
## Roadmap
|
||||
|
||||
See `tasks/todo.md`. Current milestone: **M6 — Missions & snapshot** (done). Next: M7 (red/blue execution on a mission test).
|
||||
See `tasks/todo.md`. Current milestone: **M7 — Red & blue execution on a mission test** (done). Next: M8 (custom detection-level CRUD).
|
||||
|
||||
## License
|
||||
|
||||
|
||||
Reference in New Issue
Block a user