DoD M7 (spec §F5 + §F6 + §F8 + tasks/todo.md M7) covered end-to-end:
Backend
- New migration `91a4e7c6d2f3` adds `mission_tests.last_actor_id` (FK users
ON DELETE SET NULL) and `ix_mission_tests_updated_at` for the polling query.
- `detection_levels`: 4 default rows seeded at boot, `GET /detection-levels`
read-only (CRUD lands in M8).
- `mission_tests` service + `missions` API extension:
- `GET /missions/{id}/tests/{test_id}` — full detail incl. evidence list
- `PUT /missions/{id}/tests/{test_id}` — patch red/blue fields with per-field
perm classification (`mission.write_red_fields` vs `mission.write_blue_fields`)
- `POST /missions/{id}/tests/{test_id}/transition` — pending↔skipped/blocked
and pending→executed→reviewed_by_blue (+ undo paths), side-aware perm gate
that fires *before* idempotency, `executed_at` auto-stamped on the way in
- `GET /missions/{id}/activity?since=<ISO>` — drives the 15 s polling badge
- `evidence` service + top-level `/evidence/<id>` API:
- Streaming upload, SHA256 chunk-by-chunk, 25 MB cap, ext+MIME whitelist
- Content-addressed storage at ${EVIDENCE_DIR}/<mission>/<test>/<sha256><ext>
- Atomic `os.replace`, hex-validated SHA path component, root-dir guard
- Membership-aware (404 on miss/forbidden, no existence leak)
- `/diag/reset` now wipes ${EVIDENCE_DIR}/* in test mode (symlink-safe) and
re-seeds detection levels as a safety net.
Frontend
- `lib/missions.ts` — M7 types + queryKey factory + state-machine matrix.
- `pages/MissionTestPage.tsx` — two-zone layout: red border (command, output,
comment, mark-executed + override toggle) and cyan border (detection-level
select, comment, drag-and-drop evidence dropzone). Last-touched badge polls
/activity every 15 s, gated on document.visibilityState. Per-field disable
based on the user's red/blue perms (server stays the arbiter).
- `pages/MissionDetailPage.tsx` — test rows link to the new per-test page.
- `App.tsx` — registers /missions/:id/tests/:testId behind RequireAuth.
- `HomePage.tsx` — hero + roadmap card bumped to M7; next is M8.
Tests
- `backend/tests/test_mission_tests.py` — 27 pytest tests (red/blue field
gating, state-machine matrix incl. idempotent-side enforcement, executed_at
override, 24/26 MB upload + SHA256, MIME/ext whitelist, soft-delete hide,
activity polling with URL-encoded `since`, membership 404 vs admin bypass,
cross-mission evidence access).
- `e2e/tests/m7-execution.spec.ts` — 5 Playwright tests against the live stack
(red-only/blue-only API gating, mark-executed + reviewed_by_blue side
enforcement, 24 MB/26 MB upload + SHA256 round-trip, SPA per-test page save
+ transition, non-member 404 message). afterAll restores stable admin and
re-syncs MITRE.
Docs
- CHANGELOG.md: M7 section + post-M7 review-pass subsection.
- README.md: status, feature blurb, roadmap, testing-m7 link.
- tasks/testing-m7.md: manual + automated procedure with transition matrix
and perm-gating table.
- tasks/lessons.md: M7 retrospectives (LogRecord `created` trap, URL-encoded
query timestamps, perm-before-flush, atomic move, polling visibility gate).
Test count: 133 pytest / 49 Playwright, all green.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Metamorph e2e
End-to-end tests powered by Playwright. Each milestone in tasks/todo.md should add at least one spec file (tests/m<N>-*.spec.ts).
One-time setup
cd e2e
npm install
npm run install-browsers # downloads chromium (uses sudo for system deps)
Running against a live stack
# 1. Bring the stack up from the repo root:
cd .. && make up
# 2. Run the tests:
cd e2e && npm test
# 3. Open the HTML report:
npm run report # opens playwright-report/index.html in your browser
Or from the repo root:
make e2e # runs against the already-up stack
make e2e-report # opens the HTML report
make e2e-up # one-shot: make up + wait healthy + run tests
Auto-spawn mode
Set PW_AUTOSTART=1 to let Playwright spawn make up itself before the run:
PW_AUTOSTART=1 npm test
Configuration
| Env var | Default | Purpose |
|---|---|---|
BASE_URL |
http://localhost:8080 |
The front nginx URL (which proxies /api/*) |
PW_AUTOSTART |
0 |
If 1, spawn make up before the tests |
CI |
unset | When set, retries=2 and parallel workers=2 |
Reports
- HTML :
e2e/playwright-report/index.html - JUnit :
e2e/playwright-report/junit.xml(CI ingestion) - Trace : kept on first retry, opened with
npx playwright show-trace …
Layout
e2e/
├── tests/
│ └── m0-smoke.spec.ts # bootstrap milestone (current)
│ └── m<N>-*.spec.ts # one spec per milestone, added as features land
├── playwright.config.ts
├── tsconfig.json
└── package.json