Go to file

Knacky ed70458d8f feat(m7): per-test execution — red/blue zones, evidence pipeline, activity poll

DoD M7 (spec §F5 + §F6 + §F8 + tasks/todo.md M7) covered end-to-end:

Backend
- New migration `91a4e7c6d2f3` adds `mission_tests.last_actor_id` (FK users
  ON DELETE SET NULL) and `ix_mission_tests_updated_at` for the polling query.
- `detection_levels`: 4 default rows seeded at boot, `GET /detection-levels`
  read-only (CRUD lands in M8).
- `mission_tests` service + `missions` API extension:
  - `GET /missions/{id}/tests/{test_id}` — full detail incl. evidence list
  - `PUT  /missions/{id}/tests/{test_id}` — patch red/blue fields with per-field
    perm classification (`mission.write_red_fields` vs `mission.write_blue_fields`)
  - `POST /missions/{id}/tests/{test_id}/transition` — pending↔skipped/blocked
    and pending→executed→reviewed_by_blue (+ undo paths), side-aware perm gate
    that fires *before* idempotency, `executed_at` auto-stamped on the way in
  - `GET  /missions/{id}/activity?since=<ISO>` — drives the 15 s polling badge
- `evidence` service + top-level `/evidence/<id>` API:
  - Streaming upload, SHA256 chunk-by-chunk, 25 MB cap, ext+MIME whitelist
  - Content-addressed storage at ${EVIDENCE_DIR}/<mission>/<test>/<sha256><ext>
  - Atomic `os.replace`, hex-validated SHA path component, root-dir guard
  - Membership-aware (404 on miss/forbidden, no existence leak)
- `/diag/reset` now wipes ${EVIDENCE_DIR}/* in test mode (symlink-safe) and
  re-seeds detection levels as a safety net.

Frontend
- `lib/missions.ts` — M7 types + queryKey factory + state-machine matrix.
- `pages/MissionTestPage.tsx` — two-zone layout: red border (command, output,
  comment, mark-executed + override toggle) and cyan border (detection-level
  select, comment, drag-and-drop evidence dropzone). Last-touched badge polls
  /activity every 15 s, gated on document.visibilityState. Per-field disable
  based on the user's red/blue perms (server stays the arbiter).
- `pages/MissionDetailPage.tsx` — test rows link to the new per-test page.
- `App.tsx` — registers /missions/:id/tests/:testId behind RequireAuth.
- `HomePage.tsx` — hero + roadmap card bumped to M7; next is M8.

Tests
- `backend/tests/test_mission_tests.py` — 27 pytest tests (red/blue field
  gating, state-machine matrix incl. idempotent-side enforcement, executed_at
  override, 24/26 MB upload + SHA256, MIME/ext whitelist, soft-delete hide,
  activity polling with URL-encoded `since`, membership 404 vs admin bypass,
  cross-mission evidence access).
- `e2e/tests/m7-execution.spec.ts` — 5 Playwright tests against the live stack
  (red-only/blue-only API gating, mark-executed + reviewed_by_blue side
  enforcement, 24 MB/26 MB upload + SHA256 round-trip, SPA per-test page save
  + transition, non-member 404 message). afterAll restores stable admin and
  re-syncs MITRE.

Docs
- CHANGELOG.md: M7 section + post-M7 review-pass subsection.
- README.md: status, feature blurb, roadmap, testing-m7 link.
- tasks/testing-m7.md: manual + automated procedure with transition matrix
  and perm-gating table.
- tasks/lessons.md: M7 retrospectives (LogRecord `created` trap, URL-encoded
  query timestamps, perm-before-flush, atomic move, polling visibility gate).

Test count: 133 pytest / 49 Playwright, all green.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

2026-05-14 08:16:48 +02:00

backend

feat(m7): per-test execution — red/blue zones, evidence pipeline, activity poll

2026-05-14 08:16:48 +02:00

e2e

feat(m7): per-test execution — red/blue zones, evidence pipeline, activity poll

2026-05-14 08:16:48 +02:00

frontend

feat(m7): per-test execution — red/blue zones, evidence pipeline, activity poll

2026-05-14 08:16:48 +02:00

tasks

feat(m7): per-test execution — red/blue zones, evidence pipeline, activity poll

2026-05-14 08:16:48 +02:00

.env.example

feat(m0): bootstrap repo, design system, compose stack

2026-05-11 06:16:00 +02:00

.gitignore

feat(m0): bootstrap repo, design system, compose stack

2026-05-11 06:16:00 +02:00

.pre-commit-config.yaml

feat(m0): bootstrap repo, design system, compose stack

2026-05-11 06:16:00 +02:00

CHANGELOG.md

feat(m7): per-test execution — red/blue zones, evidence pipeline, activity poll

2026-05-14 08:16:48 +02:00

docker-compose.yml

feat(m4): STIX parser + seed service + CLI

2026-05-12 13:53:53 +02:00

Makefile

feat(m0): bootstrap repo, design system, compose stack

2026-05-11 06:16:00 +02:00

README.md

feat(m7): per-test execution — red/blue zones, evidence pipeline, activity poll

2026-05-14 08:16:48 +02:00

README.md

Metamorph

Collaborative purple-team platform. Red team logs the tests they execute (procedure, command, timestamp); blue team annotates each test with detection evidence (alerts, logs, files). At the end of an engagement, Metamorph generates a standalone reveal.js slide deck classified by MITRE ATT&CK tactic.

Status: M0–M7 delivered (bootstrap → DB schema → auth → RBAC → MITRE ATT&CK reference → test & scenario templates → missions snapshot → red/blue execution on a mission test). See tasks/spec.md for the full specification and tasks/todo.md for the milestone-by-milestone plan.

Stack

Backend: Python 3.12, Flask 3, SQLAlchemy 2 + Alembic (M1+), PostgreSQL 16.
Frontend: React 18 + TypeScript + Vite + TailwindCSS (RTOps design tokens, see tasks/design.md).
Auth (M2+): JWT access (1h) + refresh (30d), Argon2id, invite-link enrollment.
RBAC (M3+): atomic permissions (31 codes) bundled into custom groups; 3 system groups seeded (admin / redteam / blueteam).
MITRE ATT&CK (M4+): Enterprise reference catalogue pinned to v19.0, seedable via make seed-mitre.
Template catalogue (M5+): reusable test_templates (markdown procedure, OPSEC level, free tags, expected IOCs, MITRE tags) + ordered scenario_templates with drag-and-drop reordering. Admin pages at /admin/tests and /admin/scenarios.
Missions (M6+): missions snapshot one or more scenario templates at creation time; template edits don't drift live missions (mission_* tables freeze every field, including MITRE tags). Non-admin members see only their own missions (membership filter, 404 on existence-leak attempts). Status state machine draft → in_progress → completed → archived, archive perm gated separately. SPA: list/filter at /missions, 3-step create wizard at /missions/new, detail page with Tests / Members / Synthesis / Export tabs.
Execution (M7+): per-test page /missions/<id>/tests/<test_id> with two zones — red (command/output/comment + mark-executed with override) and blue (detection-level select / comment / drag-and-drop evidence upload). Field-level perm gating: mission.write_red_fields / mission.write_blue_fields are server-enforced per field. State machine pending↔skipped/blocked + pending→executed→reviewed_by_blue with side-aware perms. Evidence pipeline: streaming upload to ${EVIDENCE_DIR}/<mission>/<test>/<sha256><ext>, SHA256 + MIME + extension + 25 MB cap. 15 s activity polling via /missions/<id>/activity?since=… drives the "modified by X" badge. 4 default detection_levels seeded at boot.
Delivery: docker-compose. TLS termination is expected to be handled by an external reverse proxy in production.

Quickstart

Works with Docker or Podman. The Makefile auto-detects the available engine and picks the matching compose driver (docker compose, podman compose, or podman-compose).

Requires one of:

Docker Engine 24+ with the Compose v2 plugin, or
Podman 4.0+ with podman compose (or the legacy podman-compose ≥ 1.0.6)

git clone <this repo>
cd Metamorph
make engine               # confirm which engine the Makefile picked up
make env                  # creates .env from .env.example
$EDITOR .env              # set strong values for POSTGRES_PASSWORD and JWT_SECRET
make up                   # builds and starts api + db + front
make logs                 # tail logs

Override the auto-detection if you have both engines installed:

make up ENGINE=podman                       # force podman + auto-pick its compose driver
make up ENGINE=docker COMPOSE="docker compose"
COMPOSE=podman-compose make up              # force the legacy wrapper specifically

Then:

Front: http://localhost:8080
API health: http://localhost:8080/api/v1/health (proxied) or http://localhost:8000/api/v1/health

First-time setup

make migrate                  # apply DB schema
make print-install-token      # prints the bootstrap admin token (logs banner)
# visit http://localhost:8080/setup, paste the token, create the admin account
make seed-mitre               # populate the MITRE ATT&CK reference (~50 MB, ~1 s)

The MITRE bundle is cached in the named volume metamorph_mitre (/data/mitre/<file>.json inside the api container). For air-gapped operators, pre-populate the volume with your own STIX 2.1 file and run podman compose exec api flask --app app.cli metamorph seed-mitre --source /data/mitre/your-file.json --skip-checksum.

To stop:

make down            # keep volumes
make clean           # also drop volumes (DESTRUCTIVE)

Local dev (no Docker)

Requires:

uv for Python deps
Node.js 20+ and npm
A reachable Postgres (or make up db to run only the db container)

make dev-api     # in one terminal
make dev-front   # in another

Environment variables

See .env.example. The most important ones:

Variable	Purpose
`APP_ENV`	`dev` allows placeholder secrets; anything else (prod/staging) refuses to boot with defaults
`POSTGRES_*`	DB credentials (used by `db` and `api`)
`JWT_SECRET`	HS256 signing key — generate 64+ random bytes (`python -c "import secrets; print(secrets.token_urlsafe(64))"`)
`LOG_LEVEL`	`DEBUG` / `INFO` / `WARNING` / `ERROR`
`FRONT_ORIGIN`	Allowed CORS origin for the SPA
`EVIDENCE_DIR`	Path inside the api container where uploads land
`HOST_API_PORT`	Host port mapped to the api (default 8000)
`HOST_FRONT_PORT`	Host port mapped to the front nginx (default 8080)

Testing

Manual + automated checklist for the current milestone: see tasks/testing-m<N>.md (current: testing-m7.md).
Backend unit tests: make test-api
End-to-end (Playwright): make e2e-install (once), then make up && make e2e. Reports land in e2e/playwright-report/ (HTML + JUnit XML); open with make e2e-report.

Pre-commit hooks

After cloning, install hooks once:

pipx install pre-commit   # or: pip install --user pre-commit
pre-commit install
pre-commit run --all-files   # initial sweep

The hooks run ruff + ruff-format on the backend and eslint / tsc --noEmit / prettier --check on the frontend (see .pre-commit-config.yaml).

Project layout

.
├── backend/             # Flask API
│   └── app/
│       ├── api/         # HTTP layer (blueprints)
│       ├── core/        # config, logging, errors
│       ├── db/          # SQLAlchemy session, migrations (M1+)
│       ├── models/      # ORM models (M1+)
│       ├── services/    # domain logic (M2+)
│       └── i18n/        # message catalogs (M13)
├── frontend/            # Vite + React + TS + Tailwind
│   └── src/components/ui/   # RTOps design system primitives
├── tasks/
│   ├── spec.md          # source of truth for requirements
│   ├── design.md        # RTOps design system
│   ├── todo.md          # milestone plan
│   └── lessons.md       # session retrospectives
├── docker-compose.yml
├── Makefile
└── CHANGELOG.md

Roadmap

See tasks/todo.md. Current milestone: M7 — Red & blue execution on a mission test (done). Next: M8 (custom detection-level CRUD).

License

TBD.

Languages

Python 59.6%

TypeScript 38.1%

Makefile 1.2%

Dockerfile 0.6%

CSS 0.3%

Other 0.1%

README.md Unescape Escape

Metamorph

Stack

Quickstart

First-time setup

Local dev (no Docker)

Environment variables

Testing

Pre-commit hooks

Project layout

Roadmap

License

README.md