Introduce the SPEC section for the Mythic C2 integration layer. Covers RBAC (RT-only, SOC=403), per-engagement Fernet-encrypted config, c2_config + c2_task data model with ON DELETE CASCADE, full endpoint list, output mapping rules (append-only, idempotent), 2500 ms polling and the fake/real adapter selection via MIMIC_C2_ADAPTER. Also patch tasks/todo.md: fix pytest baseline (256 from main, not 253), make cascade-delete explicit, pin the MythicMeta/Mythic_Scripting source version and document defensive base64 handling. Closes spec-reviewer WARN-1 (SPEC ↔ plan parity), WARN-2 (cascade), INFO-1 (pinned source), INFO-3 (baseline).
8.6 KiB
Sprint 8 — C2 Layer: Mythic Integration
Branch :
sprint/8-c2· Worktree :.claude/worktrees/sprint-8-c2· Base :main@6ca614a
SPEC.md phase 2: "après que la V1 soit terminée, nous ajouterons une couche permettant de se connecter à un C2 (mythic ou maison) afin d'exécuter des simulations au travers du C2."
§0 — Binding decisions (locked with user, 2026-06-10)
- C2 target: Mythic 3.x, behind a thin
C2Adapterinterface (keeps the door open for a custom C2 later). - Scope: FULL bidirectional — execute + near-real-time tracking + history import. User explicitly overrode the "one-shot first" recommendation; overrun risk accepted (see R1).
- C2 config: per-engagement (URL + API token). Token is write-only at the API level — never returned in clear.
- UI anchor: execution lives in the simulation screen (Red Team card).
- Realtime mechanism: short polling. Frontend: TanStack Query
refetchInterval2 500 ms while any attached task is incomplete. Backend: poll-on-read — refreshes non-completed tasks from Mythic when the task list is read. No scheduler, no new infra. - Secret storage: API token Fernet-encrypted in SQLite. Key from env var
MIMIC_ENCRYPTION_KEY(mandatory to enable C2 features, never hardcoded — OPSEC rule). - History import: BOTH — auto-attach of tasks launched from Mimic AND manual browse/select of the callback's task history.
- Validation: fully mocked — no dev Mythic instance available. pytest uses a mocked adapter; e2e uses a built-in
FakeAdapterselected viaMIMIC_C2_ADAPTER=fake. - RBAC: C2 is a RedTeam resource — admin + redteam full access, SOC gets 403 on every C2 endpoint (same precedent as Templates and Export).
- Workflow tie-in: launching a C2 execution auto-transitions the simulation
pending → in_progress(same rule as a manual RT edit). - Output mapping: when a task completes (or is imported), its output is APPENDED to
execution_resultprefixed by a$ <command>header line;executed_atis set from the first task's timestamp if empty; the command is appended tocommandsif not already present.
§1 — Mythic adapter contract (pinned from MythicMeta/Mythic_Scripting client source)
- Transport: POST
https://<host>:7443/graphql, headerapitoken: <token>(Hasura behind nginx). - Issue task: mutation
createTask(callback_id: <display_id>, command, params, tasking_location: "command_line"). - List callbacks: query on
callbacktable —id, display_id, active, host, user, domain, last_checkin. - Task status: query on
tasktable —display_id, status, completed(client lib uses a subscription; we poll the query instead, per decision 5). - Task output: query on
responsetable ordered by id —response_textis base64-encoded, decode + concatenate. - History: query
taskfiltered bycallback_display_id, paginated, with command + status + timestamps.
C2Adapter interface (backend/app/services/c2/adapter.py):
test_connection() -> C2Healthlist_callbacks() -> list[C2Callback]create_task(callback_display_id, command, params) -> int(task display_id)get_task(task_display_id) -> C2TaskStatusget_task_output(task_display_id) -> str(decoded)list_callback_tasks(callback_display_id, page, page_size) -> C2TaskPage
Implementations: MythicAdapter (requests, verify_tls flag from config — lab instances run self-signed), FakeAdapter (deterministic in-memory data, selected by MIMIC_C2_ADAPTER=fake, also powers e2e).
§2 — Backend (backend-builder) — milestones M1→M4
Data model (1 Alembic migration):
c2_config: id, engagement_id FK unique (ON DELETE CASCADE, same precedent assimulations.engagement_id), url, api_token_encrypted, verify_tls bool (default true), created_at, updated_at.c2_task: id, simulation_id FK (ON DELETE CASCADE), mythic_task_display_id int, callback_display_id int, command text, params text nullable, status text, completed bool, output text nullable, source enum(mimic,import), created_at, completed_at nullable.
Endpoints (all admin+redteam; SOC → 403):
GET/PUT/DELETE /api/engagements/<id>/c2-config— GET returnshas_token: bool, never the token.POST /api/engagements/<id>/c2-config/test— connectivity check via adapter.GET /api/engagements/<id>/c2/callbacks— active callbacks.POST /api/simulations/<id>/c2/execute{callback_display_id, commands: [str]}— one Mythic task per command, rows inc2_task(source=mimic), auto-transition pending→in_progress.GET /api/simulations/<id>/c2/tasks— poll-on-read: refresh incomplete tasks from Mythic; on completion fetch output and apply decision 11 mapping (idempotent, applied once).GET /api/engagements/<id>/c2/callbacks/<cid>/history?page=— paginated callback history for import.POST /api/simulations/<id>/c2/import{task_display_ids: [int]}— import selected tasks (source=import) + decision 11 mapping.
Milestones: M1 crypto service (Fernet) + migration + config CRUD + test endpoint → M2 callbacks + execute → M3 poll-on-read status/output + mapping → M4 history + import. Tests: pytest with mocked adapter (~35-45 new), zero live HTTP. Crypto round-trip, RBAC 403 matrix, mapping idempotence, migration up/down.
§3 — Frontend (frontend-builder)
- Engagement detail/form: "C2 configuration" card — URL, token (password input, write-only placeholder when
has_token), verify-TLS checkbox, [Test connection] with result feedback. Admin+redteam only. - SimulationFormPage RT card: [Execute via C2] button (hidden when no config; disabled when done) → modal: callback picker table (display_id, host, user, active, last_checkin — font-mono data cells) → commands textarea prefilled from
rt.commands→ Launch. - C2 tasks panel (under RT card): table of attached tasks — command (mono), status badge (semantic tokens), completed_at (mono).
refetchInterval2 500 ms while any incomplete; stops when all done. - Import history modal: callback picker → paginated task list with checkboxes → [Import selected].
- Terminal-SOC compliance: rounded-none, zero transitions, hairline borders, mono for data only. Status colors via
success/warnsemantic tokens. - Vitest ~15-20 new tests.
data-testidon every new interactive surface for e2e.
§4 — Docs & hygiene
- SPEC.md C2 section = FIRST commit of the sprint (sprint 5/6 lesson: never close a sprint with SPEC uncommitted).
- README:
MIMIC_ENCRYPTION_KEY,MIMIC_C2_ADAPTERenv vars + docker compose example. - CHANGELOG sprint 8 entry at close.
§5 — Pipeline & sequencing
- spec-reviewer pre-pass on this plan (before any code).
- backend-builder M1+M2 → API contract frozen → frontend-builder starts (parallel with backend M3+M4).
- design-reviewer on new UI surfaces (screenshots come from the Playwright run — e2e bootstraps its own users, which sidesteps the sprint-7 credentials wall).
- code-reviewer must be respawned (killed 2026-06-10 after idle-loop malfunction) before the review phase.
- test-verifier: new C2 Playwright specs (fake adapter) + full 223-spec re-run — also clears the sprint 7 e2e debt (suite never re-ran after the design refresh).
§6 — Risks
- R1 — scope: full bidirectional in one sprint. Mitigation: M1→M4 are ordered; M4 (history import) is the cut line if we overrun. User accepted explicitly.
- R2 — schema fidelity: no live Mythic to validate against; adapter pinned to the official
MythicMeta/Mythic_Scriptingclient source on master @ 2026-06-10 (verified via raw.githubusercontent.com). First real connection may need a small patch. README records the exact source URL alongside the "may need a patch" note. Adapter must defensively handleresponse_textbase64 decode failures (binary output): onbinascii.Errorkeep raw bytes hex-encoded with a prefix<binary>soexecution_resultnever silently corrupts. - R3 — secret at rest: Fernet + env key; key rotation out of scope (documented limitation).
- R4 — polling load: poll-on-read touches only incomplete tasks of the open simulation — bounded.
- R5 — e2e drift: sprint 7 redesign never re-validated by Playwright; the full re-run in §5.5 surfaces any breakage — budget fix time.
§7 — Definition of Done
- pytest green (256 baseline on main + new), vitest green (139 baseline + new), Playwright full suite green (223 baseline + new C2 specs).
- spec-reviewer (plan) + design-reviewer (UI) + code-reviewer (diff) all APPROVED.
- SPEC.md, README.md, CHANGELOG.md updated. No secret, key or IP hardcoded anywhere.
- PR #11 opened via
make open-pr.