Files

Knacky 813e69ee01 docs(spec): add C2 integration section (sprint 8 commit #1 )

Introduce the SPEC section for the Mythic C2 integration layer.
Covers RBAC (RT-only, SOC=403), per-engagement Fernet-encrypted config,
c2_config + c2_task data model with ON DELETE CASCADE, full endpoint
list, output mapping rules (append-only, idempotent), 2500 ms polling
and the fake/real adapter selection via MIMIC_C2_ADAPTER.

Also patch tasks/todo.md: fix pytest baseline (256 from main, not 253),
make cascade-delete explicit, pin the MythicMeta/Mythic_Scripting source
version and document defensive base64 handling.

Closes spec-reviewer WARN-1 (SPEC ↔ plan parity), WARN-2 (cascade),
INFO-1 (pinned source), INFO-3 (baseline).

2026-06-10 19:07:35 +02:00

8.6 KiB

Raw Blame History

Sprint 8 — C2 Layer: Mythic Integration

Branch : sprint/8-c2 · Worktree : .claude/worktrees/sprint-8-c2 · Base : main @ 6ca614a

SPEC.md phase 2: "après que la V1 soit terminée, nous ajouterons une couche permettant de se connecter à un C2 (mythic ou maison) afin d'exécuter des simulations au travers du C2."

§0 — Binding decisions (locked with user, 2026-06-10)

C2 target: Mythic 3.x, behind a thin C2Adapter interface (keeps the door open for a custom C2 later).
Scope: FULL bidirectional — execute + near-real-time tracking + history import. User explicitly overrode the "one-shot first" recommendation; overrun risk accepted (see R1).
C2 config: per-engagement (URL + API token). Token is write-only at the API level — never returned in clear.
UI anchor: execution lives in the simulation screen (Red Team card).
Realtime mechanism: short polling. Frontend: TanStack Query refetchInterval 2 500 ms while any attached task is incomplete. Backend: poll-on-read — refreshes non-completed tasks from Mythic when the task list is read. No scheduler, no new infra.
Secret storage: API token Fernet-encrypted in SQLite. Key from env var MIMIC_ENCRYPTION_KEY (mandatory to enable C2 features, never hardcoded — OPSEC rule).
History import: BOTH — auto-attach of tasks launched from Mimic AND manual browse/select of the callback's task history.
Validation: fully mocked — no dev Mythic instance available. pytest uses a mocked adapter; e2e uses a built-in FakeAdapter selected via MIMIC_C2_ADAPTER=fake.
RBAC: C2 is a RedTeam resource — admin + redteam full access, SOC gets 403 on every C2 endpoint (same precedent as Templates and Export).
Workflow tie-in: launching a C2 execution auto-transitions the simulation pending → in_progress (same rule as a manual RT edit).
Output mapping: when a task completes (or is imported), its output is APPENDED to execution_result prefixed by a $ <command> header line; executed_at is set from the first task's timestamp if empty; the command is appended to commands if not already present.

§1 — Mythic adapter contract (pinned from MythicMeta/Mythic_Scripting client source)

Transport: POST https://<host>:7443/graphql, header apitoken: <token> (Hasura behind nginx).
Issue task: mutation createTask(callback_id: <display_id>, command, params, tasking_location: "command_line").
List callbacks: query on callback table — id, display_id, active, host, user, domain, last_checkin.
Task status: query on task table — display_id, status, completed (client lib uses a subscription; we poll the query instead, per decision 5).
Task output: query on response table ordered by id — response_text is base64-encoded, decode + concatenate.
History: query task filtered by callback_display_id, paginated, with command + status + timestamps.

C2Adapter interface (backend/app/services/c2/adapter.py):

test_connection() -> C2Health
list_callbacks() -> list[C2Callback]
create_task(callback_display_id, command, params) -> int (task display_id)
get_task(task_display_id) -> C2TaskStatus
get_task_output(task_display_id) -> str (decoded)
list_callback_tasks(callback_display_id, page, page_size) -> C2TaskPage

Implementations: MythicAdapter (requests, verify_tls flag from config — lab instances run self-signed), FakeAdapter (deterministic in-memory data, selected by MIMIC_C2_ADAPTER=fake, also powers e2e).

§2 — Backend (backend-builder) — milestones M1→M4

Data model (1 Alembic migration):

c2_config: id, engagement_id FK unique (ON DELETE CASCADE, same precedent as simulations.engagement_id), url, api_token_encrypted, verify_tls bool (default true), created_at, updated_at.
c2_task: id, simulation_id FK (ON DELETE CASCADE), mythic_task_display_id int, callback_display_id int, command text, params text nullable, status text, completed bool, output text nullable, source enum(mimic,import), created_at, completed_at nullable.

Endpoints (all admin+redteam; SOC → 403):

GET/PUT/DELETE /api/engagements/<id>/c2-config — GET returns has_token: bool, never the token.
POST /api/engagements/<id>/c2-config/test — connectivity check via adapter.
GET /api/engagements/<id>/c2/callbacks — active callbacks.
POST /api/simulations/<id>/c2/execute {callback_display_id, commands: [str]} — one Mythic task per command, rows in c2_task (source=mimic), auto-transition pending→in_progress.
GET /api/simulations/<id>/c2/tasks — poll-on-read: refresh incomplete tasks from Mythic; on completion fetch output and apply decision 11 mapping (idempotent, applied once).
GET /api/engagements/<id>/c2/callbacks/<cid>/history?page= — paginated callback history for import.
POST /api/simulations/<id>/c2/import {task_display_ids: [int]} — import selected tasks (source=import) + decision 11 mapping.

Milestones: M1 crypto service (Fernet) + migration + config CRUD + test endpoint → M2 callbacks + execute → M3 poll-on-read status/output + mapping → M4 history + import. Tests: pytest with mocked adapter (~35-45 new), zero live HTTP. Crypto round-trip, RBAC 403 matrix, mapping idempotence, migration up/down.

§3 — Frontend (frontend-builder)

Engagement detail/form: "C2 configuration" card — URL, token (password input, write-only placeholder when has_token), verify-TLS checkbox, [Test connection] with result feedback. Admin+redteam only.
SimulationFormPage RT card: [Execute via C2] button (hidden when no config; disabled when done) → modal: callback picker table (display_id, host, user, active, last_checkin — font-mono data cells) → commands textarea prefilled from rt.commands → Launch.
C2 tasks panel (under RT card): table of attached tasks — command (mono), status badge (semantic tokens), completed_at (mono). refetchInterval 2 500 ms while any incomplete; stops when all done.
Import history modal: callback picker → paginated task list with checkboxes → [Import selected].
Terminal-SOC compliance: rounded-none, zero transitions, hairline borders, mono for data only. Status colors via success/warn semantic tokens.
Vitest ~15-20 new tests. data-testid on every new interactive surface for e2e.

§4 — Docs & hygiene

SPEC.md C2 section = FIRST commit of the sprint (sprint 5/6 lesson: never close a sprint with SPEC uncommitted).
README: MIMIC_ENCRYPTION_KEY, MIMIC_C2_ADAPTER env vars + docker compose example.
CHANGELOG sprint 8 entry at close.

§5 — Pipeline & sequencing

spec-reviewer pre-pass on this plan (before any code).
backend-builder M1+M2 → API contract frozen → frontend-builder starts (parallel with backend M3+M4).
design-reviewer on new UI surfaces (screenshots come from the Playwright run — e2e bootstraps its own users, which sidesteps the sprint-7 credentials wall).
code-reviewer must be respawned (killed 2026-06-10 after idle-loop malfunction) before the review phase.
test-verifier: new C2 Playwright specs (fake adapter) + full 223-spec re-run — also clears the sprint 7 e2e debt (suite never re-ran after the design refresh).

§6 — Risks

R1 — scope: full bidirectional in one sprint. Mitigation: M1→M4 are ordered; M4 (history import) is the cut line if we overrun. User accepted explicitly.
R2 — schema fidelity: no live Mythic to validate against; adapter pinned to the official MythicMeta/Mythic_Scripting client source on master @ 2026-06-10 (verified via raw.githubusercontent.com). First real connection may need a small patch. README records the exact source URL alongside the "may need a patch" note. Adapter must defensively handle response_text base64 decode failures (binary output): on binascii.Error keep raw bytes hex-encoded with a prefix <binary> so execution_result never silently corrupts.
R3 — secret at rest: Fernet + env key; key rotation out of scope (documented limitation).
R4 — polling load: poll-on-read touches only incomplete tasks of the open simulation — bounded.
R5 — e2e drift: sprint 7 redesign never re-validated by Playwright; the full re-run in §5.5 surfaces any breakage — budget fix time.

§7 — Definition of Done

pytest green (256 baseline on main + new), vitest green (139 baseline + new), Playwright full suite green (223 baseline + new C2 specs).
spec-reviewer (plan) + design-reviewer (UI) + code-reviewer (diff) all APPROVED.
SPEC.md, README.md, CHANGELOG.md updated. No secret, key or IP hardcoded anywhere.
PR #11 opened via make open-pr.

8.6 KiB Raw Blame History