mimic/tasks/todo.md

# Sprint 8 — C2 Layer: Mythic Integration

> Branch : `sprint/8-c2` · Worktree : `.claude/worktrees/sprint-8-c2` · Base : `main` @ `6ca614a`

SPEC.md phase 2: "après que la V1 soit terminée, nous ajouterons une couche permettant
de se connecter à un C2 (mythic ou maison) afin d'exécuter des simulations au travers du C2."

## §0 — Binding decisions (locked with user, 2026-06-10)

1. **C2 target**: Mythic 3.x, behind a thin `C2Adapter` interface (keeps the door open for a custom C2 later).
2. **Scope**: FULL bidirectional — execute + near-real-time tracking + history import. User explicitly overrode the "one-shot first" recommendation; overrun risk accepted (see R1).
3. **C2 config**: per-engagement (URL + API token). Token is write-only at the API level — never returned in clear.
4. **UI anchor**: execution lives in the simulation screen (Red Team card).
5. **Realtime mechanism**: short polling. Frontend: TanStack Query `refetchInterval` 2 500 ms while any attached task is incomplete. Backend: poll-on-read — refreshes non-completed tasks from Mythic when the task list is read. No scheduler, no new infra.
6. **Secret storage**: API token Fernet-encrypted in SQLite. Key from env var `MIMIC_ENCRYPTION_KEY` (mandatory to enable C2 features, never hardcoded — OPSEC rule).
7. **History import**: BOTH — auto-attach of tasks launched from Mimic AND manual browse/select of the callback's task history.
8. **Validation**: fully mocked — no dev Mythic instance available. pytest uses a mocked adapter; e2e uses a built-in `FakeAdapter` selected via `MIMIC_C2_ADAPTER=fake`.
9. **RBAC**: C2 is a RedTeam resource — admin + redteam full access, SOC gets 403 on every C2 endpoint (same precedent as Templates and Export).
10. **Workflow tie-in**: launching a C2 execution auto-transitions the simulation `pending → in_progress` (same rule as a manual RT edit).
11. **Output mapping**: when a task completes (or is imported), its output is APPENDED to `execution_result` prefixed by a `$ <command>` header line; `executed_at` is set from the first task's timestamp if empty; the command is appended to `commands` if not already present.

## §1 — Mythic adapter contract (pinned from MythicMeta/Mythic_Scripting client source)

- Transport: POST `https://<host>:7443/graphql`, header `apitoken: <token>` (Hasura behind nginx).
- Issue task: mutation `createTask(callback_id: <display_id>, command, params, tasking_location: "command_line")`.
- List callbacks: query on `callback` table — `id, display_id, active, host, user, domain, last_checkin`.
- Task status: query on `task` table — `display_id, status, completed` (client lib uses a subscription; we poll the query instead, per decision 5).
- Task output: query on `response` table ordered by id — `response_text` is **base64-encoded**, decode + concatenate.
- History: query `task` filtered by `callback_display_id`, paginated, with command + status + timestamps.

`C2Adapter` interface (`backend/app/services/c2/adapter.py`):
- `test_connection() -> C2Health`
- `list_callbacks() -> list[C2Callback]`
- `create_task(callback_display_id, command, params) -> int` (task display_id)
- `get_task(task_display_id) -> C2TaskStatus`
- `get_task_output(task_display_id) -> str` (decoded)
- `list_callback_tasks(callback_display_id, page, page_size) -> C2TaskPage`

Implementations: `MythicAdapter` (requests, `verify_tls` flag from config — lab instances run self-signed), `FakeAdapter` (deterministic in-memory data, selected by `MIMIC_C2_ADAPTER=fake`, also powers e2e).

## §2 — Backend (backend-builder) — milestones M1→M4

**Data model (1 Alembic migration):**
- `c2_config`: id, engagement_id FK **unique** (`ON DELETE CASCADE`, same precedent as `simulations.engagement_id`), url, api_token_encrypted, verify_tls bool (default true), created_at, updated_at.
- `c2_task`: id, simulation_id FK (`ON DELETE CASCADE`), mythic_task_display_id int, callback_display_id int, command text, params text nullable, status text, completed bool, output text nullable, source enum(`mimic`,`import`), created_at, completed_at nullable.

**Endpoints (all admin+redteam; SOC → 403):**
- `GET/PUT/DELETE /api/engagements/<id>/c2-config` — GET returns `has_token: bool`, never the token.
- `POST /api/engagements/<id>/c2-config/test` — connectivity check via adapter.
- `GET /api/engagements/<id>/c2/callbacks` — active callbacks.
- `POST /api/simulations/<id>/c2/execute` `{callback_display_id, commands: [str]}` — one Mythic task per command, rows in `c2_task` (source=mimic), auto-transition pending→in_progress.
- `GET /api/simulations/<id>/c2/tasks` — poll-on-read: refresh incomplete tasks from Mythic; on completion fetch output and apply decision 11 mapping (idempotent, applied once).
- `GET /api/engagements/<id>/c2/callbacks/<cid>/history?page=` — paginated callback history for import.
- `POST /api/simulations/<id>/c2/import` `{task_display_ids: [int]}` — import selected tasks (source=import) + decision 11 mapping.

**Milestones:** M1 crypto service (Fernet) + migration + config CRUD + test endpoint → M2 callbacks + execute → M3 poll-on-read status/output + mapping → M4 history + import.
**Tests:** pytest with mocked adapter (~35-45 new), zero live HTTP. Crypto round-trip, RBAC 403 matrix, mapping idempotence, migration up/down.

## §3 — Frontend (frontend-builder)

- **Engagement detail/form**: "C2 configuration" card — URL, token (password input, write-only placeholder when `has_token`), verify-TLS checkbox, [Test connection] with result feedback. Admin+redteam only.
- **SimulationFormPage RT card**: [Execute via C2] button (hidden when no config; disabled when done) → modal: callback picker table (display_id, host, user, active, last_checkin — font-mono data cells) → commands textarea prefilled from `rt.commands` → Launch.
- **C2 tasks panel** (under RT card): table of attached tasks — command (mono), status badge (semantic tokens), completed_at (mono). `refetchInterval` 2 500 ms while any incomplete; stops when all done.
- **Import history modal**: callback picker → paginated task list with checkboxes → [Import selected].
- Terminal-SOC compliance: rounded-none, zero transitions, hairline borders, mono for data only. Status colors via `success`/`warn` semantic tokens.
- Vitest ~15-20 new tests. `data-testid` on every new interactive surface for e2e.

## §4 — Docs & hygiene

- **SPEC.md C2 section = FIRST commit** of the sprint (sprint 5/6 lesson: never close a sprint with SPEC uncommitted).
- README: `MIMIC_ENCRYPTION_KEY`, `MIMIC_C2_ADAPTER` env vars + docker compose example.
- CHANGELOG sprint 8 entry at close.

## §5 — Pipeline & sequencing

1. spec-reviewer pre-pass on this plan (before any code).
2. backend-builder M1+M2 → API contract frozen → frontend-builder starts (parallel with backend M3+M4).
3. design-reviewer on new UI surfaces (screenshots come from the Playwright run — e2e bootstraps its own users, which sidesteps the sprint-7 credentials wall).
4. **code-reviewer must be respawned** (killed 2026-06-10 after idle-loop malfunction) before the review phase.
5. test-verifier: new C2 Playwright specs (fake adapter) + **full 223-spec re-run** — also clears the sprint 7 e2e debt (suite never re-ran after the design refresh).

## §6 — Risks

- **R1 — scope**: full bidirectional in one sprint. Mitigation: M1→M4 are ordered; M4 (history import) is the cut line if we overrun. User accepted explicitly.
- **R2 — schema fidelity**: no live Mythic to validate against; adapter pinned to the official `MythicMeta/Mythic_Scripting` client source on **master @ 2026-06-10** (verified via raw.githubusercontent.com). First real connection may need a small patch. README records the exact source URL alongside the "may need a patch" note. Adapter must defensively handle `response_text` base64 decode failures (binary output): on `binascii.Error` keep raw bytes hex-encoded with a prefix `<binary> ` so `execution_result` never silently corrupts.
- **R3 — secret at rest**: Fernet + env key; key rotation out of scope (documented limitation).
- **R4 — polling load**: poll-on-read touches only incomplete tasks of the open simulation — bounded.
- **R5 — e2e drift**: sprint 7 redesign never re-validated by Playwright; the full re-run in §5.5 surfaces any breakage — budget fix time.

## §7 — Definition of Done

- pytest green (256 baseline on main + new), vitest green (139 baseline + new), Playwright full suite green (223 baseline + new C2 specs).
- spec-reviewer (plan) + design-reviewer (UI) + code-reviewer (diff) all APPROVED.
- SPEC.md, README.md, CHANGELOG.md updated. No secret, key or IP hardcoded anywhere.
- PR #11 opened via `make open-pr`.