Files
mimic-big/tasks/spec-decisions.md
knacky 76f8443ac2 docs: sprint 2 surface in docs/api.md + D-015/D-016/D-017 + changelog
- `docs/api.md` extended with the sprint-2 surface: pagination envelope
  conventions, engagement members (GET/POST/DELETE), users (GET paginated
  with `?type=`, POST, PATCH, DELETE-soft), audit log viewer with its
  five filters. Anti-enumeration semantics (404 on foreign members) made
  explicit. Drive-by fix: `/engagements<eid>` → `/engagements/<eid>`.
- `tasks/spec-decisions.md` logs the three sprint-2 decisions verbatim:
  - **D-015** USER_MANAGE permission (wording from spec-analyst).
  - **D-016** pagination envelope shape (`{items, total, page, page_size}`).
  - **D-017** `engagement_member.role` stays a free-form label.
- `CHANGELOG.md` summarises the sprint with hashes / behaviours / decisions.
2026-05-23 15:53:45 +02:00

11 KiB
Raw Permalink Blame History

Spec decisions log

This file tracks implementation arbitrations on top of the frozen spec (Projects/Mimic — Spec.md in the RT-SecondBrain vault).

Format: one entry per decision, newest first.


2026-05-21 — Team kickoff decisions

D-001 — SOC collaboration hypothesis

Context. Devils-advocate flagged the sociological assumption that SOC analysts will cote in the live cockpit. Decision. Hypothesis accepted as-is. No paper PoC. Risk owned by lead RT.

D-002 — Mimic deployment location

Context. Spec §6 NF-network did not pin where Mimic is physically deployed. Decision. Mimic runs on RT infrastructure. SOC client connects through the existing RT reverse proxy (Caddy, out of Mimic scope). Mimic → Mythic / Home C2 through outbound VPN. RT R&D (TTP library, stealthy variants) never sits on client premises.

D-003 — Authentication strategy

Context. Spec mentions OIDC Keycloak but lab onboarding cost is high. Decision. v1 ships local auth (username/password, bcrypt, Flask server-side sessions). v2 adds Keycloak OIDC. The RBAC model is group-based from day one, so OIDC will map claims to existing groups without touching application code. SOC sessions remain a distinct mechanism (soc_session.token_opaque bcrypt hash, clear token out-of-band).

D-004 — C2 credential storage (T2)

Context. Engagement.config_json (encrypted JSON column) vs dedicated table. Decision. Dedicated table c2_credential (id, engagement_id, c2_type, config_json_fernet, version, created_at, retired_at). Active row per engagement = retired_at IS NULL, highest version. Rotation = insert + retire previous. Fernet key in env, never in DB.

D-005 — Cleanup template variable sources (T3)

Context. Jinja {{outputs.X}} source ambiguity. Decision. Two accessors:

  • {{outputs.text}}run_step.output_text (stdout/UTF-8 text).
  • {{outputs.blob("<key>")}} → reads from output_blob_ref, hard cap 10 MB (consistent with F8 evidence limit), UTF-8 decoding with latin-1 fallback, silent refusal + log entry if the blob is non-decodable. regex_extract always operates on the resulting string.

D-006 — SOC session token storage (T4)

Context. soc_session.token_opaque storage form. Decision. bcrypt hash. Clear token generated server-side at session creation, returned once in the API response, delivered out-of-band to the SOC analyst. Never re-displayable.

D-007 — Reverse proxy scope

Context. Mimic exposure to internet for SOC client access. Decision. Reverse proxy (Caddy + TLS + IP allowlist) handled by existing RT infrastructure. Mimic ships an HTTP listener on localhost only; the deployment playbook wires it behind the existing proxy.

D-008 — Group-based RBAC vs spec F11 fixed roles

Context. Spec F11 declares 3 fixed roles (rt_operator, rt_lead, soc_analyst) with an explicit permission matrix. Sprint 0 plan (B0.6, D-003) introduces group / permission / group_permission / user_group tables to prepare OIDC v2 claim-to-group mapping without code change. Decision. Group-based model accepted as an implementation layout, not a scope extension:

  • The 3 spec roles MUST exist as the 3 seeded groups at bootstrap (rt_operator, rt_lead, soc_analyst).
  • The F11 permission matrix is the canonical source: groups receive exactly the permissions of their matching role; no custom permissions UI v1.
  • Custom groups, group editing UI, or per-engagement group overrides = OUT of v1.
  • Any drift between seeded group permissions and the F11 matrix is a spec violation, not a configuration choice.

D-009 — ttp_version table forbidden (H32 reaffirmed)

Context. Sprint 0 plan (B0.2) lists ttp_version among the initial tables. Spec hypothesis H32 explicitly excludes this: "Snapshot de rejouabilité = run.snapshot_json uniquement (pas de table ttp_version séparée — simplification MVP)". Decision. Drop ttp_version from the initial migration. The ttp.version column (informational, §8) is kept. Replayability lives solely on run.snapshot_json. Re-introducing ttp_version requires explicit spec amendment through the team-lead.

D-010 — Ansible for the deployment playbook

Context. Spec §7 names Docker only on the deploy line, but D-007 references a "deployment playbook" wiring Mimic behind the existing reverse proxy. The RT team uses Ansible for infrastructure automation across projects. Decision. Deployment artifacts are Docker images (built in repo) plus an Ansible playbook (lives outside the application repo, in the RT infra repo). Mimic itself ships only the Dockerfile and a sample compose for dev; production roll-out is Ansible-driven. The README stack line is updated accordingly.

D-011 — regex_extract Jinja2 filter semantics (resolves Q-001)

Context. D-005 introduced regex_extract on Jinja templates without fixing its match-mode, no-match behaviour, group selection, or engine flavour. Backend B0.5 (templating sandbox) is starting and needs a frozen signature. Decision.

  • Enginegoogle-re2 (D-005 reaffirmed). Linear-time, no backrefs, OPSEC-safe (no ReDoS).
  • Match mode — first match only.
  • No-match — raise TemplateError("regex_extract: no match for /<pattern>/"). No silent fallback. Drifting cleanup templates must fail loudly at step run time, not on next mission.
  • Group selection — defaults to capture group 1; positional fallback to the full match when the pattern has no groups; named groups via name="<name>".
  • Signatureregex_extract(text, pattern, *, group=1, name=None).
  • Rationale — ATR/Caldera compatibility is not an objective (D-005). Fail- fast > silent string corruption when a cleanup template touches a host with unexpected output shape.

D-012 — output_blob_ref storage layout (resolves Q-002)

Context. §8 declares run_step.output_blob_ref without specifying pool, quota, format, or path. H20 says "local disk v1" only. Sprint 0 needs the layout locked because B0.5 already references {{ outputs.blob(...) }}. Decision.

  • Two separate pools
    • MIMIC_BLOB_ROOT (default /var/lib/mimic/blobs/) — binary outputs from C2Connector polling. Content-addressed layout: <aa>/<bb>/<sha256>.gz where aa/bb are the first two byte-pairs of the sha256 hex digest. gzip systematically; raw stored bytes never on disk.
    • MIMIC_EVIDENCE_ROOT (default /var/lib/mimic/evidence/) — user-uploaded evidence files (F8). Flat layout <engagement_id>/<evidence_id>.<ext>, no compression.
  • Cap per blob — 10 MB (consistent with F8 and D-005).
  • Quota — no in-app global quota v1. OS-level monitoring via Prometheus node_exporter. F12 archival pipeline will own retention/purge post-sprint-0.
  • Filesystem permissions0750, owner the mimic system user.
  • Rationale — CAS deduplicates repeated C2 outputs (same whoami, same Get-Process snapshot) for free. Evidence stays flat because uploads are one-shot and tied to an engagement scope that we want to archive whole. Two pools mean we can wire independent quotas / retention policies in v2 without migration.

Resolved open questions

  • Q-001 → D-011.
  • Q-002 → D-012.

D-013 — Hash-chain in audit_log from v1

Context. Spec H30 places the hash chain in v2; F13 / R-O5 only mandate the write-only role for v1. While implementing B0.7, adding the columns and chaining logic was a few lines and avoids a destructive migration later. Decision. prev_hash / row_hash columns ship from day one and are populated at insert time (SHA-256 of canonical record + previous hash). The chain verifier lands in v2. Cost is negligible (one SELECT + one SHA-256 per audit insert).

D-014 — Type-hinting strategy for the ORM

Context. Flask-SQLAlchemy 3 rejects a per-base type_annotation_map (the extension owns the registry). Decision. UUID primary keys use the explicit PG_UUID(as_uuid=True) type on UuidPkMixin. Foreign-key UUID columns rely on SQLAlchemy 2's built-in Uuid mapping via Mapped[uuid.UUID]. No type_annotation_map on the declarative base.

D-015 — User management permission

Decision: Add USER_MANAGE = "user.manage" to the Permission enum in backend/src/mimic/rbac/matrix.py. This permission gates all /api/v1/users CRUD endpoints (list, create, update/disable). It is granted exclusively to rt_lead (already holds ALL_PERMISSIONS — no change to GROUP_PERMISSIONS dict).

Why: The F11 matrix does not explicitly list "manage users" as a named permission, but spec §9 routes assign /admin (users, audit log) to Lead RT only. The CLI mimic-cli user create covered creation out-of-band but sprint 2 adds a UI-facing REST endpoint, which requires a named permission for @require_perm decorator + testability.

How to apply: Backend uses @require_perm(Permission.USER_MANAGE) on all /api/v1/users endpoints. No change to GROUP_PERMISSIONS needed — rt_lead holds ALL_PERMISSIONS already. rt_operator and soc_analyst get 403 automatically.

D-016 — Pagination envelope shape

Context. Sprint 2 adds two paginated endpoints (/users and /audit/log); sprint 3+ will paginate TTPs and scenarios. A consistent shape avoids two client-side parsers.

Decision. Standard envelope:

{ "items": [...], "total": <n>, "page": 1, "page_size": 50 }
  • Query params: ?page= (≥1, default 1), ?page_size= (default 50, max 200).
  • total is computed via a SELECT COUNT(*) against the same filtered query.
  • Existing non-paginated endpoints (GET /api/v1/engagements) are not migrated this sprint — changing them retroactively would break the frontend client that already shipped. They'll migrate together later via either a /api/v2/ bump or an opt-in ?paginate=true flag.

How to apply. mimic.schemas.pagination.Page[T] + PageQuery provide the shape and the validated query parsing; mimic.api._helpers.parse_page_query() is the canonical entrypoint inside blueprints.

D-017 — engagement_member.role as a free-form label

Context. The engagement_member.role column is String(40) (sprint 0). Sprint 2 needs to know what to validate at the API boundary.

Decision. Treat role as a free-form informational label, not as an authorization gate. Application-level RBAC stays the responsibility of the F11 group membership; role documents who-does-what on the engagement (e.g. "member", "lead-on-mission", "binôme A", "shadow"). Default to "member" when not provided. Validation: 140 chars.

How to apply. EngagementMemberCreate uses a str field with the 140-char bound; no enum to maintain. If future code needs a typed role, introduce a separate column (do not repurpose this one).