Files
mimic-big/docs/architecture.md
knacky df6294ed7b docs: align doc references with compose.yml rename (code-reviewer M1)
Three docs still referenced the old docker-compose.yml path. Replace
with compose.yml so a future reader cloning at this hash finds the
file at the documented path.

- CHANGELOG.md:31 — backend skeleton recap line.
- docs/architecture.md:28 — deployment artifacts note (D-010 scope).
- tasks/todo.md:9 — B0.1 task description.

Also adds a "CI follow-ups (sprint 1+)" section to tasks/todo.md
capturing the 3 MINOR + 6 NIT deferred from code-reviewer's review
of chore/podman-and-ci, plus a FERNET-KEY tracker for the secret
provisioning before c2_credential.config_fernet (D-004) is wired.
2026-05-22 19:49:16 +02:00

14 KiB

Mimic — sprint 0 architecture

This is the as-of-sprint-0 mirror of what's committed in feature/backend-skeleton and feature/frontend-skeleton. It does not invent beyond the code. When you read this and the code disagrees, the code wins — file a doc fix.

Authoritative sources outside this file:

  • Frozen specRT-SecondBrain/Projects/Mimic — Spec.md (vault).
  • Implementation arbitrationstasks/spec-decisions.md (D-001..D-014).
  • Open questionstasks/open-questions.md (Q-003..Q-005, deferred).

Repository layout

mimic/
├── backend/        # Flask + SQLAlchemy + Alembic + Jinja sandbox + RBAC + CLI
├── frontend/       # Vite + React 19 + TS strict + Tailwind 4 + TanStack Query 5
├── docs/           # This file (architecture). ADRs land in tasks/spec-decisions.md.
├── tasks/          # Sprint backlog (todo.md), decisions, open questions, lessons.
├── CHANGELOG.md    # Keep-a-Changelog flavoured.
└── README.md       # Entry point + status + stack.

Deployment artifacts (Ansible playbook, prod compose) live outside the repo in the RT infra repo (D-010). Mimic ships only Dockerfiles and a dev compose.yml.

Backend module tree

backend/src/mimic/
├── app.py            # Flask app factory (register blueprints, extensions, error handlers)
├── config.py         # Env-driven settings (no hardcoded secrets — NF-network)
├── extensions.py     # db (Flask-SQLAlchemy 3), login_manager, …
├── logging.py        # JSON structured logger
│
├── db/
│   ├── base.py             # Declarative Base + UuidPkMixin + TimestampsMixin
│   ├── types.py            # Python enums mirrored to Postgres ENUMs
│   ├── models/             # SQLAlchemy 2 typed mapped classes (§8 aggregates)
│   └── migrations/         # Alembic env + initial schema (202605210001)
│
├── rbac/
│   ├── matrix.py           # Permission enum + GROUP_PERMISSIONS (F11 source of truth)
│   └── decorators.py       # @require_perm Flask decorator
│
├── auth/
│   ├── identity.py         # current_user wiring (Flask-Login)
│   ├── password.py         # bcrypt helpers
│   └── soc_token.py        # 256-bit url-safe opaque tokens, bcrypt-hashed (D-006)
│
├── audit/
│   └── log.py              # Append-only writer + hash-chain (D-013)
│
├── templating/
│   ├── sandbox.py          # Jinja2 SandboxedEnvironment
│   └── filters.py          # regex_extract (google-re2, raise on no-match — D-011)
│
├── storage/
│   └── blob.py             # CAS sha256 + gzip pool (MIMIC_BLOB_ROOT — D-012)
│
├── connectors/
│   ├── base.py             # C2Connector ABC + dataclasses (Payload, TaskHandle, TaskResult)
│   ├── factory.py          # Factory keyed on engagement/scenario c2_type
│   └── payload_map.py      # payload_type → native command (Mythic populated, Home stub)
│
├── api/                    # Flat CRUD blueprints (sprint 0 — no orchestration yet)
│   ├── engagements.py
│   ├── hosts.py
│   ├── ttps.py
│   └── scenarios.py        # Enforces F3 invariant: host.c2_type == scenario.c2_type
│
├── schemas/                # Pydantic v2 schemas (request/response validation)
│
└── cli/                    # mimic-cli (Click)
    ├── db.py               # migrate / seed / dump / restore (NF-state R-O1)
    └── user.py             # user create

Frontend module tree

frontend/
├── src/
│   ├── main.tsx            # Vite entry, mounts <App />
│   ├── App.tsx             # Router root
│   ├── routes/             # Role-aware route definitions
│   ├── layout/             # Shell (sidebar, header, role-conditional menus)
│   ├── components/         # Wireframe components on mock data (F0.3)
│   ├── theme/              # Tailwind tokens (dark-first), Logo placeholder
│   └── lib/                # TanStack Query client + helpers
├── playwright.config.ts    # E2E skeleton (no real auth wired sprint 0)
└── vite.config.ts

Persistence — §8 aggregates

Aggregate Notes vs spec
user §8 + display_name, last_login_at (bonus, OPSEC R-O3).
permission, group, group_permission, user_group RBAC layout (D-003, D-008). 3 groups seeded by migration (rt_operator, rt_lead, soc_analyst).
engagement §8 + free-text description. c2_type default = mythic.
engagement_member Role is a free String(40) — see "Known WARN" below.
c2_credential Non-spec aggregate, arbitrated D-004 (Fernet-encrypted, versioned, rotation = insert + retire).
host §8 verbatim. c2_type must match its scenario at run start (F3).
ttp §8 + is_stealth_variant (R-O2 marker stripping) + is_published (TTP_PROMOTE F11). No ttp_version table (D-009 / H32).
scenario, scenario_step §8 verbatim. (scenario_id, order_idx) unique. c2_type carried on scenario (H33).
run snapshot_json (JSONB) is the single replay source (H32).
run_step §8 + order_idx, resolved_payload_text (final payload with OPSEC marker — H34, audit-friendly).
run_step_cleanup 1-1 with run_step via UNIQUE(run_step_id). Status enum pending/success/failed/partial (F15, R-T5).
detection, evidence §8 verbatim.
report content_sha256 referenced in PDF footer + JSON + MD (H19, H24).
soc_session token_opaque renamed token_hash (bcrypt — D-006). Bonus last_used_at.
audit_log §8 + prev_hash/row_hash (D-013 — chain stored from v1, verifier in v2) + source_ip/user_agent/comment (forensic).

Postgres-level OPSEC

  • Audit append-only enforced at SQL level: mimic_audit_writer role gets INSERT only on audit_log; UPDATE/DELETE/TRUNCATE revoked from PUBLIC. Idempotent grants in the migration; the deployment playbook (Ansible) creates the roles (D-002, D-007, D-010).
  • Hash chain (D-013): every row stores row_hash = sha256(prev_hash || ts || actor_id || action || resource_type || resource_id || metadata_json). The verifier is not wired sprint 0; columns and writer logic ship so v2 enables enforcement without a destructive migration.

RBAC — F11 mirrored as code

backend/src/mimic/rbac/matrix.py is the canonical permission map. Spec F11 table is read 1:1 into GROUP_PERMISSIONS. The migration seeds exactly three groups (D-008):

Group Permission count Notes
rt_operator 10 Includes ENGAGEMENT_READ (scope (assignés) to be applied at endpoint level).
rt_lead All (~21) ALL_PERMISSIONS.
soc_analyst 3 ENGAGEMENT_READ_OWN, DETECTION_ADD, REPORT_READ.

Two F11 cells got a finer split (no semantic drift):

  • RUN_STARTRUN_CONTROL — both lead-only, sum equivalent to F11 "Démarrer / contrôler".
  • ENGAGEMENT_READ (RT, full list) ∥ ENGAGEMENT_READ_OWN (SOC, own session scope).

Decorator: @require_perm(Permission.X) on every Flask view. current_user resolved by Flask-Login (local password v1) or future Keycloak claim mapping (v2). SOC analysts authenticate through a separate token-based middleware (see "Auth" below).

Authentication

Two flows live side-by-side (D-003):

  • RT operators / leads — username + bcrypt password (v1) + Flask server-side session. v2: OIDC Keycloak claim-to-group mapping, no app code change (the RBAC tables already accept any group name).
  • SOC analysts — opaque 256-bit URL-safe tokens (secrets.token_urlsafe(32)), bcrypt-hashed in soc_session.token_hash, plain token returned once in the API response, delivered out-of-band (D-006). Scope: one engagement. Revocation = revoked_at set; immediate effect via DB check.

Mimic itself listens on localhost; HTTPS, TLS, and IP allowlisting are owned by the existing RT Caddy reverse proxy (D-007). Mimic-side: no HSTS, no cert mgmt.

Cleanup templating (F15)

Jinja2 SandboxedEnvironment in templating/sandbox.py with two custom accessors (D-005):

  • {{ outputs.text }} — pulls run_step.output_text (stdout, UTF-8 with latin-1 fallback, silent refusal on non-decodable).
  • {{ outputs.blob("<key>") }} — pulls a blob from MIMIC_BLOB_ROOT, hard cap 10 MB.

Custom filter regex_extract(text, pattern, *, group=1, name=None) — google-re2 (no backrefs, linear time), first match only, raises on no-match (D-011). Templating drift fails loudly at step run.

Resolved command lands in run_step_cleanup.resolved_command_text (the literal sent to the C2) and run_step.resolved_payload_text for the payload itself (audit + NF-OPSEC marker visibility).

C2 abstraction

              ┌────────────────────────────────────────────┐
              │ orchestrator (sprint > 0)                   │
              │   start_step(run_step) → polling 500 ms     │
              └────────────────────────────────────────────┘
                        │ uses
                        ▼
              ┌────────────────────────────────────────────┐
              │ connectors.factory  (keyed on c2_type)      │
              └────────────────────────────────────────────┘
                        │ instantiates
                        ▼
       ┌──────────────────────┐         ┌──────────────────────┐
       │ MythicConnector       │         │ HomeConnector         │
       │ (PR1 — pending docs) │         │ (PR2 — stub          │
       │ Mythic GraphQL+REST   │         │   NotImplementedError)│
       └──────────────────────┘         └──────────────────────┘
                        │
              authenticate / list_hosts / execute_task /
              get_task_result / cancel_task / execute_cleanup
              (stream_task_output optional v1, exploited v2)

payload_type is a neutral internal enum (§7 of spec). Mapping to native commands lives in connectors/payload_map.py — Mythic populated, Home empty (blocked by PR2). UnsupportedPayloadType raised on miss → UI surfaces "incompatible C2".

Storage — file pools

Two filesystem pools (D-012):

$MIMIC_BLOB_ROOT     ── content-addressed (CAS) + gzip
  └── <aa>/<bb>/<sha256>.gz       run_step.output_blob_ref → <sha256>

$MIMIC_EVIDENCE_ROOT ── flat per engagement
  └── <engagement_id>/<evidence_id>.<ext>

Per-blob cap 10 MB. No global quota v1 — OS-level monitoring (node_exporter). F12 archival CLI will own retention (post-sprint-0).

Sprint 0 happy-path flow (current scope)

RT operator logs in              ── auth/identity (bcrypt + Flask session)
        │
        ▼
GET /api/v1/engagements           ── api/engagements:list_engagements
                                     @require_perm(ENGAGEMENT_READ)
                                     [WARN: scope (assignés) not applied — see below]
        │
        ▼
POST /api/v1/engagements          ── creates draft engagement
POST /api/v1/engagements/:id/hosts── seeds host inventory (manual v1)
POST /library/ttps                ── creates TTP draft
POST /engagements/:id/scenarios   ── composes scenario (c2_type fixed at create)
POST /engagements/:id/scenarios/:sid/steps ── adds ordered steps
        │
        ▼
[orchestration, F15 cleanup, F7 cotation, F9 report : sprint > 0]

WebSocket cockpit, run orchestrator, cleanup wiring, report renderer, OIDC, and the two real C2 connectors are all post-sprint-0.

Known WARN — to revisit later

  • audit_log chain has no runtime verifier. Columns and write logic ship per D-013, but tampering detection is v2. Until then, the chain is a forensic trail (replay offline), not an enforcement trail. Owner: whoever picks up the H30 v2 ticket.
  • engagement_member.role is String(40) — free text, no enum. Risk: future drift. Watch when implementing F11 enforcement on the member.manage endpoints.
  • GET /engagements ignores the (assignés) scope@require_perm alone admits any rt_operator. Scope-applicative check (engagement_member join) is a code-reviewer item, flagged MAJOR by team-lead. Sprint 0 leaves the endpoint flat by design; F11 closure ships with that fix.
  • Q-003 / Q-004 / Q-005 deferred — see tasks/open-questions.md. None block sprint 0; each carries a re-open when … trigger.

Decisions anticipated vs v2 (for future-me)

Sprint 0 ships Spec said Why
audit_log.prev_hash / row_hash columns + chained writer H30 puts hash chain in v2 D-013 — adding columns later is a destructive migration; verifier stays v2.
c2_credential table (versioned, retiring) Spec §8 omits it D-004 — separating Fernet-encrypted blobs from the application engagement metadata is safer than embedding config_json.
Two storage pools (blobs/ CAS + evidence/ flat) H20 says "local disk v1" D-012 — split keeps deduplication for C2 outputs and clean archival for evidence; OS-level quota only.
Group-based RBAC tables from day 1 F11 lists fixed roles D-003 + D-008 — preserves F11 semantics exactly while making OIDC v2 a config change, not a code change.

Pointers

  • Frozen spec: RT-SecondBrain/Projects/Mimic — Spec.md (vault).
  • Decisions log: tasks/spec-decisions.md (D-001..D-014).
  • Open questions: tasks/open-questions.md (Q-003..Q-005 deferred).
  • Sprint 0 backlog: tasks/todo.md.
  • Changes journal: CHANGELOG.md.