mimic-big

Author	SHA1	Message	Date
knacky	3a3e3ff0ec	feat(backend): wire created_by_id, audit log, F11 scope into CRUD (MA4/5/6) Three follow-ups on the flat CRUD blueprints triggered by code-review + spec-analyst (MA4, MA5, MA6). MA4 — `created_by_id` — engagements, TTPs and scenarios now record the creator from `current_user.id` instead of leaving the FK NULL. The new `api._helpers.current_user_id()` exposes the UUID safely (returns None when the request is unauthenticated, e.g. during /healthz). MA5 — Audit log integration — `api._helpers.audit_write(...)` wraps the hash-chained `AuditWriter` and is called after every successful commit in the 4 blueprints (engagement / host / ttp / scenario incl. step), recording the actor, action, resource type/id, IP, user agent, and small metadata (field list, names, engagement scope). F13 "Toute mutation tracée" now holds end-to-end. MA6 — RT operator scope on engagements — F11 limits RT operators to "engagements assignés". The previous implementation let them list / read every engagement and every nested resource. Fix: `is_rt_lead()` short- circuits the check for RT leads; otherwise a membership probe against `engagement_member` runs on every list/read and on `_engagement_or_404` in `hosts.py` and `scenarios.py`. Listings now `JOIN engagement_member` and filter by `current_user.id`. `audit_write` casts `db.session` (a `scoped_session` proxy) to the unwrapped `sqlalchemy.orm.Session` that `AuditWriter` expects; the two are interchangeable at runtime. The promotion-perm check on TTPs no longer needs a lazy `flask_login` import since the decorator scope already brings `current_user` in.	2026-05-22 05:24:54 +02:00
knacky	36c1ed5ffb	fix(backend): freeze F11 matrix inline in the initial migration (MA3) Code-review MAJOR MA3. The initial Alembic migration imported the live `mimic.rbac.matrix.GROUP_PERMISSIONS` to seed the `permission` / `group` / `group_permission` rows. That breaks the Alembic invariant "a migration produces the same schema regardless of when you replay it": a future tweak to the runtime matrix would silently change the seeded baseline on a fresh DB. Two changes: 1. The migration now carries an inline frozen snapshot of the F11 matrix (`_PERMISSIONS_FROZEN`, `_GROUP_PERMISSIONS_FROZEN`, `_GROUP_DESCRIPTIONS`). The seed reads from these tuples/dicts only. If the canonical matrix evolves, the next migration is responsible for the delta. 2. A new unit test `test_migration_seed_matches_current_matrix` enforces that the frozen seed equals the runtime `Permission` enum and `GROUP_PERMISSIONS` mapping. Drift now fails CI loudly with a hint to write a new migration instead of editing the existing one. Also: docstring no longer mentions `ttp_version` (M8 follow-up).	2026-05-22 05:24:37 +02:00
knacky	feadad850b	fix(backend): stream store_blob and enforce max_bytes mid-write (MA2) Code-review MAJOR MA2. The previous `store_blob(root, data: bytes)` signature forced the entire payload into RAM before the 10 MB cap was checked — a hostile-large output blob could OOM the worker before the limit even fired. New signature: `store_blob(root, stream, *, max_bytes=10_485_760)`. The implementation: - reads from `stream` in 64 KB chunks; - updates the sha256 + writes to `<root>/.tmp-<pid>-<rand>.gz` incrementally; - raises `BlobTooLarge(max_bytes)` as soon as the running total crosses the cap, then unlinks the partial temp file via `contextlib.suppress`; - atomic-renames the temp file to the CAS path `<aa>/<bb>/<sha256>.gz` once the stream finishes; - sets `0o750` on the directory and `0o640` on the file with explicit `os.chmod` (does not rely on the process umask). Updated unit tests cover: BlobTooLarge enforcement (with temp-file cleanup), multi-chunk happy path (1.5 MB payload exercising the 64 KB loop), and `max_bytes <= 0` validation.	2026-05-22 05:24:25 +02:00
knacky	6e803a482a	fix(backend): stop seeding the audit-writer role via postgres-init (MA1) Code-review MAJOR MA1. The previous `scripts/postgres-init/00-roles.sql` hardcoded a `CHANGE_ME` password for `mimic_audit_writer` and was bind-mounted into the dev Postgres container; on prod boxes this risks lingering as the real credential. - The init script was removed in the previous commit alongside the dropped scripts dir. - `docker-compose.yml` no longer mounts a `docker-entrypoint-initdb.d` directory; the audit-writer role provisioning is the Ansible playbook's responsibility (D-010). - `backend/README.md` documents the manual one-shot `CREATE ROLE` command for local dev with a placeholder password. Net effect: no `CHANGE_ME` credential reaches a container image / git history. The Alembic migration's `audit_log` grant block stays idempotent — it is a no-op when the role is absent.	2026-05-22 05:24:13 +02:00
knacky	90f8141cfc	fix(backend): make google-re2 a hard dependency, drop re fallback (B1) Code-review BLOCKER B1. Reaffirms D-011: a `re` stdlib fallback defeats the OPSEC-safe-regex guarantee because hostile C2 output can trigger catastrophic backtracking. The `[:1MB]` slice cap does not mitigate that — re-evaluating a malicious pattern over 1 MB of attacker-controlled text is still a worker freeze. - `mimic.templating.filters` now imports `re2` unconditionally and raises `RuntimeError` at module load if the binding is absent. No `re` import, no `_HAS_RE2` branch, no `_FALLBACK_MAX_INPUT`. - `pyproject.toml` already pinned `google-re2 >= 1.1, < 2.0`; this commit hardens the import path to actually enforce it. - New test `test_re2_is_required` asserts the binding is wired in.	2026-05-22 05:23:47 +02:00
knacky	adab8a58e7	chore(backend): mypy strict clean + ruff format pass Pre-merge sanity per devops checklist (ruff format --check, mypy --strict). Type fixes: - ORM models: `Mapped[dict]` → `Mapped[dict[str, Any]]` (audit, scenario, run, report, ttp, detection.artifact_files_json). Equivalent on Pydantic DTOs (TtpBase.params_schema_json, ScenarioStepBase.params_override_json). - Rename `TtpRead.current_version` → `TtpRead.version` to mirror the ORM column (which itself was renamed in D-009 cleanup). - Flask blueprints: add `-> ResponseReturnValue` to every view, plus typed UUID params on `_validate_step_consistency`. - `templating/filters.py`: rewrite the conditional re2 import so mypy can narrow the union (`ModuleType \| None`); the runtime branch on `_re2 is not None` removes the unused-ignore that was triggered by warn_unused_ignores. - `pyproject.toml`: add `flask_login.` and `pythonjsonlogger.` to the `[[tool.mypy.overrides]]` `ignore_missing_imports` list (both ship without typed marker). - Misc: drop stale `# type: ignore` comments (`app.py:36`, `rbac/decorators.py:35`) flagged by `warn_unused_ignores`. Keep `logging.JsonFormatter` ignore because the symbol exists at runtime but is not re-exported through the typed surface. Formatting: - `ruff format` applied (15 files normalized; line-length unchanged at 100). Verification on this commit: - `ruff check` → All checks passed. - `ruff format --check` → 68 files already formatted. - `mypy --strict src` → Success: no issues found in 54 source files. - `pytest tests/unit` → 49 passed.	2026-05-22 05:10:51 +02:00
knacky	12d131c826	feat(backend): add content-addressed gzip blob store (D-012) Two on-disk pools per D-012: - `MIMIC_BLOB_ROOT` (default `/var/lib/mimic/blobs/`) holds C2 polling output blobs, content-addressed gzip layout `<aa>/<bb>/<sha256>.gz`. - `MIMIC_EVIDENCE_ROOT` (default `/var/lib/mimic/evidence/`) reserved for user-uploaded evidence (flat per-engagement, no compression). Wired only in config + .env.example here; F8 endpoint lands later. `mimic.storage.blob`: - `blob_path(root, sha256_hex)` validates the digest and returns the CAS path. Raises ValueError on a malformed digest (length != 64 or non-hex). - `store_blob(root, data)` hashes, gzip-compresses, atomically writes to `<aa>/<bb>/<sha256>.gz` (0o750 dir perms, 0o640 file perms). Idempotent: duplicate writes leave mtime untouched. 5 new unit tests cover happy path, deduplication, idempotency, malformed digest, and the two-byte-pair directory layout.	2026-05-21 20:44:59 +02:00
knacky	162b6988f8	fix(backend): align regex_extract + outputs.blob() with D-011/D-012 D-011 — `regex_extract(text, pattern, , group=1, name=None)`: - engine google-re2 (linear-time, ReDoS-safe), `re` fallback with 1 MB cap. - first match only. - no match → raises Jinja2 `TemplateError` (no silent default — cleanup templates must fail loud when source string drifts). - default capture is group 1 with fallback to group(0) when the pattern has no groups; named groups via `name="<name>"`. D-012 — `outputs.blob()`: - reads the gzip-compressed CAS file from `MIMIC_BLOB_ROOT`. - 10 MB cap is applied after* decompression. - decode UTF-8 with latin-1 fallback; never raises (missing / corrupt / non-gzip blobs return empty string, logged at WARNING). Unit tests rewritten to cover both the new fail-loud regex contract and the gzip read path. 49 unit tests pass; ruff clean.	2026-05-21 20:44:48 +02:00
knacky	d470db97d9	fix(backend): align with D-008/D-009 (drop ttp_version, seed F11 matrix) D-009 reaffirms spec H32: no `ttp_version` table. Replayability lives solely on `run.snapshot_json`. The previous initial migration introduced a separate `ttp_version` aggregate by mistake — removed here. D-008 requires the bootstrap to seed exactly the three F11 groups (`rt_operator`, `rt_lead`, `soc_analyst`) with exactly the F11 permission matrix. The migration now: - inserts every `Permission` enum value into the `permission` table, - inserts the three groups with deterministic uuid5(NAMESPACE_DNS, ...) ids, - inserts the matching `group_permission` rows from GROUP_PERMISSIONS. Also renames `ttp.current_version` to `ttp.version` (matches §8 spec column name; the value remains informational per H32 / D-009).	2026-05-21 20:44:37 +02:00
knacky	887182cfd7	docs: update CHANGELOG + tasks for the backend skeleton sprint 0 - CHANGELOG.md: detail every B0.1..B0.8 deliverable + spec deltas D-008 (ttp_version coexists), D-009 (audit hash chain v1), D-010 (no type_annotation_map on declarative base). - tasks/todo.md: tick every B0.x item. - tasks/spec-decisions.md: log D-008, D-009, D-010 alongside the pre-existing D-001..D-007.	2026-05-21 20:39:06 +02:00
knacky	5d9415bb9f	test(backend): add pytest baseline (B0.8) Unit (SQLite, pure logic): - test_templating.py: Jinja2 sandbox, regex_extract, strict-undefined, sandbox blocks attribute-access escape, output blob 10 MB cap. - test_password.py: bcrypt hash + verify, empty / malformed handling. - test_soc_token.py: 256-bit url-safe token + bcrypt verification. - test_rbac_matrix.py: F11 invariants (lead ⊇ operator, SOC restricted to detection + report-read, audit_read & ttp_promote lead-only). - test_connector_factory.py: register / build / double-register-rejected, TaskStatus terminal helper, Mythic mapping vs empty Home mapping. - test_audit_hash.py: SHA-256 chain helper is deterministic and reacts to prev_hash / metadata changes. Integration scaffold (testcontainers Postgres): - tests/integration/conftest.py spins up postgres:16-alpine, monkeypatches MIMIC_DATABASE_URL, creates a Flask app + db.create_all. - test_healthz.py: end-to-end smoke through the Flask test client. 38 unit tests pass; ruff clean.	2026-05-21 20:36:03 +02:00
knacky	9fa4d61304	feat(backend): add Flask app factory, audit writer, flat CRUD + CLI (B0.7) - Flask app factory wires SQLAlchemy / Migrate / Login / SocketIO and registers every blueprint. /healthz smoke endpoint included. - Pydantic 2 DTOs (request/response) for engagement / host / TTP / scenario aggregates with from_attributes=True conversion. - Flat CRUD blueprints under /api/v1/: * engagements (list / create / get / put / delete-as-archive) * hosts (engagement-scoped CRUD) * library/ttps (CRUD; promote requires the lead-only TTP_PROMOTE) * scenarios + steps (F3 invariant enforced: host.c2_type must match scenario.c2_type at compose time, 400 otherwise). - @require_perm guards every endpoint per the F11 matrix. - audit/ writer is hash-chained from v1 (SHA-256 of canonical record plus previous hash). The SQL-level write-only role enforcement ships in the deploy playbook (idempotent grants run at migration time). - mimic-cli (click): user create (seeds RT operator/lead with group membership), db dump / db restore (manual pg_dump/pg_restore, R-O1). No orchestrator, no WebSocket, no report generation — those land after PR1/PR2/PR3.	2026-05-21 20:36:03 +02:00
knacky	7f4ad85a68	feat(backend): add local auth + group-based RBAC matching F11 (B0.6) - Permission enum + GroupName enum + GROUP_PERMISSIONS mapping mirror the F11 matrix in code (verifiable against the spec table in tests). - @require_perm decorator: 401 on anonymous, 403 on missing permission, passes through otherwise. Pure-function user_has() for unit-testing. - AuthUser (Flask-Login wrapper) resolves the permission set from a User's groups; load_user is the Flask-Login user_loader. - bcrypt password hashing helpers (12 rounds by default, configurable). - SOC opaque token (D-006): secrets.token_urlsafe(32), bcrypt-hashed at rest, plain value returned once at creation and never re-displayable. - Group-based RBAC from day one (D-003) — Keycloak OIDC in v2 maps onto the same group model.	2026-05-21 20:36:03 +02:00
knacky	104d73143a	feat(backend): add Jinja2 sandbox + regex_extract filter (B0.5) - CleanupRenderer wraps jinja2.sandbox.SandboxedEnvironment with StrictUndefined (no autoescape — shell context, not HTML). - Custom filter regex_extract(text, pattern, group=1, default='') uses google-re2 for linear-time matching (ReDoS-safe) and falls back to re with a 1 MB input cap when re2 is absent. - StepOutputs exposes {{ outputs.text }} and {{ outputs.blob('name') }}. blob() decodes UTF-8 with latin-1 fallback, hard-capped at 10 MB (consistent with F8 evidence limit, D-005). - render_cleanup() is the module-level convenience wrapper.	2026-05-21 20:36:03 +02:00
knacky	20112d61ff	feat(backend): add C2Connector ABC + payload mapping + factory (B0.4) - abstract C2Connector with authenticate / list_hosts / execute_task / get_task_result / cancel_task / execute_cleanup; stream_task_output optional v1 (NotImplementedError). - Payload / TaskHandle / TaskResult / TaskStatus frozen dataclasses. - UnsupportedPayloadType raised when no native command maps to the chosen (c2_type, payload_type) pair. - Mythic payload_type → native command map populated (spec §7 table). - HOME map left empty until PR2 is closed. - ConnectorFactory: register_connector decorator + build(c2_type) that instantiates + authenticates via an injected config resolver. No real Mythic / Home implementations land in this sprint.	2026-05-21 20:36:03 +02:00
knacky	22d37fb240	feat(backend): add §8 data model + Alembic baseline (B0.2, B0.3) - SQLAlchemy 2 typed mapped classes for every spec §8 aggregate: engagement, c2_credential, host, user, group, group_permission, user_group, engagement_member, ttp, ttp_version, scenario, scenario_step, run, run_step, run_step_cleanup, detection, evidence, report, soc_session, audit_log. - Shared mixins: UuidPkMixin (PG_UUID(as_uuid=True)) + TimestampsMixin. - StrEnum types covering every spec enum (C2Type, PayloadType, UserType, EngagementStatus, HostStatus, TtpSource, RunStatus, RunStepStatus, CleanupStatus, DetectionLevel, DetectionSource, EvidenceStatus). - Alembic baseline migration 202605210001_initial_schema: creates every table, enum, index, and idempotent grants for the audit_log write-only Postgres role (mimic_audit_writer). - Audit log carries prev_hash / row_hash from v1 (D-009). - ttp_version table coexists with run.snapshot_json (D-008, overrides H32).	2026-05-21 20:36:03 +02:00
knacky	a93c959444	chore(backend): bootstrap Python 3.12+ project skeleton (B0.1) - pyproject.toml with ruff + mypy strict + pytest + coverage >=70% - Makefile with Docker/Podman auto-detect - Multi-stage Dockerfile (python:3.12-slim-bookworm, non-root user) - docker-compose.yml for Postgres dev DB - alembic.ini wired to src/mimic/db/migrations - scripts/postgres-init/00-roles.sql seeds the audit writer role - .env.example documents every MIMIC_* var (no secrets committed)	2026-05-21 20:36:03 +02:00
knacky	2ead16114d	docs(spec): land D-011 (regex_extract) + D-012 (output_blob_ref storage) D-011 freezes the regex_extract Jinja filter signature `regex_extract(text, pattern, *, group=1, name=None)`, google-re2 engine, raise on no-match — unblocks backend B0.5 templating sandbox. D-012 splits storage in two pools: `blobs/` (CAS sha256 + gzip) for C2 binary outputs and `evidence/` (flat per engagement) for user uploads, 10 MB per-blob cap, no global quota v1. Q-001 and Q-002 removed from open-questions.md (resolved). Q-003/Q-004/Q-005 marked `deferred` with explicit re-open conditions.	2026-05-21 20:20:27 +02:00
knacky	524c6f1eb4	docs(spec): track open spec questions Q-001..Q-005 for sprint 0 Captures the four grey areas team-lead flagged in the sprint 0 brief (regex_extract semantics, output_blob_ref storage, /hosts/sync merge behaviour, payload_type↔home-C2 mapping) plus stale-host policy. No decisions taken: each entry lists options, a recommended default if no decision is reached, and a "becomes blocking when…" trigger. Resolved questions will move to spec-decisions.md as D-NNN entries.	2026-05-21 20:18:57 +02:00
knacky	4ecf4b0b0e	chore: tighten gitignore, align README stack, formalize D-010 (Ansible) - .gitignore: add Keycloak/Mythic/Fernet secret patterns (pfx, p12, token, kdbx, credentials.json, secrets.json, service-account*.json), MSVC artifacts (lib, exp, idb, ilk, tlog), dedup dist/build/ between Python and Node blocks. - README.md: align Storage line on H38 (testcontainers Postgres for Postgres- specific behavior, incl. unit tests of audit log / RBAC / write-only role). - README.md: align Deploy line on D-007/D-010 — Docker + Ansible playbook, reverse proxy explicitly out-of-Mimic. - README.md: add proprietary internal use notice. - CHANGELOG.md: convert markdown link to inline URL (no dangling reference). - tasks/spec-decisions.md: add D-010 (Ansible for deployment playbook). Addresses code-reviewer M1/M2/M3 + N2/N3/N4/N6 on commit `047583e`. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-21 20:16:40 +02:00
knacky	b144c041a7	docs: drop ttp_version from B0.2 + seed groups requirement per D-008/D-009 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-21 20:14:44 +02:00
knacky	d03ba062bf	docs(spec): add D-008 (group RBAC vs F11) and D-009 (no ttp_version table) D-008 frames the group-based RBAC layout as an OIDC-prep mechanism that must seed exactly the three F11 spec roles and their canonical permission matrix. Custom groups remain out of v1 scope. D-009 reaffirms H32: replayability lives only on run.snapshot_json. The ttp_version table listed in B0.2 must be dropped from the initial migration.	2026-05-21 20:13:14 +02:00
knacky	047583eb9c	chore: bootstrap repo skeleton with sprint 0 plan - .gitignore (Python, Node, RT/maldev hygiene, secrets) - README.md (project framing, stack, conventions, status) - CHANGELOG.md (team kickoff decisions Q1/Q2/Q3, T2/T3/T4, auth strategy) - tasks/spec-decisions.md (D-001..D-007 arbitrations on top of frozen spec) - tasks/todo.md (sprint 0 backlog: B0.* / F0.* / S0.* / R0.*) - tasks/lessons.md (empty, populated as work progresses) - backend/ frontend/ docs/ scaffolding Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-21 20:10:47 +02:00
knacky	030a018970	chore: init repo	2026-05-21 20:07:38 +02:00

24 Commits