Commit Graph

3 Commits

Author SHA1 Message Date
knacky
12d131c826 feat(backend): add content-addressed gzip blob store (D-012)
Two on-disk pools per D-012:
- `MIMIC_BLOB_ROOT` (default `/var/lib/mimic/blobs/`) holds C2 polling
  output blobs, content-addressed gzip layout `<aa>/<bb>/<sha256>.gz`.
- `MIMIC_EVIDENCE_ROOT` (default `/var/lib/mimic/evidence/`) reserved for
  user-uploaded evidence (flat per-engagement, no compression). Wired only
  in config + .env.example here; F8 endpoint lands later.

`mimic.storage.blob`:
- `blob_path(root, sha256_hex)` validates the digest and returns the CAS
  path. Raises ValueError on a malformed digest (length != 64 or non-hex).
- `store_blob(root, data)` hashes, gzip-compresses, atomically writes to
  `<aa>/<bb>/<sha256>.gz` (0o750 dir perms, 0o640 file perms). Idempotent:
  duplicate writes leave mtime untouched.

5 new unit tests cover happy path, deduplication, idempotency, malformed
digest, and the two-byte-pair directory layout.
2026-05-21 20:44:59 +02:00
knacky
162b6988f8 fix(backend): align regex_extract + outputs.blob() with D-011/D-012
D-011 — `regex_extract(text, pattern, *, group=1, name=None)`:
- engine google-re2 (linear-time, ReDoS-safe), `re` fallback with 1 MB cap.
- first match only.
- no match → raises Jinja2 `TemplateError` (no silent default — cleanup
  templates must fail loud when source string drifts).
- default capture is group 1 with fallback to group(0) when the pattern has
  no groups; named groups via `name="<name>"`.

D-012 — `outputs.blob()`:
- reads the gzip-compressed CAS file from `MIMIC_BLOB_ROOT`.
- 10 MB cap is applied **after** decompression.
- decode UTF-8 with latin-1 fallback; never raises (missing / corrupt /
  non-gzip blobs return empty string, logged at WARNING).

Unit tests rewritten to cover both the new fail-loud regex contract and
the gzip read path. 49 unit tests pass; ruff clean.
2026-05-21 20:44:48 +02:00
knacky
5d9415bb9f test(backend): add pytest baseline (B0.8)
Unit (SQLite, pure logic):
- test_templating.py: Jinja2 sandbox, regex_extract, strict-undefined,
  sandbox blocks attribute-access escape, output blob 10 MB cap.
- test_password.py: bcrypt hash + verify, empty / malformed handling.
- test_soc_token.py: 256-bit url-safe token + bcrypt verification.
- test_rbac_matrix.py: F11 invariants (lead ⊇ operator, SOC restricted
  to detection + report-read, audit_read & ttp_promote lead-only).
- test_connector_factory.py: register / build / double-register-rejected,
  TaskStatus terminal helper, Mythic mapping vs empty Home mapping.
- test_audit_hash.py: SHA-256 chain helper is deterministic and reacts
  to prev_hash / metadata changes.

Integration scaffold (testcontainers Postgres):
- tests/integration/conftest.py spins up postgres:16-alpine, monkeypatches
  MIMIC_DATABASE_URL, creates a Flask app + db.create_all.
- test_healthz.py: end-to-end smoke through the Flask test client.

38 unit tests pass; ruff clean.
2026-05-21 20:36:03 +02:00