docs(spec): land D-011 (regex_extract) + D-012 (output_blob_ref storage)
D-011 freezes the regex_extract Jinja filter signature `regex_extract(text, pattern, *, group=1, name=None)`, google-re2 engine, raise on no-match — unblocks backend B0.5 templating sandbox. D-012 splits storage in two pools: `blobs/` (CAS sha256 + gzip) for C2 binary outputs and `evidence/` (flat per engagement) for user uploads, 10 MB per-blob cap, no global quota v1. Q-001 and Q-002 removed from open-questions.md (resolved). Q-003/Q-004/Q-005 marked `deferred` with explicit re-open conditions.
This commit is contained in:
@@ -90,3 +90,48 @@ simplification MVP)"*.
|
||||
column (informational, §8) is kept. Replayability lives **solely** on
|
||||
`run.snapshot_json`. Re-introducing `ttp_version` requires explicit spec amendment
|
||||
through the team-lead.
|
||||
|
||||
### D-011 — `regex_extract` Jinja2 filter semantics (resolves Q-001)
|
||||
**Context.** D-005 introduced `regex_extract` on Jinja templates without fixing
|
||||
its match-mode, no-match behaviour, group selection, or engine flavour. Backend
|
||||
B0.5 (templating sandbox) is starting and needs a frozen signature.
|
||||
**Decision.**
|
||||
- **Engine** — `google-re2` (D-005 reaffirmed). Linear-time, no backrefs,
|
||||
OPSEC-safe (no ReDoS).
|
||||
- **Match mode** — first match only.
|
||||
- **No-match** — raise `TemplateError("regex_extract: no match for /<pattern>/")`.
|
||||
No silent fallback. Drifting cleanup templates must fail loudly at step run
|
||||
time, not on next mission.
|
||||
- **Group selection** — defaults to capture group 1; positional fallback to the
|
||||
full match when the pattern has no groups; named groups via `name="<name>"`.
|
||||
- **Signature** — `regex_extract(text, pattern, *, group=1, name=None)`.
|
||||
- **Rationale** — ATR/Caldera compatibility is not an objective (D-005). Fail-
|
||||
fast > silent string corruption when a cleanup template touches a host with
|
||||
unexpected output shape.
|
||||
|
||||
### D-012 — `output_blob_ref` storage layout (resolves Q-002)
|
||||
**Context.** §8 declares `run_step.output_blob_ref` without specifying pool,
|
||||
quota, format, or path. H20 says "local disk v1" only. Sprint 0 needs the layout
|
||||
locked because B0.5 already references `{{ outputs.blob(...) }}`.
|
||||
**Decision.**
|
||||
- **Two separate pools** —
|
||||
- `MIMIC_BLOB_ROOT` (default `/var/lib/mimic/blobs/`) — binary outputs from
|
||||
`C2Connector` polling. **Content-addressed** layout: `<aa>/<bb>/<sha256>.gz`
|
||||
where `aa`/`bb` are the first two byte-pairs of the sha256 hex digest.
|
||||
gzip systematically; raw stored bytes never on disk.
|
||||
- `MIMIC_EVIDENCE_ROOT` (default `/var/lib/mimic/evidence/`) — user-uploaded
|
||||
evidence files (F8). Flat layout `<engagement_id>/<evidence_id>.<ext>`, no
|
||||
compression.
|
||||
- **Cap per blob** — 10 MB (consistent with F8 and D-005).
|
||||
- **Quota** — no in-app global quota v1. OS-level monitoring via Prometheus
|
||||
node_exporter. F12 archival pipeline will own retention/purge post-sprint-0.
|
||||
- **Filesystem permissions** — `0750`, owner the `mimic` system user.
|
||||
- **Rationale** — CAS deduplicates repeated C2 outputs (same `whoami`, same
|
||||
`Get-Process` snapshot) for free. Evidence stays flat because uploads are
|
||||
one-shot and tied to an engagement scope that we want to archive whole.
|
||||
Two pools mean we can wire independent quotas / retention policies in v2
|
||||
without migration.
|
||||
|
||||
#### Resolved open questions
|
||||
- Q-001 → D-011.
|
||||
- Q-002 → D-012.
|
||||
|
||||
Reference in New Issue
Block a user