tasks/open-questions.md

# Open spec questions

Structured questions raised in flight by the team. Each entry is a candidate for
team-lead escalation **before** code touches the area. Resolved questions move
to `spec-decisions.md` (with a `D-NNN` id) and are deleted from here.

Format:
- **Q-NNN** — one-line title
  - **Where it bites**: the F* or §-anchor in the spec
  - **What is silent**: the precise gap
  - **Options**: numbered alternatives with tradeoffs
  - **Recommended default if no decision**: the safest path forward
  - **Blocker?**: yes / not yet — when does it start blocking implementation

---

## Q-001 — `regex_extract` Jinja2 filter semantics (H26 / D-005)

- **Where it bites**: F15 cleanup templating + any future F8/F9 evidence
  templating that references `{{ outputs.text }}` through a regex.
- **What is silent**:
  - Multi-match behaviour (return first / all / named groups only?).
  - No-match behaviour (raise / return empty string / return `None` and let
    Jinja render it as `"None"`?).
  - Capture-group selection (whole match vs `\1` vs named).
  - Regex engine flavour. D-005 mentions `google-re2` (no backrefs, linear time
    — fits OPSEC) but the spec only says "regex" generically.
- **Options**:
  1. **Strict / first match, named groups required, no-match → raise** (loud
     failure, easy to detect templating bugs early).
  2. **First match, fall back to empty string on no-match** (silent — matches
     ATR / Caldera convention).
  3. **All matches as list** (powerful, but Jinja loops in a `cleanup_command`
     are a footgun).
- **Recommended default if no decision**: option 1 with `google-re2` (D-005),
  raise `TemplateError("regex_extract: no match for /<pattern>/")` so cleanup
  templates that drift get caught at template compile time.
- **Blocker?**: not yet. Becomes blocking when B0.5 implements
  `regex_extract` (Jinja sandbox).

## Q-002 — `output_blob_ref` storage layout and quota

- **Where it bites**: §8 `run_step.output_blob_ref`, §6 NF-state, H20 ("local
  disk v1"), F8 evidence (10 MB file cap) and `{{ outputs.blob() }}` accessor
  (D-005, 10 MB cap).
- **What is silent**:
  - Filesystem path layout (`/var/lib/mimic/blobs/<engagement_id>/<run_id>/...`?).
  - Total quota per engagement / global.
  - Retention vs rotation (kept forever? linked to engagement archival?).
  - Storage object structure: raw bytes? gzip? content-addressed (sha256 dir)?
  - Same file pool as evidence uploads (10 MB cap) or separate?
- **Options**:
  1. **Content-addressed (sha256 hex prefix tree) + gzip** + symlink from
     `run_step.output_blob_ref`. Deduplication, no quota tracking needed.
  2. **Per-engagement directory tree** with no compression. Simpler, easier to
     archive on engagement close, no dedup.
  3. **Two pools** (`blobs/` for C2 outputs, `evidence/` for user uploads) so
     access patterns and quotas stay independent.
- **Recommended default if no decision**: option 3 with option 1 layout for the
  `blobs/` pool (CAS + gzip), evidence stays plain in `evidence/`. Hard cap
  10 MB per blob, no global quota v1 — disk space monitored at OS level.
- **Blocker?**: not yet. Becomes blocking when backend implements F5 (run
  execution) or F8 (evidence upload).

## Q-003 — `/engagements/:id/hosts/sync` merge semantics

- **Where it bites**: F4 + §9 endpoint `/engagements/:id/hosts/sync`.
- **What is silent**:
  - Merge vs replace: do hosts that disappear from C2 get deleted, marked
    `status = "stale"`, or kept untouched?
  - What if a manually-added host conflicts (same hostname, different
    `c2_session_id`)?
  - Source of truth: the C2 session ID, the hostname, the IP, a tuple?
  - Cascading effect: a `host` referenced by `scenario_step.host_id` deleted
    mid-engagement breaks the scenario silently.
- **Options**:
  1. **Merge + mark stale**: insert new, update changed, mark missing as
     `status = "stale"` (never delete). Conflicts: manual entry wins,
     C2-synced entry is appended with a suffix and an audit_log entry.
  2. **Replace**: delete all sync-sourced rows, reinsert from C2. Manual
     entries kept untouched. Simple but loses history.
  3. **Merge by tuple `(hostname, c2_session_id)`** with status tracking,
     conflicts always raise and require manual resolution UI.
- **Recommended default if no decision**: option 1. Stale > deleted: a deleted
  host breaks `scenario_step.host_id` and the audit trail. Conflict policy:
  manual wins, C2 entry suffixed and audit-logged.
- **Blocker?**: not yet. Becomes blocking when backend implements F4 sync
  endpoint (post-sprint-0, post-PR1).

## Q-004 — `payload_type` → C2 home command mapping (depends on PR2)

- **Where it bites**: §7 enum `payload_type`, right column "C2 maison" is fully
  TBD; F2 import journal C2 home parser; B0.4 `C2Connector` factory.
- **What is silent**: entirety of the C2 home mapping table, plus the exact
  return semantics of `HomeConnector.execute_task / get_task_result / cancel_task
  / execute_cleanup` (PR2).
- **Options**: no in-flight options — this is owned by PR2.
- **Recommended default if no decision**: keep `HomeConnector` as a stub that
  raises `NotImplementedError` (per B0.4: "no real implementation"). Block any
  attempt to ship `HomeConnector` v1 logic until PR2 is closed.
- **Blocker?**: PR2 is the formal blocker. spec-analyst only needs to make
  sure no agent quietly invents a mapping table while PR2 is still open.

## Q-005 — Stale-host policy after engagement archival

- **Where it bites**: §8 `host.status`, engagement lifecycle, F12.
- **What is silent**: when an engagement is archived, what happens to its
  hosts? Detach session, freeze status, leave as-is?
- **Options**:
  1. Set `host.status = "archived"` cascading from `engagement.status`.
  2. Keep status untouched, rely on `engagement.status` upstream.
- **Recommended default if no decision**: option 2 — engagement state is the
  single source of truth, hosts inherit by JOIN.
- **Blocker?**: not yet. Cosmetic for sprint 0, becomes relevant when archival
  CLI lands.
docs(spec): track open spec questions Q-001..Q-005 for sprint 0 Captures the four grey areas team-lead flagged in the sprint 0 brief (regex_extract semantics, output_blob_ref storage, /hosts/sync merge behaviour, payload_type↔home-C2 mapping) plus stale-host policy. No decisions taken: each entry lists options, a recommended default if no decision is reached, and a "becomes blocking when…" trigger. Resolved questions will move to spec-decisions.md as D-NNN entries. 2026-05-21 20:15:07 +02:00			`# Open spec questions`

			`Structured questions raised in flight by the team. Each entry is a candidate for`
			`team-lead escalation before code touches the area. Resolved questions move`
			to `spec-decisions.md` (with a `D-NNN` id) and are deleted from here.

			`Format:`
			`- Q-NNN — one-line title`
			`- Where it bites: the F* or §-anchor in the spec`
			`- What is silent: the precise gap`
			`- Options: numbered alternatives with tradeoffs`
			`- Recommended default if no decision: the safest path forward`
			`- Blocker?: yes / not yet — when does it start blocking implementation`

			`---`

			## Q-001 — `regex_extract` Jinja2 filter semantics (H26 / D-005)

			`- Where it bites: F15 cleanup templating + any future F8/F9 evidence`
			templating that references `{{ outputs.text }}` through a regex.
			`- What is silent:`
			`- Multi-match behaviour (return first / all / named groups only?).`
			- No-match behaviour (raise / return empty string / return `None` and let
			Jinja render it as `"None"`?).
			- Capture-group selection (whole match vs `\1` vs named).
			- Regex engine flavour. D-005 mentions `google-re2` (no backrefs, linear time
			`— fits OPSEC) but the spec only says "regex" generically.`
			`- Options:`
			`1. Strict / first match, named groups required, no-match → raise (loud`
			`failure, easy to detect templating bugs early).`
			`2. First match, fall back to empty string on no-match (silent — matches`
			`ATR / Caldera convention).`
			3. All matches as list (powerful, but Jinja loops in a `cleanup_command`
			`are a footgun).`
			- Recommended default if no decision: option 1 with `google-re2` (D-005),
			raise `TemplateError("regex_extract: no match for /<pattern>/")` so cleanup
			`templates that drift get caught at template compile time.`
			`- Blocker?: not yet. Becomes blocking when B0.5 implements`
			`regex_extract` (Jinja sandbox).

			## Q-002 — `output_blob_ref` storage layout and quota

			- Where it bites: §8 `run_step.output_blob_ref`, §6 NF-state, H20 ("local
			disk v1"), F8 evidence (10 MB file cap) and `{{ outputs.blob() }}` accessor
			`(D-005, 10 MB cap).`
			`- What is silent:`
			- Filesystem path layout (`/var/lib/mimic/blobs/<engagement_id>/<run_id>/...`?).
			`- Total quota per engagement / global.`
			`- Retention vs rotation (kept forever? linked to engagement archival?).`
			`- Storage object structure: raw bytes? gzip? content-addressed (sha256 dir)?`
			`- Same file pool as evidence uploads (10 MB cap) or separate?`
			`- Options:`
			`1. Content-addressed (sha256 hex prefix tree) + gzip + symlink from`
			`run_step.output_blob_ref`. Deduplication, no quota tracking needed.
			`2. Per-engagement directory tree with no compression. Simpler, easier to`
			`archive on engagement close, no dedup.`
			3. Two pools (`blobs/` for C2 outputs, `evidence/` for user uploads) so
			`access patterns and quotas stay independent.`
			`- Recommended default if no decision: option 3 with option 1 layout for the`
			`blobs/` pool (CAS + gzip), evidence stays plain in `evidence/`. Hard cap
			`10 MB per blob, no global quota v1 — disk space monitored at OS level.`
			`- Blocker?: not yet. Becomes blocking when backend implements F5 (run`
			`execution) or F8 (evidence upload).`

			## Q-003 — `/engagements/:id/hosts/sync` merge semantics

			- Where it bites: F4 + §9 endpoint `/engagements/:id/hosts/sync`.
			`- What is silent:`
			`- Merge vs replace: do hosts that disappear from C2 get deleted, marked`
			`status = "stale"`, or kept untouched?
			`- What if a manually-added host conflicts (same hostname, different`
			`c2_session_id`)?
			`- Source of truth: the C2 session ID, the hostname, the IP, a tuple?`
			- Cascading effect: a `host` referenced by `scenario_step.host_id` deleted
			`mid-engagement breaks the scenario silently.`
			`- Options:`
			`1. Merge + mark stale: insert new, update changed, mark missing as`
			`status = "stale"` (never delete). Conflicts: manual entry wins,
			`C2-synced entry is appended with a suffix and an audit_log entry.`
			`2. Replace: delete all sync-sourced rows, reinsert from C2. Manual`
			`entries kept untouched. Simple but loses history.`
			3. Merge by tuple `(hostname, c2_session_id)` with status tracking,
			`conflicts always raise and require manual resolution UI.`
			`- Recommended default if no decision: option 1. Stale > deleted: a deleted`
			host breaks `scenario_step.host_id` and the audit trail. Conflict policy:
			`manual wins, C2 entry suffixed and audit-logged.`
			`- Blocker?: not yet. Becomes blocking when backend implements F4 sync`
			`endpoint (post-sprint-0, post-PR1).`

			## Q-004 — `payload_type` → C2 home command mapping (depends on PR2)

			- Where it bites: §7 enum `payload_type`, right column "C2 maison" is fully
			TBD; F2 import journal C2 home parser; B0.4 `C2Connector` factory.
			`- What is silent: entirety of the C2 home mapping table, plus the exact`
			return semantics of `HomeConnector.execute_task / get_task_result / cancel_task
			/ execute_cleanup` (PR2).
			`- Options: no in-flight options — this is owned by PR2.`
			- Recommended default if no decision: keep `HomeConnector` as a stub that
			raises `NotImplementedError` (per B0.4: "no real implementation"). Block any
			attempt to ship `HomeConnector` v1 logic until PR2 is closed.
			`- Blocker?: PR2 is the formal blocker. spec-analyst only needs to make`
			`sure no agent quietly invents a mapping table while PR2 is still open.`

			`## Q-005 — Stale-host policy after engagement archival`

			- Where it bites: §8 `host.status`, engagement lifecycle, F12.
			`- What is silent: when an engagement is archived, what happens to its`
			`hosts? Detach session, freeze status, leave as-is?`
			`- Options:`
			1. Set `host.status = "archived"` cascading from `engagement.status`.
			2. Keep status untouched, rely on `engagement.status` upstream.
			`- Recommended default if no decision: option 2 — engagement state is the`
			`single source of truth, hosts inherit by JOIN.`
			`- Blocker?: not yet. Cosmetic for sprint 0, becomes relevant when archival`
			`CLI lands.`