feat: sprint 6 — engagement export (md/csv/pdf) #9

Merged
knacky merged 20 commits from sprint/6-export into main 2026-06-09 16:19:02 +00:00
Owner

Summary

  • Engagement export : GET /api/engagements/<id>/export?format=md|csv|pdf — clôt la boucle « remplace l'Excel partagé RT ↔ SOC » du SPEC.
  • 3 formats livrés : Markdown (handoff narratif), CSV (machine-readable, défense formula-injection), PDF (livrable client via WeasyPrint).
  • UI : split-button dropdown [Export ▼] sur EngagementDetailPage, 3 items. Les deux moitiés ouvrent le menu (différence sémantique vs sprint 5 où la gauche naviguait blank — il n'y a pas de format "défaut" évident).
  • RBAC SOC zero access : admin + redteam exportent ; SOC ne voit pas le bouton (DOM-absent) et tous endpoints /api/engagements/<id>/export* → 403.
  • Security MEDIUM fix mid-sprint : CSV formula injection défusée par _csv_safe() (apostrophe-prefix sur =/+/-/@/\t/\r). Le red team aurait pu injecter une formule qui s'exécute chez le SOC à l'ouverture de l'Excel.

Test plan

  • Backend : 253/253 pytest (ruff + mypy clean) — 226 sprint 1-5 baseline + 23 sprint 6 (endpoint + render + RBAC + security + filename defense-in-depth) + 4 post-code-review.
  • Frontend : 136/136 vitest (typecheck + lint clean) — 121 baseline + 12 sprint 6 + 3 coverage-gap.
  • E2e Playwright : 223/223 verts — baseline sprint 5 = 201, +22 sprint 6 (us29 8 tests, us30 3 tests, us31 5 tests + 6 supportants).

Comment tester en local

make build && make start                            # auto-podman, +50 MB d'image (deps WeasyPrint)
make create-admin USER=alice PASS=changeme8         # si premier setup
# Ouvrir http://127.0.0.1:5000 (IPv4 explicite si IPv6 par défaut)

Scénarios :

  1. Export Markdown — login admin → engagement avec ≥ 2 simulations → header → [Export ▼] → Markdown. Le .md téléchargé contient le nom de l'engagement, ses dates, et le détail de chaque simulation RT + SOC.
  2. Export CSV — même flow → CSV. Ouvre dans LibreOffice : 1 ligne header + N lignes simulations, commands multilines correctement échappés, colonnes RT et SOC visibles.
  3. Export PDF — même flow → PDF. Le fichier doit s'ouvrir dans un viewer PDF avec un rendu propre (titres, sections, tables).
  4. CSV formula injection (sécurité) — crée une simulation avec name = "=cmd|'/c calc'!A1", exporte le CSV, ouvre dans Excel/LibreOffice. La cellule doit afficher le texte littéral =cmd|'/c calc'!A1 (apostrophe forcé), pas exécuter la formule.
  5. SOC zero access — login en SOC → engagement → bouton Export ABSENT du header. Test API direct : curl -H "Authorization: Bearer <SOC_TOKEN>" http://127.0.0.1:5000/api/engagements/1/export?format=md403.
  6. Engagement vide — engagement avec 0 simulations → export OK (header seul ; CSV = 1 ligne header).
  7. Filename normalisé — engagement nommé "Opération Spéciale" → filename Content-Disposition = engagement-<id>-operation-speciale-YYYYMMDD.<ext> (NFKD strip des accents).

Notes

  • Endpoint unique avec query param format, pas 3 routes séparées — 1 RBAC à protéger, 1 test d'intégration RBAC.
  • PDF pipeline : WeasyPrint (Python HTML→PDF). Le PDF est généré depuis les MÊMES DONNÉES que le Markdown (pas depuis le string Markdown) via _render_engagement_html(). CSS inline ≤ 30 lignes.
  • Dockerfile : +6 libs minimales pour WeasyPrint (libcairo2 libpango-1.0-0 libpangoft2-1.0-0 libharfbuzz0b libfontconfig1 shared-mime-info). libgdk-pixbuf-2.0-0 exclu (text-only PDF, vérifié weasyprint --info).
  • Process wins sprint 6 : SPEC.md committed en commit #1 du sprint (recurrence 4 sprints enfin tuée) ; spec-reviewer 2-pass APPROVED avant dispatch backend (0 addendum mid-implementation, comme sprint 5) ; team mimic persistante avec les 7 agents idle (cohérence cross-sprint à partir du sprint 7+).

🤖 Generated with Claude Code

## Summary - **Engagement export** : `GET /api/engagements/<id>/export?format=md|csv|pdf` — clôt la boucle « remplace l'Excel partagé RT ↔ SOC » du SPEC. - **3 formats livrés** : Markdown (handoff narratif), CSV (machine-readable, défense formula-injection), PDF (livrable client via WeasyPrint). - **UI** : split-button dropdown `[Export ▼]` sur `EngagementDetailPage`, 3 items. Les **deux moitiés ouvrent le menu** (différence sémantique vs sprint 5 où la gauche naviguait blank — il n'y a pas de format "défaut" évident). - **RBAC SOC zero access** : admin + redteam exportent ; SOC ne voit pas le bouton (DOM-absent) et tous endpoints `/api/engagements/<id>/export*` → 403. - **Security MEDIUM fix mid-sprint** : CSV formula injection défusée par `_csv_safe()` (apostrophe-prefix sur `=`/`+`/`-`/`@`/`\t`/`\r`). Le red team aurait pu injecter une formule qui s'exécute chez le SOC à l'ouverture de l'Excel. ## Test plan - **Backend** : **253/253** pytest (`ruff` + `mypy` clean) — 226 sprint 1-5 baseline + 23 sprint 6 (endpoint + render + RBAC + security + filename defense-in-depth) + 4 post-code-review. - **Frontend** : **136/136** vitest (`typecheck` + `lint` clean) — 121 baseline + 12 sprint 6 + 3 coverage-gap. - **E2e Playwright** : **223/223** verts — baseline sprint 5 = 201, +22 sprint 6 (`us29` 8 tests, `us30` 3 tests, `us31` 5 tests + 6 supportants). ## Comment tester en local ```bash make build && make start # auto-podman, +50 MB d'image (deps WeasyPrint) make create-admin USER=alice PASS=changeme8 # si premier setup # Ouvrir http://127.0.0.1:5000 (IPv4 explicite si IPv6 par défaut) ``` Scénarios : 1. **Export Markdown** — login admin → engagement avec ≥ 2 simulations → header → `[Export ▼]` → Markdown. Le `.md` téléchargé contient le nom de l'engagement, ses dates, et le détail de chaque simulation RT + SOC. 2. **Export CSV** — même flow → CSV. Ouvre dans LibreOffice : 1 ligne header + N lignes simulations, commands multilines correctement échappés, colonnes RT et SOC visibles. 3. **Export PDF** — même flow → PDF. Le fichier doit s'ouvrir dans un viewer PDF avec un rendu propre (titres, sections, tables). 4. **CSV formula injection (sécurité)** — crée une simulation avec `name = "=cmd|'/c calc'!A1"`, exporte le CSV, ouvre dans Excel/LibreOffice. La cellule doit afficher le texte littéral `=cmd|'/c calc'!A1` (apostrophe forcé), pas exécuter la formule. 5. **SOC zero access** — login en SOC → engagement → bouton `Export` ABSENT du header. Test API direct : `curl -H "Authorization: Bearer <SOC_TOKEN>" http://127.0.0.1:5000/api/engagements/1/export?format=md` → `403`. 6. **Engagement vide** — engagement avec 0 simulations → export OK (header seul ; CSV = 1 ligne header). 7. **Filename normalisé** — engagement nommé `"Opération Spéciale"` → filename Content-Disposition = `engagement-<id>-operation-speciale-YYYYMMDD.<ext>` (NFKD strip des accents). ## Notes - **Endpoint unique** avec query param `format`, pas 3 routes séparées — 1 RBAC à protéger, 1 test d'intégration RBAC. - **PDF pipeline** : WeasyPrint (Python HTML→PDF). Le PDF est généré depuis les MÊMES DONNÉES que le Markdown (pas depuis le string Markdown) via `_render_engagement_html()`. CSS inline ≤ 30 lignes. - **Dockerfile** : +6 libs minimales pour WeasyPrint (`libcairo2 libpango-1.0-0 libpangoft2-1.0-0 libharfbuzz0b libfontconfig1 shared-mime-info`). `libgdk-pixbuf-2.0-0` exclu (text-only PDF, vérifié `weasyprint --info`). - **Process wins sprint 6** : SPEC.md committed en commit #1 du sprint (recurrence 4 sprints enfin tuée) ; spec-reviewer 2-pass APPROVED avant dispatch backend (0 addendum mid-implementation, comme sprint 5) ; team `mimic` persistante avec les 7 agents idle (cohérence cross-sprint à partir du sprint 7+). 🤖 Generated with [Claude Code](https://claude.com/claude-code)
knacky added 13 commits 2026-06-08 16:35:43 +00:00
Specifies the new export feature contract:
- 3 formats : Markdown, CSV, PDF
- Engagement header + all simulations RT + SOC
- Endpoint unique GET /api/engagements/<id>/export?format=md|csv|pdf
- RBAC admin + redteam (SOC zero access, cohérent avec Templates)
- Filename normalisé engagement-<id>-<slug>-YYYYMMDD.<ext>

Committed as commit #1 of sprint 6 — applies lesson learned in sprints 3/4/5
where the SPEC section sat as uncommitted M SPEC.md until sprint-close
discovery. Per lessons.md §sprint-5 fix candidate "Stage SPEC.md as part
of the FIRST sprint commit, not as a separate later commit."
3 user stories scoped (US-29 export formats, US-30 SOC zero access,
US-31 format/engagement robustness). Backend extends engagements_bp
with GET /api/engagements/<id>/export?format=md|csv|pdf returning the
rendered file, no DB schema change. Frontend adds an
ExportEngagementButton split-button dropdown on EngagementDetailPage,
gated to admin+redteam.

Binding decisions locked with the user: 3 formats Markdown/CSV/PDF,
RBAC admin+redteam, engagement + all simulations RT+SOC, single
endpoint with format query param. WeasyPrint chosen for PDF (Python
HTML→PDF, ~50MB cairo/pango deps to add to Dockerfile, accepted).

Plan ready for spec-reviewer Pass 1.
Fixes applied:
- BLOCKER §2 : EngagementDetailPage.test.tsx → "nouveau" (n'existe pas
  encore), pas "existant — adapter".
- WARN §1 : "Première ligne du summary" obligatoire pour backend-builder
  avec le path final EXACT (anti-URL-drift, lesson sprint 5).
- WARN §0/§1 : slug avec NFKD-strip pour accents + fallback "unnamed"
  pour edge case nom 100% non-alphanum.
- WARN §2 : ExportEngagementButton les DEUX moitiés ouvrent le dropdown
  (pas d'action par défaut — différence vs NewSimulationDropdown).
- WARN §2 : exports.ts throw Error sur non-2xx pour pipeline toast.
- WARN §1 : created_by rendu username-only en MD/CSV (pas la dict).
- WARN §1 : PDF généré depuis les DONNÉES (pas depuis le string Markdown).

NITs incorporés :
- gdk-pixbuf-2.0-0 retiré du set minimal (text-only PDF), avec note
  pour confirmer via weasyprint --info.
- data-testid="export-dropdown" sur le wrapper pour AC-30.1.
- AC-29.3 : compter rows via csv.reader, pas file.split.
- §0 point 14 : style explicite btn-outline (cohérence header).
- Test MITRE-bundle-not-loaded ajouté à test_export_render.py.

Plan prêt pour spec-reviewer Pass 2.
- New module backend/app/services/export.py with render_engagement_markdown,
  render_engagement_csv, render_engagement_pdf, _render_engagement_html helper,
  and _export_filename slugifier (NFKD + fallback "unnamed").
- Extend engagements_bp with GET /api/engagements/<int:eid>/export?format=md|csv|pdf,
  gated @role_required("admin","redteam"). Returns 400 on missing/unknown format,
  404 on unknown engagement, correct Content-Type + Content-Disposition headers.
- Reuses _enrich_techniques and _enrich_tactics from serializers.py; resilient
  to MITRE bundle not loaded (returns empty tactics, no crash).
- Adds weasyprint>=60.0 to backend/requirements.txt.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
apt-get install libcairo2 libpango-1.0-0 libpangoft2-1.0-0 libharfbuzz0b
libfontconfig1 shared-mime-info — minimal set for text-only PDF rendering.
libgdk-pixbuf-2.0-0 excluded (no images in PDF, verified via weasyprint --info).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- test_export_engagement.py: 13 endpoint tests — RBAC (admin/redteam ok, SOC 403,
  401 unauthenticated), CSV column contract, CSV special char escaping, PDF magic bytes,
  400 on missing/unknown format, 404 on missing engagement, zero-simulations edge case,
  filename slugification.
- test_export_render.py: 10 unit tests on pure render functions — header fields,
  simulation order, techniques/tactics enrichment, SOC fields always rendered,
  backtick safety in commands, CSV header row, multi-technique pipe join, PDF magic
  bytes, MITRE bundle not loaded does not crash.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Add split-button dropdown [Export ▼] on EngagementDetailPage that
downloads engagement as Markdown, CSV, or PDF via
GET /api/engagements/<id>/export?format=md|csv|pdf.

Both halves open the dropdown (no default left-click action).
RBAC-gated with canEditEngagements (admin + redteam only).
Loading state per item, toast on error, click-outside + Escape close.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
9 tests for ExportEngagementButton (render, open, close-outside,
Escape, per-format trigger, loading state, error toast).
3 RBAC tests for EngagementDetailPage (admin/redteam see Export,
soc does not). Total: 121 → 133 vitest passing.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Authenticated red-team users could craft any user-controlled string field
(name, description, commands, prerequisites, execution_result, log_source,
logs, soc_comment, incident_number, MITRE technique IDs) starting with =,
+, -, @, \t or \r. When the SOC analyst opens the exported CSV in Excel /
LibreOffice / Google Sheets — explicitly the consumption flow this sprint
optimizes for — the spreadsheet executes the field as a formula on the
SOC's machine.

Fix: new helper _csv_safe() prefixes a single apostrophe to any string
starting with a formula-trigger character, forcing the spreadsheet to
render the cell as text. Applied to every user-controlled field in
render_engagement_csv. Numeric and ISO-date fields are not wrapped.

Tests:
- test_render_engagement_csv_escapes_formula_injection_in_name
- test_render_engagement_csv_escapes_formula_injection_in_commands
- test_render_engagement_csv_does_not_alter_safe_strings

Result: 249 → 252 passing (the 1 remaining failure is pre-existing
test_index_without_built_frontend_returns_json, unrelated to this fix).

Flagged by security-guidance@claude-code-plugins automated review.
- NIT-1: remove dead _technique_names() and _technique_ids() helpers (no callers)
- NIT-2: rename engagement → _engagement in render_engagement_csv signature
- NIT-4: remove duplicate inline User import in test_export_csv_escapes_special_characters
- NIT-5: add comment on _CSV_FORMULA_TRIGGERS explaining \t and \r inclusion
- REUSE: replace custom _html_escape with stdlib html.escape (quote=True default)
- Remove now-unnecessary type: ignore comments on weasyprint (stubs resolve cleanly)
- Add test_export_filename_never_contains_quote_or_crlf defense-in-depth test

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds 3 Playwright spec files covering all 13 ACs for the engagement
export feature:
- us29-export-formats.spec.ts (8 tests): dropdown, md/csv/pdf downloads,
  admin + redteam, filename convention
- us30-export-rbac.spec.ts (3 tests): SOC button absent, SOC 403, no-token 401
- us31-export-robustness.spec.ts (4 tests): missing format 400, bad format 400,
  unknown engagement 404, zero-sim export OK

Total: 201 → 223 Playwright tests. No regressions on sprints 1–5.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- README "Status" bumped to sprint 6 + test counts (253 backend, 136
  frontend, 223 e2e).
- CHANGELOG [Unreleased] section for sprint 6: backend, frontend, e2e,
  security, and changed-section notes (SPEC commit-first + mimic team).
- 6 sprint-6 lessons in tasks/lessons.md:
  1. SPEC.md commit-first tamed the 4-sprint recurrence
  2. Persistent team mimic + idle members > "never idle"
  3. Security plugin caught CSV formula injection mid-sprint
  4. Stdlib first before custom helpers
  5. Tests that mock at module level can't exercise the target's branches
  6. _engagement param for signature symmetry across render trio

This is the team-lead wrap-up commit. PR body in tasks/pr-body-sprint-6.md
will be ingested by make open-pr.
knacky added 4 commits 2026-06-08 17:23:03 +00:00
User decision 2026-06-08 (post-PR-9, pre-merge): the export schema is
now a fixed 7-column layout focused on the RT↔SOC handoff, applied
uniformly across Markdown / CSV / PDF.

Columns (French headers): Scénario, Test, Source de log,
Commentaires SOC, Exécution (multiline concat of executed_at +
commands + execution_result, no labels), Logs remontés au SIEM,
Cyber incident.

Removed from the export (intentional): simulation status, MITRE
techniques and tactics, prerequisites, id, created_at, updated_at.
The export is a handoff product, not a full data dump.

This is the spec change that drives the upcoming render refactor
in services/export.py. SPEC committed first per the sprint-6
positional fix (FIRST commit, not at sprint close).
All three renderers (MD, CSV, PDF) now emit a uniform 7-column table with
French headers matching the RT↔SOC handoff contract locked in SPEC.md fdab324.

Helpers added:
- _format_execution(sim): canonical 3-part concat (executed_at / commands / execution_result)
- _MD_HEADERS / _HTML_HEADERS / _CSV_HEADERS unified to the same 7 FR strings

Helpers removed (no longer called):
- _tactic_names() — MITRE tactics dropped from export
- _enrich_sim_techniques() — MITRE techniques dropped from export

Fields dropped from export: status, techniques, tactic_ids, prerequisites, id,
created_at, updated_at (intentional — focused RT↔SOC handoff, see SPEC §Export).

_csv_safe() still applied to all 7 user-controlled cells including Exécution concat.

Tests updated: 255 passed, ruff clean, mypy clean.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Update AC-29.2 (Markdown) to assert | Scénario | GFM table header.
Update AC-29.3 (CSV) to assert exact 7 FR column names instead of 'name'.
Update AC-31.4 (empty engagement) MD to assert table absent, CSV header
to assert exact 7 FR columns.
Drop unused sim1/sim2 vars and makeClient import (NIT cleanup).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Post-review user decision (2026-06-08) switched the export payload to a
fixed 7-column FR handoff schema (Scénario / Test / Source de log /
Commentaires SOC / Exécution / Logs remontés au SIEM / Cyber incident).

Logged in CHANGELOG [Unreleased] Changed section with commit refs
(SPEC fdab324, backend 7335b9f, e2e aeb4bdb) and updated PR #9 body
counters: 255 pytest (was 253), 136 vitest unchanged, 223 e2e
unchanged.
knacky added 2 commits 2026-06-08 17:30:00 +00:00
Finding 1 — CSV multiline formula injection:
- Split _format_execution into _format_execution_text (MD/PDF, no sanitization) and
  _format_execution_csv (CSV, applies _csv_safe to each user-controlled component before join)
- Moved _CSV_FORMULA_TRIGGERS + _csv_safe above the format helpers (required by _format_execution_csv)
- Outer _csv_safe on the Exécution cell retained as belt-and-braces for the empty-date case
- New test: test_render_engagement_csv_defuses_formula_in_inner_execution_lines

Finding 2 — Stored XSS in Markdown table:
- _cell() in render_engagement_markdown now calls _html_escape() (quote=True, default)
  before pipe-escaping and \n→<br/> substitution — correct order preserved
- New test: test_render_engagement_markdown_escapes_html_in_table_cells

255 → 257 passed, ruff clean, mypy clean.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
CSV multiline injection + Markdown stored-XSS regressions caught by
security-guidance@claude-code-plugins on the 7-column refactor.
Backend fix in 3a9d9d3 (257 pytest, ruff/mypy clean). PR #9 body
counter bumped 255 → 257.
knacky added 1 commit 2026-06-09 16:14:25 +00:00
Add @page { size: A4 landscape } to _CSS, reduce font-size to 11px,
and set table-layout: fixed + word-break: break-word so 7 columns
fit without overflow. Unit test asserts the landscape rule is present
in the rendered HTML.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
knacky merged commit e27babed5b into main 2026-06-09 16:19:02 +00:00
knacky deleted branch sprint/6-export 2026-06-09 16:19:02 +00:00
Sign in to join this conversation.
No Reviewers
No Label
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: knacky/mimic#9