docs: log 2 MEDIUM security fixes in CHANGELOG (post-refactor)

CSV multiline injection + Markdown stored-XSS regressions caught by security-guidance@claude-code-plugins on the 7-column refactor. Backend fix in 3a9d9d3 (257 pytest, ruff/mypy clean). PR #9 body counter bumped 255 → 257.
fix(security): defuse CSV formula injection in multiline exécution cell + HTML-escape Markdown table cells
2026-06-08 19:29:59 +02:00 · 2026-06-08 19:29:10 +02:00
4 changed files with 92 additions and 27 deletions
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -36,7 +36,10 @@ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/)
 ### Changed
 - 2026-06-07 — SPEC.md § Export d'engagement added (between § Templates de simulations and § Stacks techniques). Committed as the **first** sprint commit (`7aaa5cc`), applying the fix-candidate from sprint 5's recurrent "SPEC.md uncommitted at sprint close" lesson. Four-sprint recurrence finally broken.
 - 2026-06-08 — Team `mimic` (persistent `.claude/teams/mimic/config.json`) instantiated with the full 7-agent project roster (backend-builder, frontend-builder, spec-reviewer, code-reviewer, design-reviewer, test-verifier, devil-advocate). Agents are spawned with an idle prompt at sprint start and woken via SendMessage per phase — flip vs the old "spawn-with-task-only" policy that hit the "team roster is flat" gotcha when respawning. Persistent across sprints from sprint 7+.
- 2026-06-08 (post-review, pre-merge) — **Export schema switched to a fixed 7-column handoff layout** uniform across MD/CSV/PDF. Columns (FR headers): `Scénario`, `Test`, `Source de log`, `Commentaires SOC`, `Exécution` (multi-line concat without labels — `executed_at` → `commands` → `execution_result`), `Logs remontés au SIEM`, `Cyber incident`. Removed from the export (intentional, handoff-focused): simulation status, MITRE techniques/tactics, prerequisites, id, created_at, updated_at. Markdown switched from narrative-per-simulation to a GFM table. PDF switched from sectioned HTML to a single `<table>`. SPEC `fdab324`, backend refactor `7335b9f`, e2e adaptation `aeb4bdb`. Final counters: backend 255 pytest, frontend 136 vitest, e2e 223 Playwright.
+- 2026-06-08 (post-review, pre-merge) — **Export schema switched to a fixed 7-column handoff layout** uniform across MD/CSV/PDF. Columns (FR headers): `Scénario`, `Test`, `Source de log`, `Commentaires SOC`, `Exécution` (multi-line concat without labels — `executed_at` → `commands` → `execution_result`), `Logs remontés au SIEM`, `Cyber incident`. Removed from the export (intentional, handoff-focused): simulation status, MITRE techniques/tactics, prerequisites, id, created_at, updated_at. Markdown switched from narrative-per-simulation to a GFM table. PDF switched from sectioned HTML to a single `<table>`. SPEC `fdab324`, backend refactor `7335b9f`, e2e adaptation `aeb4bdb`. Final counters: backend 257 pytest, frontend 136 vitest, e2e 223 Playwright.
 - 2026-06-08 (post-refactor, pre-merge) — **Two MEDIUM security regressions fixed** in the 7-column refactor (`3a9d9d3`), flagged by `security-guidance@claude-code-plugins`:
  1. **CSV formula injection inside the multi-line `Exécution` cell**: `_csv_safe` only checks `cell[0]`. With `executed_at` non-null, the cell starts with a safe date digit, but inner lines (commands, execution_result) starting with `=`/`+`/`-`/`@` evaded defense. Fix: `_format_execution_csv()` applies `_csv_safe` per user-controlled component BEFORE the multi-line concat. Outer `_csv_safe` on the assembled cell retained as belt-and-braces.
  2. **Stored XSS in Markdown table cells**: the new GFM table allows inline HTML (we use it for `<br/>`). A `sim.commands = "<script>alert(1)</script>"` would be rendered raw by MD viewers that interpret inline HTML (Notion, Obsidian, GitHub preview). Fix: `_cell()` now calls `html.escape()` on each value BEFORE the pipe-escape and `\n` → `<br/>` substitution — mirrors the `_render_engagement_html` PDF defense. The `<br/>` we insert ourselves stays unescaped (it's not user-controlled). 2 dedicated regression tests added.
 ---
--- a/backend/app/services/export.py
+++ b/backend/app/services/export.py
@@ -30,7 +30,36 @@ def _creator(obj: object) -> str:
    return getattr(cb, "username", "") or ""
-def _format_execution(sim: Simulation) -> str:
+# ---------------------------------------------------------------------------
 # CSV formula-injection defense (defined early — used by _format_execution_csv)
 # ---------------------------------------------------------------------------
 # \t and \r included: Excel auto-trims leading whitespace, so a tab/CR prefix still
 # reaches the formula parser in some sheet versions.
 _CSV_FORMULA_TRIGGERS = ("=", "+", "-", "@", "\t", "\r")
 def _csv_safe(value: object) -> object:
    """Defuse spreadsheet formula injection by prefixing user-controlled cells.
    Excel / LibreOffice / Google Sheets interpret cells starting with =, +, -, @,
    \\t or \\r as formulas. Since this CSV is the engagement handoff to SOC and is
    explicitly opened in a spreadsheet app, an authenticated red-team user could
    craft a simulation field that executes on the SOC analyst's machine. Prefixing
    with a single apostrophe forces the spreadsheet to treat the cell as text.
    """
    if isinstance(value, str) and value and value[0] in _CSV_FORMULA_TRIGGERS:
        return "'" + value
    return value
 # ---------------------------------------------------------------------------
 # Execution cell helpers
 # ---------------------------------------------------------------------------
 def _format_execution_text(sim: Simulation) -> str:
    """Canonical 3-part execution concat for Markdown and PDF (no CSV sanitization)."""
    parts = [
        sim.executed_at.isoformat() if sim.executed_at else "",
        sim.commands or "",
@@ -39,6 +68,17 @@ def _format_execution(sim: Simulation) -> str:
    return "\n".join(parts)
 def _format_execution_csv(sim: Simulation) -> str:
    """Execution concat for CSV: each user-controlled component is formula-defused
    before joining so that inner lines starting with =, +, -, @ are safe."""
    parts = [
        sim.executed_at.isoformat() if sim.executed_at else "",
        str(_csv_safe(sim.commands or "")),
        str(_csv_safe(sim.execution_result or "")),
    ]
    return "\n".join(parts)
 # ---------------------------------------------------------------------------
 # Markdown
 # ---------------------------------------------------------------------------
@@ -91,11 +131,16 @@ def render_engagement_markdown(
    lines.append(separator)
    for sim in simulations:
        execution = _format_execution(sim).replace("\n", "<br/>")
        def _cell(value: str | None) -> str:
-            return (value or "").replace("|", "\\|").replace("\n", "<br/>")
+            # Escape HTML (including quotes) first to prevent stored XSS in MD renderers
            # that interpret inline HTML, then escape pipe (GFM table syntax),
            # then fold newlines to <br/> (our own safe markup, inserted after escape).
            s = _html_escape(value or "")
            s = s.replace("|", "\\|")
            s = s.replace("\n", "<br/>")
            return s
        execution = _format_execution_text(sim)
        row = "| " + " | ".join([
            _cell(sim.name),
            _cell(sim.description),
@@ -125,24 +170,6 @@ _CSV_HEADERS = [
    "Cyber incident",
 ]
 # \t and \r included: Excel auto-trims leading whitespace, so a tab/CR prefix still
 # reaches the formula parser in some sheet versions.
 _CSV_FORMULA_TRIGGERS = ("=", "+", "-", "@", "\t", "\r")
 def _csv_safe(value: object) -> object:
    """Defuse spreadsheet formula injection by prefixing user-controlled cells.
    Excel / LibreOffice / Google Sheets interpret cells starting with =, +, -, @,
    \\t or \\r as formulas. Since this CSV is the engagement handoff to SOC and is
    explicitly opened in a spreadsheet app, an authenticated red-team user could
    craft a simulation field that executes on the SOC analyst's machine. Prefixing
    with a single apostrophe forces the spreadsheet to treat the cell as text.
    """
    if isinstance(value, str) and value and value[0] in _CSV_FORMULA_TRIGGERS:
        return "'" + value
    return value
 def render_engagement_csv(
    _engagement: Engagement, simulations: list[Simulation]
@@ -152,13 +179,13 @@ def render_engagement_csv(
    writer.writerow(_CSV_HEADERS)
    for sim in simulations:
-        execution = _format_execution(sim)
+        execution = _format_execution_csv(sim)
        writer.writerow([
            _csv_safe(sim.name or ""),
            _csv_safe(sim.description or ""),
            _csv_safe(sim.log_source or ""),
            _csv_safe(sim.soc_comment or ""),
-            _csv_safe(execution),
+            _csv_safe(execution),  # belt-and-braces: outer check covers empty executed_at case
            _csv_safe(sim.logs or ""),
            _csv_safe(sim.incident_number or ""),
        ])
@@ -217,7 +244,7 @@ def _render_engagement_html(
        thead = "<thead><tr>" + "".join(f"<th>{h(col)}</th>" for col in _HTML_HEADERS) + "</tr></thead>"
        parts.append(f"<table>{thead}<tbody>")
        for sim in simulations:
-            execution_html = h(_format_execution(sim)).replace("\n", "<br/>")
+            execution_html = h(_format_execution_text(sim)).replace("\n", "<br/>")
            cells = [
                h(sim.name or ""),
                h(sim.description or ""),
--- a/backend/tests/test_export_render.py
+++ b/backend/tests/test_export_render.py
@@ -220,6 +220,23 @@ def test_render_engagement_csv_escapes_formula_injection_in_execution(app) -> No
        assert "HYPERLINK" in cells[4]
 def test_render_engagement_csv_defuses_formula_in_inner_execution_lines(app) -> None:
    """When executed_at is set, the cell starts with a safe date, but commands
    line may inject formulas. Each user-controlled component must be defused."""
    with app.app_context():
        eng = _make_engagement()
        sim = _make_sim(
            executed_at=datetime(2026, 6, 1, 10, 0, tzinfo=UTC),
            commands="=cmd|'/c calc'!A1",
            execution_result="@SUM(1)",
        )
        result = render_engagement_csv(eng, [sim])
        cells = list(_csv.reader(_io.StringIO(result)))[1]
        execution_cell = cells[4]  # Exécution column
        assert "'=cmd|'/c calc'!A1" in execution_cell
        assert "'@SUM(1)" in execution_cell
 def test_render_engagement_csv_does_not_alter_safe_strings(app) -> None:
    with app.app_context():
        eng = _make_engagement()
@@ -230,6 +247,24 @@ def test_render_engagement_csv_does_not_alter_safe_strings(app) -> None:
        assert "whoami /all" in cells[4]
 def test_render_engagement_markdown_escapes_html_in_table_cells(app) -> None:
    """User content in table cells must be HTML-escaped to prevent stored XSS
    when the .md is opened in a renderer that interprets inline HTML."""
    with app.app_context():
        eng = _make_engagement()
        sim = _make_sim(
            name="<script>alert(1)</script>",
            commands='<img src=x onerror="alert(1)">',
        )
        result = render_engagement_markdown(eng, [sim])
        assert "<script>" not in result
        assert 'onerror="alert' not in result
        assert "&lt;script&gt;" in result
        assert "&lt;img" in result
        # double-quotes in attribute values are also escaped
        assert "&quot;" in result
 # ---------------------------------------------------------------------------
 # PDF tests
 # ---------------------------------------------------------------------------
--- a/tasks/pr-body-sprint-6.md
+++ b/tasks/pr-body-sprint-6.md
@@ -7,7 +7,7 @@
 - **Security MEDIUM fix mid-sprint** : CSV formula injection défusée par `_csv_safe()` (apostrophe-prefix sur `=`/`+`/`-`/`@`/`\t`/`\r`). Le red team aurait pu injecter une formule qui s'exécute chez le SOC à l'ouverture de l'Excel.
 ## Test plan
- **Backend** : **255/255** pytest (`ruff` + `mypy` clean).
+- **Backend** : **257/257** pytest (`ruff` + `mypy` clean).
 - **Frontend** : **136/136** vitest (`typecheck` + `lint` clean).
 - **E2e Playwright** : **223/223** verts — baseline sprint 5 = 201, +22 sprint 6.