Commit Graph

44 Commits

Author SHA1 Message Date
knacky
dd5c508b04 fix(backend): JSON error envelope for every HTTPException + strict_slashes=False
Some checks failed
ci / backend (lint + typecheck + unit tests) (push) Failing after 0s
ci / frontend (lint + typecheck + build + unit tests) (push) Failing after 1s
Two issues spotted by ux-frontend consuming docs/api.md against the actual
code path:

1. `flask.abort(...)` returned the Werkzeug HTML error page for 400/403/404/
   422/etc. — only the 401 paths going through `api_error()` and the
   Flask-Login `unauthorized_handler` honoured the `{error, message}`
   envelope the contract promised. The frontend's `ApiClientError.body`
   parser was forced to fall back to a raw string, and the 422 case
   could not surface Pydantic per-field errors.

   Fix: register `@app.errorhandler(HTTPException)` that serialises every
   `HTTPException` to the same JSON envelope. 422s gain a `details: [...]`
   field holding the Pydantic `errors()` list (`loc` / `msg` / `type`),
   matching the shape now documented in `docs/api.md`.

   A `_HTTP_ERROR_CODES` map maps statuses to stable snake_case codes
   (`bad_request`, `not_found`, `method_not_allowed`,
   `validation_error`, `forbidden`, `internal_error`, ...). Unknown
   statuses fall back to `http_error`.

   `description` is `cast(object, ...)` because the Werkzeug stub pins it
   to `str | None` while `flask.abort(..., description=<list>)` is the
   officially supported way to smuggle a Pydantic errors list to the
   handler.

2. `@bp.get("")` on the engagements blueprint produced `/api/v1/engagements`
   (no slash). Hitting it with a trailing slash issued a 308 redirect,
   and some browsers drop the session cookie across that hop.

   Fix: `app.url_map.strict_slashes = False`. Both forms now match the
   same handler without redirect.

5 new integration tests cover the new envelope shape (422 with details,
unknown 404, malformed-JSON 400) and the dual-slash matching. `docs/api.md`
rewritten to reflect the table of stable codes, the `details` shape, and
the no-trailing-slash convention. `CHANGELOG.md` gains a follow-up entry.

Verification: ruff check / mypy --strict / pytest tests/unit all green
(61 unit + 5 new integration).
2026-05-23 04:33:23 +02:00
knacky
dd321c2cd0 docs: add api.md contract for sprint 1 + update changelog
- docs/api.md: contract the frontend consumes — base URL, auth transport
  (Flask session cookie, credentials: include), uniform error envelope,
  MA6 tenant-scope behaviour (404 not 403), per-endpoint shape for
  /auth/{login,logout,me} and /engagements GET/POST/GET-by-id, plus a
  worked example walking through CLI bootstrap → login → POST engagement →
  list → logout.
- CHANGELOG.md: sprint-1 entry summarising the three endpoints, the dev-
  only CORS, the AuthUser extension, the audit rows, and the test
  coverage.
2026-05-23 04:22:03 +02:00
knacky
e1b381af4d test(backend): cover auth schemas + login/engagement E2E
Unit:
- test_auth_schemas: LoginRequest validation (min/max bounds, extra-fields
  policy) + serialize_current_user round-trip (RT lead permission set,
  RT operator subset, display_name None pass-through).

Integration (testcontainers Postgres, marked `integration`):
- test_login_then_create_and_list_engagement: full sprint-1 user journey —
  /me → 401, POST /login → 200, /me → 200, POST /engagements → 201,
  GET /engagements lists the new row, POST /logout → 204, /me → 401.
- test_login_rejects_bad_credentials: wrong password AND unknown user
  return the exact same 401 invalid_credentials envelope (no enumeration
  leak).
- test_logout_without_session_returns_401: /logout on anonymous returns
  the uniform not_authenticated envelope.

Unit total: 61 passed in 0.50s. Integration tests skip locally when
testcontainers is absent.
2026-05-23 04:21:55 +02:00
knacky
38b35c933a feat(backend): wire auth endpoints + dev CORS (sprint 1)
Three login endpoints under /api/v1/auth/ + dev-only CORS so the Vite
frontend can drive the session cookie.

- POST /login validates local credentials and sets a Flask session cookie.
  Returns the CurrentUser shape on 200 (user_id, username=email,
  display_name, role, permissions, groups). Uniform 401 invalid_credentials
  on bad password or unknown user; a bcrypt round against a dummy hash runs
  even on unknown users so the request timing does not enumerate accounts.
  Audits an auth.login row and bumps user.last_login_at.
- POST /logout (login_required) clears the session, returns 204, audits an
  auth.logout row.
- GET /me returns the current principal or 401 not_authenticated. Used by
  the frontend at boot to rehydrate state.

Side wiring:
- LoginManager.unauthorized_handler emits the same {error, message} JSON
  envelope so @login_required 401s match the rest of the API surface.
- api/_helpers gains `serialize_current_user(AuthUser) -> CurrentUser` and
  `api_error(code, message, status)` — used by the auth blueprint and
  available to follow-up endpoints.
- AuthUser carries display_name + user_type now; identity.load_user routes
  through a new `authuser_from_orm()` helper that the login endpoint also
  uses so /login and the user_loader produce identical shapes.
- Dev-only CORS via flask-cors on /api/*, gated on
  MIMIC_ENV=development AND MIMIC_CORS_ORIGINS non-empty. Prod keeps
  same-origin (reverse proxy fronts the SPA + API).
- LoginRequest + CurrentUser DTOs added to mimic.schemas.

No frontend-visible change to engagements (sprint-0 already shipped
created_by_id, audit log, F11 scope).
2026-05-23 04:21:44 +02:00
knacky
a8c5400f97 docs: add production deployment guide
Some checks failed
ci / backend (lint + typecheck + unit tests) (push) Failing after 0s
ci / frontend (lint + typecheck + build + unit tests) (push) Failing after 0s
Operational runbook for rolling Mimic to RT infrastructure. Scope is
the application repo only; the Ansible playbook (D-010) and Caddy
reverse proxy (D-007) are referenced as out-of-scope dependencies.

Sections:

- Host prerequisites (Podman 5, rootless, linger, PostgreSQL 16 reach).
- Filesystem layout: blobs + evidence pools at 0750 under the deploy
  user (D-012), log directory, Quadlet directory.
- Environment variables: split into "required in prod" (MIMIC_SECRET_KEY,
  MIMIC_FERNET_KEY, MIMIC_DATABASE_URL, MIMIC_DATABASE_AUDIT_URL,
  MIMIC_ENV) and "required with safe defaults" (cookie flags, log
  format, CORS origins, blob/evidence roots). Explicit note that the
  two database DSNs must point to two different Postgres roles to
  preserve the audit append-only contract (NF-AUDIT, code-reviewer N5).
- Secrets management: dedicated section addressing PR3 code-reviewer M2.
  File-based generation under ~/secrets with 0700 perms, systemd
  EnvironmentFile or future MIMIC_*_FILE indirection, vault back-up,
  Fernet key rotation requires re-encryption pass.
- Container images: pin policy `:X.Y.Z` (cross-references F-D1), exposed
  ports per layer (backend 5000 as uid 1001, frontend 8080 as uid 101).
- PostgreSQL setup: bootstrap of mimic_audit_writer role with the SQL
  the Ansible playbook runs, plus the fail-loud rationale if the role
  is missing. Alembic upgrade head invocation.
- Quadlet units: backend example with PublishPort 127.0.0.1:5000 (the
  external surface is Caddy, not the backend), EnvironmentFile,
  blob+evidence bind-mounts with `:Z` SELinux relabel.
- Smoke validation: three curl checks (Caddy-fronted /healthz, direct
  backend /healthz, audit DSN presence) with explicit "do not announce
  the release" gate on failure.
- Upgrade procedure: 5-step rolling restart anchored on Quadlet image
  tag edits + alembic upgrade as part of the entrypoint.
- Rollback procedure: image-only (additive schema) vs schema-affecting,
  with alembic downgrade against an explicit revision.
- Open items: explicit pointers to FERNET-KEY, F-D1, F-D2, F-D3
  trackers in tasks/todo.md so future operators see them.

No other file touched; no application code changed.
2026-05-23 03:15:46 +02:00
knacky
c44f8b90ad docs: archive Podman runner setup runbook + track F-D1..F-D5
Some checks failed
ci / backend (lint + typecheck + unit tests) (push) Failing after 1s
ci / frontend (lint + typecheck + build + unit tests) (push) Failing after 0s
Two changes scoped together since both stem from the post-PR2 wrap-up.

docs/podman-runner-setup.md (new, ~190 LOC):

Operational runbook for the gitea-runner host that drives CI. The first
attempt at install hit four traps that this archived version documents
so we don't lose the lesson:

 1. `act_runner register` performs a sanity ping against the container
    daemon before writing the credential. Without the Podman socket
    mounted on the *register one-shot*, register fails silently and no
    .runner file is produced. The runbook mounts the socket on both
    register and daemon containers.
 2. SELinux blocks rootless socket access by default. Quadlet
    SecurityLabelDisable=true (or --security-opt label=disable for the
    legacy CLI form) is the documented bypass. No-op on Debian, required
    on RHEL/Fedora hosts.
 3. The runner user UID is not 1000 on every host (gitea = 1005 here).
    Quadlet `%U` substitution makes the unit portable; hardcoded UIDs
    are explicitly called out as a sprint 0 mistake.
 4. `podman generate systemd` is officially deprecated. Quadlet is the
    only supported pattern going forward and is what this runbook ships;
    legacy alternative is omitted on purpose.

Also captures: token placeholder convention (<TOKEN_FROM_GITEA_UI>,
never the real value in archived docs), single-use semantics, the
"secrets via file, not chat" convention, the `:X.Y.Z` pin policy versus
`:latest` in prod (ties into follow-up F-D1), and a decommissioning
section that cleans up state without nuking the user-level Podman socket.

tasks/todo.md:

New section "Frontend follow-ups (sprint 1+)" with F-D1..F-D5 from
code-reviewer on `chore/frontend-dockerfile` (649194b). All deferred,
none blocking. F-D1 (digest pinning) is project-wide and explicitly
references the backend image and the runner image alongside the
frontend ones for a single chore commit.
2026-05-23 03:08:03 +02:00
knacky
649194b174 chore(frontend): add multi-stage Dockerfile + nginx SPA config
Some checks failed
ci / backend (lint + typecheck + unit tests) (push) Failing after 1s
ci / frontend (lint + typecheck + build + unit tests) (push) Failing after 0s
Production image for the frontend dist.

Stage 1 (build): node:22-alpine, `npm ci --ignore-scripts` from the
committed lockfile, `npm run build`. Output lands in /app/dist.

Stage 2 (runtime): docker.io/nginxinc/nginx-unprivileged:alpine.
- Upstream-maintained variant that runs as the nginx user (uid 101)
  out of the box. /var/cache/nginx and /var/run/nginx are pre-owned,
  no chown gymnastics needed in our layer. Vanilla nginx:alpine fails
  at startup as non-root because client_temp mkdir is denied.
- Listens on 8080 (non-privileged port, matches the unprivileged
  variant convention).
- nginx.conf serves /usr/share/nginx/html with SPA `try_files`
  fallback for client-side routing, long-cache headers on
  /assets/ (Vite hashed bundles), a plaintext /healthz endpoint
  for Caddy / Prometheus blackbox, and server_tokens off.

.dockerignore excludes node_modules, dist, .vite, coverage,
playwright-report, .env*, .git, editor dirs. Keeps .env.example.

Smoke local validated with `podman build -t mimic-frontend:smoke .`
and `podman run -p 127.0.0.1:18080:8080`:
  /healthz -> 200 "ok"
  /        -> 200 index.html (508 B)
  /spa/x   -> 200 (SPA fallback)
  /assets  -> Cache-Control: max-age=31536000, public, immutable
2026-05-22 19:59:09 +02:00
knacky
359225e464 chore(ci): drop transient smoke workflow now that runner is validated
Some checks failed
ci / backend (lint + typecheck + unit tests) (push) Failing after 6s
ci / frontend (lint + typecheck + build + unit tests) (push) Failing after 3s
The smoke workflow was scoped from inception to validate that the
freshly registered gitea-runner picks up jobs with the "linux" label.
It ran green on push of chore/podman-and-ci. Removing per the
"transient, removed after validation" plan recorded in the original
commit (1380672).
2026-05-22 19:49:26 +02:00
knacky
df6294ed7b docs: align doc references with compose.yml rename (code-reviewer M1)
Three docs still referenced the old docker-compose.yml path. Replace
with compose.yml so a future reader cloning at this hash finds the
file at the documented path.

- CHANGELOG.md:31 — backend skeleton recap line.
- docs/architecture.md:28 — deployment artifacts note (D-010 scope).
- tasks/todo.md:9 — B0.1 task description.

Also adds a "CI follow-ups (sprint 1+)" section to tasks/todo.md
capturing the 3 MINOR + 6 NIT deferred from code-reviewer's review
of chore/podman-and-ci, plus a FERNET-KEY tracker for the secret
provisioning before c2_credential.config_fernet (D-004) is wired.
2026-05-22 19:49:16 +02:00
knacky
1380672c03 ci(gitea): add CI workflow + transient smoke validation
All checks were successful
smoke / hello (push) Successful in 0s
Two workflows under .gitea/workflows/:

- ci.yml — runs on push:main and every PR. Two parallel jobs:
  * backend (python:3.12-slim-bookworm): apt deps for psycopg + WeasyPrint,
    pip install -e backend[dev], ruff check + ruff format --check + mypy
    --strict src + pytest tests/unit. Postgres 16 service for any
    integration-style test, env wired via service hostname.
    FERNET_KEY_TEST sourced from Gitea repo secret (no plain value in CI).
  * frontend (node:22-alpine): npm ci, ESLint, TypeScript typecheck,
    Vitest, Vite build.
  Runner label: linux (matches gitea-runner registration).
  Out of scope sprint 0: testcontainers Postgres integration tests
  (Docker-in-Docker rootless setup deferred to nightly job) and
  Playwright E2E (deferred to sprint 1+).

- smoke.yml — transient. Triggers only on push to this branch
  (chore/podman-and-ci) and on workflow_dispatch. Validates that the
  newly registered gitea-runner picks up jobs with the "linux" label.
  Removed in a follow-up commit on this branch once green.
2026-05-22 19:42:23 +02:00
knacky
9ece352659 chore(backend): rename docker-compose.yml -> compose.yml + podman notes
Compose v2 canonical filename (compose.yml) is recognized by both
docker compose and podman compose without preference. The previous
docker-compose.yml worked but signalled a Docker-first stance, while
target deployment is Podman 5.8+ rootless.

- Rename backend/docker-compose.yml -> backend/compose.yml.
- backend/README.md `make db-up` comment uses $(CONTAINER) to mirror
  the Makefile auto-detect (lines 14-16: docker || podman).
- backend/README.md audit-writer bootstrap snippet hints at podman
  fallback explicitly with `command -v` runtime sniff.
- backend/compose.yml comment for audit-writer mentions both runtimes.

No functional change. Makefile $(COMPOSE) target unchanged: Compose v2
discovers compose.yml first in its search order.
2026-05-22 19:41:38 +02:00
knacky
ffcac42272 Merge branch 'feature/backend-skeleton' into main
Sprint 0 backend skeleton: Python 3.12 / Flask / SQLAlchemy 2 / Pydantic 2
/ Alembic / pytest. Data model §8 complete, C2Connector ABC, Jinja2
sandbox with google-re2 regex_extract (D-011), CAS gzip blob storage
(D-012), local auth + group-based RBAC (D-003/D-008), F11 tenant scoping,
audit log infrastructure with hash chain anticipated (D-013).

Quality gates: ruff/mypy strict/56 unit tests passing. LGTM code-reviewer
after 2 rounds of remediation (B1 BLOCKER + 6 MAJOR addressed).

Co-Authored-By: backend <backend@mimic.local>

* origin/feature/backend-skeleton:
  docs(backend): track sprint-0 follow-ups + flag integration migration gap
  feat(backend): wire created_by_id, audit log, F11 scope into CRUD (MA4/5/6)
  fix(backend): freeze F11 matrix inline in the initial migration (MA3)
  fix(backend): stream store_blob and enforce max_bytes mid-write (MA2)
  fix(backend): stop seeding the audit-writer role via postgres-init (MA1)
  fix(backend): make google-re2 a hard dependency, drop re fallback (B1)
  chore(backend): mypy strict clean + ruff format pass
  feat(backend): add content-addressed gzip blob store (D-012)
  fix(backend): align regex_extract + outputs.blob() with D-011/D-012
  fix(backend): align with D-008/D-009 (drop ttp_version, seed F11 matrix)
  docs: update CHANGELOG + tasks for the backend skeleton sprint 0
  test(backend): add pytest baseline (B0.8)
  feat(backend): add Flask app factory, audit writer, flat CRUD + CLI (B0.7)
  feat(backend): add local auth + group-based RBAC matching F11 (B0.6)
  feat(backend): add Jinja2 sandbox + regex_extract filter (B0.5)
  feat(backend): add C2Connector ABC + payload mapping + factory (B0.4)
  feat(backend): add §8 data model + Alembic baseline (B0.2, B0.3)
  chore(backend): bootstrap Python 3.12+ project skeleton (B0.1)
v0.1.0
2026-05-22 11:45:17 +02:00
knacky
0aee6f46eb Merge branch 'docs/architecture-sprint0' into main
Sprint 0 architecture documentation: docs/architecture.md mirroring the
backend/frontend committed code, with 'Known WARN' and 'Anticipated v2'
sections. LGTM code-reviewer + author spec-analyst.

Co-Authored-By: spec-analyst <spec-analyst@mimic.local>

* origin/docs/architecture-sprint0:
  docs: add docs/architecture.md (sprint 0 mirror)
2026-05-22 11:45:17 +02:00
knacky
e77ca906d4 docs(backend): track sprint-0 follow-ups + flag integration migration gap
- `tasks/todo.md`: B0.5 description updated (re2 hard dep, no fallback);
  add a "Backend follow-ups (sprint 1+)" section with M1-M7 + N1-N6 from
  the code-review verdict.
- `CHANGELOG.md`: backend skeleton bullets refreshed (no re fallback,
  streaming blob store, audit + scope on CRUD, 56 unit tests); new
  "Code-review remediation" subsection lists B1 / MA1-MA6 / N4 / N6 / M8
  with one-line rationale each.
- `tests/integration/conftest.py`: leave `db.create_all()` in place but
  add an inline TODO (N6) pointing at the Alembic switchover that will
  exercise the F11 seed + audit-log role grants in CI.
2026-05-22 05:25:04 +02:00
knacky
3a3e3ff0ec feat(backend): wire created_by_id, audit log, F11 scope into CRUD (MA4/5/6)
Three follow-ups on the flat CRUD blueprints triggered by code-review +
spec-analyst (MA4, MA5, MA6).

**MA4 — `created_by_id`** — engagements, TTPs and scenarios now record the
creator from `current_user.id` instead of leaving the FK NULL. The new
`api._helpers.current_user_id()` exposes the UUID safely (returns None when
the request is unauthenticated, e.g. during /healthz).

**MA5 — Audit log integration** — `api._helpers.audit_write(...)` wraps the
hash-chained `AuditWriter` and is called after every successful commit in
the 4 blueprints (engagement / host / ttp / scenario incl. step), recording
the actor, action, resource type/id, IP, user agent, and small metadata
(field list, names, engagement scope). F13 "Toute mutation tracée" now
holds end-to-end.

**MA6 — RT operator scope on engagements** — F11 limits RT operators to
"engagements assignés". The previous implementation let them list / read
every engagement and every nested resource. Fix: `is_rt_lead()` short-
circuits the check for RT leads; otherwise a membership probe against
`engagement_member` runs on every list/read and on `_engagement_or_404` in
`hosts.py` and `scenarios.py`. Listings now `JOIN engagement_member` and
filter by `current_user.id`.

`audit_write` casts `db.session` (a `scoped_session` proxy) to the unwrapped
`sqlalchemy.orm.Session` that `AuditWriter` expects; the two are
interchangeable at runtime.

The promotion-perm check on TTPs no longer needs a lazy `flask_login` import
since the decorator scope already brings `current_user` in.
2026-05-22 05:24:54 +02:00
knacky
36c1ed5ffb fix(backend): freeze F11 matrix inline in the initial migration (MA3)
Code-review MAJOR MA3. The initial Alembic migration imported the live
`mimic.rbac.matrix.GROUP_PERMISSIONS` to seed the `permission` / `group` /
`group_permission` rows. That breaks the Alembic invariant "a migration
produces the same schema regardless of when you replay it": a future tweak
to the runtime matrix would silently change the seeded baseline on a fresh
DB.

Two changes:

1. The migration now carries an *inline frozen snapshot* of the F11 matrix
   (`_PERMISSIONS_FROZEN`, `_GROUP_PERMISSIONS_FROZEN`, `_GROUP_DESCRIPTIONS`).
   The seed reads from these tuples/dicts only. If the canonical matrix
   evolves, the next migration is responsible for the delta.

2. A new unit test `test_migration_seed_matches_current_matrix` enforces
   that the frozen seed equals the runtime `Permission` enum and
   `GROUP_PERMISSIONS` mapping. Drift now fails CI loudly with a hint to
   write a new migration instead of editing the existing one.

Also: docstring no longer mentions `ttp_version` (M8 follow-up).
2026-05-22 05:24:37 +02:00
knacky
feadad850b fix(backend): stream store_blob and enforce max_bytes mid-write (MA2)
Code-review MAJOR MA2. The previous `store_blob(root, data: bytes)` signature
forced the entire payload into RAM before the 10 MB cap was checked — a
hostile-large output blob could OOM the worker before the limit even fired.

New signature: `store_blob(root, stream, *, max_bytes=10_485_760)`. The
implementation:
- reads from `stream` in 64 KB chunks;
- updates the sha256 + writes to `<root>/.tmp-<pid>-<rand>.gz` incrementally;
- raises `BlobTooLarge(max_bytes)` as soon as the running total crosses the
  cap, then unlinks the partial temp file via `contextlib.suppress`;
- atomic-renames the temp file to the CAS path `<aa>/<bb>/<sha256>.gz` once
  the stream finishes;
- sets `0o750` on the directory and `0o640` on the file with explicit
  `os.chmod` (does not rely on the process umask).

Updated unit tests cover: BlobTooLarge enforcement (with temp-file cleanup),
multi-chunk happy path (1.5 MB payload exercising the 64 KB loop), and
`max_bytes <= 0` validation.
2026-05-22 05:24:25 +02:00
knacky
6e803a482a fix(backend): stop seeding the audit-writer role via postgres-init (MA1)
Code-review MAJOR MA1. The previous `scripts/postgres-init/00-roles.sql`
hardcoded a `CHANGE_ME` password for `mimic_audit_writer` and was bind-mounted
into the dev Postgres container; on prod boxes this risks lingering as the
real credential.

- The init script was removed in the previous commit alongside the dropped
  scripts dir.
- `docker-compose.yml` no longer mounts a `docker-entrypoint-initdb.d`
  directory; the audit-writer role provisioning is the Ansible playbook's
  responsibility (D-010).
- `backend/README.md` documents the manual one-shot `CREATE ROLE` command
  for local dev with a placeholder password.

Net effect: no `CHANGE_ME` credential reaches a container image / git history.
The Alembic migration's `audit_log` grant block stays idempotent — it is a
no-op when the role is absent.
2026-05-22 05:24:13 +02:00
knacky
90f8141cfc fix(backend): make google-re2 a hard dependency, drop re fallback (B1)
Code-review BLOCKER B1. Reaffirms D-011: a `re` stdlib fallback defeats the
OPSEC-safe-regex guarantee because hostile C2 output can trigger catastrophic
backtracking. The `[:1MB]` slice cap does not mitigate that — re-evaluating
a malicious pattern over 1 MB of attacker-controlled text is still a worker
freeze.

- `mimic.templating.filters` now imports `re2` unconditionally and raises
  `RuntimeError` at module load if the binding is absent. No `re` import,
  no `_HAS_RE2` branch, no `_FALLBACK_MAX_INPUT`.
- `pyproject.toml` already pinned `google-re2 >= 1.1, < 2.0`; this commit
  hardens the import path to actually enforce it.
- New test `test_re2_is_required` asserts the binding is wired in.
2026-05-22 05:23:47 +02:00
knacky
05f60cde6d docs: add docs/architecture.md (sprint 0 mirror)
High-level architecture snapshot reflecting feature/backend-skeleton
@ 12d131c and feature/frontend-skeleton @ b505a65. Covers:

- Repo + backend + frontend module trees.
- §8 aggregates with delta annotations vs the frozen spec.
- F11 permission matrix mapping to rbac/matrix.py.
- Auth split (RT bcrypt session vs SOC opaque token) per D-003 / D-006.
- Cleanup templating (Jinja sandbox + regex_extract D-011 semantics).
- C2 abstraction layer + Mythic / Home stub.
- Storage pools layout (CAS blobs + flat evidence) per D-012.
- Sprint 0 happy-path flow + post-sprint scope boundary.
- Known WARN items (audit chain unverified, scope on /engagements,
  role free-text on engagement_member, deferred Q-003..Q-005).
- Anticipated-vs-v2 table summarising D-004 / D-008 / D-012 / D-013.

This is a living mirror — when code disagrees, code wins, file a doc fix.
2026-05-22 05:11:25 +02:00
knacky
adab8a58e7 chore(backend): mypy strict clean + ruff format pass
Pre-merge sanity per devops checklist (ruff format --check, mypy --strict).

Type fixes:
- ORM models: `Mapped[dict]` → `Mapped[dict[str, Any]]` (audit, scenario, run,
  report, ttp, detection.artifact_files_json). Equivalent on Pydantic DTOs
  (TtpBase.params_schema_json, ScenarioStepBase.params_override_json).
- Rename `TtpRead.current_version` → `TtpRead.version` to mirror the ORM
  column (which itself was renamed in D-009 cleanup).
- Flask blueprints: add `-> ResponseReturnValue` to every view, plus typed
  UUID params on `_validate_step_consistency`.
- `templating/filters.py`: rewrite the conditional re2 import so mypy can
  narrow the union (`ModuleType | None`); the runtime branch on `_re2 is not
  None` removes the unused-ignore that was triggered by warn_unused_ignores.
- `pyproject.toml`: add `flask_login.*` and `pythonjsonlogger.*` to the
  `[[tool.mypy.overrides]]` `ignore_missing_imports` list (both ship without
  typed marker).
- Misc: drop stale `# type: ignore` comments (`app.py:36`,
  `rbac/decorators.py:35`) flagged by `warn_unused_ignores`. Keep
  `logging.JsonFormatter` ignore because the symbol exists at runtime but is
  not re-exported through the typed surface.

Formatting:
- `ruff format` applied (15 files normalized; line-length unchanged at 100).

Verification on this commit:
- `ruff check`  → All checks passed.
- `ruff format --check` → 68 files already formatted.
- `mypy --strict src` → Success: no issues found in 54 source files.
- `pytest tests/unit` → 49 passed.
2026-05-22 05:10:51 +02:00
knacky
12d131c826 feat(backend): add content-addressed gzip blob store (D-012)
Two on-disk pools per D-012:
- `MIMIC_BLOB_ROOT` (default `/var/lib/mimic/blobs/`) holds C2 polling
  output blobs, content-addressed gzip layout `<aa>/<bb>/<sha256>.gz`.
- `MIMIC_EVIDENCE_ROOT` (default `/var/lib/mimic/evidence/`) reserved for
  user-uploaded evidence (flat per-engagement, no compression). Wired only
  in config + .env.example here; F8 endpoint lands later.

`mimic.storage.blob`:
- `blob_path(root, sha256_hex)` validates the digest and returns the CAS
  path. Raises ValueError on a malformed digest (length != 64 or non-hex).
- `store_blob(root, data)` hashes, gzip-compresses, atomically writes to
  `<aa>/<bb>/<sha256>.gz` (0o750 dir perms, 0o640 file perms). Idempotent:
  duplicate writes leave mtime untouched.

5 new unit tests cover happy path, deduplication, idempotency, malformed
digest, and the two-byte-pair directory layout.
2026-05-21 20:44:59 +02:00
knacky
162b6988f8 fix(backend): align regex_extract + outputs.blob() with D-011/D-012
D-011 — `regex_extract(text, pattern, *, group=1, name=None)`:
- engine google-re2 (linear-time, ReDoS-safe), `re` fallback with 1 MB cap.
- first match only.
- no match → raises Jinja2 `TemplateError` (no silent default — cleanup
  templates must fail loud when source string drifts).
- default capture is group 1 with fallback to group(0) when the pattern has
  no groups; named groups via `name="<name>"`.

D-012 — `outputs.blob()`:
- reads the gzip-compressed CAS file from `MIMIC_BLOB_ROOT`.
- 10 MB cap is applied **after** decompression.
- decode UTF-8 with latin-1 fallback; never raises (missing / corrupt /
  non-gzip blobs return empty string, logged at WARNING).

Unit tests rewritten to cover both the new fail-loud regex contract and
the gzip read path. 49 unit tests pass; ruff clean.
2026-05-21 20:44:48 +02:00
knacky
d470db97d9 fix(backend): align with D-008/D-009 (drop ttp_version, seed F11 matrix)
D-009 reaffirms spec H32: no `ttp_version` table. Replayability lives solely
on `run.snapshot_json`. The previous initial migration introduced a separate
`ttp_version` aggregate by mistake — removed here.

D-008 requires the bootstrap to seed exactly the three F11 groups
(`rt_operator`, `rt_lead`, `soc_analyst`) with exactly the F11 permission
matrix. The migration now:
- inserts every `Permission` enum value into the `permission` table,
- inserts the three groups with deterministic uuid5(NAMESPACE_DNS, ...) ids,
- inserts the matching `group_permission` rows from GROUP_PERMISSIONS.

Also renames `ttp.current_version` to `ttp.version` (matches §8 spec column
name; the value remains informational per H32 / D-009).
2026-05-21 20:44:37 +02:00
ux-frontend
b505a654f8 fix(frontend): address M1-M3 polish from code-reviewer
M1 — Single SessionProvider via nested router.
  The previous router had two route entries with `path: '/'`
  (Navigate, AppShell) plus a separate `/login` entry, each wrapped in
  its own RootLayout. That instantiated SessionProvider three times,
  forking state the moment session writes diverged across siblings.
  Replaced by one Root route with SessionProvider + <Outlet />, and
  index/login/AppShell-children nested underneath. RootLayout (the
  per-tree wrapper) is now obsolete and deleted; the new Root component
  lives in src/routing/Root.tsx (addresses NIT N4 as a side effect).

M2 — Typo: "pollign" → "polling" in LiveCockpitPage masthead.

M3 — Replace asymmetric `?? 'rt_operator'` / `?? 'soc_analyst'`
  fallbacks in LiveCockpitPage with an explicit `if (!user) return null;`
  guard placed after all hooks (rules-of-hooks). AppShell already
  redirects unauthenticated visitors to /login, so the guard documents
  the invariant rather than introducing one.

NITs N1-N3, N5-N7 recorded in tasks/todo.md as sprint 1+ follow-ups.
2026-05-21 20:44:32 +02:00
knacky
887182cfd7 docs: update CHANGELOG + tasks for the backend skeleton sprint 0
- CHANGELOG.md: detail every B0.1..B0.8 deliverable + spec deltas
  D-008 (ttp_version coexists), D-009 (audit hash chain v1),
  D-010 (no type_annotation_map on declarative base).
- tasks/todo.md: tick every B0.x item.
- tasks/spec-decisions.md: log D-008, D-009, D-010 alongside the
  pre-existing D-001..D-007.
2026-05-21 20:39:06 +02:00
knacky
5d9415bb9f test(backend): add pytest baseline (B0.8)
Unit (SQLite, pure logic):
- test_templating.py: Jinja2 sandbox, regex_extract, strict-undefined,
  sandbox blocks attribute-access escape, output blob 10 MB cap.
- test_password.py: bcrypt hash + verify, empty / malformed handling.
- test_soc_token.py: 256-bit url-safe token + bcrypt verification.
- test_rbac_matrix.py: F11 invariants (lead ⊇ operator, SOC restricted
  to detection + report-read, audit_read & ttp_promote lead-only).
- test_connector_factory.py: register / build / double-register-rejected,
  TaskStatus terminal helper, Mythic mapping vs empty Home mapping.
- test_audit_hash.py: SHA-256 chain helper is deterministic and reacts
  to prev_hash / metadata changes.

Integration scaffold (testcontainers Postgres):
- tests/integration/conftest.py spins up postgres:16-alpine, monkeypatches
  MIMIC_DATABASE_URL, creates a Flask app + db.create_all.
- test_healthz.py: end-to-end smoke through the Flask test client.

38 unit tests pass; ruff clean.
2026-05-21 20:36:03 +02:00
knacky
9fa4d61304 feat(backend): add Flask app factory, audit writer, flat CRUD + CLI (B0.7)
- Flask app factory wires SQLAlchemy / Migrate / Login / SocketIO and
  registers every blueprint. /healthz smoke endpoint included.
- Pydantic 2 DTOs (request/response) for engagement / host / TTP /
  scenario aggregates with from_attributes=True conversion.
- Flat CRUD blueprints under /api/v1/:
  * engagements (list / create / get / put / delete-as-archive)
  * hosts (engagement-scoped CRUD)
  * library/ttps (CRUD; promote requires the lead-only TTP_PROMOTE)
  * scenarios + steps (F3 invariant enforced: host.c2_type must match
    scenario.c2_type at compose time, 400 otherwise).
- @require_perm guards every endpoint per the F11 matrix.
- audit/ writer is hash-chained from v1 (SHA-256 of canonical record
  plus previous hash). The SQL-level write-only role enforcement ships
  in the deploy playbook (idempotent grants run at migration time).
- mimic-cli (click): user create (seeds RT operator/lead with group
  membership), db dump / db restore (manual pg_dump/pg_restore, R-O1).

No orchestrator, no WebSocket, no report generation — those land after
PR1/PR2/PR3.
2026-05-21 20:36:03 +02:00
knacky
7f4ad85a68 feat(backend): add local auth + group-based RBAC matching F11 (B0.6)
- Permission enum + GroupName enum + GROUP_PERMISSIONS mapping mirror
  the F11 matrix in code (verifiable against the spec table in tests).
- @require_perm decorator: 401 on anonymous, 403 on missing permission,
  passes through otherwise. Pure-function user_has() for unit-testing.
- AuthUser (Flask-Login wrapper) resolves the permission set from a
  User's groups; load_user is the Flask-Login user_loader.
- bcrypt password hashing helpers (12 rounds by default, configurable).
- SOC opaque token (D-006): secrets.token_urlsafe(32), bcrypt-hashed at
  rest, plain value returned once at creation and never re-displayable.
- Group-based RBAC from day one (D-003) — Keycloak OIDC in v2 maps onto
  the same group model.
2026-05-21 20:36:03 +02:00
knacky
104d73143a feat(backend): add Jinja2 sandbox + regex_extract filter (B0.5)
- CleanupRenderer wraps jinja2.sandbox.SandboxedEnvironment with
  StrictUndefined (no autoescape — shell context, not HTML).
- Custom filter regex_extract(text, pattern, group=1, default='') uses
  google-re2 for linear-time matching (ReDoS-safe) and falls back to
  re with a 1 MB input cap when re2 is absent.
- StepOutputs exposes {{ outputs.text }} and {{ outputs.blob('name') }}.
  blob() decodes UTF-8 with latin-1 fallback, hard-capped at 10 MB
  (consistent with F8 evidence limit, D-005).
- render_cleanup() is the module-level convenience wrapper.
2026-05-21 20:36:03 +02:00
knacky
20112d61ff feat(backend): add C2Connector ABC + payload mapping + factory (B0.4)
- abstract C2Connector with authenticate / list_hosts / execute_task /
  get_task_result / cancel_task / execute_cleanup; stream_task_output
  optional v1 (NotImplementedError).
- Payload / TaskHandle / TaskResult / TaskStatus frozen dataclasses.
- UnsupportedPayloadType raised when no native command maps to the
  chosen (c2_type, payload_type) pair.
- Mythic payload_type → native command map populated (spec §7 table).
- HOME map left empty until PR2 is closed.
- ConnectorFactory: register_connector decorator + build(c2_type) that
  instantiates + authenticates via an injected config resolver.

No real Mythic / Home implementations land in this sprint.
2026-05-21 20:36:03 +02:00
knacky
22d37fb240 feat(backend): add §8 data model + Alembic baseline (B0.2, B0.3)
- SQLAlchemy 2 typed mapped classes for every spec §8 aggregate:
  engagement, c2_credential, host, user, group, group_permission,
  user_group, engagement_member, ttp, ttp_version, scenario,
  scenario_step, run, run_step, run_step_cleanup, detection, evidence,
  report, soc_session, audit_log.
- Shared mixins: UuidPkMixin (PG_UUID(as_uuid=True)) + TimestampsMixin.
- StrEnum types covering every spec enum (C2Type, PayloadType, UserType,
  EngagementStatus, HostStatus, TtpSource, RunStatus, RunStepStatus,
  CleanupStatus, DetectionLevel, DetectionSource, EvidenceStatus).
- Alembic baseline migration 202605210001_initial_schema: creates every
  table, enum, index, and idempotent grants for the audit_log
  write-only Postgres role (mimic_audit_writer).
- Audit log carries prev_hash / row_hash from v1 (D-009).
- ttp_version table coexists with run.snapshot_json (D-008,
  overrides H32).
2026-05-21 20:36:03 +02:00
knacky
a93c959444 chore(backend): bootstrap Python 3.12+ project skeleton (B0.1)
- pyproject.toml with ruff + mypy strict + pytest + coverage >=70%
- Makefile with Docker/Podman auto-detect
- Multi-stage Dockerfile (python:3.12-slim-bookworm, non-root user)
- docker-compose.yml for Postgres dev DB
- alembic.ini wired to src/mimic/db/migrations
- scripts/postgres-init/00-roles.sql seeds the audit writer role
- .env.example documents every MIMIC_* var (no secrets committed)
2026-05-21 20:36:03 +02:00
ux-frontend
12bc33469c feat(frontend): wireframes for 5 MVP screens + audit (F0.3)
Mock-data wireframes covering spec §5 / §9 surface area. All read from
src/mocks/fixtures.ts — no backend wiring yet. Each screen is built from
the design-system primitives (Panel, Pill, Button, label-system, status-dot)
and adheres to the instrumentation-grade visual grammar.

Screens:
- /login              LoginPage — RT vs SOC mode switch (segmented), role-tinted.
                       RT form picks rt_operator / rt_lead at sign-in (mock only).
                       SOC form takes a session token (out-of-band, D-006).
                       Left rail carries mission brief + platform telemetry.
- /engagements        EngagementsPage — mission roster table (codename, client,
                       status, c2_type, operators, SOC count, window).
- /runs               LiveCockpitPage — the cornerstone screen. 3-column layout:
                       steps timeline | step detail (resolved command,
                       output, evidence, detection) | side rail
                       (DetectionPanel for SOC; EvidencePanel +
                       DetectionPanel readonly + CleanupPanel for RT).
                       Control bar (F6 pause/skip/retry/abort) is lead-RT-only.
                       Stats header: steps done, detected/partial/missed counts.
- /scenarios          ScenarioComposerPage — 3-column composer:
                       filterable TTP library | ordered steps with delays
                       | inspector (params from params_schema_json, target
                       host list, jinja2 cleanup template preview).
                       c2_type locked at scenario level (D-F3 / H33).
- /library            TtpLibraryPage — catalog table with stealth-variant
                       flagging, source provenance (custom/atr/mission),
                       payload_type chip, tags. Import journal / ATR buttons.
- /reports            ReportPage — restricted MITRE matrix (techniques
                       played only, H29), narration timeline, integrity
                       hash footer (SHA-256, H19/H24/F9). PDF/JSON/MD
                       export buttons.
- /audit              AuditPage — append-only journal viewer (lead RT only,
                       F13). Tabular timestamp/actor/role/action/resource.

UX guardrails baked in:
- SOC analysts never see RT-only controls (conditional rendering, not just
  disabled state). UI layer mirrors backend RBAC but does not replace it.
- Layout density and dark-first palette tuned for long purple sessions
  (sober contrast, no flash, status colors carry information without
  being shouted).
- Live cockpit reserves a clear visual slot for cleanup-failed alerts
  (R-T5) — currently a Pill, real alert UX lands when the WebSocket is
  wired in sprint 1+.
2026-05-21 20:31:24 +02:00
ux-frontend
ef081c8c28 feat(frontend): role-aware app shell + routing skeleton (F0.4)
- Role enum (rt_operator, rt_lead, soc_analyst) aligned with spec §3 / F11.
  Frontend predicates (isRT, isLead, isSOC) drive layout only — backend
  remains the source of truth for permissions (D-008).
- SessionContext split into Provider (TSX) and hook (useSession) to satisfy
  react-refresh/only-export-components.
- AppShell composes StatusRail (link health, active run, UTC clock, build) +
  Sidebar (role-conditional nav with keyboard shortcut hints) + Outlet.
  Unauthenticated visitors redirect to /login.
- StatusRail uses pulsing status-dot pattern and label-system micro-typo
  (uppercase 10px, 0.08em tracking) to evoke an instrument-panel header.
- Router (createBrowserRouter): /login outside the shell, all app routes
  nested inside the shell. RootLayout extracted to its own file for
  fast-refresh compliance.

Routes (sprint 0, flat):
  /login                 LoginPage
  /engagements           EngagementsPage
  /library               TtpLibraryPage (RT only — gated client-side, will
                                          be re-enforced by backend RBAC)
  /scenarios             ScenarioComposerPage (RT only)
  /runs                  LiveCockpitPage
  /reports               ReportPage
  /audit                 AuditPage (lead RT only)

Sub-routes under /engagements/:eid land in sprint 1+ when real scoping
arrives.
2026-05-21 20:31:01 +02:00
ux-frontend
1562478a54 feat(frontend): provisional design system tokens + Logo placeholder (F0.2)
Aesthetic direction: instrumentation-grade telemetry (mission-control / SDR ops),
NOT shadcn defaults, NOT generic AI/SaaS. Mature palette: graphite surface scale,
CRT-amber for RT accent, steel-blue for SOC accent, sage/ochre/rust for detection
status — no neon, no rainbow.

Token layout designed to host the PR3 graphic charter without component churn:
  1. Primitives (--mc-*)        raw OKLCH scales
  2. Semantics (--accent-*, --status-*, --state-*, --surface-*, --line-*, …)
  3. Tailwind @theme mapping    semantic tokens → utilities

Includes:
- src/styles/theme.css       full token surface + base reset + scrollbars + grain
- src/styles/fonts.css       IBM Plex @font-face (self-host only)
- src/styles/globals.css     entry CSS file
- Logo (full/compact/mark) with corner-mark composition
- Panel, Pill, Button primitives reading exclusively from semantic tokens
- Logo.test.tsx (3 cases, Vitest + Testing Library)
2026-05-21 20:30:41 +02:00
ux-frontend
80ca4641a3 feat(frontend): bootstrap Vite + React 19 + TS strict toolchain (F0.1)
- Vite 8 / React 19 / TS 6 strict (noUncheckedIndexedAccess, no baseUrl deprecation)
- Tailwind 4 via @tailwindcss/vite (no PostCSS step)
- TanStack Query 5, react-router-dom 7, Recharts, clsx
- Vitest + Testing Library + jsdom for unit tests
- Playwright skeleton + first smoke spec (login redirect)
- ESLint flat config: typescript-eslint type-checked, react-hooks, react-refresh, prettier
- Prettier config (semi, single quotes, 100-col, lf)
- IBM Plex font @font-face declarations targeting /fonts/ (self-host, no CDN — OPSEC)
2026-05-21 20:30:23 +02:00
knacky
2ead16114d docs(spec): land D-011 (regex_extract) + D-012 (output_blob_ref storage)
D-011 freezes the regex_extract Jinja filter signature
`regex_extract(text, pattern, *, group=1, name=None)`, google-re2 engine,
raise on no-match — unblocks backend B0.5 templating sandbox.

D-012 splits storage in two pools: `blobs/` (CAS sha256 + gzip) for C2
binary outputs and `evidence/` (flat per engagement) for user uploads,
10 MB per-blob cap, no global quota v1.

Q-001 and Q-002 removed from open-questions.md (resolved).
Q-003/Q-004/Q-005 marked `deferred` with explicit re-open conditions.
2026-05-21 20:20:27 +02:00
knacky
524c6f1eb4 docs(spec): track open spec questions Q-001..Q-005 for sprint 0
Captures the four grey areas team-lead flagged in the sprint 0 brief
(regex_extract semantics, output_blob_ref storage, /hosts/sync merge
behaviour, payload_type↔home-C2 mapping) plus stale-host policy.

No decisions taken: each entry lists options, a recommended default
if no decision is reached, and a "becomes blocking when…" trigger.
Resolved questions will move to spec-decisions.md as D-NNN entries.
2026-05-21 20:18:57 +02:00
knacky
4ecf4b0b0e chore: tighten gitignore, align README stack, formalize D-010 (Ansible)
- .gitignore: add Keycloak/Mythic/Fernet secret patterns (pfx, p12, token, kdbx,
  credentials.json, secrets.json, service-account*.json), MSVC artifacts
  (lib, exp, idb, ilk, tlog), dedup dist/build/ between Python and Node blocks.
- README.md: align Storage line on H38 (testcontainers Postgres for Postgres-
  specific behavior, incl. unit tests of audit log / RBAC / write-only role).
- README.md: align Deploy line on D-007/D-010 — Docker + Ansible playbook,
  reverse proxy explicitly out-of-Mimic.
- README.md: add proprietary internal use notice.
- CHANGELOG.md: convert markdown link to inline URL (no dangling reference).
- tasks/spec-decisions.md: add D-010 (Ansible for deployment playbook).

Addresses code-reviewer M1/M2/M3 + N2/N3/N4/N6 on commit 047583e.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-21 20:16:40 +02:00
knacky
b144c041a7 docs: drop ttp_version from B0.2 + seed groups requirement per D-008/D-009
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-21 20:14:44 +02:00
knacky
d03ba062bf docs(spec): add D-008 (group RBAC vs F11) and D-009 (no ttp_version table)
D-008 frames the group-based RBAC layout as an OIDC-prep mechanism that must
seed exactly the three F11 spec roles and their canonical permission matrix.
Custom groups remain out of v1 scope.

D-009 reaffirms H32: replayability lives only on run.snapshot_json. The
ttp_version table listed in B0.2 must be dropped from the initial migration.
2026-05-21 20:13:14 +02:00
knacky
047583eb9c chore: bootstrap repo skeleton with sprint 0 plan
- .gitignore (Python, Node, RT/maldev hygiene, secrets)
- README.md (project framing, stack, conventions, status)
- CHANGELOG.md (team kickoff decisions Q1/Q2/Q3, T2/T3/T4, auth strategy)
- tasks/spec-decisions.md (D-001..D-007 arbitrations on top of frozen spec)
- tasks/todo.md (sprint 0 backlog: B0.* / F0.* / S0.* / R0.*)
- tasks/lessons.md (empty, populated as work progresses)
- backend/ frontend/ docs/ scaffolding

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-21 20:10:47 +02:00
knacky
030a018970 chore: init repo 2026-05-21 20:07:38 +02:00