Files
mimic-big/docs/podman-runner-setup.md

263 lines
8.5 KiB
Markdown
Raw Permalink Normal View History

docs: archive Podman runner setup runbook + track F-D1..F-D5 Two changes scoped together since both stem from the post-PR2 wrap-up. docs/podman-runner-setup.md (new, ~190 LOC): Operational runbook for the gitea-runner host that drives CI. The first attempt at install hit four traps that this archived version documents so we don't lose the lesson: 1. `act_runner register` performs a sanity ping against the container daemon before writing the credential. Without the Podman socket mounted on the *register one-shot*, register fails silently and no .runner file is produced. The runbook mounts the socket on both register and daemon containers. 2. SELinux blocks rootless socket access by default. Quadlet SecurityLabelDisable=true (or --security-opt label=disable for the legacy CLI form) is the documented bypass. No-op on Debian, required on RHEL/Fedora hosts. 3. The runner user UID is not 1000 on every host (gitea = 1005 here). Quadlet `%U` substitution makes the unit portable; hardcoded UIDs are explicitly called out as a sprint 0 mistake. 4. `podman generate systemd` is officially deprecated. Quadlet is the only supported pattern going forward and is what this runbook ships; legacy alternative is omitted on purpose. Also captures: token placeholder convention (<TOKEN_FROM_GITEA_UI>, never the real value in archived docs), single-use semantics, the "secrets via file, not chat" convention, the `:X.Y.Z` pin policy versus `:latest` in prod (ties into follow-up F-D1), and a decommissioning section that cleans up state without nuking the user-level Podman socket. tasks/todo.md: New section "Frontend follow-ups (sprint 1+)" with F-D1..F-D5 from code-reviewer on `chore/frontend-dockerfile` (649194b). All deferred, none blocking. F-D1 (digest pinning) is project-wide and explicitly references the backend image and the runner image alongside the frontend ones for a single chore commit.
2026-05-23 03:08:03 +02:00
# Gitea Actions runner — Podman rootless runbook
Archived setup procedure for the `gitea-runner` host that drives Mimic CI
(`.gitea/workflows/ci.yml`). Captures the corrections that emerged during
sprint 0 install so future operators don't re-discover the same traps.
## Target architecture
- **Host** : same VM as the Gitea server (sprint 0 deployment choice).
- **Container runtime** : Podman rootless under the existing `gitea` system
user. No new account, no rootful daemon.
- **Runner image** : `docker.io/gitea/act_runner:X.Y.Z` (pinned, see [Pin
policy](#pin-policy)).
- **Auto-start** : Quadlet (`~/.config/containers/systemd/<name>.container`)
— the upstream-recommended pattern since Podman 4.4. `podman generate
systemd` is officially deprecated; do not introduce it.
- **Label exposed to workflows** : `linux` (single, kept short, matches the
`runs-on: linux` line in `.gitea/workflows/ci.yml`).
## Prerequisites on the host
| Component | Requirement | Verify |
| --- | --- | --- |
| Podman | ≥ 4.4 (Quadlet support) | `podman --version` |
| Rootless mode | enabled | `podman info --format '{{.Host.Security.Rootless}}'``true` |
| systemd user mode | linger on for the runner user | `loginctl show-user <user> \| grep Linger` |
| `podman.socket` user unit | available | `ls /usr/lib/systemd/user/podman.socket` |
| Gitea Actions | enabled in `app.ini` | `[actions] ENABLED = true` then restart |
If Gitea Actions was never activated, edit `/etc/gitea/app.ini`:
```ini
[actions]
ENABLED = true
[actions.log_compression]
ENABLED = true
```
Restart with `sudo systemctl restart gitea`. The UI exposes
`Site Administration → Actions → Runners` once enabled.
## Pin policy
**Never use `:latest` for the runner image in production.** Pin a concrete
`gitea/act_runner:X.Y.Z` tag and bump explicitly through this runbook. The
same policy is tracked for every other production image in
[`tasks/todo.md`](../tasks/todo.md) follow-up **F-D1** (digest pinning
roadmap).
To find the current release: <https://gitea.com/gitea/act_runner/releases>.
## Step 1 — Switch to the runner user
```bash
sudo machinectl shell <user>@ # or: sudo -iu <user>
id # capture $UID for later substitution
podman info --format '{{.Host.Security.Rootless}}' # must print "true"
```
If `loginctl show-user <user> | grep Linger` reports `Linger=no`, run as
root **before** going further:
```bash
sudo loginctl enable-linger <user>
```
Without linger the Podman user-mode socket dies when `<user>` logs out and
the runner stops accepting jobs.
## Step 2 — Activate the Podman socket
```bash
systemctl --user enable --now podman.socket
systemctl --user status podman.socket
ls -la /run/user/$(id -u)/podman/podman.sock # exists, mode 0660
```
## Step 3 — Pull the runner image
```bash
podman pull docker.io/gitea/act_runner:X.Y.Z # replace X.Y.Z
```
## Step 4 — Generate a baseline config
```bash
mkdir -p ~/.config/act_runner ~/.local/share/act_runner
cd ~/.config/act_runner
podman run --rm docker.io/gitea/act_runner:X.Y.Z \
act_runner generate-config > config.yaml
```
Edit `~/.config/act_runner/config.yaml` — only these keys matter:
```yaml
runner:
capacity: 2
envs:
DOCKER_HOST: "unix:///var/run/docker.sock" # path as seen by the container
labels:
- "linux:docker://node:22-alpine"
container:
network: "bridge"
privileged: false
docker_host: "unix:///var/run/docker.sock"
options: "--security-opt label=disable" # see SELinux note below
```
## Step 5 — Register the runner (single-use token)
> **Gotcha — register pings the container daemon.**
> Even though `act_runner register` writes no actual job, it sanity-checks
> at startup that it can reach a container runtime. Without the socket
> mounted, register fails with `cannot ping container daemon` and the
> credential file `~/.local/share/act_runner/.runner` is never written.
> Mount the socket on the register one-shot too — not only on the daemon.
Generate a registration token in `Site Administration → Actions → Runners
→ Create new Runner`, then run **once**:
```bash
podman run --rm \
--security-opt label=disable \
-v ~/.config/act_runner/config.yaml:/config.yaml \
-v ~/.local/share/act_runner:/data \
-v /run/user/$(id -u)/podman/podman.sock:/var/run/docker.sock \
-w /data \
-e GITEA_INSTANCE_URL=https://repo.try2get.in \
-e GITEA_RUNNER_REGISTRATION_TOKEN=<TOKEN_FROM_GITEA_UI> \
-e GITEA_RUNNER_NAME=gitea-runner \
-e GITEA_RUNNER_LABELS=linux \
docker.io/gitea/act_runner:X.Y.Z \
act_runner register --no-interactive
```
The token is **single-use**: invalidated the moment `register` succeeds.
Generate a fresh one for each re-registration.
> **Secret handling convention.**
> Do not paste the registration token into chat, agent transcripts, or
> issue trackers. Drop it on disk (e.g. `~/runner-token.txt`), `cat` it
> into the environment for the one-shot above, then `shred -u` the file.
> This mirrors the team-lead "secrets via file, not chat" rule.
Verify the runner appears in the Gitea UI at
`https://repo.try2get.in/-/admin/actions/runners` with status `idle`.
## Step 6 — Quadlet unit (auto-start)
`~/.config/containers/systemd/gitea-runner.container` :
```ini
[Unit]
Description=Gitea Actions Runner (Mimic) — Podman rootless
After=podman.socket
Requires=podman.socket
[Container]
Image=docker.io/gitea/act_runner:X.Y.Z
ContainerName=gitea-runner
SecurityLabelDisable=true
Volume=%h/.config/act_runner/config.yaml:/config.yaml,ro
Volume=%h/.local/share/act_runner:/data
Volume=/run/user/%U/podman/podman.sock:/var/run/docker.sock
WorkingDir=/data
Exec=act_runner daemon --config /config.yaml
[Service]
Restart=on-failure
RestartSec=5
TimeoutStartSec=900
[Install]
WantedBy=default.target
```
Notes:
- `%h` expands to the runner user's `$HOME`, `%U` to the runner user's UID.
Hardcoding `1000` (or any specific UID) was a sprint 0 mistake — the
actual `gitea` UID on this host is **1005**. Quadlet substitution makes
the unit portable across hosts.
- `SecurityLabelDisable=true` is the Quadlet equivalent of
`--security-opt label=disable`. It bypasses SELinux container labelling
so the rootless container can `read+write` the host Podman socket. On
SELinux-disabled systems (Debian/Ubuntu vanilla) this is a no-op; on
RHEL/Fedora-like it is required — without it nginx-style "Permission
denied" appears on socket connect.
Activate:
```bash
systemctl --user daemon-reload
systemctl --user start gitea-runner.service # generated from .container
systemctl --user enable gitea-runner.service # persist across reboots
journalctl --user -u gitea-runner.service -e
```
Quadlet generates `gitea-runner.service` automatically; do not create it by
hand under `~/.config/systemd/user/`.
## Step 7 — Smoke validation
Push a transient workflow on a feature branch. Example used during sprint 0
(file lived at `.gitea/workflows/smoke.yml` on `chore/podman-and-ci`,
removed after green):
```yaml
name: smoke
on:
push:
branches: [chore/podman-and-ci]
workflow_dispatch:
jobs:
hello:
runs-on: linux
steps:
- run: |
echo "host: $(uname -a)"
id
head -3 /etc/os-release
```
Job picked up and green within ~10 s on the Gitea Actions tab → runner is
operational. Failures usually trace back to one of the gotchas captured
above (`journalctl --user -u gitea-runner.service -e` is authoritative).
## Step 8 — Repo secrets
CI consumes the following secrets, configured per repo at
`<repo>/settings/actions/secrets`:
| Secret | Use | Value |
| --- | --- | --- |
| `FERNET_KEY_TEST` | `MIMIC_FERNET_KEY` in CI jobs | `python -c "from cryptography.fernet import Fernet; print(Fernet.generate_key().decode())"` once, fixed thereafter |
Never reuse production Fernet material in CI.
## Decommissioning
```bash
# As the runner user:
systemctl --user disable --now gitea-runner.service
rm ~/.config/containers/systemd/gitea-runner.container
systemctl --user daemon-reload
rm -rf ~/.config/act_runner ~/.local/share/act_runner
# Drop the runner entry in Gitea UI: Site Admin → Actions → Runners → Delete.
```
The Podman socket and linger setting stay — they are user-level and shared
with anything else the user runs.
## Cross-references
- Sprint 0 decisions: [`tasks/spec-decisions.md`](../tasks/spec-decisions.md)
(D-007 reverse proxy scope, D-010 Ansible playbook scope).
- CI workflow: [`.gitea/workflows/ci.yml`](../.gitea/workflows/ci.yml).
- Deferred CI work: [`tasks/todo.md`](../tasks/todo.md) section "CI
follow-ups (sprint 1+)".