# Corrosion Wire Protocol v2 Status: **Phase 0 implemented** (host heartbeat, host commands, going-offline beacon). Per-instance command/status subjects are reserved and specified here for Phase 1. ## Design One **host agent** per machine supervises **N game instances**. Subjects are scoped license-first, then by addressee: ``` corrosion.{license_id}.host.* host-level (the agent itself) corrosion.{license_id}.{instance_id}.* instance-level (one game server) ``` `instance_id` is a config-defined slug (`[a-z0-9_-]{1,64}`), validated at agent start. `host` is a reserved segment and can never be an instance id. Payloads are JSON. Every heartbeat carries `"schema": 2` so consumers can distinguish v2 from the legacy Go companion protocol (which used `corrosion.{license_id}.companion.heartbeat`, no schema field). ## Host-level subjects (Phase 0 — live) ### `corrosion.{license_id}.host.heartbeat` (agent → backend, publish) Published every `heartbeat_seconds` (default 60, jittered ±20%). ```json { "schema": 2, "timestamp": "2026-06-11T18:00:00Z", "agent": { "version": "2.0.0-alpha.1", "commit": "a8722a7", "os": "linux", "arch": "x86_64", "uptime_seconds": 86400 }, "host": { "hostname": "asgard-01", "cpu_percent": 12.5, "cpu_cores": 80, "mem_total_mb": 262144, "mem_used_mb": 81920, "uptime_seconds": 1209600, "disks": [ { "mount": "/", "total_mb": 1907729, "free_mb": 1532211 } ] }, "instances": [ { "id": "rust-main", "game": "rust", "label": "Main 2x Vanilla", "state": "configured", "root_disk_free_mb": 1532211 } ], "probe": { "timestamp": "2026-06-11T17:58:00Z", "results": [ { "name": "corrosion-cdn", "host": "cdn.corrosionmgmt.com", "port": 443, "ok": true, "latency_ms": 18 } ] } } ``` All telemetry is measured, never fabricated. Fields the agent cannot measure are omitted (`probe` before the first probe completes, `hostname` if unavailable). Phase 0 instance `state` values: `configured` (root path exists), `missing_root`. Phase 1 adds live process states: `running`, `stopped`, `crashed`, `starting`, `updating`. ### `corrosion.{license_id}.host.cmd` (backend → agent, request-reply) Request: `{ "func": "" }`. Reply: `{ "status": "success" | "error", ... }`. | func | Reply payload | | --------- | -------------------------------------------------------- | | `ping` | `version`, `commit`, `uptime_seconds` | | `probe` | `report` — fresh ProbeReport (also cached for heartbeat) | | `sysinfo` | `snapshot` — full heartbeat payload, collected on demand | Unknown funcs return `status: "error"` with a message listing supported funcs. ### `corrosion.{license_id}.host.going_offline` (agent → backend, publish) Best-effort beacon (500ms budget) on graceful shutdown so the panel can flip the host to offline immediately instead of waiting out heartbeat staleness. Payload: `{}`. ## Instance-level subjects (Phase 1 — reserved, not yet implemented) ### `corrosion.{license_id}.{instance_id}.cmd` (backend → agent, request-reply) Lifecycle and control for one game instance. Planned funcs: `start`, `stop`, `restart`, `status`, `rcon` (process-class games), `steam_update`, `oxide_install` (rust), plus game-adapter-specific commands (Dune: docker lifecycle, RabbitMQ bus commands, Coriolis reset). ### `corrosion.{license_id}.{instance_id}.status` (agent → backend, publish) State-change events (started/stopped/crashed) so the panel does not wait for the next heartbeat. ### `corrosion.{license_id}.{instance_id}.console` (agent → backend, publish) Live console/log lines for the panel console view. ### `corrosion.{license_id}.{instance_id}.files.cmd` (backend → agent, request-reply) VueFinder-style file manager ops, jailed to the instance root. Carries over the Go agent's jailed filemanager semantics (`fm_list`, `fm_save`, ...); the legacy UNJAILED `files.get/put/delete/list` API is retired and will not be ported. ## Backend mapping notes (Phase 0) - The NestJS NATS bridge subscribes `corrosion.*.host.heartbeat` and `corrosion.*.host.going_offline`. - Until the license→host→instance schema lands, the backend may map the host heartbeat onto the existing single `server_connections` row per license: `companion_last_seen` ← heartbeat arrival, `connection_status` ← connected/offline, resources ← `host.cpu_percent` / `mem_*` / first disk. Instance-level mapping activates with the fleet schema. ## Probing — scope honesty The Phase 0 prober measures **outbound** reachability from the host (TCP connect + latency). It cannot verify **inbound** port-forwarding (the thing players hit). Inbound verification requires a backend-side reverse probe service that attempts connections to the customer's public IP/ports on request; that is specified as a Phase 1+ feature and will reuse this report format with `direction: "inbound"`. ## Versioning - The agent embeds semver + git hash + build timestamp (`--version`, heartbeat `agent` block). - Schema changes bump `schema` and are additive where possible.