feat(host-agent): Rust rewrite Phase 0 — multi-instance foundation, v2 wire protocol, real telemetry
All checks were successful
Test Asgard Runner / test (push) Successful in 3s

New corrosion-host-agent/ crate (Go companion-agent stays as behavior
reference until parity). Wire protocol v2 per COA-B: instance-scoped
subjects corrosion.{license}.{instance}.* + host-level .host.* — spec
in PROTOCOL.md, designed for the license->host->instance fleet model.

- Multi-instance TOML config in the foundation, not retrofitted
- NATS layer on the Vigilance production profile (infinite reconnect,
  capped backoff, 30s ping, 8192-msg offline buffer)
- Heartbeat with real sysinfo telemetry — Go agent shipped hardcoded
  disk/cpu placeholders; this is the panel's first true Resources data
- Connectivity prober (outbound TCP, periodic + on-demand)
- Host cmd channel (ping/probe/sysinfo), going-offline beacon,
  CancellationToken shutdown
- Live-fire verified against production NATS; artifacts: 3.7MB static
  linux-musl, 3.8MB windows .exe (static CRT)

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
This commit is contained in:
Vantz Stockwell
2026-06-11 10:02:46 -04:00
parent 1abe57ca40
commit cea3d66cdd
18 changed files with 3292 additions and 0 deletions

View File

@@ -4,6 +4,21 @@ All notable changes to this project will be documented in this file.
## [Unreleased]
### Added (Corrosion Host Agent — Rust rewrite Phase 0 — 2026-06-11)
**New: `corrosion-host-agent/`** — Rust rewrite of the Go companion agent (which stays in-tree as the behavior reference until parity). Wire protocol v2 (COA-B, Commander-approved): instance-scoped subjects `corrosion.{license}.{instance}.*` with host-level `corrosion.{license}.host.*` — full spec in `corrosion-host-agent/PROTOCOL.md`.
- Multi-instance TOML config baked into the foundation (one agent supervises N game instances; rust/conan/soulmask/dune), env overrides for secrets, strict validation (subject-safe ids, reserved segments)
- NATS layer with the production-proven Vigilance profile: infinite reconnect w/ capped backoff, 30s ping, 8192-msg offline send buffer, `tls://` scheme support
- Host heartbeat with REAL telemetry via sysinfo (CPU/mem/disks/per-instance state) — the Go agent hardcoded disk=50000MB and cpu=0.0; this is the first true Resources data
- Connectivity prober (outbound TCP + latency, periodic jittered + on-demand) — first piece of the support-triage story
- Host command channel (`ping`/`probe`/`sysinfo`, request-reply), going-offline beacon, CancellationToken graceful shutdown
- Version embedding (semver + git hash + build ts) in `--version` and every heartbeat
- Verified live against production NATS: connected, heartbeats published, clean shutdown
- Deploy artifacts verified: 3.7MB fully-static linux-musl binary, 3.8MB windows .exe (static CRT, no VC++ redist needed)
**Next phases**: 1 = process-class adapter (spawn/RCON/SteamCMD/files for Rust/Conan/Soulmask) + NestJS v2 heartbeat consumer; 2 = Dune Docker adapter; 3 = signed self-update (release gate) + service install.
### Fixed (Site Audit — Fake Data, Resilience, Fonts — 2026-06-11)
**Frontend:**