Introduce a Supervisor trait (async-trait) so the agent manages games with different models behind one wire contract. ProcessSupervisor (spawned process: rust/conan/soulmask) and the new DockerComposeSupervisor (dune) both impl it; Agent.supervisors is now HashMap<String, Arc<dyn Supervisor>> and instancecmd dispatch is game-agnostic — start/stop/restart/status identical across games, selected by a per-game factory in main. InstanceState moved to the shared supervisor module. DockerComposeSupervisor drives docker-compose up-d / stop / restart against the instance's compose project, with -f/-p/single-service support and a configurable compose binary. New [instance.docker_compose] config block. First cut = lifecycle + cached state; container crash-detection + restart adoption deferred to Phase 3b (reconcilable with a compose ps probe). Trait choice (dyn over enum) per Commander: scales to future planes (kubectl, AMP/podman, SSH) as new struct+impl, no central match. 56 tests green (6 new docker-compose mock-binary tests + 5 refactored process tests), zero warnings. Live verification pending a real Dune stack. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2.3 KiB
2.3 KiB
Corrosion Host Agent
Rust rewrite of the Go companion agent (companion-agent/, retained as the
behavior reference until parity). One agent per machine supervises every game
instance on that host — Rust, Conan Exiles, Soulmask, Dune: Awakening.
- Wire protocol: see PROTOCOL.md (v2, instance-scoped subjects)
- Config: see agent.example.toml
Status — Phase 0
- Multi-instance TOML config + env overrides (
CORROSION_LICENSE_ID,CORROSION_NATS_URL,CORROSION_NATS_TOKEN) - NATS connection (infinite reconnect, capped backoff, 30s ping, offline send-buffering,
tls://support) - Host heartbeat with real telemetry (sysinfo: CPU, memory, disks) — no fabricated values
- Connectivity prober (outbound TCP, periodic + on-demand)
- Host command channel (
ping,probe,sysinfo) - Graceful shutdown (cancellation token, going-offline beacon, NATS flush)
- Phase 1a: process supervision — per-instance start/stop/restart/status over
{instance}.cmdrequest-reply, push state events on{instance}.status, crash detection with exit codes, live state in heartbeats (integration-tested with real processes + live-NATS contract test) - Phase 1b: RCON trait (WebRCON rust / TCP conan+soulmask), SteamCMD, jailed file manager
- [~] Phase 2: Dune Docker adapter — compose lifecycle done (
docker compose up -d/stop/restartvia theSupervisortrait +DockerComposeSupervisor); RabbitMQ admin bus + Postgres admin surface deferred. Container crash-detection + state adoption on agent restart land with Phase 3b. - Phase 3a: SIGNED self-update — minisign-verified download+swap+relaunch (NATS
updatefunc); embedded public key; CI signs releases - Phase 3b: service install (systemd/SCM), PID adoption
Build
cargo build --release # native
cargo build --release --target x86_64-unknown-linux-gnu # linux deploy target
cargo build --release --target x86_64-pc-windows-msvc # windows (cargo-xwin on non-Windows)
Run
corrosion-host-agent --config ./agent.toml # foreground
corrosion-host-agent --config ./agent.toml check # validate config only
corrosion-host-agent version # semver + git hash + build ts