feat(host-agent): Phase 2 — Dune docker-compose adapter via Supervisor trait
Some checks failed
CI / backend-types (push) Successful in 10s
CI / frontend-build (push) Successful in 15s
CI / integration (push) Has been cancelled
CI / agent-tests (push) Has been cancelled

Introduce a Supervisor trait (async-trait) so the agent manages games with
different models behind one wire contract. ProcessSupervisor (spawned process:
rust/conan/soulmask) and the new DockerComposeSupervisor (dune) both impl it;
Agent.supervisors is now HashMap<String, Arc<dyn Supervisor>> and instancecmd
dispatch is game-agnostic — start/stop/restart/status identical across games,
selected by a per-game factory in main. InstanceState moved to the shared
supervisor module.

DockerComposeSupervisor drives  against
the instance's compose project, with -f/-p/single-service support and a
configurable compose binary. New [instance.docker_compose] config block.
First cut = lifecycle + cached state; container crash-detection + restart
adoption deferred to Phase 3b (reconcilable with ).

Trait choice (dyn over enum) per Commander: scales to future planes (kubectl,
AMP/podman, SSH) as new struct+impl, no central match.

56 tests green (6 new docker-compose mock-binary tests + 5 refactored process
tests), zero warnings. Live verification pending a real Dune stack.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This commit is contained in:
Vantz Stockwell
2026-06-11 21:32:25 -04:00
parent 651a35d4be
commit 8334fbe4c6
17 changed files with 679 additions and 166 deletions

View File

@@ -101,8 +101,16 @@ Payload: `{}`.
Lifecycle and control for one game instance.
The same `start`/`stop`/`restart`/`status` funcs work for **every** game: the
agent picks a `Supervisor` impl per game — a spawned-process supervisor for
Rust/Conan/Soulmask, a **docker-compose supervisor for Dune** (`docker compose
up -d` / `stop` / `restart` against the instance's compose project, configured
via `[instance.docker_compose]`). The wire contract is identical; only the
management model behind it differs.
Implemented funcs: `start`, `stop` (graceful with 30s budget, then force
kill), `restart`, `status` (returns `state` + `uptime_seconds`), and
kill — process supervisor; Dune maps stop to `docker compose stop`), `restart`,
`status` (returns `state` + `uptime_seconds`), and
`rcon``{ "func": "rcon", "command": "<console command>" }` returns
`{ "status": "success", "output": <server response> }`. Protocol per game:
WebRCON (WebSocket JSON) for rust, Source RCON (Valve TCP) for
@@ -118,7 +126,10 @@ streaming progress lines to `corrosion.{license}.{instance}.steam_status`
and replying on completion.
Planned funcs: `oxide_install` (rust), plus game-adapter-specific
commands (Dune: docker lifecycle, RabbitMQ bus commands, Coriolis reset).
commands (Dune: RabbitMQ admin-bus commands, Coriolis reset, Postgres admin
surface). Dune **lifecycle** is already covered by the shared
start/stop/restart funcs above; container crash-detection and state adoption on
agent restart land with Phase 3b.
### `corrosion.{license_id}.{instance_id}.steam_status` (agent → backend, publish) — LIVE