Introduce a Supervisor trait (async-trait) so the agent manages games with different models behind one wire contract. ProcessSupervisor (spawned process: rust/conan/soulmask) and the new DockerComposeSupervisor (dune) both impl it; Agent.supervisors is now HashMap<String, Arc<dyn Supervisor>> and instancecmd dispatch is game-agnostic — start/stop/restart/status identical across games, selected by a per-game factory in main. InstanceState moved to the shared supervisor module. DockerComposeSupervisor drives docker-compose up-d / stop / restart against the instance's compose project, with -f/-p/single-service support and a configurable compose binary. New [instance.docker_compose] config block. First cut = lifecycle + cached state; container crash-detection + restart adoption deferred to Phase 3b (reconcilable with a compose ps probe). Trait choice (dyn over enum) per Commander: scales to future planes (kubectl, AMP/podman, SSH) as new struct+impl, no central match. 56 tests green (6 new docker-compose mock-binary tests + 5 refactored process tests), zero warnings. Live verification pending a real Dune stack. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
81 lines
3.0 KiB
Rust
81 lines
3.0 KiB
Rust
//! The supervision abstraction.
|
|
//!
|
|
//! A `Supervisor` owns the lifecycle of one game instance. Different games are
|
|
//! managed in fundamentally different ways — Rust/Conan/Soulmask are spawned OS
|
|
//! processes ([`crate::process::ProcessSupervisor`]); Dune is a docker-compose
|
|
//! stack ([`crate::docker_compose::DockerComposeSupervisor`]); future planes
|
|
//! (kubectl, AMP/podman, SSH) will be their own impls. The instance command
|
|
//! dispatch (`instancecmd::dispatch`) talks only to this trait, so it never
|
|
//! learns which management model is behind a given instance.
|
|
//!
|
|
//! Trait objects (`Arc<dyn Supervisor>`) need object-safe, dynamically
|
|
//! dispatchable async methods; native `async fn` in traits is not yet
|
|
//! dyn-compatible, so we use `#[async_trait]` (the battle-tested ecosystem
|
|
//! standard) to box the returned futures. The cost — one heap alloc per
|
|
//! lifecycle call — is irrelevant for start/stop/restart, which happen seconds
|
|
//! to minutes apart.
|
|
|
|
use std::sync::Arc;
|
|
|
|
use anyhow::Result;
|
|
use serde::Serialize;
|
|
use tokio::sync::watch;
|
|
|
|
/// Observable lifecycle state of one instance. Shared vocabulary across every
|
|
/// supervisor impl; serialized verbatim into heartbeats and status events
|
|
/// (`{"state":"running", ...}`).
|
|
#[derive(Debug, Clone, PartialEq, Serialize)]
|
|
#[serde(rename_all = "snake_case", tag = "state")]
|
|
pub enum InstanceState {
|
|
/// Not lifecycle-managed (a process instance with no executable, etc.).
|
|
Unmanaged,
|
|
Stopped,
|
|
Starting,
|
|
Running,
|
|
Stopping,
|
|
/// Exited/died without a stop request.
|
|
Crashed {
|
|
#[serde(skip_serializing_if = "Option::is_none")]
|
|
exit_code: Option<i32>,
|
|
},
|
|
}
|
|
|
|
impl InstanceState {
|
|
pub fn as_label(&self) -> &'static str {
|
|
match self {
|
|
InstanceState::Unmanaged => "unmanaged",
|
|
InstanceState::Stopped => "stopped",
|
|
InstanceState::Starting => "starting",
|
|
InstanceState::Running => "running",
|
|
InstanceState::Stopping => "stopping",
|
|
InstanceState::Crashed { .. } => "crashed",
|
|
}
|
|
}
|
|
}
|
|
|
|
/// Lifecycle control + state observation for one instance.
|
|
///
|
|
/// `start`/`stop`/`restart` take `self: Arc<Self>` so an impl can hand a clone
|
|
/// to a spawned monitor task; callers hold an `Arc<dyn Supervisor>` and
|
|
/// `clone()` before each call. `watch_state` exposes the same channel the
|
|
/// status-event publisher drains, so panel push events stay decoupled from the
|
|
/// heartbeat cadence.
|
|
#[async_trait::async_trait]
|
|
pub trait Supervisor: Send + Sync {
|
|
/// The instance slug (a NATS subject segment).
|
|
fn instance_id(&self) -> &str;
|
|
|
|
/// Current cached state (cheap; no I/O).
|
|
fn state(&self) -> InstanceState;
|
|
|
|
/// Subscribe to state transitions.
|
|
fn watch_state(&self) -> watch::Receiver<InstanceState>;
|
|
|
|
/// Seconds since the instance entered `Running` (0 otherwise).
|
|
async fn uptime_seconds(&self) -> u64;
|
|
|
|
async fn start(self: Arc<Self>) -> Result<()>;
|
|
async fn stop(self: Arc<Self>) -> Result<()>;
|
|
async fn restart(self: Arc<Self>) -> Result<()>;
|
|
}
|