5 Commits

Author SHA1 Message Date
Vantz Stockwell
d13f2cb8b1 feat(host-agent): Phase 2 — Dune docker-compose adapter via Supervisor trait
Some checks failed
CI / backend-types (push) Successful in 9s
CI / frontend-build (push) Successful in 15s
CI / agent-tests (push) Failing after 35s
CI / integration (push) Has been skipped
Build Host Agent (Rust) / build (push) Successful in 1m45s
Introduce a Supervisor trait (async-trait) so the agent manages games with
different models behind one wire contract. ProcessSupervisor (spawned process:
rust/conan/soulmask) and the new DockerComposeSupervisor (dune) both impl it;
Agent.supervisors is now HashMap<String, Arc<dyn Supervisor>> and instancecmd
dispatch is game-agnostic — start/stop/restart/status identical across games,
selected by a per-game factory in main. InstanceState moved to the shared
supervisor module.

DockerComposeSupervisor drives docker-compose up-d / stop / restart against
the instance's compose project, with -f/-p/single-service support and a
configurable compose binary. New [instance.docker_compose] config block.
First cut = lifecycle + cached state; container crash-detection + restart
adoption deferred to Phase 3b (reconcilable with a compose ps probe).

Trait choice (dyn over enum) per Commander: scales to future planes (kubectl,
AMP/podman, SSH) as new struct+impl, no central match.

56 tests green (6 new docker-compose mock-binary tests + 5 refactored process
tests), zero warnings. Live verification pending a real Dune stack.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-11 21:33:00 -04:00
Vantz Stockwell
651a35d4be docs(reference): import Dune: Awakening server-manager references
All checks were successful
CI / backend-types (push) Successful in 10s
CI / frontend-build (push) Successful in 15s
CI / agent-tests (push) Successful in 39s
CI / integration (push) Successful in 22s
Phase 2 references for the host-agent Dune adapter, moved out of volatile /tmp
into docs/reference-repos/ (per Commander). Three upstream projects, .git +
node_modules + compiled binaries stripped (16MB source). Nested AI-instruction
files (.claude/, CLAUDE.md) removed so they don't pollute Corrosion sessions.

- icehunter/    dune-admin (Go+React) — 4 control planes; SETUP_DOCKER.md is the
                closest analog to our agent's Dune docker control plane (compose
                lifecycle, docker logs, RabbitMQ-via-exec, dune Postgres schema)
- adainrivers/  Rust/Tauri desktop — SSH+k8s BattleGroup control, maintenance
                daemon, in-game admin console (Rust idiom reference)
- the4rchangel/ Node web UI replacing battlegroup.bat — matches the Commander's
                Hyper-V self-host path + game-config schema

See docs/reference-repos/README.md for the full index + how we use each.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-11 21:08:05 -04:00
Vantz Stockwell
0715492ddf chore(panel): fleet-aware shell footer + drop dead vuefinder dep
All checks were successful
CI / backend-types (push) Successful in 9s
CI / frontend-build (push) Successful in 14s
CI / agent-tests (push) Successful in 49s
CI / integration (push) Successful in 22s
COA-B cleanup:
- Sidebar agent-health footer now reads the fleet store (host count / online
  count / per-host status + last heartbeat) instead of the single legacy
  server.connection row, which disagreed with the multi-host fleet. Removed the
  legacy useServerStore dependency from the shell.
- Removed the unused 'vuefinder' dependency (replaced by the native file
  manager): dep + main.ts plugin/CSS registration. Main JS chunk 588kB -> 165kB.

Recon reclassified the 'dead cmd.server v1' item: it is the LIVE license-level
command path (module config applies, plugin install, schedules, legacy
start/stop) served only by the Go agent — a Rust-agent parity gap, not dead
code. Left intact.

Build-green (vue-tsc) + boots clean in-browser (0 console errors).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-11 21:04:09 -04:00
Vantz Stockwell
4ef5db5b0d feat(panel): drive active game from deployed fleet instances
All checks were successful
CI / backend-types (push) Successful in 8s
CI / frontend-build (push) Successful in 16s
CI / agent-tests (push) Successful in 40s
CI / integration (push) Successful in 23s
The shell skin / sidebar nav / dashboard terminology now follow the games
actually deployed (game_instances.game, agent-reported) instead of a
localStorage-only toggle. syncActiveGameFromFleet() derives: one game ->
auto-skin to it; zero/multiple -> 'all' neutral. A manual GameSwitcher pick
persists and overrides the heuristic. Wired into DashboardLayout via a watch
on the fleet store.

No schema change: a license's games are the distinct games of its instances
(the normalized source of truth) — deliberately not duplicating into a
licenses.game column that would drift (Lesson 20).

Build-green (vue-tsc) + boots clean in-browser (0 console errors, theming
initializes). Authenticated auto-derive confirms live on next instance deploy.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-11 20:51:36 -04:00
Vantz Stockwell
bb71763714 docs: Lesson 28 — base64-encode multi-line CI secrets (minisign signing key)
All checks were successful
CI / backend-types (push) Successful in 9s
CI / frontend-build (push) Successful in 16s
CI / agent-tests (push) Successful in 39s
CI / integration (push) Successful in 21s
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-11 20:38:56 -04:00
1357 changed files with 239752 additions and 784 deletions

View File

@@ -4,6 +4,35 @@ All notable changes to this project will be documented in this file.
## [Unreleased]
### Added (Host-agent Phase 2 — Dune docker-compose adapter — 2026-06-12)
**`Supervisor` trait abstraction (`corrosion-host-agent`):**
- Introduced `trait Supervisor` (via `async-trait`, the battle-tested ecosystem standard) so the agent can manage games with fundamentally different models behind one wire contract. `ProcessSupervisor` (spawned OS process — Rust/Conan/Soulmask) and the new `DockerComposeSupervisor` (Dune) both implement it; `Agent.supervisors` is now `HashMap<String, Arc<dyn Supervisor>>` and the instance command dispatch (`instancecmd::dispatch`) is fully game-agnostic — `start`/`stop`/`restart`/`status` are identical across games. A per-game factory in `main` selects the impl. `InstanceState` moved to the shared `supervisor` module.
- **Architecture call** (per Commander): chose the `dyn` trait over a zero-dependency enum because the Dune references point at *several* future management planes (kubectl, AMP/podman, SSH) — a trait makes each new plane "new struct + impl," no central match to edit.
**`DockerComposeSupervisor` (Dune: Awakening):**
- Drives `docker compose up -d` / `stop` / `restart` against the instance's compose project (a "battlegroup"), with `-f`/`-p`/single-service support and a configurable compose binary (`docker compose` default, `docker-compose` legacy). New `[instance.docker_compose]` config block (file/project/service/command, all optional). `steam_update` already rejected for Dune (Docker images, no SteamCMD).
- **Scope (first cut):** lifecycle + cached state. Deferred to Phase 3b (with process PID adoption): container crash-detection and state adoption on agent restart (both reconcilable with a `docker compose ps` probe).
- Verified: 6 new docker-compose tests (mock `docker` binary asserting exact invocations + state transitions + failure paths) + the 5 refactored process-supervisor tests; full agent suite 56 tests green, zero warnings. Live verification against a real Dune stack pending the Commander standing one up.
### Changed (Fleet-driven active game + signed-update CI fix — 2026-06-12)
**Frontend — active game follows the deployed fleet:**
- The panel's active game (shell skin + sidebar nav + dashboard terminology) is now **derived from the deployed instances** instead of a localStorage-only toggle. `syncActiveGameFromFleet()` reads the distinct `game` values of the license's instances (`game_instances.game`, reported by the host agent): exactly one game deployed → the shell auto-skins to it; zero or multiple → `all` (neutral house skin). Wired into `DashboardLayout` (the always-mounted admin shell) via a watch on the fleet store.
- A manual GameSwitcher pick still wins — it persists to `cc-active-game` and suppresses auto-derive (operator intent beats the heuristic). Un-overridden panels keep tracking the fleet across sessions.
- **No backend/schema change:** a license's game(s) are the distinct games of its instances — the normalized source of truth. Deliberately did NOT add a `licenses.game` column (would duplicate `game_instances.game` and drift; see Lesson 20).
**Frontend — sidebar agent-health footer is now fleet-aware:**
- The shell footer read a single legacy `server.connection` (one `server_connections` row), which disagreed with the multi-host fleet. Repointed it at the fleet store: one host → hostname + status + last-heartbeat; multiple → `{online}/{total} online` + total instance count. Tone aggregates (all online → healthy, some → degraded, none → offline). Dropped the legacy `useServerStore` dependency from the shell entirely.
**Frontend — removed dead `vuefinder` dependency:**
- VueFinder was replaced by the native instance-scoped file manager but the plugin (and its CSS) were still globally registered in `main.ts` and shipped in the bundle. Removed the dep + the three `main.ts` lines. Side effect: the main JS chunk dropped **588 kB → 165 kB** (vuefinder bundled an entire unused file-manager UI).
**Recon note (not a change):** `corrosion.{license}.cmd.server` was on the cleanup list as "dead v1" — it is NOT. It remains the live license-level command path for all plugin/module config applies, plugin install, scheduled tasks, and legacy start/stop/restart, served only by the legacy Go agent. The Rust agent does not implement it yet — this is a **parity/migration gap** (Phase 2+), not dead code. Left intact.
**CI — signed host-agent build:**
- Fixed the `Sign artifacts (minisign)` step (`Error while loading the secret key file`): a minisign secret key is two lines and CI secret storage mangles the embedded newline. The job now base64-decodes the secret (single-line, mangling-proof) with auto-detect fallback to a raw key. `MINISIGN_SECRET_KEY` must be stored as `base64 < secret.key | tr -d '\n'`. Verified end-to-end: `agent-v2.0.0-alpha.8` Linux + Windows binaries validate against the agent's embedded public key; tampered byte rejected.
### Added (Host-Agent v2 Consumer + SEO Meta — 2026-06-11)
**Backend (NestJS):**

View File

@@ -451,3 +451,5 @@ Things I discovered about myself building a sister platform across multiple sess
26. **A jail check at the entry point does not jail the recursive walk behind it — and my own "line-by-line" review missed it; the automated security review didn't.** The file manager's `jail()` correctly canonicalized and prefix-checked the top-level path, and I traced every escape vector through it and signed off. But `copy_recursive` then walked the directory tree with `fs::metadata` (which *follows* symlinks). A symlink planted inside the jail pointing at `/etc`, then a `copy` of its parent, would dereference it and pull external content *into* the jail to be read — a jail escape the entry check never sees, because the escape is reintroduced by a descendant during traversal. Fix: `symlink_metadata` (lstat) everywhere you recurse, and refuse/never-follow symlinks across the boundary. The transferable rule: **validate at the boundary AND at every step that re-derives a path** (recursion, `read_dir`, glob, archive extraction). And the humbling part — I was confident after reviewing the jail function; the security-review pass caught the HIGH I'd waved through. Trust adversarial verification over your own once-over on security-critical code, especially path/traversal logic.
27. **Validate infra config BEFORE it reaches a deploy — and know that `docker compose up -d <service>` will recreate other services whose definitions changed.** During the NATS auth cutover I ran `docker compose up -d api` to pick up new env. Because the *nats* service definition had also changed (a new volume mount), compose recreated **corrosion-nats too** — and it failed to start on a config error (`no_auth_user` nested inside `authorization{}` instead of at top level), taking the broker down for ~3 minutes with the backend in offline mode. Two lessons: (a) a broker/proxy/DB config file is code — lint it before it can reach a restart (`nats-server -t -c cfg` to test-parse, `nginx -t`, etc.), don't let the first validation be the production container's startup; (b) `compose up -d <one-service>` is not surgical — it reconciles that service's **dependencies** too, so a stale edit to a depended-on service ships when you didn't mean it to. When touching shared-infra config, restart that service explicitly and watch it come up before moving on. Recovery also surfaced a third gotcha: recreating a client (api) while its server (nats) is down leaves the client stuck on a cached DNS failure (`EAI_AGAIN`) — restart the client once the server is healthy.
28. **A multi-line secret in CI (minisign/SSH/PGP keys) must be stored base64-encoded — the runner mangles embedded newlines and the key silently fails to load.** The signed-update CI passed the toolchain build, downloaded minisign fine, then died at the sign step on `Error while loading the secret key file` (exit 2). The cause wasn't the key or minisign — a minisign secret key file is **two lines** (`untrusted comment:` + base64 blob), and Gitea/act_runner secret storage collapses the embedded newline so the reconstructed file is one unparseable line. The robust pattern: store the secret as `base64 < secret.key | tr -d '\n'` (single line, mangling-proof) and `base64 -d` it in the job, with auto-detect fallback so a correctly-stored raw key still works, and a loud `::error::` carrying the fix command if it's neither. This applies to **any** multi-line credential in CI, not just minisign. Two corollaries: (a) the tell is "the tool runs but can't load its key" — suspect newline-mangling before the key itself; (b) generating that base64 prints the **private key to the terminal/transcript** — for a supply-chain signing key, treat it as exposed and rotate before cutover (embed the new pubkey, re-store the new secret, retire the old). And verify the published artifact end-to-end against the *embedded* pubkey (`minisign -Vm bin -P <pub>`) plus a tampered-byte negative control — a green build that signs is not the same as a signature the agent will actually accept.

View File

@@ -110,6 +110,17 @@ dependencies = [
"url",
]
[[package]]
name = "async-trait"
version = "0.1.89"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "9035ad2d096bed7955a320ee7e2230574d28fd3c3a0f186cbea1ff3c7eed5dbb"
dependencies = [
"proc-macro2",
"quote",
"syn",
]
[[package]]
name = "atomic-waker"
version = "1.1.2"
@@ -276,10 +287,11 @@ checksum = "773648b94d0e5d620f64f280777445740e61fe701025087ec8b57f45c791888b"
[[package]]
name = "corrosion-host-agent"
version = "2.0.0-alpha.8"
version = "2.0.0-alpha.9"
dependencies = [
"anyhow",
"async-nats",
"async-trait",
"chrono",
"clap",
"futures",

View File

@@ -1,6 +1,6 @@
[package]
name = "corrosion-host-agent"
version = "2.0.0-alpha.8"
version = "2.0.0-alpha.9"
edition = "2021"
description = "Corrosion Host Agent — multi-game ops runtime for self-hosted game servers"
license = "UNLICENSED"
@@ -23,6 +23,7 @@ chrono = { version = "0.4", features = ["serde", "clock"] }
tracing = "0.1"
tracing-subscriber = { version = "0.3", features = ["env-filter", "fmt"] }
anyhow = "1"
async-trait = "0.1"
clap = { version = "4.5", features = ["derive"] }
rand = "0.8"
tokio-tungstenite = "0.24"

View File

@@ -101,8 +101,16 @@ Payload: `{}`.
Lifecycle and control for one game instance.
The same `start`/`stop`/`restart`/`status` funcs work for **every** game: the
agent picks a `Supervisor` impl per game — a spawned-process supervisor for
Rust/Conan/Soulmask, a **docker-compose supervisor for Dune** (`docker compose
up -d` / `stop` / `restart` against the instance's compose project, configured
via `[instance.docker_compose]`). The wire contract is identical; only the
management model behind it differs.
Implemented funcs: `start`, `stop` (graceful with 30s budget, then force
kill), `restart`, `status` (returns `state` + `uptime_seconds`), and
kill — process supervisor; Dune maps stop to `docker compose stop`), `restart`,
`status` (returns `state` + `uptime_seconds`), and
`rcon``{ "func": "rcon", "command": "<console command>" }` returns
`{ "status": "success", "output": <server response> }`. Protocol per game:
WebRCON (WebSocket JSON) for rust, Source RCON (Valve TCP) for
@@ -118,7 +126,10 @@ streaming progress lines to `corrosion.{license}.{instance}.steam_status`
and replying on completion.
Planned funcs: `oxide_install` (rust), plus game-adapter-specific
commands (Dune: docker lifecycle, RabbitMQ bus commands, Coriolis reset).
commands (Dune: RabbitMQ admin-bus commands, Coriolis reset, Postgres admin
surface). Dune **lifecycle** is already covered by the shared
start/stop/restart funcs above; container crash-detection and state adoption on
agent restart land with Phase 3b.
### `corrosion.{license_id}.{instance_id}.steam_status` (agent → backend, publish) — LIVE

View File

@@ -20,7 +20,9 @@ instance on that host — Rust, Conan Exiles, Soulmask, Dune: Awakening.
crash detection with exit codes, live state in heartbeats
(integration-tested with real processes + live-NATS contract test)
- [ ] Phase 1b: RCON trait (WebRCON rust / TCP conan+soulmask), SteamCMD, jailed file manager
- [ ] Phase 2: Dune Docker adapter (compose lifecycle, RabbitMQ bus, Postgres admin)
- [~] Phase 2: Dune Docker adapter **compose lifecycle done** (`docker compose up -d/stop/restart`
via the `Supervisor` trait + `DockerComposeSupervisor`); RabbitMQ admin bus + Postgres admin
surface deferred. Container crash-detection + state adoption on agent restart land with Phase 3b.
- [x] Phase 3a: SIGNED self-update — minisign-verified download+swap+relaunch (NATS `update` func); embedded public key; CI signs releases
- [ ] Phase 3b: service install (systemd/SCM), PID adoption

View File

@@ -60,6 +60,24 @@ password = "changeme"
# Dune instances do not use SteamCMD (Docker images); the steam_update func
# will return a clear error if invoked on a dune instance.
# --- Dune: Awakening (container-managed) ---------------------------------
# Dune runs as a docker-compose stack, not a spawned process — leave
# `executable` unset and add an [instance.docker_compose] block. The agent
# drives `docker compose up -d / stop / restart` for start/stop/restart, and
# `steam_update` is rejected (Dune ships as Docker images).
#
# [[instance]]
# id = "dune-main"
# game = "dune"
# root = "/opt/dune" # directory the compose commands run in
# label = "Arrakis (battlegroup)"
#
# [instance.docker_compose]
# file = "docker-compose.yml" # -f; relative to root. Omit to use compose's discovery
# project = "dune-main" # -p; defaults to the instance id
# service = "gameserver" # limit lifecycle to one service; omit for the whole stack
# command = ["docker", "compose"] # default; use ["docker-compose"] for the legacy binary
[prober]
interval_seconds = 300

View File

@@ -7,16 +7,17 @@ use tokio::sync::RwLock;
use tokio_util::sync::CancellationToken;
use crate::config::Settings;
use crate::process::ProcessSupervisor;
use crate::prober::ProbeReport;
use crate::supervisor::Supervisor;
pub struct Agent {
pub cfg: Settings,
pub nats: async_nats::Client,
pub started: Instant,
pub last_probe: RwLock<Option<ProbeReport>>,
/// One supervisor per instance (unmanaged instances included — they
/// report `unmanaged` state and reject process commands).
pub supervisors: HashMap<String, Arc<ProcessSupervisor>>,
/// One supervisor per instance, keyed by instance id. The concrete impl
/// (process vs docker-compose) is chosen per game by the factory in main;
/// every subsystem talks to the `Supervisor` trait only.
pub supervisors: HashMap<String, Arc<dyn Supervisor>>,
pub shutdown: CancellationToken,
}

View File

@@ -10,6 +10,7 @@ use serde::Deserialize;
use std::collections::HashSet;
use std::path::{Path, PathBuf};
use crate::docker_compose::DockerComposeConfig;
use crate::rcon::RconConfig;
use crate::steamcmd::SteamcmdConfig;
@@ -76,6 +77,10 @@ pub struct InstanceConfig {
/// validate = false).
#[serde(default)]
pub steamcmd: Option<SteamcmdConfig>,
/// Docker-compose settings for container-managed games (Dune). Absent =
/// defaults apply (compose file in the instance root, project = instance id).
#[serde(default)]
pub docker_compose: Option<DockerComposeConfig>,
}
impl InstanceConfig {

View File

@@ -0,0 +1,216 @@
//! Docker-compose instance supervision — the Dune: Awakening adapter.
//!
//! Dune does not ship as a SteamCMD-updated process like Rust/Conan/Soulmask;
//! it runs as Docker container(s) (game server + RabbitMQ broker + Postgres),
//! orchestrated as a compose stack (a "battlegroup"). So Dune lifecycle is
//! `docker compose up -d / stop / restart` against the instance's compose
//! project, not a spawned OS process. This supervisor implements the same
//! [`Supervisor`] trait `ProcessSupervisor` does, so the instance command
//! dispatch is identical — only the management model differs.
//!
//! Scope (first cut): lifecycle + cached state. Two parity items are deferred
//! to Phase 3b alongside process PID adoption: (1) crash detection (containers
//! give us no child handle — a `docker compose ps` poll loop would supply it);
//! (2) state adoption on agent restart (a running stack reports `stopped` until
//! the next lifecycle command). Both are reconcilable with a `ps` probe.
//!
//! Reference: docs/reference-repos/icehunter SETUP_DOCKER.md (the docker
//! control plane this mirrors).
use std::path::PathBuf;
use std::process::Stdio;
use std::sync::Arc;
use std::time::Instant;
use anyhow::{bail, Context, Result};
use serde::Deserialize;
use tokio::process::Command;
use tokio::sync::{watch, Mutex};
use crate::config::InstanceConfig;
use crate::supervisor::{InstanceState, Supervisor};
/// Per-instance docker-compose settings (`[instance.docker_compose]`). All
/// fields optional — defaults cover the common "one compose file in the
/// instance root" case.
#[derive(Debug, Clone, Default, Deserialize)]
#[serde(deny_unknown_fields)]
pub struct DockerComposeConfig {
/// Compose file (`-f`). Relative paths resolve against the run dir. Default:
/// compose's own discovery (docker-compose.yml in the run dir).
#[serde(default)]
pub file: Option<PathBuf>,
/// Compose project name (`-p`). Default: the instance id.
#[serde(default)]
pub project: Option<String>,
/// Limit lifecycle ops to one service. Default: every service in the file.
#[serde(default)]
pub service: Option<String>,
/// Override the compose binary invocation. Default: `["docker","compose"]`.
/// Use `["docker-compose"]` for the legacy standalone binary.
#[serde(default)]
pub command: Option<Vec<String>>,
}
struct Inner {
started_at: Option<Instant>,
}
pub struct DockerComposeSupervisor {
instance_id: String,
/// Directory the compose commands run in (relative `-f`/file paths resolve
/// against it).
run_dir: PathBuf,
compose_file: Option<PathBuf>,
project: String,
service: Option<String>,
/// Compose binary + leading args, e.g. `["docker","compose"]`.
command: Vec<String>,
inner: Mutex<Inner>,
state_tx: watch::Sender<InstanceState>,
}
impl DockerComposeSupervisor {
pub fn new(cfg: &InstanceConfig) -> Arc<Self> {
let dc = cfg.docker_compose.clone().unwrap_or_default();
let run_dir = cfg
.working_dir
.clone()
.unwrap_or_else(|| cfg.root.clone());
let command = dc
.command
.filter(|c| !c.is_empty())
.unwrap_or_else(|| vec!["docker".to_string(), "compose".to_string()]);
let (state_tx, _) = watch::channel(InstanceState::Stopped);
Arc::new(Self {
instance_id: cfg.id.clone(),
run_dir,
compose_file: dc.file,
project: dc.project.unwrap_or_else(|| cfg.id.clone()),
service: dc.service,
command,
inner: Mutex::new(Inner { started_at: None }),
state_tx,
})
}
fn set_state(&self, state: InstanceState) {
let _ = self.state_tx.send_replace(state);
}
/// Run one compose subcommand (`up`/`stop`/`restart`/...), bailing with the
/// captured stderr on non-zero exit. Global flags (`-f`, `-p`) precede the
/// subcommand; the optional single service is appended last.
async fn run(&self, action: &str, action_args: &[&str]) -> Result<()> {
let mut cmd = Command::new(&self.command[0]);
cmd.args(&self.command[1..]);
if let Some(file) = &self.compose_file {
cmd.arg("-f").arg(file);
}
cmd.arg("-p").arg(&self.project);
cmd.arg(action);
cmd.args(action_args);
if let Some(service) = &self.service {
cmd.arg(service);
}
cmd.current_dir(&self.run_dir)
.stdin(Stdio::null())
.stdout(Stdio::piped())
.stderr(Stdio::piped());
let output = cmd
.output()
.await
.with_context(|| format!("running `{} {action}` (is docker installed and on PATH?)", self.command.join(" ")))?;
if !output.status.success() {
let stderr = String::from_utf8_lossy(&output.stderr);
let stdout = String::from_utf8_lossy(&output.stdout);
let detail = if !stderr.trim().is_empty() {
stderr.trim()
} else {
stdout.trim()
};
bail!("compose {action} failed ({}): {detail}", output.status);
}
Ok(())
}
}
#[async_trait::async_trait]
impl Supervisor for DockerComposeSupervisor {
fn instance_id(&self) -> &str {
&self.instance_id
}
fn state(&self) -> InstanceState {
self.state_tx.borrow().clone()
}
fn watch_state(&self) -> watch::Receiver<InstanceState> {
self.state_tx.subscribe()
}
async fn uptime_seconds(&self) -> u64 {
let inner = self.inner.lock().await;
match (&*self.state_tx.borrow(), inner.started_at) {
(InstanceState::Running, Some(t)) => t.elapsed().as_secs(),
_ => 0,
}
}
async fn start(self: Arc<Self>) -> Result<()> {
if matches!(
*self.state_tx.borrow(),
InstanceState::Running | InstanceState::Starting
) {
bail!("instance '{}' is already running", self.instance_id);
}
self.set_state(InstanceState::Starting);
match self.run("up", &["-d"]).await {
Ok(()) => {
self.inner.lock().await.started_at = Some(Instant::now());
self.set_state(InstanceState::Running);
tracing::info!("instance '{}' compose up -d", self.instance_id);
Ok(())
}
Err(e) => {
self.set_state(InstanceState::Stopped);
Err(e)
}
}
}
async fn stop(self: Arc<Self>) -> Result<()> {
self.set_state(InstanceState::Stopping);
match self.run("stop", &[]).await {
Ok(()) => {
self.inner.lock().await.started_at = None;
self.set_state(InstanceState::Stopped);
tracing::info!("instance '{}' compose stop", self.instance_id);
Ok(())
}
Err(e) => {
// Stop failed — the stack is most likely still up.
self.set_state(InstanceState::Running);
Err(e)
}
}
}
async fn restart(self: Arc<Self>) -> Result<()> {
self.set_state(InstanceState::Starting);
match self.run("restart", &[]).await {
Ok(()) => {
self.inner.lock().await.started_at = Some(Instant::now());
self.set_state(InstanceState::Running);
tracing::info!("instance '{}' compose restart", self.instance_id);
Ok(())
}
Err(e) => {
self.set_state(InstanceState::Stopped);
Err(e)
}
}
}
}

View File

@@ -13,9 +13,9 @@ use serde_json::json;
use std::sync::Arc;
use crate::agent::Agent;
use crate::process::ProcessSupervisor;
use crate::subjects;
use crate::steamcmd;
use crate::supervisor::Supervisor;
#[derive(Debug, Deserialize)]
struct InstanceCommand {
@@ -26,8 +26,8 @@ struct InstanceCommand {
}
/// Forward every supervisor state change as a status event.
pub async fn publish_state_changes(agent: Arc<Agent>, sup: Arc<ProcessSupervisor>) {
let subject = subjects::instance_status(&agent.cfg.license_id, &sup.instance_id);
pub async fn publish_state_changes(agent: Arc<Agent>, sup: Arc<dyn Supervisor>) {
let subject = subjects::instance_status(&agent.cfg.license_id, sup.instance_id());
let mut rx = sup.watch_state();
let cancel = agent.shutdown.clone();
@@ -40,13 +40,13 @@ pub async fn publish_state_changes(agent: Arc<Agent>, sup: Arc<ProcessSupervisor
let state = rx.borrow().clone();
let event = json!({
"timestamp": Utc::now().to_rfc3339_opts(SecondsFormat::Secs, true),
"instance_id": sup.instance_id,
"instance_id": sup.instance_id(),
"event": state,
});
match serde_json::to_vec(&event) {
Ok(bytes) => {
if let Err(e) = agent.nats.publish(subject.clone(), bytes.into()).await {
tracing::warn!("status publish failed for '{}': {e}", sup.instance_id);
tracing::warn!("status publish failed for '{}': {e}", sup.instance_id());
}
}
Err(e) => tracing::error!("status serialize failed: {e}"),
@@ -58,8 +58,8 @@ pub async fn publish_state_changes(agent: Arc<Agent>, sup: Arc<ProcessSupervisor
}
/// Request-reply command handler for one instance.
pub async fn run(agent: Arc<Agent>, sup: Arc<ProcessSupervisor>) -> anyhow::Result<()> {
let subject = subjects::instance_cmd(&agent.cfg.license_id, &sup.instance_id);
pub async fn run(agent: Arc<Agent>, sup: Arc<dyn Supervisor>) -> anyhow::Result<()> {
let subject = subjects::instance_cmd(&agent.cfg.license_id, sup.instance_id());
let mut sub = agent.nats.subscribe(subject.clone()).await?;
tracing::info!("instance command handler listening on {subject}");
@@ -74,13 +74,13 @@ pub async fn run(agent: Arc<Agent>, sup: Arc<ProcessSupervisor>) -> anyhow::Resu
tokio::spawn(async move { handle(agent, sup, msg).await });
}
None => {
tracing::warn!("instance command subscription ended for '{}'", sup.instance_id);
tracing::warn!("instance command subscription ended for '{}'", sup.instance_id());
break;
}
}
}
_ = cancel.cancelled() => {
tracing::info!("instance command handler stopping for '{}'", sup.instance_id);
tracing::info!("instance command handler stopping for '{}'", sup.instance_id());
break;
}
}
@@ -88,7 +88,7 @@ pub async fn run(agent: Arc<Agent>, sup: Arc<ProcessSupervisor>) -> anyhow::Resu
Ok(())
}
async fn handle(agent: Arc<Agent>, sup: Arc<ProcessSupervisor>, msg: async_nats::Message) {
async fn handle(agent: Arc<Agent>, sup: Arc<dyn Supervisor>, msg: async_nats::Message) {
let Some(reply) = msg.reply.clone() else {
tracing::warn!("instance command without reply subject ignored");
return;
@@ -113,20 +113,22 @@ async fn handle(agent: Arc<Agent>, sup: Arc<ProcessSupervisor>, msg: async_nats:
async fn dispatch(
agent: &Arc<Agent>,
sup: &Arc<ProcessSupervisor>,
sup: &Arc<dyn Supervisor>,
cmd: &InstanceCommand,
) -> serde_json::Value {
let func = cmd.func.as_str();
// start/stop/restart take `self: Arc<Self>` (they may hand a clone to a
// monitor task), so clone the Arc before the consuming call.
let outcome = match func {
"start" => sup.start().await.map(|_| "starting"),
"stop" => sup.stop().await.map(|_| "stopped"),
"restart" => sup.restart().await.map(|_| "restarted"),
"start" => sup.clone().start().await.map(|_| "starting"),
"stop" => sup.clone().stop().await.map(|_| "stopped"),
"restart" => sup.clone().restart().await.map(|_| "restarted"),
"status" => {
return json!({
"status": "success",
"func": "status",
"instance_id": sup.instance_id,
"instance_id": sup.instance_id(),
"state": sup.state(),
"uptime_seconds": sup.uptime_seconds().await,
});
@@ -139,15 +141,15 @@ async fn dispatch(
.cfg
.instances
.iter()
.find(|i| i.id == sup.instance_id);
.find(|i| i.id == sup.instance_id());
let rcon_cfg = inst_cfg.and_then(|i| i.rcon.as_ref());
let Some(rcon_cfg) = rcon_cfg else {
return json!({
"status": "error",
"func": "rcon",
"instance_id": sup.instance_id,
"message": format!("instance '{}' has no rcon configured", sup.instance_id),
"instance_id": sup.instance_id(),
"message": format!("instance '{}' has no rcon configured", sup.instance_id()),
});
};
@@ -155,7 +157,7 @@ async fn dispatch(
return json!({
"status": "error",
"func": "rcon",
"instance_id": sup.instance_id,
"instance_id": sup.instance_id(),
"message": "rcon func requires a 'command' field",
});
};
@@ -165,13 +167,13 @@ async fn dispatch(
Ok(output) => json!({
"status": "success",
"func": "rcon",
"instance_id": sup.instance_id,
"instance_id": sup.instance_id(),
"output": output,
}),
Err(e) => json!({
"status": "error",
"func": "rcon",
"instance_id": sup.instance_id,
"instance_id": sup.instance_id(),
"message": format!("{e:#}"),
}),
};
@@ -181,14 +183,14 @@ async fn dispatch(
// settings. The supervisor only carries process-control state, not
// the full config, so we reach into agent.cfg.instances here as the
// rcon dispatch does.
let inst_cfg = agent.cfg.instances.iter().find(|i| i.id == sup.instance_id);
let inst_cfg = agent.cfg.instances.iter().find(|i| i.id == sup.instance_id());
let Some(inst_cfg) = inst_cfg else {
return json!({
"status": "error",
"func": "steam_update",
"instance_id": sup.instance_id,
"message": format!("no config found for instance '{}'", sup.instance_id),
"instance_id": sup.instance_id(),
"message": format!("no config found for instance '{}'", sup.instance_id()),
});
};
@@ -209,7 +211,7 @@ async fn dispatch(
};
let license = agent.cfg.license_id.clone();
let instance_id = sup.instance_id.clone();
let instance_id = sup.instance_id().to_string();
let nats = agent.nats.clone();
// Publish each progress line to the steam_status subject.
@@ -240,12 +242,12 @@ async fn dispatch(
Ok(()) => json!({
"status": "success",
"func": "steam_update",
"instance_id": sup.instance_id,
"instance_id": sup.instance_id(),
}),
Err(e) => json!({
"status": "error",
"func": "steam_update",
"instance_id": sup.instance_id,
"instance_id": sup.instance_id(),
"message": format!("{e:#}"),
}),
};
@@ -262,14 +264,14 @@ async fn dispatch(
Ok(result) => json!({
"status": "success",
"func": func,
"instance_id": sup.instance_id,
"instance_id": sup.instance_id(),
"result": result,
"state": sup.state(),
}),
Err(e) => json!({
"status": "error",
"func": func,
"instance_id": sup.instance_id,
"instance_id": sup.instance_id(),
"message": format!("{e:#}"),
}),
}

View File

@@ -4,6 +4,7 @@
pub mod agent;
pub mod bus;
pub mod config;
pub mod docker_compose;
pub mod filemanager;
pub mod hostcmd;
pub mod instancecmd;
@@ -12,6 +13,7 @@ pub mod process;
pub mod rcon;
pub mod steamcmd;
pub mod subjects;
pub mod supervisor;
pub mod telemetry;
pub mod update;
pub mod version;

View File

@@ -5,8 +5,8 @@
//! game adapters arrive in Phase 1+ (see PROTOCOL.md).
use corrosion_host_agent::{
agent, bus, config, filemanager, hostcmd, instancecmd, prober, process, subjects, telemetry,
version,
agent, bus, config, docker_compose, filemanager, hostcmd, instancecmd, prober, process,
subjects, supervisor, telemetry, version,
};
use anyhow::{Context, Result};
@@ -92,10 +92,20 @@ async fn run(settings: config::Settings) -> Result<()> {
let nats = bus::connect(&settings).await?;
let supervisors = settings
// Per-game supervisor factory: container-managed games (Dune) get a
// docker-compose supervisor; everything else is a spawned-process
// supervisor. Both satisfy the `Supervisor` trait, so the rest of the agent
// is game-agnostic.
let supervisors: std::collections::HashMap<String, Arc<dyn supervisor::Supervisor>> = settings
.instances
.iter()
.map(|inst| (inst.id.clone(), process::ProcessSupervisor::new(inst)))
.map(|inst| {
let sup: Arc<dyn supervisor::Supervisor> = match inst.game.as_str() {
"dune" => docker_compose::DockerComposeSupervisor::new(inst),
_ => process::ProcessSupervisor::new(inst),
};
(inst.id.clone(), sup)
})
.collect();
let agent = Arc::new(Agent {

View File

@@ -1,14 +1,16 @@
//! Per-instance game-server process supervision.
//!
//! One `ProcessSupervisor` per process-managed instance. Lifecycle mirrors the
//! proven Go agent behavior — graceful SIGTERM with a 30s budget before force
//! kill, a monitor task that reaps the child and records crash-vs-stop — with
//! two fixes the Go version needed: args are a proper list (no naive space
//! splitting), and every state change is observable through a watch channel
//! so the panel gets push events instead of waiting for the next heartbeat.
//! One `ProcessSupervisor` per process-managed instance (Rust/Conan/Soulmask).
//! Lifecycle mirrors the proven Go agent behavior — graceful SIGTERM with a 30s
//! budget before force kill, a monitor task that reaps the child and records
//! crash-vs-stop — with two fixes the Go version needed: args are a proper list
//! (no naive space splitting), and every state change is observable through a
//! watch channel so the panel gets push events instead of waiting for the next
//! heartbeat. Lifecycle control is exposed through the [`Supervisor`] trait so
//! the command dispatch is identical across process- and container-managed
//! games.
use anyhow::{bail, Context, Result};
use serde::Serialize;
use std::path::PathBuf;
use std::process::Stdio;
use std::sync::Arc;
@@ -17,39 +19,11 @@ use tokio::process::{Child, Command};
use tokio::sync::{watch, Mutex};
use crate::config::InstanceConfig;
use crate::supervisor::{InstanceState, Supervisor};
const GRACEFUL_STOP_BUDGET: Duration = Duration::from_secs(30);
const RESTART_PAUSE: Duration = Duration::from_secs(2);
#[derive(Debug, Clone, PartialEq, Serialize)]
#[serde(rename_all = "snake_case", tag = "state")]
pub enum InstanceState {
/// Not process-managed (no executable configured).
Unmanaged,
Stopped,
Starting,
Running,
Stopping,
/// Process exited without a stop request.
Crashed {
#[serde(skip_serializing_if = "Option::is_none")]
exit_code: Option<i32>,
},
}
impl InstanceState {
pub fn as_label(&self) -> &'static str {
match self {
InstanceState::Unmanaged => "unmanaged",
InstanceState::Stopped => "stopped",
InstanceState::Starting => "starting",
InstanceState::Running => "running",
InstanceState::Stopping => "stopping",
InstanceState::Crashed { .. } => "crashed",
}
}
}
struct Inner {
child: Option<Child>,
started_at: Option<Instant>,
@@ -59,7 +33,7 @@ struct Inner {
}
pub struct ProcessSupervisor {
pub instance_id: String,
instance_id: String,
executable: Option<PathBuf>,
args: Vec<String>,
working_dir: Option<PathBuf>,
@@ -90,72 +64,6 @@ impl ProcessSupervisor {
})
}
pub fn state(&self) -> InstanceState {
self.state_tx.borrow().clone()
}
pub fn watch_state(&self) -> watch::Receiver<InstanceState> {
self.state_tx.subscribe()
}
pub async fn uptime_seconds(&self) -> u64 {
let inner = self.inner.lock().await;
match (&*self.state_tx.borrow(), inner.started_at) {
(InstanceState::Running, Some(t)) => t.elapsed().as_secs(),
_ => 0,
}
}
pub async fn start(self: &Arc<Self>) -> Result<()> {
let Some(exe) = self.executable.clone() else {
bail!("instance '{}' has no executable configured", self.instance_id);
};
if !exe.exists() {
bail!("executable not found: {}", exe.display());
}
let mut inner = self.inner.lock().await;
if matches!(*self.state_tx.borrow(), InstanceState::Running | InstanceState::Starting) {
bail!("instance '{}' is already running", self.instance_id);
}
self.set_state(InstanceState::Starting);
let workdir = self
.working_dir
.clone()
.or_else(|| exe.parent().map(|p| p.to_path_buf()))
.unwrap_or_else(|| PathBuf::from("."));
let child = Command::new(&exe)
.args(&self.args)
.current_dir(&workdir)
.stdin(Stdio::null())
.stdout(Stdio::inherit())
.stderr(Stdio::inherit())
.spawn()
.with_context(|| format!("spawning {}", exe.display()))?;
let pid = child.id();
inner.child = Some(child);
inner.started_at = Some(Instant::now());
inner.stop_requested = false;
drop(inner);
self.set_state(InstanceState::Running);
tracing::info!(
"instance '{}' started: {} (pid {:?})",
self.instance_id,
exe.display(),
pid
);
// Monitor: reap the child and classify the exit.
let sup = Arc::clone(self);
tokio::spawn(async move { sup.monitor().await });
Ok(())
}
async fn monitor(self: Arc<Self>) {
// Take a waiter without holding the lock across the whole child
// lifetime: Child::wait needs &mut, so the child stays in inner and
@@ -201,7 +109,85 @@ impl ProcessSupervisor {
}
}
pub async fn stop(self: &Arc<Self>) -> Result<()> {
fn set_state(&self, state: InstanceState) {
// send_replace never fails even with zero receivers.
let _ = self.state_tx.send_replace(state);
}
}
#[async_trait::async_trait]
impl Supervisor for ProcessSupervisor {
fn instance_id(&self) -> &str {
&self.instance_id
}
fn state(&self) -> InstanceState {
self.state_tx.borrow().clone()
}
fn watch_state(&self) -> watch::Receiver<InstanceState> {
self.state_tx.subscribe()
}
async fn uptime_seconds(&self) -> u64 {
let inner = self.inner.lock().await;
match (&*self.state_tx.borrow(), inner.started_at) {
(InstanceState::Running, Some(t)) => t.elapsed().as_secs(),
_ => 0,
}
}
async fn start(self: Arc<Self>) -> Result<()> {
let Some(exe) = self.executable.clone() else {
bail!("instance '{}' has no executable configured", self.instance_id);
};
if !exe.exists() {
bail!("executable not found: {}", exe.display());
}
let mut inner = self.inner.lock().await;
if matches!(*self.state_tx.borrow(), InstanceState::Running | InstanceState::Starting) {
bail!("instance '{}' is already running", self.instance_id);
}
self.set_state(InstanceState::Starting);
let workdir = self
.working_dir
.clone()
.or_else(|| exe.parent().map(|p| p.to_path_buf()))
.unwrap_or_else(|| PathBuf::from("."));
let child = Command::new(&exe)
.args(&self.args)
.current_dir(&workdir)
.stdin(Stdio::null())
.stdout(Stdio::inherit())
.stderr(Stdio::inherit())
.spawn()
.with_context(|| format!("spawning {}", exe.display()))?;
let pid = child.id();
inner.child = Some(child);
inner.started_at = Some(Instant::now());
inner.stop_requested = false;
drop(inner);
self.set_state(InstanceState::Running);
tracing::info!(
"instance '{}' started: {} (pid {:?})",
self.instance_id,
exe.display(),
pid
);
// Monitor: reap the child and classify the exit.
let sup = Arc::clone(&self);
tokio::spawn(async move { sup.monitor().await });
Ok(())
}
async fn stop(self: Arc<Self>) -> Result<()> {
let mut inner = self.inner.lock().await;
if inner.child.is_none() {
bail!("instance '{}' is not running", self.instance_id);
@@ -263,16 +249,14 @@ impl ProcessSupervisor {
Ok(())
}
pub async fn restart(self: &Arc<Self>) -> Result<()> {
if !matches!(*self.state_tx.borrow(), InstanceState::Stopped | InstanceState::Crashed { .. } | InstanceState::Unmanaged) {
self.stop().await?;
async fn restart(self: Arc<Self>) -> Result<()> {
if !matches!(
*self.state_tx.borrow(),
InstanceState::Stopped | InstanceState::Crashed { .. } | InstanceState::Unmanaged
) {
self.clone().stop().await?;
}
tokio::time::sleep(RESTART_PAUSE).await;
self.start().await
}
fn set_state(&self, state: InstanceState) {
// send_replace never fails even with zero receivers.
let _ = self.state_tx.send_replace(state);
}
}

View File

@@ -0,0 +1,80 @@
//! The supervision abstraction.
//!
//! A `Supervisor` owns the lifecycle of one game instance. Different games are
//! managed in fundamentally different ways — Rust/Conan/Soulmask are spawned OS
//! processes ([`crate::process::ProcessSupervisor`]); Dune is a docker-compose
//! stack ([`crate::docker_compose::DockerComposeSupervisor`]); future planes
//! (kubectl, AMP/podman, SSH) will be their own impls. The instance command
//! dispatch (`instancecmd::dispatch`) talks only to this trait, so it never
//! learns which management model is behind a given instance.
//!
//! Trait objects (`Arc<dyn Supervisor>`) need object-safe, dynamically
//! dispatchable async methods; native `async fn` in traits is not yet
//! dyn-compatible, so we use `#[async_trait]` (the battle-tested ecosystem
//! standard) to box the returned futures. The cost — one heap alloc per
//! lifecycle call — is irrelevant for start/stop/restart, which happen seconds
//! to minutes apart.
use std::sync::Arc;
use anyhow::Result;
use serde::Serialize;
use tokio::sync::watch;
/// Observable lifecycle state of one instance. Shared vocabulary across every
/// supervisor impl; serialized verbatim into heartbeats and status events
/// (`{"state":"running", ...}`).
#[derive(Debug, Clone, PartialEq, Serialize)]
#[serde(rename_all = "snake_case", tag = "state")]
pub enum InstanceState {
/// Not lifecycle-managed (a process instance with no executable, etc.).
Unmanaged,
Stopped,
Starting,
Running,
Stopping,
/// Exited/died without a stop request.
Crashed {
#[serde(skip_serializing_if = "Option::is_none")]
exit_code: Option<i32>,
},
}
impl InstanceState {
pub fn as_label(&self) -> &'static str {
match self {
InstanceState::Unmanaged => "unmanaged",
InstanceState::Stopped => "stopped",
InstanceState::Starting => "starting",
InstanceState::Running => "running",
InstanceState::Stopping => "stopping",
InstanceState::Crashed { .. } => "crashed",
}
}
}
/// Lifecycle control + state observation for one instance.
///
/// `start`/`stop`/`restart` take `self: Arc<Self>` so an impl can hand a clone
/// to a spawned monitor task; callers hold an `Arc<dyn Supervisor>` and
/// `clone()` before each call. `watch_state` exposes the same channel the
/// status-event publisher drains, so panel push events stay decoupled from the
/// heartbeat cadence.
#[async_trait::async_trait]
pub trait Supervisor: Send + Sync {
/// The instance slug (a NATS subject segment).
fn instance_id(&self) -> &str;
/// Current cached state (cheap; no I/O).
fn state(&self) -> InstanceState;
/// Subscribe to state transitions.
fn watch_state(&self) -> watch::Receiver<InstanceState>;
/// Seconds since the instance entered `Running` (0 otherwise).
async fn uptime_seconds(&self) -> u64;
async fn start(self: Arc<Self>) -> Result<()>;
async fn stop(self: Arc<Self>) -> Result<()>;
async fn restart(self: Arc<Self>) -> Result<()>;
}

View File

@@ -129,7 +129,7 @@ pub async fn collect(agent: &Agent, sys: &mut System) -> HeartbeatPayload {
let mut instances = Vec::with_capacity(agent.cfg.instances.len());
for inst in &agent.cfg.instances {
let (state, uptime_seconds) = match agent.supervisors.get(&inst.id) {
Some(sup) if !matches!(sup.state(), crate::process::InstanceState::Unmanaged) => {
Some(sup) if !matches!(sup.state(), crate::supervisor::InstanceState::Unmanaged) => {
(sup.state().as_label().to_string(), sup.uptime_seconds().await)
}
_ => {

View File

@@ -0,0 +1,156 @@
//! DockerComposeSupervisor tests. A fake `docker` script records the exact
//! arguments it was invoked with and returns a controllable exit code, so we
//! assert the compose invocations + state transitions with no real Docker
//! daemon — the same mock-the-external-binary approach the steamcmd tests use.
#![cfg(unix)]
use std::os::unix::fs::PermissionsExt;
use std::path::{Path, PathBuf};
use corrosion_host_agent::config::InstanceConfig;
use corrosion_host_agent::docker_compose::{DockerComposeConfig, DockerComposeSupervisor};
use corrosion_host_agent::supervisor::{InstanceState, Supervisor};
/// Write a fake `docker` executable that appends its args (space-joined) to
/// `args_log` and exits with the integer in `exit_file` (0 if absent).
fn fake_docker(dir: &Path, args_log: &Path, exit_file: &Path) -> PathBuf {
let script = dir.join("fakedocker");
let body = format!(
"#!/bin/sh\nprintf '%s\\n' \"$*\" >> '{}'\nexit \"$(cat '{}' 2>/dev/null || echo 0)\"\n",
args_log.display(),
exit_file.display(),
);
std::fs::write(&script, body).unwrap();
let mut perms = std::fs::metadata(&script).unwrap().permissions();
perms.set_mode(0o755);
std::fs::set_permissions(&script, perms).unwrap();
script
}
fn dune_instance(command: Vec<String>, service: Option<String>) -> InstanceConfig {
InstanceConfig {
id: "dune-main".to_string(),
game: "dune".to_string(),
root: PathBuf::from("/tmp"),
label: None,
executable: None,
args: vec![],
working_dir: None,
rcon: None,
steamcmd: None,
docker_compose: Some(DockerComposeConfig {
file: Some(PathBuf::from("docker-compose.yml")),
project: Some("duneproj".to_string()),
service,
command: Some(command),
}),
}
}
#[tokio::test]
async fn start_runs_compose_up_detached_and_sets_running() {
let dir = tempfile::tempdir().unwrap();
let args_log = dir.path().join("args.log");
let exit_file = dir.path().join("exit");
let docker = fake_docker(dir.path(), &args_log, &exit_file);
let sup = DockerComposeSupervisor::new(&dune_instance(
vec![docker.to_string_lossy().into_owned()],
None,
));
assert_eq!(sup.state(), InstanceState::Stopped);
sup.clone().start().await.expect("compose up should succeed");
assert_eq!(sup.state(), InstanceState::Running);
let logged = std::fs::read_to_string(&args_log).unwrap();
assert!(logged.contains("up -d"), "expected `up -d`; got: {logged}");
assert!(logged.contains("-p duneproj"), "expected project flag; got: {logged}");
assert!(logged.contains("-f docker-compose.yml"), "expected file flag; got: {logged}");
}
#[tokio::test]
async fn stop_runs_compose_stop_and_sets_stopped() {
let dir = tempfile::tempdir().unwrap();
let args_log = dir.path().join("args.log");
let exit_file = dir.path().join("exit");
let docker = fake_docker(dir.path(), &args_log, &exit_file);
let sup = DockerComposeSupervisor::new(&dune_instance(
vec![docker.to_string_lossy().into_owned()],
None,
));
sup.clone().start().await.expect("up");
sup.clone().stop().await.expect("compose stop should succeed");
assert_eq!(sup.state(), InstanceState::Stopped);
assert_eq!(sup.uptime_seconds().await, 0);
let logged = std::fs::read_to_string(&args_log).unwrap();
assert!(logged.lines().any(|l| l.contains("stop")), "expected a `stop` call; got: {logged}");
}
#[tokio::test]
async fn restart_runs_compose_restart() {
let dir = tempfile::tempdir().unwrap();
let args_log = dir.path().join("args.log");
let exit_file = dir.path().join("exit");
let docker = fake_docker(dir.path(), &args_log, &exit_file);
let sup = DockerComposeSupervisor::new(&dune_instance(
vec![docker.to_string_lossy().into_owned()],
None,
));
sup.clone().restart().await.expect("compose restart should succeed");
assert_eq!(sup.state(), InstanceState::Running);
let logged = std::fs::read_to_string(&args_log).unwrap();
assert!(logged.contains("restart"), "expected `restart`; got: {logged}");
}
#[tokio::test]
async fn single_service_is_targeted() {
let dir = tempfile::tempdir().unwrap();
let args_log = dir.path().join("args.log");
let exit_file = dir.path().join("exit");
let docker = fake_docker(dir.path(), &args_log, &exit_file);
let sup = DockerComposeSupervisor::new(&dune_instance(
vec![docker.to_string_lossy().into_owned()],
Some("gameserver".to_string()),
));
sup.clone().start().await.expect("up");
let logged = std::fs::read_to_string(&args_log).unwrap();
assert!(
logged.contains("up -d gameserver"),
"service must be appended after `up -d`; got: {logged}"
);
}
#[tokio::test]
async fn compose_failure_errors_and_reverts_state() {
let dir = tempfile::tempdir().unwrap();
let args_log = dir.path().join("args.log");
let exit_file = dir.path().join("exit");
std::fs::write(&exit_file, "1").unwrap(); // make the fake docker fail
let docker = fake_docker(dir.path(), &args_log, &exit_file);
let sup = DockerComposeSupervisor::new(&dune_instance(
vec![docker.to_string_lossy().into_owned()],
None,
));
let err = sup.clone().start().await.expect_err("nonzero compose exit must fail");
assert!(err.to_string().contains("compose up failed"), "got: {err}");
assert_eq!(sup.state(), InstanceState::Stopped, "failed start must revert to Stopped");
}
#[tokio::test]
async fn missing_docker_binary_errors_cleanly() {
let sup = DockerComposeSupervisor::new(&dune_instance(
vec!["/nonexistent/docker-xyz".to_string()],
None,
));
let err = sup.clone().start().await.expect_err("missing docker must fail");
assert!(err.to_string().contains("docker"), "error should mention docker: {err}");
assert_eq!(sup.state(), InstanceState::Stopped);
}

View File

@@ -8,7 +8,8 @@ use std::path::PathBuf;
use std::time::Duration;
use corrosion_host_agent::config::InstanceConfig;
use corrosion_host_agent::process::{InstanceState, ProcessSupervisor};
use corrosion_host_agent::process::ProcessSupervisor;
use corrosion_host_agent::supervisor::{InstanceState, Supervisor};
fn managed_instance(executable: &str, args: &[&str]) -> InstanceConfig {
InstanceConfig {
@@ -21,6 +22,7 @@ fn managed_instance(executable: &str, args: &[&str]) -> InstanceConfig {
working_dir: None,
rcon: None,
steamcmd: None,
docker_compose: None,
}
}
@@ -47,15 +49,15 @@ async fn start_status_stop_lifecycle() {
let sup = ProcessSupervisor::new(&managed_instance("/bin/sleep", &["300"]));
assert_eq!(sup.state(), InstanceState::Stopped);
sup.start().await.expect("start should succeed");
sup.clone().start().await.expect("start should succeed");
assert_eq!(sup.state(), InstanceState::Running);
tokio::time::sleep(Duration::from_millis(1100)).await;
assert!(sup.uptime_seconds().await >= 1, "uptime should advance");
// Double-start must be rejected while running.
assert!(sup.start().await.is_err(), "double start must fail");
assert!(sup.clone().start().await.is_err(), "double start must fail");
sup.stop().await.expect("stop should succeed");
sup.clone().stop().await.expect("stop should succeed");
let state = wait_for_state(&sup, |s| matches!(s, InstanceState::Stopped), Duration::from_secs(5)).await;
assert_eq!(state, InstanceState::Stopped);
assert_eq!(sup.uptime_seconds().await, 0);
@@ -64,7 +66,7 @@ async fn start_status_stop_lifecycle() {
#[tokio::test]
async fn unexpected_exit_is_crashed_with_code() {
let sup = ProcessSupervisor::new(&managed_instance("/bin/sh", &["-c", "sleep 0.2; exit 7"]));
sup.start().await.expect("start should succeed");
sup.clone().start().await.expect("start should succeed");
let state = wait_for_state(
&sup,
@@ -78,16 +80,16 @@ async fn unexpected_exit_is_crashed_with_code() {
#[tokio::test]
async fn restart_from_crashed_recovers() {
let sup = ProcessSupervisor::new(&managed_instance("/bin/sh", &["-c", "exit 1"]));
sup.start().await.expect("start should succeed");
sup.clone().start().await.expect("start should succeed");
wait_for_state(&sup, |s| matches!(s, InstanceState::Crashed { .. }), Duration::from_secs(5)).await;
// Restart from crashed must work (panel "Restart" after a crash).
// Use a long-lived command this time by replacing the supervisor — the
// command is fixed per supervisor, so emulate via a fresh one.
let sup2 = ProcessSupervisor::new(&managed_instance("/bin/sleep", &["300"]));
sup2.restart().await.expect("restart from stopped should start");
sup2.clone().restart().await.expect("restart from stopped should start");
assert_eq!(sup2.state(), InstanceState::Running);
sup2.stop().await.expect("cleanup stop");
sup2.clone().stop().await.expect("cleanup stop");
}
#[tokio::test]
@@ -96,14 +98,14 @@ async fn unmanaged_instance_rejects_process_commands() {
cfg.executable = None;
let sup = ProcessSupervisor::new(&cfg);
assert_eq!(sup.state(), InstanceState::Unmanaged);
assert!(sup.start().await.is_err(), "unmanaged start must fail");
assert!(sup.stop().await.is_err(), "unmanaged stop must fail");
assert!(sup.clone().start().await.is_err(), "unmanaged start must fail");
assert!(sup.clone().stop().await.is_err(), "unmanaged stop must fail");
}
#[tokio::test]
async fn missing_executable_fails_cleanly() {
let sup = ProcessSupervisor::new(&managed_instance("/nonexistent/bin/gameserver", &[]));
let err = sup.start().await.expect_err("must fail");
let err = sup.clone().start().await.expect_err("must fail");
assert!(err.to_string().contains("not found"), "error should say not found: {err}");
assert_eq!(sup.state(), InstanceState::Stopped, "failed start must not leave Starting state");
}

View File

@@ -0,0 +1,69 @@
# Reference Repos
Third-party Dune: Awakening server-management projects, kept here as **behavior
references** for Phase 2 (the Corrosion host-agent Dune adapter + future panel
Dune features). These are NOT Corrosion code and are not built or shipped — they
are read-only references. `.git` histories, `node_modules`, and compiled
binaries were stripped on import (the 38 MB `icehunter/web/dune-admin` build
artifact and a Tauri `.icns` are intentionally absent).
> Imported 2026-06-12 from `/tmp/dune-re`. Each was a separate upstream repo;
> see each project's own `LICENSE` and `README.md`. Treat as documentation.
## Why these are here
Dune: Awakening does **not** use SteamCMD or a plain game-server process like
Rust/Conan/Soulmask. It ships as **Docker container(s)** fronted by a **RabbitMQ
broker** (admin + game vhosts) and a **PostgreSQL** admin database (`dune`
schema), orchestrated as a "**battlegroup**". The game process is
`DuneSandboxServer-Linux-Shipping` (one per partition). Server settings live in
INI files (`UserEngine.ini` / `UserGame.ini`) and only take effect after a
restart. Our Dune adapter must model that container/broker/DB world instead of
the process+SteamCMD model — these repos are how that world actually works in
the wild.
## The references
### `icehunter/` — `dune-admin` (Go backend + React SPA)
The richest ops reference. A web admin panel with **four interchangeable control
planes**: `docker`, `kubectl`, `local`, and `amp` (CubeCoders AMP / podman).
Most relevant to us:
- **`SETUP_DOCKER.md`** — the Docker control plane: `docker start/stop/restart`
for lifecycle, `docker logs -f` for streaming, `docker exec` into the broker
container for RabbitMQ (`rabbitmqctl`) commands, direct TCP to the `dune`
Postgres. Optional SSH tunnelling when the admin is off-host. **This is the
closest analog to what the Corrosion host-agent Dune adapter must do.**
- `cmd/dune-admin/control_docker.go` / `control_kubectl.go` / `control_local.go`
/ `control_amp.go` — the `ControlPlane` interface and its implementations
(the start/stop/restart/status/log/broker abstraction we mirror as a Rust
game-adapter trait).
- `db.go` / `model.go` — the full Dune admin data model (players, bases,
inventory, exchange/market) for when Corrosion grows a richer Dune admin
surface beyond lifecycle.
- `CLAUDE.md` — upstream's own engineering notes; the AMP section documents the
INI-vs-API server-settings gotcha (AMP regenerates INIs on start).
### `adainrivers/` — Dune Dedicated Server Manager (Rust / Tauri desktop)
**The Rust reference.** Manages already-provisioned servers over **SSH +
Kubernetes** ("BattleGroup" start/stop/restart/update), with secure SSH tunnels
to Director / File Browser / Postgres / PgHero, an in-game admin console (item
grants, vehicle spawns, journey/XP tags), and a bundled **`dune-server-service`**
daemon for scheduled maintenance (timed restarts with in-game warnings, backups,
update apply). Closest to our stack idiomatically — read it for Rust patterns on
SSH control, the maintenance-daemon design, and the in-game command surface.
### `the4rchangel/` — Dune: Awakening Server Manager (Node.js local web UI)
**Matches the Commander's exact self-host path.** A local dashboard that
replaces the `battlegroup.bat` terminal menu — guided VM import (Hyper-V),
network, SSH, bootstrap, then daily ops: battlegroup start/stop/restart/update,
character editor, visual game-config editor (PvP, sandstorms, sandworms, mining
rates, decay, building limits), monitoring, DB access. Read it to understand the
`battlegroup.bat` workflow our agent has to drive on a Windows/Hyper-V host.
## How we use them
- **Lifecycle/control** → mirror `icehunter`'s `ControlPlane` docker provider as
the agent's Dune game-adapter (compose/`docker` lifecycle, `docker logs`
console, reject SteamCMD).
- **Rust idioms / maintenance daemon / SSH** → `adainrivers`.
- **Battlegroup.bat reality / setup flow / game-config schema** → `the4rchangel`.

View File

@@ -0,0 +1,71 @@
name: CI
on:
workflow_dispatch:
env:
FORCE_JAVASCRIPT_ACTIONS_TO_NODE24: true
jobs:
checks:
name: Workspace checks (${{ matrix.platform }})
runs-on: ${{ matrix.platform }}
strategy:
fail-fast: false
matrix:
platform: [windows-latest, ubuntu-22.04, macos-latest]
steps:
- name: Checkout
uses: actions/checkout@v4
- name: Install Rust
uses: dtolnay/rust-toolchain@stable
- name: Install Node
uses: actions/setup-node@v4
with:
node-version: 22
cache: npm
cache-dependency-path: app/package-lock.json
- name: Install Linux Tauri dependencies
if: matrix.platform == 'ubuntu-22.04'
run: |
sudo apt-get update
sudo apt-get install -y libwebkit2gtk-4.1-dev libappindicator3-dev librsvg2-dev patchelf pkg-config libssl-dev
- name: Install frontend dependencies
working-directory: app
run: npm ci
- name: Rust format
run: cargo fmt --all -- --check
- name: Rust check
run: cargo check --workspace
- name: Rust tests
run: cargo test --workspace
- name: Core API docs
run: cargo doc -p dune-manager-core --no-deps
- name: Frontend build
working-directory: app
run: npm run build
- name: Tauri shell check
run: cargo check -p dune-dedicated-server-manager-app
- name: Secret and machine-constant scan
if: matrix.platform == 'windows-latest'
shell: pwsh
run: |
rg -n -S "I:|AutoUpdate|192\.168\.2\.|menna|dune-awakening|C:\\WINDOWS\\System32\\OpenSSH|C:\\Windows\\System32\\OpenSSH|change-me-before-exposing|c05564d|d177d3bbc40be761|qRmQx|FuncomLiveServices__ServiceAuthToken" . -g "!app/**/target/**" -g "!crates/**/target/**" -g "!target/**" -g "!app/node_modules/**" -g "!app/dist/**" -g "!*.md" -g "!app/steamcmd/**" -g "!app/dune-server/**" -g "!app/vm/**" -g "!app/vm-*/**" -g "!vm/**" -g "!.tmp/**"
if ($LASTEXITCODE -eq 0) {
throw "Secret or machine-specific constant scan found matches."
}
if ($LASTEXITCODE -ne 1) {
exit $LASTEXITCODE
}

View File

@@ -0,0 +1,203 @@
name: Release
on:
push:
tags:
- "v*.*.*"
workflow_dispatch:
inputs:
version:
description: "Version to release, for example 0.1.0"
required: true
type: string
permissions:
contents: write
env:
FORCE_JAVASCRIPT_ACTIONS_TO_NODE24: true
jobs:
linux-service-binary:
name: Build dune-server-service (musl)
runs-on: ubuntu-22.04
steps:
- name: Checkout
uses: actions/checkout@v4
- name: Install Rust toolchain
uses: dtolnay/rust-toolchain@stable
with:
targets: x86_64-unknown-linux-musl
- name: Install Zig
uses: mlugg/setup-zig@v1
with:
version: 0.13.0
- name: Install cargo-zigbuild
run: cargo install --locked cargo-zigbuild
- name: Resolve release version
shell: bash
env:
WORKFLOW_VERSION: ${{ inputs.version }}
run: |
version="$WORKFLOW_VERSION"
if [ -z "$version" ]; then
version="${GITHUB_REF_NAME#v}"
fi
if [ -z "$version" ]; then
echo "could not resolve release version" >&2
exit 1
fi
echo "RELEASE_VERSION=$version" >> "$GITHUB_ENV"
echo "RELEASE_TAG=v$version" >> "$GITHUB_ENV"
- name: Build musl binary
run: |
cargo zigbuild -p dune-server-service --release --target x86_64-unknown-linux-musl
strip target/x86_64-unknown-linux-musl/release/dune-server-service
- name: Stage release artifacts
run: |
mkdir -p release-artifacts
cp target/x86_64-unknown-linux-musl/release/dune-server-service release-artifacts/dune-server-service
cp crates/dune-server-service/systemd/dune-server-service.service release-artifacts/dune-server-service.service
cp crates/dune-server-service/openrc/dune-server-service release-artifacts/dune-server-service.openrc
- name: Upload artifact for desktop bundle
uses: actions/upload-artifact@v4
with:
name: dune-server-service-musl
path: release-artifacts/
retention-days: 7
- name: Resolve release notes
if: startsWith(github.ref, 'refs/tags/v')
shell: bash
run: |
notes_path="release-notes/${RELEASE_VERSION}.md"
if [ -f "$notes_path" ]; then
echo "RELEASE_BODY_PATH=$notes_path" >> "$GITHUB_ENV"
else
tmp=$(mktemp)
printf 'Release v%s. No release-notes/%s.md was provided — see the commit log for details.\n' \
"$RELEASE_VERSION" "$RELEASE_VERSION" > "$tmp"
echo "RELEASE_BODY_PATH=$tmp" >> "$GITHUB_ENV"
fi
- name: Attach to GitHub release
if: startsWith(github.ref, 'refs/tags/v')
uses: softprops/action-gh-release@v2
with:
tag_name: ${{ env.RELEASE_TAG }}
body_path: ${{ env.RELEASE_BODY_PATH }}
files: |
release-artifacts/dune-server-service
release-artifacts/dune-server-service.service
release-artifacts/dune-server-service.openrc
desktop-app:
name: Build ${{ matrix.name }} app
needs: linux-service-binary
runs-on: ${{ matrix.platform }}
strategy:
fail-fast: false
matrix:
include:
- name: Windows
platform: windows-latest
args: --bundles nsis
- name: Linux
platform: ubuntu-22.04
args: --bundles appimage,deb
- name: macOS Apple Silicon
platform: macos-latest
args: --target aarch64-apple-darwin --bundles dmg
- name: macOS Intel
platform: macos-latest
args: --target x86_64-apple-darwin --bundles dmg
steps:
- name: Checkout
uses: actions/checkout@v4
- name: Install Rust
uses: dtolnay/rust-toolchain@stable
with:
targets: ${{ startsWith(matrix.name, 'macOS') && 'aarch64-apple-darwin,x86_64-apple-darwin' || '' }}
- name: Install Node
uses: actions/setup-node@v4
with:
node-version: 22
cache: npm
cache-dependency-path: app/package-lock.json
- name: Install Linux Tauri dependencies
if: matrix.platform == 'ubuntu-22.04'
run: |
sudo apt-get update
sudo apt-get install -y libwebkit2gtk-4.1-dev libappindicator3-dev librsvg2-dev patchelf pkg-config libssl-dev
- name: Install frontend dependencies
working-directory: app
run: npm ci
- name: Download bundled dune-server-service binary
uses: actions/download-artifact@v4
with:
name: dune-server-service-musl
path: app/src-tauri/binaries/
- name: Resolve release version
shell: pwsh
env:
WORKFLOW_VERSION: ${{ inputs.version }}
run: |
$version = $env:WORKFLOW_VERSION
if ([string]::IsNullOrWhiteSpace($version)) {
$version = "${{ github.ref_name }}".TrimStart("v")
}
if ([string]::IsNullOrWhiteSpace($version)) {
throw "Release version could not be resolved."
}
"RELEASE_VERSION=$version" | Out-File -FilePath $env:GITHUB_ENV -Append
"RELEASE_TAG=v$version" | Out-File -FilePath $env:GITHUB_ENV -Append
- name: Prepare release config
shell: pwsh
run: |
$version = $env:RELEASE_VERSION
Push-Location app
npm version --no-git-tag-version --allow-same-version $version
Pop-Location
$tauriConfigPath = "app/src-tauri/tauri.conf.json"
$config = Get-Content $tauriConfigPath -Raw
$config = $config -replace '"version":\s*"[^"]+"', ('"version": "' + $version + '"')
# Release builds publish signed updater artifacts; the checked-in
# default keeps this off so local debug builds do not require
# TAURI_SIGNING_PRIVATE_KEY.
$config = $config -replace '"createUpdaterArtifacts":\s*false', '"createUpdaterArtifacts": true'
Set-Content -Path $tauriConfigPath -Value $config -NoNewline
# The body is set by the linux-service-binary job's softprops step.
# tauri-action only uploads desktop bundles + the signed updater
# artifacts here; we don't pass releaseBody to avoid clobbering.
- name: Build and publish Tauri release
uses: tauri-apps/tauri-action@v0
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
TAURI_SIGNING_PRIVATE_KEY: ${{ secrets.TAURI_SIGNING_PRIVATE_KEY }}
TAURI_SIGNING_PRIVATE_KEY_PASSWORD: ${{ secrets.TAURI_SIGNING_PRIVATE_KEY_PASSWORD }}
VITE_ENABLE_STARTUP_UPDATE_CHECK: "true"
with:
projectPath: app
tagName: ${{ env.RELEASE_TAG }}
releaseName: "Dune Dedicated Server Manager ${{ env.RELEASE_TAG }}"
releaseDraft: false
prerelease: false
args: ${{ matrix.args }}

View File

@@ -0,0 +1,68 @@
# Dependencies
node_modules/
app/node_modules/
# Frontend build
dist/
app/dist/
app/src-tauri/gen/schemas/
# Rust/Tauri build outputs
target/
src-tauri/target/
app/src-tauri/target/
manager-api/target/
# Local environment
.env
.env.*
!.env.example
# Logs
*.log
npm-debug.log*
yarn-debug.log*
yarn-error.log*
pnpm-debug.log*
# Docs are scratch notes for now; keep README trackable later
*.md
!README.md
!docs/
!docs/*.md
docs/rabbitmq-protocol.md
# Release notes go on GitHub releases via the release workflow.
!release-notes/
!release-notes/*.md
# Editor and OS noise
.idea/
.vscode/
*.swp
*.swo
Thumbs.db
Desktop.ini
# Local app/runtime data and secrets
.tmp/
.playwright-mcp/
app/default-config.json
app/steamcmd/
app/dune-server/
dune-server/
app/vm/
app/vm-*/
app/src-tauri/dune-server/
app/src-tauri/vm/
app/src-tauri/resources/manager-api/dune-manager-api
app/src-tauri/resources/manager-api/dune-manager-api.exe
vm/
*.pem
*.key
sshKey
codex_vm_ed25519_dropbear
codex_vm_ed25519_dropbear.pub
snapshots/
keys/
initial-setup-log.txt
secrets/

7156
docs/reference-repos/adainrivers/Cargo.lock generated Normal file

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,7 @@
[workspace]
members = ["crates/dune-manager-core", "crates/dune-server-service", "app/src-tauri"]
resolver = "2"
[workspace.dependencies]
serde = { version = "1", features = ["derive"] }
serde_json = "1"

View File

@@ -0,0 +1,21 @@
MIT License
Copyright (c) 2026 gaming.tools
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.

View File

@@ -0,0 +1,59 @@
# Dune Dedicated Server Manager
A desktop manager for existing Dune Awakening dedicated servers.
![Dashboard — BattleGroup status, lifecycle actions, management service, and tunnel controls](images/ss-1.png)
The app manages already-provisioned Dune dedicated servers over SSH and
Kubernetes control commands. It does not install the game server, create VMs,
configure Hyper-V, provision Ubuntu, or manage external tools such as SteamCMD.
## Features
- Remote server profile management with SSH private-key authentication
- BattleGroup status, start, stop, restart, and update controls
- Component diagnostics, log viewing, and safe restart actions
- Secure Director, File Browser, PostgreSQL, and PgHero access through local SSH tunnels
- Bundled `dune-server-service` daemon for on-host scheduled maintenance (daily restarts with in-game warnings, automated backups, server update check + apply) — installed over SSH straight from the Management card
- Admin console for in-game actions: item grants, vehicle spawns, skill/journey/XP tags, player lookup with live pawn location, and a logged history of every published command
- Automated tasks tab with editable schedule settings (daily restart time, warning lead/frequency, update apply lead, IANA timezone) — saving auto-restarts the service so changes apply immediately
- Welcome Package automation: a per-player onboarding chain (item grants, water refill, welcome whisper) driven by Postgres player detection, tracked in the management service's SQLite ledger, and configurable from the Welcome Package tab with both a visual editor and a raw JSON mode
![Admin tab — granting items to online players with a searchable Funcom item picker](images/ss-2.png)
More management features coming soon.
## Install
Download the latest release for your operating system from GitHub Releases.
- Windows: run the NSIS installer.
- Linux: use the AppImage or Debian package.
- macOS: use the DMG for your Mac architecture.
After launching the app, add an existing server profile with its host, SSH user,
and private key path, then refresh it to detect BattleGroups and management
endpoints.
## Managed Server Assumptions
The target server must already be installed and reachable over SSH. The app
expects the Dune Kubernetes resources and vendor management scripts to exist on
the server before you add it.
Required player-facing/server ports depend on your own server deployment. A
typical dedicated-server deployment uses:
- UDP 7777-7810 for game servers
- TCP 31982 for RMQ
If you found a bug or are having other issues, please create an issue here:
https://github.com/adainrivers/dune-dedicated-server-manager/issues
## Building From Source
See [Building From Source](docs/building-from-source.md).
## License
MIT License. See [LICENSE](LICENSE).

View File

@@ -0,0 +1,15 @@
<!doctype html>
<html lang="en">
<head>
<meta charset="UTF-8" />
<meta name="viewport" content="width=device-width, initial-scale=1.0" />
<title>Dune Dedicated Server Manager</title>
<link rel="preconnect" href="https://fonts.googleapis.com">
<link rel="preconnect" href="https://fonts.gstatic.com" crossorigin>
<link href="https://fonts.googleapis.com/css2?family=Funnel+Display:wght@400;500;600;700&family=Geist:wght@300;400;500;600;700&family=Geist+Mono:wght@400;500;600&display=swap" rel="stylesheet">
</head>
<body>
<div id="root"></div>
<script type="module" src="/src/main.tsx"></script>
</body>
</html>

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,32 @@
{
"name": "dune-dedicated-server-manager-app",
"private": true,
"version": "0.3.16",
"type": "module",
"scripts": {
"dev": "vite --host 127.0.0.1 --port 1420",
"build": "tsc && vite build",
"preview": "vite preview --host 127.0.0.1 --port 1420",
"tauri": "tauri"
},
"dependencies": {
"@radix-ui/react-icons": "^1.3.2",
"@radix-ui/themes": "^3.2.1",
"@tauri-apps/api": "^2.0.0",
"@tauri-apps/plugin-dialog": "^2.7.1",
"@tauri-apps/plugin-process": "^2.3.1",
"@tauri-apps/plugin-shell": "^2.3.5",
"@tauri-apps/plugin-updater": "^2.10.1",
"markdown-to-jsx": "^9.8.1",
"react": "^18.3.1",
"react-dom": "^18.3.1"
},
"devDependencies": {
"@tauri-apps/cli": "^2.0.0",
"@types/react": "^18.3.12",
"@types/react-dom": "^18.3.1",
"@vitejs/plugin-react": "^4.3.3",
"typescript": "^5.6.3",
"vite": "^5.4.10"
}
}

View File

@@ -0,0 +1,26 @@
[package]
name = "dune-dedicated-server-manager-app"
version = "0.2.0"
description = "Desktop shell for Dune Dedicated Server Manager"
authors = ["Dune Dedicated Server Manager"]
edition = "2021"
[lib]
name = "dune_dedicated_server_manager_app_lib"
crate-type = ["staticlib", "cdylib", "rlib"]
[build-dependencies]
tauri-build = { version = "2", features = [] }
[dependencies]
dune-manager-core = { path = "../../crates/dune-manager-core" }
tauri = { version = "2", features = ["devtools"] }
serde = { workspace = true }
serde_json = { workspace = true }
tauri-plugin-dialog = "2"
tauri-plugin-updater = "2"
tauri-plugin-process = "2"
tauri-plugin-shell = "2"
base64 = "0.22"
chrono = { version = "0.4", default-features = false, features = ["clock", "std"] }
reqwest = { version = "0.12", default-features = false, features = ["json"] }

View File

@@ -0,0 +1,6 @@
# Populated by CI from the `linux-service-binary` job artifact, or locally
# via `cargo zigbuild -p dune-server-service --release --target
# x86_64-unknown-linux-musl` + manual copy. Not tracked.
dune-server-service
dune-server-service.service
dune-server-service.openrc

View File

@@ -0,0 +1,23 @@
# Bundled service binaries
This directory holds the Linux `dune-server-service` binary (musl-static), its
systemd unit, and its OpenRC init script. They are populated by the
`linux-service-binary` job in `.github/workflows/release.yml` and bundled into
the desktop installer as Tauri resources.
For local debug builds the directory can be empty — the `install_management_service`
Tauri command surfaces a friendly error when the resource is missing.
For a local end-to-end test, build the service yourself:
```powershell
rustup target add x86_64-unknown-linux-musl
cargo install --locked cargo-zigbuild
cargo zigbuild -p dune-server-service --release --target x86_64-unknown-linux-musl
Copy-Item target\x86_64-unknown-linux-musl\release\dune-server-service `
app\src-tauri\binaries\dune-server-service
Copy-Item crates\dune-server-service\systemd\dune-server-service.service `
app\src-tauri\binaries\dune-server-service.service
Copy-Item crates\dune-server-service\openrc\dune-server-service `
app\src-tauri\binaries\dune-server-service.openrc
```

View File

@@ -0,0 +1,67 @@
fn main() {
expose_dune_server_service_version();
rerun_if_bundled_binaries_change();
tauri_build::build();
}
/// Tauri's resource-copy step only fires when Cargo decides build.rs needs to
/// re-run, which by default doesn't watch arbitrary files. Without these
/// `rerun-if-changed` lines, refreshing the bundled `dune-server-service`
/// binary or its systemd/openrc units in `binaries/` after a previous build
/// produces a stale `target/release/binaries/` copy — the running exe then
/// pushes the OLD binary on Install/Update, with no visible signal.
fn rerun_if_bundled_binaries_change() {
let dir = std::path::Path::new(env!("CARGO_MANIFEST_DIR")).join("binaries");
// Watch the directory itself so file additions/deletions also trigger a rerun.
println!("cargo:rerun-if-changed={}", dir.display());
if let Ok(entries) = std::fs::read_dir(&dir) {
for entry in entries.flatten() {
let path = entry.path();
// Skip README, .gitignore, and similar bookkeeping files.
if matches!(
path.file_name().and_then(|n| n.to_str()),
Some("README.md") | Some(".gitignore")
) {
continue;
}
println!("cargo:rerun-if-changed={}", path.display());
}
}
}
fn expose_dune_server_service_version() {
let cargo_toml = std::path::Path::new(env!("CARGO_MANIFEST_DIR"))
.join("../../crates/dune-server-service/Cargo.toml");
println!("cargo:rerun-if-changed={}", cargo_toml.display());
let contents = std::fs::read_to_string(&cargo_toml)
.unwrap_or_else(|err| panic!("reading {}: {err}", cargo_toml.display()));
let version = parse_package_version(&contents).unwrap_or_else(|| {
panic!(
"could not find [package].version in {}",
cargo_toml.display()
)
});
println!("cargo:rustc-env=DUNE_SERVER_SERVICE_VERSION={version}");
}
fn parse_package_version(toml: &str) -> Option<String> {
let mut in_package = false;
for line in toml.lines() {
let trimmed = line.trim();
if trimmed.starts_with('[') {
in_package = trimmed == "[package]";
continue;
}
if !in_package {
continue;
}
if let Some(rest) = trimmed.strip_prefix("version") {
let rest = rest.trim_start();
let rest = rest.strip_prefix('=')?.trim_start();
let rest = rest.trim_start_matches('"');
let end = rest.find('"')?;
return Some(rest[..end].to_string());
}
}
None
}

View File

@@ -0,0 +1,7 @@
{
"$schema": "../gen/schemas/desktop-schema.json",
"identifier": "default",
"description": "Default desktop app permissions",
"windows": ["main"],
"permissions": ["core:default", "dialog:allow-open", "process:default", "shell:allow-open", "updater:default"]
}

Binary file not shown.

After

Width:  |  Height:  |  Size: 24 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 65 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 2.4 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 7.3 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 18 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 28 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 31 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 82 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 2.1 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 94 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 3.9 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 8.7 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 13 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 4.8 KiB

View File

@@ -0,0 +1,5 @@
<?xml version="1.0" encoding="utf-8"?>
<adaptive-icon xmlns:android="http://schemas.android.com/apk/res/android">
<foreground android:drawable="@mipmap/ic_launcher_foreground"/>
<background android:drawable="@color/ic_launcher_background"/>
</adaptive-icon>

Binary file not shown.

After

Width:  |  Height:  |  Size: 4.0 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 35 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 4.0 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 3.9 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 18 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 3.8 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 12 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 55 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 12 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 23 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 100 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 25 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 37 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 152 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 40 KiB

View File

@@ -0,0 +1,4 @@
<?xml version="1.0" encoding="utf-8"?>
<resources>
<color name="ic_launcher_background">#fff</color>
</resources>

Binary file not shown.

After

Width:  |  Height:  |  Size: 83 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 193 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 1005 B

Binary file not shown.

After

Width:  |  Height:  |  Size: 2.9 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 2.9 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 5.7 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 1.7 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 5.4 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 5.4 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 11 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 2.9 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 9.4 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 9.4 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 19 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 483 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 19 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 36 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 8.6 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 28 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 32 KiB

View File

@@ -0,0 +1,254 @@
use dune_manager_core::orchestration::{
is_started_state, BattlegroupManagementOrchestrator, BattlegroupRef, BattlegroupState,
RusshRunner, StructuredKubectl, VendorBattlegroupWrapper,
};
use crate::commands::shared::{command_error_message, runner_for_remote_kind};
use crate::commands::status_data::read_remote_server_status;
use crate::dto::{RemoteBattlegroupStatus, RemoteServerActionRequest, RemoteServerStatus};
use crate::logging::TauriOperationSink;
type Manager = BattlegroupManagementOrchestrator<
StructuredKubectl<RusshRunner>,
VendorBattlegroupWrapper<RusshRunner>,
>;
fn manager_from_runner(runner: &RusshRunner) -> Manager {
let kubernetes = StructuredKubectl::new(runner.clone());
// Pass the actual SSH login user so the wrapper knows when to insert
// `sudo -n -u dune -H bash -lc ...`. Defaulting to "dune" here was a
// silent root-style fallback: when the operator registered the server
// under e.g. `ubuntu`, the wrapper skipped impersonation and the script
// tried to read/write /home/dune as ubuntu, which fails noisily.
let ssh_user = runner.target().user.clone();
let wrapper = VendorBattlegroupWrapper::with_ssh_user(runner.clone(), ssh_user);
BattlegroupManagementOrchestrator::new(kubernetes, wrapper)
}
#[tauri::command]
pub async fn start_remote_battlegroup(
app: tauri::AppHandle,
request: RemoteServerActionRequest,
) -> Result<RemoteServerStatus, String> {
run_remote_battlegroup_action(app, request, false).await
}
#[tauri::command]
pub async fn stop_remote_battlegroup(
app: tauri::AppHandle,
request: RemoteServerActionRequest,
) -> Result<RemoteServerStatus, String> {
run_remote_battlegroup_action(app, request, true).await
}
#[tauri::command]
pub async fn restart_remote_battlegroup(
app: tauri::AppHandle,
request: RemoteServerActionRequest,
) -> Result<RemoteServerStatus, String> {
let worker_app = app.clone();
tauri::async_runtime::spawn_blocking(move || {
let mut sink = TauriOperationSink::new(worker_app);
sink.info("bg.restart", "Restarting remote battlegroup.");
let runner = runner_for_remote_kind(
request.server_type.as_deref(),
request.host,
request.user,
request.key_path,
Some(request.port),
)?;
let battlegroup = BattlegroupRef {
namespace: request.namespace,
name: request.battlegroup_name,
};
let manager = manager_from_runner(&runner);
manager
.restart_and_wait_director(&battlegroup, 240, &mut sink)
.map_err(command_error_message)?;
sink.info("bg.restart", "Refreshing battlegroup state.");
read_remote_server_status(&runner, &battlegroup.namespace, &battlegroup.name)
.map_err(command_error_message)
})
.await
.map_err(|err| format!("Remote battlegroup restart worker failed: {err}"))?
}
#[tauri::command]
pub async fn update_remote_battlegroup(
app: tauri::AppHandle,
request: RemoteServerActionRequest,
) -> Result<RemoteServerStatus, String> {
let worker_app = app.clone();
tauri::async_runtime::spawn_blocking(move || {
let mut sink = TauriOperationSink::new(worker_app);
sink.info("bg.update", "Running vendor wrapper update.");
let runner = runner_for_remote_kind(
request.server_type.as_deref(),
request.host,
request.user,
request.key_path,
Some(request.port),
)?;
run_battlegroup_update_with_runner(
&runner,
&mut sink,
request.namespace,
request.battlegroup_name,
)
})
.await
.map_err(|err| format!("Remote battlegroup update worker failed: {err}"))?
}
pub async fn run_remote_battlegroup_action(
app: tauri::AppHandle,
request: RemoteServerActionRequest,
stop: bool,
) -> Result<RemoteServerStatus, String> {
let worker_app = app.clone();
tauri::async_runtime::spawn_blocking(move || {
let mut sink = TauriOperationSink::new(worker_app);
sink.info("bg.check", "Checking remote battlegroup state.");
let runner = runner_for_remote_kind(
request.server_type.as_deref(),
request.host,
request.user,
request.key_path,
Some(request.port),
)?;
run_battlegroup_action_with_runner(
&runner,
&mut sink,
request.namespace,
request.battlegroup_name,
stop,
)
})
.await
.map_err(|err| format!("Remote battlegroup action worker failed: {err}"))?
}
fn run_battlegroup_action_with_runner(
runner: &RusshRunner,
sink: &mut TauriOperationSink,
namespace: String,
battlegroup_name: String,
stop: bool,
) -> Result<RemoteServerStatus, String> {
let battlegroup = BattlegroupRef {
namespace,
name: battlegroup_name,
};
let manager = manager_from_runner(runner);
// Pre-flight no-op guard. Read the BattleGroup state from the stable
// kubectl JSON schema (same source as the dashboard) rather than the
// vendor wrapper's `status` text: that text layout drifts across Funcom
// releases and was being misparsed into bogus phases (e.g. status="World",
// director="2/2"), which made `is_started_state` wrongly report the BG as
// not running and refuse a perfectly valid Stop (#19).
let before = read_remote_server_status(runner, &battlegroup.namespace, &battlegroup.name)
.map_err(command_error_message)?;
let before_bg = &before.battlegroup;
let before_started = is_started_state(&battlegroup_state_from_status(before_bg));
if stop && !before_started {
return Err(format!(
"Battlegroup is not running (status={}, stop={}, database={}, gateway={}, director={}).",
before_bg.phase,
before_bg.stop,
before_bg.database_phase,
before_bg.server_group_phase,
before_bg.director_phase
));
}
if !stop && before_started {
return Err("Battlegroup is already started.".to_string());
}
if stop {
manager
.stop(&battlegroup, sink)
.map_err(command_error_message)?;
} else {
manager
.start_and_wait_director(&battlegroup, 180, sink)
.map_err(command_error_message)?;
}
sink.info("bg.check", "Refreshing battlegroup state.");
read_remote_server_status(runner, &battlegroup.namespace, &battlegroup.name)
.map_err(command_error_message)
}
/// Adapts the structured `RemoteBattlegroupStatus` (read from the BattleGroup
/// CR JSON) into the core `BattlegroupState` so the shared `is_started_state`
/// phase vocabulary stays the single source of truth. `server_stats` is not
/// consulted by `is_started_state`, so it is left empty.
fn battlegroup_state_from_status(status: &RemoteBattlegroupStatus) -> BattlegroupState {
BattlegroupState {
stop: status.stop,
phase: status.phase.clone(),
database_phase: status.database_phase.clone(),
server_group_phase: status.server_group_phase.clone(),
director_phase: status.director_phase.clone(),
uptime: status.uptime.clone(),
server_stats: Vec::new(),
}
}
fn run_battlegroup_update_with_runner(
runner: &RusshRunner,
sink: &mut TauriOperationSink,
namespace: String,
battlegroup_name: String,
) -> Result<RemoteServerStatus, String> {
let battlegroup = BattlegroupRef {
namespace,
name: battlegroup_name,
};
let manager = manager_from_runner(runner);
sink.warn(
"bg.update",
"Running vendor `battlegroup update` (steamcmd + operators + maps + images).",
);
let stdout = manager
.update(&battlegroup, sink)
.map_err(command_error_message)?;
if !stdout.trim().is_empty() {
sink.info("bg.update", stdout.trim().to_string());
}
sink.info("bg.update", "Refreshing battlegroup state.");
read_remote_server_status(runner, &battlegroup.namespace, &battlegroup.name)
.map_err(command_error_message)
}
#[cfg(test)]
mod tests {
use super::*;
fn status(phase: &str, sgp: &str, director: &str, stop: bool) -> RemoteBattlegroupStatus {
RemoteBattlegroupStatus {
stop,
phase: phase.to_string(),
database_phase: "Ready".to_string(),
server_group_phase: sgp.to_string(),
director_phase: director.to_string(),
uptime: "8h45m".to_string(),
server_stats: Vec::new(),
}
}
#[test]
fn reconciling_bg_counts_as_started_so_stop_is_allowed() {
// #19: the structured kubectl read reports phase=Reconciling,
// serverGroupPhase=Running, directorPhase=Healthy while the BG is up.
// The stop guard must treat this as started (previously the wrapper
// text-parse produced status="World"/director="2/2" and refused).
let s = status("Reconciling", "Running", "Healthy", false);
assert!(is_started_state(&battlegroup_state_from_status(&s)));
}
#[test]
fn stopped_bg_is_not_started() {
assert!(!is_started_state(&battlegroup_state_from_status(&status(
"Stopped", "Stopped", "", true
))));
}
}

View File

@@ -0,0 +1,168 @@
use dune_manager_core::models::CommandResult;
use dune_manager_core::orchestration::{RemoteCommandRunner, RusshRunner};
use dune_manager_core::security::redact_text;
use crate::commands::shared::{command_error_message, runner_for_remote_kind, sh_single_quoted};
use crate::dto::{
RemoteComponentLogRequest, RemoteComponentLogResult, RemoteComponentRestartRequest,
RemoteComponentRestartResult,
};
#[tauri::command]
pub async fn remote_component_log_tail(
request: RemoteComponentLogRequest,
) -> Result<RemoteComponentLogResult, String> {
tauri::async_runtime::spawn_blocking(move || {
let runner = runner_for_remote_kind(
request.server_type.as_deref(),
request.host,
request.user,
request.key_path,
Some(request.port),
)?;
read_remote_component_log_tail(
&runner,
&request.namespace,
&request.component,
request.tail,
)
.map_err(command_error_message)
})
.await
.map_err(|err| format!("Remote component log worker failed: {err}"))?
}
#[tauri::command]
pub async fn restart_remote_component(
request: RemoteComponentRestartRequest,
) -> Result<RemoteComponentRestartResult, String> {
tauri::async_runtime::spawn_blocking(move || {
let runner = runner_for_remote_kind(
request.server_type.as_deref(),
request.host,
request.user,
request.key_path,
Some(request.port),
)?;
restart_remote_component_inner(&runner, &request.namespace, &request.component)
.map_err(command_error_message)
})
.await
.map_err(|err| format!("Remote component restart worker failed: {err}"))?
}
fn read_remote_component_log_tail(
runner: &RusshRunner,
namespace: &str,
component: &str,
tail: u32,
) -> CommandResult<RemoteComponentLogResult> {
let component = component.trim();
let (mode, pattern) = component_pod_selection(component)?;
let tail = tail.clamp(20, 500);
let script = format!(
r#"
ns={ns}
mode={mode}
pattern={pattern}
tail_lines={tail}
component={component}
if [ "$mode" = "role" ]; then
pods=$(sudo kubectl get pods -n "$ns" -l "role=$pattern" --no-headers -o custom-columns=NAME:.metadata.name 2>/dev/null || true)
elif [ "$mode" = "roles" ]; then
pods=$(sudo kubectl get pods -n "$ns" --no-headers -o custom-columns=NAME:.metadata.name,ROLE:.metadata.labels.role 2>/dev/null | grep -E "$pattern" | awk '{{print $1}}' || true)
else
pods=$(sudo kubectl get pods -n "$ns" --no-headers -o custom-columns=NAME:.metadata.name 2>/dev/null | grep -- "$pattern" || true)
fi
if [ -z "$pods" ]; then
echo "No pods found for $component."
exit 0
fi
for pod in $pods; do
echo "== $pod =="
sudo kubectl logs -n "$ns" "$pod" --all-containers --tail="$tail_lines" 2>&1 || true
done
"#,
ns = sh_single_quoted(namespace),
mode = sh_single_quoted(mode),
pattern = sh_single_quoted(pattern),
tail = tail,
component = sh_single_quoted(component),
);
let output = runner.run_script(&script)?;
Ok(RemoteComponentLogResult {
component: component.to_string(),
output: redact_text(&output),
})
}
fn restart_remote_component_inner(
runner: &RusshRunner,
namespace: &str,
component: &str,
) -> CommandResult<RemoteComponentRestartResult> {
let component = component.trim();
let (mode, pattern) = component_pod_selection(component)?;
let script = format!(
r#"
ns={ns}
mode={mode}
pattern={pattern}
component={component}
if [ "$mode" = "role" ]; then
pods=$(sudo kubectl get pods -n "$ns" -l "role=$pattern" --no-headers -o custom-columns=NAME:.metadata.name 2>/dev/null || true)
elif [ "$mode" = "roles" ]; then
pods=$(sudo kubectl get pods -n "$ns" --no-headers -o custom-columns=NAME:.metadata.name,ROLE:.metadata.labels.role 2>/dev/null | grep -E "$pattern" | awk '{{print $1}}' || true)
else
pods=$(sudo kubectl get pods -n "$ns" --no-headers -o custom-columns=NAME:.metadata.name 2>/dev/null | grep -- "$pattern" || true)
fi
if [ -z "$pods" ]; then
echo "No pods found for $component."
exit 0
fi
for pod in $pods; do
echo "Restarting $pod"
sudo kubectl delete pod -n "$ns" "$pod" --wait=false
done
"#,
ns = sh_single_quoted(namespace),
mode = sh_single_quoted(mode),
pattern = sh_single_quoted(pattern),
component = sh_single_quoted(component),
);
let output = runner.run_script(&script)?;
Ok(RemoteComponentRestartResult {
component: component.to_string(),
output: redact_text(&output),
})
}
fn component_pod_selection(component: &str) -> CommandResult<(&'static str, &'static str)> {
match component {
"database" => Ok(("role", "igw-database")),
"database-utilities" => Ok((
"roles",
"igw-database-utility|igw-database-monitor|igw-database-pghero",
)),
"message-queue" => Ok(("role", "igw-message-queue")),
"director" => Ok(("role", "igw-battlegroup-director")),
"gateway" | "gateway-resource" => Ok(("role", "igw-server-gateway")),
"text-router" => Ok(("role", "igw-text-router")),
"file-browser" => Ok(("role", "igw-filebrowser")),
"server-group" => Ok(("role", "igw-server")),
"map-survival-1" => Ok(("name", "-sg-survival-1-")),
"map-overmap" => Ok(("name", "-sg-overmap-")),
"map-deepdesert" => Ok(("name", "-sg-deepdesert-")),
"map-social-arrakeen" => Ok(("name", "-sg-sh-arrakeen-")),
"map-social-harkovillage" => Ok(("name", "-sg-sh-harkovillage-")),
_ => Err(dune_manager_core::errors::failure(format!(
"Unknown component key: {component}"
))),
}
}

View File

@@ -0,0 +1,34 @@
use dune_manager_core::orchestration::RemoteCommandRunner;
use crate::commands::shared::{command_error_message, runner_for_remote_kind};
use crate::commands::status_data::remote_records_from_battlegroups;
use crate::dto::{RemoteConnectionRequest, RemoteServerRecord};
#[tauri::command]
pub async fn detect_remote_ubuntu_servers(
request: RemoteConnectionRequest,
) -> Result<Vec<RemoteServerRecord>, String> {
tauri::async_runtime::spawn_blocking(move || {
let request = RemoteConnectionRequest {
server_type: Some("ubuntu".to_string()),
..request
};
let user = request.user.clone().unwrap_or_default();
let runner = runner_for_remote_kind(
request.server_type.as_deref(),
request.host.clone(),
user,
request.key_path.clone(),
Some(request.port),
)?;
let value = runner
.run_json(
"sudo kubectl get battlegroups -A -o json",
"remote ubuntu battlegroups",
)
.map_err(command_error_message)?;
Ok(remote_records_from_battlegroups(&request, &value))
})
.await
.map_err(|err| format!("Remote server detection worker failed: {err}"))?
}

View File

@@ -0,0 +1,36 @@
//! Frontend-facing helpers for the persisted operation log file.
use std::sync::Arc;
use tauri::State;
use crate::log_file::LogFile;
/// Appends a single row to the persisted operation log.
///
/// Frontend-originated log rows (those produced directly by React without a
/// matching Rust event) call this so the on-disk log mirrors the in-memory
/// view exactly.
#[tauri::command]
pub fn record_operation_log(
log_file: State<'_, Arc<LogFile>>,
level: String,
scope: String,
message: String,
) -> Result<(), String> {
let allowed_levels = ["debug", "info", "warn", "error"];
let normalized = if allowed_levels.contains(&level.as_str()) {
level.as_str()
} else {
"info"
};
log_file
.append(normalized, &scope, &message)
.map_err(|err| err.to_string())
}
/// Returns the absolute path of the directory containing operation.log.
#[tauri::command]
pub fn get_logs_folder(log_file: State<'_, Arc<LogFile>>) -> String {
log_file.dir().to_string_lossy().into_owned()
}

View File

@@ -0,0 +1,490 @@
use std::time::Duration;
use reqwest::Client;
use serde_json::Value;
use tauri::Manager;
use crate::state::TunnelRegistry;
pub fn ensure_client(app: &tauri::AppHandle) -> Client {
if let Some(client) = app.try_state::<Client>() {
return client.inner().clone();
}
let client = Client::builder()
.timeout(Duration::from_secs(20))
.build()
.expect("reqwest client builds");
app.manage(client.clone());
client
}
fn tunnel_local_port(registry: &TunnelRegistry, tunnel_id: &str) -> Result<u16, String> {
let tunnels = registry
.tunnels
.lock()
.map_err(|_| "tunnel registry unavailable".to_string())?;
let tunnel = tunnels
.get(tunnel_id.trim())
.ok_or_else(|| format!("no active tunnel id={tunnel_id}"))?;
Ok(tunnel.status.local_port)
}
async fn get_json(client: &Client, port: u16, path: &str) -> Result<Value, String> {
let url = format!("http://127.0.0.1:{port}{path}");
let resp = client
.get(&url)
.send()
.await
.map_err(|err| format!("GET {path}: {err}"))?;
if !resp.status().is_success() {
let status = resp.status();
let body_text = resp.text().await.unwrap_or_default();
return Err(format!("GET {path} -> {status}: {body_text}"));
}
resp.json::<Value>()
.await
.map_err(|err| format!("decoding {path}: {err}"))
}
async fn post_json(client: &Client, port: u16, path: &str, body: &Value) -> Result<Value, String> {
let url = format!("http://127.0.0.1:{port}{path}");
let resp = client
.post(&url)
.json(body)
.send()
.await
.map_err(|err| format!("POST {path}: {err}"))?;
if !resp.status().is_success() {
let status = resp.status();
let body_text = resp.text().await.unwrap_or_default();
return Err(format!("POST {path} -> {status}: {body_text}"));
}
resp.json::<Value>()
.await
.map_err(|err| format!("decoding {path}: {err}"))
}
#[tauri::command]
pub async fn ms_health(
app: tauri::AppHandle,
registry: tauri::State<'_, TunnelRegistry>,
tunnel_id: String,
) -> Result<Value, String> {
let port = tunnel_local_port(&registry, &tunnel_id)?;
let client = ensure_client(&app);
get_json(&client, port, "/api/health").await
}
#[tauri::command]
pub async fn ms_list_runs(
app: tauri::AppHandle,
registry: tauri::State<'_, TunnelRegistry>,
tunnel_id: String,
limit: Option<u32>,
task: Option<String>,
) -> Result<Value, String> {
let port = tunnel_local_port(&registry, &tunnel_id)?;
let client = ensure_client(&app);
let mut path = String::from("/api/runs");
let mut sep = '?';
if let Some(l) = limit {
path.push(sep);
path.push_str(&format!("limit={l}"));
sep = '&';
}
if let Some(t) = task {
path.push(sep);
path.push_str(&format!("task={t}"));
}
get_json(&client, port, &path).await
}
#[tauri::command]
pub async fn ms_list_logs(
app: tauri::AppHandle,
registry: tauri::State<'_, TunnelRegistry>,
tunnel_id: String,
limit: Option<u32>,
run_id: Option<i64>,
) -> Result<Value, String> {
let port = tunnel_local_port(&registry, &tunnel_id)?;
let client = ensure_client(&app);
let mut path = String::from("/api/logs");
let mut sep = '?';
if let Some(l) = limit {
path.push(sep);
path.push_str(&format!("limit={l}"));
sep = '&';
}
if let Some(r) = run_id {
path.push(sep);
path.push_str(&format!("runId={r}"));
}
get_json(&client, port, &path).await
}
#[tauri::command]
pub async fn ms_trigger_run(
app: tauri::AppHandle,
registry: tauri::State<'_, TunnelRegistry>,
tunnel_id: String,
task: String,
options: Option<Value>,
) -> Result<Value, String> {
let port = tunnel_local_port(&registry, &tunnel_id)?;
let client = ensure_client(&app);
let mut body = serde_json::Map::new();
body.insert("task".to_string(), Value::String(task));
if let Some(opts) = options {
body.insert("options".to_string(), opts);
}
post_json(&client, port, "/api/runs/trigger", &Value::Object(body)).await
}
#[tauri::command]
pub async fn ms_list_commands(
app: tauri::AppHandle,
registry: tauri::State<'_, TunnelRegistry>,
tunnel_id: String,
) -> Result<Value, String> {
let port = tunnel_local_port(&registry, &tunnel_id)?;
let client = ensure_client(&app);
get_json(&client, port, "/api/admin/commands").await
}
#[tauri::command]
pub async fn ms_search_items(
app: tauri::AppHandle,
registry: tauri::State<'_, TunnelRegistry>,
tunnel_id: String,
q: Option<String>,
limit: Option<u32>,
) -> Result<Value, String> {
let port = tunnel_local_port(&registry, &tunnel_id)?;
let client = ensure_client(&app);
get_json(
&client,
port,
&search_path("/api/admin/items", q.as_deref(), limit),
)
.await
}
#[tauri::command]
pub async fn ms_search_vehicles(
app: tauri::AppHandle,
registry: tauri::State<'_, TunnelRegistry>,
tunnel_id: String,
q: Option<String>,
limit: Option<u32>,
) -> Result<Value, String> {
let port = tunnel_local_port(&registry, &tunnel_id)?;
let client = ensure_client(&app);
get_json(
&client,
port,
&search_path("/api/admin/vehicles", q.as_deref(), limit),
)
.await
}
#[tauri::command]
pub async fn ms_search_skill_modules(
app: tauri::AppHandle,
registry: tauri::State<'_, TunnelRegistry>,
tunnel_id: String,
q: Option<String>,
limit: Option<u32>,
) -> Result<Value, String> {
let port = tunnel_local_port(&registry, &tunnel_id)?;
let client = ensure_client(&app);
get_json(
&client,
port,
&search_path("/api/admin/skill-modules", q.as_deref(), limit),
)
.await
}
#[tauri::command]
pub async fn ms_search_journey_nodes(
app: tauri::AppHandle,
registry: tauri::State<'_, TunnelRegistry>,
tunnel_id: String,
q: Option<String>,
limit: Option<u32>,
) -> Result<Value, String> {
let port = tunnel_local_port(&registry, &tunnel_id)?;
let client = ensure_client(&app);
get_json(
&client,
port,
&search_path("/api/admin/journey-nodes", q.as_deref(), limit),
)
.await
}
#[tauri::command]
pub async fn ms_search_xp_event_tags(
app: tauri::AppHandle,
registry: tauri::State<'_, TunnelRegistry>,
tunnel_id: String,
q: Option<String>,
limit: Option<u32>,
) -> Result<Value, String> {
let port = tunnel_local_port(&registry, &tunnel_id)?;
let client = ensure_client(&app);
get_json(
&client,
port,
&search_path("/api/admin/xp-event-tags", q.as_deref(), limit),
)
.await
}
#[tauri::command]
pub async fn ms_search_players(
app: tauri::AppHandle,
registry: tauri::State<'_, TunnelRegistry>,
tunnel_id: String,
q: Option<String>,
limit: Option<u32>,
) -> Result<Value, String> {
let port = tunnel_local_port(&registry, &tunnel_id)?;
let client = ensure_client(&app);
get_json(
&client,
port,
&search_path("/api/admin/players", q.as_deref(), limit),
)
.await
}
#[tauri::command]
pub async fn ms_cluster(
app: tauri::AppHandle,
registry: tauri::State<'_, TunnelRegistry>,
tunnel_id: String,
) -> Result<Value, String> {
let port = tunnel_local_port(&registry, &tunnel_id)?;
let client = ensure_client(&app);
get_json(&client, port, "/api/admin/cluster").await
}
#[tauri::command]
pub async fn ms_player_location(
app: tauri::AppHandle,
registry: tauri::State<'_, TunnelRegistry>,
tunnel_id: String,
fls_id: String,
) -> Result<Value, String> {
let port = tunnel_local_port(&registry, &tunnel_id)?;
let client = ensure_client(&app);
let path = format!("/api/admin/player-location?flsId={}", urlencoding(&fls_id));
get_json(&client, port, &path).await
}
#[tauri::command]
pub async fn ms_get_config(
app: tauri::AppHandle,
registry: tauri::State<'_, TunnelRegistry>,
tunnel_id: String,
) -> Result<Value, String> {
let port = tunnel_local_port(&registry, &tunnel_id)?;
let client = ensure_client(&app);
get_json(&client, port, "/api/config").await
}
#[tauri::command]
pub async fn ms_set_config(
app: tauri::AppHandle,
registry: tauri::State<'_, TunnelRegistry>,
tunnel_id: String,
config: Value,
) -> Result<Value, String> {
let port = tunnel_local_port(&registry, &tunnel_id)?;
let client = ensure_client(&app);
post_json(&client, port, "/api/config", &config).await
}
#[tauri::command]
pub async fn ms_list_timezones(
app: tauri::AppHandle,
registry: tauri::State<'_, TunnelRegistry>,
tunnel_id: String,
) -> Result<Value, String> {
let port = tunnel_local_port(&registry, &tunnel_id)?;
let client = ensure_client(&app);
get_json(&client, port, "/api/timezones").await
}
#[tauri::command]
pub async fn ms_cron_preview(
app: tauri::AppHandle,
registry: tauri::State<'_, TunnelRegistry>,
tunnel_id: String,
expr: String,
count: Option<u32>,
) -> Result<Value, String> {
let port = tunnel_local_port(&registry, &tunnel_id)?;
let client = ensure_client(&app);
let mut path = format!("/api/cron/preview?expr={}", urlencoding(&expr));
if let Some(c) = count {
path.push_str(&format!("&count={c}"));
}
get_json(&client, port, &path).await
}
#[tauri::command]
pub async fn ms_dump_prune_preview(
app: tauri::AppHandle,
registry: tauri::State<'_, TunnelRegistry>,
tunnel_id: String,
) -> Result<Value, String> {
let port = tunnel_local_port(&registry, &tunnel_id)?;
let client = ensure_client(&app);
get_json(&client, port, "/api/maintenance/dump-prune").await
}
#[tauri::command]
pub async fn ms_dump_prune_execute(
app: tauri::AppHandle,
registry: tauri::State<'_, TunnelRegistry>,
tunnel_id: String,
items: Value,
) -> Result<Value, String> {
let port = tunnel_local_port(&registry, &tunnel_id)?;
let client = ensure_client(&app);
let body = serde_json::json!({ "items": items });
post_json(&client, port, "/api/maintenance/dump-prune", &body).await
}
#[tauri::command]
pub async fn ms_history(
app: tauri::AppHandle,
registry: tauri::State<'_, TunnelRegistry>,
tunnel_id: String,
limit: Option<u32>,
) -> Result<Value, String> {
let port = tunnel_local_port(&registry, &tunnel_id)?;
let client = ensure_client(&app);
let path = match limit {
Some(l) => format!("/api/admin/history?limit={l}"),
None => String::from("/api/admin/history"),
};
get_json(&client, port, &path).await
}
#[tauri::command]
pub async fn ms_welcome_grants(
app: tauri::AppHandle,
registry: tauri::State<'_, TunnelRegistry>,
tunnel_id: String,
limit: Option<u32>,
) -> Result<Value, String> {
let port = tunnel_local_port(&registry, &tunnel_id)?;
let client = ensure_client(&app);
let path = match limit {
Some(l) => format!("/api/admin/welcome-grants?limit={l}"),
None => String::from("/api/admin/welcome-grants"),
};
get_json(&client, port, &path).await
}
#[tauri::command]
pub async fn ms_welcome_grant_retry(
app: tauri::AppHandle,
registry: tauri::State<'_, TunnelRegistry>,
tunnel_id: String,
player_id: String,
package_version: String,
account_id: i64,
) -> Result<Value, String> {
let port = tunnel_local_port(&registry, &tunnel_id)?;
let client = ensure_client(&app);
post_json(
&client,
port,
"/api/admin/welcome-grants/retry",
&serde_json::json!({
"playerId": player_id,
"packageVersion": package_version,
"accountId": account_id,
}),
)
.await
}
#[tauri::command]
pub async fn ms_publish(
app: tauri::AppHandle,
registry: tauri::State<'_, TunnelRegistry>,
tunnel_id: String,
command: String,
fields: Value,
) -> Result<Value, String> {
let port = tunnel_local_port(&registry, &tunnel_id)?;
let client = ensure_client(&app);
post_json(
&client,
port,
"/api/admin/publish",
&serde_json::json!({ "command": command, "fields": fields }),
)
.await
}
#[tauri::command]
pub async fn ms_welcome_whisper(
app: tauri::AppHandle,
registry: tauri::State<'_, TunnelRegistry>,
tunnel_id: String,
recipient_player_id: String,
source_player_id: String,
message: String,
) -> Result<Value, String> {
let port = tunnel_local_port(&registry, &tunnel_id)?;
let client = ensure_client(&app);
post_json(
&client,
port,
"/api/admin/welcome-whisper",
&serde_json::json!({
"recipientPlayerId": recipient_player_id,
"sourcePlayerId": source_player_id,
"message": message,
}),
)
.await
}
fn search_path(base: &str, q: Option<&str>, limit: Option<u32>) -> String {
let mut out = base.to_string();
let mut sep = '?';
if let Some(qq) = q {
out.push(sep);
out.push_str(&format!("q={}", urlencoding(qq)));
sep = '&';
}
if let Some(l) = limit {
out.push(sep);
out.push_str(&format!("limit={l}"));
}
out
}
fn urlencoding(input: &str) -> String {
let mut out = String::with_capacity(input.len());
for c in input.chars() {
match c {
'A'..='Z' | 'a'..='z' | '0'..='9' | '-' | '_' | '.' | '~' => out.push(c),
_ => {
let mut buf = [0u8; 4];
for byte in c.encode_utf8(&mut buf).bytes() {
out.push_str(&format!("%{:02X}", byte));
}
}
}
}
out
}

View File

@@ -0,0 +1,670 @@
use std::path::PathBuf;
use base64::Engine as _;
use dune_manager_core::orchestration::{RemoteCommandRunner, RusshRunner, RusshTarget};
use serde::{Deserialize, Serialize};
use tauri::{Emitter, Manager};
use crate::commands::shared::{command_error_message, sh_single_quoted};
const REMOTE_BINARY_PATH: &str = "/opt/dune-server-service/dune-server-service";
const REMOTE_SYSTEMD_UNIT_PATH: &str = "/etc/systemd/system/dune-server-service.service";
const REMOTE_OPENRC_PATH: &str = "/etc/init.d/dune-server-service";
const BUNDLED_VERSION: &str = env!("DUNE_SERVER_SERVICE_VERSION");
#[derive(Debug, Clone, Deserialize)]
#[serde(rename_all = "camelCase")]
pub struct ManagementInstallRequest {
pub host: String,
pub user: String,
pub key_path: Option<String>,
#[serde(default = "default_ssh_port")]
pub port: u16,
/// Optional command-auth token. If None, install only refreshes the binary.
pub command_auth_token: Option<String>,
}
#[derive(Debug, Clone, Deserialize)]
#[serde(rename_all = "camelCase")]
pub struct ManagementConnRequest {
pub host: String,
pub user: String,
pub key_path: Option<String>,
#[serde(default = "default_ssh_port")]
pub port: u16,
}
#[derive(Debug, Clone, Serialize)]
#[serde(rename_all = "camelCase")]
pub struct ManagementInstallResult {
pub installed: bool,
pub started: bool,
pub init_system: String,
pub installed_version: Option<String>,
pub message: String,
}
#[derive(Debug, Clone, Serialize)]
#[serde(rename_all = "camelCase")]
pub struct ManagementServiceStatus {
pub installed: bool,
pub active: bool,
pub init_system: String,
pub installed_version: Option<String>,
pub bundled_version: String,
pub journal_tail: String,
}
#[derive(Debug, Clone, Serialize)]
#[serde(rename_all = "camelCase")]
pub struct InstallProgressEvent {
pub step: String,
pub status: String,
pub message: Option<String>,
}
fn default_ssh_port() -> u16 {
22
}
#[derive(Debug, Clone)]
struct ServiceAccount {
user: String,
group: String,
home: String,
}
fn target_from_conn(req: &ManagementConnRequest) -> Result<RusshTarget, String> {
let mut target = RusshTarget::new(
PathBuf::from(
req.key_path
.as_deref()
.unwrap_or_default()
.trim()
.to_string(),
),
req.user.trim().to_string(),
req.host.trim().to_string(),
);
if req.port != 0 {
target.port = req.port;
}
target.validate().map_err(|err| err.message)?;
Ok(target)
}
fn target_from_install(req: &ManagementInstallRequest) -> Result<RusshTarget, String> {
let conn = ManagementConnRequest {
host: req.host.clone(),
user: req.user.clone(),
key_path: req.key_path.clone(),
port: req.port,
};
target_from_conn(&conn)
}
fn resolve_resource(app: &tauri::AppHandle, path: &str) -> Result<PathBuf, String> {
let resource = app
.path()
.resolve(path, tauri::path::BaseDirectory::Resource)
.map_err(|err| format!("resolving bundled {path}: {err}"))?;
if !resource.exists() {
return Err(format!("bundled {path} missing at {}", resource.display()));
}
Ok(resource)
}
#[tauri::command]
pub async fn install_management_service(
app: tauri::AppHandle,
request: ManagementInstallRequest,
) -> Result<ManagementInstallResult, String> {
let binary_path = resolve_resource(&app, "binaries/dune-server-service")?;
let unit_path = resolve_resource(&app, "binaries/dune-server-service.service")?;
let openrc_path = resolve_resource(&app, "binaries/dune-server-service.openrc")?;
let target = target_from_install(&request)?;
let token = request.command_auth_token.clone();
let app_handle = app.clone();
tauri::async_runtime::spawn_blocking(move || {
install_inner(
&app_handle,
&target,
&binary_path,
&unit_path,
&openrc_path,
token.as_deref(),
)
})
.await
.map_err(|err| format!("install worker failed: {err}"))?
}
#[tauri::command]
pub fn management_service_bundled_version() -> String {
BUNDLED_VERSION.trim().to_string()
}
#[tauri::command]
pub async fn uninstall_management_service(request: ManagementConnRequest) -> Result<(), String> {
let target = target_from_conn(&request)?;
tauri::async_runtime::spawn_blocking(move || uninstall_inner(&target))
.await
.map_err(|err| format!("uninstall worker failed: {err}"))?
}
#[tauri::command]
pub async fn restart_management_service(request: ManagementConnRequest) -> Result<(), String> {
let target = target_from_conn(&request)?;
tauri::async_runtime::spawn_blocking(move || {
let script = "set -eu\n\
export PATH=/sbin:/usr/sbin:/usr/local/sbin:$PATH\n\
if command -v systemctl >/dev/null 2>&1; then\n \
sudo systemctl restart dune-server-service.service\n\
elif command -v rc-service >/dev/null 2>&1; then\n \
sudo rc-service dune-server-service restart\n\
else\n \
echo \"no supported init system\" >&2\n \
exit 1\n\
fi\n\
exit 0\n";
let runner = RusshRunner::new(target.clone());
runner
.run_script(script)
.map_err(command_error_message)
.map(|_| ())
})
.await
.map_err(|err| format!("restart worker failed: {err}"))?
}
#[tauri::command]
pub async fn management_service_status(
request: ManagementConnRequest,
) -> Result<ManagementServiceStatus, String> {
let target = target_from_conn(&request)?;
tauri::async_runtime::spawn_blocking(move || status_inner(&target))
.await
.map_err(|err| format!("status worker failed: {err}"))?
}
fn install_inner(
app: &tauri::AppHandle,
target: &RusshTarget,
binary_path: &std::path::Path,
unit_path: &std::path::Path,
openrc_path: &std::path::Path,
token: Option<&str>,
) -> Result<ManagementInstallResult, String> {
let runner = RusshRunner::new(target.clone());
let account = discover_service_account(&runner, &target.user)?;
emit_progress(app, "stop-old", "running", None);
let stop_script = "set +e\n\
export PATH=/sbin:/usr/sbin:/usr/local/sbin:$PATH\n\
sudo systemctl disable --now server-management-service.service >/dev/null 2>&1 || true\n\
sudo systemctl stop dune-server-service.service >/dev/null 2>&1 || true\n\
sudo rc-service dune-server-service stop >/dev/null 2>&1 || true\n\
exit 0\n";
runner
.run_script(stop_script)
.map_err(|err| step_err(app, "stop-old", err))?;
emit_progress(app, "stop-old", "ok", None);
emit_progress(app, "prepare-host", "running", None);
// Pre-create every directory the systemd unit lists under
// `ReadWritePaths=`. systemd sets up a mount namespace BEFORE the binary
// runs, and a missing path there is fatal (exit 226/NAMESPACE — see the
// "/root/.steam: No such file or directory" failure mode). The service's
// sqlite + OpenRC supervisor also need the state dir and log file owned
// by the service user up front; missing them produces the silent
// not-starting symptom (goofycoolguy / MadBuffoon / issues #5, #6).
let prepare_script = format!(
"set -eu\n\
export PATH=/sbin:/usr/sbin:/usr/local/sbin:$PATH\n\
sudo install -d -m 0755 -o {user} -g {group} {home}/.dune\n\
sudo install -d -m 0700 -o {user} -g {group} {state_dir}\n\
sudo install -d -m 0755 -o {user} -g {group} {home}/.local\n\
sudo install -d -m 0755 -o {user} -g {group} {home}/.local/bin\n\
sudo install -d -m 0755 -o {user} -g {group} {home}/.steam\n\
sudo install -d -m 0755 -o {user} -g {group} {home}/Steam\n\
sudo touch /var/log/dune-server-service.log\n\
sudo chown {user}:{group} /var/log/dune-server-service.log\n\
sudo chmod 0644 /var/log/dune-server-service.log\n",
user = sh_single_quoted(&account.user),
group = sh_single_quoted(&account.group),
home = sh_single_quoted(&account.home),
state_dir = sh_single_quoted(&format!("{}/.dune/state", account.home)),
);
runner
.run_script(&prepare_script)
.map_err(|err| step_err(app, "prepare-host", err))?;
emit_progress(app, "prepare-host", "ok", None);
let binary_bytes = std::fs::read(binary_path)
.map_err(|err| format!("reading resource {}: {err}", binary_path.display()))?;
let binary_size = std::fs::metadata(binary_path)
.ok()
.map(|m| m.len())
.unwrap_or(0);
let size_msg = if binary_size > 0 {
format!("{:.1} MB", binary_size as f64 / 1024.0 / 1024.0)
} else {
"unknown size".to_string()
};
emit_progress(
app,
"upload-binary",
"running",
Some(format!(
"streaming {size_msg} from {} to {REMOTE_BINARY_PATH}",
binary_path.display()
)),
);
let upload_script = format!(
"set -eu\n\
export PATH=/sbin:/usr/sbin:/usr/local/sbin:$PATH\n\
sudo install -d -m 0755 /opt/dune-server-service\n\
tmp=$(mktemp /tmp/dune-server-service.XXXXXX)\n\
trap 'rm -f \"$tmp\"' EXIT\n\
cat > \"$tmp\"\n\
actual=$(wc -c < \"$tmp\" | tr -d '[:space:]')\n\
if [ \"$actual\" != {expected_bytes} ]; then\n \
echo \"upload byte-count mismatch: expected {expected_bytes}, got $actual\" >&2\n \
exit 42\n\
fi\n\
sudo install -m 0755 -o root -g root \"$tmp\" {dest}\n\
installed=$(sudo stat -c '%s bytes mode=%a owner=%U:%G' {dest})\n\
echo \"remote install: $installed\"\n",
expected_bytes = binary_bytes.len(),
dest = sh_single_quoted(REMOTE_BINARY_PATH),
);
let upload_stdout = runner
.run_with_stdin(
&format!("sh -c {}", sh_single_quoted(&upload_script)),
&binary_bytes,
)
.map_err(|err| step_err(app, "upload-binary", err))?;
let upload_msg = if upload_stdout.trim().is_empty() {
size_msg
} else {
format!("{size_msg}; {}", upload_stdout.trim())
};
emit_progress(app, "upload-binary", "ok", Some(upload_msg));
if let Some(t) = token {
emit_progress(app, "write-token", "running", None);
let token_b64 = base64::engine::general_purpose::STANDARD.encode(t.as_bytes());
let token_path = format!("{}/.dune/state/command-auth-token", account.home);
// Stage to a real temp file before `sudo install` instead of piping
// through `sudo install /dev/stdin ...`. On Ubuntu hosts with sudo
// `Defaults use_pty` (default on 24.04+), root-to-root sudo allocates
// a pty and the piped bytes never reach the child's fd 0, which
// surfaces as `install: No such file or directory` even though both
// /dev/stdin and the destination dir exist. The temp-file pattern
// sidesteps the pty entirely.
let token_script = format!(
"set -eu\n\
export PATH=/sbin:/usr/sbin:/usr/local/sbin:$PATH\n\
sudo install -d -m 0700 -o {user} -g {group} {state_dir}\n\
tmp=$(mktemp /tmp/dune-token.XXXXXX)\n\
trap 'rm -f \"$tmp\"' EXIT\n\
echo {b64} | base64 -d > \"$tmp\"\n\
sudo install -m 0600 -o {user} -g {group} \"$tmp\" {dest}\n",
user = sh_single_quoted(&account.user),
group = sh_single_quoted(&account.group),
state_dir = sh_single_quoted(&format!("{}/.dune/state", account.home)),
b64 = sh_single_quoted(&token_b64),
dest = sh_single_quoted(&token_path),
);
runner
.run_script(&token_script)
.map_err(|err| step_err(app, "write-token", err))?;
emit_progress(app, "write-token", "ok", None);
} else {
emit_progress(
app,
"write-token",
"ok",
Some("skipped (no token)".to_string()),
);
}
emit_progress(app, "install-init", "running", None);
let unit_b64 = base64::engine::general_purpose::STANDARD
.encode(render_systemd_unit(unit_path, &account)?.as_bytes());
let openrc_b64 = base64::engine::general_purpose::STANDARD
.encode(render_openrc_unit(openrc_path, &account)?.as_bytes());
// Stage unit content + drop-in to real temp files before `sudo install`.
// The previous `echo b64 | base64 -d | sudo install /dev/stdin ...` shape
// breaks on hosts where sudoers has `Defaults use_pty` enabled (default
// on Ubuntu 24.04+): root-to-root sudo allocates a pty and the piped
// bytes never reach the child's fd 0, surfacing as
// `install: No such file or directory`. mktemp + sudo install <tmp>
// sidesteps the pty entirely.
let init_script = format!(
"set -eu\n\
export PATH=/sbin:/usr/sbin:/usr/local/sbin:$PATH\n\
tmp_unit=$(mktemp /tmp/dune-unit.XXXXXX)\n\
tmp_dropin=$(mktemp /tmp/dune-dropin.XXXXXX)\n\
tmp_openrc=$(mktemp /tmp/dune-openrc.XXXXXX)\n\
trap 'rm -f \"$tmp_unit\" \"$tmp_dropin\" \"$tmp_openrc\"' EXIT\n\
if command -v systemctl >/dev/null 2>&1; then\n \
echo SYSTEMD\n \
echo {unit_b64} | base64 -d > \"$tmp_unit\"\n \
sudo install -m 0644 -o root -g root \"$tmp_unit\" {unit_dest}\n \
sudo install -d -m 0755 /etc/systemd/system/dune-server-service.service.d\n \
printf '%s\\n' '[Service]' 'NoNewPrivileges=false' 'MemoryDenyWriteExecute=false' > \"$tmp_dropin\"\n \
sudo install -m 0644 -o root -g root \"$tmp_dropin\" /etc/systemd/system/dune-server-service.service.d/zz-dune-steamcmd-compat.conf\n \
sudo systemctl daemon-reload\n \
sudo systemctl reset-failed dune-server-service.service >/dev/null 2>&1 || true\n\
elif command -v rc-service >/dev/null 2>&1; then\n \
echo OPENRC\n \
echo {openrc_b64} | base64 -d > \"$tmp_openrc\"\n \
sudo install -m 0755 -o root -g root \"$tmp_openrc\" {openrc_dest}\n \
sudo rc-update add dune-server-service default >/dev/null 2>&1 || true\n\
else\n \
echo \"no supported init system found (need systemd or openrc)\" >&2\n \
exit 1\n\
fi\n",
unit_b64 = sh_single_quoted(&unit_b64),
unit_dest = sh_single_quoted(REMOTE_SYSTEMD_UNIT_PATH),
openrc_b64 = sh_single_quoted(&openrc_b64),
openrc_dest = sh_single_quoted(REMOTE_OPENRC_PATH),
);
let init_stdout = runner
.run_script(&init_script)
.map_err(|err| step_err(app, "install-init", err))?;
let mut init_system = String::from("unknown");
for line in init_stdout.lines() {
match line.trim() {
"SYSTEMD" => init_system = "systemd".to_string(),
"OPENRC" => init_system = "openrc".to_string(),
_ => {}
}
}
emit_progress(app, "install-init", "ok", Some(init_system.clone()));
emit_progress(app, "start-service", "running", None);
let start_script = "set -eu\n\
export PATH=/sbin:/usr/sbin:/usr/local/sbin:$PATH\n\
if command -v systemctl >/dev/null 2>&1; then\n \
sudo systemctl enable --now dune-server-service.service\n\
elif command -v rc-service >/dev/null 2>&1; then\n \
sudo rc-service dune-server-service restart >/dev/null 2>&1 || sudo rc-service dune-server-service start\n\
fi\n";
runner
.run_script(start_script)
.map_err(|err| step_err(app, "start-service", err))?;
emit_progress(app, "start-service", "ok", None);
emit_progress(app, "verify", "running", None);
// `STATE=...` line carries the canonical systemctl/openrc state. When the
// unit is anything other than active we also tail the journal so the UI
// surfaces *why* — empty parentheses helped nobody.
let verify_script = "set +e\n\
export PATH=/sbin:/usr/sbin:/usr/local/sbin:$PATH\n\
if command -v systemctl >/dev/null 2>&1; then\n \
sleep 1\n \
state=$(sudo systemctl is-active dune-server-service.service 2>/dev/null | tr -d '\\r\\n')\n \
[ -z \"$state\" ] && state=unknown\n \
echo \"STATE=$state\"\n \
if [ \"$state\" != active ]; then\n \
echo '--- journalctl ---'\n \
sudo journalctl -u dune-server-service.service -n 20 --no-pager 2>&1 | tail -n 20\n \
fi\n\
elif command -v rc-service >/dev/null 2>&1; then\n \
sleep 1\n \
if sudo rc-service dune-server-service status >/dev/null 2>&1; then echo STATE=active; else echo STATE=inactive; fi\n \
if [ -f /var/log/dune-server-service.log ]; then\n \
echo '--- supervisor log ---'\n \
sudo tail -n 20 /var/log/dune-server-service.log 2>&1\n \
fi\n\
else\n \
echo STATE=unknown\n\
fi\n\
/opt/dune-server-service/dune-server-service --version 2>/dev/null || true\n\
exit 0\n";
let verify_stdout = runner
.run_script(verify_script)
.map_err(|err| step_err(app, "verify", err))?;
let mut active_state = String::new();
let mut installed_version: Option<String> = None;
let mut diagnostic_lines: Vec<String> = Vec::new();
let mut collecting_diag = false;
for line in verify_stdout.lines() {
let trimmed = line.trim();
if let Some(state) = trimmed.strip_prefix("STATE=") {
active_state = state.to_string();
continue;
}
if trimmed.starts_with("--- ") && trimmed.ends_with(" ---") {
collecting_diag = true;
continue;
}
if trimmed.starts_with("dune-server-service ") {
installed_version = trimmed
.strip_prefix("dune-server-service ")
.map(|s| s.trim().to_string());
continue;
}
if collecting_diag && !trimmed.is_empty() {
diagnostic_lines.push(trimmed.to_string());
}
}
let started = active_state == "active";
let verify_msg = match (started, &installed_version) {
(true, Some(v)) => Some(format!("active, version {v}")),
(true, None) => Some("active".to_string()),
(false, _) => {
let header = if active_state.is_empty() {
"not active".to_string()
} else {
format!("not active ({active_state})")
};
if diagnostic_lines.is_empty() {
Some(header)
} else {
// Keep the tail short so the toast/log stays readable; full
// detail is still on the host via `journalctl -u ...`.
let tail: Vec<String> = diagnostic_lines
.iter()
.rev()
.take(6)
.rev()
.cloned()
.collect();
Some(format!("{header}\n{}", tail.join("\n")))
}
}
};
emit_progress(
app,
"verify",
if started { "ok" } else { "error" },
verify_msg.clone(),
);
Ok(ManagementInstallResult {
installed: true,
started,
init_system: init_system.clone(),
installed_version,
message: format!("installed via {init_system}; active={active_state}"),
})
}
fn discover_service_account(
runner: &RusshRunner,
_registered_user: &str,
) -> Result<ServiceAccount, String> {
// The Dune service ALWAYS runs as the vendor's `dune` user with home
// `/home/dune`, no matter which account the operator SSH'd in as. SSH
// login may be root / ubuntu / a custom sudoer; install steps escalate
// via `sudo install -o dune -g dune` and the systemd/openrc unit pins
// User=dune. We still call getent on the host to fail loudly if `dune`
// isn't provisioned yet (e.g. vendor setup wasn't run).
let script = "set -eu\n\
user=dune\n\
home=$(getent passwd \"$user\" | awk -F: '{print $6}')\n\
group=$(id -gn \"$user\" 2>/dev/null || echo dune)\n\
if [ -z \"$home\" ]; then\n \
echo \"dune user not found on host — run the vendor setup first\" >&2\n \
exit 1\n\
fi\n\
printf 'USER=%s\\nGROUP=%s\\nHOME=%s\\n' \"$user\" \"$group\" \"$home\"\n";
let script = script.to_string();
let stdout = runner.run_script(&script).map_err(command_error_message)?;
let mut account = ServiceAccount {
user: String::new(),
group: String::new(),
home: String::new(),
};
for line in stdout.lines() {
if let Some(value) = line.strip_prefix("USER=") {
account.user = value.trim().to_string();
} else if let Some(value) = line.strip_prefix("GROUP=") {
account.group = value.trim().to_string();
} else if let Some(value) = line.strip_prefix("HOME=") {
account.home = value.trim().trim_end_matches('/').to_string();
}
}
if account.user.is_empty() || account.group.is_empty() || account.home.is_empty() {
return Err(format!(
"could not resolve service account from remote output: {stdout}"
));
}
Ok(account)
}
fn render_systemd_unit(path: &std::path::Path, account: &ServiceAccount) -> Result<String, String> {
let unit = std::fs::read_to_string(path)
.map_err(|err| format!("reading resource {}: {err}", path.display()))?;
let home = account.home.as_str();
Ok(unit
.replace("User=dune", &format!("User={}", account.user))
.replace("Group=dune", &format!("Group={}", account.group))
.replace("/home/dune/.local/bin", &format!("{home}/.local/bin"))
.replace("/home/dune/.dune", &format!("{home}/.dune"))
.replace("/home/dune/.steam", &format!("{home}/.steam"))
.replace("/home/dune/Steam", &format!("{home}/Steam"))
.replace(
"Environment=\"DUNE_SERVICE_HOME=/home/dune\"",
&format!("Environment=\"DUNE_SERVICE_HOME={home}\""),
))
}
fn render_openrc_unit(path: &std::path::Path, account: &ServiceAccount) -> Result<String, String> {
let unit = std::fs::read_to_string(path)
.map_err(|err| format!("reading resource {}: {err}", path.display()))?;
let home = account.home.as_str();
Ok(unit
.replace(
"command_user=\"dune:dune\"",
&format!("command_user=\"{}:{}\"", account.user, account.group),
)
.replace(
"--owner dune:dune",
&format!("--owner {}:{}", account.user, account.group),
)
.replace("/home/dune/.dune", &format!("{home}/.dune"))
.replace(
"DUNE_SERVICE_HOME=\"${DUNE_SERVICE_HOME:-/home/dune}\"",
&format!("DUNE_SERVICE_HOME=\"${{DUNE_SERVICE_HOME:-{home}}}\""),
))
}
fn emit_progress(app: &tauri::AppHandle, step: &str, status: &str, message: Option<String>) {
let payload = InstallProgressEvent {
step: step.to_string(),
status: status.to_string(),
message,
};
let _ = app.emit("management-install-progress", payload);
}
fn step_err(
app: &tauri::AppHandle,
step: &str,
err: dune_manager_core::models::CommandFailure,
) -> String {
let msg = command_error_message(err);
emit_progress(app, step, "error", Some(msg.clone()));
msg
}
fn uninstall_inner(target: &RusshTarget) -> Result<(), String> {
let script = "set -eu\n\
export PATH=/sbin:/usr/sbin:/usr/local/sbin:$PATH\n\
if command -v systemctl >/dev/null 2>&1; then\n \
sudo systemctl disable --now dune-server-service.service >/dev/null 2>&1 || true\n \
sudo rm -f /etc/systemd/system/dune-server-service.service\n \
sudo systemctl daemon-reload\n\
fi\n\
if command -v rc-service >/dev/null 2>&1; then\n \
sudo rc-service dune-server-service stop >/dev/null 2>&1 || true\n \
sudo rc-update del dune-server-service default >/dev/null 2>&1 || true\n \
sudo rm -f /etc/init.d/dune-server-service\n\
fi\n\
sudo rm -rf /opt/dune-server-service\n\
exit 0\n";
let runner = RusshRunner::new(target.clone());
runner
.run_script(script)
.map_err(command_error_message)
.map(|_| ())
}
fn status_inner(target: &RusshTarget) -> Result<ManagementServiceStatus, String> {
let script = "set +e\n\
export PATH=/sbin:/usr/sbin:/usr/local/sbin:$PATH\n\
if [ -x /opt/dune-server-service/dune-server-service ]; then\n \
echo INSTALLED=yes\n \
/opt/dune-server-service/dune-server-service --version 2>/dev/null | head -n 1\n\
else\n \
echo INSTALLED=no\n\
fi\n\
if command -v systemctl >/dev/null 2>&1; then\n \
echo INIT=systemd\n \
sudo systemctl is-active dune-server-service.service\n\
elif command -v rc-service >/dev/null 2>&1; then\n \
echo INIT=openrc\n \
sudo rc-service dune-server-service status >/dev/null 2>&1 && echo active || echo inactive\n\
else\n \
echo INIT=none\n\
fi\n\
exit 0\n";
let runner = RusshRunner::new(target.clone());
let stdout = runner.run_script(script).map_err(command_error_message)?;
let mut installed = false;
let mut active = false;
let mut init_system = String::from("unknown");
let mut installed_version: Option<String> = None;
for line in stdout.lines() {
let trimmed = line.trim();
match trimmed {
"INSTALLED=yes" => installed = true,
"INSTALLED=no" => installed = false,
"INIT=systemd" => init_system = "systemd".to_string(),
"INIT=openrc" => init_system = "openrc".to_string(),
"INIT=none" => init_system = "none".to_string(),
"active" => active = true,
"inactive" => active = false,
other if other.starts_with("dune-server-service ") => {
installed_version = other
.strip_prefix("dune-server-service ")
.map(|s| s.trim().to_string());
}
_ => {}
}
}
Ok(ManagementServiceStatus {
installed,
active,
init_system,
installed_version,
bundled_version: BUNDLED_VERSION.trim().to_string(),
journal_tail: String::new(),
})
}

View File

@@ -0,0 +1,39 @@
mod battlegroup;
mod component;
mod discovery;
mod logs;
mod management_api;
mod management_service;
mod preflight;
pub(crate) mod shared;
mod status;
mod status_data;
mod status_helpers;
mod status_naming;
mod tunnel;
mod tunnel_helpers;
pub use battlegroup::{
restart_remote_battlegroup, start_remote_battlegroup, stop_remote_battlegroup,
update_remote_battlegroup,
};
pub use component::{remote_component_log_tail, restart_remote_component};
pub use discovery::detect_remote_ubuntu_servers;
pub use logs::{get_logs_folder, record_operation_log};
pub use management_api::{
ms_cluster, ms_cron_preview, ms_dump_prune_execute, ms_dump_prune_preview, ms_get_config,
ms_health, ms_history, ms_list_commands, ms_list_logs, ms_list_runs, ms_list_timezones,
ms_player_location, ms_publish, ms_search_items, ms_search_journey_nodes, ms_search_players,
ms_search_skill_modules, ms_search_vehicles, ms_search_xp_event_tags, ms_set_config,
ms_trigger_run, ms_welcome_grant_retry, ms_welcome_grants, ms_welcome_whisper,
};
pub use management_service::{
install_management_service, management_service_bundled_version, management_service_status,
restart_management_service, uninstall_management_service,
};
pub use preflight::check_remote_sudo;
pub use status::{remote_server_components, remote_server_status};
pub use tunnel::{
server_tunnel_status, start_custom_tunnel, start_server_tunnel, stop_all_tunnels,
stop_server_tunnel,
};

View File

@@ -0,0 +1,104 @@
//! Pre-attach connectivity + sudo checks executed against a candidate host.
use std::path::PathBuf;
use dune_manager_core::orchestration::{RemoteCommandRunner, RusshRunner, RusshTarget};
use serde::Serialize;
#[derive(Debug, Serialize)]
#[serde(rename_all = "camelCase")]
pub struct PreflightCheck {
/// SSH connection + key authentication succeeded.
pub ssh_ok: bool,
/// The SSH user can `sudo -n -u dune` without a password.
pub sudo_to_dune_ok: bool,
/// The `dune` user itself has passwordless sudo for arbitrary commands.
pub dune_nopasswd_ok: bool,
/// Whether the SSH login user IS `dune` (no impersonation needed).
pub is_dune_login: bool,
/// Raw stdout/stderr collected from the probe script — surfaced in the
/// UI when something fails so the operator can see exactly what
/// happened on the host.
pub raw_output: String,
}
#[derive(Debug, serde::Deserialize)]
#[serde(rename_all = "camelCase")]
pub struct PreflightRequest {
pub host: String,
pub user: String,
pub key_path: String,
#[serde(default)]
pub port: Option<u16>,
}
/// Probes connectivity, SSH auth, and the various sudo capabilities we
/// rely on. The result is used to gate the attach flow with a clear error
/// banner when something is missing.
#[tauri::command]
pub async fn check_remote_sudo(request: PreflightRequest) -> Result<PreflightCheck, String> {
let host = request.host.trim().to_string();
let user = request.user.trim().to_string();
let key_path = request.key_path.trim().to_string();
let port = request.port;
if host.is_empty() || user.is_empty() || key_path.is_empty() {
return Err("Host, user, and SSH key path are required.".to_string());
}
tauri::async_runtime::spawn_blocking(move || run_preflight(host, user, key_path, port))
.await
.map_err(|err| format!("Preflight worker failed: {err}"))?
}
fn run_preflight(
host: String,
user: String,
key_path: String,
port: Option<u16>,
) -> Result<PreflightCheck, String> {
let mut target = RusshTarget::new(PathBuf::from(&key_path), user.clone(), host.clone());
if let Some(p) = port {
target.port = p;
}
target.validate().map_err(|err| err.message)?;
let runner = RusshRunner::new(target);
let probe = r#"set +e
echo SSH_OK
if sudo -n -u dune true >/dev/null 2>&1; then echo SUDO_TO_DUNE_OK; else echo SUDO_TO_DUNE_FAILED; fi
if sudo -n -u dune sudo -n true >/dev/null 2>&1; then echo DUNE_NOPASSWD_OK; else echo DUNE_NOPASSWD_FAILED; fi
echo PREFLIGHT_DONE
"#;
let stdout = runner.run_script(probe).map_err(|err| {
// Connection / auth failures land here. Surface them to the UI so
// the operator can fix host/key before retrying.
if !err.stderr.trim().is_empty() {
format!("{}: {}", err.message, err.stderr.trim())
} else {
err.message
}
})?;
let ssh_ok = stdout.contains("SSH_OK");
let is_dune_login = user == "dune";
// When the SSH login is already dune, we do not need a sudo-to-dune
// hop; treat it as ok regardless of the probe outcome.
let sudo_to_dune_ok = is_dune_login || stdout.contains("SUDO_TO_DUNE_OK");
let dune_nopasswd_ok = if is_dune_login {
// `sudo -n -u dune sudo -n true` may be rejected when the outer
// sudo refuses self-targeting. Fall back to a direct `sudo -n true`
// check when the operator is already logged in as dune. Re-run a
// quick second probe.
let direct = r#"if sudo -n true >/dev/null 2>&1; then echo DUNE_NOPASSWD_OK; else echo DUNE_NOPASSWD_FAILED; fi"#;
runner
.run_script(direct)
.map(|out| out.contains("DUNE_NOPASSWD_OK"))
.unwrap_or(false)
} else {
stdout.contains("DUNE_NOPASSWD_OK")
};
Ok(PreflightCheck {
ssh_ok,
sudo_to_dune_ok,
dune_nopasswd_ok,
is_dune_login,
raw_output: stdout,
})
}

View File

@@ -0,0 +1,47 @@
use std::path::PathBuf;
use dune_manager_core::models::CommandFailure;
use dune_manager_core::orchestration::{RusshRunner, RusshTarget};
pub fn remote_runner(
host: String,
user: String,
key_path: String,
port: Option<u16>,
) -> Result<RusshRunner, String> {
let mut target = RusshTarget::new(PathBuf::from(key_path), user, host);
if let Some(p) = port {
target.port = p;
}
target.validate().map_err(|err| err.message)?;
Ok(RusshRunner::new(target))
}
pub fn runner_for_remote_kind(
_server_type: Option<&str>,
host: String,
user: String,
key_path: Option<String>,
port: Option<u16>,
) -> Result<RusshRunner, String> {
let key_path = key_path
.map(|value| value.trim().to_string())
.filter(|value| !value.is_empty())
.ok_or_else(|| "SSH private key is required for remote Ubuntu servers.".to_string())?;
remote_runner(host, user, key_path, port)
}
pub fn command_error_message(err: CommandFailure) -> String {
let mut parts = vec![err.message];
if !err.stderr.trim().is_empty() {
parts.push(err.stderr);
}
if !err.stdout.trim().is_empty() {
parts.push(err.stdout);
}
parts.join("\n")
}
pub fn sh_single_quoted(value: &str) -> String {
format!("'{}'", value.replace('\'', "'\"'\"'"))
}

View File

@@ -0,0 +1,40 @@
use crate::commands::shared::{command_error_message, runner_for_remote_kind};
use crate::commands::status_data::{read_remote_server_components, read_remote_server_status};
use crate::dto::{RemoteServerActionRequest, RemoteServerComponent, RemoteServerStatus};
#[tauri::command]
pub async fn remote_server_status(
request: RemoteServerActionRequest,
) -> Result<RemoteServerStatus, String> {
tauri::async_runtime::spawn_blocking(move || {
let runner = runner_for_remote_kind(
request.server_type.as_deref(),
request.host,
request.user,
request.key_path,
Some(request.port),
)?;
read_remote_server_status(&runner, &request.namespace, &request.battlegroup_name)
.map_err(command_error_message)
})
.await
.map_err(|err| format!("Remote status worker failed: {err}"))?
}
#[tauri::command]
pub async fn remote_server_components(
request: RemoteServerActionRequest,
) -> Result<Vec<RemoteServerComponent>, String> {
tauri::async_runtime::spawn_blocking(move || {
let runner = runner_for_remote_kind(
request.server_type.as_deref(),
request.host,
request.user,
request.key_path,
Some(request.port),
)?;
read_remote_server_components(&runner, &request.namespace).map_err(command_error_message)
})
.await
.map_err(|err| format!("Remote component diagnostics worker failed: {err}"))?
}

View File

@@ -0,0 +1,694 @@
use dune_manager_core::errors::failure;
use dune_manager_core::models::CommandResult;
use dune_manager_core::orchestration::{RemoteCommandRunner, RusshRunner};
use serde_json::Value;
use crate::commands::shared::sh_single_quoted;
use crate::commands::status_helpers::{pod_component, server_resource_components};
use crate::commands::status_naming::friendly_map_name;
use crate::dto::{
RemoteBattlegroupServerStat, RemoteBattlegroupStatus, RemoteServerComponent,
RemoteServerPackageStatus, RemoteServerStatus,
};
pub fn read_remote_server_status(
runner: &RusshRunner,
namespace: &str,
battlegroup_name: &str,
) -> CommandResult<RemoteServerStatus> {
// The vendor wrapper's `status` text output is the source of truth in
// older operator versions, but the format keeps shifting across Funcom
// releases (newer wrappers show the partial world name in "Status",
// "N/M" ratios under "Director", and semantic words like "Healthy"
// under "Uptime" — none of which match the older
// `Running/Running/Running/Running/1h2m` shape we used to parse).
// Read the BattleGroup CR's `status` object directly so we stay
// pinned to the stable Kubernetes schema instead of the rotating
// text rendering.
let bg = runner.run_json(
&format!(
"sudo kubectl get battlegroup -n {} {} -o json",
sh_single_quoted(namespace),
sh_single_quoted(battlegroup_name),
),
"remote battlegroup",
)?;
// Per-partition live data (player count, gamePhase, ready) lives on a
// separate ServerStats CRD published by the Funcom operator — the same
// source `F:\Dune\Server\gt-server-status\gt_server_status.py` consumes.
// Failing to fetch this is non-fatal; the table just shows blank
// players where it can't be merged.
let stats = runner
.run_json(
&format!(
"sudo kubectl get serverstats -n {} -o json",
sh_single_quoted(namespace),
),
"remote serverstats",
)
.unwrap_or_else(|_| Value::Null);
let battlegroup = battlegroup_status_from_json_with_stats(&bg, &stats).ok_or_else(|| {
failure(format!(
"BattleGroup `{battlegroup_name}` returned no status object yet (likely still initialising)"
))
})?;
let package = read_guest_package_status(runner, namespace, battlegroup_name)?;
Ok(RemoteServerStatus {
battlegroup,
package,
})
}
/// Maps a raw `kubectl get battlegroup ... -o json` payload into the UI's
/// `RemoteBattlegroupStatus` and merges per-partition
/// live data (players, gamePhase, ready) from a `kubectl get serverstats`
/// JSON payload. Pass `Value::Null` when no stats are available.
pub(crate) fn battlegroup_status_from_json_with_stats(
bg: &Value,
serverstats: &Value,
) -> Option<RemoteBattlegroupStatus> {
bg.get("metadata")?.get("name")?.as_str()?;
let spec = bg.get("spec").cloned().unwrap_or(Value::Null);
let status = bg.get("status").cloned().unwrap_or(Value::Null);
let stop = spec
.get("stop")
.and_then(Value::as_bool)
.or_else(|| status.get("stop").and_then(Value::as_bool))
.unwrap_or(false);
// Funcom's CR carries `status.startTimestamp` at the BG level (when the
// BG first scheduled) but not per-server. We render it on every row as a
// best-effort age — accurate when partitions all came up together, off
// by however long a partition has restarted independently.
let bg_age = status
.get("startTimestamp")
.and_then(Value::as_str)
.map(format_age_since_iso)
.unwrap_or_default();
let stats_by_partition = index_serverstats_by_partition(serverstats);
let server_stats = status
.get("servers")
.and_then(Value::as_array)
.map(|servers| {
servers
.iter()
.map(|s| server_stat_from_json(s, &bg_age, &stats_by_partition))
.collect()
})
.unwrap_or_default();
// Database/director phases are nested in the live CR, not top-level
// fields. Fall back to top-level keys for older operator builds.
let database_phase = status
.get("database")
.and_then(|d| d.get("phase"))
.and_then(Value::as_str)
.map(str::to_string)
.unwrap_or_else(|| string_field(&status, "databasePhase"));
let director_phase = status
.get("utilities")
.and_then(|u| u.get("director"))
.and_then(|d| d.get("phase"))
.and_then(Value::as_str)
.map(str::to_string)
.unwrap_or_else(|| string_field(&status, "directorPhase"));
// Uptime: the CR doesn't expose a pre-formatted string anymore, so we
// compute it from `status.startTimestamp` (the same field we use for
// per-row age). Older operators that set a literal `uptime` string win.
let uptime_literal = string_field(&status, "uptime");
let uptime = if uptime_literal.is_empty() {
bg_age.clone()
} else {
uptime_literal
};
Some(RemoteBattlegroupStatus {
stop,
phase: string_field(&status, "phase"),
database_phase,
server_group_phase: string_field(&status, "serverGroupPhase"),
director_phase,
uptime,
server_stats,
})
}
#[derive(Default, Clone)]
struct PartitionStats {
players: Option<i64>,
}
/// Build a `partition_index -> PartitionStats` map from a `kubectl get
/// serverstats -n <ns> -o json` payload. The Funcom operator emits one
/// ServerStats CR per partition with `spec.area.partition` as the id and
/// `status.runtime.players` as the live count. Same source the
/// `gt_server_status.py` cron script consumes.
fn index_serverstats_by_partition(stats: &Value) -> std::collections::HashMap<i64, PartitionStats> {
let mut out = std::collections::HashMap::new();
let Some(items) = stats.get("items").and_then(Value::as_array) else {
return out;
};
for item in items {
let partition = item
.get("spec")
.and_then(|s| s.get("area"))
.and_then(|a| a.get("partition"))
.and_then(Value::as_i64);
let Some(partition) = partition else { continue };
let players = item
.get("status")
.and_then(|s| s.get("runtime"))
.and_then(|r| r.get("players"))
.and_then(Value::as_i64);
out.insert(partition, PartitionStats { players });
}
out
}
fn string_field(value: &Value, key: &str) -> String {
match value.get(key) {
Some(Value::String(s)) => s.clone(),
Some(Value::Number(n)) => n.to_string(),
Some(Value::Bool(b)) => b.to_string(),
_ => String::new(),
}
}
fn server_stat_from_json(
server: &Value,
bg_age: &str,
stats_by_partition: &std::collections::HashMap<i64, PartitionStats>,
) -> RemoteBattlegroupServerStat {
// The Funcom operator names this field `partitionMap` in the BattleGroup
// CR's `status.servers[]` — confirmed against backed-up live CR YAML.
// Older / alternate operators have used `map` or `name`, so we keep
// those as fallbacks. With no map at all `friendly_map_name` returns
// "Game Server" which is what we want to avoid here.
let raw_map = server
.get("partitionMap")
.and_then(Value::as_str)
.or_else(|| server.get("map").and_then(Value::as_str))
.or_else(|| server.get("name").and_then(Value::as_str))
.unwrap_or_default();
let partition_index = server
.get("partitionIndex")
.and_then(Value::as_u64)
.or_else(|| server.get("ordinalIndex").and_then(Value::as_u64));
let friendly = friendly_map_name(raw_map, raw_map);
let labelled = match partition_index {
Some(idx) => format!("{friendly} #{idx}"),
None => friendly,
};
let ready_str = match server.get("ready") {
Some(Value::Bool(b)) => b.to_string(),
Some(Value::String(s)) => s.clone(),
Some(Value::Number(n)) => n.to_string(),
_ => String::new(),
};
// The BG CR's status.servers[] entries don't carry a player count or
// age; we inherit the BG-level age and merge the per-partition player
// count from the matching ServerStats CR (keyed by partitionIndex).
let age = if let Some(start) = server.get("startTimestamp").and_then(Value::as_str) {
format_age_since_iso(start)
} else {
bg_age.to_string()
};
let players = partition_index
.and_then(|idx| stats_by_partition.get(&(idx as i64)))
.and_then(|s| s.players)
.map(|n| n.to_string())
.unwrap_or_default();
RemoteBattlegroupServerStat {
map: labelled,
phase: string_field(server, "phase"),
ready: ready_str,
players,
age,
}
}
/// Format an RFC 3339 timestamp like `"2026-05-22T01:27:53Z"` as a compact
/// elapsed-time string (`5d 3h`, `2h 17m`, `45m`, `12s`). Returns empty
/// string when parsing fails — the UI just shows an empty cell.
fn format_age_since_iso(iso_ts: &str) -> String {
let parsed = chrono::DateTime::parse_from_rfc3339(iso_ts.trim());
let Ok(start) = parsed else {
return String::new();
};
let now = chrono::Utc::now();
let diff = now.signed_duration_since(start.with_timezone(&chrono::Utc));
let secs = diff.num_seconds().max(0);
if secs < 60 {
return format!("{secs}s");
}
let minutes = secs / 60;
if minutes < 60 {
return format!("{minutes}m");
}
let hours = minutes / 60;
let mins_rem = minutes % 60;
if hours < 24 {
return format!("{hours}h {mins_rem}m");
}
let days = hours / 24;
let hours_rem = hours % 24;
format!("{days}d {hours_rem}h")
}
fn read_guest_package_status(
runner: &RusshRunner,
namespace: &str,
battlegroup_name: &str,
) -> CommandResult<RemoteServerPackageStatus> {
let script = r#"
set -u
download=/home/dune/.dune/download
manifest="$download/steamapps/appmanifest_4754530.acf"
ns=__NAMESPACE__
bg=__BATTLEGROUP__
read_vdf_value() {
key="$1"
file="$2"
[ -f "$file" ] || return 0
awk -F '"' -v wanted="$key" '$2 == wanted { print $4; exit }' "$file" 2>/dev/null || true
}
read_file() {
file="$1"
[ -f "$file" ] || return 0
head -n 1 "$file" 2>/dev/null | tr -d '\r\n'
}
printf 'installedBuildId=%s\n' "$(read_vdf_value buildid "$manifest")"
printf 'battlegroupVersion=%s\n' "$(read_file "$download/images/battlegroup/version.txt")"
printf 'operatorVersion=%s\n' "$(read_file "$download/images/operators/version.txt")"
live_image=$(sudo kubectl get battlegroup "$bg" -n "$ns" -o jsonpath='{..image}' 2>/dev/null | tr ' ' '\n' | awk -F: '/self-hosting\/(igw-server|seabass-server):/ { print $NF; exit }' || true)
printf 'liveBattlegroupVersion=%s\n' "$live_image"
"#
.replace("__NAMESPACE__", &sh_single_quoted(namespace))
.replace("__BATTLEGROUP__", &sh_single_quoted(battlegroup_name));
let output = runner.run_script(&script)?;
let value = |key: &str| {
output.lines().find_map(|line| {
let (name, value) = line.split_once('=')?;
(name == key && !value.trim().is_empty()).then(|| value.trim().to_string())
})
};
Ok(RemoteServerPackageStatus {
installed_build_id: value("installedBuildId"),
battlegroup_version: value("battlegroupVersion"),
live_battlegroup_version: value("liveBattlegroupVersion"),
operator_version: value("operatorVersion"),
})
}
pub fn read_remote_server_components(
runner: &RusshRunner,
namespace: &str,
) -> CommandResult<Vec<RemoteServerComponent>> {
let pods = runner.run_json(
&format!(
"sudo kubectl get pods -n {} -o json",
sh_single_quoted(namespace)
),
"remote server pods",
)?;
let resources = runner.run_json(
&format!(
"sudo kubectl get servergroups,servergateways,serversets -n {} -o json",
sh_single_quoted(namespace)
),
"remote server resources",
)?;
let mut components = vec![
pod_component("Database", "database", &pods, |role, name| {
role.contains("database") && !name.contains("-util-")
}),
pod_component(
"Database utilities",
"database-utilities",
&pods,
|role, _| {
role.contains("database-utility")
|| role.contains("database-monitor")
|| role.contains("database-pghero")
},
),
pod_component("Message Queue", "message-queue", &pods, |role, name| {
role.contains("message-queue") || name.contains("-mq-")
}),
pod_component("Director", "director", &pods, |role, name| {
role.contains("battlegroup-director") || name.contains("-bgd-")
}),
pod_component("Gateway", "gateway", &pods, |role, name| {
role.contains("server-gateway") || name.contains("-sgw-")
}),
pod_component("Text Router", "text-router", &pods, |role, name| {
role.contains("text-router") || name.contains("-tr-")
}),
pod_component("File Browser", "file-browser", &pods, |role, name| {
role.contains("filebrowser") || name.contains("-fb-")
}),
];
components.extend(server_resource_components(&resources));
Ok(components
.into_iter()
.filter(|component| component.state != "Not present")
.collect())
}
pub fn remote_records_from_battlegroups(
request: &crate::dto::RemoteConnectionRequest,
value: &Value,
) -> Vec<crate::dto::RemoteServerRecord> {
value
.get("items")
.and_then(Value::as_array)
.into_iter()
.flatten()
.filter_map(|item| remote_record_from_battlegroup(request, item))
.collect()
}
fn remote_record_from_battlegroup(
request: &crate::dto::RemoteConnectionRequest,
item: &Value,
) -> Option<crate::dto::RemoteServerRecord> {
let namespace = item
.get("metadata")?
.get("namespace")?
.as_str()?
.to_string();
let battlegroup_name = item.get("metadata")?.get("name")?.as_str()?.to_string();
let title = item
.get("spec")
.and_then(|spec| spec.get("title"))
.and_then(Value::as_str)
.unwrap_or(&battlegroup_name)
.to_string();
let phase = item
.get("status")
.and_then(|status| status.get("phase"))
.and_then(Value::as_str)
.unwrap_or("Unknown")
.to_string();
let server_type = request
.server_type
.as_deref()
.unwrap_or("ubuntu")
.trim()
.to_string();
let user = request
.user
.as_deref()
.map(str::trim)
.unwrap_or_default()
.to_string();
Some(crate::dto::RemoteServerRecord {
id: remote_record_id(&server_type, &request.host, request.key_path.as_deref()),
name: title,
host: request.host.clone(),
user,
key_path: request.key_path.clone().unwrap_or_default(),
port: request.port,
server_type,
namespace,
battlegroup_name: battlegroup_name.clone(),
world_unique_name: battlegroup_name,
phase,
})
}
fn remote_record_id(_server_type: &str, host: &str, key_path: Option<&str>) -> String {
format!(
"ubuntu:{}:{}",
host.trim().to_lowercase(),
key_path.unwrap_or_default().trim().to_lowercase()
)
}
#[cfg(test)]
mod tests {
use super::*;
use serde_json::json;
fn bg(spec: Value, status: Value) -> Value {
json!({
"metadata": {"name": "sh-test-bg", "namespace": "funcom-seabass-sh-test"},
"spec": spec,
"status": status,
})
}
fn bg_status(bg: &Value) -> Option<RemoteBattlegroupStatus> {
battlegroup_status_from_json_with_stats(bg, &Value::Null)
}
#[test]
fn maps_reconciling_bg_with_null_director_phase() {
// Mirrors the user-reported payload: phase Reconciling, gateway
// Running, director not yet populated. Prior text-parse path was
// confusing the UI into greying the Director tunnel; under direct
// kubectl read the director_phase is just "" which the UI treats
// as "ready enough".
let value = bg(
json!({"stop": false}),
json!({
"phase": "Reconciling",
"serverGroupPhase": "Running",
"directorPhase": Value::Null,
"stop": Value::Null,
}),
);
let dto = bg_status(&value).expect("status maps");
assert!(!dto.stop);
assert_eq!(dto.phase, "Reconciling");
assert_eq!(dto.server_group_phase, "Running");
assert_eq!(dto.director_phase, "");
assert_eq!(dto.uptime, "");
}
#[test]
fn falls_back_to_status_stop_when_spec_missing() {
let value = bg(json!({}), json!({"phase": "Stopped", "stop": true}));
let dto = bg_status(&value).expect("status maps");
assert!(dto.stop);
assert_eq!(dto.phase, "Stopped");
}
#[test]
fn server_stats_pulled_from_status_servers_array() {
let value = bg(
json!({"stop": false}),
json!({
"phase": "Running",
"servers": [
{"map": "Survival_1", "phase": "Running", "ready": true},
{"name": "DeepDesert_1", "phase": "Stopped", "ready": false},
]
}),
);
let dto = bg_status(&value).expect("status maps");
assert_eq!(dto.server_stats.len(), 2);
assert_eq!(
dto.server_stats[0].map,
friendly_map_name("Survival_1", "Survival_1")
);
assert_eq!(dto.server_stats[0].phase, "Running");
assert_eq!(dto.server_stats[0].ready, "true");
// Players empty when no ServerStats CR is supplied — that data lives
// on a separate CRD and is merged via `_with_stats`.
assert_eq!(dto.server_stats[0].players, "");
assert_eq!(
dto.server_stats[1].map,
friendly_map_name("DeepDesert_1", "DeepDesert_1")
);
assert_eq!(dto.server_stats[1].ready, "false");
assert_eq!(dto.server_stats[1].age, "");
}
#[test]
fn server_stats_merge_player_count_from_serverstats_crd() {
// Mirrors the data shape gt_server_status.py reads: each ServerStats
// CR has spec.area.partition matching the BG's partitionIndex, and
// status.runtime.players is the live count.
let value = bg(
json!({"stop": false}),
json!({
"phase": "Healthy",
"servers": [
{"partitionMap": "Survival_1", "partitionIndex": 1, "phase": "Running", "ready": true},
{"partitionMap": "Survival_1", "partitionIndex": 31, "phase": "Running", "ready": true},
{"partitionMap": "Overmap", "partitionIndex": 2, "phase": "Running", "ready": true},
],
}),
);
let stats = json!({
"items": [
{"spec": {"area": {"partition": 1, "map": "Survival_1"}}, "status": {"runtime": {"players": 7}}},
{"spec": {"area": {"partition": 31, "map": "Survival_1"}}, "status": {"runtime": {"players": 0}}},
{"spec": {"area": {"partition": 2, "map": "Overmap"}}, "status": {"runtime": {"players": 3}}},
],
});
let dto = battlegroup_status_from_json_with_stats(&value, &stats).expect("status maps");
assert_eq!(dto.server_stats[0].players, "7");
assert_eq!(dto.server_stats[1].players, "0");
assert_eq!(dto.server_stats[2].players, "3");
}
#[test]
fn server_stats_player_count_blank_when_partition_missing_from_stats() {
let value = bg(
json!({"stop": false}),
json!({
"servers": [
{"partitionMap": "Survival_1", "partitionIndex": 1, "phase": "Running", "ready": true},
],
}),
);
let stats = json!({"items": []});
let dto = battlegroup_status_from_json_with_stats(&value, &stats).expect("status maps");
assert_eq!(dto.server_stats[0].players, "");
}
#[test]
fn server_stats_use_partition_map_and_index_from_real_cr() {
// Mirrors the actual Funcom operator status.servers[] shape captured
// from a live BattleGroup CR backup. Pre-fix the map column showed
// "Game Server" for every row because we were reading `map`/`name`
// instead of `partitionMap`.
let value = bg(
json!({"stop": false}),
json!({
"phase": "Healthy",
"servers": [
{
"partitionMap": "Survival_1",
"partitionIndex": 1,
"phase": "Running",
"ready": true,
},
{
"partitionMap": "Survival_1",
"partitionIndex": 31,
"phase": "Running",
"ready": true,
},
{
"partitionMap": "Overmap",
"partitionIndex": 2,
"phase": "Running",
"ready": true,
},
]
}),
);
let dto = bg_status(&value).expect("status maps");
assert_eq!(dto.server_stats.len(), 3);
assert_eq!(dto.server_stats[0].map, "Hagga Basin #1");
assert_eq!(dto.server_stats[1].map, "Hagga Basin #31");
assert_eq!(dto.server_stats[2].map, "Overmap #2");
assert!(dto.server_stats.iter().all(|s| s.phase == "Running"));
assert!(dto.server_stats.iter().all(|s| s.ready == "true"));
}
#[test]
fn returns_none_when_not_a_battlegroup_resource() {
let value = json!({"kind": "Pod", "spec": {}, "status": {}});
assert!(bg_status(&value).is_none());
}
#[test]
fn bg_start_timestamp_propagates_to_every_server_row_when_per_server_missing() {
// status.startTimestamp from the live CR backup is one minute in the
// past for this test.
let one_min_ago = (chrono::Utc::now() - chrono::Duration::minutes(1))
.to_rfc3339_opts(chrono::SecondsFormat::Secs, true);
let value = bg(
json!({"stop": false}),
json!({
"phase": "Running",
"startTimestamp": one_min_ago,
"servers": [
{"partitionMap": "Survival_1", "partitionIndex": 1, "phase": "Running", "ready": true},
{"partitionMap": "Overmap", "partitionIndex": 2, "phase": "Running", "ready": true},
],
}),
);
let dto = bg_status(&value).expect("status maps");
// All rows pick up the same BG-level age.
assert_eq!(dto.server_stats.len(), 2);
for row in &dto.server_stats {
assert!(
row.age == "1m" || row.age == "60s",
"row age was {:?}",
row.age
);
}
}
#[test]
fn database_director_phases_pulled_from_nested_status() {
// Live CR shape: status.database.phase + status.utilities.director.phase,
// not top-level databasePhase/directorPhase.
let value = bg(
json!({"stop": false}),
json!({
"phase": "Healthy",
"serverGroupPhase": "Running",
"database": {"phase": "Ready", "address": "1.2.3.4:15432"},
"utilities": {
"director": {"phase": "Healthy", "address": "1.2.3.4:30393"},
},
}),
);
let dto = bg_status(&value).expect("status maps");
assert_eq!(dto.database_phase, "Ready");
assert_eq!(dto.director_phase, "Healthy");
}
#[test]
fn uptime_derived_from_start_timestamp_when_no_literal() {
let one_hr_ago =
(chrono::Utc::now() - chrono::Duration::hours(1) - chrono::Duration::minutes(2))
.to_rfc3339_opts(chrono::SecondsFormat::Secs, true);
let value = bg(
json!({"stop": false}),
json!({"phase": "Healthy", "startTimestamp": one_hr_ago}),
);
let dto = bg_status(&value).expect("status maps");
assert_eq!(dto.uptime, "1h 2m");
}
#[test]
fn uptime_prefers_literal_string_when_older_operator_set_it() {
let value = bg(
json!({"stop": false}),
json!({
"phase": "Healthy",
"uptime": "1h2m",
"startTimestamp": "2026-05-22T01:27:53Z",
}),
);
let dto = bg_status(&value).expect("status maps");
assert_eq!(dto.uptime, "1h2m");
}
#[test]
fn format_age_since_iso_handles_common_shapes() {
assert_eq!(format_age_since_iso(""), "");
assert_eq!(format_age_since_iso("not a timestamp"), "");
let recent = (chrono::Utc::now() - chrono::Duration::seconds(30))
.to_rfc3339_opts(chrono::SecondsFormat::Secs, true);
assert!(format_age_since_iso(&recent).ends_with('s'));
let hours =
(chrono::Utc::now() - chrono::Duration::hours(3) - chrono::Duration::minutes(15))
.to_rfc3339_opts(chrono::SecondsFormat::Secs, true);
assert_eq!(format_age_since_iso(&hours), "3h 15m");
let days = (chrono::Utc::now() - chrono::Duration::days(5) - chrono::Duration::hours(7))
.to_rfc3339_opts(chrono::SecondsFormat::Secs, true);
assert_eq!(format_age_since_iso(&days), "5d 7h");
}
}

View File

@@ -0,0 +1,271 @@
use serde_json::Value;
use crate::commands::status_naming::{friendly_map_name, serverset_log_key};
use crate::dto::RemoteServerComponent;
pub fn pod_component(
label: &str,
log_key: &str,
pods: &Value,
matches: impl Fn(&str, &str) -> bool,
) -> RemoteServerComponent {
let mut total = 0usize;
let mut ready = 0usize;
let mut restarts = 0u64;
let mut reasons = Vec::new();
let mut phases = Vec::new();
for item in pods["items"].as_array().cloned().unwrap_or_default() {
let name = item["metadata"]["name"].as_str().unwrap_or_default();
let role = item["metadata"]["labels"]["role"]
.as_str()
.unwrap_or_default();
if !matches(role, name) {
continue;
}
total += 1;
let phase = item["status"]["phase"].as_str().unwrap_or_default();
if !phase.is_empty() {
phases.push(phase.to_string());
}
let statuses = item["status"]["containerStatuses"]
.as_array()
.cloned()
.unwrap_or_default();
let pod_ready = !statuses.is_empty()
&& statuses
.iter()
.all(|status| status["ready"].as_bool().unwrap_or(false));
if pod_ready || phase == "Succeeded" {
ready += 1;
}
for status in statuses {
restarts += status["restartCount"].as_u64().unwrap_or_default();
if let Some(reason) = status["state"]["waiting"]["reason"].as_str() {
reasons.push(reason.to_string());
}
if let Some(reason) = status["state"]["terminated"]["reason"].as_str() {
if reason != "Completed" {
reasons.push(reason.to_string());
}
}
}
}
if total == 0 {
return component(
label,
log_key,
"system",
"Not present",
"gray",
"No matching runtime component was found.",
vec![],
);
}
let details = compact_details(vec![
format!("{ready}/{total} pods ready"),
if restarts > 0 {
format!("{restarts} container restarts")
} else {
String::new()
},
if reasons.is_empty() {
String::new()
} else {
format!("Reason: {}", reasons.join(", "))
},
]);
if ready == total && reasons.is_empty() {
component(
label,
log_key,
"system",
"Ready",
"green",
"All pods are ready.",
details,
)
} else if reasons.iter().any(|reason| is_bad_reason(reason))
|| phases.iter().any(|phase| phase == "Failed")
{
component(
label,
log_key,
"system",
"Problem",
"red",
"One or more pods are failing.",
details,
)
} else {
component(
label,
log_key,
"system",
"Starting",
"amber",
"Waiting for pods to become ready.",
details,
)
}
}
pub fn server_resource_components(resources: &Value) -> Vec<RemoteServerComponent> {
let mut items = resources["items"].as_array().cloned().unwrap_or_default();
items.sort_by(|left, right| {
left["metadata"]["name"]
.as_str()
.unwrap_or_default()
.cmp(right["metadata"]["name"].as_str().unwrap_or_default())
});
let mut output = Vec::new();
for item in items {
let kind = item["kind"].as_str().unwrap_or_default();
let name = item["metadata"]["name"].as_str().unwrap_or_default();
match kind {
"ServerGroup" => output.push(server_group_component(&item)),
"ServerGateway" => output.push(resource_phase_component("Gateway Resource", &item)),
"ServerSet" => {
if should_show_serverset(&item) {
output.push(serverset_component(name, &item));
}
}
_ => {}
}
}
output
}
fn server_group_component(item: &Value) -> RemoteServerComponent {
let phase = item["status"]["phase"].as_str().unwrap_or("Unknown");
phase_component(
"Server Group",
"server-group",
"system",
phase,
format!("Server Group reports {phase}."),
vec![],
)
}
fn resource_phase_component(label: &str, item: &Value) -> RemoteServerComponent {
let phase = item["status"]["phase"].as_str().unwrap_or("Unknown");
phase_component(
label,
"gateway-resource",
"system",
phase,
format!("{label} reports {phase}."),
vec![],
)
}
fn serverset_component(name: &str, item: &Value) -> RemoteServerComponent {
let map = item["spec"]["map"].as_str().unwrap_or_default();
let label = friendly_map_name(map, name);
let phase = item["status"]["phase"].as_str().unwrap_or("Unknown");
let target = item["status"]["targetReplicas"]
.as_u64()
.unwrap_or_default();
let ready = item["status"]["readyReplicas"].as_u64().unwrap_or_default();
let completed = item["status"]["completedReplicas"]
.as_u64()
.unwrap_or_default();
let pods = item["status"]["pods"]
.as_array()
.cloned()
.unwrap_or_default();
let game_ready = pods
.iter()
.filter(|pod| pod["ready"].as_bool().unwrap_or(false))
.count();
let details = compact_details(vec![
format!("{ready}/{target} Kubernetes-ready replicas"),
format!("{completed}/{target} completed game replicas"),
format!("{game_ready}/{target} game-ready servers"),
]);
let summary =
if phase == "Initializing" && ready >= target && target > 0 && game_ready < target as usize
{
"Game process is running, but game readiness has not completed.".to_string()
} else {
format!("{label} reports {phase}.")
};
phase_component(
&label,
&serverset_log_key(name, map),
"map",
phase,
summary,
details,
)
}
fn should_show_serverset(item: &Value) -> bool {
let phase = item["status"]["phase"].as_str().unwrap_or_default();
let target = item["status"]["targetReplicas"]
.as_u64()
.unwrap_or_default();
let map = item["spec"]["map"].as_str().unwrap_or_default();
phase != "Stopped" || target > 0 || matches!(map, "Survival_1" | "Overmap" | "DeepDesert_1")
}
fn phase_component(
label: &str,
log_key: &str,
category: &str,
phase: &str,
summary: String,
details: Vec<String>,
) -> RemoteServerComponent {
let normalized = phase.to_ascii_lowercase();
let (state, tone) = match normalized.as_str() {
"healthy" | "running" | "ready" | "available" => ("Ready", "green"),
"stopped" | "suspended" => ("Stopped", "gray"),
"initializing" | "reconciling" | "pending" | "starting" => ("Starting", "amber"),
"failed" | "error" | "degraded" => ("Problem", "red"),
_ => ("Unknown", "amber"),
};
component(label, log_key, category, state, tone, summary, details)
}
fn component(
name: &str,
log_key: &str,
category: &str,
state: &str,
tone: &str,
summary: impl Into<String>,
details: Vec<String>,
) -> RemoteServerComponent {
RemoteServerComponent {
name: name.to_string(),
log_key: log_key.to_string(),
category: category.to_string(),
state: state.to_string(),
tone: tone.to_string(),
summary: summary.into(),
details,
}
}
fn compact_details(values: Vec<String>) -> Vec<String> {
values
.into_iter()
.filter(|value| !value.trim().is_empty())
.collect()
}
fn is_bad_reason(reason: &str) -> bool {
matches!(
reason,
"CrashLoopBackOff"
| "ImagePullBackOff"
| "ErrImagePull"
| "CreateContainerConfigError"
| "CreateContainerError"
| "RunContainerError"
| "OOMKilled"
| "Error"
)
}

View File

@@ -0,0 +1,62 @@
pub fn friendly_map_name(map: &str, fallback_name: &str) -> String {
let normalized = map.to_ascii_lowercase();
if normalized == "survival_1" || fallback_name.contains("survival-1") {
return "Hagga Basin".to_string();
}
if normalized == "overmap" || fallback_name.contains("overmap") {
return "Overmap".to_string();
}
if normalized.contains("deepdesert") || fallback_name.contains("deepdesert") {
return "Deep Desert".to_string();
}
if fallback_name.contains("sh-arrakeen") {
return "Social Hub: Arrakeen".to_string();
}
if fallback_name.contains("sh-harkovillage") {
return "Social Hub: Harko Village".to_string();
}
if !map.is_empty() {
return map.replace('_', " ");
}
"Game Server".to_string()
}
pub fn serverset_log_key(name: &str, map: &str) -> String {
let combined = format!("{name} {map}").to_ascii_lowercase();
if map.eq_ignore_ascii_case("Survival_1") || combined.contains("survival-1") {
return "map-survival-1".to_string();
}
if map.eq_ignore_ascii_case("Overmap") || combined.contains("overmap") {
return "map-overmap".to_string();
}
if combined.contains("deepdesert") || combined.contains("deep-desert") {
return "map-deepdesert".to_string();
}
if combined.contains("sh-arrakeen") {
return "map-social-arrakeen".to_string();
}
if combined.contains("sh-harkovillage") {
return "map-social-harkovillage".to_string();
}
format!("map-{}", sanitize_component_key(map))
}
fn sanitize_component_key(value: &str) -> String {
let key = value
.chars()
.map(|character| {
if character.is_ascii_alphanumeric() {
character.to_ascii_lowercase()
} else {
'-'
}
})
.collect::<String>()
.trim_matches('-')
.to_string();
if key.is_empty() {
"unknown".to_string()
} else {
key
}
}

View File

@@ -0,0 +1,301 @@
use std::io::{Read, Write};
use std::net::TcpStream;
use std::path::PathBuf;
use std::time::Duration;
use dune_manager_core::orchestration::{LocalForwarder, RusshTarget};
use crate::commands::tunnel_helpers::{
discover_database_tunnel_port, discover_director_tunnel_port, discover_pg_hero_tunnel_port,
normalize_tunnel_service, tunnel_target, tunnel_url,
};
use crate::dto::{
CustomTunnelStartRequest, ServerTunnelStartRequest, ServerTunnelStatus, ServerTunnelStopRequest,
};
use crate::state::{ManagedTunnel, TunnelRegistry};
const MANAGEMENT_API_PORT: u16 = 29187;
const LEGACY_MANAGEMENT_API_PORT: u16 = 8787;
#[tauri::command]
pub async fn start_server_tunnel(
registry: tauri::State<'_, TunnelRegistry>,
request: ServerTunnelStartRequest,
) -> Result<ServerTunnelStatus, String> {
let registry = registry.inner().clone();
tauri::async_runtime::spawn_blocking(move || start_server_tunnel_inner(&registry, request))
.await
.map_err(|err| format!("Tunnel worker failed: {err}"))?
}
#[tauri::command]
pub async fn stop_server_tunnel(
registry: tauri::State<'_, TunnelRegistry>,
request: ServerTunnelStopRequest,
) -> Result<(), String> {
let registry = registry.inner().clone();
tauri::async_runtime::spawn_blocking(move || {
stop_server_tunnel_inner(&registry, &request.tunnel_id)
})
.await
.map_err(|err| format!("Tunnel stop worker failed: {err}"))?
}
#[tauri::command]
pub async fn server_tunnel_status(
registry: tauri::State<'_, TunnelRegistry>,
request: ServerTunnelStopRequest,
) -> Result<Option<ServerTunnelStatus>, String> {
let registry = registry.inner().clone();
tauri::async_runtime::spawn_blocking(move || {
existing_running_tunnel(&registry, request.tunnel_id.trim())
})
.await
.map_err(|err| format!("Tunnel status worker failed: {err}"))?
}
#[tauri::command]
pub async fn stop_all_tunnels(registry: tauri::State<'_, TunnelRegistry>) -> Result<(), String> {
registry.stop_all();
Ok(())
}
#[tauri::command]
pub async fn start_custom_tunnel(
registry: tauri::State<'_, TunnelRegistry>,
request: CustomTunnelStartRequest,
) -> Result<ServerTunnelStatus, String> {
let registry = registry.inner().clone();
tauri::async_runtime::spawn_blocking(move || start_custom_tunnel_inner(&registry, request))
.await
.map_err(|err| format!("Tunnel worker failed: {err}"))?
}
fn start_custom_tunnel_inner(
registry: &TunnelRegistry,
request: CustomTunnelStartRequest,
) -> Result<ServerTunnelStatus, String> {
let tunnel_id = request.tunnel_id.trim();
if tunnel_id.is_empty() {
return Err("Tunnel id is required.".to_string());
}
if let Some(status) = existing_running_tunnel(registry, tunnel_id)? {
return Ok(status);
}
let target = match request.server_kind.trim() {
"ubuntu" => {
let mut t = RusshTarget::new(
PathBuf::from(request.key_path.as_deref().unwrap_or_default().trim()),
request.user.trim().to_string(),
request.host.trim().to_string(),
);
if request.port != 0 {
t.port = request.port;
}
t.validate().map_err(|err| err.message)?;
t
}
other => return Err(format!("Unsupported remote server kind: {other}")),
};
let forwarder = LocalForwarder::start(
&target,
request.local_port,
"127.0.0.1",
request.remote_port,
)
.map_err(|err| err.message)?;
let local_port = forwarder.local_port();
let url = match request.protocol.trim() {
"https" => format!("https://127.0.0.1:{local_port}/"),
"postgresql" => format!("postgresql://127.0.0.1:{local_port}/"),
_ => format!("http://127.0.0.1:{local_port}/"),
};
let status = ServerTunnelStatus {
tunnel_id: tunnel_id.to_string(),
service: "custom".to_string(),
local_port,
remote_port: request.remote_port,
url,
};
let mut tunnels = registry
.tunnels
.lock()
.map_err(|_| "Tunnel registry is unavailable.".to_string())?;
if let Some(existing) = tunnels.remove(tunnel_id) {
existing.forwarder.stop();
}
tunnels.insert(
tunnel_id.to_string(),
ManagedTunnel {
forwarder,
status: status.clone(),
},
);
Ok(status)
}
fn start_server_tunnel_inner(
registry: &TunnelRegistry,
request: ServerTunnelStartRequest,
) -> Result<ServerTunnelStatus, String> {
let tunnel_id = request.tunnel_id.trim();
if tunnel_id.is_empty() {
return Err("Tunnel id is required.".to_string());
}
if let Some(status) = existing_running_tunnel(registry, tunnel_id)? {
return Ok(status);
}
let target = tunnel_target(&request)?;
let service = normalize_tunnel_service(&request.service)?;
let remote_port = match service.as_str() {
"director" => discover_director_tunnel_port(&target, &request.namespace)?,
"fileBrowser" => 18888,
"database" => discover_database_tunnel_port(&target, &request.namespace)?,
"pgHero" => discover_pg_hero_tunnel_port(&target, &request.namespace)?,
"managementApi" => MANAGEMENT_API_PORT,
_ => unreachable!(),
};
if service == "managementApi" {
return start_management_api_tunnel(registry, tunnel_id, &target, &service);
}
let forwarder =
LocalForwarder::start(&target, 0, "127.0.0.1", remote_port).map_err(|err| err.message)?;
let local_port = forwarder.local_port();
let status = ServerTunnelStatus {
tunnel_id: tunnel_id.to_string(),
url: tunnel_url(&service, local_port),
service,
local_port,
remote_port,
};
let mut tunnels = registry
.tunnels
.lock()
.map_err(|_| "Tunnel registry is unavailable.".to_string())?;
if let Some(existing) = tunnels.remove(tunnel_id) {
existing.forwarder.stop();
}
tunnels.insert(
tunnel_id.to_string(),
ManagedTunnel {
forwarder,
status: status.clone(),
},
);
Ok(status)
}
fn start_management_api_tunnel(
registry: &TunnelRegistry,
tunnel_id: &str,
target: &RusshTarget,
service: &str,
) -> Result<ServerTunnelStatus, String> {
let mut last_error = String::new();
for remote_port in [MANAGEMENT_API_PORT, LEGACY_MANAGEMENT_API_PORT] {
let forwarder = LocalForwarder::start(target, 0, "127.0.0.1", remote_port)
.map_err(|err| err.message)?;
let local_port = forwarder.local_port();
match probe_management_api(local_port) {
Ok(()) => {
let status = ServerTunnelStatus {
tunnel_id: tunnel_id.to_string(),
url: tunnel_url(service, local_port),
service: service.to_string(),
local_port,
remote_port,
};
let mut tunnels = registry
.tunnels
.lock()
.map_err(|_| "Tunnel registry is unavailable.".to_string())?;
if let Some(existing) = tunnels.remove(tunnel_id) {
existing.forwarder.stop();
}
tunnels.insert(
tunnel_id.to_string(),
ManagedTunnel {
forwarder,
status: status.clone(),
},
);
return Ok(status);
}
Err(err) => {
last_error = format!("127.0.0.1:{remote_port}: {err}");
forwarder.stop();
}
}
}
Err(format!(
"management service did not answer on port {MANAGEMENT_API_PORT} or legacy port {LEGACY_MANAGEMENT_API_PORT}; last probe: {last_error}"
))
}
fn probe_management_api(local_port: u16) -> Result<(), String> {
let addr = format!("127.0.0.1:{local_port}");
let timeout = Duration::from_millis(1500);
let socket_addr: std::net::SocketAddr =
addr.parse().map_err(|err| format!("bad addr: {err}"))?;
let mut stream = TcpStream::connect_timeout(&socket_addr, timeout)
.map_err(|err| format!("connect failed: {err}"))?;
stream.set_read_timeout(Some(timeout)).ok();
stream.set_write_timeout(Some(timeout)).ok();
stream
.write_all(b"GET /api/health HTTP/1.1\r\nHost: 127.0.0.1\r\nConnection: close\r\n\r\n")
.map_err(|err| format!("write failed: {err}"))?;
let mut buf = [0u8; 256];
let n = stream
.read(&mut buf)
.map_err(|err| format!("read failed: {err}"))?;
if n == 0 {
return Err("remote closed without an HTTP response".to_string());
}
let head = String::from_utf8_lossy(&buf[..n]);
if head.starts_with("HTTP/1.1 200") || head.starts_with("HTTP/1.0 200") {
Ok(())
} else {
Err(format!("unexpected health response: {}", head.trim()))
}
}
fn stop_server_tunnel_inner(registry: &TunnelRegistry, tunnel_id: &str) -> Result<(), String> {
let mut tunnels = registry
.tunnels
.lock()
.map_err(|_| "Tunnel registry is unavailable.".to_string())?;
if let Some(tunnel) = tunnels.remove(tunnel_id.trim()) {
tunnel.forwarder.stop();
}
Ok(())
}
fn existing_running_tunnel(
registry: &TunnelRegistry,
tunnel_id: &str,
) -> Result<Option<ServerTunnelStatus>, String> {
let mut tunnels = registry
.tunnels
.lock()
.map_err(|_| "Tunnel registry is unavailable.".to_string())?;
let Some(tunnel) = tunnels.get(tunnel_id) else {
return Ok(None);
};
if tunnel.forwarder.is_finished() {
if let Some(stale) = tunnels.remove(tunnel_id) {
stale.forwarder.stop();
}
Ok(None)
} else {
Ok(Some(tunnel.status.clone()))
}
}

Some files were not shown because too many files have changed in this diff Show More