13 Commits

Author SHA1 Message Date
Vantz Stockwell
9e5e828c8d fix(docker): nginx healthcheck uses 127.0.0.1 not localhost — IPv4-only listener
All checks were successful
CI / backend-types (push) Successful in 10s
CI / frontend-build (push) Successful in 16s
CI / agent-tests (push) Successful in 44s
CI / integration (push) Successful in 21s
corrosion-nginx reported (unhealthy) despite serving the panel fine:
nginx listens 0.0.0.0:80 (IPv4 only, no listen [::]:80), but
'localhost' resolves to ::1 first inside the container, so the probe
got connection-refused. Verified: 127.0.0.1:80 serves the SPA. Probe
now targets IPv4 explicitly. No nginx config change — the panel was
never broken, only the healthcheck's hostname resolution.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-11 11:43:01 -04:00
Vantz Stockwell
fccd5c61c5 docs(claude): Lessons 24-25 — onModuleInit-before-connect dead subscriptions + resurrected-path crash
All checks were successful
CI / backend-types (push) Successful in 9s
CI / frontend-build (push) Successful in 16s
CI / agent-tests (push) Successful in 39s
CI / integration (push) Successful in 22s
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-11 11:17:02 -04:00
Vantz Stockwell
c72a280361 fix(api): WS gateways crashed on first forwarded event — WebSocket.OPEN undefined at runtime
All checks were successful
CI / backend-types (push) Successful in 9s
CI / frontend-build (push) Successful in 16s
CI / agent-tests (push) Successful in 40s
CI / integration (push) Successful in 20s
Lesson 10 in the flesh: the onApplicationBootstrap fix made the NATS->
WS bridge actually deliver events for the first time, which instantly
crashed the API. esModuleInterop is off, so 'import WebSocket from ws'
compiles to ws_1.default = undefined; WebSocket.OPEN threw
'Cannot read properties of undefined' and killed the process on the
first heartbeat forward. All three WS guard sites (nats-bridge x2,
console gateway) switched to the import-agnostic instance constant
client.OPEN. Latent in every build — never hit because the bridge was
dead-on-boot until today.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-11 11:11:29 -04:00
Vantz Stockwell
a3b4b5cc7d fix(api): NATS subscriptions moved to onApplicationBootstrap — they silently no-oped before connect
All checks were successful
CI / backend-types (push) Successful in 10s
CI / frontend-build (push) Successful in 16s
CI / agent-tests (push) Successful in 47s
CI / integration (push) Successful in 22s
Production bug caught live: provider onModuleInit order put bridge/
consumer subscription hooks BEFORE NatsService finished connecting, so
every subscribe() hit the [OFFLINE] no-op path — the WS bridge has been
dead-on-boot in every production build, and the new v2 consumer never
saw a heartbeat (server_connections stayed empty under a live agent).
onApplicationBootstrap is guaranteed to run after all module inits,
including the awaited NATS connect.

The new CI contract suite fails on exactly this class of bug.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-11 11:02:52 -04:00
Vantz Stockwell
4e184ca571 ci: full test gate — types, frontend build, agent tests, agent<->backend contract suite
Some checks failed
CI / backend-types (push) Successful in 11s
CI / frontend-build (push) Successful in 17s
CI / agent-tests (push) Successful in 1m48s
CI / integration (push) Has been cancelled
ci.yml runs on every push to main: backend tsc, frontend vue-tsc+vite,
cargo test (cached), then an integration job with postgres:16 + nats
service containers — real migrations applied to a fresh DB, real
backend booted (admin seed provides the license), real agent binary
spawned. contract-tests/agent-backend.contract.mjs proves the entire
v2 pipeline: heartbeat shape + measured telemetry, auto-registered
server_connections row flipping connected, instance start/stop/status
round-trips with push events, and the offline beacon flipping the row
back. This is the test that could not run before a production rebuild
until now — it now runs before every push lands.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-11 10:59:44 -04:00
Vantz Stockwell
fde0926d52 feat(host-agent): Phase 1b RCON — WebRCON (rust) + Source RCON (conan/soulmask)
rcon func on the instance command channel: WebSocket JSON WebRCON with
Identifier correlation (skips chat/log noise frames) and full Valve
Source RCON over TCP (auth, exec, multi-packet reassembly via empty
probe, 1MiB cap). Protocol inferred from game, explicit kind override
in [instance.rcon]. Always 127.0.0.1 — agent is co-located.

Hardening from review: WebRCON password never interpolated into error
contexts/logs (redacted URL); probe-tolerant termination — a quiet
period after received data ends the response for servers that don't
echo the probe (Soulmask conformance unverified), so data is never
discarded on probe timeout.

13/13 tests green incl. mock Source-RCON server (auth/multi-packet/
errors) and mock WebRCON server (noise-frame skipping).

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-11 10:53:52 -04:00
Vantz Stockwell
4d99c9d99d feat(frontend): validate persisted session on app boot
A stale or revoked token previously rendered the full panel chrome and
only collapsed on the first API call. App boot now calls /auth/me
through useApi (401 -> refresh -> logout already handled there); user
profile refreshes on success, and non-auth failures (network, 5xx)
never log the user out.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-11 10:49:21 -04:00
Vantz Stockwell
b8f0ccba3c fix(frontend): env-driven marketing host detection
Exact-match on 'corrosionmgmt.com' meant www. or any staging host
silently served the panel instead of the marketing site. Hosts now come
from VITE_MARKETING_HOSTS (comma-separated, defaults cover bare + www).

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-11 10:47:15 -04:00
Vantz Stockwell
068a476f39 feat(host-agent): Phase 1a process supervision — instance start/stop/restart/status + push state events
Per-instance ProcessSupervisor: tokio child spawn with proper arg list
(fixes Go's naive space-splitting), graceful SIGTERM with 30s budget
then force kill, monitor task classifying ordered-stop vs crash (exit
code captured), watch-channel state observable everywhere. Instance cmd
channel live on corrosion.{license}.{instance}.cmd (start/stop/restart/
status) with state events pushed on {instance}.status (keep-latest
semantics, documented). Heartbeats now carry live process state +
uptime per instance. Crate restructured lib+bin for integration tests.

Verified: 5 integration tests with real OS processes (lifecycle, crash
exit-code, restart recovery, unmanaged rejection, clean spawn failure)
+ live-NATS contract test (request-reply roundtrips, double-start
rejection, push events, heartbeat state) — all green.

Known limitation (documented): no PID adoption yet — agent restart
orphans a running game process to 'stopped' until panel restart.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-11 10:44:24 -04:00
Vantz Stockwell
f706c3c47e docs(claude): host-agent reality — active Rust crate, tag scheme, runner container truth, command corrections
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-11 10:37:37 -04:00
Vantz Stockwell
4c9c322c29 feat(seo): per-route titles + meta descriptions; ci: honest runner test
Every page previously titled 'Corrosion Management' with zero meta -
marketing invisible to search and link previews. Router afterEach now
sets title/description/og per route (no new deps); marketing pages get
real content-backed descriptions, panel views mechanical titles.
index.html carries defaults for pre-JS crawlers. Verified in-browser
per page via Playwright.

test-runner.yml: per-tool presence checks instead of green-lighting
missing toolchains; workflow_dispatch instead of every push.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-11 10:35:58 -04:00
Vantz Stockwell
47fa72763c feat(api): host-agent protocol v2 consumer — heartbeat persistence, auto-register, staleness sweep
Nothing persisted agent heartbeats before: companion_last_seen was
written once at setup and connection_status stayed 'connected' forever.
HostAgentConsumerService now consumes corrosion.*.host.heartbeat
(updates last_seen + status, auto-creates the bare_metal connection row
on first contact), host.going_offline (graceful offline), and sweeps
connections offline after 180s of heartbeat silence. License-existence
tenant validation with caching per NATS-consumer doctrine. WS bridge
forwards host_heartbeat/host_going_offline to the panel.

Contract-verified against production NATS with the backend's own nats
lib: v2 subjects, schema 2, real telemetry, offline beacon.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-11 10:35:58 -04:00
Vantz Stockwell
b455bf9f14 ci(host-agent): bootstrap Rust in the runner container; roll to alpha.2
All checks were successful
Build Host Agent (Rust) / build (push) Successful in 1m29s
Test Asgard Runner / test (push) Successful in 3s
Asgard runner executes jobs in bare node:20-bullseye (no Rust, no sudo)
- install rustup + musl/mingw cross toolchains per-run, same pattern as
setup-go in the Go pipeline. agent-v2.0.0-alpha.1 predates this fix;
forward-only doctrine: version rolls to alpha.2 rather than re-pushing
the tag.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-11 10:15:36 -04:00
34 changed files with 2177 additions and 79 deletions

View File

@@ -42,3 +42,6 @@ FRONTEND_URL=http://localhost:5174
# Frontend (Vite — must be prefixed with VITE_) # Frontend (Vite — must be prefixed with VITE_)
VITE_PANEL_URL=https://panel.corrosionmgmt.com VITE_PANEL_URL=https://panel.corrosionmgmt.com
# Hostnames that serve the marketing site (comma-separated); all other hosts get the panel
VITE_MARKETING_HOSTS=corrosionmgmt.com,www.corrosionmgmt.com

View File

@@ -35,11 +35,16 @@ jobs:
exit 1 exit 1
fi fi
- name: Install cross toolchains # The Asgard runner executes jobs in a bare node:20-bullseye container
# (no Rust, no sudo, runs as root) — bootstrap the toolchain per-run,
# same pattern as actions/setup-go in the Go pipeline.
- name: Install Rust + cross toolchains
run: | run: |
sudo apt-get update -qq apt-get update -qq
sudo apt-get install -y -qq musl-tools gcc-mingw-w64-x86-64 apt-get install -y -qq build-essential musl-tools gcc-mingw-w64-x86-64 curl
rustup target add x86_64-unknown-linux-musl x86_64-pc-windows-gnu curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh -s -- -y --default-toolchain stable --profile minimal
echo "$HOME/.cargo/bin" >> $GITHUB_PATH
"$HOME/.cargo/bin/rustup" target add x86_64-unknown-linux-musl x86_64-pc-windows-gnu
- name: Build Linux AMD64 (static musl) - name: Build Linux AMD64 (static musl)
run: | run: |

122
.gitea/workflows/ci.yml Normal file
View File

@@ -0,0 +1,122 @@
name: CI
# Test gate for every push to main. The deploy story: main must be green here
# before the stack is rebuilt (deploy workflow enforces it once SSH transport
# secrets land). Jobs run in the runner's bare node:20-bullseye container —
# toolchains bootstrap per-run.
on:
push:
branches: [main]
pull_request:
jobs:
backend-types:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Type-check NestJS backend
run: |
cd backend-nest
npm ci --no-audit --no-fund 2>&1 | tail -2
npx tsc --noEmit
frontend-build:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Build frontend (vue-tsc gate + vite)
run: |
cd frontend
npm ci --no-audit --no-fund 2>&1 | tail -2
npm run build
agent-tests:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Cache cargo
uses: actions/cache@v4
with:
path: |
~/.cargo/registry
~/.cargo/git
corrosion-host-agent/target
key: cargo-${{ hashFiles('corrosion-host-agent/Cargo.lock') }}
- name: Install Rust
run: |
apt-get update -qq && apt-get install -y -qq build-essential curl
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh -s -- -y --default-toolchain stable --profile minimal
echo "$HOME/.cargo/bin" >> $GITHUB_PATH
- name: Test agent
run: |
cd corrosion-host-agent
cargo test
- name: Upload agent binary for integration
uses: actions/upload-artifact@v3
with:
name: agent-debug
path: corrosion-host-agent/target/debug/corrosion-host-agent
integration:
runs-on: ubuntu-latest
needs: agent-tests
services:
postgres:
image: postgres:16
env:
POSTGRES_USER: corrosion
POSTGRES_PASSWORD: citest
POSTGRES_DB: corrosion
nats:
image: nats:2.10-alpine
steps:
- uses: actions/checkout@v4
- name: Download agent binary
uses: actions/download-artifact@v3
with:
name: agent-debug
path: agent-bin
- name: Apply migrations to fresh DB
run: |
apt-get update -qq && apt-get install -y -qq postgresql-client
until PGPASSWORD=citest psql -h postgres -U corrosion -d corrosion -c 'SELECT 1' >/dev/null 2>&1; do sleep 1; done
for f in $(ls backend/migrations/*.sql | sort); do
echo "applying $f"
PGPASSWORD=citest psql -h postgres -U corrosion -d corrosion -v ON_ERROR_STOP=1 -q -f "$f"
done
- name: Build + boot backend
run: |
cd backend-nest
npm ci --no-audit --no-fund 2>&1 | tail -2
npm run build
DATABASE_URL=postgres://corrosion:citest@postgres:5432/corrosion \
NATS_URL=nats://nats:4222 \
JWT_SECRET=ci-secret ENCRYPTION_KEY=ci-encryption-key \
ADMIN_EMAIL=ci@corrosion.test ADMIN_PASSWORD=ci-password-123 ADMIN_USERNAME=CI \
nohup node dist/main.js > /tmp/backend.log 2>&1 &
for i in $(seq 1 30); do
code=$(curl -s -o /dev/null -w "%{http_code}" http://localhost:3000/api/auth/login -X POST -H 'Content-Type: application/json' -d '{}' || true)
[ "$code" = "400" ] && echo "backend up" && exit 0
sleep 2
done
echo "backend failed to come up"; cat /tmp/backend.log; exit 1
- name: Run agent↔backend contract suite
run: |
chmod +x agent-bin/corrosion-host-agent
LICENSE_ID=$(PGPASSWORD=citest psql -h postgres -U corrosion -d corrosion -t -A -c 'SELECT id FROM licenses LIMIT 1')
echo "license under test: $LICENSE_ID"
[ -n "$LICENSE_ID" ] || { echo "admin seed did not create a license"; cat /tmp/backend.log; exit 1; }
LICENSE_ID="$LICENSE_ID" \
DATABASE_URL=postgres://corrosion:citest@postgres:5432/corrosion \
NATS_URL=nats://nats:4222 \
AGENT_BIN=$PWD/agent-bin/corrosion-host-agent \
node contract-tests/agent-backend.contract.mjs
- name: Backend log on failure
if: failure()
run: cat /tmp/backend.log || true

View File

@@ -1,5 +1,6 @@
name: Test Asgard Runner name: Test Asgard Runner
on: [push] # On-demand only — no reason to spin a container on every push.
on: [workflow_dispatch]
jobs: jobs:
test: test:
@@ -17,8 +18,15 @@ jobs:
echo "Memory: $(free -h | grep Mem | awk '{print $2}')" echo "Memory: $(free -h | grep Mem | awk '{print $2}')"
echo "Disk: $(df -h / | tail -1 | awk '{print $4}')" echo "Disk: $(df -h / | tail -1 | awk '{print $4}')"
echo "===========================================" echo "==========================================="
echo "Go: $(go version)" # Jobs run in a bare node:20-bullseye container: toolchains are NOT
echo "Rust: $(rustc --version)" # preinstalled — workflows must bootstrap them (setup-go, rustup).
echo "Docker: $(docker --version)" # Report presence honestly instead of green-lighting a missing tool.
for tool in go rustc docker node; do
if command -v "$tool" >/dev/null 2>&1; then
echo "$tool: $($tool --version 2>&1 | head -1)"
else
echo "$tool: NOT PRESENT (workflows must install per-run)"
fi
done
echo "===========================================" echo "==========================================="
echo "✅ Asgard runner is OPERATIONAL" echo "✅ Asgard runner reachable — container is node:20-bullseye, bootstrap toolchains per-run"

View File

@@ -4,6 +4,19 @@ All notable changes to this project will be documented in this file.
## [Unreleased] ## [Unreleased]
### Added (Host-Agent v2 Consumer + SEO Meta — 2026-06-11)
**Backend (NestJS):**
- `HostAgentConsumerService` (new) — consumes wire protocol v2: `corrosion.*.host.heartbeat` updates `companion_last_seen` + `connection_status='connected'` (auto-registers the connection row on first contact); `host.going_offline` flips offline; a 60s staleness sweep marks hosts offline after 180s of silence. Previously NOTHING persisted heartbeats — `connection_status` was set once at setup and never changed again. Tenant-validated (UUID + license existence, cached) per NATS-consumer doctrine
- `NatsBridgeService` — bridges `host_heartbeat` / `host_going_offline` events to the panel WebSocket
- Verified by contract test: real agent → production NATS → captured with the backend's own `nats` lib under the real license; subjects, schema 2, real telemetry, offline beacon all confirmed
**Frontend:**
- Per-route document titles + meta descriptions (router `afterEach`, no new deps): six marketing pages get real titles/descriptions/OG tags (previously every page was "Corrosion Management" with zero meta — invisible to search and link previews); panel views get mechanical "{View} — Corrosion" titles
**CI:**
- `test-runner.yml` — honest per-tool presence checks (was printing "OPERATIONAL" while every toolchain probe failed); on-demand trigger instead of every push
### Added (Corrosion Host Agent — Rust rewrite Phase 0 — 2026-06-11) ### Added (Corrosion Host Agent — Rust rewrite Phase 0 — 2026-06-11)
**New: `corrosion-host-agent/`** — Rust rewrite of the Go companion agent (which stays in-tree as the behavior reference until parity). Wire protocol v2 (COA-B, Commander-approved): instance-scoped subjects `corrosion.{license}.{instance}.*` with host-level `corrosion.{license}.host.*` — full spec in `corrosion-host-agent/PROTOCOL.md`. **New: `corrosion-host-agent/`** — Rust rewrite of the Go companion agent (which stays in-tree as the behavior reference until parity). Wire protocol v2 (COA-B, Commander-approved): instance-scoped subjects `corrosion.{license}.{instance}.*` with host-level `corrosion.{license}.host.*` — full spec in `corrosion-host-agent/PROTOCOL.md`.

View File

@@ -55,7 +55,12 @@ frontend/ # Vue 3 + TypeScript
package.json package.json
vite.config.ts # Proxies /api to :3000 vite.config.ts # Proxies /api to :3000
companion-agent/ # Go binary for bare metal servers corrosion-host-agent/ # Rust host agent (ACTIVE) — multi-game ops runtime
src/ # main, config, bus (NATS), telemetry, prober, hostcmd
PROTOCOL.md # Wire protocol v2 spec (instance-scoped subjects)
agent.example.toml # Multi-instance config reference
companion-agent/ # Go binary (LEGACY — behavior reference until Rust parity)
cmd/agent/ # main.go entry point cmd/agent/ # main.go entry point
internal/ # Core agent logic (nats, commands, process) internal/ # Core agent logic (nats, commands, process)
Makefile # Build for Linux/Windows Makefile # Build for Linux/Windows
@@ -91,14 +96,16 @@ cd backend-nest && npx tsc --noEmit # Type-check without building
# Frontend # Frontend
cd frontend && npm run dev # Vite dev server (port 5174) cd frontend && npm run dev # Vite dev server (port 5174)
cd frontend && npm run build # Production build → dist/ cd frontend && npm run build # vue-tsc -b && vite build (type-check included; no separate lint/type-check scripts exist)
cd frontend && npm run lint # ESLint
cd frontend && npm run type-check # TypeScript checking (vue-tsc)
# Companion Agent (Go) # Host Agent (Rust — ACTIVE)
cd corrosion-host-agent && cargo check # Fast validation
cd corrosion-host-agent && cargo build --release --target x86_64-unknown-linux-musl # Static Linux binary
cd corrosion-host-agent && cargo xwin build --release --target x86_64-pc-windows-msvc # Windows (local)
# CI: push tag agent-vX.Y.Z (must match Cargo.toml version) → Asgard builds → CDN /host-agent/alpha/
# Companion Agent (Go — LEGACY, behavior reference until Rust parity)
cd companion-agent && make build # Build for current platform cd companion-agent && make build # Build for current platform
cd companion-agent && make linux # Cross-compile for Linux
cd companion-agent && make windows # Cross-compile for Windows
# Docker (from docker/ directory — Commander ALWAYS builds with --no-cache) # Docker (from docker/ directory — Commander ALWAYS builds with --no-cache)
docker compose build --no-cache && docker compose up -d # Full rebuild + start docker compose build --no-cache && docker compose up -d # Full rebuild + start
@@ -374,7 +381,8 @@ Default to Sonnet. Escalate to Opus when the problem demands it, not as a comfor
- Treat every change as production deployment (`corrosionmgmt.com`) - Treat every change as production deployment (`corrosionmgmt.com`)
- Document why, not just what, in commits and CHANGELOG - Document why, not just what, in commits and CHANGELOG
- **Always commit and push when done touching code — never ask, never wait for permission** - **Always commit and push when done touching code — never ask, never wait for permission**
- **Tag companion agent builds when Go code in `companion-agent/` is modified** — increment from latest tag (currently v1.0.3), push tag to trigger CI build + CDN upload - **Tag agent builds when agent code is modified** — Rust agent: `agent-vX.Y.Z` (must match `corrosion-host-agent/Cargo.toml`; CI publishes to CDN `/host-agent/alpha/`, while `/latest/` stays on the Go build until cutover). Legacy Go agent: `vX.Y.Z`. Tags roll FORWARD only — never reuse or re-push a tag; cut the next version
- **The Asgard CI runner executes jobs in a bare `node:20-bullseye` container** — no Rust/Go/Docker/sudo preinstalled; workflows must bootstrap toolchains per-run (setup-go, rustup via curl)
## Development Notes ## Development Notes
@@ -435,3 +443,7 @@ Things I discovered about myself building a sister platform across multiple sess
22. **Build-green is not render-correct — visually verify UI work before calling it done.** The entire design-system re-skin (50+ files, six green commits) rendered almost completely unstyled in the browser — white background, no surfaces, no accent — because the design tokens never loaded. `vue-tsc -b` + `vite build` passed clean the whole time; CSS that *compiles* can still apply *zero* styles. One Playwright screenshot of the login exposed it in seconds. When the deliverable is visual, a green build is necessary but not sufficient: load it in a real browser (Playwright on the dev server at :5174), screenshot it, and assert on `getComputedStyle` — don't trust compilation alone. This is Lesson 17 with teeth. 22. **Build-green is not render-correct — visually verify UI work before calling it done.** The entire design-system re-skin (50+ files, six green commits) rendered almost completely unstyled in the browser — white background, no surfaces, no accent — because the design tokens never loaded. `vue-tsc -b` + `vite build` passed clean the whole time; CSS that *compiles* can still apply *zero* styles. One Playwright screenshot of the login exposed it in seconds. When the deliverable is visual, a green build is necessary but not sufficient: load it in a real browser (Playwright on the dev server at :5174), screenshot it, and assert on `getComputedStyle` — don't trust compilation alone. This is Lesson 17 with teeth.
23. **Tailwind v4 silently drops a nested `@import` barrel placed after `@import "tailwindcss"`.** `style.css` did `@import "tailwindcss"; @import "./styles/corrosion.css";` where corrosion.css was a barrel of eight `@import` token files. Once Tailwind v4 expands the tailwindcss import in place, the barrel's inner @imports no longer precede all statements, so PostCSS drops them — emitting only an easily-ignored "@import must precede all other statements" warning. Result: every design token resolved empty and the whole panel rendered unstyled. Import token/design CSS files **directly and contiguously** in the entry stylesheet; never via a nested barrel after the Tailwind import. The build warning you wave off as "pre-existing" may be the entire feature silently failing. 23. **Tailwind v4 silently drops a nested `@import` barrel placed after `@import "tailwindcss"`.** `style.css` did `@import "tailwindcss"; @import "./styles/corrosion.css";` where corrosion.css was a barrel of eight `@import` token files. Once Tailwind v4 expands the tailwindcss import in place, the barrel's inner @imports no longer precede all statements, so PostCSS drops them — emitting only an easily-ignored "@import must precede all other statements" warning. Result: every design token resolved empty and the whole panel rendered unstyled. Import token/design CSS files **directly and contiguously** in the entry stylesheet; never via a nested barrel after the Tailwind import. The build warning you wave off as "pre-existing" may be the entire feature silently failing.
24. **`onModuleInit` runs before async `onModuleInit` of dependencies completes — register NATS/external subscriptions in `onApplicationBootstrap`.** `NatsService.onModuleInit` connects to NATS (async); `NatsBridgeService`/`HostAgentConsumerService` registered their subscriptions in their own `onModuleInit`, which fired while the connection was still null — so every `subscribe()` hit the `[OFFLINE]` no-op path and the WS bridge was dead-on-boot in *every* production build, silently. Nest guarantees `onApplicationBootstrap` runs only after all module init (including the awaited connect) finishes. Anything that depends on another provider's async startup belongs in bootstrap, not init. The tell: a subscription that "should be there" but the handler never fires and there's no error — trace the *startup ordering*, not the handler.
25. **Fixing a dead code path detonates the live code behind it — budget for the second bug.** The moment Lesson 24's fix made the NATS→WS bridge actually deliver events, the API crashed on the first forwarded heartbeat: `WebSocket.OPEN` was `undefined` at runtime because `esModuleInterop` is off, so `import WebSocket from 'ws'` compiled to `ws_1.default` (undefined). That crash had sat behind the dead bridge since the gateway was written — never hit because no event ever reached it. When you resurrect a path that was silently no-op, everything downstream of it is effectively *untested code running for the first time in production*. Verify the whole chain end-to-end (I watched the DB row appear, then flip offline), don't stop at "the subscription fires now." This is Lesson 10 with a fuse on it. Import-runtime gotcha worth remembering: when `esModuleInterop` is off, prefer instance constants (`client.OPEN`) over class statics (`WebSocket.OPEN`) for `ws`.

View File

@@ -49,6 +49,9 @@ import { EarlyAccessModule } from './modules/early-access/early-access.module';
// Shared Services // Shared Services
import { NatsService } from './services/nats.service'; import { NatsService } from './services/nats.service';
import { NatsBridgeService } from './services/nats-bridge.service'; import { NatsBridgeService } from './services/nats-bridge.service';
import { HostAgentConsumerService } from './services/host-agent-consumer.service';
import { ServerConnection } from './entities/server-connection.entity';
import { License } from './entities/license.entity';
import { SteamService } from './services/steam.service'; import { SteamService } from './services/steam.service';
// Gateway // Gateway
@@ -91,6 +94,9 @@ import { NatsBridgeGateway } from './gateways/nats-bridge.gateway';
// Scheduler // Scheduler
ScheduleModule.forRoot(), ScheduleModule.forRoot(),
// Repositories for app-level shared services (host-agent consumer)
TypeOrmModule.forFeature([ServerConnection, License]),
// Feature Modules // Feature Modules
AuthModule, AuthModule,
UsersModule, UsersModule,
@@ -134,6 +140,7 @@ import { NatsBridgeGateway } from './gateways/nats-bridge.gateway';
// Shared services // Shared services
NatsService, NatsService,
NatsBridgeService, NatsBridgeService,
HostAgentConsumerService,
SteamService, SteamService,
// WebSocket gateway // WebSocket gateway

View File

@@ -71,7 +71,10 @@ export class NatsBridgeGateway implements OnGatewayConnection, OnGatewayDisconne
// Subscribe to NATS events for this license // Subscribe to NATS events for this license
const listener = (event: string, data: unknown) => { const listener = (event: string, data: unknown) => {
if (client.readyState === WebSocket.OPEN) { // client.OPEN (instance constant) — NOT WebSocket.OPEN: with
// esModuleInterop off, the default `ws` import is undefined at
// runtime, so the static crashes. The instance constant is safe.
if (client.readyState === client.OPEN) {
client.send(JSON.stringify({ client.send(JSON.stringify({
type: 'event', type: 'event',
license_id: payload.license_id, license_id: payload.license_id,

View File

@@ -108,7 +108,9 @@ export class ConsoleGateway implements OnGatewayConnection, OnGatewayDisconnect
const message = JSON.stringify({ event, data }); const message = JSON.stringify({ event, data });
for (const client of clients) { for (const client of clients) {
if (client.readyState === WebSocket.OPEN) { // client.OPEN, not WebSocket.OPEN — esModuleInterop is off so the
// default `ws` import is undefined at runtime (would crash on forward).
if (client.readyState === client.OPEN) {
client.send(message); client.send(message);
} }
} }

View File

@@ -0,0 +1,154 @@
import { Injectable, Logger, OnApplicationBootstrap } from '@nestjs/common';
import { Interval } from '@nestjs/schedule';
import { InjectRepository } from '@nestjs/typeorm';
import { Repository } from 'typeorm';
import { NatsService } from './nats.service';
import { ServerConnection } from '../entities/server-connection.entity';
import { License } from '../entities/license.entity';
/**
* Consumes Corrosion wire protocol v2 host-agent subjects
* (corrosion-host-agent/PROTOCOL.md) and keeps server_connections truthful.
*
* Before this service existed, NOTHING persisted agent heartbeats:
* companion_last_seen was written once at setup and connection_status stayed
* 'connected' forever. Now: heartbeat -> last_seen + connected (row
* auto-created on first contact), going_offline beacon -> offline, and a
* staleness sweep marks hosts offline when heartbeats stop arriving.
*/
@Injectable()
export class HostAgentConsumerService implements OnApplicationBootstrap {
private readonly logger = new Logger(HostAgentConsumerService.name);
/** licenseId -> cache expiry epoch-ms. Positive = exists, absent = unknown. */
private knownLicenses = new Map<string, number>();
/** Unknown/garbage license ids we already warned about (anti log-spam). */
private warnedUnknown = new Set<string>();
private static readonly UUID_RE =
/^[0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12}$/i;
private static readonly LICENSE_CACHE_TTL_MS = 5 * 60_000;
/** 3x the agent's default 60s heartbeat (which jitters to max 72s). */
private static readonly OFFLINE_AFTER_MS = 180_000;
constructor(
private readonly nats: NatsService,
@InjectRepository(ServerConnection)
private readonly connectionRepository: Repository<ServerConnection>,
@InjectRepository(License)
private readonly licenseRepository: Repository<License>,
) {}
// Bootstrap, not module-init: subscriptions registered before NatsService
// finished connecting silently no-op (see NatsBridgeService note).
onApplicationBootstrap() {
this.nats.subscribe('corrosion.*.host.heartbeat', (data, subject) => {
const licenseId = subject.split('.')[1];
void this.onHeartbeat(licenseId).catch((err) =>
this.logger.error(`heartbeat handling failed for ${licenseId}: ${err.message}`, err.stack),
);
void data; // payload telemetry is bridged to the browser; persistence here is liveness only
});
this.nats.subscribe('corrosion.*.host.going_offline', (_data, subject) => {
const licenseId = subject.split('.')[1];
void this.onGoingOffline(licenseId).catch((err) =>
this.logger.error(`going_offline handling failed for ${licenseId}: ${err.message}`, err.stack),
);
});
this.logger.log('Host agent (protocol v2) consumer subscriptions initialized');
}
private async onHeartbeat(licenseId: string): Promise<void> {
if (!(await this.isValidTenant(licenseId))) return;
const now = new Date();
const existing = await this.connectionRepository.findOne({
where: { license_id: licenseId },
});
if (existing) {
await this.connectionRepository.update(
{ id: existing.id },
{ companion_last_seen: now, connection_status: 'connected', updated_at: now },
);
if (existing.connection_status !== 'connected') {
this.logger.log(`host agent for license ${licenseId} is back online`);
}
} else {
// First contact from a host agent: auto-register the connection so the
// panel lights up without a manual setup step.
await this.connectionRepository.save(
this.connectionRepository.create({
license_id: licenseId,
connection_type: 'bare_metal',
connection_status: 'connected',
companion_last_seen: now,
}),
);
this.logger.log(`host agent registered for license ${licenseId} (first heartbeat)`);
}
}
private async onGoingOffline(licenseId: string): Promise<void> {
if (!(await this.isValidTenant(licenseId))) return;
await this.connectionRepository.update(
{ license_id: licenseId },
{ connection_status: 'offline', updated_at: new Date() },
);
this.logger.log(`host agent for license ${licenseId} went offline (graceful beacon)`);
}
/**
* Heartbeats stopping must flip the panel to offline — an agent that
* crashes or loses network never sends the goodbye beacon.
*/
@Interval(60_000)
async sweepStaleConnections(): Promise<void> {
const threshold = new Date(Date.now() - HostAgentConsumerService.OFFLINE_AFTER_MS);
const result = await this.connectionRepository
.createQueryBuilder()
.update(ServerConnection)
.set({ connection_status: 'offline', updated_at: () => 'NOW()' })
.where('connection_status = :connected', { connected: 'connected' })
.andWhere('companion_last_seen IS NOT NULL')
.andWhere('companion_last_seen < :threshold', { threshold })
.execute();
if (result.affected) {
this.logger.warn(`marked ${result.affected} stale host connection(s) offline`);
}
}
/**
* Tenant validation: the subject segment must be a real license UUID.
* NATS consumers must never write rows for subjects an arbitrary publisher
* invented. Existence is cached to avoid a query per heartbeat.
*/
private async isValidTenant(licenseId: string): Promise<boolean> {
if (!HostAgentConsumerService.UUID_RE.test(licenseId)) {
this.warnUnknownOnce(licenseId, 'not a UUID');
return false;
}
const cachedUntil = this.knownLicenses.get(licenseId);
if (cachedUntil && cachedUntil > Date.now()) return true;
const exists = await this.licenseRepository.exist({ where: { id: licenseId } });
if (!exists) {
this.warnUnknownOnce(licenseId, 'no such license');
return false;
}
this.knownLicenses.set(licenseId, Date.now() + HostAgentConsumerService.LICENSE_CACHE_TTL_MS);
return true;
}
private warnUnknownOnce(licenseId: string, reason: string): void {
if (this.warnedUnknown.has(licenseId)) return;
this.warnedUnknown.add(licenseId);
this.logger.warn(`ignoring host-agent traffic for invalid license '${licenseId}' (${reason})`);
}
}

View File

@@ -1,3 +1,4 @@
export { NatsService } from './nats.service'; export { NatsService } from './nats.service';
export { NatsBridgeService } from './nats-bridge.service'; export { NatsBridgeService } from './nats-bridge.service';
export { HostAgentConsumerService } from './host-agent-consumer.service';
export { SteamService } from './steam.service'; export { SteamService } from './steam.service';

View File

@@ -1,14 +1,19 @@
import { Injectable, OnModuleInit, Logger } from '@nestjs/common'; import { Injectable, OnApplicationBootstrap, Logger } from '@nestjs/common';
import { NatsService } from './nats.service'; import { NatsService } from './nats.service';
@Injectable() @Injectable()
export class NatsBridgeService implements OnModuleInit { export class NatsBridgeService implements OnApplicationBootstrap {
private readonly logger = new Logger(NatsBridgeService.name); private readonly logger = new Logger(NatsBridgeService.name);
private listeners: Map<string, Set<(event: string, data: unknown) => void>> = new Map(); private listeners: Map<string, Set<(event: string, data: unknown) => void>> = new Map();
constructor(private nats: NatsService) {} constructor(private nats: NatsService) {}
onModuleInit() { // Subscriptions MUST happen in onApplicationBootstrap, not onModuleInit:
// provider onModuleInit order is not guaranteed, and these hooks once ran
// before NatsService connected — every subscribe() silently no-oped and the
// WS bridge was dead from boot. Bootstrap runs after ALL module inits
// (including the awaited NATS connect) complete.
onApplicationBootstrap() {
this.nats.subscribe('corrosion.*.companion.heartbeat', (data, subject) => { this.nats.subscribe('corrosion.*.companion.heartbeat', (data, subject) => {
const licenseId = subject.split('.')[1]; const licenseId = subject.split('.')[1];
this.emit(licenseId, 'heartbeat', data); this.emit(licenseId, 'heartbeat', data);
@@ -44,6 +49,17 @@ export class NatsBridgeService implements OnModuleInit {
this.emit(licenseId, 'oxide_status', data); this.emit(licenseId, 'oxide_status', data);
}); });
// Wire protocol v2 (corrosion-host-agent) — host-level telemetry
this.nats.subscribe('corrosion.*.host.heartbeat', (data, subject) => {
const licenseId = subject.split('.')[1];
this.emit(licenseId, 'host_heartbeat', data);
});
this.nats.subscribe('corrosion.*.host.going_offline', (data, subject) => {
const licenseId = subject.split('.')[1];
this.emit(licenseId, 'host_going_offline', data);
});
this.logger.log('NATS bridge subscriptions initialized'); this.logger.log('NATS bridge subscriptions initialized');
} }

View File

@@ -0,0 +1,152 @@
// Full-pipeline contract test: Rust host agent → NATS → NestJS consumer → Postgres.
//
// Proves the wire protocol v2 chain end to end against a REAL backend and DB:
// 1. agent heartbeat arrives with schema 2 + measured telemetry
// 2. backend auto-registers the server_connections row and marks it connected
// 3. instance command channel round-trips (start/status/stop) with push events
// 4. graceful agent shutdown publishes the offline beacon and the row flips offline
//
// Required env:
// LICENSE_ID — existing license uuid (CI: from the admin seed)
// DATABASE_URL — postgres connection string for assertions
// NATS_URL — broker both agent and backend use (default nats://localhost:4222)
// AGENT_BIN — path to the corrosion-host-agent binary
//
// Uses the backend's own node_modules (nats, pg) so the client libs under test
// are exactly what production runs.
import { createRequire } from 'node:module';
import { spawn } from 'node:child_process';
import { writeFileSync, mkdtempSync } from 'node:fs';
import { tmpdir } from 'node:os';
import { join, dirname } from 'node:path';
import { fileURLToPath } from 'node:url';
const repoRoot = join(dirname(fileURLToPath(import.meta.url)), '..');
const require = createRequire(join(repoRoot, 'backend-nest', 'node_modules', 'x.js'));
const { connect, StringCodec } = require('nats');
const { Client: PgClient } = require('pg');
const LICENSE = process.env.LICENSE_ID;
const NATS_URL = process.env.NATS_URL ?? 'nats://localhost:4222';
const DATABASE_URL = process.env.DATABASE_URL;
const AGENT_BIN = process.env.AGENT_BIN ?? join(repoRoot, 'corrosion-host-agent', 'target', 'debug', 'corrosion-host-agent');
if (!LICENSE || !DATABASE_URL) {
console.error('LICENSE_ID and DATABASE_URL are required');
process.exit(2);
}
const sc = StringCodec();
const errs = [];
const check = (cond, msg) => { if (!cond) errs.push(msg); };
const sleep = (ms) => new Promise((r) => setTimeout(r, ms));
async function pollDb(pg, predicate, label, timeoutMs = 30_000) {
const deadline = Date.now() + timeoutMs;
for (;;) {
const { rows } = await pg.query(
'SELECT connection_type, connection_status, companion_last_seen FROM server_connections WHERE license_id = $1',
[LICENSE],
);
if (predicate(rows)) return rows;
if (Date.now() > deadline) {
errs.push(`${label}: timeout after ${timeoutMs}ms — rows: ${JSON.stringify(rows)}`);
return rows;
}
await sleep(1000);
}
}
const main = async () => {
const pg = new PgClient({ connectionString: DATABASE_URL });
await pg.connect();
const nc = await connect({ servers: NATS_URL });
const heartbeats = [];
const statusEvents = [];
(async () => { for await (const m of nc.subscribe(`corrosion.${LICENSE}.host.heartbeat`)) heartbeats.push(JSON.parse(sc.decode(m.data))); })();
(async () => { for await (const m of nc.subscribe(`corrosion.${LICENSE}.ci-instance.status`)) statusEvents.push(JSON.parse(sc.decode(m.data))); })();
// --- spawn the real agent ---
const dir = mkdtempSync(join(tmpdir(), 'cha-contract-'));
const cfgPath = join(dir, 'agent.toml');
writeFileSync(cfgPath, `
[agent]
license_id = "${LICENSE}"
nats_url = "${NATS_URL}"
heartbeat_seconds = 10
log_level = "info"
[[instance]]
id = "ci-instance"
game = "rust"
root = "/tmp"
label = "Contract CI"
executable = "/bin/sleep"
args = ["300"]
`);
const agent = spawn(AGENT_BIN, ['--config', cfgPath], { stdio: ['ignore', 'inherit', 'inherit'] });
const agentExited = new Promise((r) => agent.on('exit', r));
// --- 1. heartbeat shape + real telemetry ---
const hbDeadline = Date.now() + 20_000;
while (heartbeats.length === 0 && Date.now() < hbDeadline) await sleep(500);
check(heartbeats.length > 0, 'no heartbeat within 20s');
if (heartbeats.length) {
const hb = heartbeats[0];
check(hb.schema === 2, `schema != 2: ${hb.schema}`);
check(typeof hb.host?.cpu_percent === 'number', 'missing host.cpu_percent');
check(hb.host?.mem_total_mb > 0, 'mem_total_mb not measured');
check(Array.isArray(hb.host?.disks) && hb.host.disks.length > 0, 'no disks reported');
check(hb.instances?.[0]?.id === 'ci-instance', 'instance missing from heartbeat');
check(!!hb.agent?.version && !!hb.agent?.commit, 'agent version/commit missing');
}
// --- 2. backend auto-registers + connects ---
const rows = await pollDb(pg, (r) => r.length === 1 && r[0].connection_status === 'connected', 'auto-register connected');
if (rows.length === 1) {
check(rows[0].connection_type === 'bare_metal', `connection_type: ${rows[0].connection_type}`);
check(rows[0].companion_last_seen !== null, 'companion_last_seen not set');
}
// --- 3. instance command channel ---
const cmd = async (payload) =>
JSON.parse(sc.decode((await nc.request(`corrosion.${LICENSE}.ci-instance.cmd`, sc.encode(JSON.stringify(payload)), { timeout: 8000 })).data));
const st0 = await cmd({ func: 'status' });
check(st0.state?.state === 'stopped', `initial state: ${JSON.stringify(st0.state)}`);
const start = await cmd({ func: 'start' });
check(start.status === 'success', `start: ${JSON.stringify(start)}`);
await sleep(1000);
const st1 = await cmd({ func: 'status' });
check(st1.state?.state === 'running', `post-start state: ${JSON.stringify(st1.state)}`);
check((await cmd({ func: 'start' })).status === 'error', 'double start must error');
check((await cmd({ func: 'bogus' })).status === 'error', 'unknown func must error');
const stop = await cmd({ func: 'stop' });
check(stop.status === 'success', `stop: ${JSON.stringify(stop)}`);
await sleep(1000);
const seq = statusEvents.map((e) => e.event?.state);
check(seq.includes('running') && seq.includes('stopped'), `status events incomplete: ${seq.join(',')}`);
// --- 4. graceful shutdown → offline beacon → DB flips offline ---
agent.kill('SIGTERM');
await Promise.race([agentExited, sleep(8000)]);
await pollDb(pg, (r) => r.length === 1 && r[0].connection_status === 'offline', 'beacon offline', 20_000);
await nc.close();
await pg.end();
if (errs.length) {
console.error('\nCONTRACT FAIL:');
errs.forEach((e) => console.error(' -', e));
process.exit(1);
}
console.log('\nCONTRACT PASS: heartbeat shape, auto-register, connected/offline lifecycle, instance command channel, push events');
process.exit(0);
};
main().catch((e) => {
console.error('contract test crashed:', e);
process.exit(1);
});

View File

@@ -149,6 +149,12 @@ version = "3.20.3"
source = "registry+https://github.com/rust-lang/crates.io-index" source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "72f5acc6cb2ba439de613abc23857ec3d78374d8ed5ac84e9d11336e87da8649" checksum = "72f5acc6cb2ba439de613abc23857ec3d78374d8ed5ac84e9d11336e87da8649"
[[package]]
name = "byteorder"
version = "1.5.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "1fd0f2584146f6f2ef48085050886acf353beff7305ebd1ae69500e27c67f64b"
[[package]] [[package]]
name = "bytes" name = "bytes"
version = "1.11.1" version = "1.11.1"
@@ -258,18 +264,20 @@ checksum = "773648b94d0e5d620f64f280777445740e61fe701025087ec8b57f45c791888b"
[[package]] [[package]]
name = "corrosion-host-agent" name = "corrosion-host-agent"
version = "2.0.0-alpha.1" version = "2.0.0-alpha.2"
dependencies = [ dependencies = [
"anyhow", "anyhow",
"async-nats", "async-nats",
"chrono", "chrono",
"clap", "clap",
"futures", "futures",
"libc",
"rand", "rand",
"serde", "serde",
"serde_json", "serde_json",
"sysinfo", "sysinfo",
"tokio", "tokio",
"tokio-tungstenite",
"tokio-util", "tokio-util",
"toml", "toml",
"tracing", "tracing",
@@ -580,6 +588,22 @@ version = "0.5.0"
source = "registry+https://github.com/rust-lang/crates.io-index" source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "2304e00983f87ffb38b55b444b5e3b60a884b5d30c0fca7d82fe33449bbe55ea" checksum = "2304e00983f87ffb38b55b444b5e3b60a884b5d30c0fca7d82fe33449bbe55ea"
[[package]]
name = "http"
version = "1.4.2"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "6970f50e31d6fc17d3fa27329444bfa74e196cf62e95052a3f6fee181dba6425"
dependencies = [
"bytes",
"itoa",
]
[[package]]
name = "httparse"
version = "1.10.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "6dbf3de79e51f3d586ab4cb9d5c3e2c14aa28ed23d180cf89b4df0454a69cc87"
[[package]] [[package]]
name = "iana-time-zone" name = "iana-time-zone"
version = "0.1.65" version = "0.1.65"
@@ -1276,6 +1300,17 @@ dependencies = [
"serde", "serde",
] ]
[[package]]
name = "sha1"
version = "0.10.6"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "e3bf829a2d51ab4a5ddf1352d8470c140cadc8301b2ae1789db023f01cedd6ba"
dependencies = [
"cfg-if",
"cpufeatures",
"digest",
]
[[package]] [[package]]
name = "sha2" name = "sha2"
version = "0.10.9" version = "0.10.9"
@@ -1528,6 +1563,18 @@ dependencies = [
"tokio", "tokio",
] ]
[[package]]
name = "tokio-tungstenite"
version = "0.24.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "edc5f74e248dc973e0dbb7b74c7e0d6fcc301c694ff50049504004ef4d0cdcd9"
dependencies = [
"futures-util",
"log",
"tokio",
"tungstenite",
]
[[package]] [[package]]
name = "tokio-util" name = "tokio-util"
version = "0.7.18" version = "0.7.18"
@@ -1654,6 +1701,24 @@ dependencies = [
"tokio", "tokio",
] ]
[[package]]
name = "tungstenite"
version = "0.24.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "18e5b8366ee7a95b16d32197d0b2604b43a0be89dc5fac9f8e96ccafbaedda8a"
dependencies = [
"byteorder",
"bytes",
"data-encoding",
"http",
"httparse",
"log",
"rand",
"sha1",
"thiserror",
"utf-8",
]
[[package]] [[package]]
name = "typenum" name = "typenum"
version = "1.20.1" version = "1.20.1"
@@ -1684,6 +1749,12 @@ dependencies = [
"serde", "serde",
] ]
[[package]]
name = "utf-8"
version = "0.7.6"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "09cc8ee72d2a9becf2f2febe0205bbed8fc6615b7cb429ad062dc7b7ddd036a9"
[[package]] [[package]]
name = "utf8_iter" name = "utf8_iter"
version = "1.0.4" version = "1.0.4"

View File

@@ -1,6 +1,6 @@
[package] [package]
name = "corrosion-host-agent" name = "corrosion-host-agent"
version = "2.0.0-alpha.1" version = "2.0.0-alpha.2"
edition = "2021" edition = "2021"
description = "Corrosion Host Agent — multi-game ops runtime for self-hosted game servers" description = "Corrosion Host Agent — multi-game ops runtime for self-hosted game servers"
license = "UNLICENSED" license = "UNLICENSED"
@@ -25,6 +25,10 @@ tracing-subscriber = { version = "0.3", features = ["env-filter", "fmt"] }
anyhow = "1" anyhow = "1"
clap = { version = "4.5", features = ["derive"] } clap = { version = "4.5", features = ["derive"] }
rand = "0.8" rand = "0.8"
tokio-tungstenite = "0.24"
[target.'cfg(unix)'.dependencies]
libc = "0.2"
# Size-optimized release: single static binary living next to RAM-heavy game # Size-optimized release: single static binary living next to RAM-heavy game
# servers. Panic stays 'unwind' so a panicking task surfaces through its # servers. Panic stays 'unwind' so a panicking task surfaces through its

View File

@@ -1,8 +1,9 @@
# Corrosion Wire Protocol v2 # Corrosion Wire Protocol v2
Status: **Phase 0 implemented** (host heartbeat, host commands, going-offline Status: **Phase 0 + Phase 1 process control implemented** (host heartbeat,
beacon). Per-instance command/status subjects are reserved and specified here host commands, going-offline beacon, per-instance start/stop/restart/status
for Phase 1. with push state events). RCON, SteamCMD, file ops, and game adapters are
specified but not yet implemented.
## Design ## Design
@@ -70,9 +71,10 @@ All telemetry is measured, never fabricated. Fields the agent cannot measure
are omitted (`probe` before the first probe completes, `hostname` if are omitted (`probe` before the first probe completes, `hostname` if
unavailable). unavailable).
Phase 0 instance `state` values: `configured` (root path exists), Instance `state` values — process-managed (an `executable` is configured):
`missing_root`. Phase 1 adds live process states: `running`, `stopped`, `running`, `stopped`, `starting`, `stopping`, `crashed`; unmanaged
`crashed`, `starting`, `updating`. (telemetry-only): `configured` (root exists), `missing_root`. Each instance
also reports `uptime_seconds` (0 unless running).
### `corrosion.{license_id}.host.cmd` (backend → agent, request-reply) ### `corrosion.{license_id}.host.cmd` (backend → agent, request-reply)
@@ -92,19 +94,40 @@ Best-effort beacon (500ms budget) on graceful shutdown so the panel can flip
the host to offline immediately instead of waiting out heartbeat staleness. the host to offline immediately instead of waiting out heartbeat staleness.
Payload: `{}`. Payload: `{}`.
## Instance-level subjects (Phase 1 — reserved, not yet implemented) ## Instance-level subjects
### `corrosion.{license_id}.{instance_id}.cmd` (backend → agent, request-reply) ### `corrosion.{license_id}.{instance_id}.cmd` (backend → agent, request-reply) — LIVE
Lifecycle and control for one game instance. Planned funcs: `start`, `stop`, Lifecycle and control for one game instance.
`restart`, `status`, `rcon` (process-class games), `steam_update`,
`oxide_install` (rust), plus game-adapter-specific commands (Dune: docker
lifecycle, RabbitMQ bus commands, Coriolis reset).
### `corrosion.{license_id}.{instance_id}.status` (agent → backend, publish) Implemented funcs: `start`, `stop` (graceful with 30s budget, then force
kill), `restart`, `status` (returns `state` + `uptime_seconds`), and
`rcon``{ "func": "rcon", "command": "<console command>" }` returns
`{ "status": "success", "output": <server response> }`. Protocol per game:
WebRCON (WebSocket JSON) for rust, Source RCON (Valve TCP) for
conan/soulmask; explicit `kind` override available in the instance's
`[instance.rcon]` config. Always targets 127.0.0.1 (agent is co-located).
Errors reply `{ "status": "error", "message": ... }` — including start on an
unmanaged instance, double start, missing rcon config, and unknown funcs.
State-change events (started/stopped/crashed) so the panel does not wait for Planned funcs: `steam_update`, `oxide_install` (rust), plus
the next heartbeat. game-adapter-specific commands (Dune: docker lifecycle, RabbitMQ bus
commands, Coriolis reset).
### `corrosion.{license_id}.{instance_id}.status` (agent → backend, publish) — LIVE
State-change events so the panel does not wait for the next heartbeat.
Payload: `{ "timestamp", "instance_id", "event": { "state": ..., "exit_code"? } }`.
Semantics: **keep-latest state sync**, not a lossless transition ledger —
near-instant transient states (e.g. `starting` when spawn succeeds
immediately) may coalesce into the following state. Consumers should treat
each event as "current state is now X".
Known Phase 1 limitation: the supervisor does not yet persist/adopt PIDs — if
the agent itself restarts while a game server is running, the game process
survives but reports `stopped` until restarted through the panel. PID
adoption is queued with the service-install work.
### `corrosion.{license_id}.{instance_id}.console` (agent → backend, publish) ### `corrosion.{license_id}.{instance_id}.console` (agent → backend, publish)

View File

@@ -15,7 +15,11 @@ instance on that host — Rust, Conan Exiles, Soulmask, Dune: Awakening.
- [x] Connectivity prober (outbound TCP, periodic + on-demand) - [x] Connectivity prober (outbound TCP, periodic + on-demand)
- [x] Host command channel (`ping`, `probe`, `sysinfo`) - [x] Host command channel (`ping`, `probe`, `sysinfo`)
- [x] Graceful shutdown (cancellation token, going-offline beacon, NATS flush) - [x] Graceful shutdown (cancellation token, going-offline beacon, NATS flush)
- [ ] Phase 1: process-class game adapter (spawn/RCON/SteamCMD/files) — Rust, Conan, Soulmask - [x] Phase 1a: process supervision — per-instance start/stop/restart/status over
`{instance}.cmd` request-reply, push state events on `{instance}.status`,
crash detection with exit codes, live state in heartbeats
(integration-tested with real processes + live-NATS contract test)
- [ ] Phase 1b: RCON trait (WebRCON rust / TCP conan+soulmask), SteamCMD, jailed file manager
- [ ] Phase 2: Dune Docker adapter (compose lifecycle, RabbitMQ bus, Postgres admin) - [ ] Phase 2: Dune Docker adapter (compose lifecycle, RabbitMQ bus, Postgres admin)
- [ ] Phase 3: signed self-update (enforced ed25519 — release gate), service install, supervisor split - [ ] Phase 3: signed self-update (enforced ed25519 — release gate), service install, supervisor split

View File

@@ -23,11 +23,28 @@ game = "rust" # rust | conan | soulmask | dune
root = "/opt/rustserver" root = "/opt/rustserver"
label = "Main 2x Vanilla" label = "Main 2x Vanilla"
# RCON lets the panel send console commands to the running server.
# For rust the protocol is WebRCON (WebSocket JSON); for conan/soulmask it is
# Source RCON (Valve TCP binary). `kind` is optional — it is inferred from
# the game name when absent.
#
# The [instance.rcon] sub-table MUST immediately follow the [[instance]] entry
# it belongs to (standard TOML array-of-tables scoping rule).
[instance.rcon]
port = 28016
password = "changeme"
# kind = "webrcon" # explicit override; omit to infer from game
# [[instance]] # [[instance]]
# id = "soulmask-main" # id = "soulmask-main"
# game = "soulmask" # game = "soulmask"
# root = "/opt/soulmask/main" # root = "/opt/soulmask/main"
# label = "Cloud Mist Forest (cluster main)" # label = "Cloud Mist Forest (cluster main)"
#
# [instance.rcon]
# port = 19000
# password = "changeme"
# # kind = "source" # inferred automatically for soulmask
[prober] [prober]
interval_seconds = 300 interval_seconds = 300

View File

@@ -1,10 +1,13 @@
//! Shared agent handle: every subsystem task holds an `Arc<Agent>`. //! Shared agent handle: every subsystem task holds an `Arc<Agent>`.
use std::collections::HashMap;
use std::sync::Arc;
use std::time::Instant; use std::time::Instant;
use tokio::sync::RwLock; use tokio::sync::RwLock;
use tokio_util::sync::CancellationToken; use tokio_util::sync::CancellationToken;
use crate::config::Settings; use crate::config::Settings;
use crate::process::ProcessSupervisor;
use crate::prober::ProbeReport; use crate::prober::ProbeReport;
pub struct Agent { pub struct Agent {
@@ -12,5 +15,8 @@ pub struct Agent {
pub nats: async_nats::Client, pub nats: async_nats::Client,
pub started: Instant, pub started: Instant,
pub last_probe: RwLock<Option<ProbeReport>>, pub last_probe: RwLock<Option<ProbeReport>>,
/// One supervisor per instance (unmanaged instances included — they
/// report `unmanaged` state and reject process commands).
pub supervisors: HashMap<String, Arc<ProcessSupervisor>>,
pub shutdown: CancellationToken, pub shutdown: CancellationToken,
} }

View File

@@ -10,6 +10,8 @@ use serde::Deserialize;
use std::collections::HashSet; use std::collections::HashSet;
use std::path::{Path, PathBuf}; use std::path::{Path, PathBuf};
use crate::rcon::RconConfig;
/// Instance ids share the NATS subject namespace with host-level segments. /// Instance ids share the NATS subject namespace with host-level segments.
const RESERVED_INSTANCE_IDS: &[&str] = &["host", "cmd", "files", "update", "agent"]; const RESERVED_INSTANCE_IDS: &[&str] = &["host", "cmd", "files", "update", "agent"];
@@ -49,6 +51,33 @@ pub struct InstanceConfig {
/// Optional human label shown in the panel. /// Optional human label shown in the panel.
#[serde(default)] #[serde(default)]
pub label: Option<String>, pub label: Option<String>,
/// Game server executable. Relative paths resolve against `root`.
/// Absent = unmanaged instance (telemetry only, no process control).
#[serde(default)]
pub executable: Option<PathBuf>,
/// Arguments as a proper list — no shell splitting, quoted values survive.
#[serde(default)]
pub args: Vec<String>,
/// Working directory for the process. Defaults to the executable's directory.
#[serde(default)]
pub working_dir: Option<PathBuf>,
/// RCON connection settings for this instance. Absent = rcon unavailable.
/// Protocol defaults to WebRcon for rust, Source for conan/soulmask.
#[serde(default)]
pub rcon: Option<RconConfig>,
}
impl InstanceConfig {
/// Absolute executable path, if this instance is process-managed.
pub fn resolved_executable(&self) -> Option<PathBuf> {
self.executable.as_ref().map(|exe| {
if exe.is_absolute() {
exe.clone()
} else {
self.root.join(exe)
}
})
}
} }
#[derive(Debug, Clone, Default, Deserialize)] #[derive(Debug, Clone, Default, Deserialize)]

View File

@@ -0,0 +1,201 @@
//! Per-instance command channel + state-change events.
//!
//! Each process-managed instance gets a request-reply subscriber on
//! `corrosion.{license}.{instance_id}.cmd` (funcs: start/stop/restart/status/rcon)
//! and a publisher task that pushes every supervisor state change to
//! `corrosion.{license}.{instance_id}.status` — the panel sees crashes when
//! they happen, not when the next heartbeat ambles in.
use chrono::{SecondsFormat, Utc};
use futures::StreamExt;
use serde::Deserialize;
use serde_json::json;
use std::sync::Arc;
use crate::agent::Agent;
use crate::process::ProcessSupervisor;
use crate::subjects;
#[derive(Debug, Deserialize)]
struct InstanceCommand {
func: String,
/// Payload for funcs that carry a text argument (e.g. rcon).
#[serde(default)]
command: Option<String>,
}
/// Forward every supervisor state change as a status event.
pub async fn publish_state_changes(agent: Arc<Agent>, sup: Arc<ProcessSupervisor>) {
let subject = subjects::instance_status(&agent.cfg.license_id, &sup.instance_id);
let mut rx = sup.watch_state();
let cancel = agent.shutdown.clone();
loop {
tokio::select! {
changed = rx.changed() => {
if changed.is_err() {
break;
}
let state = rx.borrow().clone();
let event = json!({
"timestamp": Utc::now().to_rfc3339_opts(SecondsFormat::Secs, true),
"instance_id": sup.instance_id,
"event": state,
});
match serde_json::to_vec(&event) {
Ok(bytes) => {
if let Err(e) = agent.nats.publish(subject.clone(), bytes.into()).await {
tracing::warn!("status publish failed for '{}': {e}", sup.instance_id);
}
}
Err(e) => tracing::error!("status serialize failed: {e}"),
}
}
_ = cancel.cancelled() => break,
}
}
}
/// Request-reply command handler for one instance.
pub async fn run(agent: Arc<Agent>, sup: Arc<ProcessSupervisor>) -> anyhow::Result<()> {
let subject = subjects::instance_cmd(&agent.cfg.license_id, &sup.instance_id);
let mut sub = agent.nats.subscribe(subject.clone()).await?;
tracing::info!("instance command handler listening on {subject}");
let cancel = agent.shutdown.clone();
loop {
tokio::select! {
msg = sub.next() => {
match msg {
Some(msg) => {
let agent = agent.clone();
let sup = sup.clone();
tokio::spawn(async move { handle(agent, sup, msg).await });
}
None => {
tracing::warn!("instance command subscription ended for '{}'", sup.instance_id);
break;
}
}
}
_ = cancel.cancelled() => {
tracing::info!("instance command handler stopping for '{}'", sup.instance_id);
break;
}
}
}
Ok(())
}
async fn handle(agent: Arc<Agent>, sup: Arc<ProcessSupervisor>, msg: async_nats::Message) {
let Some(reply) = msg.reply.clone() else {
tracing::warn!("instance command without reply subject ignored");
return;
};
let response = match serde_json::from_slice::<InstanceCommand>(&msg.payload) {
Ok(cmd) => dispatch(&agent, &sup, &cmd).await,
Err(e) => json!({ "status": "error", "message": format!("invalid command payload: {e}") }),
};
let bytes = match serde_json::to_vec(&response) {
Ok(b) => b,
Err(e) => {
tracing::error!("response serialize failed: {e}");
return;
}
};
if let Err(e) = agent.nats.publish(reply, bytes.into()).await {
tracing::warn!("response publish failed: {e}");
}
}
async fn dispatch(
agent: &Arc<Agent>,
sup: &Arc<ProcessSupervisor>,
cmd: &InstanceCommand,
) -> serde_json::Value {
let func = cmd.func.as_str();
let outcome = match func {
"start" => sup.start().await.map(|_| "starting"),
"stop" => sup.stop().await.map(|_| "stopped"),
"restart" => sup.restart().await.map(|_| "restarted"),
"status" => {
return json!({
"status": "success",
"func": "status",
"instance_id": sup.instance_id,
"state": sup.state(),
"uptime_seconds": sup.uptime_seconds().await,
});
}
"rcon" => {
// Look up the InstanceConfig for this supervisor so we can access
// rcon settings and the game name without changing the supervisor's
// data model.
let inst_cfg = agent
.cfg
.instances
.iter()
.find(|i| i.id == sup.instance_id);
let rcon_cfg = inst_cfg.and_then(|i| i.rcon.as_ref());
let Some(rcon_cfg) = rcon_cfg else {
return json!({
"status": "error",
"func": "rcon",
"instance_id": sup.instance_id,
"message": format!("instance '{}' has no rcon configured", sup.instance_id),
});
};
let Some(command) = cmd.command.as_deref() else {
return json!({
"status": "error",
"func": "rcon",
"instance_id": sup.instance_id,
"message": "rcon func requires a 'command' field",
});
};
let game = inst_cfg.map(|i| i.game.as_str()).unwrap_or("rust");
return match crate::rcon::send_command(rcon_cfg, game, command).await {
Ok(output) => json!({
"status": "success",
"func": "rcon",
"instance_id": sup.instance_id,
"output": output,
}),
Err(e) => json!({
"status": "error",
"func": "rcon",
"instance_id": sup.instance_id,
"message": format!("{e:#}"),
}),
};
}
other => {
return json!({
"status": "error",
"message": format!("unknown func '{other}' (supported: start, stop, restart, status, rcon)"),
});
}
};
match outcome {
Ok(result) => json!({
"status": "success",
"func": func,
"instance_id": sup.instance_id,
"result": result,
"state": sup.state(),
}),
Err(e) => json!({
"status": "error",
"func": func,
"instance_id": sup.instance_id,
"message": format!("{e:#}"),
}),
}
}

View File

@@ -0,0 +1,14 @@
//! Corrosion Host Agent library surface — modules are public so integration
//! tests can drive subsystems (notably the process supervisor) directly.
pub mod agent;
pub mod bus;
pub mod config;
pub mod hostcmd;
pub mod instancecmd;
pub mod prober;
pub mod process;
pub mod rcon;
pub mod subjects;
pub mod telemetry;
pub mod version;

View File

@@ -4,14 +4,9 @@
//! connectivity prober, host command channel. Process control, file ops, and //! connectivity prober, host command channel. Process control, file ops, and
//! game adapters arrive in Phase 1+ (see PROTOCOL.md). //! game adapters arrive in Phase 1+ (see PROTOCOL.md).
mod agent; use corrosion_host_agent::{
mod bus; agent, bus, config, hostcmd, instancecmd, prober, process, subjects, telemetry, version,
mod config; };
mod hostcmd;
mod prober;
mod subjects;
mod telemetry;
mod version;
use anyhow::{Context, Result}; use anyhow::{Context, Result};
use clap::{Parser, Subcommand}; use clap::{Parser, Subcommand};
@@ -96,11 +91,18 @@ async fn run(settings: config::Settings) -> Result<()> {
let nats = bus::connect(&settings).await?; let nats = bus::connect(&settings).await?;
let supervisors = settings
.instances
.iter()
.map(|inst| (inst.id.clone(), process::ProcessSupervisor::new(inst)))
.collect();
let agent = Arc::new(Agent { let agent = Arc::new(Agent {
cfg: settings, cfg: settings,
nats, nats,
started: Instant::now(), started: Instant::now(),
last_probe: RwLock::new(None), last_probe: RwLock::new(None),
supervisors,
shutdown: CancellationToken::new(), shutdown: CancellationToken::new(),
}); });
@@ -115,6 +117,21 @@ async fn run(settings: config::Settings) -> Result<()> {
} }
})); }));
} }
for sup in agent.supervisors.values() {
{
let agent = agent.clone();
let sup = sup.clone();
handles.push(tokio::spawn(async move {
if let Err(e) = instancecmd::run(agent, sup).await {
tracing::error!("instance command handler failed: {e:#}");
}
}));
}
handles.push(tokio::spawn(instancecmd::publish_state_changes(
agent.clone(),
sup.clone(),
)));
}
wait_for_shutdown_signal().await; wait_for_shutdown_signal().await;
tracing::info!("shutdown signal received"); tracing::info!("shutdown signal received");

View File

@@ -0,0 +1,278 @@
//! Per-instance game-server process supervision.
//!
//! One `ProcessSupervisor` per process-managed instance. Lifecycle mirrors the
//! proven Go agent behavior — graceful SIGTERM with a 30s budget before force
//! kill, a monitor task that reaps the child and records crash-vs-stop — with
//! two fixes the Go version needed: args are a proper list (no naive space
//! splitting), and every state change is observable through a watch channel
//! so the panel gets push events instead of waiting for the next heartbeat.
use anyhow::{bail, Context, Result};
use serde::Serialize;
use std::path::PathBuf;
use std::process::Stdio;
use std::sync::Arc;
use std::time::{Duration, Instant};
use tokio::process::{Child, Command};
use tokio::sync::{watch, Mutex};
use crate::config::InstanceConfig;
const GRACEFUL_STOP_BUDGET: Duration = Duration::from_secs(30);
const RESTART_PAUSE: Duration = Duration::from_secs(2);
#[derive(Debug, Clone, PartialEq, Serialize)]
#[serde(rename_all = "snake_case", tag = "state")]
pub enum InstanceState {
/// Not process-managed (no executable configured).
Unmanaged,
Stopped,
Starting,
Running,
Stopping,
/// Process exited without a stop request.
Crashed {
#[serde(skip_serializing_if = "Option::is_none")]
exit_code: Option<i32>,
},
}
impl InstanceState {
pub fn as_label(&self) -> &'static str {
match self {
InstanceState::Unmanaged => "unmanaged",
InstanceState::Stopped => "stopped",
InstanceState::Starting => "starting",
InstanceState::Running => "running",
InstanceState::Stopping => "stopping",
InstanceState::Crashed { .. } => "crashed",
}
}
}
struct Inner {
child: Option<Child>,
started_at: Option<Instant>,
/// True while a stop was requested — the monitor uses it to distinguish
/// an ordered shutdown from a crash.
stop_requested: bool,
}
pub struct ProcessSupervisor {
pub instance_id: String,
executable: Option<PathBuf>,
args: Vec<String>,
working_dir: Option<PathBuf>,
inner: Mutex<Inner>,
state_tx: watch::Sender<InstanceState>,
}
impl ProcessSupervisor {
pub fn new(cfg: &InstanceConfig) -> Arc<Self> {
let executable = cfg.resolved_executable();
let initial = if executable.is_some() {
InstanceState::Stopped
} else {
InstanceState::Unmanaged
};
let (state_tx, _) = watch::channel(initial);
Arc::new(Self {
instance_id: cfg.id.clone(),
executable,
args: cfg.args.clone(),
working_dir: cfg.working_dir.clone(),
inner: Mutex::new(Inner {
child: None,
started_at: None,
stop_requested: false,
}),
state_tx,
})
}
pub fn state(&self) -> InstanceState {
self.state_tx.borrow().clone()
}
pub fn watch_state(&self) -> watch::Receiver<InstanceState> {
self.state_tx.subscribe()
}
pub async fn uptime_seconds(&self) -> u64 {
let inner = self.inner.lock().await;
match (&*self.state_tx.borrow(), inner.started_at) {
(InstanceState::Running, Some(t)) => t.elapsed().as_secs(),
_ => 0,
}
}
pub async fn start(self: &Arc<Self>) -> Result<()> {
let Some(exe) = self.executable.clone() else {
bail!("instance '{}' has no executable configured", self.instance_id);
};
if !exe.exists() {
bail!("executable not found: {}", exe.display());
}
let mut inner = self.inner.lock().await;
if matches!(*self.state_tx.borrow(), InstanceState::Running | InstanceState::Starting) {
bail!("instance '{}' is already running", self.instance_id);
}
self.set_state(InstanceState::Starting);
let workdir = self
.working_dir
.clone()
.or_else(|| exe.parent().map(|p| p.to_path_buf()))
.unwrap_or_else(|| PathBuf::from("."));
let child = Command::new(&exe)
.args(&self.args)
.current_dir(&workdir)
.stdin(Stdio::null())
.stdout(Stdio::inherit())
.stderr(Stdio::inherit())
.spawn()
.with_context(|| format!("spawning {}", exe.display()))?;
let pid = child.id();
inner.child = Some(child);
inner.started_at = Some(Instant::now());
inner.stop_requested = false;
drop(inner);
self.set_state(InstanceState::Running);
tracing::info!(
"instance '{}' started: {} (pid {:?})",
self.instance_id,
exe.display(),
pid
);
// Monitor: reap the child and classify the exit.
let sup = Arc::clone(self);
tokio::spawn(async move { sup.monitor().await });
Ok(())
}
async fn monitor(self: Arc<Self>) {
// Take a waiter without holding the lock across the whole child
// lifetime: Child::wait needs &mut, so the child stays in inner and
// we poll it.
loop {
let status = {
let mut inner = self.inner.lock().await;
let Some(child) = inner.child.as_mut() else { return };
match child.try_wait() {
Ok(Some(status)) => Some(status),
Ok(None) => None,
Err(e) => {
tracing::error!("instance '{}' wait failed: {e}", self.instance_id);
return;
}
}
};
match status {
Some(status) => {
let mut inner = self.inner.lock().await;
inner.child = None;
inner.started_at = None;
let ordered = inner.stop_requested;
inner.stop_requested = false;
drop(inner);
if ordered {
self.set_state(InstanceState::Stopped);
tracing::info!("instance '{}' stopped ({status})", self.instance_id);
} else {
let exit_code = status.code();
self.set_state(InstanceState::Crashed { exit_code });
tracing::warn!(
"instance '{}' exited unexpectedly ({status}) — marked crashed",
self.instance_id
);
}
return;
}
None => tokio::time::sleep(Duration::from_millis(500)).await,
}
}
}
pub async fn stop(self: &Arc<Self>) -> Result<()> {
let mut inner = self.inner.lock().await;
if inner.child.is_none() {
bail!("instance '{}' is not running", self.instance_id);
}
inner.stop_requested = true;
self.set_state(InstanceState::Stopping);
let child = inner.child.as_mut().expect("checked above");
// Graceful first: SIGTERM on unix; Windows has no SIGTERM equivalent
// for console processes, so it goes straight to kill there.
#[cfg(unix)]
if let Some(pid) = child.id() {
unsafe {
libc::kill(pid as i32, libc::SIGTERM);
}
}
#[cfg(not(unix))]
{
let _ = child.start_kill();
}
drop(inner);
// Wait for the monitor to observe the exit; force kill on budget.
let mut rx = self.watch_state();
let deadline = tokio::time::timeout(GRACEFUL_STOP_BUDGET, async {
loop {
if matches!(*rx.borrow(), InstanceState::Stopped) {
return;
}
if rx.changed().await.is_err() {
return;
}
}
})
.await;
if deadline.is_err() {
tracing::warn!(
"instance '{}' ignored SIGTERM for {}s — force killing",
self.instance_id,
GRACEFUL_STOP_BUDGET.as_secs()
);
let mut inner = self.inner.lock().await;
if let Some(child) = inner.child.as_mut() {
let _ = child.start_kill();
}
drop(inner);
let mut rx = self.watch_state();
let _ = tokio::time::timeout(Duration::from_secs(5), async {
while !matches!(*rx.borrow(), InstanceState::Stopped) {
if rx.changed().await.is_err() {
break;
}
}
})
.await;
}
Ok(())
}
pub async fn restart(self: &Arc<Self>) -> Result<()> {
if !matches!(*self.state_tx.borrow(), InstanceState::Stopped | InstanceState::Crashed { .. } | InstanceState::Unmanaged) {
self.stop().await?;
}
tokio::time::sleep(RESTART_PAUSE).await;
self.start().await
}
fn set_state(&self, state: InstanceState) {
// send_replace never fails even with zero receivers.
let _ = self.state_tx.send_replace(state);
}
}

View File

@@ -0,0 +1,320 @@
//! RCON client: game-server remote-console over WebRCON (Rust) or Source RCON (Conan/Soulmask).
//!
//! The agent runs co-located with the game server, so every connection targets
//! 127.0.0.1 — no TLS is needed and latency is sub-millisecond. Two protocols
//! are supported because the Rust game ships its own WebSocket-based WebRCON
//! while Conan Exiles and Soulmask use the Valve Source RCON wire format over
//! plain TCP.
//!
//! The protocol selection is explicit in the config (`kind`) but can be inferred
//! from the game name when absent — callers supply the `game` field they already
//! have in `InstanceConfig`.
use anyhow::{bail, Context, Result};
use futures::{SinkExt, StreamExt};
use rand::Rng;
use serde::Deserialize;
use tokio::io::{AsyncReadExt, AsyncWriteExt};
use tokio::net::TcpStream;
use tokio::time::{timeout, Duration};
/// WebRCON is the Facepunch WebSocket protocol (Rust game).
/// Source RCON is the Valve wire protocol used by Conan Exiles and Soulmask.
#[derive(Debug, Clone, Copy, PartialEq, Eq, Deserialize)]
#[serde(rename_all = "lowercase")]
pub enum RconKind {
WebRcon,
Source,
}
#[derive(Debug, Clone, Deserialize)]
#[serde(deny_unknown_fields)]
pub struct RconConfig {
/// Protocol override. When absent the kind is resolved from `game`.
#[serde(default)]
pub kind: Option<RconKind>,
pub port: u16,
pub password: String,
}
impl RconConfig {
/// Resolve the concrete protocol, falling back to a per-game default when
/// `kind` is not set. rust → WebRcon; conan + soulmask → Source.
pub fn resolved_kind(&self, game: &str) -> RconKind {
if let Some(k) = self.kind {
return k;
}
match game {
"conan" | "soulmask" => RconKind::Source,
// rust is the primary game; anything unknown defaults to WebRcon
// — operators can always override with an explicit `kind`.
_ => RconKind::WebRcon,
}
}
}
const CONNECT_TIMEOUT: Duration = Duration::from_secs(5);
const RESPONSE_TIMEOUT: Duration = Duration::from_secs(10);
/// Send `command` to the game server and return its text response.
///
/// The agent runs on the same host as the game server, so the target address
/// is always 127.0.0.1:{port}. Connection and response deadlines are fixed at
/// 5 s and 10 s respectively — enough headroom for a loaded server while still
/// catching hung connections quickly.
pub async fn send_command(cfg: &RconConfig, game: &str, command: &str) -> Result<String> {
match cfg.resolved_kind(game) {
RconKind::WebRcon => webrcon_exec(cfg, command).await,
RconKind::Source => source_rcon_exec(cfg, command).await,
}
}
// ---------------------------------------------------------------------------
// WebRCON (Rust game) — WebSocket JSON protocol
// ---------------------------------------------------------------------------
/// WebRCON request/response envelope. The server also emits chat/log frames
/// on this socket with Identifier == 0; those are skipped.
#[derive(serde::Serialize)]
struct WebRconRequest<'a> {
#[serde(rename = "Identifier")]
identifier: i32,
#[serde(rename = "Message")]
message: &'a str,
#[serde(rename = "Name")]
name: &'static str,
}
#[derive(serde::Deserialize)]
struct WebRconResponse {
#[serde(rename = "Identifier")]
identifier: i32,
#[serde(rename = "Message")]
message: String,
}
async fn webrcon_exec(cfg: &RconConfig, command: &str) -> Result<String> {
use tokio_tungstenite::connect_async;
use tokio_tungstenite::tungstenite::Message as WsMsg;
// The Rust game server embeds the password in the WebSocket URL path —
// never interpolate the real URL into errors or logs.
let url = format!("ws://127.0.0.1:{}/{}", cfg.port, cfg.password);
let redacted = format!("ws://127.0.0.1:{}/<redacted>", cfg.port);
// Wrap the entire connection + exchange in the connect timeout — we want
// the timeout to cover TCP handshake + WS upgrade, not just the send.
let (mut ws, _) = timeout(CONNECT_TIMEOUT, connect_async(&url))
.await
.context("connect timeout")?
.with_context(|| format!("WebRCON connect to {redacted}"))?;
// Use a random positive i32 so correlation is unambiguous even when
// multiple callers share a port (future concurrency).
let id: i32 = rand::thread_rng().gen_range(1..=i32::MAX);
let req = WebRconRequest { identifier: id, message: command, name: "Corrosion" };
let payload = serde_json::to_string(&req).context("serialize WebRCON request")?;
ws.send(WsMsg::Text(payload))
.await
.context("send WebRCON command")?;
tracing::debug!("WebRCON sent id={id} command={command:?}");
// Read frames until we see our Identifier — skip chat/log noise (id 0 or
// any other value that isn't ours).
let result = timeout(RESPONSE_TIMEOUT, async {
loop {
match ws.next().await {
Some(Ok(WsMsg::Text(text))) => {
match serde_json::from_str::<WebRconResponse>(&text) {
Ok(resp) if resp.identifier == id => return Ok(resp.message),
Ok(_) => {
// Not our response (chat, log, another caller's frame).
tracing::trace!("WebRCON skipping frame with different Identifier");
continue;
}
Err(e) => {
tracing::trace!("WebRCON non-JSON frame ignored: {e}");
continue;
}
}
}
Some(Ok(WsMsg::Close(_))) => bail!("WebRCON server closed connection"),
Some(Ok(_)) => continue, // binary/ping/pong — skip
Some(Err(e)) => return Err(anyhow::anyhow!(e).context("WebRCON read error")),
None => bail!("WebRCON stream ended without response"),
}
}
})
.await
.context("WebRCON response timeout")??;
// Close cleanly; a send error here is cosmetic — we already have our data.
let _ = ws.close(None).await;
Ok(result)
}
// ---------------------------------------------------------------------------
// Source RCON (Conan Exiles, Soulmask) — Valve TCP binary protocol
//
// Packet layout (all fields little-endian):
// i32 size — byte count of the remaining packet (id + type + body + 2 nulls)
// i32 id — caller-chosen correlation id; auth failure returns -1
// i32 type — 0=RESPONSE_VALUE, 2=EXECCOMMAND/AUTH_RESPONSE, 3=AUTH
// [u8] body — UTF-8 command or response text
// u8 0x00 — body null terminator
// u8 0x00 — padding null terminator
//
// Multi-packet handling: after sending the command we also send an empty
// RESPONSE_VALUE probe with a distinct id. We collect all RESPONSE_VALUE
// packets belonging to the command id and stop when we receive the probe's
// response. This is the standard technique specified in the Valve wiki.
// ---------------------------------------------------------------------------
const RCON_TYPE_AUTH: i32 = 3;
const RCON_TYPE_AUTH_RESPONSE: i32 = 2;
const RCON_TYPE_EXECCOMMAND: i32 = 2;
const RCON_TYPE_RESPONSE_VALUE: i32 = 0;
/// Maximum accumulated response body (guards against misbehaving servers).
const MAX_RESPONSE_BYTES: usize = 1024 * 1024; // 1 MiB
async fn source_rcon_exec(cfg: &RconConfig, command: &str) -> Result<String> {
let addr = format!("127.0.0.1:{}", cfg.port);
let stream = timeout(CONNECT_TIMEOUT, TcpStream::connect(&addr))
.await
.context("connect timeout")?
.with_context(|| format!("Source RCON connect to {addr}"))?;
let mut stream = stream;
// --- Auth ---
let auth_id: i32 = rand::thread_rng().gen_range(1..=i32::MAX);
send_packet(&mut stream, auth_id, RCON_TYPE_AUTH, cfg.password.as_bytes()).await?;
// The server sends two responses to AUTH: first an empty RESPONSE_VALUE,
// then an AUTH_RESPONSE. We skip the first and read until AUTH_RESPONSE.
timeout(RESPONSE_TIMEOUT, async {
loop {
let (id, ptype, _body) = recv_packet(&mut stream).await?;
if ptype == RCON_TYPE_AUTH_RESPONSE {
if id == -1 {
bail!("Source RCON auth failed: wrong password");
}
tracing::debug!("Source RCON authenticated (id={id})");
return Ok(());
}
// Skip the empty RESPONSE_VALUE that precedes AUTH_RESPONSE.
}
#[allow(unreachable_code)]
Ok::<(), anyhow::Error>(())
})
.await
.context("Source RCON auth timeout")??;
// --- Command ---
let cmd_id: i32 = rand::thread_rng().gen_range(1..=i32::MAX);
// Probe id must differ from cmd_id.
let probe_id: i32 = loop {
let id: i32 = rand::thread_rng().gen_range(1..=i32::MAX);
if id != cmd_id {
break id;
}
};
send_packet(&mut stream, cmd_id, RCON_TYPE_EXECCOMMAND, command.as_bytes()).await?;
// Empty RESPONSE_VALUE probe — the server echoes it after processing the
// preceding command, signalling end-of-response.
send_packet(&mut stream, probe_id, RCON_TYPE_RESPONSE_VALUE, b"").await?;
// Not every server is probe-conformant (Soulmask unverified): once we hold
// response data, a short per-read quiet period also terminates — never
// discard a response we already received just because the probe echo
// didn't come back.
const QUIET_PERIOD: Duration = Duration::from_millis(1500);
let response = timeout(RESPONSE_TIMEOUT, async {
let mut body_accum: Vec<u8> = Vec::new();
loop {
let next = if body_accum.is_empty() {
recv_packet(&mut stream).await.map(Some)
} else {
match timeout(QUIET_PERIOD, recv_packet(&mut stream)).await {
Ok(res) => res.map(Some),
Err(_elapsed) => Ok(None), // quiet after data — done
}
};
let Some((id, ptype, body)) = next? else {
break;
};
if ptype != RCON_TYPE_RESPONSE_VALUE {
continue; // unexpected packet type — skip
}
if id == probe_id {
// Probe echoed back — all command response packets have arrived.
break;
}
if id == cmd_id {
if body_accum.len() + body.len() > MAX_RESPONSE_BYTES {
bail!("Source RCON response exceeded {MAX_RESPONSE_BYTES} bytes");
}
body_accum.extend_from_slice(&body);
}
// Skip packets with other ids (shouldn't happen but be defensive).
}
Ok::<Vec<u8>, anyhow::Error>(body_accum)
})
.await
.context("Source RCON response timeout")??;
String::from_utf8(response).context("Source RCON response is not valid UTF-8")
}
/// Write a Source RCON packet to the stream.
async fn send_packet(stream: &mut TcpStream, id: i32, ptype: i32, body: &[u8]) -> Result<()> {
// size = id(4) + type(4) + body(n) + 2 null terminators
let size = (4 + 4 + body.len() + 2) as i32;
let mut buf: Vec<u8> = Vec::with_capacity(4 + size as usize);
buf.extend_from_slice(&size.to_le_bytes());
buf.extend_from_slice(&id.to_le_bytes());
buf.extend_from_slice(&ptype.to_le_bytes());
buf.extend_from_slice(body);
buf.push(0x00);
buf.push(0x00);
stream.write_all(&buf).await.context("Source RCON write")?;
Ok(())
}
/// Read one Source RCON packet; returns (id, type, body).
async fn recv_packet(stream: &mut TcpStream) -> Result<(i32, i32, Vec<u8>)> {
let mut size_buf = [0u8; 4];
stream
.read_exact(&mut size_buf)
.await
.context("Source RCON read size")?;
let size = i32::from_le_bytes(size_buf) as usize;
// Minimum packet: id(4) + type(4) + 2 null terminators = 10 bytes.
if size < 10 {
bail!("Source RCON: malformed packet (size={size})");
}
if size > MAX_RESPONSE_BYTES + 16 {
bail!("Source RCON: packet too large ({size} bytes)");
}
let mut payload = vec![0u8; size];
stream
.read_exact(&mut payload)
.await
.context("Source RCON read payload")?;
let id = i32::from_le_bytes(payload[0..4].try_into().unwrap());
let ptype = i32::from_le_bytes(payload[4..8].try_into().unwrap());
// Body is everything between the two fields and the two trailing nulls.
let body_end = size.saturating_sub(2); // strip 2 null terminators
let body = payload[8..body_end].to_vec();
Ok((id, ptype, body))
}

View File

@@ -17,14 +17,12 @@ pub fn host_going_offline(license: &str) -> String {
format!("corrosion.{license}.host.going_offline") format!("corrosion.{license}.host.going_offline")
} }
/// Phase 1: per-instance command channel (start/stop/restart/rcon/...). /// Per-instance command channel (start/stop/restart/status; rcon et al. to come).
#[allow(dead_code)]
pub fn instance_cmd(license: &str, instance: &str) -> String { pub fn instance_cmd(license: &str, instance: &str) -> String {
format!("corrosion.{license}.{instance}.cmd") format!("corrosion.{license}.{instance}.cmd")
} }
/// Phase 1: per-instance state-change events. /// Per-instance state-change events.
#[allow(dead_code)]
pub fn instance_status(license: &str, instance: &str) -> String { pub fn instance_status(license: &str, instance: &str) -> String {
format!("corrosion.{license}.{instance}.status") format!("corrosion.{license}.{instance}.status")
} }

View File

@@ -65,9 +65,10 @@ pub struct InstanceInfo {
pub game: String, pub game: String,
#[serde(skip_serializing_if = "Option::is_none")] #[serde(skip_serializing_if = "Option::is_none")]
pub label: Option<String>, pub label: Option<String>,
/// Phase 0 states: `configured` (root exists) or `missing_root`. /// Process-managed: running/stopped/starting/stopping/crashed.
/// Phase 1 adds live process states (running/stopped/crashed). /// Unmanaged (no executable configured): configured/missing_root.
pub state: String, pub state: String,
pub uptime_seconds: u64,
#[serde(skip_serializing_if = "Option::is_none")] #[serde(skip_serializing_if = "Option::is_none")]
pub root_disk_free_mb: Option<u64>, pub root_disk_free_mb: Option<u64>,
} }
@@ -125,21 +126,30 @@ pub async fn collect(agent: &Agent, sys: &mut System) -> HeartbeatPayload {
}) })
.collect(); .collect();
let instances = agent let mut instances = Vec::with_capacity(agent.cfg.instances.len());
.cfg for inst in &agent.cfg.instances {
.instances let (state, uptime_seconds) = match agent.supervisors.get(&inst.id) {
.iter() Some(sup) if !matches!(sup.state(), crate::process::InstanceState::Unmanaged) => {
.map(|inst| { (sup.state().as_label().to_string(), sup.uptime_seconds().await)
}
_ => {
let exists = inst.root.exists(); let exists = inst.root.exists();
InstanceInfo { (
if exists { "configured" } else { "missing_root" }.to_string(),
0,
)
}
};
instances.push(InstanceInfo {
id: inst.id.clone(), id: inst.id.clone(),
game: inst.game.clone(), game: inst.game.clone(),
label: inst.label.clone(), label: inst.label.clone(),
state: if exists { "configured" } else { "missing_root" }.to_string(), state,
uptime_seconds,
root_disk_free_mb: disk_free_for_path(&disks, &inst.root), root_disk_free_mb: disk_free_for_path(&disks, &inst.root),
});
} }
}) let instances = instances;
.collect();
HeartbeatPayload { HeartbeatPayload {
schema: 2, schema: 2,

View File

@@ -0,0 +1,353 @@
//! RCON integration tests using in-process mock servers.
//!
//! Real OS sockets on ephemeral ports — no mocking framework. Each test
//! binds a listener, spawns a task that speaks the expected protocol, then
//! exercises `rcon::send_command` and asserts on the result. Tests are
//! unix-only because the musl cross-compile target and the CI runner are both
//! Linux; the production use case is also Linux-only (game servers don't run
//! on macOS or Windows in production).
//!
//! We use `#[cfg(unix)]` to keep parity with the supervisor integration tests.
#![cfg(unix)]
use corrosion_host_agent::rcon::{RconConfig, RconKind};
use tokio::io::{AsyncReadExt, AsyncWriteExt};
use tokio::net::{TcpListener, TcpStream};
// ---------------------------------------------------------------------------
// Source RCON helpers — duplicate the wire-format encode/decode locally so
// the tests own the mock server without depending on the production code path.
// ---------------------------------------------------------------------------
/// Build a Source RCON packet: [size(4LE) | id(4LE) | type(4LE) | body | 0x00 0x00]
fn encode_packet(id: i32, ptype: i32, body: &[u8]) -> Vec<u8> {
let size = (4 + 4 + body.len() + 2) as i32;
let mut out = Vec::with_capacity(4 + size as usize);
out.extend_from_slice(&size.to_le_bytes());
out.extend_from_slice(&id.to_le_bytes());
out.extend_from_slice(&ptype.to_le_bytes());
out.extend_from_slice(body);
out.push(0x00);
out.push(0x00);
out
}
/// Read one Source RCON packet from a TcpStream.
async fn read_packet(stream: &mut TcpStream) -> (i32, i32, Vec<u8>) {
let mut size_buf = [0u8; 4];
stream.read_exact(&mut size_buf).await.unwrap();
let size = i32::from_le_bytes(size_buf) as usize;
let mut payload = vec![0u8; size];
stream.read_exact(&mut payload).await.unwrap();
let id = i32::from_le_bytes(payload[0..4].try_into().unwrap());
let ptype = i32::from_le_bytes(payload[4..8].try_into().unwrap());
let body_end = size.saturating_sub(2);
let body = payload[8..body_end].to_vec();
(id, ptype, body)
}
const SOURCE_TYPE_AUTH: i32 = 3;
const SOURCE_TYPE_AUTH_RESPONSE: i32 = 2;
const SOURCE_TYPE_EXECCOMMAND: i32 = 2;
const SOURCE_TYPE_RESPONSE_VALUE: i32 = 0;
// ---------------------------------------------------------------------------
// Mock Source RCON server
// ---------------------------------------------------------------------------
/// Run a Source RCON server that accepts password "goodpw", rejects others,
/// and responds to the first EXECCOMMAND with `response_body`.
///
/// If `split_at` is Some(n) the body is split: the first `n` bytes arrive in
/// one RESPONSE_VALUE packet and the remainder in a second — testing multi-
/// packet reassembly.
async fn run_source_mock(
mut stream: TcpStream,
accept_password: &str,
command_response: &[u8],
split_at: Option<usize>,
) {
// --- Auth phase ---
let (auth_id, ptype, body) = read_packet(&mut stream).await;
assert_eq!(ptype, SOURCE_TYPE_AUTH, "expected AUTH packet");
let password = String::from_utf8_lossy(&body);
if password != accept_password {
// Send empty RESPONSE_VALUE then AUTH_RESPONSE with id = -1 (failure).
let empty = encode_packet(auth_id, SOURCE_TYPE_RESPONSE_VALUE, b"");
stream.write_all(&empty).await.unwrap();
let fail = encode_packet(-1, SOURCE_TYPE_AUTH_RESPONSE, b"");
stream.write_all(&fail).await.unwrap();
return;
}
// Success: empty RESPONSE_VALUE then AUTH_RESPONSE with the auth id.
let empty = encode_packet(auth_id, SOURCE_TYPE_RESPONSE_VALUE, b"");
stream.write_all(&empty).await.unwrap();
let ok = encode_packet(auth_id, SOURCE_TYPE_AUTH_RESPONSE, b"");
stream.write_all(&ok).await.unwrap();
// --- Command phase ---
let (cmd_id, cmd_ptype, _cmd_body) = read_packet(&mut stream).await;
assert_eq!(cmd_ptype, SOURCE_TYPE_EXECCOMMAND, "expected EXECCOMMAND");
// Read the probe packet (empty RESPONSE_VALUE with a different id).
let (probe_id, probe_ptype, _) = read_packet(&mut stream).await;
assert_eq!(probe_ptype, SOURCE_TYPE_RESPONSE_VALUE, "expected probe packet");
// Send the command response, optionally split across two packets.
if let Some(n) = split_at {
let (part1, part2) = command_response.split_at(n.min(command_response.len()));
let p1 = encode_packet(cmd_id, SOURCE_TYPE_RESPONSE_VALUE, part1);
stream.write_all(&p1).await.unwrap();
let p2 = encode_packet(cmd_id, SOURCE_TYPE_RESPONSE_VALUE, part2);
stream.write_all(&p2).await.unwrap();
} else {
let p = encode_packet(cmd_id, SOURCE_TYPE_RESPONSE_VALUE, command_response);
stream.write_all(&p).await.unwrap();
}
// Echo the probe to signal end-of-response.
let probe_echo = encode_packet(probe_id, SOURCE_TYPE_RESPONSE_VALUE, b"");
stream.write_all(&probe_echo).await.unwrap();
}
// ---------------------------------------------------------------------------
// Source RCON tests
// ---------------------------------------------------------------------------
#[tokio::test]
async fn source_rcon_auth_and_exec_returns_response() {
let listener = TcpListener::bind("127.0.0.1:0").await.unwrap();
let port = listener.local_addr().unwrap().port();
tokio::spawn(async move {
let (stream, _) = listener.accept().await.unwrap();
run_source_mock(stream, "goodpw", b"Hello from server", None).await;
});
let cfg = RconConfig { kind: Some(RconKind::Source), port, password: "goodpw".to_string() };
let result = corrosion_host_agent::rcon::send_command(&cfg, "conan", "status")
.await
.expect("command should succeed");
assert_eq!(result, "Hello from server");
}
#[tokio::test]
async fn source_rcon_wrong_password_returns_auth_error() {
let listener = TcpListener::bind("127.0.0.1:0").await.unwrap();
let port = listener.local_addr().unwrap().port();
tokio::spawn(async move {
let (stream, _) = listener.accept().await.unwrap();
run_source_mock(stream, "goodpw", b"should not see this", None).await;
});
let cfg = RconConfig { kind: Some(RconKind::Source), port, password: "wrongpw".to_string() };
let err = corrosion_host_agent::rcon::send_command(&cfg, "conan", "status")
.await
.expect_err("wrong password should fail");
assert!(
err.to_string().to_lowercase().contains("auth"),
"error should mention auth failure, got: {err}"
);
}
#[tokio::test]
async fn source_rcon_multi_packet_response_concatenated() {
// Build a body large enough to split meaningfully across two packets.
// Use repeating ASCII so the result is valid UTF-8 and easy to verify.
// 200 'A's then 200 'B's = 400 bytes, split at 200.
let body: Vec<u8> = std::iter::repeat_n(b'A', 200)
.chain(std::iter::repeat_n(b'B', 200))
.collect();
let listener = TcpListener::bind("127.0.0.1:0").await.unwrap();
let port = listener.local_addr().unwrap().port();
let body_clone = body.clone();
tokio::spawn(async move {
let (stream, _) = listener.accept().await.unwrap();
run_source_mock(stream, "goodpw", &body_clone, Some(200)).await;
});
let cfg = RconConfig { kind: Some(RconKind::Source), port, password: "goodpw".to_string() };
let result = corrosion_host_agent::rcon::send_command(&cfg, "soulmask", "showplayers")
.await
.expect("multi-packet command should succeed");
let expected = String::from_utf8(body).unwrap();
assert_eq!(result, expected, "full body should be concatenated across both packets");
}
#[tokio::test]
async fn source_rcon_connect_timeout_to_unreachable_port() {
// Bind a listener but never accept — the connection will time out during
// the RCON auth phase because nothing is reading from the socket.
// We use a port that is bound (so TCP connect itself succeeds) but then
// the mock simply drops the stream, forcing a read error, which should
// surface as an error (not a panic or hang).
let listener = TcpListener::bind("127.0.0.1:0").await.unwrap();
let port = listener.local_addr().unwrap().port();
// Accept the TCP connection but immediately drop it — simulates a port
// that accepts but never speaks RCON.
tokio::spawn(async move {
let (_stream, _) = listener.accept().await.unwrap();
// _stream dropped here — EOF on the client's read
});
let cfg =
RconConfig { kind: Some(RconKind::Source), port, password: "goodpw".to_string() };
let err = corrosion_host_agent::rcon::send_command(&cfg, "conan", "status")
.await
.expect_err("closed connection should fail");
// We just need it to fail and not hang; error message varies by OS.
let _ = err;
}
// ---------------------------------------------------------------------------
// WebRCON mock server
// ---------------------------------------------------------------------------
/// Run a WebRCON mock: send one noise frame (Identifier 0), then respond to
/// the first real request with the given output.
async fn run_webrcon_mock(stream: tokio::net::TcpStream, output: &str) {
use futures::{SinkExt, StreamExt};
use tokio_tungstenite::accept_async;
use tokio_tungstenite::tungstenite::Message as WsMsg;
let mut ws = accept_async(stream).await.expect("WS handshake failed");
// Send noise (chat frame, Identifier 0) before the real request arrives.
let noise = serde_json::json!({
"Identifier": 0,
"Message": "Player X joined",
"Name": "Server",
"Type": "Chat"
});
ws.send(WsMsg::Text(noise.to_string()))
.await
.unwrap();
// Read the command request.
let msg = ws.next().await.unwrap().unwrap();
let text = match msg {
WsMsg::Text(t) => t,
other => panic!("expected Text frame, got {other:?}"),
};
let req: serde_json::Value = serde_json::from_str(&text).unwrap();
let req_id = req["Identifier"].as_i64().unwrap() as i32;
// Reply with the same Identifier so the client correlates correctly.
let reply = serde_json::json!({
"Identifier": req_id,
"Message": output,
"Type": "Generic",
});
ws.send(WsMsg::Text(reply.to_string())).await.unwrap();
}
// ---------------------------------------------------------------------------
// WebRCON tests
// ---------------------------------------------------------------------------
#[tokio::test]
async fn webrcon_skips_noise_and_returns_correct_message() {
let listener = TcpListener::bind("127.0.0.1:0").await.unwrap();
let port = listener.local_addr().unwrap().port();
tokio::spawn(async move {
let (stream, _) = listener.accept().await.unwrap();
run_webrcon_mock(stream, "Players: 42/100").await;
});
// Password is embedded in the URL path — any non-empty string works with
// our mock.
let cfg = RconConfig {
kind: Some(RconKind::WebRcon),
port,
password: "testpw".to_string(),
};
let result = corrosion_host_agent::rcon::send_command(&cfg, "rust", "playercount")
.await
.expect("WebRCON command should succeed");
assert_eq!(result, "Players: 42/100");
}
// ---------------------------------------------------------------------------
// TOML parsing test — pins [[instance]] + [instance.rcon] sub-table syntax
// ---------------------------------------------------------------------------
#[test]
fn toml_instance_with_rcon_parses_correctly() {
let toml = r#"
[agent]
license_id = "test-license"
nats_url = "nats://localhost:4222"
[[instance]]
id = "rust-main"
game = "rust"
root = "/opt/rustserver"
[instance.rcon]
port = 28016
password = "secretpassword"
kind = "webrcon"
"#;
let cfg: corrosion_host_agent::config::ConfigFile =
toml::from_str(toml).expect("TOML should parse");
assert_eq!(cfg.instances.len(), 1);
let inst = &cfg.instances[0];
assert_eq!(inst.id, "rust-main");
let rcon = inst.rcon.as_ref().expect("rcon should be present");
assert_eq!(rcon.port, 28016);
assert_eq!(rcon.password, "secretpassword");
assert_eq!(rcon.kind, Some(corrosion_host_agent::rcon::RconKind::WebRcon));
}
#[test]
fn toml_instance_without_rcon_defaults_to_none() {
let toml = r#"
[agent]
license_id = "test-license"
nats_url = "nats://localhost:4222"
[[instance]]
id = "conan-main"
game = "conan"
root = "/opt/conan"
"#;
let cfg: corrosion_host_agent::config::ConfigFile =
toml::from_str(toml).expect("TOML should parse");
assert!(cfg.instances[0].rcon.is_none(), "absent rcon should be None");
}
#[test]
fn resolved_kind_infers_from_game_name() {
use corrosion_host_agent::rcon::{RconConfig, RconKind};
let cfg_no_kind = RconConfig { kind: None, port: 28016, password: "x".to_string() };
assert_eq!(cfg_no_kind.resolved_kind("rust"), RconKind::WebRcon);
assert_eq!(cfg_no_kind.resolved_kind("conan"), RconKind::Source);
assert_eq!(cfg_no_kind.resolved_kind("soulmask"), RconKind::Source);
assert_eq!(cfg_no_kind.resolved_kind("dune"), RconKind::WebRcon); // fallback
// Explicit kind always wins.
let cfg_source = RconConfig { kind: Some(RconKind::Source), ..cfg_no_kind.clone() };
assert_eq!(cfg_source.resolved_kind("rust"), RconKind::Source);
let cfg_webrcon = RconConfig { kind: Some(RconKind::WebRcon), ..cfg_no_kind };
assert_eq!(cfg_webrcon.resolved_kind("conan"), RconKind::WebRcon);
}

View File

@@ -0,0 +1,108 @@
//! Process supervisor integration tests using real OS processes.
//! Unix-only test doubles (/bin/sleep, /bin/sh) — the supervisor logic under
//! test is platform-shared; Windows-specific stop semantics get covered when
//! the Windows service work lands.
#![cfg(unix)]
use std::path::PathBuf;
use std::time::Duration;
use corrosion_host_agent::config::InstanceConfig;
use corrosion_host_agent::process::{InstanceState, ProcessSupervisor};
fn managed_instance(executable: &str, args: &[&str]) -> InstanceConfig {
InstanceConfig {
id: "test-instance".to_string(),
game: "rust".to_string(),
root: PathBuf::from("/tmp"),
label: None,
executable: Some(PathBuf::from(executable)),
args: args.iter().map(|s| s.to_string()).collect(),
working_dir: None,
rcon: None,
}
}
async fn wait_for_state(
sup: &std::sync::Arc<ProcessSupervisor>,
want: fn(&InstanceState) -> bool,
budget: Duration,
) -> InstanceState {
let deadline = tokio::time::Instant::now() + budget;
loop {
let state = sup.state();
if want(&state) {
return state;
}
if tokio::time::Instant::now() > deadline {
panic!("timed out waiting for state; last = {state:?}");
}
tokio::time::sleep(Duration::from_millis(100)).await;
}
}
#[tokio::test]
async fn start_status_stop_lifecycle() {
let sup = ProcessSupervisor::new(&managed_instance("/bin/sleep", &["300"]));
assert_eq!(sup.state(), InstanceState::Stopped);
sup.start().await.expect("start should succeed");
assert_eq!(sup.state(), InstanceState::Running);
tokio::time::sleep(Duration::from_millis(1100)).await;
assert!(sup.uptime_seconds().await >= 1, "uptime should advance");
// Double-start must be rejected while running.
assert!(sup.start().await.is_err(), "double start must fail");
sup.stop().await.expect("stop should succeed");
let state = wait_for_state(&sup, |s| matches!(s, InstanceState::Stopped), Duration::from_secs(5)).await;
assert_eq!(state, InstanceState::Stopped);
assert_eq!(sup.uptime_seconds().await, 0);
}
#[tokio::test]
async fn unexpected_exit_is_crashed_with_code() {
let sup = ProcessSupervisor::new(&managed_instance("/bin/sh", &["-c", "sleep 0.2; exit 7"]));
sup.start().await.expect("start should succeed");
let state = wait_for_state(
&sup,
|s| matches!(s, InstanceState::Crashed { .. }),
Duration::from_secs(5),
)
.await;
assert_eq!(state, InstanceState::Crashed { exit_code: Some(7) });
}
#[tokio::test]
async fn restart_from_crashed_recovers() {
let sup = ProcessSupervisor::new(&managed_instance("/bin/sh", &["-c", "exit 1"]));
sup.start().await.expect("start should succeed");
wait_for_state(&sup, |s| matches!(s, InstanceState::Crashed { .. }), Duration::from_secs(5)).await;
// Restart from crashed must work (panel "Restart" after a crash).
// Use a long-lived command this time by replacing the supervisor — the
// command is fixed per supervisor, so emulate via a fresh one.
let sup2 = ProcessSupervisor::new(&managed_instance("/bin/sleep", &["300"]));
sup2.restart().await.expect("restart from stopped should start");
assert_eq!(sup2.state(), InstanceState::Running);
sup2.stop().await.expect("cleanup stop");
}
#[tokio::test]
async fn unmanaged_instance_rejects_process_commands() {
let mut cfg = managed_instance("/bin/sleep", &["300"]);
cfg.executable = None;
let sup = ProcessSupervisor::new(&cfg);
assert_eq!(sup.state(), InstanceState::Unmanaged);
assert!(sup.start().await.is_err(), "unmanaged start must fail");
assert!(sup.stop().await.is_err(), "unmanaged stop must fail");
}
#[tokio::test]
async fn missing_executable_fails_cleanly() {
let sup = ProcessSupervisor::new(&managed_instance("/nonexistent/bin/gameserver", &[]));
let err = sup.start().await.expect_err("must fail");
assert!(err.to_string().contains("not found"), "error should say not found: {err}");
assert_eq!(sup.state(), InstanceState::Stopped, "failed start must not leave Starting state");
}

View File

@@ -87,7 +87,10 @@ services:
api: api:
condition: service_started condition: service_started
healthcheck: healthcheck:
test: ["CMD-SHELL", "wget -q --spider http://localhost:80/ || exit 1"] # 127.0.0.1, not localhost: nginx listens IPv4-only (0.0.0.0:80) but
# `localhost` resolves to ::1 first inside the container → the probe hit
# nothing and reported unhealthy while the panel served fine on IPv4.
test: ["CMD-SHELL", "wget -q --spider http://127.0.0.1:80/ || exit 1"]
interval: 10s interval: 10s
timeout: 5s timeout: 5s
retries: 3 retries: 3

View File

@@ -9,6 +9,9 @@
<meta name="viewport" content="width=device-width, initial-scale=1.0" /> <meta name="viewport" content="width=device-width, initial-scale=1.0" />
<meta name="theme-color" content="#0a0a0a" /> <meta name="theme-color" content="#0a0a0a" />
<title>Corrosion Management</title> <title>Corrosion Management</title>
<meta name="description" content="Management panel for self-hosted survival game servers — Rust, Dune: Awakening, Conan Exiles, Soulmask. Wipe automation, plugins, monitoring. Bring your own server." />
<meta property="og:title" content="Corrosion — Game Server Operations for Self-Hosted Communities" />
<meta property="og:description" content="Management panel for self-hosted survival game servers — Rust, Dune: Awakening, Conan Exiles, Soulmask. Wipe automation, plugins, monitoring. Bring your own server." />
<!-- Fonts via <link>, NOT a CSS @import — the bundler drops @import rules <!-- Fonts via <link>, NOT a CSS @import — the bundler drops @import rules
that land mid-file after concatenation, silently shipping system fonts --> that land mid-file after concatenation, silently shipping system fonts -->
<link rel="preconnect" href="https://fonts.googleapis.com" /> <link rel="preconnect" href="https://fonts.googleapis.com" />

View File

@@ -1,7 +1,14 @@
<script setup lang="ts"> <script setup lang="ts">
import { onMounted } from 'vue'
import { RouterView } from 'vue-router' import { RouterView } from 'vue-router'
import ToastNotification from '@/components/ToastNotification.vue' import ToastNotification from '@/components/ToastNotification.vue'
import ErrorBoundary from '@/components/ErrorBoundary.vue' import ErrorBoundary from '@/components/ErrorBoundary.vue'
import { useAuthStore } from '@/stores/auth'
// Validate any persisted session against the API on boot — a stale token
// should bounce to login immediately, not after the first failed call.
const auth = useAuthStore()
onMounted(() => { void auth.validateSession() })
</script> </script>
<template> <template>

View File

@@ -1,11 +1,28 @@
import { createRouter, createWebHistory, type RouteRecordRaw } from 'vue-router' import { createRouter, createWebHistory, type RouteRecordRaw } from 'vue-router'
import { useAuthStore } from '@/stores/auth' import { useAuthStore } from '@/stores/auth'
// Extend vue-router's RouteMeta so title/description are typed throughout
declare module 'vue-router' {
interface RouteMeta {
title?: string
description?: string
requiresAuth?: boolean
guest?: boolean
superAdmin?: boolean
}
}
// --------------------------------------------------------------------------- // ---------------------------------------------------------------------------
// Domain detection — runs once at module load // Domain detection — runs once at module load
// Env-driven so www./staging hosts route correctly; an exact-match literal
// here once meant any non-canonical marketing host silently got the panel.
// --------------------------------------------------------------------------- // ---------------------------------------------------------------------------
const hostname = typeof window !== 'undefined' ? window.location.hostname : '' const hostname = typeof window !== 'undefined' ? window.location.hostname : ''
const isMarketingDomain = hostname === 'corrosionmgmt.com' const marketingHosts = (import.meta.env.VITE_MARKETING_HOSTS ?? 'corrosionmgmt.com,www.corrosionmgmt.com')
.split(',')
.map((h: string) => h.trim().toLowerCase())
.filter(Boolean)
const isMarketingDomain = marketingHosts.includes(hostname.toLowerCase())
// --------------------------------------------------------------------------- // ---------------------------------------------------------------------------
// Marketing page children — shared between both domain route sets // Marketing page children — shared between both domain route sets
@@ -15,31 +32,55 @@ const marketingChildren: RouteRecordRaw[] = [
path: '', path: '',
name: 'landing', name: 'landing',
component: () => import('@/views/marketing/LandingView.vue'), component: () => import('@/views/marketing/LandingView.vue'),
meta: {
title: 'Corrosion — Game Server Operations for Self-Hosted Communities',
description: 'Management panel for self-hosted survival game servers — Rust, Dune: Awakening, Conan Exiles, Soulmask. Wipe automation, plugins, monitoring. Bring your own server.',
},
}, },
{ {
path: 'pricing', path: 'pricing',
name: 'pricing', name: 'pricing',
component: () => import('@/views/marketing/PricingView.vue'), component: () => import('@/views/marketing/PricingView.vue'),
meta: {
title: 'Pricing — Corrosion',
description: 'Plans from $9.99/mo (Hobby, 15 servers) to Network ($99.99+/mo, 50+ servers). Non-commercial and commercial tiers. No hosting fees — bring your own server.',
},
}, },
{ {
path: 'how-it-works', path: 'how-it-works',
name: 'how-it-works', name: 'how-it-works',
component: () => import('@/views/marketing/HowItWorksView.vue'), component: () => import('@/views/marketing/HowItWorksView.vue'),
meta: {
title: 'How It Works — Corrosion',
description: 'Install one host agent on Windows or Linux. It connects outbound-only to Corrosion — no inbound ports, no SSH. Manage every game instance from the browser.',
},
}, },
{ {
path: 'faq', path: 'faq',
name: 'faq', name: 'faq',
component: () => import('@/views/marketing/FaqView.vue'), component: () => import('@/views/marketing/FaqView.vue'),
meta: {
title: 'FAQ — Corrosion',
description: 'Honest answers: Corrosion is self-service (BYOS, no hosting). Support is docs + community; 1:1 at $125/hr. Supports Rust, Dune, Conan Exiles, Soulmask.',
},
}, },
{ {
path: 'roadmap', path: 'roadmap',
name: 'roadmap', name: 'roadmap',
component: () => import('@/views/marketing/RoadmapView.vue'), component: () => import('@/views/marketing/RoadmapView.vue'),
meta: {
title: 'Roadmap — Corrosion',
description: 'Phase 1 shipped: core control plane, auto-wiper, plugin management. In progress: Dune, Conan, Soulmask multi-game blueprints. Planned: API access, integrations.',
},
}, },
{ {
path: 'early-access', path: 'early-access',
name: 'early-access', name: 'early-access',
component: () => import('@/views/marketing/EarlyAccessView.vue'), component: () => import('@/views/marketing/EarlyAccessView.vue'),
meta: {
title: 'Early Access — Corrosion',
description: 'Join the early access list. Get full control plane access — wipe automation, plugin management, real-time console — and lock in launch pricing.',
},
}, },
] ]
@@ -53,25 +94,25 @@ const panelRoutes: RouteRecordRaw[] = [
path: '/login', path: '/login',
name: 'login', name: 'login',
component: () => import('@/views/auth/LoginView.vue'), component: () => import('@/views/auth/LoginView.vue'),
meta: { guest: true }, meta: { guest: true, title: 'Sign in — Corrosion' },
}, },
{ {
path: '/register', path: '/register',
name: 'register', name: 'register',
component: () => import('@/views/auth/RegisterView.vue'), component: () => import('@/views/auth/RegisterView.vue'),
meta: { guest: true }, meta: { guest: true, title: 'Create account — Corrosion' },
}, },
{ {
path: '/forgot-password', path: '/forgot-password',
name: 'forgot-password', name: 'forgot-password',
component: () => import('@/views/auth/ForgotPasswordView.vue'), component: () => import('@/views/auth/ForgotPasswordView.vue'),
meta: { guest: true }, meta: { guest: true, title: 'Reset password — Corrosion' },
}, },
{ {
path: '/setup', path: '/setup',
name: 'setup-wizard', name: 'setup-wizard',
component: () => import('@/views/auth/SetupWizardView.vue'), component: () => import('@/views/auth/SetupWizardView.vue'),
meta: { requiresAuth: true }, meta: { requiresAuth: true, title: 'Setup — Corrosion' },
}, },
// Admin dashboard routes (with sidebar layout) // Admin dashboard routes (with sidebar layout)
@@ -84,217 +125,254 @@ const panelRoutes: RouteRecordRaw[] = [
path: '', path: '',
name: 'dashboard', name: 'dashboard',
component: () => import('@/views/admin/DashboardView.vue'), component: () => import('@/views/admin/DashboardView.vue'),
meta: { title: 'Dashboard — Corrosion' },
}, },
{ {
path: 'server', path: 'server',
name: 'server', name: 'server',
component: () => import('@/views/admin/ServerView.vue'), component: () => import('@/views/admin/ServerView.vue'),
meta: { title: 'Server — Corrosion' },
}, },
{ {
path: 'console', path: 'console',
name: 'console', name: 'console',
component: () => import('@/views/admin/ConsoleView.vue'), component: () => import('@/views/admin/ConsoleView.vue'),
meta: { title: 'Console — Corrosion' },
}, },
{ {
path: 'players', path: 'players',
name: 'players', name: 'players',
component: () => import('@/views/admin/PlayersView.vue'), component: () => import('@/views/admin/PlayersView.vue'),
meta: { title: 'Players — Corrosion' },
}, },
{ {
path: 'plugins', path: 'plugins',
name: 'plugins', name: 'plugins',
component: () => import('@/views/admin/PluginsView.vue'), component: () => import('@/views/admin/PluginsView.vue'),
meta: { title: 'Plugins — Corrosion' },
}, },
{ {
path: 'files', path: 'files',
name: 'files', name: 'files',
component: () => import('@/views/admin/FileManagerView.vue'), component: () => import('@/views/admin/FileManagerView.vue'),
meta: { title: 'Files — Corrosion' },
}, },
{ {
path: 'plugin-configs', path: 'plugin-configs',
name: 'plugin-configs', name: 'plugin-configs',
component: () => import('@/views/admin/PluginConfigsView.vue'), component: () => import('@/views/admin/PluginConfigsView.vue'),
meta: { title: 'Plugin Configs — Corrosion' },
}, },
{ {
path: 'loot-builder', path: 'loot-builder',
name: 'loot-builder', name: 'loot-builder',
component: () => import('@/views/admin/LootBuilderView.vue'), component: () => import('@/views/admin/LootBuilderView.vue'),
meta: { title: 'Loot Builder — Corrosion' },
}, },
{ {
path: 'teleport-config', path: 'teleport-config',
name: 'teleport-config', name: 'teleport-config',
component: () => import('@/views/admin/TeleportConfigView.vue'), component: () => import('@/views/admin/TeleportConfigView.vue'),
meta: { title: 'Teleport Config — Corrosion' },
}, },
{ {
path: 'gather-manager', path: 'gather-manager',
name: 'gather-manager', name: 'gather-manager',
component: () => import('@/views/admin/GatherManagerView.vue'), component: () => import('@/views/admin/GatherManagerView.vue'),
meta: { title: 'Gather Manager — Corrosion' },
}, },
{ {
path: 'autodoors', path: 'autodoors',
name: 'autodoors', name: 'autodoors',
component: () => import('@/views/admin/AutoDoorsView.vue'), component: () => import('@/views/admin/AutoDoorsView.vue'),
meta: { title: 'Auto Doors — Corrosion' },
}, },
{ {
path: 'kits', path: 'kits',
name: 'kits-config', name: 'kits-config',
component: () => import('@/views/admin/KitsView.vue'), component: () => import('@/views/admin/KitsView.vue'),
meta: { title: 'Kits — Corrosion' },
}, },
{ {
path: 'furnace-splitter', path: 'furnace-splitter',
name: 'furnace-splitter', name: 'furnace-splitter',
component: () => import('@/views/admin/FurnaceSplitterView.vue'), component: () => import('@/views/admin/FurnaceSplitterView.vue'),
meta: { title: 'Furnace Splitter — Corrosion' },
}, },
{ {
path: 'better-chat', path: 'better-chat',
name: 'better-chat', name: 'better-chat',
component: () => import('@/views/admin/BetterChatView.vue'), component: () => import('@/views/admin/BetterChatView.vue'),
meta: { title: 'Better Chat — Corrosion' },
}, },
{ {
path: 'timed-execute', path: 'timed-execute',
name: 'timed-execute', name: 'timed-execute',
component: () => import('@/views/admin/TimedExecuteView.vue'), component: () => import('@/views/admin/TimedExecuteView.vue'),
meta: { title: 'Timed Execute — Corrosion' },
}, },
{ {
path: 'raidable-bases', path: 'raidable-bases',
name: 'raidable-bases', name: 'raidable-bases',
component: () => import('@/views/admin/RaidableBasesView.vue'), component: () => import('@/views/admin/RaidableBasesView.vue'),
meta: { title: 'Raidable Bases — Corrosion' },
}, },
{ {
path: 'wipes', path: 'wipes',
name: 'wipes', name: 'wipes',
component: () => import('@/views/admin/WipesView.vue'), component: () => import('@/views/admin/WipesView.vue'),
meta: { title: 'Wipes — Corrosion' },
}, },
{ {
path: 'wipes/profiles', path: 'wipes/profiles',
name: 'wipe-profiles', name: 'wipe-profiles',
component: () => import('@/views/admin/WipeProfilesView.vue'), component: () => import('@/views/admin/WipeProfilesView.vue'),
meta: { title: 'Wipe Profiles — Corrosion' },
}, },
{ {
path: 'wipes/calendar', path: 'wipes/calendar',
name: 'wipe-calendar', name: 'wipe-calendar',
component: () => import('@/views/admin/WipeCalendarView.vue'), component: () => import('@/views/admin/WipeCalendarView.vue'),
meta: { title: 'Wipe Calendar — Corrosion' },
}, },
{ {
path: 'wipes/history', path: 'wipes/history',
name: 'wipe-history', name: 'wipe-history',
component: () => import('@/views/admin/WipeHistoryView.vue'), component: () => import('@/views/admin/WipeHistoryView.vue'),
meta: { title: 'Wipe History — Corrosion' },
}, },
{ {
path: 'wipes/analytics', path: 'wipes/analytics',
name: 'wipe-analytics', name: 'wipe-analytics',
component: () => import('@/views/admin/WipeAnalyticsView.vue'), component: () => import('@/views/admin/WipeAnalyticsView.vue'),
meta: { title: 'Wipe Analytics — Corrosion' },
}, },
{ {
path: 'maps', path: 'maps',
name: 'maps', name: 'maps',
component: () => import('@/views/admin/MapsView.vue'), component: () => import('@/views/admin/MapsView.vue'),
meta: { title: 'Maps — Corrosion' },
}, },
{ {
path: 'maps/analytics', path: 'maps/analytics',
name: 'map-analytics', name: 'map-analytics',
component: () => import('@/views/admin/MapAnalyticsView.vue'), component: () => import('@/views/admin/MapAnalyticsView.vue'),
meta: { title: 'Map Analytics — Corrosion' },
}, },
{ {
path: 'chat', path: 'chat',
name: 'chat', name: 'chat',
component: () => import('@/views/admin/ChatLogView.vue'), component: () => import('@/views/admin/ChatLogView.vue'),
meta: { title: 'Chat Log — Corrosion' },
}, },
{ {
path: 'analytics', path: 'analytics',
name: 'analytics', name: 'analytics',
component: () => import('@/views/admin/AnalyticsView.vue'), component: () => import('@/views/admin/AnalyticsView.vue'),
meta: { title: 'Analytics — Corrosion' },
}, },
{ {
path: 'retention', path: 'retention',
name: 'retention', name: 'retention',
component: () => import('@/views/admin/PlayerRetentionView.vue'), component: () => import('@/views/admin/PlayerRetentionView.vue'),
meta: { title: 'Player Retention — Corrosion' },
}, },
{ {
path: 'notifications', path: 'notifications',
name: 'notifications', name: 'notifications',
component: () => import('@/views/admin/NotificationsView.vue'), component: () => import('@/views/admin/NotificationsView.vue'),
meta: { title: 'Notifications — Corrosion' },
}, },
{ {
path: 'team', path: 'team',
name: 'team', name: 'team',
component: () => import('@/views/admin/TeamView.vue'), component: () => import('@/views/admin/TeamView.vue'),
meta: { title: 'Team — Corrosion' },
}, },
{ {
path: 'store/config', path: 'store/config',
name: 'store-config', name: 'store-config',
component: () => import('@/views/admin/StoreConfigView.vue'), component: () => import('@/views/admin/StoreConfigView.vue'),
meta: { title: 'Store Config — Corrosion' },
}, },
{ {
path: 'store/items', path: 'store/items',
name: 'store-items', name: 'store-items',
component: () => import('@/views/admin/StoreItemsView.vue'), component: () => import('@/views/admin/StoreItemsView.vue'),
meta: { title: 'Store Items — Corrosion' },
}, },
{ {
path: 'store/revenue', path: 'store/revenue',
name: 'store-revenue', name: 'store-revenue',
component: () => import('@/views/admin/StoreRevenueView.vue'), component: () => import('@/views/admin/StoreRevenueView.vue'),
meta: { title: 'Store Revenue — Corrosion' },
}, },
{ {
path: 'modules', path: 'modules',
name: 'modules', name: 'modules',
component: () => import('@/views/admin/ModuleStoreView.vue'), component: () => import('@/views/admin/ModuleStoreView.vue'),
meta: { title: 'Modules — Corrosion' },
}, },
{ {
path: 'settings', path: 'settings',
name: 'settings', name: 'settings',
component: () => import('@/views/admin/SettingsView.vue'), component: () => import('@/views/admin/SettingsView.vue'),
meta: { title: 'Settings — Corrosion' },
}, },
{ {
path: 'schedules', path: 'schedules',
name: 'schedules', name: 'schedules',
component: () => import('@/views/admin/SchedulesView.vue'), component: () => import('@/views/admin/SchedulesView.vue'),
meta: { title: 'Schedules — Corrosion' },
}, },
{ {
path: 'migration', path: 'migration',
name: 'migration', name: 'migration',
component: () => import('@/views/admin/MigrationView.vue'), component: () => import('@/views/admin/MigrationView.vue'),
meta: { title: 'Migration — Corrosion' },
}, },
{ {
path: 'changelog', path: 'changelog',
name: 'changelog', name: 'changelog',
component: () => import('@/views/admin/ChangelogView.vue'), component: () => import('@/views/admin/ChangelogView.vue'),
meta: { title: 'Changelog — Corrosion' },
}, },
{ {
path: 'alerts', path: 'alerts',
name: 'alerts', name: 'alerts',
component: () => import('@/views/admin/AlertsView.vue'), component: () => import('@/views/admin/AlertsView.vue'),
meta: { title: 'Alerts — Corrosion' },
}, },
// Platform Admin views (super-admin only) // Platform Admin views (super-admin only)
{ {
path: 'admin', path: 'admin',
name: 'platform-admin', name: 'platform-admin',
component: () => import('@/views/platform-admin/AdminDashboard.vue'), component: () => import('@/views/platform-admin/AdminDashboard.vue'),
meta: { superAdmin: true }, meta: { superAdmin: true, title: 'Admin — Corrosion' },
}, },
{ {
path: 'admin/licenses', path: 'admin/licenses',
name: 'platform-licenses', name: 'platform-licenses',
component: () => import('@/views/platform-admin/AdminLicenses.vue'), component: () => import('@/views/platform-admin/AdminLicenses.vue'),
meta: { superAdmin: true }, meta: { superAdmin: true, title: 'Admin: Licenses — Corrosion' },
}, },
{ {
path: 'admin/subscriptions', path: 'admin/subscriptions',
name: 'platform-subscriptions', name: 'platform-subscriptions',
component: () => import('@/views/platform-admin/AdminSubscriptions.vue'), component: () => import('@/views/platform-admin/AdminSubscriptions.vue'),
meta: { superAdmin: true }, meta: { superAdmin: true, title: 'Admin: Subscriptions — Corrosion' },
}, },
{ {
path: 'admin/users', path: 'admin/users',
name: 'platform-users', name: 'platform-users',
component: () => import('@/views/platform-admin/AdminUsers.vue'), component: () => import('@/views/platform-admin/AdminUsers.vue'),
meta: { superAdmin: true }, meta: { superAdmin: true, title: 'Admin: Users — Corrosion' },
}, },
{ {
path: 'admin/servers', path: 'admin/servers',
name: 'platform-servers', name: 'platform-servers',
component: () => import('@/views/platform-admin/AdminServers.vue'), component: () => import('@/views/platform-admin/AdminServers.vue'),
meta: { superAdmin: true }, meta: { superAdmin: true, title: 'Admin: Servers — Corrosion' },
}, },
], ],
}, },
@@ -329,6 +407,7 @@ const panelRoutes: RouteRecordRaw[] = [
path: '/status', path: '/status',
name: 'status', name: 'status',
component: () => import('@/views/public/StatusPageView.vue'), component: () => import('@/views/public/StatusPageView.vue'),
meta: { title: 'Status — Corrosion' },
}, },
// Catch-all // Catch-all
@@ -366,6 +445,7 @@ const marketingRoutes: RouteRecordRaw[] = [
path: '/status', path: '/status',
name: 'status', name: 'status',
component: () => import('@/views/public/StatusPageView.vue'), component: () => import('@/views/public/StatusPageView.vue'),
meta: { title: 'Status — Corrosion' },
}, },
// Catch-all: unknown routes → landing page // Catch-all: unknown routes → landing page
@@ -383,6 +463,38 @@ const router = createRouter({
routes: isMarketingDomain ? marketingRoutes : panelRoutes, routes: isMarketingDomain ? marketingRoutes : panelRoutes,
}) })
// ---------------------------------------------------------------------------
// Document title + meta description/OG update on every navigation
// ---------------------------------------------------------------------------
function setOrClearMeta(selector: string, attr: string, value: string): void {
let el = document.querySelector<HTMLMetaElement>(selector)
if (!el) {
el = document.createElement('meta')
// Parse the selector to set the right attribute (name="..." or property="...")
const nameMatch = selector.match(/\[name="([^"]+)"\]/)
const propMatch = selector.match(/\[property="([^"]+)"\]/)
if (nameMatch?.[1]) el.setAttribute('name', nameMatch[1])
if (propMatch?.[1]) el.setAttribute('property', propMatch[1])
document.head.appendChild(el)
}
el.setAttribute(attr, value)
}
router.afterEach((to) => {
// Title
document.title = to.meta.title ?? 'Corrosion Management'
// Description
const desc = to.meta.description ?? ''
setOrClearMeta('meta[name="description"]', 'content', desc)
// OG title
setOrClearMeta('meta[property="og:title"]', 'content', to.meta.title ?? 'Corrosion Management')
// OG description
setOrClearMeta('meta[property="og:description"]', 'content', desc)
})
// Auth guard — only meaningful on panel domain (marketing has no requiresAuth routes) // Auth guard — only meaningful on panel domain (marketing has no requiresAuth routes)
router.beforeEach((to, _from, next) => { router.beforeEach((to, _from, next) => {
const auth = useAuthStore() const auth = useAuthStore()

View File

@@ -58,6 +58,27 @@ export const useAuthStore = defineStore('auth', () => {
permissions.value = {} permissions.value = {}
} }
/**
* Validate the persisted session against the API on app boot. Without this,
* a stale/revoked token renders the full panel chrome and only collapses on
* the first real API call. useApi's 401 path (refresh → retry → logout)
* does the heavy lifting; any non-auth failure (network, 5xx) keeps the
* session — never log users out because the API blipped.
* Dynamic import avoids a static auth-store ↔ useApi module cycle.
*/
async function validateSession(): Promise<void> {
if (!accessToken.value) return
try {
const { useApi } = await import('@/composables/useApi')
const me = await useApi().get<Partial<User>>('/auth/me')
if (user.value && me && typeof me === 'object') {
user.value = { ...user.value, ...me }
}
} catch {
// 401 → refresh → logout/redirect already handled inside useApi.
}
}
function hasModule(moduleSlug: string): boolean { function hasModule(moduleSlug: string): boolean {
return license.value?.modules_enabled?.includes(moduleSlug) ?? false return license.value?.modules_enabled?.includes(moduleSlug) ?? false
} }
@@ -92,6 +113,7 @@ export const useAuthStore = defineStore('auth', () => {
setAuth, setAuth,
setLicense, setLicense,
logout, logout,
validateSession,
hasModule, hasModule,
hasPermission, hasPermission,
} }