wipe_schedules rows existed but nothing read or fired them — an operator could
set a wipe schedule and it would never trigger (the headline auto-wipe feature
was inert; the manual trigger worked, the scheduler did not).
- WipesService now implements OnModuleInit/OnModuleDestroy with a 60s executor
(mirrors SchedulesService): bootstraps next_scheduled_run, then fires every
active schedule whose next_scheduled_run <= now via triggerWipe(...'scheduled')
-> instancesService.wipeForLicense -> the agent wipe handler, advancing
next_scheduled_run from the cron each cycle (advances even on failure so a
broken schedule can't re-fire every 60s).
- triggerWipe parameterized with triggerType ('manual' | 'scheduled') so
wipe_history records the real origin.
- Extracted nextCronDate into src/common/cron.util.ts (shared by the event and
wipe schedulers; was duplicated/private). Cron is evaluated UTC — the per-
schedule timezone column is still not honored, a known limitation shared by
both schedulers (follow-up: tz-aware cron lib).
Backend tsc green. Scheduling logic is at parity with the in-production event
scheduler; live end-to-end (a scheduled wipe deleting real files) verifies when
a game stack + agent are connected.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Player id and ban reason flowed unsanitized into the single-line RCON command,
so a control char (newline/CR) in 'reason' could break the framing and inject a
second console command — an RBAC-escalation vector (a Moderator-role user could
run arbitrary RCON via the ban reason field).
- validate player id against a safe token charset /^[A-Za-z0-9_.:-]{1,64}$/ and
reject otherwise (multi-game safe — not a Rust-only SteamID64 regex, so
Conan/Funcom and Dune ids still pass)
- strip C0 control chars from reason, collapse whitespace, cap at 200 chars
- coerce ban duration to a non-negative integer
Flagged by automated commit security review. Backend tsc green.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The legacy Go agent was never deployed, so the entire backend command surface
published to a dead cmd.server/cmd.wipe/files.cmd void. Route it all to the
Rust agent's instance-scoped subjects.
Agent (corrosion-host-agent, alpha.10):
- New src/wipe.rs + 'wipe' func on {instance}.cmd: stop -> delete game files by
type (map/blueprint/full, with optional backup) -> restart. Jailed to the
instance root, symlink-safe (lstat, no cross-boundary follow — Lesson 26).
8 tests incl. jail-escape + symlink-skip proofs. Agent suite 64 tests green.
Backend (NestJS):
- InstancesService is now @Global with license-scoped convenience wrappers
(lifecycleForLicense/rconForLicense/writeFileForLicense/readFileForLicense/
deleteFileForLicense/wipeForLicense) + resolveDefaultInstance (license ->
primary instance).
- Routed to the agent: servers start/stop/restart/command; players kick/banid/
unban via RCON; schedules restart/announce/command/plugin-reload; wipes ->
wipeForLicense (real wipe now); plugins reload/unload/upload via rcon+file
ops; all 9 plugin-config module applies -> writeFileForLicense + oxide.reload
rcon, imports -> readFileForLicense (server:// prefix stripped).
- Honestly gated (need agent funcs not yet built): server deploy-from-panel,
Oxide install, one-click uMod install -> 503 coming-soon instead of dead
publishes.
Backend tsc green; agent cargo test green (64).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Multi-game rebrand (no more Rust-only leftovers): game-neutral setup wizard +
deploy/store defaults; player-id labels driven by game profile (Steam ID only
for Rust); blueprint wipe type + verify-plugins gated to uMod games; oxide
command examples + Rust-only plugin pages (AutoDoors/FurnaceSplitter/BetterChat)
guarded behind mods==='umod' with empty-states for other games.
Honesty: webstore checkout shows coming-soon (backend now 503s); 'integrated
webstore' marketed as coming-soon; Discord references neutralized to
community/webhook; migration FAQ marked in-development; analytics dev phase
labels removed; Network pricing tier set to Custom/Contact (was a confusing
duplicate of Operator); docs/PRICING.md rewritten to match live subscriptions.
UX/bugs: fixed ServerView oxide-status operator-precedence bug; dead 'Deploy
server' button wired; non-functional topbar search removed; alert()/confirm()
replaced with toasts across schedules/alerts/migration/public store+server;
analytics chart arrays null-guarded; production console.logs gated to DEV.
Frontend build (vue-tsc + vite) green.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
- analytics: getMapAnalytics queried map.name but the map_library column is
display_name (no name column) — every map-analytics call 500'd. Fixed select
+ groupBy to map.display_name.
- setup: guard ENCRYPTION_KEY length before AES-256-GCM createCipheriv — an
unset key crashed bare-metal setup with an opaque 'Invalid key length' 500;
now returns a clear 503. Also stop falsely marking bare-metal connected on
completeSetup; leave offline until the agent's first heartbeat.
- webstore: public checkout returned a FAKE PayPal order token + sandbox URL
that resolves to nowhere. Refuse honestly with 503 (payments coming soon)
instead of faking a transaction.
- store: module purchase wrote a fake txn_<ts> implying a charge; record it
honestly as a free Beta grant (transaction_id=beta-free-grant, amount 0).
Backend tsc green.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Introduce a Supervisor trait (async-trait) so the agent manages games with
different models behind one wire contract. ProcessSupervisor (spawned process:
rust/conan/soulmask) and the new DockerComposeSupervisor (dune) both impl it;
Agent.supervisors is now HashMap<String, Arc<dyn Supervisor>> and instancecmd
dispatch is game-agnostic — start/stop/restart/status identical across games,
selected by a per-game factory in main. InstanceState moved to the shared
supervisor module.
DockerComposeSupervisor drives docker-compose up-d / stop / restart against
the instance's compose project, with -f/-p/single-service support and a
configurable compose binary. New [instance.docker_compose] config block.
First cut = lifecycle + cached state; container crash-detection + restart
adoption deferred to Phase 3b (reconcilable with a compose ps probe).
Trait choice (dyn over enum) per Commander: scales to future planes (kubectl,
AMP/podman, SSH) as new struct+impl, no central match.
56 tests green (6 new docker-compose mock-binary tests + 5 refactored process
tests), zero warnings. Live verification pending a real Dune stack.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Phase 2 references for the host-agent Dune adapter, moved out of volatile /tmp
into docs/reference-repos/ (per Commander). Three upstream projects, .git +
node_modules + compiled binaries stripped (16MB source). Nested AI-instruction
files (.claude/, CLAUDE.md) removed so they don't pollute Corrosion sessions.
- icehunter/ dune-admin (Go+React) — 4 control planes; SETUP_DOCKER.md is the
closest analog to our agent's Dune docker control plane (compose
lifecycle, docker logs, RabbitMQ-via-exec, dune Postgres schema)
- adainrivers/ Rust/Tauri desktop — SSH+k8s BattleGroup control, maintenance
daemon, in-game admin console (Rust idiom reference)
- the4rchangel/ Node web UI replacing battlegroup.bat — matches the Commander's
Hyper-V self-host path + game-config schema
See docs/reference-repos/README.md for the full index + how we use each.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
COA-B cleanup:
- Sidebar agent-health footer now reads the fleet store (host count / online
count / per-host status + last heartbeat) instead of the single legacy
server.connection row, which disagreed with the multi-host fleet. Removed the
legacy useServerStore dependency from the shell.
- Removed the unused 'vuefinder' dependency (replaced by the native file
manager): dep + main.ts plugin/CSS registration. Main JS chunk 588kB -> 165kB.
Recon reclassified the 'dead cmd.server v1' item: it is the LIVE license-level
command path (module config applies, plugin install, schedules, legacy
start/stop) served only by the Go agent — a Rust-agent parity gap, not dead
code. Left intact.
Build-green (vue-tsc) + boots clean in-browser (0 console errors).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The shell skin / sidebar nav / dashboard terminology now follow the games
actually deployed (game_instances.game, agent-reported) instead of a
localStorage-only toggle. syncActiveGameFromFleet() derives: one game ->
auto-skin to it; zero/multiple -> 'all' neutral. A manual GameSwitcher pick
persists and overrides the heuristic. Wired into DashboardLayout via a watch
on the fleet store.
No schema change: a license's games are the distinct games of its instances
(the normalized source of truth) — deliberately not duplicating into a
licenses.game column that would drift (Lesson 20).
Build-green (vue-tsc) + boots clean in-browser (0 console errors, theming
initializes). Authenticated auto-derive confirms live on next instance deploy.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The 'Sign artifacts' step failed on alpha.7 with 'Error while loading the
secret key file' (exit 2): minisign downloaded and ran, but the reconstructed
key file was unparseable. A minisign secret key is two lines (comment + base64
blob); Gitea/act_runner secret storage mangles the embedded newline, collapsing
it to one line. Decode the secret as base64 (single-line, mangling-proof) with
auto-detect fallback to a raw two-line key. Fails loudly with the fix command
if the secret is neither form.
Requires re-storing MINISIGN_SECRET_KEY as: base64 < secret.key | tr -d '\n'
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
alpha.6 signing failed: 'E: Unable to locate package minisign' —
minisign isn't packaged for node:20-bullseye. Download the official
static linux binary instead. Forward to alpha.7 (alpha.6 published
nothing).
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Agent only ever runs a binary whose minisign signature verifies against
the EMBEDDED public key. NATS host.cmd func 'update' {url}: download
binary + .minisig from the CDN -> verify against embedded pubkey ->
atomic swap (.old rollback) -> relaunch. URL allowlist (https + cdn.
corrosionmgmt.com only, rejects userinfo-bypass), 100MiB cap. Closes the
supply-chain hole: even a malicious CDN upload can't run unsigned.
CI: build-host-agent.yml signs every artifact with MINISIGN_SECRET_KEY
(Gitea secret) and publishes .minisig alongside; the step FAILS the
build if the secret is absent (refuses to ship unsigned). Bumped to
alpha.6.
6 deterministic tests (accept valid / reject tampered+garbage+empty sig,
URL allowlist incl userinfo-bypass, atomic swap+rollback). Fixtures
signed with the real release key so tests need no key at runtime. Full
suite 50/50 green; musl + native build clean.
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
FileManagerView rewritten as a native DS browser on the per-instance
file bridge: instance selector, breadcrumb nav, dir-first listing
(name/size/modified), folder drill-down, inline file editor (read/save),
toolbar (new folder/file/refresh), per-row rename + delete-confirm.
New files store wraps the /instances/:id/files* endpoints. VueFinder
import + RemoteDriver fully removed — no more retired-protocol /api/files.
Honest empty (no instance -> Server page) + error (retry) states, never
the global error boundary.
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Rounds out the per-instance file bridge to the agent's full jailed file
manager so a real file browser can be built on it: POST :id/files/
{delete,rename,mkdir,mkfile,move,copy}, all via requestScoped (license-
scoped reply) on the new agent {op,path} protocol. files.manage. The
broken legacy VueFinder /api/files (retired Go fm_* protocol, wrong
subject, default _INBOX) is superseded by this — frontend rewrite next.
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
The Server page's config-honesty note now leads somewhere real: a
Configuration file panel that loads a config file from the instance
(prefilled with the game's primaryConfigFile hint — server.cfg,
ServerSettings.ini, GameXishu.json), edits it in a mono textarea, and
saves it straight to the host through the jailed agent file bridge.
Not-found is handled gracefully (empty editor to create). Works across
games; gameProfiles gains primaryConfigFile per game.
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
GET /api/instances/:id/files (list) + /file (read), PUT /file (write) —
tenant-guarded, routed through requestScoped to the per-instance
corrosion.{license}.{instance}.files.cmd using the new agent's {op,path}
protocol (jailed to the instance root, symlink-safe). files.view /
files.manage perms. Foundation for the per-game config editor and for
fixing the legacy VueFinder File Manager (which still speaks the retired
Go fm_* protocol on the wrong subject and is broken under per-license
auth — separate reconciliation).
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
The agent reply is authoritative for the action just taken; the fleet
DB only updates on the next heartbeat (~10s), so the immediate refetch
read a stale state and reverted the UI (Start -> still Stopped). Now
apply the reply's state/uptime directly to the instance.
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
The Server page now manages the selected GAME INSTANCE, not the legacy
host connection. New instances store flattens the fleet and drives the
command bridge. New 'Game instance' panel: real state badge
(running/stopped/crashed/configured), uptime, host, and an instance
selector when >1. Start/Stop/Restart/Refresh wired to POST
/api/instances/:id/lifecycle — gated on the actual instance state (not
host connectivity), with telemetry-only instances flagged. Works across
all four games (state + lifecycle are game-agnostic).
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Self-service host removal. DELETE /api/fleet/hosts/:id (server.manage,
tenant-guarded): refuses while the host is 'connected' (409 — a live
agent re-registers on its next heartbeat, stop it first), deletes the
host's game_instances explicitly (FK is SET NULL, would otherwise
orphan them; instance_stats cascade), and clears the legacy
server_connections row if it was the license's last host. Fleet view:
offline host cards get a Remove button with inline confirm + toast.
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Server page Host-agent panel now fetches GET /api/servers/agent-
credentials and renders the real agent.toml (license UUID, nats_user,
nats_password) instead of the broken LICENSE_ID=license_key env
commands that would never connect. Password masked by default with a
reveal toggle; copy-to-clipboard uses the real value. Setup commands
point at --config /etc/corrosion/agent.toml.
Configuration panel: World size / Current seed (Rust-only Facepunch
concepts) gated behind isRust; Conan/Soulmask/Dune get an honest note
pointing to the File Manager for their real config files instead of
fake Rust fields.
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Backend layer wiring the panel to the host agent's per-instance command
channel (the unblocker for the Server-page rework):
- NatsService.requestScoped(): request-reply with a LICENSE-SCOPED reply
subject (corrosion.{license}.reply.<id>) so per-license-scoped agents
(no _INBOX permission) can actually reply — the design from the NATS
auth work, now exercised.
- InstancesModule: POST /api/instances/:id/lifecycle {action} (start/
stop/restart/status/steam_update, server.manage) and POST :id/rcon
{command} (server.console). Tenant-guarded via game_instances.
- GET /api/servers/agent-credentials: derives the agent's NATS user/
password (HMAC) so a customer can configure their agent — closes the
post-auth setup gap.
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Caught during the live cutover: nats-server rejects 'unknown field
no_auth_user' when it is nested in the authorization block, taking the
whole broker down. Both the generator (open stage) and the committed
bootstrap default emitted it nested. Moved to top level. Enforce-stage
output was unaffected (no no_auth_user), which is what the live broker
now runs.
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Two HIGH findings from automated review on the generator, both fixed:
1. Cross-tenant inbox access: per-license users were granted _INBOX.>,
letting license A subscribe to license B's request-reply responses.
Now scoped to corrosion.{license}.> ONLY; replies must ride the
license namespace (corrosion.{license}.reply.<id>) — documented in
PROTOCOL.md. Agent unchanged (responds to msg.reply); constraint is
on the requester (internal user has full >).
2. Default-open auth bypass: generator defaulted to stage=open with a
full-access anonymous user — a stale regen left the broker wide open.
Now defaults to enforce (secure by default); the explicit 'open'
migration stage maps anonymous to a harmless corrosion.unclaimed.>
namespace, never real tenant subjects. Committed bootstrap default
hardened the same way.
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Closes the open broker (anonymous publish to any tenant's corrosion.*).
Per-license isolation via NATS user/password + subject permissions:
each license -> user=license_id, password=HMAC-SHA256(license_id,
NATS_TOKEN_SECRET), scoped to corrosion.{license_id}.> + _INBOX. Backend
uses a privileged internal user.
- Agent (alpha.5): nats_user/nats_password config + env, user_and_password
auth; falls back to token/anonymous (transition-safe)
- Backend: connects with NATS_INTERNAL_USER/PASSWORD when set, else anon
- scripts/generate-nats-auth.mjs: regenerates nats-auth.conf from the
licenses table; NATS_AUTH_STAGE=open keeps a no_auth_user fallback
(verify creds first), =enforce rejects anonymous
- committed nats-auth.conf is the SAFE OPEN default (no secrets); the
host copy carries real users and is not committed
- compose: NATS_INTERNAL_USER/PASSWORD/NATS_TOKEN_SECRET, mount nats-auth.conf
Entirely non-breaking until secrets+config deployed; staged cutover next.
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Tenant-scoped fleet read: GET /api/fleet returns agent_hosts (host
metrics) each with their game_instances, plus a summary
(host/instance/online counts). FleetView lists host cards (status, CPU/
mem/disk/uptime/last-heartbeat) with their instances (game, state badge,
uptime); honest empty state -> Server page when no hosts. New 'Fleet'
sidebar nav item across all four game profiles, /fleet route. Store
follows the no-throw-on-fetch pattern (error state, never bricks). The
marketing hero made real from the live fleet tables.
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Migration 022 adds agent_hosts / game_instances / instance_clusters /
instance_stats (named agent_hosts to avoid the existing B2B hosts
table). HostAgentConsumerService now parses the full v2 heartbeat and
upserts an agent_hosts row (host metrics: cpu/mem/disk/agent version,
keyed by license_id+hostname until enrollment) plus one game_instances
row per heartbeat instance entry (state + uptime, the billing unit).
Legacy server_connections write retained so the current panel keeps
working — additive migration, nothing breaks. Staleness sweep + offline
beacon now flip agent_hosts too. cluster_id FK reserved for Soulmask/
Dune. Migration applied to live DB; tsc green.
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Automated security review (HIGH) caught a jail-escape my own review
missed: copy_recursive used fs::metadata (follows symlinks). A symlink
inside the jail pointing to e.g. /etc, then a 'copy' of its parent dir,
would dereference it and pull external content INTO the jail where it
could be read — a read-escape exfiltration. jail() validates only the
top-level src/dest; the recursive walk reintroduced the escape.
Fix: copy_recursive uses symlink_metadata and refuses any symlink
('symlinks are not followed across the jail boundary'). list() likewise
switched to symlink_metadata so it reports the link, never the
dereferenced target's size/type (info leak). Two regression tests added:
copy-symlink-exfil (asserts no external content lands inside) and
list-no-deref. 44/44 tests green. Rolled forward to alpha.4 (vulnerable
alpha.3 superseded).
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
steam_update func runs SteamCMD per game (rust/conan/soulmask app-ids;
dune rejected), streaming stdout to {instance}.steam_status. Jailed
file manager on {instance}.files.cmd: list/read/write/delete/rename/
mkdir/mkfile/move/copy, all confined to instance root via two-stage
lexical-normalize + canonicalize (defeats ../ traversal AND symlink
escape — incl chained symlinks). Replaces the Go agent's UNJAILED
legacy files API (retired, not ported). 5MiB read cap.
42/42 tests green: 24 filemanager incl 7 jail-escape attempts
(dotdot, deep dotdot, absolute, symlink-inside, direct symlink,
chained symlink), 5 steamcmd app-id (cfg-gated win/linux soulmask).
Jail logic reviewed line-by-line: Path::starts_with is component-wise
(no sibling-prefix bypass), non-existent suffix components can't be
symlinks, leading .. normalizes to / and fails the prefix check.
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
corrosion-nginx reported (unhealthy) despite serving the panel fine:
nginx listens 0.0.0.0:80 (IPv4 only, no listen [::]:80), but
'localhost' resolves to ::1 first inside the container, so the probe
got connection-refused. Verified: 127.0.0.1:80 serves the SPA. Probe
now targets IPv4 explicitly. No nginx config change — the panel was
never broken, only the healthcheck's hostname resolution.
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Lesson 10 in the flesh: the onApplicationBootstrap fix made the NATS->
WS bridge actually deliver events for the first time, which instantly
crashed the API. esModuleInterop is off, so 'import WebSocket from ws'
compiles to ws_1.default = undefined; WebSocket.OPEN threw
'Cannot read properties of undefined' and killed the process on the
first heartbeat forward. All three WS guard sites (nats-bridge x2,
console gateway) switched to the import-agnostic instance constant
client.OPEN. Latent in every build — never hit because the bridge was
dead-on-boot until today.
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Production bug caught live: provider onModuleInit order put bridge/
consumer subscription hooks BEFORE NatsService finished connecting, so
every subscribe() hit the [OFFLINE] no-op path — the WS bridge has been
dead-on-boot in every production build, and the new v2 consumer never
saw a heartbeat (server_connections stayed empty under a live agent).
onApplicationBootstrap is guaranteed to run after all module inits,
including the awaited NATS connect.
The new CI contract suite fails on exactly this class of bug.
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
ci.yml runs on every push to main: backend tsc, frontend vue-tsc+vite,
cargo test (cached), then an integration job with postgres:16 + nats
service containers — real migrations applied to a fresh DB, real
backend booted (admin seed provides the license), real agent binary
spawned. contract-tests/agent-backend.contract.mjs proves the entire
v2 pipeline: heartbeat shape + measured telemetry, auto-registered
server_connections row flipping connected, instance start/stop/status
round-trips with push events, and the offline beacon flipping the row
back. This is the test that could not run before a production rebuild
until now — it now runs before every push lands.
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
rcon func on the instance command channel: WebSocket JSON WebRCON with
Identifier correlation (skips chat/log noise frames) and full Valve
Source RCON over TCP (auth, exec, multi-packet reassembly via empty
probe, 1MiB cap). Protocol inferred from game, explicit kind override
in [instance.rcon]. Always 127.0.0.1 — agent is co-located.
Hardening from review: WebRCON password never interpolated into error
contexts/logs (redacted URL); probe-tolerant termination — a quiet
period after received data ends the response for servers that don't
echo the probe (Soulmask conformance unverified), so data is never
discarded on probe timeout.
13/13 tests green incl. mock Source-RCON server (auth/multi-packet/
errors) and mock WebRCON server (noise-frame skipping).
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
A stale or revoked token previously rendered the full panel chrome and
only collapsed on the first API call. App boot now calls /auth/me
through useApi (401 -> refresh -> logout already handled there); user
profile refreshes on success, and non-auth failures (network, 5xx)
never log the user out.
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Exact-match on 'corrosionmgmt.com' meant www. or any staging host
silently served the panel instead of the marketing site. Hosts now come
from VITE_MARKETING_HOSTS (comma-separated, defaults cover bare + www).
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Per-instance ProcessSupervisor: tokio child spawn with proper arg list
(fixes Go's naive space-splitting), graceful SIGTERM with 30s budget
then force kill, monitor task classifying ordered-stop vs crash (exit
code captured), watch-channel state observable everywhere. Instance cmd
channel live on corrosion.{license}.{instance}.cmd (start/stop/restart/
status) with state events pushed on {instance}.status (keep-latest
semantics, documented). Heartbeats now carry live process state +
uptime per instance. Crate restructured lib+bin for integration tests.
Verified: 5 integration tests with real OS processes (lifecycle, crash
exit-code, restart recovery, unmanaged rejection, clean spawn failure)
+ live-NATS contract test (request-reply roundtrips, double-start
rejection, push events, heartbeat state) — all green.
Known limitation (documented): no PID adoption yet — agent restart
orphans a running game process to 'stopped' until panel restart.
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Every page previously titled 'Corrosion Management' with zero meta -
marketing invisible to search and link previews. Router afterEach now
sets title/description/og per route (no new deps); marketing pages get
real content-backed descriptions, panel views mechanical titles.
index.html carries defaults for pre-JS crawlers. Verified in-browser
per page via Playwright.
test-runner.yml: per-tool presence checks instead of green-lighting
missing toolchains; workflow_dispatch instead of every push.
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Nothing persisted agent heartbeats before: companion_last_seen was
written once at setup and connection_status stayed 'connected' forever.
HostAgentConsumerService now consumes corrosion.*.host.heartbeat
(updates last_seen + status, auto-creates the bare_metal connection row
on first contact), host.going_offline (graceful offline), and sweeps
connections offline after 180s of heartbeat silence. License-existence
tenant validation with caching per NATS-consumer doctrine. WS bridge
forwards host_heartbeat/host_going_offline to the panel.
Contract-verified against production NATS with the backend's own nats
lib: v2 subjects, schema 2, real telemetry, offline beacon.
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Asgard runner executes jobs in bare node:20-bullseye (no Rust, no sudo)
- install rustup + musl/mingw cross toolchains per-run, same pattern as
setup-go in the Go pipeline. agent-v2.0.0-alpha.1 predates this fix;
forward-only doctrine: version rolls to alpha.2 rather than re-pushing
the tag.
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Separate tag namespace from the Go pipeline (v*.*.*) per the
blast-radius doctrine; artifacts publish to /host-agent/alpha/ and a
versioned dir, leaving /host-agent/latest/ on the Go build until
cutover. Linux = static musl, Windows = mingw (msvc/cargo-xwin stays
the local release path). Tag-vs-Cargo.toml version gate included.
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
New corrosion-host-agent/ crate (Go companion-agent stays as behavior
reference until parity). Wire protocol v2 per COA-B: instance-scoped
subjects corrosion.{license}.{instance}.* + host-level .host.* — spec
in PROTOCOL.md, designed for the license->host->instance fleet model.
- Multi-instance TOML config in the foundation, not retrofitted
- NATS layer on the Vigilance production profile (infinite reconnect,
capped backoff, 30s ping, 8192-msg offline buffer)
- Heartbeat with real sysinfo telemetry — Go agent shipped hardcoded
disk/cpu placeholders; this is the panel's first true Resources data
- Connectivity prober (outbound TCP, periodic + on-demand)
- Host cmd channel (ping/probe/sysinfo), going-offline beacon,
CancellationToken shutdown
- Live-fire verified against production NATS; artifacts: 3.7MB static
linux-musl, 3.8MB windows .exe (static CRT)
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>