corrosion-admin-panel

Author	SHA1	Message	Date
Vantz Stockwell	62bc9cd2a3	feat(wipes): wire the auto-wiper — scheduled wipes now actually fire Some checks failed CI / backend-types (push) Successful in 9s Details CI / frontend-build (push) Successful in 15s Details CI / agent-tests (push) Failing after 30s Details CI / integration (push) Has been skipped Details wipe_schedules rows existed but nothing read or fired them — an operator could set a wipe schedule and it would never trigger (the headline auto-wipe feature was inert; the manual trigger worked, the scheduler did not). - WipesService now implements OnModuleInit/OnModuleDestroy with a 60s executor (mirrors SchedulesService): bootstraps next_scheduled_run, then fires every active schedule whose next_scheduled_run <= now via triggerWipe(...'scheduled') -> instancesService.wipeForLicense -> the agent wipe handler, advancing next_scheduled_run from the cron each cycle (advances even on failure so a broken schedule can't re-fire every 60s). - triggerWipe parameterized with triggerType ('manual' \| 'scheduled') so wipe_history records the real origin. - Extracted nextCronDate into src/common/cron.util.ts (shared by the event and wipe schedulers; was duplicated/private). Cron is evaluated UTC — the per- schedule timezone column is still not honored, a known limitation shared by both schedulers (follow-up: tz-aware cron lib). Backend tsc green. Scheduling logic is at parity with the in-production event scheduler; live end-to-end (a scheduled wipe deleting real files) verifies when a game stack + agent are connected. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-12 01:50:49 -04:00
Vantz Stockwell	e23b6a7e69	feat(brand): chemistry rebrand across panel + marketing Some checks failed CI / backend-types (push) Successful in 10s Details CI / frontend-build (push) Successful in 16s Details CI / agent-tests (push) Failing after 34s Details CI / integration (push) Has been skipped Details The logged-in panel is now Catalyst Console (by Corrosion); the marketing site keeps Corrosion as the platform/company and introduces the lexicon. - Wordmark: panel/auth Logo lockup -> 'Catalyst' / 'by Corrosion'; the shared C-core house mark (CorrosionMark) is untouched. Marketing nav/footer keep the 'Corrosion' wordmark. - Titles: panel routes -> '{View} · Catalyst'; auth -> Catalyst; document.title fallback + index.html -> 'Catalyst Console'. Marketing titles stay '— Corrosion'. - Host agent user-facing copy -> 're-Agent' across panel + marketing (the binary filename / CDN URLs / config paths / domains are UNCHANGED — that's the separate infra/binary-rename sprint; 'Download re-Agent' fetching corrosion-host-agent-* is the intended intermediate state). - Deploy-recipe 'blueprint/template' -> 'Formula/Formulae' in marketing + roadmap; Rust in-game 'blueprint wipe' kept (game term). - docs/BRANDING.md added (Oracle review + locked lexicon). vue-tsc + vite green; rendered clean both faces (Catalyst panel / Corrosion marketing), 0 console errors. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-12 01:19:01 -04:00
Vantz Stockwell	215355d1cb	fix(security): prevent RCON command injection in player kick/ban/unban (HIGH) Some checks failed CI / backend-types (push) Successful in 10s Details CI / frontend-build (push) Successful in 16s Details CI / agent-tests (push) Failing after 29s Details CI / integration (push) Has been skipped Details Player id and ban reason flowed unsanitized into the single-line RCON command, so a control char (newline/CR) in 'reason' could break the framing and inject a second console command — an RBAC-escalation vector (a Moderator-role user could run arbitrary RCON via the ban reason field). - validate player id against a safe token charset /^[A-Za-z0-9_.:-]{1,64}$/ and reject otherwise (multi-game safe — not a Rust-only SteamID64 regex, so Conan/Funcom and Dune ids still pass) - strip C0 control chars from reason, collapse whitespace, cap at 200 chars - coerce ban duration to a non-negative integer Flagged by automated commit security review. Backend tsc green. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-11 22:36:44 -04:00
Vantz Stockwell	440474290b	feat: wire the panel command surface to the live Rust agent + wipe handler All checks were successful CI / backend-types (push) Successful in 10s Details CI / frontend-build (push) Successful in 15s Details CI / agent-tests (push) Successful in 1m35s Details Build Host Agent (Rust) / build (push) Successful in 1m48s Details CI / integration (push) Successful in 23s Details The legacy Go agent was never deployed, so the entire backend command surface published to a dead cmd.server/cmd.wipe/files.cmd void. Route it all to the Rust agent's instance-scoped subjects. Agent (corrosion-host-agent, alpha.10): - New src/wipe.rs + 'wipe' func on {instance}.cmd: stop -> delete game files by type (map/blueprint/full, with optional backup) -> restart. Jailed to the instance root, symlink-safe (lstat, no cross-boundary follow — Lesson 26). 8 tests incl. jail-escape + symlink-skip proofs. Agent suite 64 tests green. Backend (NestJS): - InstancesService is now @Global with license-scoped convenience wrappers (lifecycleForLicense/rconForLicense/writeFileForLicense/readFileForLicense/ deleteFileForLicense/wipeForLicense) + resolveDefaultInstance (license -> primary instance). - Routed to the agent: servers start/stop/restart/command; players kick/banid/ unban via RCON; schedules restart/announce/command/plugin-reload; wipes -> wipeForLicense (real wipe now); plugins reload/unload/upload via rcon+file ops; all 9 plugin-config module applies -> writeFileForLicense + oxide.reload rcon, imports -> readFileForLicense (server:// prefix stripped). - Honestly gated (need agent funcs not yet built): server deploy-from-panel, Oxide install, one-click uMod install -> 503 coming-soon instead of dead publishes. Backend tsc green; agent cargo test green (64). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> agent-v2.0.0-alpha.10	2026-06-11 22:30:18 -04:00
Vantz Stockwell	6f783bfac8	feat(panel): Beta sweep — multi-game coherence, honesty, UX fixes All checks were successful CI / backend-types (push) Successful in 10s Details CI / frontend-build (push) Successful in 15s Details CI / agent-tests (push) Successful in 45s Details CI / integration (push) Successful in 22s Details Multi-game rebrand (no more Rust-only leftovers): game-neutral setup wizard + deploy/store defaults; player-id labels driven by game profile (Steam ID only for Rust); blueprint wipe type + verify-plugins gated to uMod games; oxide command examples + Rust-only plugin pages (AutoDoors/FurnaceSplitter/BetterChat) guarded behind mods==='umod' with empty-states for other games. Honesty: webstore checkout shows coming-soon (backend now 503s); 'integrated webstore' marketed as coming-soon; Discord references neutralized to community/webhook; migration FAQ marked in-development; analytics dev phase labels removed; Network pricing tier set to Custom/Contact (was a confusing duplicate of Operator); docs/PRICING.md rewritten to match live subscriptions. UX/bugs: fixed ServerView oxide-status operator-precedence bug; dead 'Deploy server' button wired; non-functional topbar search removed; alert()/confirm() replaced with toasts across schedules/alerts/migration/public store+server; analytics chart arrays null-guarded; production console.logs gated to DEV. Frontend build (vue-tsc + vite) green. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-11 22:06:10 -04:00
Vantz Stockwell	f2ea415840	fix(api): Beta hardening — real 500 fix, encryption guard, honest payments All checks were successful CI / backend-types (push) Successful in 10s Details CI / frontend-build (push) Successful in 15s Details CI / agent-tests (push) Successful in 1m36s Details CI / integration (push) Successful in 23s Details - analytics: getMapAnalytics queried map.name but the map_library column is display_name (no name column) — every map-analytics call 500'd. Fixed select + groupBy to map.display_name. - setup: guard ENCRYPTION_KEY length before AES-256-GCM createCipheriv — an unset key crashed bare-metal setup with an opaque 'Invalid key length' 500; now returns a clear 503. Also stop falsely marking bare-metal connected on completeSetup; leave offline until the agent's first heartbeat. - webstore: public checkout returned a FAKE PayPal order token + sandbox URL that resolves to nowhere. Refuse honestly with 503 (payments coming soon) instead of faking a transaction. - store: module purchase wrote a fake txn_<ts> implying a charge; record it honestly as a free Beta grant (transaction_id=beta-free-grant, amount 0). Backend tsc green. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-11 21:53:22 -04:00
Vantz Stockwell	d13f2cb8b1	feat(host-agent): Phase 2 — Dune docker-compose adapter via Supervisor trait Some checks failed CI / backend-types (push) Successful in 9s Details CI / frontend-build (push) Successful in 15s Details CI / agent-tests (push) Failing after 35s Details CI / integration (push) Has been skipped Details Build Host Agent (Rust) / build (push) Successful in 1m45s Details Introduce a Supervisor trait (async-trait) so the agent manages games with different models behind one wire contract. ProcessSupervisor (spawned process: rust/conan/soulmask) and the new DockerComposeSupervisor (dune) both impl it; Agent.supervisors is now HashMap<String, Arc<dyn Supervisor>> and instancecmd dispatch is game-agnostic — start/stop/restart/status identical across games, selected by a per-game factory in main. InstanceState moved to the shared supervisor module. DockerComposeSupervisor drives docker-compose up-d / stop / restart against the instance's compose project, with -f/-p/single-service support and a configurable compose binary. New [instance.docker_compose] config block. First cut = lifecycle + cached state; container crash-detection + restart adoption deferred to Phase 3b (reconcilable with a compose ps probe). Trait choice (dyn over enum) per Commander: scales to future planes (kubectl, AMP/podman, SSH) as new struct+impl, no central match. 56 tests green (6 new docker-compose mock-binary tests + 5 refactored process tests), zero warnings. Live verification pending a real Dune stack. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> agent-v2.0.0-alpha.9	2026-06-11 21:33:00 -04:00
Vantz Stockwell	651a35d4be	docs(reference): import Dune: Awakening server-manager references All checks were successful CI / backend-types (push) Successful in 10s Details CI / frontend-build (push) Successful in 15s Details CI / agent-tests (push) Successful in 39s Details CI / integration (push) Successful in 22s Details Phase 2 references for the host-agent Dune adapter, moved out of volatile /tmp into docs/reference-repos/ (per Commander). Three upstream projects, .git + node_modules + compiled binaries stripped (16MB source). Nested AI-instruction files (.claude/, CLAUDE.md) removed so they don't pollute Corrosion sessions. - icehunter/ dune-admin (Go+React) — 4 control planes; SETUP_DOCKER.md is the closest analog to our agent's Dune docker control plane (compose lifecycle, docker logs, RabbitMQ-via-exec, dune Postgres schema) - adainrivers/ Rust/Tauri desktop — SSH+k8s BattleGroup control, maintenance daemon, in-game admin console (Rust idiom reference) - the4rchangel/ Node web UI replacing battlegroup.bat — matches the Commander's Hyper-V self-host path + game-config schema See docs/reference-repos/README.md for the full index + how we use each. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-11 21:08:05 -04:00
Vantz Stockwell	0715492ddf	chore(panel): fleet-aware shell footer + drop dead vuefinder dep All checks were successful CI / backend-types (push) Successful in 9s Details CI / frontend-build (push) Successful in 14s Details CI / agent-tests (push) Successful in 49s Details CI / integration (push) Successful in 22s Details COA-B cleanup: - Sidebar agent-health footer now reads the fleet store (host count / online count / per-host status + last heartbeat) instead of the single legacy server.connection row, which disagreed with the multi-host fleet. Removed the legacy useServerStore dependency from the shell. - Removed the unused 'vuefinder' dependency (replaced by the native file manager): dep + main.ts plugin/CSS registration. Main JS chunk 588kB -> 165kB. Recon reclassified the 'dead cmd.server v1' item: it is the LIVE license-level command path (module config applies, plugin install, schedules, legacy start/stop) served only by the Go agent — a Rust-agent parity gap, not dead code. Left intact. Build-green (vue-tsc) + boots clean in-browser (0 console errors). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-11 21:04:09 -04:00
Vantz Stockwell	4ef5db5b0d	feat(panel): drive active game from deployed fleet instances All checks were successful CI / backend-types (push) Successful in 8s Details CI / frontend-build (push) Successful in 16s Details CI / agent-tests (push) Successful in 40s Details CI / integration (push) Successful in 23s Details The shell skin / sidebar nav / dashboard terminology now follow the games actually deployed (game_instances.game, agent-reported) instead of a localStorage-only toggle. syncActiveGameFromFleet() derives: one game -> auto-skin to it; zero/multiple -> 'all' neutral. A manual GameSwitcher pick persists and overrides the heuristic. Wired into DashboardLayout via a watch on the fleet store. No schema change: a license's games are the distinct games of its instances (the normalized source of truth) — deliberately not duplicating into a licenses.game column that would drift (Lesson 20). Build-green (vue-tsc) + boots clean in-browser (0 console errors, theming initializes). Authenticated auto-derive confirms live on next instance deploy. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-11 20:51:36 -04:00
Vantz Stockwell	bb71763714	docs: Lesson 28 — base64-encode multi-line CI secrets (minisign signing key) All checks were successful CI / backend-types (push) Successful in 9s Details CI / frontend-build (push) Successful in 16s Details CI / agent-tests (push) Successful in 39s Details CI / integration (push) Successful in 21s Details Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-11 20:38:56 -04:00
Vantz Stockwell	f18b45e3f2	fix(ci): base64-decode minisign secret key (CI mangles multi-line); bump alpha.8 Some checks failed CI / backend-types (push) Successful in 9s Details CI / frontend-build (push) Successful in 16s Details CI / agent-tests (push) Successful in 1m30s Details CI / integration (push) Failing after 13s Details Build Host Agent (Rust) / build (push) Successful in 1m45s Details The 'Sign artifacts' step failed on alpha.7 with 'Error while loading the secret key file' (exit 2): minisign downloaded and ran, but the reconstructed key file was unparseable. A minisign secret key is two lines (comment + base64 blob); Gitea/act_runner secret storage mangles the embedded newline, collapsing it to one line. Decode the secret as base64 (single-line, mangling-proof) with auto-detect fallback to a raw two-line key. Fails loudly with the fix command if the secret is neither form. Requires re-storing MINISIGN_SECRET_KEY as: base64 < secret.key \| tr -d '\n' Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> agent-v2.0.0-alpha.8	2026-06-11 20:31:48 -04:00
Vantz Stockwell	702de24e28	fix(ci): fetch minisign static binary (not in bullseye apt); bump alpha.7 Some checks failed CI / backend-types (push) Successful in 10s Details CI / frontend-build (push) Successful in 15s Details CI / agent-tests (push) Successful in 43s Details Build Host Agent (Rust) / build (push) Failing after 1m33s Details CI / integration (push) Successful in 22s Details alpha.6 signing failed: 'E: Unable to locate package minisign' — minisign isn't packaged for node:20-bullseye. Download the official static linux binary instead. Forward to alpha.7 (alpha.6 published nothing). Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> agent-v2.0.0-alpha.7	2026-06-11 20:18:08 -04:00
Vantz Stockwell	6b3e805ac2	feat(host-agent): Phase 3a signed self-update (minisign) + CI signing gate Some checks failed CI / backend-types (push) Successful in 9s Details CI / frontend-build (push) Successful in 16s Details CI / agent-tests (push) Successful in 1m27s Details CI / integration (push) Successful in 21s Details Build Host Agent (Rust) / build (push) Failing after 1m33s Details Agent only ever runs a binary whose minisign signature verifies against the EMBEDDED public key. NATS host.cmd func 'update' {url}: download binary + .minisig from the CDN -> verify against embedded pubkey -> atomic swap (.old rollback) -> relaunch. URL allowlist (https + cdn. corrosionmgmt.com only, rejects userinfo-bypass), 100MiB cap. Closes the supply-chain hole: even a malicious CDN upload can't run unsigned. CI: build-host-agent.yml signs every artifact with MINISIGN_SECRET_KEY (Gitea secret) and publishes .minisig alongside; the step FAILS the build if the secret is absent (refuses to ship unsigned). Bumped to alpha.6. 6 deterministic tests (accept valid / reject tampered+garbage+empty sig, URL allowlist incl userinfo-bypass, atomic swap+rollback). Fixtures signed with the real release key so tests need no key at runtime. Full suite 50/50 green; musl + native build clean. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>	2026-06-11 20:00:36 -04:00
Vantz Stockwell	7c84912ff5	chore(frontend): bump version 1.0.0 -> 1.0.1 All checks were successful CI / backend-types (push) Successful in 9s Details CI / frontend-build (push) Successful in 16s Details CI / agent-tests (push) Successful in 50s Details CI / integration (push) Successful in 28s Details Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>	2026-06-11 19:38:52 -04:00
Vantz Stockwell	355a53f6e3	feat(files): native instance-scoped file browser (replaces broken VueFinder) All checks were successful CI / backend-types (push) Successful in 9s Details CI / frontend-build (push) Successful in 16s Details CI / agent-tests (push) Successful in 42s Details CI / integration (push) Successful in 22s Details FileManagerView rewritten as a native DS browser on the per-instance file bridge: instance selector, breadcrumb nav, dir-first listing (name/size/modified), folder drill-down, inline file editor (read/save), toolbar (new folder/file/refresh), per-row rename + delete-confirm. New files store wraps the /instances/:id/files* endpoints. VueFinder import + RemoteDriver fully removed — no more retired-protocol /api/files. Honest empty (no instance -> Server page) + error (retry) states, never the global error boundary. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>	2026-06-11 19:31:01 -04:00
Vantz Stockwell	589516a021	feat(api): complete per-instance file op-set (delete/rename/mkdir/mkfile/move/copy) All checks were successful CI / backend-types (push) Successful in 8s Details CI / frontend-build (push) Successful in 15s Details CI / agent-tests (push) Successful in 54s Details CI / integration (push) Successful in 25s Details Rounds out the per-instance file bridge to the agent's full jailed file manager so a real file browser can be built on it: POST :id/files/ {delete,rename,mkdir,mkfile,move,copy}, all via requestScoped (license- scoped reply) on the new agent {op,path} protocol. files.manage. The broken legacy VueFinder /api/files (retired Go fm_* protocol, wrong subject, default _INBOX) is superseded by this — frontend rewrite next. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>	2026-06-11 19:24:31 -04:00
Vantz Stockwell	f60e6abd33	feat(server): config file editor — read/edit/save a host config file per instance All checks were successful CI / backend-types (push) Successful in 10s Details CI / frontend-build (push) Successful in 16s Details CI / agent-tests (push) Successful in 44s Details CI / integration (push) Successful in 21s Details The Server page's config-honesty note now leads somewhere real: a Configuration file panel that loads a config file from the instance (prefilled with the game's primaryConfigFile hint — server.cfg, ServerSettings.ini, GameXishu.json), edits it in a mono textarea, and saves it straight to the host through the jailed agent file bridge. Not-found is handled gracefully (empty editor to create). Works across games; gameProfiles gains primaryConfigFile per game. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>	2026-06-11 19:07:59 -04:00
Vantz Stockwell	877fadcb6c	feat(api): per-instance file bridge — list/read/write via the new agent file manager All checks were successful CI / backend-types (push) Successful in 9s Details CI / frontend-build (push) Successful in 16s Details CI / agent-tests (push) Successful in 44s Details CI / integration (push) Successful in 21s Details GET /api/instances/:id/files (list) + /file (read), PUT /file (write) — tenant-guarded, routed through requestScoped to the per-instance corrosion.{license}.{instance}.files.cmd using the new agent's {op,path} protocol (jailed to the instance root, symlink-safe). files.view / files.manage perms. Foundation for the per-game config editor and for fixing the legacy VueFinder File Manager (which still speaks the retired Go fm_* protocol on the wrong subject and is broken under per-license auth — separate reconciliation). Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>	2026-06-11 19:00:28 -04:00
Vantz Stockwell	e897a4802f	fix(server): apply lifecycle reply state optimistically (heartbeat lag) All checks were successful CI / backend-types (push) Successful in 9s Details CI / frontend-build (push) Successful in 16s Details CI / agent-tests (push) Successful in 42s Details CI / integration (push) Successful in 21s Details The agent reply is authoritative for the action just taken; the fleet DB only updates on the next heartbeat (~10s), so the immediate refetch read a stale state and reverted the UI (Start -> still Stopped). Now apply the reply's state/uptime directly to the instance. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>	2026-06-11 18:41:19 -04:00
Vantz Stockwell	c0b20f2f78	feat(server): instance-centric controls — real per-instance state + lifecycle All checks were successful CI / backend-types (push) Successful in 10s Details CI / frontend-build (push) Successful in 16s Details CI / agent-tests (push) Successful in 55s Details CI / integration (push) Successful in 22s Details The Server page now manages the selected GAME INSTANCE, not the legacy host connection. New instances store flattens the fleet and drives the command bridge. New 'Game instance' panel: real state badge (running/stopped/crashed/configured), uptime, host, and an instance selector when >1. Start/Stop/Restart/Refresh wired to POST /api/instances/:id/lifecycle — gated on the actual instance state (not host connectivity), with telemetry-only instances flagged. Works across all four games (state + lifecycle are game-agnostic). Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>	2026-06-11 18:37:53 -04:00
Vantz Stockwell	06e832fca1	feat(fleet): remove host — DELETE /api/fleet/hosts/:id + Fleet card action All checks were successful CI / backend-types (push) Successful in 9s Details CI / frontend-build (push) Successful in 16s Details CI / agent-tests (push) Successful in 36s Details CI / integration (push) Successful in 21s Details Self-service host removal. DELETE /api/fleet/hosts/:id (server.manage, tenant-guarded): refuses while the host is 'connected' (409 — a live agent re-registers on its next heartbeat, stop it first), deletes the host's game_instances explicitly (FK is SET NULL, would otherwise orphan them; instance_stats cascade), and clears the legacy server_connections row if it was the license's last host. Fleet view: offline host cards get a Remove button with inline confirm + toast. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>	2026-06-11 18:21:04 -04:00
Vantz Stockwell	009ceb86ad	feat(server): real agent credentials + agent.toml setup; per-game config honesty All checks were successful CI / backend-types (push) Successful in 10s Details CI / frontend-build (push) Successful in 17s Details CI / agent-tests (push) Successful in 45s Details CI / integration (push) Successful in 22s Details Server page Host-agent panel now fetches GET /api/servers/agent- credentials and renders the real agent.toml (license UUID, nats_user, nats_password) instead of the broken LICENSE_ID=license_key env commands that would never connect. Password masked by default with a reveal toggle; copy-to-clipboard uses the real value. Setup commands point at --config /etc/corrosion/agent.toml. Configuration panel: World size / Current seed (Rust-only Facepunch concepts) gated behind isRust; Conan/Soulmask/Dune get an honest note pointing to the File Manager for their real config files instead of fake Rust fields. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>	2026-06-11 13:23:47 -04:00
Vantz Stockwell	6f31c41dc3	feat(api): instance command bridge + agent credentials endpoint All checks were successful CI / backend-types (push) Successful in 10s Details CI / frontend-build (push) Successful in 16s Details CI / agent-tests (push) Successful in 43s Details CI / integration (push) Successful in 21s Details Backend layer wiring the panel to the host agent's per-instance command channel (the unblocker for the Server-page rework): - NatsService.requestScoped(): request-reply with a LICENSE-SCOPED reply subject (corrosion.{license}.reply.<id>) so per-license-scoped agents (no _INBOX permission) can actually reply — the design from the NATS auth work, now exercised. - InstancesModule: POST /api/instances/:id/lifecycle {action} (start/ stop/restart/status/steam_update, server.manage) and POST :id/rcon {command} (server.console). Tenant-guarded via game_instances. - GET /api/servers/agent-credentials: derives the agent's NATS user/ password (HMAC) so a customer can configure their agent — closes the post-auth setup gap. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>	2026-06-11 13:05:22 -04:00
Vantz Stockwell	99433a09d1	docs(claude): Lesson 27 — lint infra config before deploy; compose up -d recreates changed deps All checks were successful CI / backend-types (push) Successful in 9s Details CI / frontend-build (push) Successful in 16s Details CI / agent-tests (push) Successful in 42s Details CI / integration (push) Successful in 22s Details Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>	2026-06-11 12:53:06 -04:00
Vantz Stockwell	b442ef4102	fix(api): consumer rejects malformed heartbeats with no host block (no phantom hosts) All checks were successful CI / backend-types (push) Successful in 9s Details CI / frontend-build (push) Successful in 16s Details CI / agent-tests (push) Successful in 41s Details CI / integration (push) Successful in 21s Details Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>	2026-06-11 12:49:53 -04:00
Vantz Stockwell	856106174a	fix(nats): no_auth_user is top-level, not inside authorization{} — broke broker startup All checks were successful CI / backend-types (push) Successful in 9s Details CI / frontend-build (push) Successful in 16s Details CI / agent-tests (push) Successful in 43s Details CI / integration (push) Successful in 22s Details Caught during the live cutover: nats-server rejects 'unknown field no_auth_user' when it is nested in the authorization block, taking the whole broker down. Both the generator (open stage) and the committed bootstrap default emitted it nested. Moved to top level. Enforce-stage output was unaffected (no no_auth_user), which is what the live broker now runs. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>	2026-06-11 12:47:14 -04:00
Vantz Stockwell	463908b18e	fix(nats): security review — secure-by-default + per-tenant inbox isolation All checks were successful CI / backend-types (push) Successful in 10s Details CI / frontend-build (push) Successful in 16s Details CI / agent-tests (push) Successful in 43s Details CI / integration (push) Successful in 23s Details Two HIGH findings from automated review on the generator, both fixed: 1. Cross-tenant inbox access: per-license users were granted _INBOX.>, letting license A subscribe to license B's request-reply responses. Now scoped to corrosion.{license}.> ONLY; replies must ride the license namespace (corrosion.{license}.reply.<id>) — documented in PROTOCOL.md. Agent unchanged (responds to msg.reply); constraint is on the requester (internal user has full >). 2. Default-open auth bypass: generator defaulted to stage=open with a full-access anonymous user — a stale regen left the broker wide open. Now defaults to enforce (secure by default); the explicit 'open' migration stage maps anonymous to a harmless corrosion.unclaimed.> namespace, never real tenant subjects. Committed bootstrap default hardened the same way. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>	2026-06-11 12:39:31 -04:00
Vantz Stockwell	00cff51ce5	feat(nats): per-license auth mechanism — agent user/password, scoped broker, generator (non-breaking) All checks were successful CI / backend-types (push) Successful in 10s Details CI / frontend-build (push) Successful in 17s Details CI / agent-tests (push) Successful in 1m23s Details Build Host Agent (Rust) / build (push) Successful in 1m38s Details CI / integration (push) Successful in 23s Details Closes the open broker (anonymous publish to any tenant's corrosion.*). Per-license isolation via NATS user/password + subject permissions: each license -> user=license_id, password=HMAC-SHA256(license_id, NATS_TOKEN_SECRET), scoped to corrosion.{license_id}.> + _INBOX. Backend uses a privileged internal user. - Agent (alpha.5): nats_user/nats_password config + env, user_and_password auth; falls back to token/anonymous (transition-safe) - Backend: connects with NATS_INTERNAL_USER/PASSWORD when set, else anon - scripts/generate-nats-auth.mjs: regenerates nats-auth.conf from the licenses table; NATS_AUTH_STAGE=open keeps a no_auth_user fallback (verify creds first), =enforce rejects anonymous - committed nats-auth.conf is the SAFE OPEN default (no secrets); the host copy carries real users and is not committed - compose: NATS_INTERNAL_USER/PASSWORD/NATS_TOKEN_SECRET, mount nats-auth.conf Entirely non-breaking until secrets+config deployed; staged cutover next. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> agent-v2.0.0-alpha.5	2026-06-11 12:33:27 -04:00
Vantz Stockwell	7a07d600e7	feat(fleet): Phase B — fleet overview UI + GET /api/fleet read endpoint Tenant-scoped fleet read: GET /api/fleet returns agent_hosts (host metrics) each with their game_instances, plus a summary (host/instance/online counts). FleetView lists host cards (status, CPU/ mem/disk/uptime/last-heartbeat) with their instances (game, state badge, uptime); honest empty state -> Server page when no hosts. New 'Fleet' sidebar nav item across all four game profiles, /fleet route. Store follows the no-throw-on-fetch pattern (error state, never bricks). The marketing hero made real from the live fleet tables. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>	2026-06-11 12:32:55 -04:00
Vantz Stockwell	4a4ae7a5d4	docs(claude): Lesson 26 — jail-at-entry doesn't jail the recursive walk (security review caught what my review missed) All checks were successful CI / backend-types (push) Successful in 10s Details CI / frontend-build (push) Successful in 16s Details CI / agent-tests (push) Successful in 41s Details CI / integration (push) Successful in 21s Details Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>	2026-06-11 12:04:23 -04:00
Vantz Stockwell	930f655bf5	feat(api): fleet data model Phase A — License -> Host -> Instance All checks were successful CI / backend-types (push) Successful in 14s Details CI / frontend-build (push) Successful in 16s Details CI / agent-tests (push) Successful in 42s Details CI / integration (push) Successful in 22s Details Migration 022 adds agent_hosts / game_instances / instance_clusters / instance_stats (named agent_hosts to avoid the existing B2B hosts table). HostAgentConsumerService now parses the full v2 heartbeat and upserts an agent_hosts row (host metrics: cpu/mem/disk/agent version, keyed by license_id+hostname until enrollment) plus one game_instances row per heartbeat instance entry (state + uptime, the billing unit). Legacy server_connections write retained so the current panel keeps working — additive migration, nothing breaks. Staleness sweep + offline beacon now flip agent_hosts too. cluster_id FK reserved for Soulmask/ Dune. Migration applied to live DB; tsc green. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>	2026-06-11 12:00:52 -04:00
Vantz Stockwell	700dc2254d	fix(host-agent): SECURITY — file manager copy/list no longer follow symlinks out of the jail Some checks failed CI / backend-types (push) Successful in 9s Details CI / frontend-build (push) Successful in 17s Details CI / agent-tests (push) Successful in 1m21s Details Build Host Agent (Rust) / build (push) Successful in 1m34s Details CI / integration (push) Has been cancelled Details Automated security review (HIGH) caught a jail-escape my own review missed: copy_recursive used fs::metadata (follows symlinks). A symlink inside the jail pointing to e.g. /etc, then a 'copy' of its parent dir, would dereference it and pull external content INTO the jail where it could be read — a read-escape exfiltration. jail() validates only the top-level src/dest; the recursive walk reintroduced the escape. Fix: copy_recursive uses symlink_metadata and refuses any symlink ('symlinks are not followed across the jail boundary'). list() likewise switched to symlink_metadata so it reports the link, never the dereferenced target's size/type (info leak). Two regression tests added: copy-symlink-exfil (asserts no external content lands inside) and list-no-deref. 44/44 tests green. Rolled forward to alpha.4 (vulnerable alpha.3 superseded). Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> agent-v2.0.0-alpha.4	2026-06-11 11:57:08 -04:00
Vantz Stockwell	7fdca2cd4f	chore(host-agent): bump to 2.0.0-alpha.3 (RCON + supervision + SteamCMD + file manager) All checks were successful CI / backend-types (push) Successful in 9s Details CI / frontend-build (push) Successful in 16s Details CI / agent-tests (push) Successful in 1m26s Details Build Host Agent (Rust) / build (push) Successful in 1m35s Details CI / integration (push) Successful in 21s Details Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> agent-v2.0.0-alpha.3	2026-06-11 11:52:05 -04:00
Vantz Stockwell	18f978dde1	feat(host-agent): Phase 1c — SteamCMD update + jailed file manager steam_update func runs SteamCMD per game (rust/conan/soulmask app-ids; dune rejected), streaming stdout to {instance}.steam_status. Jailed file manager on {instance}.files.cmd: list/read/write/delete/rename/ mkdir/mkfile/move/copy, all confined to instance root via two-stage lexical-normalize + canonicalize (defeats ../ traversal AND symlink escape — incl chained symlinks). Replaces the Go agent's UNJAILED legacy files API (retired, not ported). 5MiB read cap. 42/42 tests green: 24 filemanager incl 7 jail-escape attempts (dotdot, deep dotdot, absolute, symlink-inside, direct symlink, chained symlink), 5 steamcmd app-id (cfg-gated win/linux soulmask). Jail logic reviewed line-by-line: Path::starts_with is component-wise (no sibling-prefix bypass), non-existent suffix components can't be symlinks, leading .. normalizes to / and fails the prefix check. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>	2026-06-11 11:51:46 -04:00
Vantz Stockwell	9e5e828c8d	fix(docker): nginx healthcheck uses 127.0.0.1 not localhost — IPv4-only listener All checks were successful CI / backend-types (push) Successful in 10s Details CI / frontend-build (push) Successful in 16s Details CI / agent-tests (push) Successful in 44s Details CI / integration (push) Successful in 21s Details corrosion-nginx reported (unhealthy) despite serving the panel fine: nginx listens 0.0.0.0:80 (IPv4 only, no listen [::]:80), but 'localhost' resolves to ::1 first inside the container, so the probe got connection-refused. Verified: 127.0.0.1:80 serves the SPA. Probe now targets IPv4 explicitly. No nginx config change — the panel was never broken, only the healthcheck's hostname resolution. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>	2026-06-11 11:43:01 -04:00
Vantz Stockwell	fccd5c61c5	docs(claude): Lessons 24-25 — onModuleInit-before-connect dead subscriptions + resurrected-path crash All checks were successful CI / backend-types (push) Successful in 9s Details CI / frontend-build (push) Successful in 16s Details CI / agent-tests (push) Successful in 39s Details CI / integration (push) Successful in 22s Details Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>	2026-06-11 11:17:02 -04:00
Vantz Stockwell	c72a280361	fix(api): WS gateways crashed on first forwarded event — WebSocket.OPEN undefined at runtime All checks were successful CI / backend-types (push) Successful in 9s Details CI / frontend-build (push) Successful in 16s Details CI / agent-tests (push) Successful in 40s Details CI / integration (push) Successful in 20s Details Lesson 10 in the flesh: the onApplicationBootstrap fix made the NATS-> WS bridge actually deliver events for the first time, which instantly crashed the API. esModuleInterop is off, so 'import WebSocket from ws' compiles to ws_1.default = undefined; WebSocket.OPEN threw 'Cannot read properties of undefined' and killed the process on the first heartbeat forward. All three WS guard sites (nats-bridge x2, console gateway) switched to the import-agnostic instance constant client.OPEN. Latent in every build — never hit because the bridge was dead-on-boot until today. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>	2026-06-11 11:11:29 -04:00
Vantz Stockwell	a3b4b5cc7d	fix(api): NATS subscriptions moved to onApplicationBootstrap — they silently no-oped before connect All checks were successful CI / backend-types (push) Successful in 10s Details CI / frontend-build (push) Successful in 16s Details CI / agent-tests (push) Successful in 47s Details CI / integration (push) Successful in 22s Details Production bug caught live: provider onModuleInit order put bridge/ consumer subscription hooks BEFORE NatsService finished connecting, so every subscribe() hit the [OFFLINE] no-op path — the WS bridge has been dead-on-boot in every production build, and the new v2 consumer never saw a heartbeat (server_connections stayed empty under a live agent). onApplicationBootstrap is guaranteed to run after all module inits, including the awaited NATS connect. The new CI contract suite fails on exactly this class of bug. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>	2026-06-11 11:02:52 -04:00
Vantz Stockwell	4e184ca571	ci: full test gate — types, frontend build, agent tests, agent<->backend contract suite Some checks failed CI / backend-types (push) Successful in 11s Details CI / frontend-build (push) Successful in 17s Details CI / agent-tests (push) Successful in 1m48s Details CI / integration (push) Has been cancelled Details ci.yml runs on every push to main: backend tsc, frontend vue-tsc+vite, cargo test (cached), then an integration job with postgres:16 + nats service containers — real migrations applied to a fresh DB, real backend booted (admin seed provides the license), real agent binary spawned. contract-tests/agent-backend.contract.mjs proves the entire v2 pipeline: heartbeat shape + measured telemetry, auto-registered server_connections row flipping connected, instance start/stop/status round-trips with push events, and the offline beacon flipping the row back. This is the test that could not run before a production rebuild until now — it now runs before every push lands. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>	2026-06-11 10:59:44 -04:00
Vantz Stockwell	fde0926d52	feat(host-agent): Phase 1b RCON — WebRCON (rust) + Source RCON (conan/soulmask) rcon func on the instance command channel: WebSocket JSON WebRCON with Identifier correlation (skips chat/log noise frames) and full Valve Source RCON over TCP (auth, exec, multi-packet reassembly via empty probe, 1MiB cap). Protocol inferred from game, explicit kind override in [instance.rcon]. Always 127.0.0.1 — agent is co-located. Hardening from review: WebRCON password never interpolated into error contexts/logs (redacted URL); probe-tolerant termination — a quiet period after received data ends the response for servers that don't echo the probe (Soulmask conformance unverified), so data is never discarded on probe timeout. 13/13 tests green incl. mock Source-RCON server (auth/multi-packet/ errors) and mock WebRCON server (noise-frame skipping). Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>	2026-06-11 10:53:52 -04:00
Vantz Stockwell	4d99c9d99d	feat(frontend): validate persisted session on app boot A stale or revoked token previously rendered the full panel chrome and only collapsed on the first API call. App boot now calls /auth/me through useApi (401 -> refresh -> logout already handled there); user profile refreshes on success, and non-auth failures (network, 5xx) never log the user out. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>	2026-06-11 10:49:21 -04:00
Vantz Stockwell	b8f0ccba3c	fix(frontend): env-driven marketing host detection Exact-match on 'corrosionmgmt.com' meant www. or any staging host silently served the panel instead of the marketing site. Hosts now come from VITE_MARKETING_HOSTS (comma-separated, defaults cover bare + www). Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>	2026-06-11 10:47:15 -04:00
Vantz Stockwell	068a476f39	feat(host-agent): Phase 1a process supervision — instance start/stop/restart/status + push state events Per-instance ProcessSupervisor: tokio child spawn with proper arg list (fixes Go's naive space-splitting), graceful SIGTERM with 30s budget then force kill, monitor task classifying ordered-stop vs crash (exit code captured), watch-channel state observable everywhere. Instance cmd channel live on corrosion.{license}.{instance}.cmd (start/stop/restart/ status) with state events pushed on {instance}.status (keep-latest semantics, documented). Heartbeats now carry live process state + uptime per instance. Crate restructured lib+bin for integration tests. Verified: 5 integration tests with real OS processes (lifecycle, crash exit-code, restart recovery, unmanaged rejection, clean spawn failure) + live-NATS contract test (request-reply roundtrips, double-start rejection, push events, heartbeat state) — all green. Known limitation (documented): no PID adoption yet — agent restart orphans a running game process to 'stopped' until panel restart. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>	2026-06-11 10:44:24 -04:00
Vantz Stockwell	f706c3c47e	docs(claude): host-agent reality — active Rust crate, tag scheme, runner container truth, command corrections Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>	2026-06-11 10:37:37 -04:00
Vantz Stockwell	4c9c322c29	feat(seo): per-route titles + meta descriptions; ci: honest runner test Every page previously titled 'Corrosion Management' with zero meta - marketing invisible to search and link previews. Router afterEach now sets title/description/og per route (no new deps); marketing pages get real content-backed descriptions, panel views mechanical titles. index.html carries defaults for pre-JS crawlers. Verified in-browser per page via Playwright. test-runner.yml: per-tool presence checks instead of green-lighting missing toolchains; workflow_dispatch instead of every push. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>	2026-06-11 10:35:58 -04:00
Vantz Stockwell	47fa72763c	feat(api): host-agent protocol v2 consumer — heartbeat persistence, auto-register, staleness sweep Nothing persisted agent heartbeats before: companion_last_seen was written once at setup and connection_status stayed 'connected' forever. HostAgentConsumerService now consumes corrosion.*.host.heartbeat (updates last_seen + status, auto-creates the bare_metal connection row on first contact), host.going_offline (graceful offline), and sweeps connections offline after 180s of heartbeat silence. License-existence tenant validation with caching per NATS-consumer doctrine. WS bridge forwards host_heartbeat/host_going_offline to the panel. Contract-verified against production NATS with the backend's own nats lib: v2 subjects, schema 2, real telemetry, offline beacon. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>	2026-06-11 10:35:58 -04:00
Vantz Stockwell	b455bf9f14	ci(host-agent): bootstrap Rust in the runner container; roll to alpha.2 All checks were successful Build Host Agent (Rust) / build (push) Successful in 1m29s Details Test Asgard Runner / test (push) Successful in 3s Details Asgard runner executes jobs in bare node:20-bullseye (no Rust, no sudo) - install rustup + musl/mingw cross toolchains per-run, same pattern as setup-go in the Go pipeline. agent-v2.0.0-alpha.1 predates this fix; forward-only doctrine: version rolls to alpha.2 rather than re-pushing the tag. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> agent-v2.0.0-alpha.2	2026-06-11 10:15:36 -04:00
Vantz Stockwell	4abf0ab889	ci(host-agent): Rust agent build pipeline on agent-v* tags -> CDN alpha channel Some checks failed Build Host Agent (Rust) / build (push) Failing after 3s Details Test Asgard Runner / test (push) Successful in 3s Details Separate tag namespace from the Go pipeline (v..*) per the blast-radius doctrine; artifacts publish to /host-agent/alpha/ and a versioned dir, leaving /host-agent/latest/ on the Go build until cutover. Linux = static musl, Windows = mingw (msvc/cargo-xwin stays the local release path). Tag-vs-Cargo.toml version gate included. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> agent-v2.0.0-alpha.1	2026-06-11 10:09:43 -04:00
Vantz Stockwell	cea3d66cdd	feat(host-agent): Rust rewrite Phase 0 — multi-instance foundation, v2 wire protocol, real telemetry All checks were successful Test Asgard Runner / test (push) Successful in 3s Details New corrosion-host-agent/ crate (Go companion-agent stays as behavior reference until parity). Wire protocol v2 per COA-B: instance-scoped subjects corrosion.{license}.{instance}.* + host-level .host.* — spec in PROTOCOL.md, designed for the license->host->instance fleet model. - Multi-instance TOML config in the foundation, not retrofitted - NATS layer on the Vigilance production profile (infinite reconnect, capped backoff, 30s ping, 8192-msg offline buffer) - Heartbeat with real sysinfo telemetry — Go agent shipped hardcoded disk/cpu placeholders; this is the panel's first true Resources data - Connectivity prober (outbound TCP, periodic + on-demand) - Host cmd channel (ping/probe/sysinfo), going-offline beacon, CancellationToken shutdown - Live-fire verified against production NATS; artifacts: 3.7MB static linux-musl, 3.8MB windows .exe (static CRT) Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>	2026-06-11 10:02:46 -04:00

1 2 3 4 5

218 Commits