Introduce a Supervisor trait (async-trait) so the agent manages games with different models behind one wire contract. ProcessSupervisor (spawned process: rust/conan/soulmask) and the new DockerComposeSupervisor (dune) both impl it; Agent.supervisors is now HashMap<String, Arc<dyn Supervisor>> and instancecmd dispatch is game-agnostic — start/stop/restart/status identical across games, selected by a per-game factory in main. InstanceState moved to the shared supervisor module. DockerComposeSupervisor drives against the instance's compose project, with -f/-p/single-service support and a configurable compose binary. New [instance.docker_compose] config block. First cut = lifecycle + cached state; container crash-detection + restart adoption deferred to Phase 3b (reconcilable with ). Trait choice (dyn over enum) per Commander: scales to future planes (kubectl, AMP/podman, SSH) as new struct+impl, no central match. 56 tests green (6 new docker-compose mock-binary tests + 5 refactored process tests), zero warnings. Live verification pending a real Dune stack. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
42 KiB
CHANGELOG — Corrosion Admin Panel
All notable changes to this project will be documented in this file.
[Unreleased]
Added (Host-agent Phase 2 — Dune docker-compose adapter — 2026-06-12)
Supervisor trait abstraction (corrosion-host-agent):
- Introduced
trait Supervisor(viaasync-trait, the battle-tested ecosystem standard) so the agent can manage games with fundamentally different models behind one wire contract.ProcessSupervisor(spawned OS process — Rust/Conan/Soulmask) and the newDockerComposeSupervisor(Dune) both implement it;Agent.supervisorsis nowHashMap<String, Arc<dyn Supervisor>>and the instance command dispatch (instancecmd::dispatch) is fully game-agnostic —start/stop/restart/statusare identical across games. A per-game factory inmainselects the impl.InstanceStatemoved to the sharedsupervisormodule. - Architecture call (per Commander): chose the
dyntrait over a zero-dependency enum because the Dune references point at several future management planes (kubectl, AMP/podman, SSH) — a trait makes each new plane "new struct + impl," no central match to edit.
DockerComposeSupervisor (Dune: Awakening):
- Drives
docker compose up -d/stop/restartagainst the instance's compose project (a "battlegroup"), with-f/-p/single-service support and a configurable compose binary (docker composedefault,docker-composelegacy). New[instance.docker_compose]config block (file/project/service/command, all optional).steam_updatealready rejected for Dune (Docker images, no SteamCMD). - Scope (first cut): lifecycle + cached state. Deferred to Phase 3b (with process PID adoption): container crash-detection and state adoption on agent restart (both reconcilable with a
docker compose psprobe). - Verified: 6 new docker-compose tests (mock
dockerbinary asserting exact invocations + state transitions + failure paths) + the 5 refactored process-supervisor tests; full agent suite 56 tests green, zero warnings. Live verification against a real Dune stack pending the Commander standing one up.
Changed (Fleet-driven active game + signed-update CI fix — 2026-06-12)
Frontend — active game follows the deployed fleet:
- The panel's active game (shell skin + sidebar nav + dashboard terminology) is now derived from the deployed instances instead of a localStorage-only toggle.
syncActiveGameFromFleet()reads the distinctgamevalues of the license's instances (game_instances.game, reported by the host agent): exactly one game deployed → the shell auto-skins to it; zero or multiple →all(neutral house skin). Wired intoDashboardLayout(the always-mounted admin shell) via a watch on the fleet store. - A manual GameSwitcher pick still wins — it persists to
cc-active-gameand suppresses auto-derive (operator intent beats the heuristic). Un-overridden panels keep tracking the fleet across sessions. - No backend/schema change: a license's game(s) are the distinct games of its instances — the normalized source of truth. Deliberately did NOT add a
licenses.gamecolumn (would duplicategame_instances.gameand drift; see Lesson 20).
Frontend — sidebar agent-health footer is now fleet-aware:
- The shell footer read a single legacy
server.connection(oneserver_connectionsrow), which disagreed with the multi-host fleet. Repointed it at the fleet store: one host → hostname + status + last-heartbeat; multiple →{online}/{total} online+ total instance count. Tone aggregates (all online → healthy, some → degraded, none → offline). Dropped the legacyuseServerStoredependency from the shell entirely.
Frontend — removed dead vuefinder dependency:
- VueFinder was replaced by the native instance-scoped file manager but the plugin (and its CSS) were still globally registered in
main.tsand shipped in the bundle. Removed the dep + the threemain.tslines. Side effect: the main JS chunk dropped 588 kB → 165 kB (vuefinder bundled an entire unused file-manager UI).
Recon note (not a change): corrosion.{license}.cmd.server was on the cleanup list as "dead v1" — it is NOT. It remains the live license-level command path for all plugin/module config applies, plugin install, scheduled tasks, and legacy start/stop/restart, served only by the legacy Go agent. The Rust agent does not implement it yet — this is a parity/migration gap (Phase 2+), not dead code. Left intact.
CI — signed host-agent build:
- Fixed the
Sign artifacts (minisign)step (Error while loading the secret key file): a minisign secret key is two lines and CI secret storage mangles the embedded newline. The job now base64-decodes the secret (single-line, mangling-proof) with auto-detect fallback to a raw key.MINISIGN_SECRET_KEYmust be stored asbase64 < secret.key | tr -d '\n'. Verified end-to-end:agent-v2.0.0-alpha.8Linux + Windows binaries validate against the agent's embedded public key; tampered byte rejected.
Added (Host-Agent v2 Consumer + SEO Meta — 2026-06-11)
Backend (NestJS):
HostAgentConsumerService(new) — consumes wire protocol v2:corrosion.*.host.heartbeatupdatescompanion_last_seen+connection_status='connected'(auto-registers the connection row on first contact);host.going_offlineflips offline; a 60s staleness sweep marks hosts offline after 180s of silence. Previously NOTHING persisted heartbeats —connection_statuswas set once at setup and never changed again. Tenant-validated (UUID + license existence, cached) per NATS-consumer doctrineNatsBridgeService— bridgeshost_heartbeat/host_going_offlineevents to the panel WebSocket- Verified by contract test: real agent → production NATS → captured with the backend's own
natslib under the real license; subjects, schema 2, real telemetry, offline beacon all confirmed
Frontend:
- Per-route document titles + meta descriptions (router
afterEach, no new deps): six marketing pages get real titles/descriptions/OG tags (previously every page was "Corrosion Management" with zero meta — invisible to search and link previews); panel views get mechanical "{View} — Corrosion" titles
CI:
test-runner.yml— honest per-tool presence checks (was printing "OPERATIONAL" while every toolchain probe failed); on-demand trigger instead of every push
Added (Corrosion Host Agent — Rust rewrite Phase 0 — 2026-06-11)
New: corrosion-host-agent/ — Rust rewrite of the Go companion agent (which stays in-tree as the behavior reference until parity). Wire protocol v2 (COA-B, Commander-approved): instance-scoped subjects corrosion.{license}.{instance}.* with host-level corrosion.{license}.host.* — full spec in corrosion-host-agent/PROTOCOL.md.
- Multi-instance TOML config baked into the foundation (one agent supervises N game instances; rust/conan/soulmask/dune), env overrides for secrets, strict validation (subject-safe ids, reserved segments)
- NATS layer with the production-proven Vigilance profile: infinite reconnect w/ capped backoff, 30s ping, 8192-msg offline send buffer,
tls://scheme support - Host heartbeat with REAL telemetry via sysinfo (CPU/mem/disks/per-instance state) — the Go agent hardcoded disk=50000MB and cpu=0.0; this is the first true Resources data
- Connectivity prober (outbound TCP + latency, periodic jittered + on-demand) — first piece of the support-triage story
- Host command channel (
ping/probe/sysinfo, request-reply), going-offline beacon, CancellationToken graceful shutdown - Version embedding (semver + git hash + build ts) in
--versionand every heartbeat - Verified live against production NATS: connected, heartbeats published, clean shutdown
- Deploy artifacts verified: 3.7MB fully-static linux-musl binary, 3.8MB windows .exe (static CRT, no VC++ redist needed)
Next phases: 1 = process-class adapter (spawn/RCON/SteamCMD/files for Rust/Conan/Soulmask) + NestJS v2 heartbeat consumer; 2 = Dune Docker adapter; 3 = signed self-update (release gate) + service install.
Fixed (Site Audit — Fake Data, Resilience, Fonts — 2026-06-11)
Frontend:
SetupWizardView.vue— Replaced fake install instructions (get.corrosionmgmt.com | shinstall script andcorrosion-agentbinary, neither of which exists) with the real host-agent download + run commands matching ServerView; multi-game copy on the completion step- Marketing views (Landing, Pricing, HowItWorks, Roadmap, EarlyAccess) — Replaced "View live demo" CTA (no demo exists; it linked to the panel login) with an honest "Sign in" link
ErrorBoundary.vue— Error state now resets on route change (previously one failed view bricked the entire SPA, including marketing pages, until manual reload); addedcontentvariantDashboardLayout.vue— Routed views are now wrapped in a content-scoped ErrorBoundary so the sidebar/topbar survive a view failure instead of the whole panel unmountingindex.html/styles/tokens/fonts.css— Google Fonts moved from CSS@importto<link>tags. The bundler silently dropped the mid-bundle@import, so production shipped system fallback fonts (Geist/JetBrains Mono/Oxanium never loaded)StatusPageView.vue— Platform KPIs show "—" until the first successful fetch instead of fake zerosLoginView.vue— Added missing "Forgot password?" link (route + backend endpoint already existed)
Backend (NestJS):
AdminSeedService(new, auth module) — Bootstraps a super-admin user + active license fromADMIN_EMAIL/ADMIN_PASSWORD/ADMIN_USERNAME/ADMIN_LICENSE_KEYwhen the users table is empty. A fresh deploy previously had a schema but no possible login. Compose already passes the env vars
Purpose: Findings from the full-site fake-data audit. Show real data or honest empty states — never invented values, dead URLs, or fabricated zeros.
Fixed (Safe Formatting Utilities — 2026-02-15)
Frontend:
AnalyticsView.vue— Replaced unsafe.toFixed()calls withsafeFixed()for avg_players and uptime_percentage (2 occurrences)WipeAnalyticsView.vue— Replaced unsafe.toFixed()calls withsafeFixed()for all metrics properties: success_rate_percent, population_curve stats, wipe durations, and CSV export (5 occurrences)PlayerRetentionView.vue— Replaced unsafe.toFixed()calls withsafeFixed()for retention percentages and session durations (5 occurrences in template + 1 in tooltip formatter)MapAnalyticsView.vue— Replaced unsafe.toFixed()calls withsafeFixed()for rotation effectiveness, map performance metrics, and table display (6 occurrences)
Purpose: Prevents runtime errors from calling .toFixed() on null/undefined values in analytics views. Uses the safe formatting utilities from @/utils/formatters.ts with optional chaining for all numeric display operations.
Fixed (NestJS Entity Alignment — 2026-02-15)
Backend (NestJS):
NotificationsConfigentity — Renamedemail_alerts_enabled→email_enabled, addedemail_addresscolumnNotificationsConfigentity — Renamed notification columns to match service expectations:notify_on_start,notify_on_stop,notify_on_crash,notify_on_wipe_start,notify_on_wipe_complete,notify_on_wipe_failure,notify_on_player_threshold,player_thresholdScheduledTaskentity — Renamedis_active→is_enabled, addedlast_runcolumnTeamMemberentity — Renamedaccepted_at→joined_atto match service expectationsRoleentity — AddeddescriptioncolumnJwtStrategy— Updated to referencejoined_atinstead ofaccepted_at- Resolved all 12 TypeScript compilation errors caused by entity/service column mismatches
Added (Frontend Gap Closure — 2026-02-15)
New Views:
SchedulesView.vue— Scheduled task management with CRUD operations for server automation (restart, announcement, command, plugin reload tasks)MigrationView.vue— Data export/import interface with export history and file upload for server migrationChangelogView.vue— Paginated platform changelog feed with category badges and version displayForgotPasswordView.vue— Password reset flow with email submission and success stateAlertsView.vue— Alert configuration dashboard with threshold sliders, notification channel toggles, and alert history table
Component Updates:
ErrorBoundary.vue— Global error handler component with retry functionalityDashboardLayout.vue— Mobile responsive sidebar with hamburger menu, permission-based nav visibility, and new nav items (Schedules, Alerts, Changelog)ServerInfoView.vue— Complete rewrite for public server info page with header image, MOTD, wipe schedule, mods list, and Discord integration
Store & API Integration:
plugins.ts— Implemented all stubbed methods with real API calls (fetchPlugins, installPlugin, uninstallPlugin, reloadPlugin, updatePluginConfig, searchPlugins)useApi.ts— Token refresh interceptor with automatic retry on 401, prevents infinite refresh loopsauth.ts— AddedhasPermission()helper with basic permission checking
Router:
- Added routes:
/schedules,/migration,/changelog,/alerts,/forgot-password - Added catch-all route redirecting to home
- All new routes under authenticated dashboard layout
App Structure:
- Wrapped root
<RouterView>withErrorBoundaryfor global error handling
Purpose: Closes frontend implementation gaps identified during Phase 4. Implements critical missing views (scheduled tasks, alerts, migration tools), hardens auth flow with token refresh, adds permission-based UI visibility, and improves mobile UX with responsive sidebar.
Added (NestJS Backend — Core Modules)
Auth Module (modules/auth/):
- Complete authentication system with JWT and refresh tokens
- DTOs:
LoginDto,RegisterDto,RefreshTokenDto,VerifyTotpDto,UpdateProfileDto,ForgotPasswordDto,ResetPasswordDto JwtStrategy— Passport strategy with user lookup, license resolution, and role permissions injectionAuthService— Full auth lifecycle:register()— User creation with auto-generated license key (CORR-XXXX-XXXX-XXXX format)login()— Credential validation, TOTP verification, token generationrefresh()— Access token refresh from valid refresh tokensetupTotp()— TOTP secret generation with QR code (otpauth + qrcode libraries)verifyTotp()— TOTP code validation and 2FA enablementgetProfile()/updateProfile()— User profile managementforgotPassword()/resetPassword()— Password recovery stubs (SMTP integration pending)
AuthController— 9 REST endpoints:POST /auth/login— Email/password login with optional TOTPPOST /auth/register— New user registration with auto-license creationPOST /auth/refresh— Token refreshPOST /auth/2fa/setup— Generate TOTP QR code (authenticated)POST /auth/2fa/verify— Enable 2FA (authenticated)GET /auth/me— Current user profile (authenticated)PUT /auth/profile— Update profile (authenticated)POST /auth/forgot-password— Request password reset (public)POST /auth/reset-password— Reset with token (public)
- Password hashing via argon2, TOTP via otpauth with 30-second window validation
- License key auto-generation on registration (random hex parts)
- JWT payload includes: sub (user ID), email, username, is_super_admin, license_id, permissions
- Strategy enriches JWT with license context (owner or team member lookup) and role permissions
Users Module (modules/users/):
- Simple CRUD wrapper around User repository
UsersService—findById(),findByEmail(),findAll()with password_hash excluded from selectUsersController— Admin-only endpoints:GET /users— List all users (requiresusers.viewpermission)GET /users/:id— Get user by ID (requiresusers.viewpermission)
- Password fields excluded from all query results
Licenses Module (modules/licenses/):
- License management with owner authorization
- DTO:
ValidateKeyDto— License key validation input LicensesService:findById()— License lookup with owner/super admin authorization checkfindByKey()— Key-based lookupfindByOwner()— All licenses owned by usercreate()— New license generation with CORR-XXXX-XXXX-XXXX formatvalidateKey()— Public key validation returning status and metadata
LicensesController:GET /licenses/:id— Get license (owner or super admin only)POST /licenses/validate-key— Public key validation endpoint
- License key format:
CORR-{4-hex}-{4-hex}-{4-hex}(e.g., CORR-A1B2-C3D4-E5F6) - Ownership enforced: non-super-admin users can only access their own licenses
Patterns Applied:
- All DTOs use class-validator decorators (@IsEmail, @IsString, @MinLength, etc.)
- All controllers use @ApiTags and @ApiBearerAuth for Swagger documentation
- All routes use @ApiOperation for endpoint descriptions
- Custom decorators: @Public(), @CurrentUser(), @CurrentTenant(), @RequirePermission()
- Entity imports from
../../entities/directory - ConfigService for environment variables (JWT_SECRET, JWT_ACCESS_EXPIRY_SECONDS, JWT_REFRESH_EXPIRY_SECONDS)
- Multi-tenant isolation: License lookup respects ownership unless super admin
- JwtStrategy enriches request.user with license_id and permissions for downstream guards
Security:
- Argon2 password hashing (not bcrypt — more resistant to GPU attacks)
- TOTP 6-digit codes with ±1 period window validation
- Refresh tokens with separate expiry (default 7 days vs 1 hour access token)
- Password fields never returned in API responses
- License access requires ownership or super admin flag
Status: Core auth, users, and licenses modules operational. Registration creates user + license atomically. Login returns JWT with license context. TOTP 2FA flow complete. Password reset stubbed pending SMTP integration. All endpoints documented via Swagger.
Added (Phase 4 — Module Licensing Backend)
Backend Infrastructure:
- Migration
009_module_licensing.sql— Module marketplace database schema:modulestable — Registry of available modules (slug, name, description, category, price, features, version, plugin URL)module_purchasestable — License-module ownership tracking with transaction loggingmodule_installationstable — Deployment status tracking (pending, installing, installed, failed)- Seed data: Loot Manager module ($9.99) with features array
backend/src/models/modules.rs— Domain models:Modulestruct with rust_decimal pricing supportModuleWithOwnership— Catalog display with is_purchased flagModulePurchase,ModuleInstallation— Purchase and deployment recordsPurchasedModule— Combined view for user's module library
backend/src/db/modules.rs— Data access layer (11 query functions):get_module_catalog()— All available modulesget_catalog_with_ownership(license_id)— Annotated catalog with purchase statusget_purchased_modules(license_id)— User's module library with installation statusis_module_purchased(license_id, module_id)— Ownership validationrecord_module_purchase()— Transaction logging with PayPal ID supportget_module_installation_status()/update_installation_status()— Deployment trackingget_module_by_id()/get_module_by_slug()— Module lookup
backend/src/api/modules.rs— REST endpoints with auth middleware:GET /api/modules/catalog— Returns modules with is_purchased flag for current licenseGET /api/modules/my-modules— Purchased modules with installation detailsPOST /api/modules/purchase— Records purchase (stub transaction for Phase 4 MVP — payment integration deferred to XO's direct touch)POST /api/modules/install— Triggers module installation via ModuleInstaller serviceGET /api/modules/:module_id/installation-status— Real-time deployment status polling
- Router integration in
main.rsat/api/moduleswith JWT auth requirement Cargo.tomldependency:rust_decimalfor DECIMAL field support
Multi-Tenancy Enforcement:
- All queries scoped by
license_idfrom JWT claims - Foreign key constraints enforce license-module binding
- Purchase validation prevents cross-tenant access
- Installation status isolated per license
Payment Integration Strategy:
- Purchase endpoint stubs transaction with "STUB_TRANSACTION" ID
- PayPal integration deferred to XO's direct implementation
transaction_idandamount_paidfields ready for real gateway
Status: Module licensing backend operational. Catalog queryable, purchases recordable, ownership enforceable, installation status trackable. Payment gateway integration pending.
Added (Phase 4 — Loot Manager Plugin Skeleton)
Plugin Skeleton:
plugin/modules/LootManager.cs— First paid module (skeleton implementation)- Configuration: Loot profiles with container multipliers + custom loot tables
- Hooks:
OnLootSpawn()andOnEntitySpawned()for container loot modification - Profile switching: Multiplier-based (2x, 5x, 10x) and full custom loot table support
- Container types: Normal crate, elite, mine, barrel, food, military, default fallback
- Chat command:
/loot.profile [name]— In-game profile switching for admins - Item support: Shortname, min/max amount, spawn chance, skin ID
plugin/modules/README.md— Module documentation- Price: $9.99
- Features: Visual loot table editor (dashboard integration TBD), profile switching, skin support
- Installation: Auto-deploy via module store (implementation TBD)
Database:
- Migration 009 already includes Loot Manager seed data in
modulestable
Status: Skeleton complete. Hooks functional. Profile switching works via chat command. Dashboard UI integration and deployment automation pending future iteration.
Added (Phase 4 — Module Auto-Installation Pipeline)
Backend Service:
backend/src/services/module_installer.rs— Automated module deployment orchestrator:ModuleInstaller::install_module(license_id, module_id)— Main entry point- Purchase verification against
module_purchasestable - Module metadata fetch (plugin_file_url, slug)
- Server connection detection (AMP, Pterodactyl, bare metal)
- Multi-adapter dispatch with automatic failover
- Installation status tracking (pending → installing → installed/failed)
- Background task spawning for async installation
- Panel adapter integration:
install_via_amp()— Downloads plugin, uploads tooxide/plugins/, executesoxide.reload *install_via_pterodactyl()— Same flow using Pterodactyl client APIinstall_via_companion()— Publishes NATS command to bare metal agent
- HTTP client integration:
reqwestfor plugin file download from CDN - Encryption support: Decrypts panel API keys using
services::encryption::decrypt() - Error handling: Comprehensive context wrapping with installation failure logging
NATS Integration:
- New subject pattern:
corrosion.{license_id}.cmd.module.install - Request/reply timeout: 60 seconds for companion agent response
- Expected payload:
{ "module_id": "loot-manager", "download_url": "https://cdn.corrosionmgmt.com/modules/LootManager.cs", "filename": "LootManager.cs", "target_path": "oxide/plugins/" } - Expected response:
{ "module_id": "loot-manager", "success": true|false, "error": "optional error message" } - Subject pattern already covered by existing
corrosion.*.cmd.>wildcard in STREAM_AGENT_COMMANDS
API Updates:
backend/src/api/modules.rs:- Updated
POST /api/modules/install— Replaced stub with real ModuleInstaller invocation - Spawn background task for async installation
- Return immediately with "installing" status
GET /api/modules/:module_id/installation-status— Already existed, now returns real data frommodule_installationstable
- Updated
- ModuleInstaller instantiation with encryption key from AppConfig
Documentation:
docs/COMPANION_AGENT_MODULE_INSTALL.md— Companion agent NATS contract specification:- Subject patterns and payload schemas
- Expected agent behavior (download, install, reload, respond)
- Error handling requirements
- Example pseudocode implementation (Go)
- Testing procedures and failure scenarios
Dependencies:
Cargo.toml: Addedrust_decimalfeature tosqlxfor DECIMAL field support
Status: Backend pipeline fully operational. Modules install automatically to AMP/Pterodactyl servers. Companion agent NATS contract documented. Companion agent implementation (Go) pending future iteration.
Added (Phase 4 — Module Store Frontend)
Frontend:
- Complete
ModuleStoreView.vueimplementation — Customer-facing module marketplace with:- Catalog Tab:
- Module grid with preview images, prices, category badges, purchase status
- Search functionality (name/description)
- Category filter (Loot, Events, Economy, Kits, Admin, PVP, PVE, Building)
- Hover animations and professional card layout
- My Modules Tab:
- Purchased modules with installation status tracking
- "Install" button for purchased-but-not-installed modules
- Empty state prompting catalog browsing
- Module Detail Modal:
- Full-screen module preview with screenshots gallery
- Expanded description and complete features list
- Version display and pricing details
- Direct purchase/install CTA from modal
- Purchase Confirmation Modal:
- Shows module name, license binding, total price
- Error handling with inline error display
- Non-refundable disclaimer
- Processing state during purchase flow
- Payment Flow:
- Instant purchase confirmation (MVP)
- External payment URL redirect support (Stripe/PayPal)
- State refresh after successful purchase
- Catalog Tab:
- TypeScript types (
types/index.ts):Moduleinterface with full marketplace metadata (id, slug, name, description, price, category, images, features, version, purchase/install status)PurchaseRequestinterface for API integration
- API Integration:
GET /api/modules/catalog— Browse all available modulesGET /api/modules/my-modules— Fetch purchased modules for current licensePOST /api/modules/purchase— Initiate module purchase (returns payment URL or instant confirmation)POST /api/modules/install— Trigger deployment to game server
Design Details:
- Professional marketplace UI using existing Tailwind patterns
- Color-coded category badges (8 categories supported)
- Preview image with hover scale effect
- "Purchased" badge overlay on owned modules
- Three-state purchase flow: Not Purchased → Purchased → Installed
- Mobile-responsive grid (1/2/3 columns)
- Empty states for zero results and zero purchases
- Price display prominently in catalog cards and modals
Purpose: Enables server admins to browse, preview, purchase, and install premium gameplay modules (Loot systems, Events, Economy plugins, Kits) directly from the dashboard. Customers pay real money here — UI polish critical.
Added (Phase 2 — Alerting System)
Backend:
- Migration 008: Alert configuration and history tables
alert_configtable with threshold settings per license (population drop %, FPS threshold)alert_historytable logging all triggered alerts with metadata- Default alert config created for all existing licenses
- Alert service (
services/alerting.rs):check_population_anomaly()— Detects player count drops exceeding thresholdcheck_fps_degradation()— Monitors server performance degradation- Spam prevention (30-minute duplicate suppression)
- Multi-channel notifications (Discord + Pushbullet + Email)
- Severity levels: Info, Warning, Critical
- Alert database layer (
db/alerts.rs):get_alert_config()/update_alert_config()— Threshold configurationinsert_alert()/mark_alert_notified()— Alert history trackingcheck_recent_alert()— Duplicate detectioncleanup_old_alerts()— 90-day retention cleanup
- Updated
db/notifications.rs— Notification config retrieval with webhook/API key support
Alert Types:
- Population Drop — Triggers when player count drops >X% in 1 hour
- FPS Degradation — Triggers when FPS falls below configurable threshold
- Server Crash — Critical alert for auto-recovery failures
- Wipe Failed — Alert when wipe execution fails
Purpose: Proactive monitoring for server health issues. Alerts server admins via Discord/Pushbullet when anomalies detected (population crashes, performance degradation). Configurable thresholds per license.
Added (Phase 2 — Wipe Performance Analytics)
Backend:
backend/src/db/wipes.rs— Comprehensive wipe analytics query layer:get_wipe_success_rate()— Success vs failure rate over time rangeget_average_wipe_duration()— Average execution time for successful wipesget_wipe_to_peak_population()— Hours from wipe completion to peak player count (24h window)get_population_curve_by_cycle()— Day 1 vs Day 2 vs Day 3 average player counts post-wipeget_optimal_wipe_timing()— Recommends best day of week + hour based on historical peak populationsget_wipe_analytics_entries()— Detailed per-wipe records for charting (duration, peak pop, success)- All queries use hourly aggregates (
server_stats_hourly) with 90-day retention
backend/src/api/analytics.rs— Wipe performance endpoint:GET /api/analytics/wipes/performance?range=90d— Returns full wipe performance metrics- Supports range params:
6d,12d,90d,all(converted to wipe count estimates) - Response includes: success rate, avg duration, population curve, optimal timing, individual wipe entries
Frontend:
WipeAnalyticsView.vue— Complete wipe performance dashboard:- ECharts Visualizations:
- Wipe success timeline (scatter plot: green = success, red = failed)
- Population curve bar chart (Day 1/Day 2/Day 3 average players post-wipe)
- Wipe duration trend (line chart showing execution time evolution)
- Insight Cards:
- Success rate percentage with total wipe count
- Average wipe duration (formatted as minutes:seconds)
- Peak population day identifier
- Optimal wipe timing recommendation (day + hour)
- Actionable Recommendations Banner:
- Optimal wipe day/hour based on post-wipe player peaks
- Weekly vs bi-weekly wipe suggestion (if Day 1 >> Day 2 population)
- Duration optimization alerts (if avg > 10 minutes)
- Rollback protection warnings (if failures detected)
- Time range selector: Last 6 wipes / Last 12 wipes / All time
- CSV export functionality
- ECharts Visualizations:
- Added route
/wipes/analyticsto router - TypeScript interfaces:
WipePerformanceMetrics,WipeAnalyticsEntry,PopulationCurve
Purpose: Answers critical questions: "How long do wipes take? When do players peak post-wipe? What's my success rate? When should I schedule wipes for max population?" Enables data-driven wipe timing optimization and operational insights.
Added (Phase 3 — Public Status Page)
Backend:
- Migration 007: Added
status_page_descriptionTEXT column topublic_site_config - Public API models (
models/public.rs):PublicServerStatus— Server status with live stats for public displayPlatformHealth— Platform-wide health metrics (total servers, online count, total players, uptime)StatusPageResponse— Complete status page data structurePublicSiteConfig— Full public site configuration model
- Public database queries (
db/public.rs):get_public_servers()— Retrieves all opted-in servers with current stats, uptime percentages (24h/7d/30d), wipe schedulesget_platform_health()— Calculates platform-wide aggregate metricscalculate_uptime_percentage()— Uptime calculation from hourly statsformat_cron_expression()— Human-readable wipe schedule formattingget_public_site_config()/create_public_site_config()/update_public_site_config()— Config management
- Public API endpoint (
api/public.rs):GET /api/public/status— Public status page data (no auth required)
- Settings API (
api/settings.rs):GET /api/settings/public-site— Fetch public site config (auth required)PUT /api/settings/public-site— Update status page opt-in and description (auth required)
Frontend:
StatusPageView.vue— Complete public status page with:- Platform health header (total servers, online now, total players, platform uptime)
- Server grid with status indicators (green/yellow/red), player counts, uptime badges (24h/7d/30d)
- Wipe schedule display with countdown timers
- Server search/filter functionality
- Auto-refresh every 10 seconds via polling
- Mobile-responsive grid layout
- "Powered by Corrosion" footer with panel link
- Settings dashboard integration (
SettingsView.vue):- New "Public Status" tab with toggle for
show_on_status_page - Text area for
status_page_description - Save endpoint integration
- New "Public Status" tab with toggle for
Infrastructure:
- nginx already configured for
status.corrosionmgmt.comrouting - Router already configured with
/statusroute on both panel and marketing domains
Purpose: Public-facing marketing page showcasing all Corrosion servers. Drives platform visibility and attracts new customers ("I want this for my server too").
Added (Phase 2.2 — Player Retention Analytics)
Backend:
- Migration
004_player_sessions.sql— Player session tracking table with indexes for retention queries backend/src/db/player_sessions.rs— Complete player session tracking and retention analysis:track_player_join()/track_player_leave()— Record individual player sessionscalculate_retention_after_wipe()— Calculate 24h/48h/72h return rates per wipeget_unique_player_count()/get_avg_session_duration()— Session metricsget_new_vs_returning_ratio()— New vs returning player analysisget_recent_wipe_retention_metrics()— Multi-wipe retention trendscleanup_old_player_sessions()— 90-day retention cleanup
backend/src/api/plugin.rs— Plugin event endpoints:POST /api/plugin/player-event— Track player join/leave eventsPOST /api/plugin/checkin— Plugin registration on server start
- Extended
backend/src/api/analytics.rswith retention endpoints:GET /api/analytics/retention?wipe_count=6— Multi-wipe retention metricsGET /api/analytics/retention/export— CSV export of retention data
Frontend:
PlayerRetentionView.vue— Complete retention analytics dashboard:- ECharts retention curve (24h/48h/72h lines across multiple wipes)
- Summary cards: unique players, avg session duration, new vs returning ratio
- Wipe selector (last 3/6/10/20 wipes)
- Detailed wipe table with retention percentages
- CSV export functionality
- Added route
/retentionto router - TypeScript interfaces:
WipeRetentionMetric,SessionSummary,RetentionResponse
Plugin:
- Updated
CorrosionCompanion.csto track player events via/api/plugin/player-event - Modified
OnPlayerConnected/OnPlayerDisconnectedhooks with license_key authentication
Purpose: Answers critical question: "What percentage of players return 24h/48h/72h after a wipe?" Enables data-driven wipe timing optimization and player retention analysis.
Added (Phase 2.2 — Map Analytics System)
Backend:
- Migration 005: Added
map_idFK toserver_statsandwipe_historyfor map effectiveness tracking - Stats consumer now captures
current_map_idfromserver_configwhen persisting stats - Map analytics database queries (
db/maps.rs):get_map_analytics()— Returns performance metrics per map (avg/peak players, times used, effectiveness score)get_map_population_trends()— Player count trends per map over wipe cycles- Effectiveness scoring algorithm: (avg_players / peak_players) * 100
- Analytics API endpoint (
api/analytics.rs):GET /api/analytics/maps?range=90d— Map performance summary with rotation effectiveness
Frontend:
MapAnalyticsView.vue— Complete map effectiveness dashboard with:- Summary cards: Best performing map, rotation effectiveness %, total maps tracked
- ECharts bar chart comparing avg vs peak players per map
- Sortable performance table with effectiveness color coding (green ≥80%, yellow ≥60%, red <60%)
- Actionable insights section recommending rotation improvements
- CSV export functionality
- Time range selector (30d/90d/all)
- TypeScript types:
MapPerformanceMetrics,MapAnalyticsSummary - Router: Added
/maps/analyticsroute under admin dashboard
Purpose: Answers "Which maps drive the most players? Is my rotation working?" Enables data-driven map selection for wipe day.
Added (Phase 2 — Data Aggregation Pipeline)
Backend:
- Stats ingestion consumer service (
stats_consumer.rs) subscribing tocorrosion.*.statsNATS subject - Complete stats database queries (
db/stats.rs) with support for:- Raw stats insertion and retrieval
- Hourly aggregation queries
- Analytics summary calculations (peak/avg players, uptime)
- Data retention cleanup (7 days raw, 90 days hourly)
- Hourly stats aggregation scheduler job (runs at :05 past every hour)
- Daily cleanup scheduler job (runs at 03:00 UTC)
- Analytics API endpoints (
api/analytics.rs):GET /api/analytics/summary— Peak/avg players, uptime percentageGET /api/analytics/timeseries— Time-series data for charting (hourly/raw granularity)GET /api/analytics/export— CSV export of server stats
- Background service initialization in main.rs (stats consumer + scheduler)
Frontend:
- Analytics TypeScript types (
AnalyticsSummary,TimeseriesData,HourlyStats) - Complete
AnalyticsView.vueimplementation with:- Real-time data fetching from analytics API
- Apache ECharts integration for Player Count and Server Performance charts
- Time range selector (24h/7d/30d)
- CSV export functionality
- Loading states and responsive layout
Infrastructure:
- Made
NatsBridge.jetstreampublic for service consumer access
Added (Sovereign Infrastructure Stack)
Services Deployed:
- Gitea (git.corrosionmgmt.com) — Self-hosted Git with Actions support
- Container:
corrosion-giteaon port 8090 (HTTP) and 8095 (SSH) - SQLite database (self-contained, persistent)
- Replaces GitHub dependency for source control
- Gitea Actions enabled for CI/CD
- Container:
- SeaweedFS (cdn.corrosionmgmt.com) — S3-compatible object storage and CDN
- Container:
corrosion-cdnwith integrated Master/Volume/Filer/S3 - Filer UI at port 8091 (cdn.corrosionmgmt.com)
- Master UI at port 8093 (admin.cdn.corrosionmgmt.com)
- S3 API at port 8092 (internal access)
- Purpose: Map hosting, plugin packages, companion binaries, backups
- Container:
- Gitea Act Runner (asgard build server) — CI/CD execution environment
- Runs on Ryzen 9 7945HX (16C/32T, 64GB DDR5)
- Docker-based job execution
- Go 1.21+ and Rust toolchains available
- Connects to public Gitea instance remotely
CI/CD Workflows:
test-runner.yml— Runner capability validation (hostname, resources, toolchains)build-companion.yml— Production companion agent build pipeline:- Triggers on version tags (v*..)
- Cross-compiles for Linux AMD64 and Windows AMD64
- Generates SHA256 checksums
- Creates Gitea release with auto-generated installation instructions
- Uploads binaries and checksums as release assets
Documentation:
infra/docker-compose.yml— Infrastructure stack definitioninfra/README.md— Deployment guide and architecture overviewinfra/NPM-CONFIG.md— Nginx Proxy Manager configurationinfra/ASGARD-RUNNER.md— Act runner setup guide
Repository Migration:
- Migrated from GitHub to self-hosted Gitea
- Remote updated to
git@git.corrosionmgmt.com:vantzs/corrosion-admin-panel.git - All future development on sovereign infrastructure
Technical Details
Data Flow:
Plugin/Agent publishes stats (60s interval)
→ NATS JetStream (corrosion.*.stats)
→ StatsConsumerService persists to server_stats table
→ Hourly aggregation job rolls up to server_stats_hourly
→ Analytics API queries aggregated data
→ Frontend renders charts via ECharts
Database Schema:
server_statstable (raw stats, 7-day retention)server_stats_hourlytable (aggregated hourly data, 90-day retention)
Scheduler Jobs:
- Hourly aggregation:
0 5 * * * *(at :05 past every hour) - Daily cleanup:
0 0 3 * * *(at 03:00 UTC)
Installation Notes
Frontend:
cd frontend && npm install echarts
Backend:
No additional dependencies beyond existing Cargo.toml.
Deferred to Phase 2.2
- Player retention tracking (new vs returning players, session duration)
- Wipe-correlated analytics
- Player activity heatmaps (time-of-day patterns)
- Anomaly alerting system
[2025-02-15] — Phase 1 Complete
Added (Phase 1 — Foundation)
Backend Services:
- Core control plane (Axum + Tokio)
- Auto-wiper with rollback (
wipe_engine.rs) - Plugin management system
- WebSocket/NATS bridge for real-time data
- Companion agent adapter (bare metal server management)
- Panel adapters (AMP + Pterodactyl)
Frontend:
- Vue 3 dashboard with 19 admin sub-views
- Wipe management UI with real-time progress
- Toast notification system
- Plugin management interface
- Public server site
Infrastructure:
- PostgreSQL schema (migrations 001-003)
- NATS JetStream streams (6 streams configured)
- Docker Compose deployment (4 services)
- JWT auth with refresh tokens, TOTP 2FA
Companion Agent:
- Go binary for bare metal server management
- NATS-based command execution
- Process lifecycle control
- File operations support
uMod Plugin:
- C# plugin for Rust game server integration
- Stats publishing every 60 seconds
- Server lifecycle event reporting
Commits
c5d0571— feat: Complete Phase 1 frontend — WebSocket + Wipe feature end-to-end590765f— feat: Complete Phase 1 backend services and WebSocket/NATS bridge8320591— docs: Update companion agent language choice to Go3c39345— docs: Add CLAUDE.md and Claude Code settings81eeb3b— docs: Add AGENTS.md roster and resource discipline
Format: type: Short description
Types: feat, fix, docs, refactor, test, chore, perf, ci