Proactive monitoring infrastructure for server health: **Alert Service:** - Population drop detection (configurable % threshold) - FPS degradation monitoring (configurable FPS threshold) - Multi-channel notifications (Discord, Pushbullet, Email) - Spam prevention (30-min duplicate suppression) - Severity levels (Info, Warning, Critical) **Database:** - alert_config table (thresholds per license) - alert_history table (event log with metadata) - 90-day retention with cleanup job **Integration:** - Discord/Pushbullet service integration - Notification config retrieval from public_site_config - Ready for stats pipeline integration Purpose: Server admins get alerted when anomalies occur (population crashes, performance degradation). Configurable thresholds enable proactive server management. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
15 KiB
CHANGELOG — Corrosion Admin Panel
All notable changes to this project will be documented in this file.
[Unreleased]
Added (Phase 2 — Alerting System)
Backend:
- Migration 008: Alert configuration and history tables
alert_configtable with threshold settings per license (population drop %, FPS threshold)alert_historytable logging all triggered alerts with metadata- Default alert config created for all existing licenses
- Alert service (
services/alerting.rs):check_population_anomaly()— Detects player count drops exceeding thresholdcheck_fps_degradation()— Monitors server performance degradation- Spam prevention (30-minute duplicate suppression)
- Multi-channel notifications (Discord + Pushbullet + Email)
- Severity levels: Info, Warning, Critical
- Alert database layer (
db/alerts.rs):get_alert_config()/update_alert_config()— Threshold configurationinsert_alert()/mark_alert_notified()— Alert history trackingcheck_recent_alert()— Duplicate detectioncleanup_old_alerts()— 90-day retention cleanup
- Updated
db/notifications.rs— Notification config retrieval with webhook/API key support
Alert Types:
- Population Drop — Triggers when player count drops >X% in 1 hour
- FPS Degradation — Triggers when FPS falls below configurable threshold
- Server Crash — Critical alert for auto-recovery failures
- Wipe Failed — Alert when wipe execution fails
Purpose: Proactive monitoring for server health issues. Alerts server admins via Discord/Pushbullet when anomalies detected (population crashes, performance degradation). Configurable thresholds per license.
Added (Phase 2 — Wipe Performance Analytics)
Backend:
backend/src/db/wipes.rs— Comprehensive wipe analytics query layer:get_wipe_success_rate()— Success vs failure rate over time rangeget_average_wipe_duration()— Average execution time for successful wipesget_wipe_to_peak_population()— Hours from wipe completion to peak player count (24h window)get_population_curve_by_cycle()— Day 1 vs Day 2 vs Day 3 average player counts post-wipeget_optimal_wipe_timing()— Recommends best day of week + hour based on historical peak populationsget_wipe_analytics_entries()— Detailed per-wipe records for charting (duration, peak pop, success)- All queries use hourly aggregates (
server_stats_hourly) with 90-day retention
backend/src/api/analytics.rs— Wipe performance endpoint:GET /api/analytics/wipes/performance?range=90d— Returns full wipe performance metrics- Supports range params:
6d,12d,90d,all(converted to wipe count estimates) - Response includes: success rate, avg duration, population curve, optimal timing, individual wipe entries
Frontend:
WipeAnalyticsView.vue— Complete wipe performance dashboard:- ECharts Visualizations:
- Wipe success timeline (scatter plot: green = success, red = failed)
- Population curve bar chart (Day 1/Day 2/Day 3 average players post-wipe)
- Wipe duration trend (line chart showing execution time evolution)
- Insight Cards:
- Success rate percentage with total wipe count
- Average wipe duration (formatted as minutes:seconds)
- Peak population day identifier
- Optimal wipe timing recommendation (day + hour)
- Actionable Recommendations Banner:
- Optimal wipe day/hour based on post-wipe player peaks
- Weekly vs bi-weekly wipe suggestion (if Day 1 >> Day 2 population)
- Duration optimization alerts (if avg > 10 minutes)
- Rollback protection warnings (if failures detected)
- Time range selector: Last 6 wipes / Last 12 wipes / All time
- CSV export functionality
- ECharts Visualizations:
- Added route
/wipes/analyticsto router - TypeScript interfaces:
WipePerformanceMetrics,WipeAnalyticsEntry,PopulationCurve
Purpose: Answers critical questions: "How long do wipes take? When do players peak post-wipe? What's my success rate? When should I schedule wipes for max population?" Enables data-driven wipe timing optimization and operational insights.
Added (Phase 3 — Public Status Page)
Backend:
- Migration 007: Added
status_page_descriptionTEXT column topublic_site_config - Public API models (
models/public.rs):PublicServerStatus— Server status with live stats for public displayPlatformHealth— Platform-wide health metrics (total servers, online count, total players, uptime)StatusPageResponse— Complete status page data structurePublicSiteConfig— Full public site configuration model
- Public database queries (
db/public.rs):get_public_servers()— Retrieves all opted-in servers with current stats, uptime percentages (24h/7d/30d), wipe schedulesget_platform_health()— Calculates platform-wide aggregate metricscalculate_uptime_percentage()— Uptime calculation from hourly statsformat_cron_expression()— Human-readable wipe schedule formattingget_public_site_config()/create_public_site_config()/update_public_site_config()— Config management
- Public API endpoint (
api/public.rs):GET /api/public/status— Public status page data (no auth required)
- Settings API (
api/settings.rs):GET /api/settings/public-site— Fetch public site config (auth required)PUT /api/settings/public-site— Update status page opt-in and description (auth required)
Frontend:
StatusPageView.vue— Complete public status page with:- Platform health header (total servers, online now, total players, platform uptime)
- Server grid with status indicators (green/yellow/red), player counts, uptime badges (24h/7d/30d)
- Wipe schedule display with countdown timers
- Server search/filter functionality
- Auto-refresh every 10 seconds via polling
- Mobile-responsive grid layout
- "Powered by Corrosion" footer with panel link
- Settings dashboard integration (
SettingsView.vue):- New "Public Status" tab with toggle for
show_on_status_page - Text area for
status_page_description - Save endpoint integration
- New "Public Status" tab with toggle for
Infrastructure:
- nginx already configured for
status.corrosionmgmt.comrouting - Router already configured with
/statusroute on both panel and marketing domains
Purpose: Public-facing marketing page showcasing all Corrosion servers. Drives platform visibility and attracts new customers ("I want this for my server too").
Added (Phase 2.2 — Player Retention Analytics)
Backend:
- Migration
004_player_sessions.sql— Player session tracking table with indexes for retention queries backend/src/db/player_sessions.rs— Complete player session tracking and retention analysis:track_player_join()/track_player_leave()— Record individual player sessionscalculate_retention_after_wipe()— Calculate 24h/48h/72h return rates per wipeget_unique_player_count()/get_avg_session_duration()— Session metricsget_new_vs_returning_ratio()— New vs returning player analysisget_recent_wipe_retention_metrics()— Multi-wipe retention trendscleanup_old_player_sessions()— 90-day retention cleanup
backend/src/api/plugin.rs— Plugin event endpoints:POST /api/plugin/player-event— Track player join/leave eventsPOST /api/plugin/checkin— Plugin registration on server start
- Extended
backend/src/api/analytics.rswith retention endpoints:GET /api/analytics/retention?wipe_count=6— Multi-wipe retention metricsGET /api/analytics/retention/export— CSV export of retention data
Frontend:
PlayerRetentionView.vue— Complete retention analytics dashboard:- ECharts retention curve (24h/48h/72h lines across multiple wipes)
- Summary cards: unique players, avg session duration, new vs returning ratio
- Wipe selector (last 3/6/10/20 wipes)
- Detailed wipe table with retention percentages
- CSV export functionality
- Added route
/retentionto router - TypeScript interfaces:
WipeRetentionMetric,SessionSummary,RetentionResponse
Plugin:
- Updated
CorrosionCompanion.csto track player events via/api/plugin/player-event - Modified
OnPlayerConnected/OnPlayerDisconnectedhooks with license_key authentication
Purpose: Answers critical question: "What percentage of players return 24h/48h/72h after a wipe?" Enables data-driven wipe timing optimization and player retention analysis.
Added (Phase 2.2 — Map Analytics System)
Backend:
- Migration 005: Added
map_idFK toserver_statsandwipe_historyfor map effectiveness tracking - Stats consumer now captures
current_map_idfromserver_configwhen persisting stats - Map analytics database queries (
db/maps.rs):get_map_analytics()— Returns performance metrics per map (avg/peak players, times used, effectiveness score)get_map_population_trends()— Player count trends per map over wipe cycles- Effectiveness scoring algorithm: (avg_players / peak_players) * 100
- Analytics API endpoint (
api/analytics.rs):GET /api/analytics/maps?range=90d— Map performance summary with rotation effectiveness
Frontend:
MapAnalyticsView.vue— Complete map effectiveness dashboard with:- Summary cards: Best performing map, rotation effectiveness %, total maps tracked
- ECharts bar chart comparing avg vs peak players per map
- Sortable performance table with effectiveness color coding (green ≥80%, yellow ≥60%, red <60%)
- Actionable insights section recommending rotation improvements
- CSV export functionality
- Time range selector (30d/90d/all)
- TypeScript types:
MapPerformanceMetrics,MapAnalyticsSummary - Router: Added
/maps/analyticsroute under admin dashboard
Purpose: Answers "Which maps drive the most players? Is my rotation working?" Enables data-driven map selection for wipe day.
Added (Phase 2 — Data Aggregation Pipeline)
Backend:
- Stats ingestion consumer service (
stats_consumer.rs) subscribing tocorrosion.*.statsNATS subject - Complete stats database queries (
db/stats.rs) with support for:- Raw stats insertion and retrieval
- Hourly aggregation queries
- Analytics summary calculations (peak/avg players, uptime)
- Data retention cleanup (7 days raw, 90 days hourly)
- Hourly stats aggregation scheduler job (runs at :05 past every hour)
- Daily cleanup scheduler job (runs at 03:00 UTC)
- Analytics API endpoints (
api/analytics.rs):GET /api/analytics/summary— Peak/avg players, uptime percentageGET /api/analytics/timeseries— Time-series data for charting (hourly/raw granularity)GET /api/analytics/export— CSV export of server stats
- Background service initialization in main.rs (stats consumer + scheduler)
Frontend:
- Analytics TypeScript types (
AnalyticsSummary,TimeseriesData,HourlyStats) - Complete
AnalyticsView.vueimplementation with:- Real-time data fetching from analytics API
- Apache ECharts integration for Player Count and Server Performance charts
- Time range selector (24h/7d/30d)
- CSV export functionality
- Loading states and responsive layout
Infrastructure:
- Made
NatsBridge.jetstreampublic for service consumer access
Added (Sovereign Infrastructure Stack)
Services Deployed:
- Gitea (git.corrosionmgmt.com) — Self-hosted Git with Actions support
- Container:
corrosion-giteaon port 8090 (HTTP) and 8095 (SSH) - SQLite database (self-contained, persistent)
- Replaces GitHub dependency for source control
- Gitea Actions enabled for CI/CD
- Container:
- SeaweedFS (cdn.corrosionmgmt.com) — S3-compatible object storage and CDN
- Container:
corrosion-cdnwith integrated Master/Volume/Filer/S3 - Filer UI at port 8091 (cdn.corrosionmgmt.com)
- Master UI at port 8093 (admin.cdn.corrosionmgmt.com)
- S3 API at port 8092 (internal access)
- Purpose: Map hosting, plugin packages, companion binaries, backups
- Container:
- Gitea Act Runner (asgard build server) — CI/CD execution environment
- Runs on Ryzen 9 7945HX (16C/32T, 64GB DDR5)
- Docker-based job execution
- Go 1.21+ and Rust toolchains available
- Connects to public Gitea instance remotely
CI/CD Workflows:
test-runner.yml— Runner capability validation (hostname, resources, toolchains)build-companion.yml— Production companion agent build pipeline:- Triggers on version tags (v*..)
- Cross-compiles for Linux AMD64 and Windows AMD64
- Generates SHA256 checksums
- Creates Gitea release with auto-generated installation instructions
- Uploads binaries and checksums as release assets
Documentation:
infra/docker-compose.yml— Infrastructure stack definitioninfra/README.md— Deployment guide and architecture overviewinfra/NPM-CONFIG.md— Nginx Proxy Manager configurationinfra/ASGARD-RUNNER.md— Act runner setup guide
Repository Migration:
- Migrated from GitHub to self-hosted Gitea
- Remote updated to
git@git.corrosionmgmt.com:vantzs/corrosion-admin-panel.git - All future development on sovereign infrastructure
Technical Details
Data Flow:
Plugin/Agent publishes stats (60s interval)
→ NATS JetStream (corrosion.*.stats)
→ StatsConsumerService persists to server_stats table
→ Hourly aggregation job rolls up to server_stats_hourly
→ Analytics API queries aggregated data
→ Frontend renders charts via ECharts
Database Schema:
server_statstable (raw stats, 7-day retention)server_stats_hourlytable (aggregated hourly data, 90-day retention)
Scheduler Jobs:
- Hourly aggregation:
0 5 * * * *(at :05 past every hour) - Daily cleanup:
0 0 3 * * *(at 03:00 UTC)
Installation Notes
Frontend:
cd frontend && npm install echarts
Backend:
No additional dependencies beyond existing Cargo.toml.
Deferred to Phase 2.2
- Player retention tracking (new vs returning players, session duration)
- Wipe-correlated analytics
- Player activity heatmaps (time-of-day patterns)
- Anomaly alerting system
[2025-02-15] — Phase 1 Complete
Added (Phase 1 — Foundation)
Backend Services:
- Core control plane (Axum + Tokio)
- Auto-wiper with rollback (
wipe_engine.rs) - Plugin management system
- WebSocket/NATS bridge for real-time data
- Companion agent adapter (bare metal server management)
- Panel adapters (AMP + Pterodactyl)
Frontend:
- Vue 3 dashboard with 19 admin sub-views
- Wipe management UI with real-time progress
- Toast notification system
- Plugin management interface
- Public server site
Infrastructure:
- PostgreSQL schema (migrations 001-003)
- NATS JetStream streams (6 streams configured)
- Docker Compose deployment (4 services)
- JWT auth with refresh tokens, TOTP 2FA
Companion Agent:
- Go binary for bare metal server management
- NATS-based command execution
- Process lifecycle control
- File operations support
uMod Plugin:
- C# plugin for Rust game server integration
- Stats publishing every 60 seconds
- Server lifecycle event reporting
Commits
c5d0571— feat: Complete Phase 1 frontend — WebSocket + Wipe feature end-to-end590765f— feat: Complete Phase 1 backend services and WebSocket/NATS bridge8320591— docs: Update companion agent language choice to Go3c39345— docs: Add CLAUDE.md and Claude Code settings81eeb3b— docs: Add AGENTS.md roster and resource discipline
Format: type: Short description
Types: feat, fix, docs, refactor, test, chore, perf, ci