WIRELESS_REGISTRATIONS stream had a 256MB MaxBytes cap in a 256MB
container — guaranteed to crash under load. ALERT_EVENTS and
OPERATION_EVENTS had no byte limit at all.
- Reduce WIRELESS_REGISTRATIONS MaxBytes from 256MB to 128MB
- Add 16MB MaxBytes cap to ALERT_EVENTS and OPERATION_EVENTS
- Bump NATS container memory limit from 256MB to 384MB
- Add restart: unless-stopped to NATS in base compose
- Bump version to 9.8.2
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Root cause: stale NATS JetStream consumers accumulated across API
restarts, causing 13+ consumers to fight over messages in a single
Python async event loop (100% CPU).
Fixes:
- Add performance indexes on devices(tenant_id, hostname),
devices(tenant_id, status), key_access_log(tenant_id, created_at)
— drops devices seq_scans from 402k to 6 per interval
- Remove redundant ORDER BY t.name from fleet summary SQL
(tenant name sort is client-side, was forcing a cross-table sort)
- Bump NATS memory limit from 128MB to 256MB (was at 118/128)
- Increase dev poll interval from 60s to 120s for 400+ device fleet
The stream purge + restart brought API CPU from 100% to 0.3%.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Absolute paths (/Volumes/ssd01/mikrotik/docker-data/) are machine-specific
and won't work on any other system. Use ./docker-data/ so the repo works
wherever it's cloned.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>