Reverted from MapLibre/PMTiles to Leaflet with nginx-proxied OSM raster
tiles — the MapLibre approach had unresolvable CSP and theme compat
issues. The proxy keeps all browser requests local (no third-party).
Also:
- Add CPE signal strength and parent AP name to fleet summary SQL
and map popup cards (e.g. "Signal: -62 dBm to ap-shady-north")
- Add .dockerignore to exclude 8GB PMTiles and node_modules from
Docker build context (was causing 10+ minute builds)
- Configure mailpit SMTP in dev compose
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Replace OpenStreetMap CDN with self-hosted Protomaps PMTiles
(Wisconsin + Florida regional extracts, served from nginx)
- Add protomaps-leaflet for vector tile rendering in dark theme
- Update CSP to remove openstreetmap.org, add blob: for vector workers
- Add nginx location block for /tiles/ with byte range support
- Mount tiles directory as volume (not baked into image)
- Remove alert_fired/alert_resolved toast notifications that spammed
"undefined" at fleet scale — dashboard still updates via query invalidation
- Add *.pmtiles to .gitignore (large binaries)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Root cause: stale NATS JetStream consumers accumulated across API
restarts, causing 13+ consumers to fight over messages in a single
Python async event loop (100% CPU).
Fixes:
- Add performance indexes on devices(tenant_id, hostname),
devices(tenant_id, status), key_access_log(tenant_id, created_at)
— drops devices seq_scans from 402k to 6 per interval
- Remove redundant ORDER BY t.name from fleet summary SQL
(tenant name sort is client-side, was forcing a cross-table sort)
- Bump NATS memory limit from 128MB to 256MB (was at 118/128)
- Increase dev poll interval from 60s to 120s for 400+ device fleet
The stream purge + restart brought API CPU from 100% to 0.3%.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Bind tunnel listeners to 0.0.0.0 instead of 127.0.0.1 so tunnels
are reachable through reverse proxies and container networks
- Reduce port range to 49000-49004 (5 concurrent tunnels)
- Derive WinBox URI host from request Host header instead of
hardcoding 127.0.0.1, enabling use behind reverse proxies
- Add README security warning about default encryption keys
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add WebSocket upgrade map to nginx and proxy /ws/ssh to poller:8080
- Update CSP connect-src to allow ws: and wss: for terminal connections
- Add tunnel port range 49000-49100, SSH relay env vars, ulimits, and healthcheck to poller in both override and prod compose files
- Increase poller memory limit to 512M in prod for tunnel/SSH overhead
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Two bugs fixed:
1. audit_service.py: log_action() inserted into audit_logs using the
caller's DB session but never committed. Any router that called
db.commit() before log_action() (firmware, devices, config_editor,
alerts, certificates) had its audit rows silently rolled back when
the request session closed.
Fix: log_action now opens its own AdminAsyncSessionLocal and self-
commits, making audit persistence independent of the caller's
transaction. The 'db' parameter is kept for backward compat but
unused. Affects 5 routers (firmware, devices, config_editor,
alerts, certificates).
2. docker-compose.override.yml: /data/firmware-cache had no volume
mount so the directory didn't exist in the container, causing
firmware downloads to fail with Permission denied.
Fix: bind-mount docker-data/firmware-cache:/data/firmware-cache
so firmware images survive container restarts.
- poller/docker-entrypoint.sh: convert from CRLF+BOM to LF (UTF-8 no BOM)
Windows saved the file with a UTF-8 BOM which made the Linux kernel
reject the shebang with 'exec format error', crashing the poller.
- infrastructure/openbao/init.sh: same CRLF -> LF fix
- poller/Dockerfile: add sed to strip CRLF and BOM at image build time
as a defensive measure for future Windows edits
- docker-compose.override.yml: add 'restart: on-failure' to api and poller
so they recover from the postgres startup race (TimescaleDB restarts
postgres after initdb, briefly causing connection refused on first boot)
- .gitattributes: enforce LF for all text/script/code files so git
normalises line endings on checkout and prevents this class of bug