Commit Graph

34 Commits

Author SHA1 Message Date
Jason Staack
4092806fbc fix(license): remove unlimited flag, device limit must always be a number
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-19 22:32:39 -05:00
Jason Staack
fdc8d9cb68 feat(license): add BSL license enforcement with device limit indicator
- Add LICENSE_DEVICES env var (default 250, matches BSL 1.1 free tier)
- Add /api/settings/license endpoint returning device count vs limit
- Header shows flashing red "502/500 licensed" badge when over limit
- About page shows license tier, device count, and over-limit warning
- Nothing is crippled — all features work regardless of device count
- Bump version to 9.7.1

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-19 22:28:56 -05:00
Jason Staack
21f2934906 fix(map): revert to Leaflet + proxied OSM tiles, add CPE signal to popups
Reverted from MapLibre/PMTiles to Leaflet with nginx-proxied OSM raster
tiles — the MapLibre approach had unresolvable CSP and theme compat
issues. The proxy keeps all browser requests local (no third-party).

Also:
- Add CPE signal strength and parent AP name to fleet summary SQL
  and map popup cards (e.g. "Signal: -62 dBm to ap-shady-north")
- Add .dockerignore to exclude 8GB PMTiles and node_modules from
  Docker build context (was causing 10+ minute builds)
- Configure mailpit SMTP in dev compose

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-19 21:47:15 -05:00
Jason Staack
79899840ca feat(map): self-hosted MapLibre GL + PMTiles vector map
Replace Leaflet + OSM raster tiles with MapLibre GL JS + PMTiles:
- Full continental US vector tiles (8GB PMTiles, zoom 0-14 with overzoom)
- Dark theme via @protomaps/basemaps (official supported path)
- Clustered device markers with status colors (green/yellow/red)
- Popup cards show CPU, memory, wireless client count + avg signal
- Font glyphs proxied through nginx, sprites served locally
- Zero third-party requests from the browser
- Fleet summary SQL now includes wireless client count and avg signal
  via LEFT JOIN LATERAL on wireless_links

Also removes alert toast spam and fixes map container height.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-19 20:16:07 -05:00
Jason Staack
1042319a08 perf: fix API CPU saturation at 400+ devices
Root cause: stale NATS JetStream consumers accumulated across API
restarts, causing 13+ consumers to fight over messages in a single
Python async event loop (100% CPU).

Fixes:
- Add performance indexes on devices(tenant_id, hostname),
  devices(tenant_id, status), key_access_log(tenant_id, created_at)
  — drops devices seq_scans from 402k to 6 per interval
- Remove redundant ORDER BY t.name from fleet summary SQL
  (tenant name sort is client-side, was forcing a cross-table sort)
- Bump NATS memory limit from 128MB to 256MB (was at 118/128)
- Increase dev poll interval from 60s to 120s for 400+ device fleet

The stream purge + restart brought API CPU from 100% to 0.3%.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-19 18:06:40 -05:00
Jason Staack
dffea763f6 fix(sites): fix site CRUD crashes and silent form errors
- Fix AttributeError in sites router: CurrentUser has `user_id` not `id`
  (create/update/delete all crashed with 500)
- Add onError handlers with toast notifications to SiteFormDialog

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-19 17:42:58 -05:00
Jason Staack
6a5829e0ff style: ruff format 10 python files
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-19 13:49:59 -05:00
Jason Staack
9d6b68760f fix(lint): remove unused imports and extraneous f-string prefix
Ruff auto-fix: unused Optional imports in sectors router and link
schemas, unused Site import in device service, unused datetime
imports in trend detector, unused text import in site service,
and f-string without placeholders in signal history service.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-19 13:45:47 -05:00
Jason Staack
124a72582b feat(15-01): add signal history and site alert services, routers, and main.py wiring
- Create signal_history_service with TimescaleDB time_bucket queries for 24h/7d/30d ranges
- Create site_alert_service with full CRUD for rules, events list/resolve, and active count
- Create signal_history router with GET endpoint for time-bucketed signal data
- Create site_alerts router with CRUD endpoints for rules and event management
- Wire both routers into main.py with /api prefix

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-19 07:18:02 -05:00
Jason Staack
430cab98a8 feat(14-01): add site_id device filter, wireless data endpoints, and frontend API clients
- Add site_id and sector_id query parameters to devices list endpoint
- Add get_device_registrations and get_device_rf_stats to link_service
- Add RegistrationResponse, RFStatsResponse schemas to link.py
- Add /registrations and /rf-stats endpoints to links router
- Add sectorsApi frontend client (list, create, update, delete, assignDevice)
- Add wirelessApi frontend client (links, registrations, RF stats, unknown clients)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-19 06:42:08 -05:00
Jason Staack
ea5afe3408 feat(14-01): add sector CRUD backend with migration, model, service, and router
- Create sectors table migration (034) with RLS and devices.sector_id FK
- Add Sector ORM model with site_id and tenant_id foreign keys
- Add SectorCreate/Update/Response/ListResponse Pydantic schemas
- Implement sector_service with CRUD and device assignment functions
- Add sectors router with GET/POST/PUT/DELETE and device sector assignment
- Register sectors router in main.py
- Add sector_id and sector_name to Device model and DeviceResponse

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-19 06:40:44 -05:00
Jason Staack
0434d31030 feat(13-03): add link service, schemas, router, and wire subscribers into lifespan
- LinkResponse/UnknownClientResponse Pydantic schemas with from_attributes
- Link service with get_links, get_device_links, get_site_links, get_unknown_clients
- Unknown clients query uses DISTINCT ON for latest registration per MAC
- 4 REST endpoints: tenant links, device links, site links, unknown clients
- Interface and link discovery subscribers wired into FastAPI lifespan start/stop
- Links router registered at /api prefix

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-19 06:12:06 -05:00
Jason Staack
7afd918e2f feat(11-01): create site service, router, and wire into app
- Add site_service with CRUD, health rollup, device assignment functions
- Add sites router with 8 endpoints (CRUD + assign/unassign/bulk-assign)
- RBAC: viewer for reads, operator for writes, tenant_admin for delete
- Wire sites_router into main.py with /api prefix
- Health rollup computes device_count, online_count, online_percent per site

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-18 21:38:54 -05:00
Jason Staack
6713a8cf5b feat(audit): make device names clickable in audit log
Add device_id to the audit log API response and frontend type, then
use DeviceLink to make device hostnames navigable in AuditLogTable.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-18 11:16:21 -05:00
Jason Staack
1800330545 feat: expand config editor menu tree and add WiFi wave2 template
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-16 18:27:38 -05:00
Jason Staack
7a563fecd2 fix: resolve ruff lint and formatting issues
Remove unused timedelta import from test_wireless_api.py and
auto-format metrics.py to pass ruff format check.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-15 20:09:14 -05:00
Jason Staack
8bffe3b4d0 feat: add wireless-issues API endpoints for dashboard
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-15 20:03:36 -05:00
Jason Staack
7ef849550c feat: seed default wireless alert rules on tenant creation 2026-03-15 20:02:00 -05:00
Jason Staack
06a41ca9bf fix(lint): resolve all ruff lint errors
Add ruff config to exclude alembic E402, SQLAlchemy F821, and pre-existing
E501 line-length issues. Auto-fix 69 unused imports and 2 f-strings without
placeholders. Manually fix 8 unused variables. Apply ruff format to 127 files.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-14 22:17:50 -05:00
Jason Staack
2ad0367c91 fix(vpn): backport VPN fixes from production debugging
- Fix _commit_and_sync infinite recursion
- Use admin session for subnet_index allocation (bypass RLS)
- Auto-set VPN endpoint from CORS_ORIGINS hostname
- Remove server address field from VPN setup UI
- Add DELETE endpoint and button for VPN config removal
- Add wg-reload watcher for reliable config hot-reload via wg syncconf
- Add wg_status.json writer for live peer handshake status in UI
- Per-tenant SNAT for poller-to-device routing through VPN
- Restrict VPN→eth0 forwarding to Docker networks only (block exit node abuse)
- Use 10.10.0.0/16 allowed-address in RouterOS commands
- Fix structlog event= conflict (use audit=True)
- Export backup_scheduler proxy for firmware/upgrade imports
2026-03-14 20:59:14 -05:00
Jason Staack
b5f9bf14df fix(vpn): commit before sync_wireguard_config to ensure data visibility
sync_wireguard_config opens its own AdminAsyncSessionLocal connection
which cannot see uncommitted data from the caller's transaction. Add
_commit_and_sync helper that commits first, then regenerates wg0.conf.

Also removes the unused db parameter from sync_wireguard_config.
2026-03-14 16:42:17 -05:00
Jason Staack
b4a7494016 feat(vpn): update API error handling for subnet exhaustion and IP validation
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-14 16:36:46 -05:00
Jason Staack
17d9d3e00f feat(vpn): regenerate wg0.conf on tenant deletion
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-14 16:31:33 -05:00
Jason Staack
970501e453 feat: implement Remote WinBox worker, API, frontend integration, OpenBao persistence, and supporting docs 2026-03-14 09:05:14 -05:00
Jason Staack
1a1ceb2cb1 feat(10-01): add audit event logging to config backup operations
- config_snapshot_created event after successful snapshot INSERT
- config_snapshot_skipped_duplicate event on dedup match
- config_diff_generated event after diff INSERT
- config_backup_manual_trigger event on manual trigger success
- All log_action calls wrapped in try/except for safety

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-12 23:44:00 -05:00
Jason Staack
af7007df13 feat(06-02): add snapshot view and diff retrieval endpoints
- GET /config/{snapshot_id} returns decrypted full config with RBAC
- GET /config/{snapshot_id}/diff returns unified diff text with RBAC
- 404 for missing snapshots/diffs, 500 for Transit decrypt failure
- Both endpoints enforce viewer+ role and config:read scope

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-12 23:03:32 -05:00
Jason Staack
5c56344d74 feat(06-01): add config-history endpoint with RBAC and main.py registration
- GET /api/tenants/{tid}/devices/{did}/config-history endpoint
- Viewer+ RBAC with config:read scope
- Pagination via limit/offset query params (defaults 50/0)
- Router registered in main.py

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-12 22:59:37 -05:00
Jason Staack
00f0a8b507 feat(04-01): add config snapshot trigger endpoint with NATS request-reply
- POST /tenants/{tid}/devices/{did}/config-snapshot/trigger endpoint
- Requires operator role, rate limited 10/minute
- Returns 201 success, 404 device not found, 409 lock held, 502 failure, 504 timeout
- Reuses NATS connection from routeros_proxy module
- 6 tests covering all response paths including connection errors

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-12 22:10:25 -05:00
Jason Staack
c2eea6847f fix: WinBox tunnel bind address, port range, and proxy support
- Bind tunnel listeners to 0.0.0.0 instead of 127.0.0.1 so tunnels
  are reachable through reverse proxies and container networks
- Reduce port range to 49000-49004 (5 concurrent tunnels)
- Derive WinBox URI host from request Host header instead of
  hardcoding 127.0.0.1, enabling use behind reverse proxies
- Add README security warning about default encryption keys

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-12 19:03:53 -05:00
Jason Staack
7aaaeaa1d1 fix: address spec compliance gaps - tenant check, XFF fallback, rate limiting
- Gap 1: Add tenant ID verification after device lookup in SSH relay handleSSH,
  closing cross-tenant token reuse vulnerability
- Gap 2: Add X-Forwarded-For fallback (last entry) when X-Real-IP is absent in
  SSH relay source IP extraction; import strings package
- Gap 3: Add @limiter.limit("10/minute") to POST /winbox-session and POST
  /ssh-session using existing slowapi pattern from app.middleware.rate_limit
- Gap 4: Add TODO comment in open_ssh_session explaining that SSH session count
  enforcement is at the poller level; no NATS subject exists yet for API-side
  pre-check

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-12 15:51:14 -05:00
Jason Staack
4860fad643 feat(api): add remote access endpoints for WinBox tunnels and SSH sessions
Implements four operator-gated endpoints under /api/tenants/{tenant_id}/devices/{device_id}/:
- POST /winbox-session: opens a WinBox tunnel via NATS request-reply to poller
- POST /ssh-session: mints a single-use Redis token (120s TTL) for WebSocket SSH relay
- DELETE /winbox-session/{tunnel_id}: idempotently closes a WinBox tunnel
- GET /sessions: lists active WinBox tunnels via NATS tunnel.status.list

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-12 15:39:24 -05:00
Cog
57e754bb27 fix: implement vault key decryption on login + fix token refresh via cookie
Three bugs fixed:

1. Phase 30 (auth.ts): After SRP login the encrypted_key_set was returned
   from the server but the vault key and RSA private key were never unwrapped
   with the AUK. keyStore.getVaultKey() was always null, causing Tier 1
   config-backup diffs to crash with a TypeError.
   Fix: unwrap vault key and private key using crypto.subtle.unwrapKey after
   successful SRP verification. Non-fatal: warns to console if decryption
   fails so login always succeeds.

2. Token refresh (auth.py): The /refresh endpoint required refresh_token in
   the request body, but the frontend never stored or sent it. After the 15-
   minute access token TTL, all authenticated API calls would fail silently
   because the interceptor sent an empty body and received 422 (not 401),
   so the retry loop never fired.
   Fix: login/srpVerify now set an httpOnly refresh_token cookie scoped to
   /api/auth/refresh. The refresh endpoint now accepts the token from either
   cookie (preferred) or body (legacy). Logout clears both cookies.
   RefreshRequest.refresh_token is now Optional to allow empty-body calls.

3. Silent token rotation: the /refresh endpoint now also rotates the refresh
   token cookie on each use (issues a fresh token), reducing the window for
   stolen refresh token replay.
2026-03-12 14:05:40 -05:00
Jason Staack
2605a97331 fix: use user.user_id instead of user.id in SMTP settings save
CurrentUser object uses user_id attribute, not id. Caused AttributeError
on PUT /api/settings/smtp.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-09 21:08:21 -05:00
Jason Staack
b840047e19 feat: The Other Dude v9.0.1 — full-featured email system
ci: add GitHub Pages deployment workflow for docs site

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-08 19:30:44 -05:00