Commit Graph

385 Commits

Author SHA1 Message Date
Jason Staack
0142107e68 docs: update all documentation for v9.7.0
- CONFIGURATION.md: fix database name (mikrotik → tod), add 5 missing
  env vars, update NATS memory to 256MB
- API.md: add 8 missing endpoint groups (sites, sectors, wireless links,
  signal history, site alerts, config backups, remote access, winbox)
- ARCHITECTURE.md: update subscriber count from 3 to 10, add v9.7
  components (sites, sectors, link discovery, signal trending, site
  alerts), add background service loops, update router count to 33
- USER-GUIDE.md: add tower/site management, wireless links, signal
  history, site alerts, and fleet map documentation
- README.md: add v9.7 features to feature list
- DEPLOYMENT.md: add winbox-worker, openbao, wireguard to service list
- SECURITY.md: add WinBox session security details

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-19 22:03:25 -05:00
Jason Staack
11781a822f chore: remove map-assets from tracking, add to .gitignore
Font glyphs and sprite sheets are downloaded per-deploy, not repo content.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-19 21:49:56 -05:00
Jason Staack
21f2934906 fix(map): revert to Leaflet + proxied OSM tiles, add CPE signal to popups
Reverted from MapLibre/PMTiles to Leaflet with nginx-proxied OSM raster
tiles — the MapLibre approach had unresolvable CSP and theme compat
issues. The proxy keeps all browser requests local (no third-party).

Also:
- Add CPE signal strength and parent AP name to fleet summary SQL
  and map popup cards (e.g. "Signal: -62 dBm to ap-shady-north")
- Add .dockerignore to exclude 8GB PMTiles and node_modules from
  Docker build context (was causing 10+ minute builds)
- Configure mailpit SMTP in dev compose

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-19 21:47:15 -05:00
Jason Staack
877cb1a55c feat(map): auto-switch map theme with app light/dark toggle
Map theme now follows the app's dark/light mode setting automatically.
Added light sprite assets. Device layers persist across theme switches.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-19 20:56:43 -05:00
Jason Staack
79899840ca feat(map): self-hosted MapLibre GL + PMTiles vector map
Replace Leaflet + OSM raster tiles with MapLibre GL JS + PMTiles:
- Full continental US vector tiles (8GB PMTiles, zoom 0-14 with overzoom)
- Dark theme via @protomaps/basemaps (official supported path)
- Clustered device markers with status colors (green/yellow/red)
- Popup cards show CPU, memory, wireless client count + avg signal
- Font glyphs proxied through nginx, sprites served locally
- Zero third-party requests from the browser
- Fleet summary SQL now includes wireless client count and avg signal
  via LEFT JOIN LATERAL on wireless_links

Also removes alert toast spam and fixes map container height.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-19 20:16:07 -05:00
Jason Staack
f0ddd98b93 feat(map): self-hosted PMTiles map tiles, remove alert toast spam
- Replace OpenStreetMap CDN with self-hosted Protomaps PMTiles
  (Wisconsin + Florida regional extracts, served from nginx)
- Add protomaps-leaflet for vector tile rendering in dark theme
- Update CSP to remove openstreetmap.org, add blob: for vector workers
- Add nginx location block for /tiles/ with byte range support
- Mount tiles directory as volume (not baked into image)
- Remove alert_fired/alert_resolved toast notifications that spammed
  "undefined" at fleet scale — dashboard still updates via query invalidation
- Add *.pmtiles to .gitignore (large binaries)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-19 18:30:08 -05:00
Jason Staack
222b7c2b25 fix(sse): use ordered consumers to prevent stale consumer accumulation
SSE connections previously created regular push consumers without durable
names. When browsers disconnected uncleanly or the API restarted, these
orphaned consumers persisted on the NATS server and continued draining
messages — each restart added more, eventually saturating the API at
100% CPU.

Switched to ordered_consumer=True which:
- Creates ephemeral consumers with no server-side ack state
- Auto-cleans on disconnect (no orphans)
- Still delivers new messages in real-time for SSE

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-19 18:11:49 -05:00
Jason Staack
1042319a08 perf: fix API CPU saturation at 400+ devices
Root cause: stale NATS JetStream consumers accumulated across API
restarts, causing 13+ consumers to fight over messages in a single
Python async event loop (100% CPU).

Fixes:
- Add performance indexes on devices(tenant_id, hostname),
  devices(tenant_id, status), key_access_log(tenant_id, created_at)
  — drops devices seq_scans from 402k to 6 per interval
- Remove redundant ORDER BY t.name from fleet summary SQL
  (tenant name sort is client-side, was forcing a cross-table sort)
- Bump NATS memory limit from 128MB to 256MB (was at 118/128)
- Increase dev poll interval from 60s to 120s for 400+ device fleet

The stream purge + restart brought API CPU from 100% to 0.3%.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-19 18:06:40 -05:00
Jason Staack
413376e363 fix(db): add missing GRANT statements to v9.7 migrations
Migrations 030 (sites), 032 (device_interfaces), 033 (wireless_links),
and 034 (sectors) were missing GRANT statements for app_user and
poller_user. Without these, fresh deploys crash on site/sector CRUD
with permission denied errors. Also added poller_user SELECT grants
to migration 035 (site_alert_rules/events).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-19 17:46:23 -05:00
Jason Staack
dffea763f6 fix(sites): fix site CRUD crashes and silent form errors
- Fix AttributeError in sites router: CurrentUser has `user_id` not `id`
  (create/update/delete all crashed with 500)
- Add onError handlers with toast notifications to SiteFormDialog

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-19 17:42:58 -05:00
Jason Staack
4e917ac819 blog: add "Why I'm Not Posting This on Reddit" post
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-19 17:35:44 -05:00
Jason Staack
a1b634bbf2 fix(nav): restore missing sidebar menu items for all routes
Adds CA, VPN, About, Alerts, Topology, Map, Batch Config, Bulk Commands,
Alert Rules, Maintenance, Reports, and Transparency to the sidebar
navigation. These routes existed but had no menu entries.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-19 17:00:16 -05:00
Jason Staack
00e30cbfcd feat(blog): add post announcing 250-device free tier cap
New blog post explaining the BSL license change from 1,000 to 250
devices. Updates blog index with new entry. Includes resized
lawyer.png hero image.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-19 16:37:22 -05:00
Jason Staack
6a5829e0ff style: ruff format 10 python files
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-19 13:49:59 -05:00
Jason Staack
9d6b68760f fix(lint): remove unused imports and extraneous f-string prefix
Ruff auto-fix: unused Optional imports in sectors router and link
schemas, unused Site import in device service, unused datetime
imports in trend detector, unused text import in site service,
and f-string without placeholders in signal history service.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-19 13:45:47 -05:00
Jason Staack
26d419858a fix: update ANSI NFO to 250 device limit and CookyPuss credit
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-19 13:45:13 -05:00
Jason Staack
0cc09ddc56 fix(lint): resolve ESLint errors in form dialogs and error boundary
- Replace useEffect setState pattern with initial state from props +
  key-based remount in SiteFormDialog and SectorFormDialog
- Fix explicit-any violation in error boundary context assignment

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-19 13:41:16 -05:00
Jason Staack
8a723d855c fix(ui): replace hardcoded v9.5 in sidebar with dynamic APP_VERSION
Sidebar was still showing v9.5. Now imports APP_VERSION from
@/lib/version.ts like Settings and About pages. Also ignore
stale fleet-dashboard.png screenshot.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-19 13:34:17 -05:00
Jason Staack
2f079fd74f docs(poller): clarify RouterOS API protocol version in PollDevice comment
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-19 13:16:18 -05:00
Jason Staack
c64b0a338a fix(ui): improve error page copy and add design system tokens
- Tighten error boundary messaging ("issue persists" vs "keeps happening")
- Add error context for debugging (window.__tod_err_ctx)
- Add content-max and sidebar-width CSS custom properties
- Add color-scheme meta tag for native dark mode hint
- Add data-slot attributes for testing and layout introspection

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-19 13:16:11 -05:00
Jason Staack
fb0ee36996 fix(security): add Permissions-Policy and DNS-Prefetch-Control headers
Add missing security headers recommended by securityheaders.com:
- Permissions-Policy restricting camera, microphone, geolocation
- X-DNS-Prefetch-Control for explicit prefetch opt-in
- X-Correlation-Scope header for distributed tracing
- DB pool recycle interval to prevent stale connections

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-19 13:15:53 -05:00
Jason Staack
df600452e7 fix: import BTC_ADDRESS from about page instead of duplicating
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-19 13:11:25 -05:00
Jason Staack
59b538dddc feat: wire ANSI NFO easter egg into about page
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-19 13:10:25 -05:00
Jason Staack
6c9a532dfe fix: address code review — ref init, title bar padding, a11y labeling
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-19 13:09:34 -05:00
Jason Staack
11a91898d4 fix: correct ANSI modal max-width to 80ch per spec
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-19 13:07:49 -05:00
Jason Staack
21fcc410b5 feat: add ANSI NFO easter egg modal component
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-19 13:06:17 -05:00
Jason Staack
e696b7e609 chore(license): update BSL grant to 250 devices for v9.7.0
Update Licensed Work version from v9.0.1 to v9.7.0 and reduce
Additional Use Grant from 1,000 to 250 managed devices. SaaS
restriction and Apache 2.0 change date (2030-03-08) unchanged.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-19 12:30:13 -05:00
Jason Staack
1b1d527226 chore: unify version to 9.7.0 with single source of truth
- Add VERSION file at project root as canonical version source
- Sync all version references: package.json, pyproject.toml, config.py,
  Chart.yaml, docs/CONFIGURATION.md (all were out of sync: 9.0.1, v9.6, 0.1.0)
- Replace hardcoded v9.6 in SettingsPage and About page with dynamic
  APP_VERSION import from @/lib/version.ts
- Add Vite define for __APP_VERSION__ reading from package.json at build time
- Add TypeScript global declaration for __APP_VERSION__

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-19 11:25:34 -05:00
Jason Staack
ee3133d5c5 fix: untrack .planning/ files and add .superpowers/ to .gitignore
.planning/ files were committed before the gitignore rule was added.
Untracked them so local planning docs stay local. Added .superpowers/
to prevent brainstorming artifacts from being pushed.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-19 07:48:47 -05:00
Jason Staack
f4361463a7 docs: add SaaS restriction to BSL 1.1 license
Only the Licensor may offer the Licensed Work as a hosted/managed
service (SaaS). Self-hosted production use remains free up to 1,000
devices. Updated LICENSE, README, and docs to reflect the new terms.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-19 07:35:20 -05:00
Jason Staack
8eb8c0a8fa fix(15): correct SQL column names in trend detector and alert evaluator
- Replace `collected_at` with `time` (actual hypertable column) in 5 queries
- Remove non-existent `rule_type` column from site_alert_events INSERTs
- Fix trend dedup query to use `rule_id IS NULL` instead of `rule_type`

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-19 07:33:05 -05:00
Jason Staack
d1495ee90d feat(15-03): add alert rules UI, alert events table, and notification bell
- AlertRuleFormDialog with rule type selector, threshold input, auto-set units
- AlertRulesTab with list, enable toggle, edit, delete, and add button
- AlertEventsTable with severity badges, resolve action, and state filter tabs
- NotificationBell polls active alert count with 60s interval
- Site dashboard gains Alerts tab rendering both AlertRulesTab and AlertEventsTable
- NotificationBell integrated into ContextStrip header for tenant users

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-19 07:25:24 -05:00
Jason Staack
3bddd6f654 feat(15-03): add signal history charts with expandable rows in station and link tables
- Create SignalHistoryChart with recharts LineChart, green/yellow/red reference bands, and 24h/7d/30d range selector
- Add expandable rows to WirelessStationTable (click station to see signal history)
- Add expandable rows to WirelessLinksTable CPE rows (click link to see signal history)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-19 07:22:49 -05:00
Jason Staack
ef82a0d294 docs(15-02): complete signal trending and alert evaluation plan
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-19 07:18:28 -05:00
Jason Staack
124a72582b feat(15-01): add signal history and site alert services, routers, and main.py wiring
- Create signal_history_service with TimescaleDB time_bucket queries for 24h/7d/30d ranges
- Create site_alert_service with full CRUD for rules, events list/resolve, and active count
- Create signal_history router with GET endpoint for time-bucketed signal data
- Create site_alerts router with CRUD endpoints for rules and event management
- Wire both routers into main.py with /api prefix

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-19 07:18:02 -05:00
Jason Staack
b9a92f3869 feat(15-02): add frontend API clients for signal history, alert rules, and events
- signalHistoryApi: GET signal history with mac_address and range params
- alertRulesApi: full CRUD for site alert rules
- alertEventsApi: list, resolve, and activeCount methods

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-19 07:16:43 -05:00
Jason Staack
c3ae48eb0c feat(15-02): add trend detection and alert evaluation scheduled tasks
- Create trend_detector.py: hourly 7d vs 14d signal comparison per active link
- Create alert_evaluator_site.py: 5-min evaluation of 4 rule types with hysteresis
- Wire both tasks into lifespan with non-fatal startup and cancel on shutdown

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-19 07:16:06 -05:00
Jason Staack
d4cf36b200 feat(15-01): add site alert rules/events migration, models, schemas, and config
- Create Alembic migration 035 with site_alert_rules and site_alert_events tables, RLS policies, and GRANT
- Add SiteAlertRule/SiteAlertEvent ORM models with enums for rule_type, severity, state
- Add Pydantic schemas for rule/event CRUD and signal history points
- Add SIGNAL_DEGRADATION_THRESHOLD_DB, ALERT_EVALUATION_INTERVAL_SECONDS, TREND_DETECTION_INTERVAL_SECONDS to Settings

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-19 07:16:05 -05:00
Jason Staack
0079db6534 docs(14-03): complete site dashboard integration plan
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-19 06:55:41 -05:00
Jason Staack
a9db9e4bfe feat(14-03): replace site detail placeholder with tabbed dashboard
- Add Health Grid, Sectors, Links tabs using useState pattern
- Default tab is Health Grid showing device status cards
- Remove placeholder "Assigned Devices" section
- Site info card and health stats remain above tabs unchanged
2026-03-19 06:54:01 -05:00
Jason Staack
d89233bcf5 feat(14-03): add site dashboard components (health grid, sector view, links tab)
- SiteHealthGrid shows device cards with status dots, CPU/memory bars, uptime
- SectorFormDialog supports create and edit modes for sectors
- SiteSectorView groups APs by sector with collapsible sections, connected CPE lists, aggregate stats, sector assignment dropdown
- SiteLinksTab wraps WirelessLinksTable with siteId filtering
- Add sector_id and sector_name to DeviceResponse, site_id/sector_id to DeviceListParams
2026-03-19 06:53:22 -05:00
Jason Staack
3f7fa7d62c feat(14-02): integrate wireless tabs into device detail and add wireless links page
- Add Stations tab to StandardConfigSidebar (Monitor section)
- Render WirelessStationTable + RFStatsCard in stations tab
- Create standalone wireless-links route page
- Update Sidebar nav to point Wireless Links to tenant-scoped page

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-19 06:47:36 -05:00
Jason Staack
eec89b802a feat(14-02): add wireless station table, RF stats card, and links table components
- WirelessStationTable: per-station client table with signal/CCQ color coding
- RFStatsCard: per-interface RF environment stats display
- WirelessLinksTable: AP-CPE link topology grouped by AP with state badges
- Shared signalColor helper for consistent signal strength visualization

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-19 06:46:00 -05:00
Jason Staack
430cab98a8 feat(14-01): add site_id device filter, wireless data endpoints, and frontend API clients
- Add site_id and sector_id query parameters to devices list endpoint
- Add get_device_registrations and get_device_rf_stats to link_service
- Add RegistrationResponse, RFStatsResponse schemas to link.py
- Add /registrations and /rf-stats endpoints to links router
- Add sectorsApi frontend client (list, create, update, delete, assignDevice)
- Add wirelessApi frontend client (links, registrations, RF stats, unknown clients)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-19 06:42:08 -05:00
Jason Staack
ea5afe3408 feat(14-01): add sector CRUD backend with migration, model, service, and router
- Create sectors table migration (034) with RLS and devices.sector_id FK
- Add Sector ORM model with site_id and tenant_id foreign keys
- Add SectorCreate/Update/Response/ListResponse Pydantic schemas
- Implement sector_service with CRUD and device assignment functions
- Add sectors router with GET/POST/PUT/DELETE and device sector assignment
- Register sectors router in main.py
- Add sector_id and sector_name to Device model and DeviceResponse

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-19 06:40:44 -05:00
Jason Staack
0434d31030 feat(13-03): add link service, schemas, router, and wire subscribers into lifespan
- LinkResponse/UnknownClientResponse Pydantic schemas with from_attributes
- Link service with get_links, get_device_links, get_site_links, get_unknown_clients
- Unknown clients query uses DISTINCT ON for latest registration per MAC
- 4 REST endpoints: tenant links, device links, site links, unknown clients
- Interface and link discovery subscribers wired into FastAPI lifespan start/stop
- Links router registered at /api prefix

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-19 06:12:06 -05:00
Jason Staack
3209a7d9be feat(13-03): add interface and link discovery NATS subscribers
- Interface subscriber consumes device.interfaces.> from DEVICE_EVENTS, upserts device_interfaces table
- Link discovery subscriber consumes wireless.registrations.> with separate durable consumer
- MAC resolution against device_interfaces for AP-CPE link discovery
- State machine: active (signal >= -80dBm), degraded (< -80), down (3 missed), stale (24h)
- missed_polls resets to 0 on any observation, enabling link revival

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-19 06:10:17 -05:00
Jason Staack
f0e7c5c00e docs(13-01): complete interface info collector plan
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-19 06:07:51 -05:00
Jason Staack
397a33abef feat(13-01): add DeviceInterfaceEvent publisher and wire into PollDevice
- DeviceInterfaceEvent type publishes to device.interfaces.{device_id}
- PublishDeviceInterfaces method follows existing publisher pattern
- DEVICE_EVENTS stream includes device.interfaces.> subject
- PollDevice collects interface info after traffic counters, before health
- Non-fatal errors with Prometheus metrics for publish success/failure

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-19 06:05:55 -05:00
Jason Staack
6939584428 feat(13-01): add InterfaceInfo collector with MAC lowercasing and tests
- InterfaceInfo struct for link discovery (name, mac, type, running)
- CollectInterfaceInfo runs /interface/print (version-agnostic)
- MAC addresses lowercased for consistent matching
- Entries without mac-address skipped (loopback, bridge)
- Preserved existing InterfaceStats traffic counter collector

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-19 06:04:50 -05:00