SQLAlchemy couldn't resolve ForeignKey("snmp_profiles.id") because
there's no SNMPProfile ORM model — profiles are managed via raw SQL.
The FK constraint exists at the DB level via migration 039. The ORM
column is now a plain UUID.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The app_user role had no INSERT/UPDATE/DELETE on credential_profiles,
snmp_profiles, or snmp_metrics — causing 'permission denied' when
creating credential profiles or SNMP profiles from the UI.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
DeviceCreate now accepts device_type, snmp_port, snmp_version,
snmp_profile_id, credential_profile_id, and community string.
Username/password are optional (not needed for SNMP devices).
A model validator ensures at least one credential method is provided.
DeviceResponse and DeviceUpdate include the same SNMP fields so
list/detail endpoints return them and users can modify them.
The create_device service skips TCP probe for SNMP devices (UDP),
encrypts inline community strings via Transit, and sets all SNMP
columns on the Device ORM object.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The tod-mib-parser at /app/ was overwritten by COPY backend/ .
Move to /usr/local/bin/ and update config path.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Build tod-mib-parser in both poller and API Dockerfiles
- Bundle 16 standard MIBs (IF-MIB, HOST-RESOURCES, SNMPv2, etc.)
- Pass --search-path /app/mibs to parser so dependencies resolve
- Users no longer need to upload standard MIBs manually
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
SQLAlchemy's text() interprets :name::type as two named parameters.
Fixes syntax errors in link discovery, signal history, and SNMP profile
CRUD that caused 500 errors at runtime.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
SQLAlchemy's text() interprets ::jsonb as a named parameter binding.
Use CAST(:profile_data AS jsonb) to avoid the collision.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Adds SNMP Device Profiles card to SettingsPage for discoverability.
Adds device_count correlated subquery to profile list SQL and schema
field so the frontend profile cards show accurate device counts.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- POST /snmp-profiles/parse-mib: upload MIB file, subprocess-call tod-mib-parser, return OID tree JSON
- POST /snmp-profiles/{id}/test: test profile connectivity via NATS discovery probe to poller
- New snmp_proxy service module following routeros_proxy.py lazy NATS pattern
- Pydantic schemas: MIBParseResponse, ProfileTestRequest, ProfileTestResponse, ProfileTestOIDResult
- MIB_PARSER_PATH config setting with /app/tod-mib-parser default
- MIB parse errors return 422, not 500; temp file cleanup in finally block
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- POST /tenants/{tenant_id}/devices/bulk endpoint with rate limiting
- bulk_add_with_profile service validates profile ownership and type compatibility
- Duplicate IP check prevents adding same IP twice in one tenant
- TCP reachability check for RouterOS devices, skipped for SNMP (UDP)
- Per-device result reporting with partial success support
- Device model updated with device_type, snmp_port, snmp_version, snmp_profile_id columns
- Audit logging for bulk add operations
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- BulkAddWithProfileRequest with credential_profile_id, device_type, defaults
- BulkAddDeviceEntry with IP address validation
- BulkAddDefaults for type-appropriate port/TLS defaults
- BulkAddDeviceResult and BulkAddWithProfileResult for per-device reporting
- Existing BulkAddRequest preserved for backward compatibility
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Service with CRUD + Transit encryption for all new credential writes
- Router with 6 endpoints under /tenants/{tenant_id}/credential-profiles
- Delete returns HTTP 409 with device_count when devices reference profile
- Registered credential_profiles_router in main.py
- DeviceUpdate schema accepts optional credential_profile_id
- update_device validates profile belongs to tenant before assigning
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- SQLAlchemy model mapping to credential_profiles table (migration 037)
- CredentialProfileCreate with model_validator enforcing per-type required fields
- CredentialProfileUpdate with conditional validation on type change
- CredentialProfileResponse without any credential fields (write-only)
- Device model updated with credential_profile_id FK and relationship
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Add _insert_snmp_custom_metrics handler for custom SNMP OID events
- Insert all 9 columns into snmp_metrics hypertable
- Change unknown metric types from ACK to NAK for redelivery safety
- Prevents permanent data loss during deployment ordering mismatches
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- devices table: device_type (default 'routeros'), snmp_port (default 161),
snmp_version, snmp_profile_id FK -> snmp_profiles, credential_profile_id
FK -> credential_profiles, with lock_timeout = 3s for safe ALTER
- snmp_metrics: hypertable with 90-day retention, composite index on
(device_id, metric_name, time DESC), RLS with tenant isolation
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- credential_profiles: UUID PK, tenant_id FK with CASCADE, credential_type,
encrypted credential fields, unique(tenant_id, name), RLS, poller_user GRANT
- snmp_profiles: UUID PK, nullable tenant_id for system profiles, profile_data
JSONB, partial unique indexes for tenant vs system name uniqueness, RLS with
system profile visibility to all tenants, poller_user GRANT
- 6 system seed profiles: generic-snmp, network-switch, network-router,
wireless-ap, ups-device, mikrotik-snmp with full OID collection definitions
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Add LICENSE_DEVICES env var (default 250, matches BSL 1.1 free tier)
- Add /api/settings/license endpoint returning device count vs limit
- Header shows flashing red "502/500 licensed" badge when over limit
- About page shows license tier, device count, and over-limit warning
- Nothing is crippled — all features work regardless of device count
- Bump version to 9.7.1
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Reverted from MapLibre/PMTiles to Leaflet with nginx-proxied OSM raster
tiles — the MapLibre approach had unresolvable CSP and theme compat
issues. The proxy keeps all browser requests local (no third-party).
Also:
- Add CPE signal strength and parent AP name to fleet summary SQL
and map popup cards (e.g. "Signal: -62 dBm to ap-shady-north")
- Add .dockerignore to exclude 8GB PMTiles and node_modules from
Docker build context (was causing 10+ minute builds)
- Configure mailpit SMTP in dev compose
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Replace Leaflet + OSM raster tiles with MapLibre GL JS + PMTiles:
- Full continental US vector tiles (8GB PMTiles, zoom 0-14 with overzoom)
- Dark theme via @protomaps/basemaps (official supported path)
- Clustered device markers with status colors (green/yellow/red)
- Popup cards show CPU, memory, wireless client count + avg signal
- Font glyphs proxied through nginx, sprites served locally
- Zero third-party requests from the browser
- Fleet summary SQL now includes wireless client count and avg signal
via LEFT JOIN LATERAL on wireless_links
Also removes alert toast spam and fixes map container height.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
SSE connections previously created regular push consumers without durable
names. When browsers disconnected uncleanly or the API restarted, these
orphaned consumers persisted on the NATS server and continued draining
messages — each restart added more, eventually saturating the API at
100% CPU.
Switched to ordered_consumer=True which:
- Creates ephemeral consumers with no server-side ack state
- Auto-cleans on disconnect (no orphans)
- Still delivers new messages in real-time for SSE
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Root cause: stale NATS JetStream consumers accumulated across API
restarts, causing 13+ consumers to fight over messages in a single
Python async event loop (100% CPU).
Fixes:
- Add performance indexes on devices(tenant_id, hostname),
devices(tenant_id, status), key_access_log(tenant_id, created_at)
— drops devices seq_scans from 402k to 6 per interval
- Remove redundant ORDER BY t.name from fleet summary SQL
(tenant name sort is client-side, was forcing a cross-table sort)
- Bump NATS memory limit from 128MB to 256MB (was at 118/128)
- Increase dev poll interval from 60s to 120s for 400+ device fleet
The stream purge + restart brought API CPU from 100% to 0.3%.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Migrations 030 (sites), 032 (device_interfaces), 033 (wireless_links),
and 034 (sectors) were missing GRANT statements for app_user and
poller_user. Without these, fresh deploys crash on site/sector CRUD
with permission denied errors. Also added poller_user SELECT grants
to migration 035 (site_alert_rules/events).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Fix AttributeError in sites router: CurrentUser has `user_id` not `id`
(create/update/delete all crashed with 500)
- Add onError handlers with toast notifications to SiteFormDialog
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Ruff auto-fix: unused Optional imports in sectors router and link
schemas, unused Site import in device service, unused datetime
imports in trend detector, unused text import in site service,
and f-string without placeholders in signal history service.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add missing security headers recommended by securityheaders.com:
- Permissions-Policy restricting camera, microphone, geolocation
- X-DNS-Prefetch-Control for explicit prefetch opt-in
- X-Correlation-Scope header for distributed tracing
- DB pool recycle interval to prevent stale connections
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Add VERSION file at project root as canonical version source
- Sync all version references: package.json, pyproject.toml, config.py,
Chart.yaml, docs/CONFIGURATION.md (all were out of sync: 9.0.1, v9.6, 0.1.0)
- Replace hardcoded v9.6 in SettingsPage and About page with dynamic
APP_VERSION import from @/lib/version.ts
- Add Vite define for __APP_VERSION__ reading from package.json at build time
- Add TypeScript global declaration for __APP_VERSION__
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Replace `collected_at` with `time` (actual hypertable column) in 5 queries
- Remove non-existent `rule_type` column from site_alert_events INSERTs
- Fix trend dedup query to use `rule_id IS NULL` instead of `rule_type`
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Create signal_history_service with TimescaleDB time_bucket queries for 24h/7d/30d ranges
- Create site_alert_service with full CRUD for rules, events list/resolve, and active count
- Create signal_history router with GET endpoint for time-bucketed signal data
- Create site_alerts router with CRUD endpoints for rules and event management
- Wire both routers into main.py with /api prefix
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Create trend_detector.py: hourly 7d vs 14d signal comparison per active link
- Create alert_evaluator_site.py: 5-min evaluation of 4 rule types with hysteresis
- Wire both tasks into lifespan with non-fatal startup and cancel on shutdown
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Create Alembic migration 035 with site_alert_rules and site_alert_events tables, RLS policies, and GRANT
- Add SiteAlertRule/SiteAlertEvent ORM models with enums for rule_type, severity, state
- Add Pydantic schemas for rule/event CRUD and signal history points
- Add SIGNAL_DEGRADATION_THRESHOLD_DB, ALERT_EVALUATION_INTERVAL_SECONDS, TREND_DETECTION_INTERVAL_SECONDS to Settings
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Create sectors table migration (034) with RLS and devices.sector_id FK
- Add Sector ORM model with site_id and tenant_id foreign keys
- Add SectorCreate/Update/Response/ListResponse Pydantic schemas
- Implement sector_service with CRUD and device assignment functions
- Add sectors router with GET/POST/PUT/DELETE and device sector assignment
- Register sectors router in main.py
- Add sector_id and sector_name to Device model and DeviceResponse
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- LinkResponse/UnknownClientResponse Pydantic schemas with from_attributes
- Link service with get_links, get_device_links, get_site_links, get_unknown_clients
- Unknown clients query uses DISTINCT ON for latest registration per MAC
- 4 REST endpoints: tenant links, device links, site links, unknown clients
- Interface and link discovery subscribers wired into FastAPI lifespan start/stop
- Links router registered at /api prefix
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Interface subscriber consumes device.interfaces.> from DEVICE_EVENTS, upserts device_interfaces table
- Link discovery subscriber consumes wireless.registrations.> with separate durable consumer
- MAC resolution against device_interfaces for AP-CPE link discovery
- State machine: active (signal >= -80dBm), degraded (< -80), down (3 missed), stale (24h)
- missed_polls resets to 0 on any observation, enabling link revival
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Migration 033 creates wireless_links with state machine, missed_polls, RLS
- WirelessLink model with LinkState enum (discovered/active/degraded/down/stale)
- Register DeviceInterface, WirelessLink, LinkState in models __init__
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Migration 032 creates device_interfaces with RLS, MAC index, unique(device_id, name)
- DeviceInterface SQLAlchemy model with all columns and device relationship
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- wireless_registration_subscriber.py: consumes wireless.registrations.> from WIRELESS_REGISTRATIONS stream
- Inserts per-client rows into wireless_registrations hypertable
- Inserts RF monitor data into rf_monitor_stats hypertable
- Uses AdminAsyncSessionLocal to bypass RLS for cross-tenant writes
- Durable consumer: api-wireless-reg-consumer with retry logic
- Wired into FastAPI lifespan with non-fatal startup and graceful shutdown
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- wireless_registrations hypertable with per-client columns (mac, signal, rates, uptime)
- rf_monitor_stats hypertable for RF environment data (noise floor, channel width, tx power)
- RLS tenant_isolation with super_admin bypass on both tables
- Composite indexes: device+time, mac+time (for Phase 13 link discovery)
- 30-day retention policies on both hypertables
- GRANTs for app_user and poller_user
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Add site_id (Optional[UUID]) and site_name (Optional[str]) to backend DeviceResponse schema
- Include site fields in _build_device_response helper
- Add selectinload(Device.site) to _device_with_relations for eager loading
- Add site_id and site_name to frontend DeviceResponse interface
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Add site_service with CRUD, health rollup, device assignment functions
- Add sites router with 8 endpoints (CRUD + assign/unassign/bulk-assign)
- RBAC: viewer for reads, operator for writes, tenant_admin for delete
- Wire sites_router into main.py with /api prefix
- Health rollup computes device_count, online_count, online_percent per site
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Add migration 030 with sites table, RLS policy, and device site_id FK
- Add Site SQLAlchemy model with tenant isolation
- Add site_id nullable FK and relationship to Device model
- Add sites relationship to Tenant model
- Register Site in models __init__.py
- Add SiteCreate, SiteUpdate, SiteResponse, SiteListResponse schemas
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add device_id to the audit log API response and frontend type, then
use DeviceLink to make device hostnames navigable in AuditLogTable.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
kms_service.py does not exist and Transit encryption was never
implemented for SMTP passwords, making the decrypt_transit code path
unreachable. Remove it entirely and leave only the Fernet fallback.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>