Ruff auto-fix: unused Optional imports in sectors router and link
schemas, unused Site import in device service, unused datetime
imports in trend detector, unused text import in site service,
and f-string without placeholders in signal history service.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Replace `collected_at` with `time` (actual hypertable column) in 5 queries
- Remove non-existent `rule_type` column from site_alert_events INSERTs
- Fix trend dedup query to use `rule_id IS NULL` instead of `rule_type`
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Create signal_history_service with TimescaleDB time_bucket queries for 24h/7d/30d ranges
- Create site_alert_service with full CRUD for rules, events list/resolve, and active count
- Create signal_history router with GET endpoint for time-bucketed signal data
- Create site_alerts router with CRUD endpoints for rules and event management
- Wire both routers into main.py with /api prefix
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Create trend_detector.py: hourly 7d vs 14d signal comparison per active link
- Create alert_evaluator_site.py: 5-min evaluation of 4 rule types with hysteresis
- Wire both tasks into lifespan with non-fatal startup and cancel on shutdown
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Create sectors table migration (034) with RLS and devices.sector_id FK
- Add Sector ORM model with site_id and tenant_id foreign keys
- Add SectorCreate/Update/Response/ListResponse Pydantic schemas
- Implement sector_service with CRUD and device assignment functions
- Add sectors router with GET/POST/PUT/DELETE and device sector assignment
- Register sectors router in main.py
- Add sector_id and sector_name to Device model and DeviceResponse
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- LinkResponse/UnknownClientResponse Pydantic schemas with from_attributes
- Link service with get_links, get_device_links, get_site_links, get_unknown_clients
- Unknown clients query uses DISTINCT ON for latest registration per MAC
- 4 REST endpoints: tenant links, device links, site links, unknown clients
- Interface and link discovery subscribers wired into FastAPI lifespan start/stop
- Links router registered at /api prefix
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Interface subscriber consumes device.interfaces.> from DEVICE_EVENTS, upserts device_interfaces table
- Link discovery subscriber consumes wireless.registrations.> with separate durable consumer
- MAC resolution against device_interfaces for AP-CPE link discovery
- State machine: active (signal >= -80dBm), degraded (< -80), down (3 missed), stale (24h)
- missed_polls resets to 0 on any observation, enabling link revival
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- wireless_registration_subscriber.py: consumes wireless.registrations.> from WIRELESS_REGISTRATIONS stream
- Inserts per-client rows into wireless_registrations hypertable
- Inserts RF monitor data into rf_monitor_stats hypertable
- Uses AdminAsyncSessionLocal to bypass RLS for cross-tenant writes
- Durable consumer: api-wireless-reg-consumer with retry logic
- Wired into FastAPI lifespan with non-fatal startup and graceful shutdown
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Add site_id (Optional[UUID]) and site_name (Optional[str]) to backend DeviceResponse schema
- Include site fields in _build_device_response helper
- Add selectinload(Device.site) to _device_with_relations for eager loading
- Add site_id and site_name to frontend DeviceResponse interface
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Add site_service with CRUD, health rollup, device assignment functions
- Add sites router with 8 endpoints (CRUD + assign/unassign/bulk-assign)
- RBAC: viewer for reads, operator for writes, tenant_admin for delete
- Wire sites_router into main.py with /api prefix
- Health rollup computes device_count, online_count, online_percent per site
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
kms_service.py does not exist and Transit encryption was never
implemented for SMTP passwords, making the decrypt_transit code path
unreachable. Remove it entirely and leave only the Fernet fallback.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Fix _commit_and_sync infinite recursion
- Use admin session for subnet_index allocation (bypass RLS)
- Auto-set VPN endpoint from CORS_ORIGINS hostname
- Remove server address field from VPN setup UI
- Add DELETE endpoint and button for VPN config removal
- Add wg-reload watcher for reliable config hot-reload via wg syncconf
- Add wg_status.json writer for live peer handshake status in UI
- Per-tenant SNAT for poller-to-device routing through VPN
- Restrict VPN→eth0 forwarding to Docker networks only (block exit node abuse)
- Use 10.10.0.0/16 allowed-address in RouterOS commands
- Fix structlog event= conflict (use audit=True)
- Export backup_scheduler proxy for firmware/upgrade imports
sync_wireguard_config opens its own AdminAsyncSessionLocal connection
which cannot see uncommitted data from the caller's transaction. Add
_commit_and_sync helper that commits first, then regenerates wg0.conf.
Also removes the unused db parameter from sync_wireguard_config.
- config_snapshot_created event after successful snapshot INSERT
- config_snapshot_skipped_duplicate event on dedup match
- config_diff_generated event after diff INSERT
- config_backup_manual_trigger event on manual trigger success
- All log_action calls wrapped in try/except for safety
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add CONFIG_RETENTION_DAYS setting (default 90) to config.py
- Create retention_service.py with cleanup_expired_snapshots (parameterized SQL via make_interval)
- APScheduler IntervalTrigger runs cleanup every 24h with 1h jitter
- Prometheus counter and histogram for observability
- CASCADE FKs handle diff/change deletion automatically
- All 4 unit tests pass
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- get_snapshot queries snapshot by id/device/tenant, decrypts via Transit
- get_snapshot_diff queries diff by new_snapshot_id with device/tenant filter
- Both return None for missing data (404-safe)
- 4 new tests with mocked Transit and DB sessions
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Service queries router_config_changes JOIN router_config_diffs for timeline
- Returns paginated entries with component, summary, timestamp, diff metadata
- ORDER BY created_at DESC with limit/offset pagination
- 4 tests covering formatting, empty results, pagination, and ordering
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Diff INSERT now uses RETURNING id to capture diff_id
- parse_diff_changes called after diff commit, results stored in router_config_changes
- Change parser errors are best-effort (logged, never block diff storage)
- Added tests for change storage and parser error resilience
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- parse_diff_changes() extracts component, summary, raw_line from unified diffs
- RouterOS path detection converts /ip firewall filter to ip/firewall/filter
- Human-readable summaries: Added/Removed/Modified N component rules
- Fallback to system/general when no path headers found
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add RETURNING id to snapshot INSERT for new_snapshot_id capture
- Call generate_and_store_diff after successful commit (best-effort)
- Outer try/except safety net ensures snapshot ack never blocked by diff
- Update subscriber tests to mock diff service
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- NATS subscriber for config.snapshot.> on DEVICE_EVENTS stream
- Dedup by SHA256 hash against latest snapshot per device
- OpenBao Transit encryption before INSERT (plaintext never stored)
- Malformed/orphan messages acked and discarded safely
- Transit failure causes nak for NATS retry
- Prometheus metrics: ingested, dedup_skipped, errors, duration
- All 6 unit tests pass
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Poller publishes session end events via JetStream when SSH sessions
close (normal disconnect or idle timeout). Backend subscribes with a
durable consumer and writes ssh_session_end audit log entries with
duration.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Two bugs fixed:
1. audit_service.py: log_action() inserted into audit_logs using the
caller's DB session but never committed. Any router that called
db.commit() before log_action() (firmware, devices, config_editor,
alerts, certificates) had its audit rows silently rolled back when
the request session closed.
Fix: log_action now opens its own AdminAsyncSessionLocal and self-
commits, making audit persistence independent of the caller's
transaction. The 'db' parameter is kept for backward compat but
unused. Affects 5 routers (firmware, devices, config_editor,
alerts, certificates).
2. docker-compose.override.yml: /data/firmware-cache had no volume
mount so the directory didn't exist in the container, causing
firmware downloads to fail with Permission denied.
Fix: bind-mount docker-data/firmware-cache:/data/firmware-cache
so firmware images survive container restarts.
execute_cli was passing the full CLI string (e.g. '/ping address=8.8.8.8
count=4') as a single command to the Go poller. go-routeros expects the
command path and args separately. Now splits into command + prefixed args.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
When use_tls=false, the old logic set start_tls=true for any port != 25,
which broke plain SMTP servers like Mailpit. Now:
- Port 465: implicit TLS
- use_tls=true on other ports: STARTTLS
- use_tls=false: plain SMTP (no TLS)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>