Commit Graph

260 Commits

Author SHA1 Message Date
Jason Staack
00bdde9975 test(09-01): add failing tests for retention cleanup service
- Test cleanup deletes expired snapshots
- Test snapshots within retention window are kept
- Test deleted count is returned
- Test empty table handled gracefully

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-12 23:32:20 -05:00
Jason Staack
be41add4e9 feat(08-02): add snapshot download button to config history timeline
- Add SnapshotResponse interface and getSnapshot API method
- Add deviceName prop to ConfigHistorySection
- Add download handler that fetches snapshot and triggers .rsc file download
- Add Download icon button on each timeline entry with stopPropagation

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-12 23:23:55 -05:00
Jason Staack
8a64596d8b docs(08-01): complete diff viewer plan
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-12 23:22:22 -05:00
Jason Staack
2cf426fa63 feat(08-01): wire diff viewer into config history timeline
- Add click handlers to timeline entries to open diff viewer
- Render DiffViewer inline above timeline when snapshot selected
- Add hover state and cursor-pointer to timeline entries

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-12 23:20:52 -05:00
Jason Staack
dda00fbd23 feat(08-01): add diff viewer component and API client
- Add DiffResponse interface and getDiff method to configHistoryApi
- Create DiffViewer component with unified diff rendering
- Green highlighting for added lines, red for removed lines
- Blue styling for hunk headers, loading skeleton, error state

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-12 23:20:24 -05:00
Jason Staack
bd09590c01 docs(07-01): complete config history UI timeline plan
- SUMMARY.md with plan execution results
- STATE.md updated to phase 7 complete
- ROADMAP.md and REQUIREMENTS.md updated

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-12 23:13:36 -05:00
Jason Staack
36861fffea feat(07-01): wire ConfigHistorySection into device detail page
- Import and render ConfigHistorySection below Interface Utilization
- Configuration history now visible on device overview tab

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-12 23:12:16 -05:00
Jason Staack
6bd24517ba feat(07-01): add config history API client and timeline component
- Add ConfigChangeEntry interface and configHistoryApi.list() to api.ts
- Create ConfigHistorySection with timeline, loading skeleton, and empty state
- Poll every 60s via TanStack Query refetchInterval

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-12 23:11:46 -05:00
Jason Staack
e8bf994e7d docs(06-02): complete snapshot view and diff retrieval plan
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-12 23:04:48 -05:00
Jason Staack
af7007df13 feat(06-02): add snapshot view and diff retrieval endpoints
- GET /config/{snapshot_id} returns decrypted full config with RBAC
- GET /config/{snapshot_id}/diff returns unified diff text with RBAC
- 404 for missing snapshots/diffs, 500 for Transit decrypt failure
- Both endpoints enforce viewer+ role and config:read scope

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-12 23:03:32 -05:00
Jason Staack
83cd661efc feat(06-02): add get_snapshot and get_snapshot_diff service functions
- get_snapshot queries snapshot by id/device/tenant, decrypts via Transit
- get_snapshot_diff queries diff by new_snapshot_id with device/tenant filter
- Both return None for missing data (404-safe)
- 4 new tests with mocked Transit and DB sessions

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-12 23:02:58 -05:00
Jason Staack
7c0aa1b359 docs(06-01): complete config history timeline plan
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-12 23:01:12 -05:00
Jason Staack
5c56344d74 feat(06-01): add config-history endpoint with RBAC and main.py registration
- GET /api/tenants/{tid}/devices/{did}/config-history endpoint
- Viewer+ RBAC with config:read scope
- Pagination via limit/offset query params (defaults 50/0)
- Router registered in main.py

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-12 22:59:37 -05:00
Jason Staack
f7d5aec4ec feat(06-01): add config history service with TDD tests
- Service queries router_config_changes JOIN router_config_diffs for timeline
- Returns paginated entries with component, summary, timestamp, diff metadata
- ORDER BY created_at DESC with limit/offset pagination
- 4 tests covering formatting, empty results, pagination, and ordering

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-12 22:58:51 -05:00
Jason Staack
3416cc90a5 docs(05-02): complete structured change parser plan
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-12 22:38:16 -05:00
Jason Staack
122b5917f4 feat(05-02): wire change parser into diff service with RETURNING id
- Diff INSERT now uses RETURNING id to capture diff_id
- parse_diff_changes called after diff commit, results stored in router_config_changes
- Change parser errors are best-effort (logged, never block diff storage)
- Added tests for change storage and parser error resilience

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-12 22:37:09 -05:00
Jason Staack
b167831105 feat(05-02): implement config change parser for RouterOS diffs
- parse_diff_changes() extracts component, summary, raw_line from unified diffs
- RouterOS path detection converts /ip firewall filter to ip/firewall/filter
- Human-readable summaries: Added/Removed/Modified N component rules
- Fallback to system/general when no path headers found

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-12 22:35:48 -05:00
Jason Staack
7fddf35fc5 test(05-02): add failing tests for config change parser
- 6 tests covering component extraction, summaries, multi-section, removals, modifications, fallback, raw_line

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-12 22:35:19 -05:00
Jason Staack
4e083a9606 docs(05-01): complete config diff service plan
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-12 22:34:16 -05:00
Jason Staack
eb76343d04 feat(05-01): wire diff generation into snapshot subscriber
- Add RETURNING id to snapshot INSERT for new_snapshot_id capture
- Call generate_and_store_diff after successful commit (best-effort)
- Outer try/except safety net ensures snapshot ack never blocked by diff
- Update subscriber tests to mock diff service

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-12 22:32:40 -05:00
Jason Staack
72d0ae2856 feat(05-01): implement config diff service with Transit decrypt and difflib
- generate_and_store_diff decrypts old+new snapshots, produces unified diff
- Stores diff in router_config_diffs with line counts
- Best-effort: decrypt/DB errors logged, never raised
- Prometheus metrics: generated_total, errors_total, duration_seconds

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-12 22:31:28 -05:00
Jason Staack
79453fa115 test(05-01): add failing tests for config diff service
- 5 tests: diff generation, first snapshot skip, decrypt failure, line counts, empty diff skip

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-12 22:30:52 -05:00
Jason Staack
db5bb3fa96 docs(04-01): complete manual backup trigger plan
- Summary with 12 tests (6 Go, 6 Python), all passing
- STATE.md updated: Phase 4 complete, decisions logged
- ROADMAP.md updated: Phase 4 plan progress
- REQUIREMENTS.md: COLL-04 marked complete

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-12 22:12:33 -05:00
Jason Staack
00f0a8b507 feat(04-01): add config snapshot trigger endpoint with NATS request-reply
- POST /tenants/{tid}/devices/{did}/config-snapshot/trigger endpoint
- Requires operator role, rate limited 10/minute
- Returns 201 success, 404 device not found, 409 lock held, 502 failure, 504 timeout
- Reuses NATS connection from routeros_proxy module
- 6 tests covering all response paths including connection errors

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-12 22:10:25 -05:00
Jason Staack
0e664150e7 test(04-01): add failing tests for config snapshot trigger endpoint
- Test success returns 201 with sha256_hash
- Test NATS timeout returns 504
- Test poller failure returns 502
- Test device not found returns 404
- Test lock contention returns 409

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-12 22:08:13 -05:00
Jason Staack
0851eced36 feat(04-01): implement BackupResponder with extracted CollectAndPublish
- Create BackupResponder for NATS request-reply on config.backup.trigger
- Extract public CollectAndPublish from BackupScheduler returning sha256 hash
- Define BackupExecutor/BackupLocker/DeviceGetter interfaces for testability
- Create RedisBackupLocker adapter wrapping redislock.Client
- Wire BackupResponder into main.go lifecycle
- All 6 tests pass with in-process NATS server

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-12 22:07:35 -05:00
Jason Staack
9e102fda20 test(04-01): add failing tests for BackupResponder
- Test subscribe registers subscription
- Test valid request returns success with sha256_hash
- Test lock held returns locked status
- Test invalid JSON returns error
- Test Stop unsubscribes cleanly
- Test device not found returns failed status

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-12 22:04:44 -05:00
Jason Staack
bf3fb509ed docs(03-01): complete config snapshot subscriber plan
- SUMMARY.md with task commits and decisions
- STATE.md updated to Phase 3 complete
- ROADMAP.md progress updated
- REQUIREMENTS.md: STOR-02 marked complete

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-12 21:49:43 -05:00
Jason Staack
0db06419e7 feat(03-01): wire config snapshot subscriber into main.py lifespan
- Start config_snapshot_subscriber in lifespan startup (non-fatal)
- Stop config_snapshot_subscriber in lifespan shutdown
- Placed after push_rollback_subscriber (near config-related subscribers)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-12 21:47:51 -05:00
Jason Staack
3ab9f27d49 feat(03-01): implement config snapshot subscriber with dedup and encryption
- NATS subscriber for config.snapshot.> on DEVICE_EVENTS stream
- Dedup by SHA256 hash against latest snapshot per device
- OpenBao Transit encryption before INSERT (plaintext never stored)
- Malformed/orphan messages acked and discarded safely
- Transit failure causes nak for NATS retry
- Prometheus metrics: ingested, dedup_skipped, errors, duration
- All 6 unit tests pass

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-12 21:47:07 -05:00
Jason Staack
9d8274158a test(03-01): add failing tests for config snapshot subscriber
- 6 tests: new snapshot stored, duplicate skipped, encrypt failure naks,
  malformed acked, orphan device acked, first-snapshot stored
- Tests mock NATS msg, AdminAsyncSessionLocal, OpenBaoTransitService

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-12 21:45:41 -05:00
Jason Staack
d456fe58e9 docs(02-02): complete backup scheduler plan
- SUMMARY.md with execution metrics and decisions
- STATE.md updated: Phase 2 complete, 3 plans done
- ROADMAP.md updated: Phase 2 marked complete
- REQUIREMENTS.md: COLL-03, COLL-05 marked complete

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-12 20:57:47 -05:00
Jason Staack
d34817a36c feat(02-02): wire BackupScheduler into main.go lifecycle
- Create BackupScheduler with all dependencies injected
- Run as goroutine parallel to status poll scheduler
- Shares same context for graceful shutdown via SIGINT/SIGTERM
- Startup logged with interval, max_concurrent, command_timeout

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-12 20:55:06 -05:00
Jason Staack
2653a32d6f feat(02-02): implement BackupScheduler with per-device goroutines and concurrency control
- BackupScheduler manages per-device backup goroutines independently from status poll
- First backup uses 30-300s random jitter delay to spread load
- Concurrency limited by buffered channel semaphore (configurable max)
- Per-device Redis lock prevents duplicate backups across pods
- Auth failures and host key mismatches block retries with clear warnings
- Transient errors use 5m/15m/1h exponential backoff with cap
- Offline devices skipped via Redis status key check
- TOFU fingerprint stored on first successful SSH connection
- Config output validated, normalized, hashed, published to NATS
- SSHHostKeyUpdater interface added to interfaces.go
- All 12 backup unit tests pass

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-12 20:54:23 -05:00
Jason Staack
a884b0945d test(02-02): add failing tests for BackupScheduler
- Jitter range, backoff sequence, shouldRetry blocking logic
- Online-only gating via Redis, concurrency semaphore behavior
- Reconciliation start/stop device lifecycle

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-12 20:52:29 -05:00
Jason Staack
7ff3178b84 docs(02-01): complete config backup primitives plan
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-12 20:50:27 -05:00
Jason Staack
4ae39d2cb3 feat(02-01): add config backup env vars, NATS event, device SSH fields, migration, metrics
- Config: CONFIG_BACKUP_INTERVAL (21600s), CONFIG_BACKUP_MAX_CONCURRENT (10), CONFIG_BACKUP_COMMAND_TIMEOUT (60s)
- NATS: ConfigSnapshotEvent type, PublishConfigSnapshot method, config.snapshot.> stream subject
- Device: SSHPort/SSHHostKeyFingerprint fields, UpdateSSHHostKey method, updated queries/scans
- Migration 028: ssh_port, ssh_host_key_fingerprint, timestamp columns with poller_user grants
- Metrics: ConfigBackupTotal (counter), ConfigBackupDuration (histogram), ConfigBackupActive (gauge)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-12 20:48:12 -05:00
Jason Staack
f1abb75cab feat(02-01): add SSH executor with TOFU host key verification and config normalizer
- SSH RunCommand with typed error classification (auth, hostkey, timeout, connection refused, truncated)
- TOFU host key callback: accept-on-first-connect, verify-on-subsequent, reject-on-mismatch
- NormalizeConfig strips timestamps, normalizes line endings, trims whitespace, collapses blanks
- HashConfig returns 64-char lowercase hex SHA256 of normalized config
- 22 unit tests covering all error kinds, TOFU flows, normalization edge cases, idempotency

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-12 20:46:04 -05:00
Jason Staack
33f888a6e2 docs(02): create phase plan for poller config collection
Two plans covering SSH executor, config normalization, NATS publishing,
backup scheduler, and main.go wiring for periodic RouterOS config backup.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-12 20:39:47 -05:00
Jason Staack
a7a17a5ecd feat(01-01): add Alembic migration 027 for config snapshot tables with RLS
- Create router_config_snapshots table with Transit ciphertext storage
- Create router_config_diffs table with snapshot pair FK references
- Create router_config_changes table for parsed semantic changes
- Add RLS tenant isolation (ENABLE + FORCE + USING + WITH CHECK) on all 3
- Add GRANT SELECT/INSERT/DELETE to app_user on all 3
- Add performance indexes: device+collected_at, device+hash, snapshot pair, diff_id

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-12 20:04:18 -05:00
Jason Staack
8fe275e6f3 feat(01-01): add RouterConfigSnapshot/Diff/Change ORM models and tests
- Add RouterConfigSnapshot model with Transit ciphertext config_text
  and SHA-256 plaintext hash for deduplication
- Add RouterConfigDiff model for unified diffs between snapshots
- Add RouterConfigChange model for parsed semantic changes
- Export all three from app.models barrel file
- Add unit tests for importability, table names, columns, and types

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-12 20:03:43 -05:00
Jason Staack
7e2d637e0d docs: initialize project 2026-03-12 19:37:09 -05:00
Jason Staack
70126980a4 docs: map existing codebase 2026-03-12 19:33:26 -05:00
Jason Staack
5beede9502 Merge branch 'feature/remote-access' 2026-03-12 19:04:14 -05:00
Jason Staack
c2eea6847f fix: WinBox tunnel bind address, port range, and proxy support
- Bind tunnel listeners to 0.0.0.0 instead of 127.0.0.1 so tunnels
  are reachable through reverse proxies and container networks
- Reduce port range to 49000-49004 (5 concurrent tunnels)
- Derive WinBox URI host from request Host header instead of
  hardcoding 127.0.0.1, enabling use behind reverse proxies
- Add README security warning about default encryption keys

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-12 19:03:53 -05:00
Jason Staack
8cce0ef750 docs: add v9.5 remote access to website docs
WinBox tunnels, SSH terminal, NATS request-reply architecture,
session management, security notes, and updated port tables.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-12 17:02:02 -05:00
Jason Staack
acf1790bed feat: add audit.session.end NATS pipeline for SSH session tracking
Poller publishes session end events via JetStream when SSH sessions
close (normal disconnect or idle timeout). Backend subscribes with a
durable consumer and writes ssh_session_end audit log entries with
duration.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-12 16:07:10 -05:00
Jason Staack
7aaaeaa1d1 fix: address spec compliance gaps - tenant check, XFF fallback, rate limiting
- Gap 1: Add tenant ID verification after device lookup in SSH relay handleSSH,
  closing cross-tenant token reuse vulnerability
- Gap 2: Add X-Forwarded-For fallback (last entry) when X-Real-IP is absent in
  SSH relay source IP extraction; import strings package
- Gap 3: Add @limiter.limit("10/minute") to POST /winbox-session and POST
  /ssh-session using existing slowapi pattern from app.middleware.rate_limit
- Gap 4: Add TODO comment in open_ssh_session explaining that SSH session count
  enforcement is at the poller level; no NATS subject exists yet for API-side
  pre-check

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-12 15:51:14 -05:00
Jason Staack
a4e1c78744 docs: update documentation for v9.5 remote access feature
Add tunnel manager, SSH relay, new env vars, security model, and
Remote Access key feature entry across ARCHITECTURE, DEPLOYMENT,
SECURITY, CONFIGURATION, and README.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-12 15:47:03 -05:00
Jason Staack
d2471278ab feat(frontend): integrate WinBox and SSH buttons into device page
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-12 15:45:14 -05:00