docs(03-01): complete config snapshot subscriber plan
- SUMMARY.md with task commits and decisions - STATE.md updated to Phase 3 complete - ROADMAP.md progress updated - REQUIREMENTS.md: STOR-02 marked complete Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
@@ -17,7 +17,7 @@
|
|||||||
### Storage
|
### Storage
|
||||||
|
|
||||||
- [x] **STOR-01**: API stores config snapshots in `router_config_snapshots` table with SHA256 hash
|
- [x] **STOR-01**: API stores config snapshots in `router_config_snapshots` table with SHA256 hash
|
||||||
- [ ] **STOR-02**: Duplicate snapshots (same hash as previous) are skipped, no diff generated
|
- [x] **STOR-02**: Duplicate snapshots (same hash as previous) are skipped, no diff generated
|
||||||
- [ ] **STOR-03**: Snapshots retained for 90 days (configurable via `CONFIG_RETENTION_DAYS`)
|
- [ ] **STOR-03**: Snapshots retained for 90 days (configurable via `CONFIG_RETENTION_DAYS`)
|
||||||
- [ ] **STOR-04**: Older snapshots automatically deleted by retention cleanup
|
- [ ] **STOR-04**: Older snapshots automatically deleted by retention cleanup
|
||||||
- [x] **STOR-05**: Snapshots encrypted at rest, accessible only through RBAC
|
- [x] **STOR-05**: Snapshots encrypted at rest, accessible only through RBAC
|
||||||
@@ -75,7 +75,7 @@
|
|||||||
| COLL-05 | Phase 2: Poller Config Collection | Complete |
|
| COLL-05 | Phase 2: Poller Config Collection | Complete |
|
||||||
| COLL-06 | Phase 2: Poller Config Collection | Complete |
|
| COLL-06 | Phase 2: Poller Config Collection | Complete |
|
||||||
| STOR-01 | Phase 1: Database Schema | Complete |
|
| STOR-01 | Phase 1: Database Schema | Complete |
|
||||||
| STOR-02 | Phase 3: Snapshot Ingestion | Pending |
|
| STOR-02 | Phase 3: Snapshot Ingestion | Complete |
|
||||||
| STOR-03 | Phase 9: Retention & Cleanup | Pending |
|
| STOR-03 | Phase 9: Retention & Cleanup | Pending |
|
||||||
| STOR-04 | Phase 9: Retention & Cleanup | Pending |
|
| STOR-04 | Phase 9: Retention & Cleanup | Pending |
|
||||||
| STOR-05 | Phase 1: Database Schema | Complete |
|
| STOR-05 | Phase 1: Database Schema | Complete |
|
||||||
|
|||||||
@@ -56,17 +56,17 @@ Plans:
|
|||||||
- [ ] 02-02-PLAN.md — Backup scheduler with per-device goroutines, concurrency control, retry logic, and main.go wiring
|
- [ ] 02-02-PLAN.md — Backup scheduler with per-device goroutines, concurrency control, retry logic, and main.go wiring
|
||||||
|
|
||||||
### Phase 3: Snapshot Ingestion
|
### Phase 3: Snapshot Ingestion
|
||||||
**Goal**: Backend receives config snapshots from NATS, computes SHA256 hash, and stores new snapshots while skipping duplicates
|
**Goal**: Backend receives config snapshots from NATS, encrypts via Transit, deduplicates by SHA256, and stores new snapshots
|
||||||
**Depends on**: Phase 1, Phase 2
|
**Depends on**: Phase 1, Phase 2
|
||||||
**Requirements**: STOR-02
|
**Requirements**: STOR-02
|
||||||
**Success Criteria** (what must be TRUE):
|
**Success Criteria** (what must be TRUE):
|
||||||
1. Backend NATS subscriber consumes `config.snapshot.create` messages and persists snapshots to `router_config_snapshots`
|
1. Backend NATS subscriber consumes `config.snapshot.create` messages and persists snapshots to `router_config_snapshots`
|
||||||
2. When a snapshot has the same SHA256 hash as the device's most recent snapshot, it is skipped (no new row, no diff)
|
2. When a snapshot has the same SHA256 hash as the device's most recent snapshot, it is skipped (no new row, no diff)
|
||||||
3. Each stored snapshot includes device_id, tenant_id, config_text (encrypted), sha256_hash, and collected_at timestamp
|
3. Each stored snapshot includes device_id, tenant_id, config_text (encrypted), sha256_hash, and collected_at timestamp
|
||||||
**Plans**: TBD
|
**Plans**: 1 plan
|
||||||
|
|
||||||
Plans:
|
Plans:
|
||||||
- [ ] 03-01: NATS subscriber for config snapshot ingestion with deduplication
|
- [ ] 03-01-PLAN.md — NATS subscriber for config snapshot ingestion with dedup, encryption, and main.py wiring
|
||||||
|
|
||||||
### Phase 4: Manual Backup Trigger
|
### Phase 4: Manual Backup Trigger
|
||||||
**Goal**: Operators can trigger an immediate config backup for a specific device through the API
|
**Goal**: Operators can trigger an immediate config backup for a specific device through the API
|
||||||
|
|||||||
@@ -2,15 +2,15 @@
|
|||||||
gsd_state_version: 1.0
|
gsd_state_version: 1.0
|
||||||
milestone: v9.6
|
milestone: v9.6
|
||||||
milestone_name: milestone
|
milestone_name: milestone
|
||||||
status: in_progress
|
status: completed
|
||||||
stopped_at: Completed 02-02-PLAN.md
|
stopped_at: Completed 03-01-PLAN.md
|
||||||
last_updated: "2026-03-13T01:55:37Z"
|
last_updated: "2026-03-13T02:48:59.037Z"
|
||||||
last_activity: 2026-03-13 -- Completed 02-02 backup scheduler (per-device goroutines, concurrency, main.go wiring)
|
last_activity: 2026-03-13 -- Completed 02-02 backup scheduler with per-device goroutines and main.go wiring
|
||||||
progress:
|
progress:
|
||||||
total_phases: 10
|
total_phases: 10
|
||||||
completed_phases: 2
|
completed_phases: 3
|
||||||
total_plans: 3
|
total_plans: 4
|
||||||
completed_plans: 3
|
completed_plans: 4
|
||||||
percent: 100
|
percent: 100
|
||||||
---
|
---
|
||||||
|
|
||||||
@@ -21,23 +21,23 @@ progress:
|
|||||||
See: .planning/PROJECT.md (updated 2026-03-12)
|
See: .planning/PROJECT.md (updated 2026-03-12)
|
||||||
|
|
||||||
**Core value:** Operators can see exactly what changed on a router and when, with reliable config snapshots for download
|
**Core value:** Operators can see exactly what changed on a router and when, with reliable config snapshots for download
|
||||||
**Current focus:** Phase 2: Poller Config Collection
|
**Current focus:** Phase 3: Snapshot Ingestion -- COMPLETE
|
||||||
|
|
||||||
## Current Position
|
## Current Position
|
||||||
|
|
||||||
Phase: 2 of 10 (Poller Config Collection) -- COMPLETE
|
Phase: 3 of 10 (Snapshot Ingestion) -- COMPLETE
|
||||||
Plan: 2 of 2 in current phase (02-02 complete)
|
Plan: 1 of 1 in current phase (03-01 complete)
|
||||||
Status: Phase 2 complete
|
Status: Phase 3 complete
|
||||||
Last activity: 2026-03-13 -- Completed 02-02 backup scheduler with per-device goroutines and main.go wiring
|
Last activity: 2026-03-13 -- Completed 03-01 config snapshot subscriber with dedup, Transit encryption, and NATS ingestion
|
||||||
|
|
||||||
Progress: [██████████] 100%
|
Progress: [██████████] 100%
|
||||||
|
|
||||||
## Performance Metrics
|
## Performance Metrics
|
||||||
|
|
||||||
**Velocity:**
|
**Velocity:**
|
||||||
- Total plans completed: 3
|
- Total plans completed: 4
|
||||||
- Average duration: 4min
|
- Average duration: 4min
|
||||||
- Total execution time: 0.20 hours
|
- Total execution time: 0.27 hours
|
||||||
|
|
||||||
**By Phase:**
|
**By Phase:**
|
||||||
|
|
||||||
@@ -45,10 +45,11 @@ Progress: [██████████] 100%
|
|||||||
|-------|-------|-------|----------|
|
|-------|-------|-------|----------|
|
||||||
| 01-database-schema | 1 | 3min | 3min |
|
| 01-database-schema | 1 | 3min | 3min |
|
||||||
| 02-poller-config-collection | 2 | 9min | 4.5min |
|
| 02-poller-config-collection | 2 | 9min | 4.5min |
|
||||||
|
| 03-snapshot-ingestion | 1 | 4min | 4min |
|
||||||
|
|
||||||
**Recent Trend:**
|
**Recent Trend:**
|
||||||
- Last 5 plans: none
|
- Last 5 plans: 3min, 4min, 5min, 4min
|
||||||
- Trend: N/A
|
- Trend: stable
|
||||||
|
|
||||||
*Updated after each plan completion*
|
*Updated after each plan completion*
|
||||||
|
|
||||||
@@ -68,6 +69,8 @@ Recent decisions affecting current work:
|
|||||||
- [02-02] BackupScheduler runs independently from status poll scheduler with separate goroutines
|
- [02-02] BackupScheduler runs independently from status poll scheduler with separate goroutines
|
||||||
- [02-02] Buffered channel semaphore for concurrency control (Go idiom, no external deps)
|
- [02-02] Buffered channel semaphore for concurrency control (Go idiom, no external deps)
|
||||||
- [02-02] Devices with no Redis status key assumed potentially online for first backup
|
- [02-02] Devices with no Redis status key assumed potentially online for first backup
|
||||||
|
- [Phase 03]: Trust poller-provided SHA256 hash (no recompute on backend)
|
||||||
|
- [Phase 03]: Transit failure causes nak (NATS retry), plaintext never stored as fallback
|
||||||
|
|
||||||
### Pending Todos
|
### Pending Todos
|
||||||
|
|
||||||
@@ -79,6 +82,6 @@ None yet.
|
|||||||
|
|
||||||
## Session Continuity
|
## Session Continuity
|
||||||
|
|
||||||
Last session: 2026-03-13T01:55:37Z
|
Last session: 2026-03-13T02:48:59.034Z
|
||||||
Stopped at: Completed 02-02-PLAN.md (Phase 2 complete)
|
Stopped at: Completed 03-01-PLAN.md
|
||||||
Resume file: Next phase (03)
|
Resume file: None
|
||||||
|
|||||||
108
.planning/phases/03-snapshot-ingestion/03-01-SUMMARY.md
Normal file
108
.planning/phases/03-snapshot-ingestion/03-01-SUMMARY.md
Normal file
@@ -0,0 +1,108 @@
|
|||||||
|
---
|
||||||
|
phase: 03-snapshot-ingestion
|
||||||
|
plan: 01
|
||||||
|
subsystem: api
|
||||||
|
tags: [nats, jetstream, openbao, transit, encryption, postgresql, prometheus, dedup]
|
||||||
|
|
||||||
|
# Dependency graph
|
||||||
|
requires:
|
||||||
|
- phase: 01-database-schema
|
||||||
|
provides: RouterConfigSnapshot model and router_config_snapshots table
|
||||||
|
- phase: 02-poller-config-collection
|
||||||
|
provides: Go poller publishes config.snapshot.> NATS messages
|
||||||
|
provides:
|
||||||
|
- NATS subscriber consuming config.snapshot.> messages
|
||||||
|
- SHA256 dedup preventing duplicate snapshot storage
|
||||||
|
- OpenBao Transit encryption of config text before INSERT
|
||||||
|
- Prometheus metrics for ingestion monitoring
|
||||||
|
affects: [04-diff-engine, snapshot-api, config-timeline]
|
||||||
|
|
||||||
|
# Tech tracking
|
||||||
|
tech-stack:
|
||||||
|
added: [prometheus_client]
|
||||||
|
patterns: [nats-subscriber-with-dedup, transit-encrypt-before-insert]
|
||||||
|
|
||||||
|
key-files:
|
||||||
|
created:
|
||||||
|
- backend/app/services/config_snapshot_subscriber.py
|
||||||
|
- backend/tests/test_config_snapshot_subscriber.py
|
||||||
|
modified:
|
||||||
|
- backend/app/main.py
|
||||||
|
|
||||||
|
key-decisions:
|
||||||
|
- "Trust poller-provided SHA256 hash (no recompute on backend)"
|
||||||
|
- "Raw SQL for dedup SELECT and INSERT (consistent with nats_subscriber.py pattern)"
|
||||||
|
- "OpenBao Transit service instantiated per-message with close() for connection hygiene"
|
||||||
|
|
||||||
|
patterns-established:
|
||||||
|
- "Config snapshot ingestion: dedup by SHA256 -> encrypt -> INSERT -> ack"
|
||||||
|
- "Transit failure causes nak (NATS retry), plaintext never stored as fallback"
|
||||||
|
|
||||||
|
requirements-completed: [STOR-02]
|
||||||
|
|
||||||
|
# Metrics
|
||||||
|
duration: 4min
|
||||||
|
completed: 2026-03-13
|
||||||
|
---
|
||||||
|
|
||||||
|
# Phase 3 Plan 1: Config Snapshot Subscriber Summary
|
||||||
|
|
||||||
|
**NATS subscriber ingesting config snapshots with SHA256 dedup, OpenBao Transit encryption, and Prometheus metrics**
|
||||||
|
|
||||||
|
## Performance
|
||||||
|
|
||||||
|
- **Duration:** 4 min
|
||||||
|
- **Started:** 2026-03-13T02:44:01Z
|
||||||
|
- **Completed:** 2026-03-13T02:48:08Z
|
||||||
|
- **Tasks:** 2
|
||||||
|
- **Files modified:** 3
|
||||||
|
|
||||||
|
## Accomplishments
|
||||||
|
- NATS subscriber consuming config.snapshot.> on DEVICE_EVENTS stream with durable consumer
|
||||||
|
- SHA256 dedup: duplicate snapshots silently skipped at debug level with Prometheus counter
|
||||||
|
- OpenBao Transit encryption: plaintext never stored in PostgreSQL, Transit failure causes nak
|
||||||
|
- Malformed and orphan device messages acked and discarded safely with warning logs
|
||||||
|
- 6 unit tests covering all handler paths (new, duplicate, encrypt fail, malformed, orphan, first)
|
||||||
|
- Wired into main.py lifespan with non-fatal startup pattern
|
||||||
|
|
||||||
|
## Task Commits
|
||||||
|
|
||||||
|
Each task was committed atomically:
|
||||||
|
|
||||||
|
1. **Task 1 (RED): Failing tests** - `9d82741` (test)
|
||||||
|
2. **Task 1 (GREEN): Config snapshot subscriber** - `3ab9f27` (feat)
|
||||||
|
3. **Task 2: Wire into main.py lifespan** - `0db0641` (feat)
|
||||||
|
|
||||||
|
_TDD task had RED + GREEN commits_
|
||||||
|
|
||||||
|
## Files Created/Modified
|
||||||
|
- `backend/app/services/config_snapshot_subscriber.py` - NATS subscriber with dedup, encryption, metrics
|
||||||
|
- `backend/tests/test_config_snapshot_subscriber.py` - 6 unit tests for all handler paths
|
||||||
|
- `backend/app/main.py` - Lifespan wiring for start/stop
|
||||||
|
|
||||||
|
## Decisions Made
|
||||||
|
- Trust poller-provided SHA256 hash (no recompute on backend) -- per project decision
|
||||||
|
- Raw SQL for dedup SELECT and INSERT -- consistent with existing nats_subscriber.py pattern
|
||||||
|
- OpenBao Transit service instantiated per-message with close() -- connection hygiene
|
||||||
|
- config_text never appears in any log statement -- contains passwords and keys
|
||||||
|
|
||||||
|
## Deviations from Plan
|
||||||
|
|
||||||
|
None - plan executed exactly as written.
|
||||||
|
|
||||||
|
## Issues Encountered
|
||||||
|
|
||||||
|
None.
|
||||||
|
|
||||||
|
## User Setup Required
|
||||||
|
|
||||||
|
None - no external service configuration required.
|
||||||
|
|
||||||
|
## Next Phase Readiness
|
||||||
|
- Config snapshot subscriber ready to receive messages from Go poller
|
||||||
|
- RouterConfigSnapshot rows will be available for diff engine (Phase 4)
|
||||||
|
- Prometheus metrics exposed for monitoring ingestion rate and errors
|
||||||
|
|
||||||
|
---
|
||||||
|
*Phase: 03-snapshot-ingestion*
|
||||||
|
*Completed: 2026-03-13*
|
||||||
Reference in New Issue
Block a user