docs(09-01): complete retention cleanup plan

- Create 09-01-SUMMARY.md with execution results
- Update STATE.md with phase 9 position and decisions
- Update ROADMAP.md with phase 9 progress
- Mark STOR-03 and STOR-04 requirements complete

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
Jason Staack
2026-03-12 23:35:37 -05:00
parent 4d62bc9499
commit 50211d1853
4 changed files with 123 additions and 22 deletions

View File

@@ -18,8 +18,8 @@
- [x] **STOR-01**: API stores config snapshots in `router_config_snapshots` table with SHA256 hash - [x] **STOR-01**: API stores config snapshots in `router_config_snapshots` table with SHA256 hash
- [x] **STOR-02**: Duplicate snapshots (same hash as previous) are skipped, no diff generated - [x] **STOR-02**: Duplicate snapshots (same hash as previous) are skipped, no diff generated
- [ ] **STOR-03**: Snapshots retained for 90 days (configurable via `CONFIG_RETENTION_DAYS`) - [x] **STOR-03**: Snapshots retained for 90 days (configurable via `CONFIG_RETENTION_DAYS`)
- [ ] **STOR-04**: Older snapshots automatically deleted by retention cleanup - [x] **STOR-04**: Older snapshots automatically deleted by retention cleanup
- [x] **STOR-05**: Snapshots encrypted at rest, accessible only through RBAC - [x] **STOR-05**: Snapshots encrypted at rest, accessible only through RBAC
### Diff & Parsing ### Diff & Parsing
@@ -76,8 +76,8 @@
| COLL-06 | Phase 2: Poller Config Collection | Complete | | COLL-06 | Phase 2: Poller Config Collection | Complete |
| STOR-01 | Phase 1: Database Schema | Complete | | STOR-01 | Phase 1: Database Schema | Complete |
| STOR-02 | Phase 3: Snapshot Ingestion | Complete | | STOR-02 | Phase 3: Snapshot Ingestion | Complete |
| STOR-03 | Phase 9: Retention & Cleanup | Pending | | STOR-03 | Phase 9: Retention & Cleanup | Complete |
| STOR-04 | Phase 9: Retention & Cleanup | Pending | | STOR-04 | Phase 9: Retention & Cleanup | Complete |
| STOR-05 | Phase 1: Database Schema | Complete | | STOR-05 | Phase 1: Database Schema | Complete |
| DIFF-01 | Phase 5: Diff Engine | Complete | | DIFF-01 | Phase 5: Diff Engine | Complete |
| DIFF-02 | Phase 5: Diff Engine | Complete | | DIFF-02 | Phase 5: Diff Engine | Complete |

View File

@@ -20,7 +20,7 @@ Decimal phases appear between their surrounding integers in numeric order.
- [x] **Phase 6: History API** - REST endpoints for timeline, snapshot view, and diff retrieval with RBAC (completed 2026-03-13) - [x] **Phase 6: History API** - REST endpoints for timeline, snapshot view, and diff retrieval with RBAC (completed 2026-03-13)
- [x] **Phase 7: Config History UI** - Timeline section on device page with change summaries (completed 2026-03-13) - [x] **Phase 7: Config History UI** - Timeline section on device page with change summaries (completed 2026-03-13)
- [ ] **Phase 8: Diff Viewer & Download** - Unified diff display with syntax highlighting and .rsc download - [ ] **Phase 8: Diff Viewer & Download** - Unified diff display with syntax highlighting and .rsc download
- [ ] **Phase 9: Retention & Cleanup** - 90-day retention policy with automatic snapshot deletion - [x] **Phase 9: Retention & Cleanup** - 90-day retention policy with automatic snapshot deletion (completed 2026-03-13)
- [ ] **Phase 10: Audit & Observability** - Audit event logging for all config backup operations - [ ] **Phase 10: Audit & Observability** - Audit event logging for all config backup operations
## Phase Details ## Phase Details
@@ -147,10 +147,10 @@ Plans:
1. Snapshots older than 90 days (default) are automatically deleted along with their associated diffs and changes 1. Snapshots older than 90 days (default) are automatically deleted along with their associated diffs and changes
2. Retention period is configurable via `CONFIG_RETENTION_DAYS` environment variable 2. Retention period is configurable via `CONFIG_RETENTION_DAYS` environment variable
3. Cleanup runs on a scheduled interval without blocking normal operations 3. Cleanup runs on a scheduled interval without blocking normal operations
**Plans**: TBD **Plans**: 1 plan
Plans: Plans:
- [ ] 09-01: Retention cleanup scheduler and cascading deletion - [ ] 09-01-PLAN.md — Retention cleanup service with APScheduler, configurable retention period, and cascading deletion
### Phase 10: Audit & Observability ### Phase 10: Audit & Observability
**Goal**: All config backup operations are logged as audit events for compliance and troubleshooting **Goal**: All config backup operations are logged as audit events for compliance and troubleshooting
@@ -182,5 +182,5 @@ Note: Phase 9 depends only on Phase 3 and Phase 10 depends on Phases 3/4/5, so P
| 6. History API | 2/2 | Complete | 2026-03-13 | | 6. History API | 2/2 | Complete | 2026-03-13 |
| 7. Config History UI | 1/1 | Complete | 2026-03-13 | | 7. Config History UI | 1/1 | Complete | 2026-03-13 |
| 8. Diff Viewer & Download | 1/2 | In Progress| | | 8. Diff Viewer & Download | 1/2 | In Progress| |
| 9. Retention & Cleanup | 0/1 | Not started | - | | 9. Retention & Cleanup | 1/1 | Complete | 2026-03-13 |
| 10. Audit & Observability | 0/1 | Not started | - | | 10. Audit & Observability | 0/1 | Not started | - |

View File

@@ -3,15 +3,15 @@ gsd_state_version: 1.0
milestone: v9.6 milestone: v9.6
milestone_name: milestone milestone_name: milestone
status: completed status: completed
stopped_at: Completed 08-02-PLAN.md stopped_at: Completed 09-01-PLAN.md
last_updated: "2026-03-13T04:24:44.396Z" last_updated: "2026-03-13T04:34:12Z"
last_activity: 2026-03-13 -- Completed 08-02 snapshot download last_activity: 2026-03-13 -- Completed 09-01 retention cleanup
progress: progress:
total_phases: 10 total_phases: 10
completed_phases: 8 completed_phases: 9
total_plans: 12 total_plans: 13
completed_plans: 12 completed_plans: 13
percent: 92 percent: 100
--- ---
# Project State # Project State
@@ -21,14 +21,14 @@ progress:
See: .planning/PROJECT.md (updated 2026-03-12) See: .planning/PROJECT.md (updated 2026-03-12)
**Core value:** Operators can see exactly what changed on a router and when, with reliable config snapshots for download **Core value:** Operators can see exactly what changed on a router and when, with reliable config snapshots for download
**Current focus:** Phase 8: Diff Viewer & Download **Current focus:** Phase 9: Retention & Cleanup -- COMPLETE
## Current Position ## Current Position
Phase: 8 of 10 (Diff Viewer & Download) -- COMPLETE Phase: 9 of 10 (Retention & Cleanup) -- COMPLETE
Plan: 2 of 2 in current phase Plan: 1 of 1 in current phase
Status: Phase 08 complete Status: Phase 09 complete
Last activity: 2026-03-13 -- Completed 08-02 snapshot download Last activity: 2026-03-13 -- Completed 09-01 retention cleanup
Progress: [██████████] 100% Progress: [██████████] 100%
@@ -60,6 +60,7 @@ Progress: [██████████] 100%
| Phase 07 P01 | 3min | 2 tasks | 3 files | | Phase 07 P01 | 3min | 2 tasks | 3 files |
| Phase 08 P01 | 1min | 2 tasks | 3 files | | Phase 08 P01 | 1min | 2 tasks | 3 files |
| Phase 08 P02 | 1min | 1 tasks | 3 files | | Phase 08 P02 | 1min | 1 tasks | 3 files |
| Phase 09 P01 | 2min | 2 tasks | 4 files |
## Accumulated Context ## Accumulated Context
@@ -94,6 +95,8 @@ Recent decisions affecting current work:
- [Phase 08]: DiffViewer rendered inline above timeline (not modal) for context preservation - [Phase 08]: DiffViewer rendered inline above timeline (not modal) for context preservation
- [Phase 08]: Line classification function for unified diff: +green, -red, @@blue, ---/+++ muted - [Phase 08]: Line classification function for unified diff: +green, -red, @@blue, ---/+++ muted
- [Phase 08]: Blob URL download pattern consistent with existing exportMyData and auditLogsApi.exportCsv patterns - [Phase 08]: Blob URL download pattern consistent with existing exportMyData and auditLogsApi.exportCsv patterns
- [Phase 09]: make_interval(days => :days) for parameterized PostgreSQL interval in retention cleanup
- [Phase 09]: 24h IntervalTrigger with 1h jitter for stagger; AdminAsyncSessionLocal for cross-tenant cleanup
### Pending Todos ### Pending Todos
@@ -105,6 +108,6 @@ None yet.
## Session Continuity ## Session Continuity
Last session: 2026-03-13T04:24:44.393Z Last session: 2026-03-13T04:34:12Z
Stopped at: Completed 08-02-PLAN.md Stopped at: Completed 09-01-PLAN.md
Resume file: None Resume file: None

View File

@@ -0,0 +1,98 @@
---
phase: 09-retention-cleanup
plan: 01
subsystem: database
tags: [apscheduler, retention, postgresql, prometheus, cascade-delete]
# Dependency graph
requires:
- phase: 01-database-schema
provides: router_config_snapshots table with CASCADE FK constraints
provides:
- Automatic retention cleanup of expired config snapshots
- CONFIG_RETENTION_DAYS env var for configurable retention period
- Prometheus metrics for cleanup observability
affects: []
# Tech tracking
tech-stack:
added: []
patterns: [APScheduler IntervalTrigger for periodic maintenance jobs]
key-files:
created:
- backend/app/services/retention_service.py
- backend/tests/test_retention_service.py
modified:
- backend/app/config.py
- backend/app/main.py
key-decisions:
- "make_interval(days => :days) for parameterized PostgreSQL interval (no string concatenation)"
- "24h IntervalTrigger with 1h jitter to stagger cleanup across instances"
- "AdminAsyncSessionLocal (bypasses RLS) since retention is cross-tenant system operation"
patterns-established:
- "IntervalTrigger pattern for periodic maintenance jobs (vs CronTrigger for scheduled backups)"
requirements-completed: [STOR-03, STOR-04]
# Metrics
duration: 2min
completed: 2026-03-13
---
# Phase 9 Plan 1: Retention Cleanup Summary
**Daily APScheduler job deletes config snapshots older than CONFIG_RETENTION_DAYS (default 90) with CASCADE FK cleanup of diffs and changes**
## Performance
- **Duration:** 2 min
- **Started:** 2026-03-13T04:31:48Z
- **Completed:** 2026-03-13T04:34:12Z
- **Tasks:** 2
- **Files modified:** 4
## Accomplishments
- Retention service with parameterized SQL DELETE using make_interval for safe interval binding
- APScheduler IntervalTrigger running every 24h with 1h jitter for stagger
- Prometheus counter and histogram for cleanup observability
- Wired into main.py lifespan with non-fatal startup pattern
## Task Commits
Each task was committed atomically:
1. **Task 1 (RED): Add failing tests** - `00bdde9` (test)
2. **Task 1 (GREEN): Implement retention service + config setting** - `a9f7a45` (feat)
3. **Task 2: Wire retention scheduler into lifespan** - `4d62bc9` (feat)
## Files Created/Modified
- `backend/app/services/retention_service.py` - Retention cleanup logic, scheduler, Prometheus metrics
- `backend/tests/test_retention_service.py` - 4 unit tests for cleanup function
- `backend/app/config.py` - Added CONFIG_RETENTION_DAYS setting (default 90)
- `backend/app/main.py` - Wired start/stop retention scheduler into lifespan
## Decisions Made
- Used make_interval(days => :days) for parameterized PostgreSQL interval (avoids string concatenation SQL injection risk)
- 24h IntervalTrigger with 1h jitter to stagger cleanup across instances
- AdminAsyncSessionLocal bypasses RLS since retention is a cross-tenant system operation
## Deviations from Plan
None - plan executed exactly as written.
## Issues Encountered
None
## User Setup Required
None - no external service configuration required. CONFIG_RETENTION_DAYS defaults to 90 if not set.
## Next Phase Readiness
- Retention cleanup is fully operational, ready for phase 10
- No blockers
---
*Phase: 09-retention-cleanup*
*Completed: 2026-03-13*