Files
the-other-dude/.planning/phases/02-poller-config-collection/02-02-SUMMARY.md
Jason Staack d456fe58e9 docs(02-02): complete backup scheduler plan
- SUMMARY.md with execution metrics and decisions
- STATE.md updated: Phase 2 complete, 3 plans done
- ROADMAP.md updated: Phase 2 marked complete
- REQUIREMENTS.md: COLL-03, COLL-05 marked complete

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-12 20:57:47 -05:00

4.0 KiB

phase, plan, subsystem, tags, requires, provides, affects, tech-stack, key-files, key-decisions, patterns-established, requirements-completed, duration, completed
phase plan subsystem tags requires provides affects tech-stack key-files key-decisions patterns-established requirements-completed duration completed
02-poller-config-collection 02 poller
ssh
backup
scheduler
nats
routeros
concurrency
tofu
redis
phase provides
02-poller-config-collection/01 SSH executor, config normalizer, NATS ConfigSnapshotEvent, Prometheus metrics, config fields
BackupScheduler with per-device goroutines managing periodic SSH config collection
Concurrency-limited config backup pipeline (SSH -> normalize -> hash -> NATS publish)
TOFU host key verification with persistent fingerprint storage
Auth/hostkey error blocking with transient error exponential backoff
SSHHostKeyUpdater consumer-side interface
03-backend-snapshot-consumer
api
poller
added patterns
per-device goroutine lifecycle
buffered channel semaphore
Redis online gating
created modified
poller/internal/poller/backup_scheduler.go
poller/internal/poller/backup_scheduler_test.go
poller/internal/poller/interfaces.go
poller/cmd/poller/main.go
BackupScheduler runs independently from status poll scheduler with separate goroutines
Semaphore uses buffered channel pattern matching existing codebase style
Device with no Redis status key assumed potentially online (first poll not yet completed)
Backup goroutine pattern: jitter -> initial backup -> ticker loop with gating checks
Error classification: auth/hostkey block retries, transient errors use exponential backoff
COLL-01
COLL-03
COLL-05
COLL-06
4min 2026-03-13

Phase 2 Plan 2: Backup Scheduler Summary

BackupScheduler orchestrating periodic SSH config collection with per-device goroutines, concurrency semaphore, TOFU verification, and NATS publishing

Performance

  • Duration: 4 min
  • Started: 2026-03-13T01:51:27Z
  • Completed: 2026-03-13T01:55:37Z
  • Tasks: 2
  • Files modified: 4

Accomplishments

  • BackupScheduler manages per-device backup goroutines with 30-300s initial jitter
  • Concurrency limited by configurable buffered channel semaphore (default 10)
  • Auth failures and host key mismatches permanently block retries with clear log warnings
  • Transient errors use stepped backoff (5m/15m/1h cap)
  • Full pipeline wired into main.go running parallel to existing status poll scheduler

Task Commits

Each task was committed atomically:

  1. Task 1: BackupScheduler with per-device goroutines - a884b09 (test) + 2653a32 (feat) -- TDD red/green
  2. Task 2: Wire BackupScheduler into main.go - d34817a (feat)

Files Created/Modified

  • poller/internal/poller/backup_scheduler.go - BackupScheduler with per-device goroutines, concurrency control, SSH collection, NATS publishing
  • poller/internal/poller/backup_scheduler_test.go - Unit tests for jitter, backoff, retry blocking, online gating, semaphore, reconciliation
  • poller/internal/poller/interfaces.go - Added SSHHostKeyUpdater consumer-side interface
  • poller/cmd/poller/main.go - BackupScheduler initialization and goroutine startup

Decisions Made

  • BackupScheduler runs independently from status poll scheduler -- separate goroutine pool, no shared state
  • Semaphore uses buffered channel pattern (consistent with Go idioms, no external deps)
  • Devices with no Redis status key assumed potentially online to avoid blocking first backup
  • Locker nil-check allows tests to run without Redis lock infrastructure

Deviations from Plan

None - plan executed exactly as written.

Issues Encountered

None

User Setup Required

None - no external service configuration required.

Next Phase Readiness

  • Config backup pipeline complete: SSH -> normalize -> hash -> NATS publish
  • Backend snapshot consumer (Phase 3) can subscribe to config.snapshot.create.> to receive snapshots
  • Pre-existing integration test failures in poller package (missing certificate_authorities table) are unrelated to this work

Phase: 02-poller-config-collection Completed: 2026-03-13