Files
the-other-dude/.planning/phases/02-poller-config-collection/02-01-SUMMARY.md
Jason Staack 7ff3178b84 docs(02-01): complete config backup primitives plan
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-12 20:50:27 -05:00

5.8 KiB

phase, plan, subsystem, tags, requires, provides, affects, tech-stack, key-files, key-decisions, patterns-established, requirements-completed, duration, completed
phase plan subsystem tags requires provides affects tech-stack key-files key-decisions patterns-established requirements-completed duration completed
02-poller-config-collection 01 poller
ssh
tofu
routeros
config-normalization
sha256
nats
prometheus
alembic
phase provides
01-database-schema router_config_snapshots table for storing backup data
SSH command executor with TOFU host key verification and typed error classification
Config normalizer with deterministic SHA256 hashing
ConfigSnapshotEvent NATS event type and PublishConfigSnapshot method
Config backup environment variables (interval, concurrency, timeout)
Device model SSH fields (port, host key fingerprint) with UpdateSSHHostKey method
Alembic migration 028 for devices table SSH columns
Prometheus metrics for config backup observability
02-02-backup-scheduler
03-backend-subscriber
added patterns
TOFU host key verification via SHA256 fingerprint comparison
Config normalization pipeline: line endings, timestamp strip, whitespace trim, blank collapse
SSH error classification into typed SSHErrorKind enum
created modified
poller/internal/device/ssh_executor.go
poller/internal/device/ssh_executor_test.go
poller/internal/device/normalize.go
poller/internal/device/normalize_test.go
backend/alembic/versions/028_device_ssh_host_key.py
poller/internal/config/config.go
poller/internal/bus/publisher.go
poller/internal/store/devices.go
poller/internal/observability/metrics.go
TOFU fingerprint format matches ssh-keygen: SHA256:base64(sha256(pubkey))
NormalizationVersion=1 constant included in NATS payloads for future re-processing
UpdateSSHHostKey sets first_seen via COALESCE to preserve original observation time
SSH error classification: classifySSHError inspects error strings for auth/hostkey/timeout/refused patterns
Config normalization: version-tracked deterministic pipeline for RouterOS export output
COLL-01
COLL-02
COLL-06
5min 2026-03-13

Phase 02 Plan 01: Config Backup Primitives Summary

SSH executor with TOFU host key verification, RouterOS config normalizer with SHA256 hashing, NATS snapshot event, and Alembic migration for device SSH columns

Performance

  • Duration: 5 min
  • Started: 2026-03-13T01:43:33Z
  • Completed: 2026-03-13T01:48:38Z
  • Tasks: 2
  • Files modified: 9

Accomplishments

  • SSH RunCommand executor with context-aware dialing, TOFU host key callback, and 6-kind typed error classification
  • Deterministic config normalizer: strips RouterOS timestamps, normalizes line endings, trims whitespace, collapses blanks, computes SHA256 hash
  • 22 unit tests covering error classification, TOFU flows (first connect/match/mismatch), normalization edge cases, idempotency
  • Config backup env vars, NATS ConfigSnapshotEvent, device model SSH extensions, migration 028, Prometheus metrics

Task Commits

Each task was committed atomically:

  1. Task 1: SSH executor, normalizer, and their tests - f1abb75 (feat)
  2. Task 2: Config env vars, NATS event type, device model extensions, Alembic migration, metrics - 4ae39d2 (feat)

Note: Task 1 used TDD -- tests written first (RED), implementation second (GREEN).

Files Created/Modified

  • poller/internal/device/ssh_executor.go - RunCommand SSH executor with TOFU host key verification and typed errors
  • poller/internal/device/ssh_executor_test.go - Unit tests for SSH error classification, TOFU callbacks, CommandResult
  • poller/internal/device/normalize.go - NormalizeConfig and HashConfig for RouterOS export output
  • poller/internal/device/normalize_test.go - Table-driven tests for normalization pipeline edge cases
  • poller/internal/config/config.go - Added ConfigBackupIntervalSeconds, ConfigBackupMaxConcurrent, ConfigBackupCommandTimeoutSeconds
  • poller/internal/bus/publisher.go - Added ConfigSnapshotEvent type, PublishConfigSnapshot method, config.snapshot.> stream subject
  • poller/internal/store/devices.go - Added SSHPort/SSHHostKeyFingerprint fields, UpdateSSHHostKey method, updated queries
  • poller/internal/observability/metrics.go - Added ConfigBackupTotal, ConfigBackupDuration, ConfigBackupActive metrics
  • backend/alembic/versions/028_device_ssh_host_key.py - Migration adding ssh_port, ssh_host_key_fingerprint, timestamp columns

Decisions Made

  • TOFU fingerprint format uses SHA256:base64(sha256(pubkey)) to match ssh-keygen output format
  • NormalizationVersion=1 constant is included in NATS payloads so consumers can detect algorithm changes
  • UpdateSSHHostKey uses COALESCE on ssh_host_key_first_seen to preserve original observation timestamp

Deviations from Plan

Auto-fixed Issues

1. [Rule 1 - Bug] Fixed test key generation approach

  • Found during: Task 1 (GREEN phase)
  • Issue: Embedded OpenSSH PEM test key had padding errors ("ssh: padding not as expected")
  • Fix: Switched to programmatic ed25519 key generation via crypto/ed25519.GenerateKey
  • Files modified: poller/internal/device/ssh_executor_test.go
  • Verification: All 22 tests pass
  • Committed in: f1abb75 (Task 1 commit)

Total deviations: 1 auto-fixed (1 bug) Impact on plan: Minimal -- test infrastructure fix only, no production code change.

Issues Encountered

None beyond the test key generation fix documented above.

User Setup Required

None - no external service configuration required.

Next Phase Readiness

  • All primitives ready for Plan 02 (backup scheduler) to wire together
  • SSH executor, normalizer, NATS event, device model, config, and metrics are independently tested and compilable
  • Migration 028 ready to apply before deploying the backup scheduler

Phase: 02-poller-config-collection Completed: 2026-03-13