15 Commits

Author SHA1 Message Date
Jason Staack
e1d81b40ac fix: cap NATS JetStream streams to prevent OOM crash
WIRELESS_REGISTRATIONS stream had a 256MB MaxBytes cap in a 256MB
container — guaranteed to crash under load. ALERT_EVENTS and
OPERATION_EVENTS had no byte limit at all.

- Reduce WIRELESS_REGISTRATIONS MaxBytes from 256MB to 128MB
- Add 16MB MaxBytes cap to ALERT_EVENTS and OPERATION_EVENTS
- Bump NATS container memory limit from 256MB to 384MB
- Add restart: unless-stopped to NATS in base compose
- Bump version to 9.8.2

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-23 07:52:07 -05:00
Jason Staack
ffd2629ff0 feat(18-05): add SoftwareVersion field to DeviceStatusEvent
- Additive field with omitempty tag for SNMP device identification
- Existing RouterOS events produce identical JSON (field not set)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-21 19:32:49 -05:00
Jason Staack
4c667fd454 feat(18-04): implement DiscoveryResponder for SNMP device probes
- NATS request-reply on device.discover.snmp with queue group discover-workers
- Probes sysObjectID.0, sysDescr.0, sysName.0 via SNMP GET
- Credentials from request payload (not database), never logged
- 5-second timeout on both connect and GET operations
- Supports SNMP v1, v2c, and v3 with all auth/priv protocols
- Builds gosnmp client inline to avoid snmp->bus import cycle

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-21 19:28:28 -05:00
Jason Staack
727598cc78 test(18-04): add failing tests for DiscoveryResponder
- Subscribe/unsubscribe lifecycle
- Invalid JSON returns error response
- Missing ip_address returns descriptive error
- Response JSON field names match spec
- Invalid SNMP version rejected
- Default port 161 when zero

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-21 19:25:23 -05:00
Jason Staack
0697563a13 feat(18-01): add counter cache with Redis delta computation and SNMPMetricsEvent
- Implement computeCounterDelta for Counter32/Counter64 with wraparound handling
- Sanity threshold discards deltas > 90% of max value (device reset detection)
- CounterCache uses Redis MGET/MSET pipelining for efficient state persistence
- Counter keys use "snmp:counter:{device_id}:{oid}" format with 600s TTL
- Add SNMPMetricsEvent and SNMPMetricEntry structs to bus package
- Add PublishSNMPMetrics publishing to "device.metrics.snmp_custom.{device_id}"
- Full test coverage: 10 counter tests including miniredis integration

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-21 19:21:25 -05:00
Jason Staack
397a33abef feat(13-01): add DeviceInterfaceEvent publisher and wire into PollDevice
- DeviceInterfaceEvent type publishes to device.interfaces.{device_id}
- PublishDeviceInterfaces method follows existing publisher pattern
- DEVICE_EVENTS stream includes device.interfaces.> subject
- PollDevice collects interface info after traffic counters, before health
- Non-fatal errors with Prometheus metrics for publish success/failure

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-19 06:05:55 -05:00
Jason Staack
caa33ca8d7 feat(12-01): add RF monitor collector, WIRELESS_REGISTRATIONS stream, wire into poll cycle
- RFMonitorStats struct for per-interface RF data (noise floor, channel width, TX power)
- CollectRFMonitor with v6/v7 RouterOS version routing
- WIRELESS_REGISTRATIONS NATS stream with 30-day retention (separate from DEVICE_EVENTS)
- WirelessRegistrationEvent type and PublishWirelessRegistrations method
- Poll cycle collects per-client registrations and RF stats, publishes combined event

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-19 05:38:14 -05:00
Jason Staack
05e5595c2b fix(poller): add 64MB cap on DEVICE_EVENTS NATS stream
Without a byte limit, the stream grows unbounded within its 24h
max_age window. At 101 devices polling every 60s, it hits 128MB
in ~10 hours and OOMs the NATS container.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-18 05:52:09 -05:00
Jason Staack
970501e453 feat: implement Remote WinBox worker, API, frontend integration, OpenBao persistence, and supporting docs 2026-03-14 09:05:14 -05:00
Jason Staack
0851eced36 feat(04-01): implement BackupResponder with extracted CollectAndPublish
- Create BackupResponder for NATS request-reply on config.backup.trigger
- Extract public CollectAndPublish from BackupScheduler returning sha256 hash
- Define BackupExecutor/BackupLocker/DeviceGetter interfaces for testability
- Create RedisBackupLocker adapter wrapping redislock.Client
- Wire BackupResponder into main.go lifecycle
- All 6 tests pass with in-process NATS server

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-12 22:07:35 -05:00
Jason Staack
9e102fda20 test(04-01): add failing tests for BackupResponder
- Test subscribe registers subscription
- Test valid request returns success with sha256_hash
- Test lock held returns locked status
- Test invalid JSON returns error
- Test Stop unsubscribes cleanly
- Test device not found returns failed status

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-12 22:04:44 -05:00
Jason Staack
4ae39d2cb3 feat(02-01): add config backup env vars, NATS event, device SSH fields, migration, metrics
- Config: CONFIG_BACKUP_INTERVAL (21600s), CONFIG_BACKUP_MAX_CONCURRENT (10), CONFIG_BACKUP_COMMAND_TIMEOUT (60s)
- NATS: ConfigSnapshotEvent type, PublishConfigSnapshot method, config.snapshot.> stream subject
- Device: SSHPort/SSHHostKeyFingerprint fields, UpdateSSHHostKey method, updated queries/scans
- Migration 028: ssh_port, ssh_host_key_fingerprint, timestamp columns with poller_user grants
- Metrics: ConfigBackupTotal (counter), ConfigBackupDuration (histogram), ConfigBackupActive (gauge)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-12 20:48:12 -05:00
Jason Staack
acf1790bed feat: add audit.session.end NATS pipeline for SSH session tracking
Poller publishes session end events via JetStream when SSH sessions
close (normal disconnect or idle timeout). Backend subscribes with a
durable consumer and writes ssh_session_end audit log entries with
duration.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-12 16:07:10 -05:00
Jason Staack
d3d3e36192 feat(poller): add NATS tunnel responder for WinBox tunnel management
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-12 15:30:34 -05:00
Jason Staack
b840047e19 feat: The Other Dude v9.0.1 — full-featured email system
ci: add GitHub Pages deployment workflow for docs site

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-08 19:30:44 -05:00