- Bind tunnel listeners to 0.0.0.0 instead of 127.0.0.1 so tunnels
are reachable through reverse proxies and container networks
- Reduce port range to 49000-49004 (5 concurrent tunnels)
- Derive WinBox URI host from request Host header instead of
hardcoding 127.0.0.1, enabling use behind reverse proxies
- Add README security warning about default encryption keys
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
WinBox tunnels, SSH terminal, NATS request-reply architecture,
session management, security notes, and updated port tables.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Poller publishes session end events via JetStream when SSH sessions
close (normal disconnect or idle timeout). Backend subscribes with a
durable consumer and writes ssh_session_end audit log entries with
duration.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Gap 1: Add tenant ID verification after device lookup in SSH relay handleSSH,
closing cross-tenant token reuse vulnerability
- Gap 2: Add X-Forwarded-For fallback (last entry) when X-Real-IP is absent in
SSH relay source IP extraction; import strings package
- Gap 3: Add @limiter.limit("10/minute") to POST /winbox-session and POST
/ssh-session using existing slowapi pattern from app.middleware.rate_limit
- Gap 4: Add TODO comment in open_ssh_session explaining that SSH session count
enforcement is at the poller level; no NATS subject exists yet for API-side
pre-check
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add WebSocket upgrade map to nginx and proxy /ws/ssh to poller:8080
- Update CSP connect-src to allow ws: and wss: for terminal connections
- Add tunnel port range 49000-49100, SSH relay env vars, ulimits, and healthcheck to poller in both override and prod compose files
- Increase poller memory limit to 512M in prod for tunnel/SSH overhead
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Implements four operator-gated endpoints under /api/tenants/{tenant_id}/devices/{device_id}/:
- POST /winbox-session: opens a WinBox tunnel via NATS request-reply to poller
- POST /ssh-session: mints a single-use Redis token (120s TTL) for WebSocket SSH relay
- DELETE /winbox-session/{tunnel_id}: idempotently closes a WinBox tunnel
- GET /sessions: lists active WinBox tunnels via NATS tunnel.status.list
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add TunnelManager, TunnelResponder, SSH relay server, and SSH relay HTTP
server to the poller startup sequence with env-configurable port ranges,
idle timeouts, and session limits. Extends graceful shutdown to cover the
HTTP server (5s context), tunnel manager, and SSH relay server via defer.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Implements the SSH relay server (Task 2.1) that validates single-use
Redis tokens via GETDEL, dials SSH to the target device with PTY,
and bridges WebSocket binary/text frames to SSH stdin/stdout/stderr
with idle timeout and per-user/per-device session limits.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Implements Manager which orchestrates WinBox tunnel lifecycle: open,
close, idle cleanup, and status queries. Uses PortPool and Tunnel from
Tasks 1.2/1.3. DeviceStore and CredentialCache wired in for Task 1.5.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Implements Tunnel type that listens on a local port, accepts WinBox client
connections, dials the remote RouterOS device, and proxies traffic
bidirectionally. Uses activityReader to atomically update LastActive on
each read for idle timeout detection. Per-connection contexts derive from
the tunnel context so Close() terminates all connections cleanly.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Implements PortPool with mutex-protected allocation, bind verification
to skip ports already in use by the OS, and release-for-reuse semantics.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Six-chunk TDD implementation plan for WinBox TCP tunnels and SSH terminal relay through the Go poller. Covers tunnel manager, SSH relay, API endpoints, infrastructure, frontend, and documentation.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Two bugs fixed:
1. audit_service.py: log_action() inserted into audit_logs using the
caller's DB session but never committed. Any router that called
db.commit() before log_action() (firmware, devices, config_editor,
alerts, certificates) had its audit rows silently rolled back when
the request session closed.
Fix: log_action now opens its own AdminAsyncSessionLocal and self-
commits, making audit persistence independent of the caller's
transaction. The 'db' parameter is kept for backward compat but
unused. Affects 5 routers (firmware, devices, config_editor,
alerts, certificates).
2. docker-compose.override.yml: /data/firmware-cache had no volume
mount so the directory didn't exist in the container, causing
firmware downloads to fail with Permission denied.
Fix: bind-mount docker-data/firmware-cache:/data/firmware-cache
so firmware images survive container restarts.
Without a resolver directive, nginx resolves upstream hostnames once at
startup and caches the IP forever. When the API container restarts it gets
a new Docker-assigned IP, causing 502 Bad Gateway until nginx is reloaded.
Fix:
- Add 'resolver 127.0.0.11 valid=10s' (Docker embedded DNS)
- Use a variable in proxy_pass ('set \ api') so nginx
re-resolves on every request using the resolver above
- Variable proxy_pass passes the full request URI as-is, so /api/...
correctly maps to http://api:8000/api/... without double-pathing
- poller/docker-entrypoint.sh: convert from CRLF+BOM to LF (UTF-8 no BOM)
Windows saved the file with a UTF-8 BOM which made the Linux kernel
reject the shebang with 'exec format error', crashing the poller.
- infrastructure/openbao/init.sh: same CRLF -> LF fix
- poller/Dockerfile: add sed to strip CRLF and BOM at image build time
as a defensive measure for future Windows edits
- docker-compose.override.yml: add 'restart: on-failure' to api and poller
so they recover from the postgres startup race (TimescaleDB restarts
postgres after initdb, briefly causing connection refused on first boot)
- .gitattributes: enforce LF for all text/script/code files so git
normalises line endings on checkout and prevents this class of bug
Three bugs fixed:
1. Phase 30 (auth.ts): After SRP login the encrypted_key_set was returned
from the server but the vault key and RSA private key were never unwrapped
with the AUK. keyStore.getVaultKey() was always null, causing Tier 1
config-backup diffs to crash with a TypeError.
Fix: unwrap vault key and private key using crypto.subtle.unwrapKey after
successful SRP verification. Non-fatal: warns to console if decryption
fails so login always succeeds.
2. Token refresh (auth.py): The /refresh endpoint required refresh_token in
the request body, but the frontend never stored or sent it. After the 15-
minute access token TTL, all authenticated API calls would fail silently
because the interceptor sent an empty body and received 422 (not 401),
so the retry loop never fired.
Fix: login/srpVerify now set an httpOnly refresh_token cookie scoped to
/api/auth/refresh. The refresh endpoint now accepts the token from either
cookie (preferred) or body (legacy). Logout clears both cookies.
RefreshRequest.refresh_token is now Optional to allow empty-body calls.
3. Silent token rotation: the /refresh endpoint now also rotates the refresh
token cookie on each use (issues a fresh token), reducing the window for
stolen refresh token replay.
The Secret Key encoder used 26 base-30 characters which can only
represent 30^26 ≈ 2^127.58 values. Since the key is 128 bits,
~25% of generated keys silently lost their high bits during
formatting, making the Emergency Kit key unable to reconstruct
the original bytes on a new browser.
Changed KEY_CHAR_LENGTH from 26 to 27 (30^27 > 2^128). Parser
accepts both old 26-char and new 27-char keys for backward
compatibility. Format: A3-XXXXXX-XXXXXX-XXXXXX-XXXXXX-XXX
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Restore original 6-step quick start with comments. Increase arch flow
box contrast (bg-deep background, stronger border) and arrow size.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Hero: tighter 3-line intro focused on the problem
- What It Does: updated section label
- Safe Config: panic-revert language, fleet-wide templates
- Who This Is For: expanded audience descriptions
- Architecture: new section with vertical flow diagram
- Quick Start: simplified to 3 commands
- CTA: open source + self-hosted, closing tagline
- Slow gradient fill animation from 1.2s to 2s
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Gradient on left half of background, white on right. Animation sweeps
from white to gradient. Uses 'both' fill mode for correct state during
delay and after completion.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The gradient sweeps left-to-right across "Centralized Management"
after a 0.3s delay, transitioning from plain text to the teal-burgundy
gradient over 1.2s.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Each list gets a dynamically generated keyframe where only 1/N of the
cycle is active. Bullets are staggered 0.8s apart so they take turns
pulsing in sequence, looping forever.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Teal bullet dots pulse with a staggered throb when list items scroll
into view. Uses IntersectionObserver with 120ms stagger per item.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Section labels, titles, descriptions, and closing statements are now
centered. Bullet lists remain left-aligned within their centered
container for readability. Fixes visual disconnect between centered
hero and left-justified content sections.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace marketing-heavy hero, feature cards, and architecture diagram
with straightforward copy aimed at real MikroTik operators. New sections:
What It Does, Safe Configuration, Built for Real Operators, Designed for
Scale, and Open Source CTA.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace verbose ASCII architecture in README with clean linear flow.
Remove tech stack badge grid from landing page.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Banner on landing page, docs page, and GitHub README warning that the
software is in active development and not yet ready for production use.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
execute_cli was passing the full CLI string (e.g. '/ping address=8.8.8.8
count=4') as a single command to the Go poller. go-routeros expects the
command path and args separately. Now splits into command + prefixed args.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The target input showed "8.8.8.8" as placeholder text but the actual
value was empty. Clicking Ping/Traceroute silently returned because
the empty target guard fired. Users saw the placeholder and assumed
the tool was broken.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The server-generated PDF had a placeholder for the Secret Key that was
never filled in client-side, making the Emergency Kit useless. Users
who relied on it could not recover their Secret Key on new devices.
Now generates the PDF entirely client-side via browser print dialog,
with the real Secret Key embedded. No server round-trip, key never
leaves the browser.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
When a user logs in from a browser with an outdated Secret Key in
IndexedDB (e.g. after server rebuild/re-enrollment), the SRP handshake
fails with 401 but the Secret Key input field was never shown — leaving
the user stuck with no way to enter their current key.
Now detects stale-key 401s and prompts for manual Secret Key entry.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The FleetTable empty state navigated with ?add=true but the devices page
never read that param. Now it opens the AddDeviceForm when add=true is
in the search params.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>