Files
the-other-dude/docs/superpowers/specs/2026-03-14-setup-script-design.md
Jason Staack c7c9f4d71e docs: add setup script implementation plan
7-task plan covering database rename, login page fix, setup.py
wizard with OpenBao bootstrap, sequential builds, and health checks.
Also fixes spec OpenBao timeout to 60s.
2026-03-14 09:52:58 -05:00

9.8 KiB

TOD Production Setup Script — Design Spec

Overview

An interactive Python setup wizard (setup.py) that walks a sysadmin through configuring and deploying TOD (The Other Dude) for production. The script minimizes manual configuration by auto-generating secrets, capturing OpenBao credentials automatically, building images sequentially, and verifying service health.

Target audience: Technical sysadmins unfamiliar with this specific project.

Design Decisions

  • Python 3.10+ — already required by the stack, enables rich input handling and colored output.
  • Linear wizard with opinionated defaults — grouped sections, auto-generate everything possible, only prompt for genuine human decisions.
  • Integrated OpenBao bootstrap — script starts the OpenBao container, captures unseal key and root token, updates .env.prod automatically (no manual copy-paste).
  • Sequential image builds — builds api, poller, frontend, winbox-worker one at a time to avoid OOM on low-RAM machines.
  • Re-runnable — safe to run again; detects existing .env.prod and offers to overwrite, back up (.env.prod.backup.<ISO timestamp>), or abort.

Prerequisite: Database Rename

The codebase currently uses mikrotik as the database name. Before the setup script can use tod, these files must be updated:

  • docker-compose.yml — default POSTGRES_DB and healthcheck (pg_isready -d)
  • docker-compose.prod.yml — hardcoded poller DATABASE_URL (change to ${POLLER_DATABASE_URL})
  • docker-compose.staging.yml — if applicable
  • scripts/init-postgres.sqlGRANT CONNECT ON DATABASE statements
  • .env.example — all URL references

The setup script will use POSTGRES_DB=tod. These file changes are part of the implementation, not runtime.

Additionally, docker-compose.prod.yml hardcodes the poller's DATABASE_URL. This must be changed to DATABASE_URL: ${POLLER_DATABASE_URL} so the setup script's generated value is used.

Script Flow

Phase 1: Pre-flight Checks

  • Verify Python 3.10+
  • Verify Docker Engine and Docker Compose v2 are installed and the daemon is running
  • Check for existing .env.prod — if found, offer: overwrite / back up and create new / abort
  • Warn if less than 4GB RAM available
  • Check if key ports are in use (5432, 6379, 4222, 8001, 3000, 51820) and warn

Phase 2: Interactive Configuration (Linear Wizard)

Six sections, presented in order:

2.1 Database

Prompt Default Notes
PostgreSQL superuser password (required, no default) Validated non-empty, min 12 chars

Auto-generated:

  • POSTGRES_DB=tod
  • app_user password via secrets.token_urlsafe(24) (yields ~32 base64 chars)
  • poller_user password via secrets.token_urlsafe(24) (yields ~32 base64 chars)
  • DATABASE_URL=postgresql+asyncpg://postgres:<pw>@postgres:5432/tod
  • SYNC_DATABASE_URL=postgresql+psycopg2://postgres:<pw>@postgres:5432/tod
  • APP_USER_DATABASE_URL=postgresql+asyncpg://app_user:<app_pw>@postgres:5432/tod
  • POLLER_DATABASE_URL=postgres://poller_user:<poller_pw>@postgres:5432/tod

2.2 Security

No prompts. Auto-generated:

  • JWT_SECRET_KEY via secrets.token_urlsafe(64) (yields ~86 base64 chars)
  • CREDENTIAL_ENCRYPTION_KEY via base64(secrets.token_bytes(32)) (yields 44 base64 chars)
  • Display both values to the user with a "save these somewhere safe" note

2.3 Admin Account

Prompt Default Notes
Admin email admin@the-other-dude.dev Validated as email-like
Admin password (enter or press Enter to generate) Min 12 chars if manual; generated passwords are 24 chars

2.4 Email (Optional)

Prompt Default Notes
Configure SMTP now? No If no, skip with reminder
SMTP host (required if yes)
SMTP port 587
SMTP username (optional)
SMTP password (optional)
From address (required if yes)
Use TLS? Yes

2.5 Web / Domain

Prompt Default Notes
Production domain (required) e.g. tod.staack.com

Auto-derived:

  • APP_BASE_URL=https://<domain>
  • CORS_ORIGINS=https://<domain>

2.6 Summary & Confirmation

Display all settings grouped by section. Secrets are partially masked (first 8 chars + ...). Ask for confirmation before writing.

Phase 3: Write .env.prod

Write the file with section comments and timestamp header. Also generate scripts/init-postgres-prod.sql with the generated app_user and poller_user passwords baked in (PostgreSQL init scripts don't support env var substitution).

Format:

# ============================================================
# TOD Production Environment — generated by setup.py
# Generated: <ISO timestamp>
# ============================================================

# --- Database ---
POSTGRES_DB=tod
POSTGRES_USER=postgres
POSTGRES_PASSWORD=<input>
DATABASE_URL=postgresql+asyncpg://postgres:<pw>@postgres:5432/tod
SYNC_DATABASE_URL=postgresql+psycopg2://postgres:<pw>@postgres:5432/tod
APP_USER_DATABASE_URL=postgresql+asyncpg://app_user:<app_pw>@postgres:5432/tod
POLLER_DATABASE_URL=postgres://poller_user:<poller_pw>@postgres:5432/tod

# --- Security ---
JWT_SECRET_KEY=<generated>
CREDENTIAL_ENCRYPTION_KEY=<generated>

# --- OpenBao (KMS) ---
OPENBAO_ADDR=http://openbao:8200
OPENBAO_TOKEN=PLACEHOLDER_RUN_SETUP
BAO_UNSEAL_KEY=PLACEHOLDER_RUN_SETUP

# --- Admin Bootstrap ---
FIRST_ADMIN_EMAIL=<input>
FIRST_ADMIN_PASSWORD=<input-or-generated>

# --- Email ---
# <configured block or "unconfigured" note>
SMTP_HOST=
SMTP_PORT=587
SMTP_USER=
SMTP_PASSWORD=
SMTP_USE_TLS=true
SMTP_FROM_ADDRESS=noreply@example.com

# --- Web ---
APP_BASE_URL=https://<domain>
CORS_ORIGINS=https://<domain>

# --- Application ---
ENVIRONMENT=production
LOG_LEVEL=info
DEBUG=false
APP_NAME=TOD - The Other Dude

# --- Storage ---
GIT_STORE_PATH=/data/git-store
FIRMWARE_CACHE_DIR=/data/firmware-cache
WIREGUARD_CONFIG_PATH=/data/wireguard
WIREGUARD_GATEWAY=wireguard
CONFIG_RETENTION_DAYS=90

# --- Redis & NATS ---
REDIS_URL=redis://redis:6379/0
NATS_URL=nats://nats:4222

# --- Poller ---
POLL_INTERVAL_SECONDS=60
CONNECTION_TIMEOUT_SECONDS=10
COMMAND_TIMEOUT_SECONDS=30

# --- Remote Access ---
TUNNEL_PORT_MIN=49000
TUNNEL_PORT_MAX=49100
TUNNEL_IDLE_TIMEOUT=300
SSH_RELAY_PORT=8080
SSH_IDLE_TIMEOUT=900

# --- Config Backup ---
CONFIG_BACKUP_INTERVAL=21600
CONFIG_BACKUP_MAX_CONCURRENT=10

Phase 4: OpenBao Bootstrap

  1. Start postgres and openbao containers only: docker compose -f docker-compose.yml -f docker-compose.prod.yml --env-file .env.prod up -d postgres openbao
  2. Wait for openbao container to be healthy (timeout 60s)
  3. Run docker compose logs openbao 2>&1 and parse the OPENBAO_TOKEN= and BAO_UNSEAL_KEY= lines using regex (init.sh prints these to stdout during container startup, which is captured in Docker logs)
  4. Update .env.prod by replacing the PLACEHOLDER_RUN_SETUP values with the captured credentials
  5. On failure: .env.prod retains placeholders, print instructions for manual capture via docker compose logs openbao

Phase 5: Build Images

Build sequentially to avoid OOM:

docker compose -f docker-compose.yml -f docker-compose.prod.yml build api
docker compose -f docker-compose.yml -f docker-compose.prod.yml build poller
docker compose -f docker-compose.yml -f docker-compose.prod.yml build frontend
docker compose -f docker-compose.yml -f docker-compose.prod.yml build winbox-worker

Show progress for each. On failure: stop, report which image failed, suggest rerunning.

Phase 6: Start Stack

docker compose -f docker-compose.yml -f docker-compose.prod.yml --env-file .env.prod up -d

Phase 7: Health Check

  • Poll service health for up to 60 seconds
  • Report status of: postgres, redis, nats, openbao, api, poller, frontend, winbox-worker
  • On success: print access URL (https://<domain>) and admin credentials
  • On timeout: report which services are unhealthy, suggest docker compose logs <service>

Database Init Script

scripts/init-postgres.sql hardcodes app_password and poller_password. Since PostgreSQL's docker-entrypoint-initdb.d scripts don't support environment variable substitution, the setup script generates scripts/init-postgres-prod.sql with the actual generated passwords baked in. The docker-compose.prod.yml volume mount will be updated to use this file instead.

Login Page Fix

frontend/src/routes/login.tsx lines 235-241 contain a "First time?" hint showing .env credential names. This will be wrapped in {import.meta.env.DEV && (...)} so it only appears in development builds. Vite's production build strips DEV-gated code entirely.

Error Handling

Scenario Behavior
Docker not installed/running Fail early with clear message
Existing .env.prod Offer: overwrite / back up / abort
Port already in use Warn (non-blocking) with which port and likely culprit
OpenBao init fails .env.prod retains placeholders, print manual capture steps
Image build fails Stop, show failed image, suggest retry command
Health check timeout (60s) Report unhealthy services, suggest log commands
Ctrl+C before Phase 3 Graceful exit, no files written
Ctrl+C during/after Phase 3 .env.prod exists (possibly with placeholders), noted on exit

Re-runnability

  • Detects existing .env.prod and offers choices
  • Won't regenerate secrets if valid ones exist (offers to keep or regenerate)
  • OpenBao re-init is idempotent (init.sh handles already-initialized state)
  • Image rebuilds are safe (Docker layer caching)
  • Backup naming: .env.prod.backup.<ISO timestamp>

Dependencies

  • Python 3.10+ (stdlib only — no pip packages required)
  • Docker Engine 24+
  • Docker Compose v2
  • Stdlib modules: secrets, subprocess, shutil, json, re, datetime, pathlib, getpass, socket (for port checks)