7-task plan covering database rename, login page fix, setup.py wizard with OpenBao bootstrap, sequential builds, and health checks. Also fixes spec OpenBao timeout to 60s.
9.8 KiB
TOD Production Setup Script — Design Spec
Overview
An interactive Python setup wizard (setup.py) that walks a sysadmin through configuring and deploying TOD (The Other Dude) for production. The script minimizes manual configuration by auto-generating secrets, capturing OpenBao credentials automatically, building images sequentially, and verifying service health.
Target audience: Technical sysadmins unfamiliar with this specific project.
Design Decisions
- Python 3.10+ — already required by the stack, enables rich input handling and colored output.
- Linear wizard with opinionated defaults — grouped sections, auto-generate everything possible, only prompt for genuine human decisions.
- Integrated OpenBao bootstrap — script starts the OpenBao container, captures unseal key and root token, updates
.env.prodautomatically (no manual copy-paste). - Sequential image builds — builds api, poller, frontend, winbox-worker one at a time to avoid OOM on low-RAM machines.
- Re-runnable — safe to run again; detects existing
.env.prodand offers to overwrite, back up (.env.prod.backup.<ISO timestamp>), or abort.
Prerequisite: Database Rename
The codebase currently uses mikrotik as the database name. Before the setup script can use tod, these files must be updated:
docker-compose.yml— defaultPOSTGRES_DBand healthcheck (pg_isready -d)docker-compose.prod.yml— hardcoded pollerDATABASE_URL(change to${POLLER_DATABASE_URL})docker-compose.staging.yml— if applicablescripts/init-postgres.sql—GRANT CONNECT ON DATABASEstatements.env.example— all URL references
The setup script will use POSTGRES_DB=tod. These file changes are part of the implementation, not runtime.
Additionally, docker-compose.prod.yml hardcodes the poller's DATABASE_URL. This must be changed to DATABASE_URL: ${POLLER_DATABASE_URL} so the setup script's generated value is used.
Script Flow
Phase 1: Pre-flight Checks
- Verify Python 3.10+
- Verify Docker Engine and Docker Compose v2 are installed and the daemon is running
- Check for existing
.env.prod— if found, offer: overwrite / back up and create new / abort - Warn if less than 4GB RAM available
- Check if key ports are in use (5432, 6379, 4222, 8001, 3000, 51820) and warn
Phase 2: Interactive Configuration (Linear Wizard)
Six sections, presented in order:
2.1 Database
| Prompt | Default | Notes |
|---|---|---|
| PostgreSQL superuser password | (required, no default) | Validated non-empty, min 12 chars |
Auto-generated:
POSTGRES_DB=todapp_userpassword viasecrets.token_urlsafe(24)(yields ~32 base64 chars)poller_userpassword viasecrets.token_urlsafe(24)(yields ~32 base64 chars)DATABASE_URL=postgresql+asyncpg://postgres:<pw>@postgres:5432/todSYNC_DATABASE_URL=postgresql+psycopg2://postgres:<pw>@postgres:5432/todAPP_USER_DATABASE_URL=postgresql+asyncpg://app_user:<app_pw>@postgres:5432/todPOLLER_DATABASE_URL=postgres://poller_user:<poller_pw>@postgres:5432/tod
2.2 Security
No prompts. Auto-generated:
JWT_SECRET_KEYviasecrets.token_urlsafe(64)(yields ~86 base64 chars)CREDENTIAL_ENCRYPTION_KEYviabase64(secrets.token_bytes(32))(yields 44 base64 chars)- Display both values to the user with a "save these somewhere safe" note
2.3 Admin Account
| Prompt | Default | Notes |
|---|---|---|
| Admin email | admin@the-other-dude.dev |
Validated as email-like |
| Admin password | (enter or press Enter to generate) | Min 12 chars if manual; generated passwords are 24 chars |
2.4 Email (Optional)
| Prompt | Default | Notes |
|---|---|---|
| Configure SMTP now? | No | If no, skip with reminder |
| SMTP host | (required if yes) | |
| SMTP port | 587 | |
| SMTP username | (optional) | |
| SMTP password | (optional) | |
| From address | (required if yes) | |
| Use TLS? | Yes |
2.5 Web / Domain
| Prompt | Default | Notes |
|---|---|---|
| Production domain | (required) | e.g. tod.staack.com |
Auto-derived:
APP_BASE_URL=https://<domain>CORS_ORIGINS=https://<domain>
2.6 Summary & Confirmation
Display all settings grouped by section. Secrets are partially masked (first 8 chars + ...). Ask for confirmation before writing.
Phase 3: Write .env.prod
Write the file with section comments and timestamp header. Also generate scripts/init-postgres-prod.sql with the generated app_user and poller_user passwords baked in (PostgreSQL init scripts don't support env var substitution).
Format:
# ============================================================
# TOD Production Environment — generated by setup.py
# Generated: <ISO timestamp>
# ============================================================
# --- Database ---
POSTGRES_DB=tod
POSTGRES_USER=postgres
POSTGRES_PASSWORD=<input>
DATABASE_URL=postgresql+asyncpg://postgres:<pw>@postgres:5432/tod
SYNC_DATABASE_URL=postgresql+psycopg2://postgres:<pw>@postgres:5432/tod
APP_USER_DATABASE_URL=postgresql+asyncpg://app_user:<app_pw>@postgres:5432/tod
POLLER_DATABASE_URL=postgres://poller_user:<poller_pw>@postgres:5432/tod
# --- Security ---
JWT_SECRET_KEY=<generated>
CREDENTIAL_ENCRYPTION_KEY=<generated>
# --- OpenBao (KMS) ---
OPENBAO_ADDR=http://openbao:8200
OPENBAO_TOKEN=PLACEHOLDER_RUN_SETUP
BAO_UNSEAL_KEY=PLACEHOLDER_RUN_SETUP
# --- Admin Bootstrap ---
FIRST_ADMIN_EMAIL=<input>
FIRST_ADMIN_PASSWORD=<input-or-generated>
# --- Email ---
# <configured block or "unconfigured" note>
SMTP_HOST=
SMTP_PORT=587
SMTP_USER=
SMTP_PASSWORD=
SMTP_USE_TLS=true
SMTP_FROM_ADDRESS=noreply@example.com
# --- Web ---
APP_BASE_URL=https://<domain>
CORS_ORIGINS=https://<domain>
# --- Application ---
ENVIRONMENT=production
LOG_LEVEL=info
DEBUG=false
APP_NAME=TOD - The Other Dude
# --- Storage ---
GIT_STORE_PATH=/data/git-store
FIRMWARE_CACHE_DIR=/data/firmware-cache
WIREGUARD_CONFIG_PATH=/data/wireguard
WIREGUARD_GATEWAY=wireguard
CONFIG_RETENTION_DAYS=90
# --- Redis & NATS ---
REDIS_URL=redis://redis:6379/0
NATS_URL=nats://nats:4222
# --- Poller ---
POLL_INTERVAL_SECONDS=60
CONNECTION_TIMEOUT_SECONDS=10
COMMAND_TIMEOUT_SECONDS=30
# --- Remote Access ---
TUNNEL_PORT_MIN=49000
TUNNEL_PORT_MAX=49100
TUNNEL_IDLE_TIMEOUT=300
SSH_RELAY_PORT=8080
SSH_IDLE_TIMEOUT=900
# --- Config Backup ---
CONFIG_BACKUP_INTERVAL=21600
CONFIG_BACKUP_MAX_CONCURRENT=10
Phase 4: OpenBao Bootstrap
- Start postgres and openbao containers only:
docker compose -f docker-compose.yml -f docker-compose.prod.yml --env-file .env.prod up -d postgres openbao - Wait for openbao container to be healthy (timeout 60s)
- Run
docker compose logs openbao 2>&1and parse theOPENBAO_TOKEN=andBAO_UNSEAL_KEY=lines using regex (init.sh prints these to stdout during container startup, which is captured in Docker logs) - Update
.env.prodby replacing thePLACEHOLDER_RUN_SETUPvalues with the captured credentials - On failure:
.env.prodretains placeholders, print instructions for manual capture viadocker compose logs openbao
Phase 5: Build Images
Build sequentially to avoid OOM:
docker compose -f docker-compose.yml -f docker-compose.prod.yml build api
docker compose -f docker-compose.yml -f docker-compose.prod.yml build poller
docker compose -f docker-compose.yml -f docker-compose.prod.yml build frontend
docker compose -f docker-compose.yml -f docker-compose.prod.yml build winbox-worker
Show progress for each. On failure: stop, report which image failed, suggest rerunning.
Phase 6: Start Stack
docker compose -f docker-compose.yml -f docker-compose.prod.yml --env-file .env.prod up -d
Phase 7: Health Check
- Poll service health for up to 60 seconds
- Report status of: postgres, redis, nats, openbao, api, poller, frontend, winbox-worker
- On success: print access URL (
https://<domain>) and admin credentials - On timeout: report which services are unhealthy, suggest
docker compose logs <service>
Database Init Script
scripts/init-postgres.sql hardcodes app_password and poller_password. Since PostgreSQL's docker-entrypoint-initdb.d scripts don't support environment variable substitution, the setup script generates scripts/init-postgres-prod.sql with the actual generated passwords baked in. The docker-compose.prod.yml volume mount will be updated to use this file instead.
Login Page Fix
frontend/src/routes/login.tsx lines 235-241 contain a "First time?" hint showing .env credential names. This will be wrapped in {import.meta.env.DEV && (...)} so it only appears in development builds. Vite's production build strips DEV-gated code entirely.
Error Handling
| Scenario | Behavior |
|---|---|
| Docker not installed/running | Fail early with clear message |
Existing .env.prod |
Offer: overwrite / back up / abort |
| Port already in use | Warn (non-blocking) with which port and likely culprit |
| OpenBao init fails | .env.prod retains placeholders, print manual capture steps |
| Image build fails | Stop, show failed image, suggest retry command |
| Health check timeout (60s) | Report unhealthy services, suggest log commands |
| Ctrl+C before Phase 3 | Graceful exit, no files written |
| Ctrl+C during/after Phase 3 | .env.prod exists (possibly with placeholders), noted on exit |
Re-runnability
- Detects existing
.env.prodand offers choices - Won't regenerate secrets if valid ones exist (offers to keep or regenerate)
- OpenBao re-init is idempotent (init.sh handles already-initialized state)
- Image rebuilds are safe (Docker layer caching)
- Backup naming:
.env.prod.backup.<ISO timestamp>
Dependencies
- Python 3.10+ (stdlib only — no pip packages required)
- Docker Engine 24+
- Docker Compose v2
- Stdlib modules:
secrets,subprocess,shutil,json,re,datetime,pathlib,getpass,socket(for port checks)