# TOD Production Setup Script — Design Spec ## Overview An interactive Python setup wizard (`setup.py`) that walks a sysadmin through configuring and deploying TOD (The Other Dude) for production. The script minimizes manual configuration by auto-generating secrets, capturing OpenBao credentials automatically, building images sequentially, and verifying service health. **Target audience:** Technical sysadmins unfamiliar with this specific project. ## Design Decisions - **Python 3.10+** — already required by the stack, enables rich input handling and colored output. - **Linear wizard with opinionated defaults** — grouped sections, auto-generate everything possible, only prompt for genuine human decisions. - **Integrated OpenBao bootstrap** — script starts the OpenBao container, captures unseal key and root token, updates `.env.prod` automatically (no manual copy-paste). - **Sequential image builds** — builds api, poller, frontend, winbox-worker one at a time to avoid OOM on low-RAM machines. - **Re-runnable** — safe to run again; detects existing `.env.prod` and offers to overwrite, back up (`.env.prod.backup.`), or abort. ## Prerequisite: Database Rename The codebase currently uses `mikrotik` as the database name. Before the setup script can use `tod`, these files must be updated: - `docker-compose.yml` — default `POSTGRES_DB` and healthcheck (`pg_isready -d`) - `docker-compose.prod.yml` — hardcoded poller `DATABASE_URL` (change to `${POLLER_DATABASE_URL}`) - `docker-compose.staging.yml` — if applicable - `scripts/init-postgres.sql` — `GRANT CONNECT ON DATABASE` statements - `.env.example` — all URL references The setup script will use `POSTGRES_DB=tod`. These file changes are part of the implementation, not runtime. Additionally, `docker-compose.prod.yml` hardcodes the poller's `DATABASE_URL`. This must be changed to `DATABASE_URL: ${POLLER_DATABASE_URL}` so the setup script's generated value is used. ## Script Flow ### Phase 1: Pre-flight Checks - Verify Python 3.10+ - Verify Docker Engine and Docker Compose v2 are installed and the daemon is running - Check for existing `.env.prod` — if found, offer: overwrite / back up and create new / abort - Warn if less than 4GB RAM available - Check if key ports are in use (5432, 6379, 4222, 8001, 3000, 51820) and warn ### Phase 2: Interactive Configuration (Linear Wizard) Six sections, presented in order: #### 2.1 Database | Prompt | Default | Notes | |--------|---------|-------| | PostgreSQL superuser password | (required, no default) | Validated non-empty, min 12 chars | Auto-generated: - `POSTGRES_DB=tod` - `app_user` password via `secrets.token_urlsafe(24)` (yields ~32 base64 chars) - `poller_user` password via `secrets.token_urlsafe(24)` (yields ~32 base64 chars) - `DATABASE_URL=postgresql+asyncpg://postgres:@postgres:5432/tod` - `SYNC_DATABASE_URL=postgresql+psycopg2://postgres:@postgres:5432/tod` - `APP_USER_DATABASE_URL=postgresql+asyncpg://app_user:@postgres:5432/tod` - `POLLER_DATABASE_URL=postgres://poller_user:@postgres:5432/tod` #### 2.2 Security No prompts. Auto-generated: - `JWT_SECRET_KEY` via `secrets.token_urlsafe(64)` (yields ~86 base64 chars) - `CREDENTIAL_ENCRYPTION_KEY` via `base64(secrets.token_bytes(32))` (yields 44 base64 chars) - Display both values to the user with a "save these somewhere safe" note #### 2.3 Admin Account | Prompt | Default | Notes | |--------|---------|-------| | Admin email | `admin@the-other-dude.dev` | Validated as email-like | | Admin password | (enter or press Enter to generate) | Min 12 chars if manual; generated passwords are 24 chars | #### 2.4 Email (Optional) | Prompt | Default | Notes | |--------|---------|-------| | Configure SMTP now? | No | If no, skip with reminder | | SMTP host | (required if yes) | | | SMTP port | 587 | | | SMTP username | (optional) | | | SMTP password | (optional) | | | From address | (required if yes) | | | Use TLS? | Yes | | #### 2.5 Web / Domain | Prompt | Default | Notes | |--------|---------|-------| | Production domain | (required) | e.g. `tod.staack.com` | Auto-derived: - `APP_BASE_URL=https://` - `CORS_ORIGINS=https://` #### 2.6 Summary & Confirmation Display all settings grouped by section. Secrets are partially masked (first 8 chars + `...`). Ask for confirmation before writing. ### Phase 3: Write `.env.prod` Write the file with section comments and timestamp header. Also generate `scripts/init-postgres-prod.sql` with the generated `app_user` and `poller_user` passwords baked in (PostgreSQL init scripts don't support env var substitution). Format: ```bash # ============================================================ # TOD Production Environment — generated by setup.py # Generated: # ============================================================ # --- Database --- POSTGRES_DB=tod POSTGRES_USER=postgres POSTGRES_PASSWORD= DATABASE_URL=postgresql+asyncpg://postgres:@postgres:5432/tod SYNC_DATABASE_URL=postgresql+psycopg2://postgres:@postgres:5432/tod APP_USER_DATABASE_URL=postgresql+asyncpg://app_user:@postgres:5432/tod POLLER_DATABASE_URL=postgres://poller_user:@postgres:5432/tod # --- Security --- JWT_SECRET_KEY= CREDENTIAL_ENCRYPTION_KEY= # --- OpenBao (KMS) --- OPENBAO_ADDR=http://openbao:8200 OPENBAO_TOKEN=PLACEHOLDER_RUN_SETUP BAO_UNSEAL_KEY=PLACEHOLDER_RUN_SETUP # --- Admin Bootstrap --- FIRST_ADMIN_EMAIL= FIRST_ADMIN_PASSWORD= # --- Email --- # SMTP_HOST= SMTP_PORT=587 SMTP_USER= SMTP_PASSWORD= SMTP_USE_TLS=true SMTP_FROM_ADDRESS=noreply@example.com # --- Web --- APP_BASE_URL=https:// CORS_ORIGINS=https:// # --- Application --- ENVIRONMENT=production LOG_LEVEL=info DEBUG=false APP_NAME=TOD - The Other Dude # --- Storage --- GIT_STORE_PATH=/data/git-store FIRMWARE_CACHE_DIR=/data/firmware-cache WIREGUARD_CONFIG_PATH=/data/wireguard WIREGUARD_GATEWAY=wireguard CONFIG_RETENTION_DAYS=90 # --- Redis & NATS --- REDIS_URL=redis://redis:6379/0 NATS_URL=nats://nats:4222 # --- Poller --- POLL_INTERVAL_SECONDS=60 CONNECTION_TIMEOUT_SECONDS=10 COMMAND_TIMEOUT_SECONDS=30 # --- Remote Access --- TUNNEL_PORT_MIN=49000 TUNNEL_PORT_MAX=49100 TUNNEL_IDLE_TIMEOUT=300 SSH_RELAY_PORT=8080 SSH_IDLE_TIMEOUT=900 # --- Config Backup --- CONFIG_BACKUP_INTERVAL=21600 CONFIG_BACKUP_MAX_CONCURRENT=10 ``` ### Phase 4: OpenBao Bootstrap 1. Start postgres and openbao containers only: `docker compose -f docker-compose.yml -f docker-compose.prod.yml --env-file .env.prod up -d postgres openbao` 2. Wait for openbao container to be healthy (timeout 60s) 3. Run `docker compose logs openbao 2>&1` and parse the `OPENBAO_TOKEN=` and `BAO_UNSEAL_KEY=` lines using regex (init.sh prints these to stdout during container startup, which is captured in Docker logs) 4. Update `.env.prod` by replacing the `PLACEHOLDER_RUN_SETUP` values with the captured credentials 5. On failure: `.env.prod` retains placeholders, print instructions for manual capture via `docker compose logs openbao` ### Phase 5: Build Images Build sequentially to avoid OOM: ``` docker compose -f docker-compose.yml -f docker-compose.prod.yml build api docker compose -f docker-compose.yml -f docker-compose.prod.yml build poller docker compose -f docker-compose.yml -f docker-compose.prod.yml build frontend docker compose -f docker-compose.yml -f docker-compose.prod.yml build winbox-worker ``` Show progress for each. On failure: stop, report which image failed, suggest rerunning. ### Phase 6: Start Stack ``` docker compose -f docker-compose.yml -f docker-compose.prod.yml --env-file .env.prod up -d ``` ### Phase 7: Health Check - Poll service health for up to 60 seconds - Report status of: postgres, redis, nats, openbao, api, poller, frontend, winbox-worker - On success: print access URL (`https://`) and admin credentials - On timeout: report which services are unhealthy, suggest `docker compose logs ` ## Database Init Script `scripts/init-postgres.sql` hardcodes `app_password` and `poller_password`. Since PostgreSQL's `docker-entrypoint-initdb.d` scripts don't support environment variable substitution, the setup script generates `scripts/init-postgres-prod.sql` with the actual generated passwords baked in. The docker-compose.prod.yml volume mount will be updated to use this file instead. ## Login Page Fix `frontend/src/routes/login.tsx` lines 235-241 contain a "First time?" hint showing `.env` credential names. This will be wrapped in `{import.meta.env.DEV && (...)}` so it only appears in development builds. Vite's production build strips DEV-gated code entirely. ## Error Handling | Scenario | Behavior | |----------|----------| | Docker not installed/running | Fail early with clear message | | Existing `.env.prod` | Offer: overwrite / back up / abort | | Port already in use | Warn (non-blocking) with which port and likely culprit | | OpenBao init fails | `.env.prod` retains placeholders, print manual capture steps | | Image build fails | Stop, show failed image, suggest retry command | | Health check timeout (60s) | Report unhealthy services, suggest log commands | | Ctrl+C before Phase 3 | Graceful exit, no files written | | Ctrl+C during/after Phase 3 | `.env.prod` exists (possibly with placeholders), noted on exit | ## Re-runnability - Detects existing `.env.prod` and offers choices - Won't regenerate secrets if valid ones exist (offers to keep or regenerate) - OpenBao re-init is idempotent (init.sh handles already-initialized state) - Image rebuilds are safe (Docker layer caching) - Backup naming: `.env.prod.backup.` ## Dependencies - Python 3.10+ (stdlib only — no pip packages required) - Docker Engine 24+ - Docker Compose v2 - Stdlib modules: `secrets`, `subprocess`, `shutil`, `json`, `re`, `datetime`, `pathlib`, `getpass`, `socket` (for port checks)