7-task plan covering database rename, login page fix, setup.py wizard with OpenBao bootstrap, sequential builds, and health checks. Also fixes spec OpenBao timeout to 60s.
250 lines
9.8 KiB
Markdown
250 lines
9.8 KiB
Markdown
# TOD Production Setup Script — Design Spec
|
|
|
|
## Overview
|
|
|
|
An interactive Python setup wizard (`setup.py`) that walks a sysadmin through configuring and deploying TOD (The Other Dude) for production. The script minimizes manual configuration by auto-generating secrets, capturing OpenBao credentials automatically, building images sequentially, and verifying service health.
|
|
|
|
**Target audience:** Technical sysadmins unfamiliar with this specific project.
|
|
|
|
## Design Decisions
|
|
|
|
- **Python 3.10+** — already required by the stack, enables rich input handling and colored output.
|
|
- **Linear wizard with opinionated defaults** — grouped sections, auto-generate everything possible, only prompt for genuine human decisions.
|
|
- **Integrated OpenBao bootstrap** — script starts the OpenBao container, captures unseal key and root token, updates `.env.prod` automatically (no manual copy-paste).
|
|
- **Sequential image builds** — builds api, poller, frontend, winbox-worker one at a time to avoid OOM on low-RAM machines.
|
|
- **Re-runnable** — safe to run again; detects existing `.env.prod` and offers to overwrite, back up (`.env.prod.backup.<ISO timestamp>`), or abort.
|
|
|
|
## Prerequisite: Database Rename
|
|
|
|
The codebase currently uses `mikrotik` as the database name. Before the setup script can use `tod`, these files must be updated:
|
|
|
|
- `docker-compose.yml` — default `POSTGRES_DB` and healthcheck (`pg_isready -d`)
|
|
- `docker-compose.prod.yml` — hardcoded poller `DATABASE_URL` (change to `${POLLER_DATABASE_URL}`)
|
|
- `docker-compose.staging.yml` — if applicable
|
|
- `scripts/init-postgres.sql` — `GRANT CONNECT ON DATABASE` statements
|
|
- `.env.example` — all URL references
|
|
|
|
The setup script will use `POSTGRES_DB=tod`. These file changes are part of the implementation, not runtime.
|
|
|
|
Additionally, `docker-compose.prod.yml` hardcodes the poller's `DATABASE_URL`. This must be changed to `DATABASE_URL: ${POLLER_DATABASE_URL}` so the setup script's generated value is used.
|
|
|
|
## Script Flow
|
|
|
|
### Phase 1: Pre-flight Checks
|
|
|
|
- Verify Python 3.10+
|
|
- Verify Docker Engine and Docker Compose v2 are installed and the daemon is running
|
|
- Check for existing `.env.prod` — if found, offer: overwrite / back up and create new / abort
|
|
- Warn if less than 4GB RAM available
|
|
- Check if key ports are in use (5432, 6379, 4222, 8001, 3000, 51820) and warn
|
|
|
|
### Phase 2: Interactive Configuration (Linear Wizard)
|
|
|
|
Six sections, presented in order:
|
|
|
|
#### 2.1 Database
|
|
|
|
| Prompt | Default | Notes |
|
|
|--------|---------|-------|
|
|
| PostgreSQL superuser password | (required, no default) | Validated non-empty, min 12 chars |
|
|
|
|
Auto-generated:
|
|
- `POSTGRES_DB=tod`
|
|
- `app_user` password via `secrets.token_urlsafe(24)` (yields ~32 base64 chars)
|
|
- `poller_user` password via `secrets.token_urlsafe(24)` (yields ~32 base64 chars)
|
|
- `DATABASE_URL=postgresql+asyncpg://postgres:<pw>@postgres:5432/tod`
|
|
- `SYNC_DATABASE_URL=postgresql+psycopg2://postgres:<pw>@postgres:5432/tod`
|
|
- `APP_USER_DATABASE_URL=postgresql+asyncpg://app_user:<app_pw>@postgres:5432/tod`
|
|
- `POLLER_DATABASE_URL=postgres://poller_user:<poller_pw>@postgres:5432/tod`
|
|
|
|
#### 2.2 Security
|
|
|
|
No prompts. Auto-generated:
|
|
- `JWT_SECRET_KEY` via `secrets.token_urlsafe(64)` (yields ~86 base64 chars)
|
|
- `CREDENTIAL_ENCRYPTION_KEY` via `base64(secrets.token_bytes(32))` (yields 44 base64 chars)
|
|
- Display both values to the user with a "save these somewhere safe" note
|
|
|
|
#### 2.3 Admin Account
|
|
|
|
| Prompt | Default | Notes |
|
|
|--------|---------|-------|
|
|
| Admin email | `admin@the-other-dude.dev` | Validated as email-like |
|
|
| Admin password | (enter or press Enter to generate) | Min 12 chars if manual; generated passwords are 24 chars |
|
|
|
|
#### 2.4 Email (Optional)
|
|
|
|
| Prompt | Default | Notes |
|
|
|--------|---------|-------|
|
|
| Configure SMTP now? | No | If no, skip with reminder |
|
|
| SMTP host | (required if yes) | |
|
|
| SMTP port | 587 | |
|
|
| SMTP username | (optional) | |
|
|
| SMTP password | (optional) | |
|
|
| From address | (required if yes) | |
|
|
| Use TLS? | Yes | |
|
|
|
|
#### 2.5 Web / Domain
|
|
|
|
| Prompt | Default | Notes |
|
|
|--------|---------|-------|
|
|
| Production domain | (required) | e.g. `tod.staack.com` |
|
|
|
|
Auto-derived:
|
|
- `APP_BASE_URL=https://<domain>`
|
|
- `CORS_ORIGINS=https://<domain>`
|
|
|
|
#### 2.6 Summary & Confirmation
|
|
|
|
Display all settings grouped by section. Secrets are partially masked (first 8 chars + `...`). Ask for confirmation before writing.
|
|
|
|
### Phase 3: Write `.env.prod`
|
|
|
|
Write the file with section comments and timestamp header. Also generate `scripts/init-postgres-prod.sql` with the generated `app_user` and `poller_user` passwords baked in (PostgreSQL init scripts don't support env var substitution).
|
|
|
|
Format:
|
|
|
|
```bash
|
|
# ============================================================
|
|
# TOD Production Environment — generated by setup.py
|
|
# Generated: <ISO timestamp>
|
|
# ============================================================
|
|
|
|
# --- Database ---
|
|
POSTGRES_DB=tod
|
|
POSTGRES_USER=postgres
|
|
POSTGRES_PASSWORD=<input>
|
|
DATABASE_URL=postgresql+asyncpg://postgres:<pw>@postgres:5432/tod
|
|
SYNC_DATABASE_URL=postgresql+psycopg2://postgres:<pw>@postgres:5432/tod
|
|
APP_USER_DATABASE_URL=postgresql+asyncpg://app_user:<app_pw>@postgres:5432/tod
|
|
POLLER_DATABASE_URL=postgres://poller_user:<poller_pw>@postgres:5432/tod
|
|
|
|
# --- Security ---
|
|
JWT_SECRET_KEY=<generated>
|
|
CREDENTIAL_ENCRYPTION_KEY=<generated>
|
|
|
|
# --- OpenBao (KMS) ---
|
|
OPENBAO_ADDR=http://openbao:8200
|
|
OPENBAO_TOKEN=PLACEHOLDER_RUN_SETUP
|
|
BAO_UNSEAL_KEY=PLACEHOLDER_RUN_SETUP
|
|
|
|
# --- Admin Bootstrap ---
|
|
FIRST_ADMIN_EMAIL=<input>
|
|
FIRST_ADMIN_PASSWORD=<input-or-generated>
|
|
|
|
# --- Email ---
|
|
# <configured block or "unconfigured" note>
|
|
SMTP_HOST=
|
|
SMTP_PORT=587
|
|
SMTP_USER=
|
|
SMTP_PASSWORD=
|
|
SMTP_USE_TLS=true
|
|
SMTP_FROM_ADDRESS=noreply@example.com
|
|
|
|
# --- Web ---
|
|
APP_BASE_URL=https://<domain>
|
|
CORS_ORIGINS=https://<domain>
|
|
|
|
# --- Application ---
|
|
ENVIRONMENT=production
|
|
LOG_LEVEL=info
|
|
DEBUG=false
|
|
APP_NAME=TOD - The Other Dude
|
|
|
|
# --- Storage ---
|
|
GIT_STORE_PATH=/data/git-store
|
|
FIRMWARE_CACHE_DIR=/data/firmware-cache
|
|
WIREGUARD_CONFIG_PATH=/data/wireguard
|
|
WIREGUARD_GATEWAY=wireguard
|
|
CONFIG_RETENTION_DAYS=90
|
|
|
|
# --- Redis & NATS ---
|
|
REDIS_URL=redis://redis:6379/0
|
|
NATS_URL=nats://nats:4222
|
|
|
|
# --- Poller ---
|
|
POLL_INTERVAL_SECONDS=60
|
|
CONNECTION_TIMEOUT_SECONDS=10
|
|
COMMAND_TIMEOUT_SECONDS=30
|
|
|
|
# --- Remote Access ---
|
|
TUNNEL_PORT_MIN=49000
|
|
TUNNEL_PORT_MAX=49100
|
|
TUNNEL_IDLE_TIMEOUT=300
|
|
SSH_RELAY_PORT=8080
|
|
SSH_IDLE_TIMEOUT=900
|
|
|
|
# --- Config Backup ---
|
|
CONFIG_BACKUP_INTERVAL=21600
|
|
CONFIG_BACKUP_MAX_CONCURRENT=10
|
|
```
|
|
|
|
### Phase 4: OpenBao Bootstrap
|
|
|
|
1. Start postgres and openbao containers only: `docker compose -f docker-compose.yml -f docker-compose.prod.yml --env-file .env.prod up -d postgres openbao`
|
|
2. Wait for openbao container to be healthy (timeout 60s)
|
|
3. Run `docker compose logs openbao 2>&1` and parse the `OPENBAO_TOKEN=` and `BAO_UNSEAL_KEY=` lines using regex (init.sh prints these to stdout during container startup, which is captured in Docker logs)
|
|
4. Update `.env.prod` by replacing the `PLACEHOLDER_RUN_SETUP` values with the captured credentials
|
|
5. On failure: `.env.prod` retains placeholders, print instructions for manual capture via `docker compose logs openbao`
|
|
|
|
### Phase 5: Build Images
|
|
|
|
Build sequentially to avoid OOM:
|
|
|
|
```
|
|
docker compose -f docker-compose.yml -f docker-compose.prod.yml build api
|
|
docker compose -f docker-compose.yml -f docker-compose.prod.yml build poller
|
|
docker compose -f docker-compose.yml -f docker-compose.prod.yml build frontend
|
|
docker compose -f docker-compose.yml -f docker-compose.prod.yml build winbox-worker
|
|
```
|
|
|
|
Show progress for each. On failure: stop, report which image failed, suggest rerunning.
|
|
|
|
### Phase 6: Start Stack
|
|
|
|
```
|
|
docker compose -f docker-compose.yml -f docker-compose.prod.yml --env-file .env.prod up -d
|
|
```
|
|
|
|
### Phase 7: Health Check
|
|
|
|
- Poll service health for up to 60 seconds
|
|
- Report status of: postgres, redis, nats, openbao, api, poller, frontend, winbox-worker
|
|
- On success: print access URL (`https://<domain>`) and admin credentials
|
|
- On timeout: report which services are unhealthy, suggest `docker compose logs <service>`
|
|
|
|
## Database Init Script
|
|
|
|
`scripts/init-postgres.sql` hardcodes `app_password` and `poller_password`. Since PostgreSQL's `docker-entrypoint-initdb.d` scripts don't support environment variable substitution, the setup script generates `scripts/init-postgres-prod.sql` with the actual generated passwords baked in. The docker-compose.prod.yml volume mount will be updated to use this file instead.
|
|
|
|
## Login Page Fix
|
|
|
|
`frontend/src/routes/login.tsx` lines 235-241 contain a "First time?" hint showing `.env` credential names. This will be wrapped in `{import.meta.env.DEV && (...)}` so it only appears in development builds. Vite's production build strips DEV-gated code entirely.
|
|
|
|
## Error Handling
|
|
|
|
| Scenario | Behavior |
|
|
|----------|----------|
|
|
| Docker not installed/running | Fail early with clear message |
|
|
| Existing `.env.prod` | Offer: overwrite / back up / abort |
|
|
| Port already in use | Warn (non-blocking) with which port and likely culprit |
|
|
| OpenBao init fails | `.env.prod` retains placeholders, print manual capture steps |
|
|
| Image build fails | Stop, show failed image, suggest retry command |
|
|
| Health check timeout (60s) | Report unhealthy services, suggest log commands |
|
|
| Ctrl+C before Phase 3 | Graceful exit, no files written |
|
|
| Ctrl+C during/after Phase 3 | `.env.prod` exists (possibly with placeholders), noted on exit |
|
|
|
|
## Re-runnability
|
|
|
|
- Detects existing `.env.prod` and offers choices
|
|
- Won't regenerate secrets if valid ones exist (offers to keep or regenerate)
|
|
- OpenBao re-init is idempotent (init.sh handles already-initialized state)
|
|
- Image rebuilds are safe (Docker layer caching)
|
|
- Backup naming: `.env.prod.backup.<ISO timestamp>`
|
|
|
|
## Dependencies
|
|
|
|
- Python 3.10+ (stdlib only — no pip packages required)
|
|
- Docker Engine 24+
|
|
- Docker Compose v2
|
|
- Stdlib modules: `secrets`, `subprocess`, `shutil`, `json`, `re`, `datetime`, `pathlib`, `getpass`, `socket` (for port checks)
|