diff --git a/docs/plans/2026-05-24-second-environment-design.md b/docs/plans/2026-05-24-second-environment-design.md new file mode 100644 index 0000000..5d909f5 --- /dev/null +++ b/docs/plans/2026-05-24-second-environment-design.md @@ -0,0 +1,276 @@ +# Second Docker Environment (`env2`) — Design + +**Date:** 2026-05-24 +**Status:** Approved — ready for implementation plan +**Purpose:** Stand up a second, concurrently-running ScadaLink cluster on the same machine so the new Transport (#24) feature can be exercised end-to-end against a real second environment (export from one UI, import into the other). + +## Goal + +A sibling `docker-env2/` directory with `deploy.sh` / `teardown.sh` / `seed-sites.sh` / `init-db.sh` that brings up a minimal but fully-functional second cluster — its own central + site, its own ConfigurationDB — alongside the existing `docker/` stack. Both environments run concurrently and share the commodity infra services (MSSQL container, LDAP, SMTP, OPC UA, REST API). No application code changes; this is purely deploy tooling. + +## Non-Goals + +- Not a fully air-gapped twin (LDAP/SMTP/OPC UA/REST API are shared). +- Not a full mirror of primary's three-site topology — env2 has one site (`site-x`). +- Not a multi-tenant abstraction or `--env` flag retrofit on `docker/deploy.sh` — kept as two independent script trees for clarity. +- No new automated tests — env2 enables manual verification via [`2026-05-24-second-environment-verification.md`](2026-05-24-second-environment-verification.md) (created during implementation). + +## Architecture Overview + +``` + (host machine) + + Primary stack (already existing — unchanged) Env2 stack (new) + ┌────────────────────────────────────┐ ┌──────────────────────────────┐ + │ Traefik :9000 ◄── 9001/9002 UI │ │ Traefik :9100 ◄── 9101/9102 UI│ + │ Central A/B (9011/9012 Akka) │ │ Central A/B (9111/9112 Akka) │ + │ Site-A/B/C (9021..9044) │ │ Site-X (9121/9122 Akka, │ + └─────────────┬──────────────────────┘ │ 9123/9124 gRPC) │ + │ └──────────┬───────────────────┘ + │ │ + ▼ scadalink-net (shared bridge network) ◄──────┘ + ┌──────────────────────────────────────────────────────────────┐ + │ scadalink-mssql ScadaLinkConfig (primary DB) │ + │ ScadaLinkMachineData (primary DB) │ + │ ScadaLinkConfig2 (env2 DB) ← new │ + │ ScadaLinkMachineData2(env2 DB) ← new │ + │ scadalink-ldap (shared — same test users) │ + │ scadalink-smtp (shared Mailpit) │ + │ scadalink-opcua (shared) │ + │ scadalink-restapi (shared) │ + └──────────────────────────────────────────────────────────────┘ +``` + +Both stacks attach to the same `scadalink-net` Docker bridge so env2's app containers can reach the infra services by container hostname (`scadalink-mssql`, `scadalink-ldap`, etc.). Akka clusters are independent — each side's `SeedNodes` lists only its own central nodes, so they never gossip-merge despite sharing the network. + +## Topology & Port Allocation + +| Role | Container name | Host Web | Host Akka | Host gRPC | Notes | +|----------------|-----------------------------|----------|-----------|-----------|-------| +| Traefik LB | `scadalink-env2-traefik` | 9100 | — | — | Dashboard on host 8181 | +| Central A | `scadalink-env2-central-a` | 9101 | 9111 | — | | +| Central B | `scadalink-env2-central-b` | 9102 | 9112 | — | | +| Site-X A | `scadalink-env2-site-x-a` | — | 9121 | 9123 | | +| Site-X B | `scadalink-env2-site-x-b` | — | 9122 | 9124 | | + +Pattern: env2 host ports are primary + 100 (e.g. primary central-a 9001 → env2 central-a 9101). Confirmed free at design time. Identifier `site-x` distinguishes env2's single site from primary's `site-a/b/c` in logs/UI (technically not required — each central has its own ConfigurationDB — but clearer for operators). + +## Infrastructure & Databases + +**Shared `scadalink-net` + shared `scadalink-mssql` container, separate logical databases:** + +- New databases: `ScadaLinkConfig2`, `ScadaLinkMachineData2`. +- Reuse the existing `scadalink_app` SQL login with `db_owner` on both — one credential to manage. +- DB creation handled by a new `infra/mssql/setup-env2.sql` (idempotent, `IF NOT EXISTS`-guarded). +- Two activation paths: + 1. **Fresh MSSQL volume** — mount `setup-env2.sql` alongside the existing `setup.sql` in `/docker-entrypoint-initdb.d/` so it runs automatically on first startup. + 2. **Already-running MSSQL** — `docker-env2/init-db.sh` exec's `sqlcmd` inside the container to apply the same script. No MSSQL restart. +- EF Core migrations auto-apply on env2 central startup (matches the primary's `Development` env var pattern) → tables created on first deploy. + +**Reused as-is:** LDAP (same `multi-role`/`admin`/`designer`/`deployer` test users), SMTP (Mailpit — env2's emails appear in the same inbox at http://localhost:8025, distinguishable by env2's `FromAddress`), OPC UA, REST API. All stateless commodities; isolation gains nothing by duplicating them. + +Connection strings in env2 central appsettings: +``` +ConfigurationDb: Server=scadalink-mssql,1433;Database=ScadaLinkConfig2;User Id=scadalink_app;Password=ScadaLink_Dev1#;TrustServerCertificate=true +MachineDataDb: Server=scadalink-mssql,1433;Database=ScadaLinkMachineData2;User Id=scadalink_app;Password=ScadaLink_Dev1#;TrustServerCertificate=true +``` + +## Directory Layout + +``` +docker-env2/ +├── docker-compose.yml # 5 services: 2 central + 2 site + Traefik +├── deploy.sh # build (reuses docker/build.sh) + init-db + compose up +├── teardown.sh # compose down (preserves data + logs) +├── seed-sites.sh # CLI creates site-x against http://localhost:9100 +├── init-db.sh # sqlcmd exec against scadalink-mssql +├── central-node-a/ +│ └── appsettings.Central.json +├── central-node-b/ +│ └── appsettings.Central.json +├── site-x-node-a/ +│ ├── appsettings.Site.json +│ ├── data/ # gitignored — SQLite store-and-forward + site-event-log +│ └── logs/ # gitignored +├── site-x-node-b/ +│ ├── appsettings.Site.json +│ ├── data/ +│ └── logs/ +└── traefik/ + ├── traefik.yml # dashboard on :8080 (host 8181) + └── dynamic.yml # service URLs → scadalink-env2-central-a/b +``` + +Mirrors `docker/`'s shape exactly so operator muscle memory carries over. + +## Per-Node Appsettings — Key Differences from Primary + +Each env2 appsettings file is a near-clone of the primary equivalent with these targeted overrides: + +| Field | Primary | Env2 | +|--------------------------------------------|-----------------------------------------------------|-----------------------------------------------------| +| `Node.NodeHostname` | `scadalink-central-a` / `scadalink-site-a-a` / ... | `scadalink-env2-central-a` / `scadalink-env2-site-x-a` / ... | +| `Cluster.SeedNodes` | primary central hostnames | env2 central hostnames | +| `Communication.CentralContactPoints` (site)| primary central hostnames | env2 central hostnames | +| `Node.SiteId` (site) | `site-a` / `site-b` / `site-c` | `site-x` | +| `Database.ConfigurationDb` | `ScadaLinkConfig` | `ScadaLinkConfig2` | +| `Database.MachineDataDb` | `ScadaLinkMachineData` | `ScadaLinkMachineData2` | +| `Notification.FromAddress` | `scada-notifications@company.com` | `scada-notifications-env2@company.com` | +| `Security.JwtSigningKey` | primary signing key | a distinct env2 signing key | +| `Transport.SourceEnvironment` | `docker-cluster` | `docker-cluster-env2` | + +`Transport.SourceEnvironment` is the field that ends up stamped into exported bundle manifests, so a bundle visibly self-identifies which environment produced it. + +## Lifecycle Scripts + +### `docker-env2/deploy.sh` +```bash +#!/bin/bash +set -euo pipefail +SCRIPT_DIR="$(cd "$(dirname "$0")" && pwd)" + +echo "=== ScadaLink Env2 Docker Deploy ===" + +# Reuse the primary build (same scadalink:latest image) +"$SCRIPT_DIR/../docker/build.sh" + +# Ensure env2 databases exist on the shared scadalink-mssql +"$SCRIPT_DIR/init-db.sh" + +echo "Deploying env2 containers..." +docker compose -f "$SCRIPT_DIR/docker-compose.yml" up -d --force-recreate +docker compose -f "$SCRIPT_DIR/docker-compose.yml" ps + +echo "Access points:" +echo " Central (Traefik LB): http://localhost:9100" +echo " Central UI (node A): http://localhost:9101" +echo " Central UI (node B): http://localhost:9102" +echo " Traefik dashboard: http://localhost:8181" +echo "Seed site: docker-env2/seed-sites.sh" +``` + +### `docker-env2/init-db.sh` +```bash +#!/bin/bash +set -euo pipefail +SCRIPT_DIR="$(cd "$(dirname "$0")" && pwd)" + +if ! docker ps --format '{{.Names}}' | grep -q '^scadalink-mssql$'; then + echo "ERROR: scadalink-mssql is not running. Start it: cd infra && docker compose up -d" >&2 + exit 1 +fi + +echo "Applying env2 database setup..." +docker exec -i scadalink-mssql /opt/mssql-tools18/bin/sqlcmd \ + -S localhost -U sa -P 'ScadaLink_Dev1#' -C \ + < "$SCRIPT_DIR/../infra/mssql/setup-env2.sql" + +echo "Env2 databases ready." +``` + +### `docker-env2/seed-sites.sh` +```bash +#!/bin/bash +set -euo pipefail +SCRIPT_DIR="$(cd "$(dirname "$0")" && pwd)" +PROJECT_ROOT="$(cd "$SCRIPT_DIR/.." && pwd)" +CLI="dotnet run --project $PROJECT_ROOT/src/ScadaLink.CLI --" +AUTH="--username multi-role --password password" +URL="--url http://localhost:9100" + +echo "Creating Site-X on env2..." +$CLI $URL $AUTH site create \ + --name "Env2 Site X" \ + --identifier "site-x" \ + --description "Env2 test site - two-node cluster" \ + --node-a-address "akka.tcp://scadalink@scadalink-env2-site-x-a:8082" \ + --node-b-address "akka.tcp://scadalink@scadalink-env2-site-x-b:8082" \ + --grpc-node-a-address "http://scadalink-env2-site-x-a:8083" \ + --grpc-node-b-address "http://scadalink-env2-site-x-b:8083" \ +|| echo " (Site-X may already exist)" +``` + +### `docker-env2/teardown.sh` +```bash +#!/bin/bash +set -euo pipefail +SCRIPT_DIR="$(cd "$(dirname "$0")" && pwd)" +docker compose -f "$SCRIPT_DIR/docker-compose.yml" down +``` + +### Operator Workflow + +| Action | Command(s) | +|----------------------------------|-----------| +| First-time env2 bring-up | `bash docker-env2/deploy.sh && bash docker-env2/seed-sites.sh` | +| Iterate on env2 after code edit | `bash docker-env2/deploy.sh` | +| Iterate on both envs | `bash docker/deploy.sh && bash docker-env2/deploy.sh` (build cached on 2nd) | +| Wipe env2 DB for clean re-import | `docker exec scadalink-mssql sqlcmd ... DROP DATABASE ScadaLinkConfig2; DROP DATABASE ScadaLinkMachineData2;` then `bash docker-env2/deploy.sh` | +| Stop env2 only | `bash docker-env2/teardown.sh` | + +## Transport Testing Workflow — The Whole Point of env2 + +**Golden-path demo:** + +1. Set up primary with at least a few templates + one deployed instance. +2. `bash docker-env2/deploy.sh && bash docker-env2/seed-sites.sh` — env2 ConfigurationDB is empty. +3. Browser → http://localhost:9000 → `multi-role` login → **Design → Export Bundle** → select templates → review → set passphrase → download `.scadabundle`. +4. Browser → http://localhost:9101 → log in → **Admin → Import Bundle** → upload file → enter passphrase → review diff (all "Create" rows) → confirm. +5. Verify: env2's `Design → Templates` shows imported items; `Audit → Configuration Audit Log` shows rows tagged with the matching `BundleImportId`. +6. Deploy an imported template to env2's `site-x` to prove runtime-validity end-to-end. + +**Manual tests env2 enables that mock-based tests cannot:** +- Conflict-resolution UI on re-import (Skip / Overwrite / Rename per row). +- Cross-environment audit correlation via `BundleImportId` chip. +- Schema-version gating (`SchemaVersionMajor` mismatch). +- Wrong-passphrase rejection + `MaxUnlockAttemptsPerSession=3` lockout. +- Round-trip parity: export from primary → import into env2 → export from env2 → re-import into primary with Skip-on-conflict. Revision hashes should match. + +**What env2 does NOT test:** +- Multi-site Transport scenarios (env2 has one site by design). +- Site-clustered Transport flows (Transport is central-only). +- True air-gapped network isolation (env2 shares MSSQL/LDAP/SMTP — out of scope). + +## Error Handling & Edge Cases + +- **`init-db.sh`** fails fast with a clear message if `scadalink-mssql` isn't running. +- **`deploy.sh`** runs with `set -euo pipefail` so any failed step halts cleanly. +- **MSSQL volume reset** — both the docker-entrypoint mount and the exec-based `init-db.sh` apply the same idempotent script; either path leaves env2 DBs ready. +- **Cluster cross-talk** — primary and env2 use the same Akka system name `scadalink` but disjoint seed-node hostnames, so the gossip protocols cannot merge. Defensive: env2 appsettings are written from scratch, not sed'd from primary. +- **gRPC streaming** — env2 central uses container-name DNS (`http://scadalink-env2-site-x-a:8083`) for site-x streams, populated by `seed-sites.sh`. +- **Cookie/JWT bleed** — different `JwtSigningKey` + different host origins (`localhost:9000` vs `localhost:9100`) mean sessions cannot cross envs. +- **Port collision** — host port range `91XX` non-overlapping with primary's `90XX`; confirmed all 10 ports free at design time. If an operator later remaps, Compose surfaces `bind: address already in use`. + +## Testing + +**No new automated tests are added.** This is infrastructure tooling — the Transport feature already has 39 unit + 26 integration tests. The deliverable is a manual verification checklist at `docs/plans/2026-05-24-second-environment-verification.md` mirroring the Transport manual checklist, walking through the Section 5 golden path. + +**First-deploy smoke test:** +1. `docker ps` shows 5 new `scadalink-env2-*` containers. +2. `curl http://localhost:9101/health/ready` returns green. +3. `curl http://localhost:9100/health/active` Traefik routes to active node. +4. Browser to http://localhost:9100 → `multi-role` login → Dashboard renders, Sites page is empty. +5. Run `docker-env2/seed-sites.sh` → site-x appears; health turns green within ~30s. + +## Documentation Updates + +- New: `docker-env2/README.md` — operator quick-start, copying the structure of `docker/README.md`. +- Update: `README.md` (project root) — add a "Second Environment" callout pointing to `docker-env2/README.md` with a one-sentence purpose statement. +- Update: `CLAUDE.md` — add `docker-env2/` to the Project Structure section so future sessions discover it. +- Update: `infra/README.md` — note that `setup-env2.sql` is mounted alongside `setup.sql`. + +## Out-of-Scope Future Extensions + +- A third / fourth environment by the same pattern (just bump prefix + port offset). +- `--env` flag retrofit on `docker/deploy.sh` if the directory duplication grows painful — not worth doing for just two environments. +- Air-gapped twin with its own MSSQL/LDAP/SMTP — straightforward extension of the same pattern if isolation requirements ever tighten. + +## Acceptance Criteria + +- [ ] `bash docker-env2/deploy.sh` brings up 5 containers cleanly on a machine where primary is already running. +- [ ] `bash docker-env2/seed-sites.sh` registers `site-x` and the site cluster reaches healthy state. +- [ ] http://localhost:9100 serves the env2 Central UI through Traefik with failover between 9101/9102. +- [ ] env2 reads/writes only `ScadaLinkConfig2` / `ScadaLinkMachineData2`; primary's DBs untouched after env2 deploy. +- [ ] `bash docker/deploy.sh && bash docker-env2/deploy.sh` succeeds in sequence; both stacks run concurrently. +- [ ] A bundle exported from primary can be imported into env2, with audit rows tagged by `BundleImportId` and visible in env2's Configuration Audit Log. +- [ ] Manual verification checklist completes end-to-end.