Files
scadalink-design/docs/plans/2026-05-24-second-environment-design.md
Joseph Doherty 2fd3426fc2 docs(plans): add second environment (env2) design
Brainstorming output for a sibling docker-env2/ tree that brings up a
minimal second cluster (2 central + 1 site x 2 nodes + Traefik) on the
same machine alongside the primary docker/ stack. Shares the existing
scadalink-net network and scadalink-mssql container but uses separate
logical databases (ScadaLinkConfig2 / ScadaLinkMachineData2) so the
Transport (#24) feature can be exercised end-to-end with real
cross-environment exports and imports.
2026-05-24 07:03:02 -04:00

17 KiB

Second Docker Environment (env2) — Design

Date: 2026-05-24 Status: Approved — ready for implementation plan Purpose: Stand up a second, concurrently-running ScadaLink cluster on the same machine so the new Transport (#24) feature can be exercised end-to-end against a real second environment (export from one UI, import into the other).

Goal

A sibling docker-env2/ directory with deploy.sh / teardown.sh / seed-sites.sh / init-db.sh that brings up a minimal but fully-functional second cluster — its own central + site, its own ConfigurationDB — alongside the existing docker/ stack. Both environments run concurrently and share the commodity infra services (MSSQL container, LDAP, SMTP, OPC UA, REST API). No application code changes; this is purely deploy tooling.

Non-Goals

  • Not a fully air-gapped twin (LDAP/SMTP/OPC UA/REST API are shared).
  • Not a full mirror of primary's three-site topology — env2 has one site (site-x).
  • Not a multi-tenant abstraction or --env flag retrofit on docker/deploy.sh — kept as two independent script trees for clarity.
  • No new automated tests — env2 enables manual verification via 2026-05-24-second-environment-verification.md (created during implementation).

Architecture Overview

                          (host machine)

  Primary stack (already existing — unchanged)        Env2 stack (new)
  ┌────────────────────────────────────┐              ┌──────────────────────────────┐
  │ Traefik :9000  ◄── 9001/9002 UI    │              │ Traefik :9100 ◄── 9101/9102 UI│
  │ Central A/B    (9011/9012 Akka)    │              │ Central A/B  (9111/9112 Akka) │
  │ Site-A/B/C (9021..9044)            │              │ Site-X       (9121/9122 Akka, │
  └─────────────┬──────────────────────┘              │              9123/9124 gRPC)  │
                │                                     └──────────┬───────────────────┘
                │                                                │
                ▼  scadalink-net (shared bridge network)  ◄──────┘
        ┌──────────────────────────────────────────────────────────────┐
        │ scadalink-mssql        ScadaLinkConfig      (primary DB)     │
        │                        ScadaLinkMachineData (primary DB)     │
        │                        ScadaLinkConfig2     (env2 DB) ← new  │
        │                        ScadaLinkMachineData2(env2 DB) ← new  │
        │ scadalink-ldap         (shared — same test users)            │
        │ scadalink-smtp         (shared Mailpit)                      │
        │ scadalink-opcua        (shared)                              │
        │ scadalink-restapi      (shared)                              │
        └──────────────────────────────────────────────────────────────┘

Both stacks attach to the same scadalink-net Docker bridge so env2's app containers can reach the infra services by container hostname (scadalink-mssql, scadalink-ldap, etc.). Akka clusters are independent — each side's SeedNodes lists only its own central nodes, so they never gossip-merge despite sharing the network.

Topology & Port Allocation

Role Container name Host Web Host Akka Host gRPC Notes
Traefik LB scadalink-env2-traefik 9100 Dashboard on host 8181
Central A scadalink-env2-central-a 9101 9111
Central B scadalink-env2-central-b 9102 9112
Site-X A scadalink-env2-site-x-a 9121 9123
Site-X B scadalink-env2-site-x-b 9122 9124

Pattern: env2 host ports are primary + 100 (e.g. primary central-a 9001 → env2 central-a 9101). Confirmed free at design time. Identifier site-x distinguishes env2's single site from primary's site-a/b/c in logs/UI (technically not required — each central has its own ConfigurationDB — but clearer for operators).

Infrastructure & Databases

Shared scadalink-net + shared scadalink-mssql container, separate logical databases:

  • New databases: ScadaLinkConfig2, ScadaLinkMachineData2.
  • Reuse the existing scadalink_app SQL login with db_owner on both — one credential to manage.
  • DB creation handled by a new infra/mssql/setup-env2.sql (idempotent, IF NOT EXISTS-guarded).
  • Two activation paths:
    1. Fresh MSSQL volume — mount setup-env2.sql alongside the existing setup.sql in /docker-entrypoint-initdb.d/ so it runs automatically on first startup.
    2. Already-running MSSQLdocker-env2/init-db.sh exec's sqlcmd inside the container to apply the same script. No MSSQL restart.
  • EF Core migrations auto-apply on env2 central startup (matches the primary's Development env var pattern) → tables created on first deploy.

Reused as-is: LDAP (same multi-role/admin/designer/deployer test users), SMTP (Mailpit — env2's emails appear in the same inbox at http://localhost:8025, distinguishable by env2's FromAddress), OPC UA, REST API. All stateless commodities; isolation gains nothing by duplicating them.

Connection strings in env2 central appsettings:

ConfigurationDb: Server=scadalink-mssql,1433;Database=ScadaLinkConfig2;User Id=scadalink_app;Password=ScadaLink_Dev1#;TrustServerCertificate=true
MachineDataDb:   Server=scadalink-mssql,1433;Database=ScadaLinkMachineData2;User Id=scadalink_app;Password=ScadaLink_Dev1#;TrustServerCertificate=true

Directory Layout

docker-env2/
├── docker-compose.yml              # 5 services: 2 central + 2 site + Traefik
├── deploy.sh                       # build (reuses docker/build.sh) + init-db + compose up
├── teardown.sh                     # compose down (preserves data + logs)
├── seed-sites.sh                   # CLI creates site-x against http://localhost:9100
├── init-db.sh                      # sqlcmd exec against scadalink-mssql
├── central-node-a/
│   └── appsettings.Central.json
├── central-node-b/
│   └── appsettings.Central.json
├── site-x-node-a/
│   ├── appsettings.Site.json
│   ├── data/                       # gitignored — SQLite store-and-forward + site-event-log
│   └── logs/                       # gitignored
├── site-x-node-b/
│   ├── appsettings.Site.json
│   ├── data/
│   └── logs/
└── traefik/
    ├── traefik.yml                 # dashboard on :8080 (host 8181)
    └── dynamic.yml                 # service URLs → scadalink-env2-central-a/b

Mirrors docker/'s shape exactly so operator muscle memory carries over.

Per-Node Appsettings — Key Differences from Primary

Each env2 appsettings file is a near-clone of the primary equivalent with these targeted overrides:

Field Primary Env2
Node.NodeHostname scadalink-central-a / scadalink-site-a-a / ... scadalink-env2-central-a / scadalink-env2-site-x-a / ...
Cluster.SeedNodes primary central hostnames env2 central hostnames
Communication.CentralContactPoints (site) primary central hostnames env2 central hostnames
Node.SiteId (site) site-a / site-b / site-c site-x
Database.ConfigurationDb ScadaLinkConfig ScadaLinkConfig2
Database.MachineDataDb ScadaLinkMachineData ScadaLinkMachineData2
Notification.FromAddress scada-notifications@company.com scada-notifications-env2@company.com
Security.JwtSigningKey primary signing key a distinct env2 signing key
Transport.SourceEnvironment docker-cluster docker-cluster-env2

Transport.SourceEnvironment is the field that ends up stamped into exported bundle manifests, so a bundle visibly self-identifies which environment produced it.

Lifecycle Scripts

docker-env2/deploy.sh

#!/bin/bash
set -euo pipefail
SCRIPT_DIR="$(cd "$(dirname "$0")" && pwd)"

echo "=== ScadaLink Env2 Docker Deploy ==="

# Reuse the primary build (same scadalink:latest image)
"$SCRIPT_DIR/../docker/build.sh"

# Ensure env2 databases exist on the shared scadalink-mssql
"$SCRIPT_DIR/init-db.sh"

echo "Deploying env2 containers..."
docker compose -f "$SCRIPT_DIR/docker-compose.yml" up -d --force-recreate
docker compose -f "$SCRIPT_DIR/docker-compose.yml" ps

echo "Access points:"
echo "  Central (Traefik LB): http://localhost:9100"
echo "  Central UI (node A):  http://localhost:9101"
echo "  Central UI (node B):  http://localhost:9102"
echo "  Traefik dashboard:    http://localhost:8181"
echo "Seed site:  docker-env2/seed-sites.sh"

docker-env2/init-db.sh

#!/bin/bash
set -euo pipefail
SCRIPT_DIR="$(cd "$(dirname "$0")" && pwd)"

if ! docker ps --format '{{.Names}}' | grep -q '^scadalink-mssql$'; then
  echo "ERROR: scadalink-mssql is not running. Start it: cd infra && docker compose up -d" >&2
  exit 1
fi

echo "Applying env2 database setup..."
docker exec -i scadalink-mssql /opt/mssql-tools18/bin/sqlcmd \
    -S localhost -U sa -P 'ScadaLink_Dev1#' -C \
    < "$SCRIPT_DIR/../infra/mssql/setup-env2.sql"

echo "Env2 databases ready."

docker-env2/seed-sites.sh

#!/bin/bash
set -euo pipefail
SCRIPT_DIR="$(cd "$(dirname "$0")" && pwd)"
PROJECT_ROOT="$(cd "$SCRIPT_DIR/.." && pwd)"
CLI="dotnet run --project $PROJECT_ROOT/src/ScadaLink.CLI --"
AUTH="--username multi-role --password password"
URL="--url http://localhost:9100"

echo "Creating Site-X on env2..."
$CLI $URL $AUTH site create \
    --name "Env2 Site X" \
    --identifier "site-x" \
    --description "Env2 test site - two-node cluster" \
    --node-a-address "akka.tcp://scadalink@scadalink-env2-site-x-a:8082" \
    --node-b-address "akka.tcp://scadalink@scadalink-env2-site-x-b:8082" \
    --grpc-node-a-address "http://scadalink-env2-site-x-a:8083" \
    --grpc-node-b-address "http://scadalink-env2-site-x-b:8083" \
|| echo "  (Site-X may already exist)"

docker-env2/teardown.sh

#!/bin/bash
set -euo pipefail
SCRIPT_DIR="$(cd "$(dirname "$0")" && pwd)"
docker compose -f "$SCRIPT_DIR/docker-compose.yml" down

Operator Workflow

Action Command(s)
First-time env2 bring-up bash docker-env2/deploy.sh && bash docker-env2/seed-sites.sh
Iterate on env2 after code edit bash docker-env2/deploy.sh
Iterate on both envs bash docker/deploy.sh && bash docker-env2/deploy.sh (build cached on 2nd)
Wipe env2 DB for clean re-import docker exec scadalink-mssql sqlcmd ... DROP DATABASE ScadaLinkConfig2; DROP DATABASE ScadaLinkMachineData2; then bash docker-env2/deploy.sh
Stop env2 only bash docker-env2/teardown.sh

Transport Testing Workflow — The Whole Point of env2

Golden-path demo:

  1. Set up primary with at least a few templates + one deployed instance.
  2. bash docker-env2/deploy.sh && bash docker-env2/seed-sites.sh — env2 ConfigurationDB is empty.
  3. Browser → http://localhost:9000multi-role login → Design → Export Bundle → select templates → review → set passphrase → download .scadabundle.
  4. Browser → http://localhost:9101 → log in → Admin → Import Bundle → upload file → enter passphrase → review diff (all "Create" rows) → confirm.
  5. Verify: env2's Design → Templates shows imported items; Audit → Configuration Audit Log shows rows tagged with the matching BundleImportId.
  6. Deploy an imported template to env2's site-x to prove runtime-validity end-to-end.

Manual tests env2 enables that mock-based tests cannot:

  • Conflict-resolution UI on re-import (Skip / Overwrite / Rename per row).
  • Cross-environment audit correlation via BundleImportId chip.
  • Schema-version gating (SchemaVersionMajor mismatch).
  • Wrong-passphrase rejection + MaxUnlockAttemptsPerSession=3 lockout.
  • Round-trip parity: export from primary → import into env2 → export from env2 → re-import into primary with Skip-on-conflict. Revision hashes should match.

What env2 does NOT test:

  • Multi-site Transport scenarios (env2 has one site by design).
  • Site-clustered Transport flows (Transport is central-only).
  • True air-gapped network isolation (env2 shares MSSQL/LDAP/SMTP — out of scope).

Error Handling & Edge Cases

  • init-db.sh fails fast with a clear message if scadalink-mssql isn't running.
  • deploy.sh runs with set -euo pipefail so any failed step halts cleanly.
  • MSSQL volume reset — both the docker-entrypoint mount and the exec-based init-db.sh apply the same idempotent script; either path leaves env2 DBs ready.
  • Cluster cross-talk — primary and env2 use the same Akka system name scadalink but disjoint seed-node hostnames, so the gossip protocols cannot merge. Defensive: env2 appsettings are written from scratch, not sed'd from primary.
  • gRPC streaming — env2 central uses container-name DNS (http://scadalink-env2-site-x-a:8083) for site-x streams, populated by seed-sites.sh.
  • Cookie/JWT bleed — different JwtSigningKey + different host origins (localhost:9000 vs localhost:9100) mean sessions cannot cross envs.
  • Port collision — host port range 91XX non-overlapping with primary's 90XX; confirmed all 10 ports free at design time. If an operator later remaps, Compose surfaces bind: address already in use.

Testing

No new automated tests are added. This is infrastructure tooling — the Transport feature already has 39 unit + 26 integration tests. The deliverable is a manual verification checklist at docs/plans/2026-05-24-second-environment-verification.md mirroring the Transport manual checklist, walking through the Section 5 golden path.

First-deploy smoke test:

  1. docker ps shows 5 new scadalink-env2-* containers.
  2. curl http://localhost:9101/health/ready returns green.
  3. curl http://localhost:9100/health/active Traefik routes to active node.
  4. Browser to http://localhost:9100multi-role login → Dashboard renders, Sites page is empty.
  5. Run docker-env2/seed-sites.sh → site-x appears; health turns green within ~30s.

Documentation Updates

  • New: docker-env2/README.md — operator quick-start, copying the structure of docker/README.md.
  • Update: README.md (project root) — add a "Second Environment" callout pointing to docker-env2/README.md with a one-sentence purpose statement.
  • Update: CLAUDE.md — add docker-env2/ to the Project Structure section so future sessions discover it.
  • Update: infra/README.md — note that setup-env2.sql is mounted alongside setup.sql.

Out-of-Scope Future Extensions

  • A third / fourth environment by the same pattern (just bump prefix + port offset).
  • --env flag retrofit on docker/deploy.sh if the directory duplication grows painful — not worth doing for just two environments.
  • Air-gapped twin with its own MSSQL/LDAP/SMTP — straightforward extension of the same pattern if isolation requirements ever tighten.

Acceptance Criteria

  • bash docker-env2/deploy.sh brings up 5 containers cleanly on a machine where primary is already running.
  • bash docker-env2/seed-sites.sh registers site-x and the site cluster reaches healthy state.
  • http://localhost:9100 serves the env2 Central UI through Traefik with failover between 9101/9102.
  • env2 reads/writes only ScadaLinkConfig2 / ScadaLinkMachineData2; primary's DBs untouched after env2 deploy.
  • bash docker/deploy.sh && bash docker-env2/deploy.sh succeeds in sequence; both stacks run concurrently.
  • A bundle exported from primary can be imported into env2, with audit rows tagged by BundleImportId and visible in env2's Configuration Audit Log.
  • Manual verification checklist completes end-to-end.