docs(plans): add second environment (env2) design
Brainstorming output for a sibling docker-env2/ tree that brings up a minimal second cluster (2 central + 1 site x 2 nodes + Traefik) on the same machine alongside the primary docker/ stack. Shares the existing scadalink-net network and scadalink-mssql container but uses separate logical databases (ScadaLinkConfig2 / ScadaLinkMachineData2) so the Transport (#24) feature can be exercised end-to-end with real cross-environment exports and imports.
This commit is contained in:
276
docs/plans/2026-05-24-second-environment-design.md
Normal file
276
docs/plans/2026-05-24-second-environment-design.md
Normal file
@@ -0,0 +1,276 @@
|
||||
# Second Docker Environment (`env2`) — Design
|
||||
|
||||
**Date:** 2026-05-24
|
||||
**Status:** Approved — ready for implementation plan
|
||||
**Purpose:** Stand up a second, concurrently-running ScadaLink cluster on the same machine so the new Transport (#24) feature can be exercised end-to-end against a real second environment (export from one UI, import into the other).
|
||||
|
||||
## Goal
|
||||
|
||||
A sibling `docker-env2/` directory with `deploy.sh` / `teardown.sh` / `seed-sites.sh` / `init-db.sh` that brings up a minimal but fully-functional second cluster — its own central + site, its own ConfigurationDB — alongside the existing `docker/` stack. Both environments run concurrently and share the commodity infra services (MSSQL container, LDAP, SMTP, OPC UA, REST API). No application code changes; this is purely deploy tooling.
|
||||
|
||||
## Non-Goals
|
||||
|
||||
- Not a fully air-gapped twin (LDAP/SMTP/OPC UA/REST API are shared).
|
||||
- Not a full mirror of primary's three-site topology — env2 has one site (`site-x`).
|
||||
- Not a multi-tenant abstraction or `--env` flag retrofit on `docker/deploy.sh` — kept as two independent script trees for clarity.
|
||||
- No new automated tests — env2 enables manual verification via [`2026-05-24-second-environment-verification.md`](2026-05-24-second-environment-verification.md) (created during implementation).
|
||||
|
||||
## Architecture Overview
|
||||
|
||||
```
|
||||
(host machine)
|
||||
|
||||
Primary stack (already existing — unchanged) Env2 stack (new)
|
||||
┌────────────────────────────────────┐ ┌──────────────────────────────┐
|
||||
│ Traefik :9000 ◄── 9001/9002 UI │ │ Traefik :9100 ◄── 9101/9102 UI│
|
||||
│ Central A/B (9011/9012 Akka) │ │ Central A/B (9111/9112 Akka) │
|
||||
│ Site-A/B/C (9021..9044) │ │ Site-X (9121/9122 Akka, │
|
||||
└─────────────┬──────────────────────┘ │ 9123/9124 gRPC) │
|
||||
│ └──────────┬───────────────────┘
|
||||
│ │
|
||||
▼ scadalink-net (shared bridge network) ◄──────┘
|
||||
┌──────────────────────────────────────────────────────────────┐
|
||||
│ scadalink-mssql ScadaLinkConfig (primary DB) │
|
||||
│ ScadaLinkMachineData (primary DB) │
|
||||
│ ScadaLinkConfig2 (env2 DB) ← new │
|
||||
│ ScadaLinkMachineData2(env2 DB) ← new │
|
||||
│ scadalink-ldap (shared — same test users) │
|
||||
│ scadalink-smtp (shared Mailpit) │
|
||||
│ scadalink-opcua (shared) │
|
||||
│ scadalink-restapi (shared) │
|
||||
└──────────────────────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
Both stacks attach to the same `scadalink-net` Docker bridge so env2's app containers can reach the infra services by container hostname (`scadalink-mssql`, `scadalink-ldap`, etc.). Akka clusters are independent — each side's `SeedNodes` lists only its own central nodes, so they never gossip-merge despite sharing the network.
|
||||
|
||||
## Topology & Port Allocation
|
||||
|
||||
| Role | Container name | Host Web | Host Akka | Host gRPC | Notes |
|
||||
|----------------|-----------------------------|----------|-----------|-----------|-------|
|
||||
| Traefik LB | `scadalink-env2-traefik` | 9100 | — | — | Dashboard on host 8181 |
|
||||
| Central A | `scadalink-env2-central-a` | 9101 | 9111 | — | |
|
||||
| Central B | `scadalink-env2-central-b` | 9102 | 9112 | — | |
|
||||
| Site-X A | `scadalink-env2-site-x-a` | — | 9121 | 9123 | |
|
||||
| Site-X B | `scadalink-env2-site-x-b` | — | 9122 | 9124 | |
|
||||
|
||||
Pattern: env2 host ports are primary + 100 (e.g. primary central-a 9001 → env2 central-a 9101). Confirmed free at design time. Identifier `site-x` distinguishes env2's single site from primary's `site-a/b/c` in logs/UI (technically not required — each central has its own ConfigurationDB — but clearer for operators).
|
||||
|
||||
## Infrastructure & Databases
|
||||
|
||||
**Shared `scadalink-net` + shared `scadalink-mssql` container, separate logical databases:**
|
||||
|
||||
- New databases: `ScadaLinkConfig2`, `ScadaLinkMachineData2`.
|
||||
- Reuse the existing `scadalink_app` SQL login with `db_owner` on both — one credential to manage.
|
||||
- DB creation handled by a new `infra/mssql/setup-env2.sql` (idempotent, `IF NOT EXISTS`-guarded).
|
||||
- Two activation paths:
|
||||
1. **Fresh MSSQL volume** — mount `setup-env2.sql` alongside the existing `setup.sql` in `/docker-entrypoint-initdb.d/` so it runs automatically on first startup.
|
||||
2. **Already-running MSSQL** — `docker-env2/init-db.sh` exec's `sqlcmd` inside the container to apply the same script. No MSSQL restart.
|
||||
- EF Core migrations auto-apply on env2 central startup (matches the primary's `Development` env var pattern) → tables created on first deploy.
|
||||
|
||||
**Reused as-is:** LDAP (same `multi-role`/`admin`/`designer`/`deployer` test users), SMTP (Mailpit — env2's emails appear in the same inbox at http://localhost:8025, distinguishable by env2's `FromAddress`), OPC UA, REST API. All stateless commodities; isolation gains nothing by duplicating them.
|
||||
|
||||
Connection strings in env2 central appsettings:
|
||||
```
|
||||
ConfigurationDb: Server=scadalink-mssql,1433;Database=ScadaLinkConfig2;User Id=scadalink_app;Password=ScadaLink_Dev1#;TrustServerCertificate=true
|
||||
MachineDataDb: Server=scadalink-mssql,1433;Database=ScadaLinkMachineData2;User Id=scadalink_app;Password=ScadaLink_Dev1#;TrustServerCertificate=true
|
||||
```
|
||||
|
||||
## Directory Layout
|
||||
|
||||
```
|
||||
docker-env2/
|
||||
├── docker-compose.yml # 5 services: 2 central + 2 site + Traefik
|
||||
├── deploy.sh # build (reuses docker/build.sh) + init-db + compose up
|
||||
├── teardown.sh # compose down (preserves data + logs)
|
||||
├── seed-sites.sh # CLI creates site-x against http://localhost:9100
|
||||
├── init-db.sh # sqlcmd exec against scadalink-mssql
|
||||
├── central-node-a/
|
||||
│ └── appsettings.Central.json
|
||||
├── central-node-b/
|
||||
│ └── appsettings.Central.json
|
||||
├── site-x-node-a/
|
||||
│ ├── appsettings.Site.json
|
||||
│ ├── data/ # gitignored — SQLite store-and-forward + site-event-log
|
||||
│ └── logs/ # gitignored
|
||||
├── site-x-node-b/
|
||||
│ ├── appsettings.Site.json
|
||||
│ ├── data/
|
||||
│ └── logs/
|
||||
└── traefik/
|
||||
├── traefik.yml # dashboard on :8080 (host 8181)
|
||||
└── dynamic.yml # service URLs → scadalink-env2-central-a/b
|
||||
```
|
||||
|
||||
Mirrors `docker/`'s shape exactly so operator muscle memory carries over.
|
||||
|
||||
## Per-Node Appsettings — Key Differences from Primary
|
||||
|
||||
Each env2 appsettings file is a near-clone of the primary equivalent with these targeted overrides:
|
||||
|
||||
| Field | Primary | Env2 |
|
||||
|--------------------------------------------|-----------------------------------------------------|-----------------------------------------------------|
|
||||
| `Node.NodeHostname` | `scadalink-central-a` / `scadalink-site-a-a` / ... | `scadalink-env2-central-a` / `scadalink-env2-site-x-a` / ... |
|
||||
| `Cluster.SeedNodes` | primary central hostnames | env2 central hostnames |
|
||||
| `Communication.CentralContactPoints` (site)| primary central hostnames | env2 central hostnames |
|
||||
| `Node.SiteId` (site) | `site-a` / `site-b` / `site-c` | `site-x` |
|
||||
| `Database.ConfigurationDb` | `ScadaLinkConfig` | `ScadaLinkConfig2` |
|
||||
| `Database.MachineDataDb` | `ScadaLinkMachineData` | `ScadaLinkMachineData2` |
|
||||
| `Notification.FromAddress` | `scada-notifications@company.com` | `scada-notifications-env2@company.com` |
|
||||
| `Security.JwtSigningKey` | primary signing key | a distinct env2 signing key |
|
||||
| `Transport.SourceEnvironment` | `docker-cluster` | `docker-cluster-env2` |
|
||||
|
||||
`Transport.SourceEnvironment` is the field that ends up stamped into exported bundle manifests, so a bundle visibly self-identifies which environment produced it.
|
||||
|
||||
## Lifecycle Scripts
|
||||
|
||||
### `docker-env2/deploy.sh`
|
||||
```bash
|
||||
#!/bin/bash
|
||||
set -euo pipefail
|
||||
SCRIPT_DIR="$(cd "$(dirname "$0")" && pwd)"
|
||||
|
||||
echo "=== ScadaLink Env2 Docker Deploy ==="
|
||||
|
||||
# Reuse the primary build (same scadalink:latest image)
|
||||
"$SCRIPT_DIR/../docker/build.sh"
|
||||
|
||||
# Ensure env2 databases exist on the shared scadalink-mssql
|
||||
"$SCRIPT_DIR/init-db.sh"
|
||||
|
||||
echo "Deploying env2 containers..."
|
||||
docker compose -f "$SCRIPT_DIR/docker-compose.yml" up -d --force-recreate
|
||||
docker compose -f "$SCRIPT_DIR/docker-compose.yml" ps
|
||||
|
||||
echo "Access points:"
|
||||
echo " Central (Traefik LB): http://localhost:9100"
|
||||
echo " Central UI (node A): http://localhost:9101"
|
||||
echo " Central UI (node B): http://localhost:9102"
|
||||
echo " Traefik dashboard: http://localhost:8181"
|
||||
echo "Seed site: docker-env2/seed-sites.sh"
|
||||
```
|
||||
|
||||
### `docker-env2/init-db.sh`
|
||||
```bash
|
||||
#!/bin/bash
|
||||
set -euo pipefail
|
||||
SCRIPT_DIR="$(cd "$(dirname "$0")" && pwd)"
|
||||
|
||||
if ! docker ps --format '{{.Names}}' | grep -q '^scadalink-mssql$'; then
|
||||
echo "ERROR: scadalink-mssql is not running. Start it: cd infra && docker compose up -d" >&2
|
||||
exit 1
|
||||
fi
|
||||
|
||||
echo "Applying env2 database setup..."
|
||||
docker exec -i scadalink-mssql /opt/mssql-tools18/bin/sqlcmd \
|
||||
-S localhost -U sa -P 'ScadaLink_Dev1#' -C \
|
||||
< "$SCRIPT_DIR/../infra/mssql/setup-env2.sql"
|
||||
|
||||
echo "Env2 databases ready."
|
||||
```
|
||||
|
||||
### `docker-env2/seed-sites.sh`
|
||||
```bash
|
||||
#!/bin/bash
|
||||
set -euo pipefail
|
||||
SCRIPT_DIR="$(cd "$(dirname "$0")" && pwd)"
|
||||
PROJECT_ROOT="$(cd "$SCRIPT_DIR/.." && pwd)"
|
||||
CLI="dotnet run --project $PROJECT_ROOT/src/ScadaLink.CLI --"
|
||||
AUTH="--username multi-role --password password"
|
||||
URL="--url http://localhost:9100"
|
||||
|
||||
echo "Creating Site-X on env2..."
|
||||
$CLI $URL $AUTH site create \
|
||||
--name "Env2 Site X" \
|
||||
--identifier "site-x" \
|
||||
--description "Env2 test site - two-node cluster" \
|
||||
--node-a-address "akka.tcp://scadalink@scadalink-env2-site-x-a:8082" \
|
||||
--node-b-address "akka.tcp://scadalink@scadalink-env2-site-x-b:8082" \
|
||||
--grpc-node-a-address "http://scadalink-env2-site-x-a:8083" \
|
||||
--grpc-node-b-address "http://scadalink-env2-site-x-b:8083" \
|
||||
|| echo " (Site-X may already exist)"
|
||||
```
|
||||
|
||||
### `docker-env2/teardown.sh`
|
||||
```bash
|
||||
#!/bin/bash
|
||||
set -euo pipefail
|
||||
SCRIPT_DIR="$(cd "$(dirname "$0")" && pwd)"
|
||||
docker compose -f "$SCRIPT_DIR/docker-compose.yml" down
|
||||
```
|
||||
|
||||
### Operator Workflow
|
||||
|
||||
| Action | Command(s) |
|
||||
|----------------------------------|-----------|
|
||||
| First-time env2 bring-up | `bash docker-env2/deploy.sh && bash docker-env2/seed-sites.sh` |
|
||||
| Iterate on env2 after code edit | `bash docker-env2/deploy.sh` |
|
||||
| Iterate on both envs | `bash docker/deploy.sh && bash docker-env2/deploy.sh` (build cached on 2nd) |
|
||||
| Wipe env2 DB for clean re-import | `docker exec scadalink-mssql sqlcmd ... DROP DATABASE ScadaLinkConfig2; DROP DATABASE ScadaLinkMachineData2;` then `bash docker-env2/deploy.sh` |
|
||||
| Stop env2 only | `bash docker-env2/teardown.sh` |
|
||||
|
||||
## Transport Testing Workflow — The Whole Point of env2
|
||||
|
||||
**Golden-path demo:**
|
||||
|
||||
1. Set up primary with at least a few templates + one deployed instance.
|
||||
2. `bash docker-env2/deploy.sh && bash docker-env2/seed-sites.sh` — env2 ConfigurationDB is empty.
|
||||
3. Browser → http://localhost:9000 → `multi-role` login → **Design → Export Bundle** → select templates → review → set passphrase → download `.scadabundle`.
|
||||
4. Browser → http://localhost:9101 → log in → **Admin → Import Bundle** → upload file → enter passphrase → review diff (all "Create" rows) → confirm.
|
||||
5. Verify: env2's `Design → Templates` shows imported items; `Audit → Configuration Audit Log` shows rows tagged with the matching `BundleImportId`.
|
||||
6. Deploy an imported template to env2's `site-x` to prove runtime-validity end-to-end.
|
||||
|
||||
**Manual tests env2 enables that mock-based tests cannot:**
|
||||
- Conflict-resolution UI on re-import (Skip / Overwrite / Rename per row).
|
||||
- Cross-environment audit correlation via `BundleImportId` chip.
|
||||
- Schema-version gating (`SchemaVersionMajor` mismatch).
|
||||
- Wrong-passphrase rejection + `MaxUnlockAttemptsPerSession=3` lockout.
|
||||
- Round-trip parity: export from primary → import into env2 → export from env2 → re-import into primary with Skip-on-conflict. Revision hashes should match.
|
||||
|
||||
**What env2 does NOT test:**
|
||||
- Multi-site Transport scenarios (env2 has one site by design).
|
||||
- Site-clustered Transport flows (Transport is central-only).
|
||||
- True air-gapped network isolation (env2 shares MSSQL/LDAP/SMTP — out of scope).
|
||||
|
||||
## Error Handling & Edge Cases
|
||||
|
||||
- **`init-db.sh`** fails fast with a clear message if `scadalink-mssql` isn't running.
|
||||
- **`deploy.sh`** runs with `set -euo pipefail` so any failed step halts cleanly.
|
||||
- **MSSQL volume reset** — both the docker-entrypoint mount and the exec-based `init-db.sh` apply the same idempotent script; either path leaves env2 DBs ready.
|
||||
- **Cluster cross-talk** — primary and env2 use the same Akka system name `scadalink` but disjoint seed-node hostnames, so the gossip protocols cannot merge. Defensive: env2 appsettings are written from scratch, not sed'd from primary.
|
||||
- **gRPC streaming** — env2 central uses container-name DNS (`http://scadalink-env2-site-x-a:8083`) for site-x streams, populated by `seed-sites.sh`.
|
||||
- **Cookie/JWT bleed** — different `JwtSigningKey` + different host origins (`localhost:9000` vs `localhost:9100`) mean sessions cannot cross envs.
|
||||
- **Port collision** — host port range `91XX` non-overlapping with primary's `90XX`; confirmed all 10 ports free at design time. If an operator later remaps, Compose surfaces `bind: address already in use`.
|
||||
|
||||
## Testing
|
||||
|
||||
**No new automated tests are added.** This is infrastructure tooling — the Transport feature already has 39 unit + 26 integration tests. The deliverable is a manual verification checklist at `docs/plans/2026-05-24-second-environment-verification.md` mirroring the Transport manual checklist, walking through the Section 5 golden path.
|
||||
|
||||
**First-deploy smoke test:**
|
||||
1. `docker ps` shows 5 new `scadalink-env2-*` containers.
|
||||
2. `curl http://localhost:9101/health/ready` returns green.
|
||||
3. `curl http://localhost:9100/health/active` Traefik routes to active node.
|
||||
4. Browser to http://localhost:9100 → `multi-role` login → Dashboard renders, Sites page is empty.
|
||||
5. Run `docker-env2/seed-sites.sh` → site-x appears; health turns green within ~30s.
|
||||
|
||||
## Documentation Updates
|
||||
|
||||
- New: `docker-env2/README.md` — operator quick-start, copying the structure of `docker/README.md`.
|
||||
- Update: `README.md` (project root) — add a "Second Environment" callout pointing to `docker-env2/README.md` with a one-sentence purpose statement.
|
||||
- Update: `CLAUDE.md` — add `docker-env2/` to the Project Structure section so future sessions discover it.
|
||||
- Update: `infra/README.md` — note that `setup-env2.sql` is mounted alongside `setup.sql`.
|
||||
|
||||
## Out-of-Scope Future Extensions
|
||||
|
||||
- A third / fourth environment by the same pattern (just bump prefix + port offset).
|
||||
- `--env` flag retrofit on `docker/deploy.sh` if the directory duplication grows painful — not worth doing for just two environments.
|
||||
- Air-gapped twin with its own MSSQL/LDAP/SMTP — straightforward extension of the same pattern if isolation requirements ever tighten.
|
||||
|
||||
## Acceptance Criteria
|
||||
|
||||
- [ ] `bash docker-env2/deploy.sh` brings up 5 containers cleanly on a machine where primary is already running.
|
||||
- [ ] `bash docker-env2/seed-sites.sh` registers `site-x` and the site cluster reaches healthy state.
|
||||
- [ ] http://localhost:9100 serves the env2 Central UI through Traefik with failover between 9101/9102.
|
||||
- [ ] env2 reads/writes only `ScadaLinkConfig2` / `ScadaLinkMachineData2`; primary's DBs untouched after env2 deploy.
|
||||
- [ ] `bash docker/deploy.sh && bash docker-env2/deploy.sh` succeeds in sequence; both stacks run concurrently.
|
||||
- [ ] A bundle exported from primary can be imported into env2, with audit rows tagged by `BundleImportId` and visible in env2's Configuration Audit Log.
|
||||
- [ ] Manual verification checklist completes end-to-end.
|
||||
Reference in New Issue
Block a user