Adds a one-shot cluster-seed service to docker-dev/docker-compose.yml
that pre-populates the three Akka clusters' scope rows in the shared
OtOpcUa ConfigDb so operators don't have to click through /clusters +
/hosts on every fresh bring-up.
Seed contents:
ServerCluster MAIN (Warm/2), SITE-A (Warm/2), SITE-B (Warm/2)
ClusterNode driver-a + driver-b → MAIN
site-a-1 + site-a-2 → SITE-A
site-b-1 + site-b-2 → SITE-B
NodeCount + RedundancyMode honour the CK_ServerCluster check constraint.
ApplicationUri follows the urn:OtOpcUa:<NodeId> convention; uniqueness
across the fleet satisfies UX_ClusterNode_ApplicationUri.
Mechanism:
- docker-dev/seed/seed-clusters.sql — idempotent INSERTs (IF NOT EXISTS
guards on every row).
- docker-dev/seed/entrypoint.sh — bash wrapper that waits for SQL to
accept connections, then polls until dbo.ServerCluster exists (the
host containers' EF auto-migration creates it on first boot), then
applies the SQL script.
- cluster-seed service uses mcr.microsoft.com/mssql-tools as the base
image (bash + sqlcmd available), restart: "no" so it runs once.
Re-running `docker compose up` is safe: the seed exits cleanly on the
second run because every INSERT is guarded.
Manual re-seed: `docker compose run --rm cluster-seed`.
113 lines
6.1 KiB
Markdown
113 lines
6.1 KiB
Markdown
# docker-dev
|
|
|
|
Mac-friendly multi-cluster OtOpcUa fleet for manual UI exercise + integration smoke tests. Spins up **three isolated Akka clusters** + SQL Server + OpenLDAP + Traefik on the same Compose network. All three clusters share the single `OtOpcUa` ConfigDb — multi-tenancy is enforced by per-row `ServerCluster.ClusterId` scoping. Akka.Cluster gossip stays isolated between meshes because their seed-node lists are disjoint, even though they share the same system name `otopcua`.
|
|
|
|
## Stack
|
|
|
|
### Shared infrastructure
|
|
|
|
| Service | Role | Ports |
|
|
|---|---|---|
|
|
| `sql` | SQL Server 2022 — single `OtOpcUa` ConfigDb shared by all three clusters | host `14330` → container `1433` |
|
|
| `ldap` | OpenLDAP with dev users `alice` / `bob` | host `3893` → container `1389` |
|
|
| `traefik` | Routes :80 by Host header / PathPrefix | host `80`, dashboard `8080` |
|
|
|
|
### Main cluster — split admin/driver roles
|
|
|
|
| Service | Role | Ports |
|
|
|---|---|---|
|
|
| `admin-a` | `OTOPCUA_ROLES=admin`, cluster seed | internal `9000` |
|
|
| `admin-b` | `OTOPCUA_ROLES=admin`, joins admin-a | internal `9000` |
|
|
| `driver-a` | `OTOPCUA_ROLES=driver` | host `4840` → container `4840` |
|
|
| `driver-b` | `OTOPCUA_ROLES=driver` | host `4841` → container `4840` |
|
|
|
|
### Site A cluster — 2-node fused admin+driver
|
|
|
|
| Service | Role | Ports |
|
|
|---|---|---|
|
|
| `site-a-1` | `OTOPCUA_ROLES=admin,driver`, cluster seed | host `4842` → container `4840` |
|
|
| `site-a-2` | `OTOPCUA_ROLES=admin,driver`, joins site-a-1 | host `4843` → container `4840` |
|
|
|
|
### Site B cluster — 2-node fused admin+driver
|
|
|
|
| Service | Role | Ports |
|
|
|---|---|---|
|
|
| `site-b-1` | `OTOPCUA_ROLES=admin,driver`, cluster seed | host `4844` → container `4840` |
|
|
| `site-b-2` | `OTOPCUA_ROLES=admin,driver`, joins site-b-1 | host `4845` → container `4840` |
|
|
|
|
All containers bind Akka remoting to port `4053` inside their own network namespace; the `PublicHostname` of each matches its Compose service name. Akka mesh isolation is enforced purely by disjoint seed lists. Configuration-side isolation is enforced by `ServerCluster.ClusterId` — see "Multi-tenancy" below.
|
|
|
|
## Multi-tenancy
|
|
|
|
All eight host nodes write to the same `OtOpcUa` ConfigDb. The `ServerCluster` table differentiates the three Akka meshes: each Akka cluster maps to one row, and each `ClusterNode` row's `ClusterId` ties the runtime node back to its owning cluster scope.
|
|
|
|
A one-shot `cluster-seed` Compose service (image `mcr.microsoft.com/mssql-tools`) waits for SQL + the EF auto-migration to complete and then INSERTs the rows below. The seed is **idempotent** — `IF NOT EXISTS` guards every insert — so re-runs on `docker compose up` are no-ops:
|
|
|
|
| Akka mesh | `ServerCluster.ClusterId` | `ClusterNode.NodeId` rows |
|
|
|---|---|---|
|
|
| Main | `MAIN` | `driver-a`, `driver-b` (OPC UA publishers) |
|
|
| Site A | `SITE-A` | `site-a-1`, `site-a-2` |
|
|
| Site B | `SITE-B` | `site-b-1`, `site-b-2` |
|
|
|
|
`ClusterNode` is the table for **OPC UA-publishing nodes** (not every Akka cluster member), which is why the main cluster's `admin-a` / `admin-b` don't get rows — they're control-plane-only.
|
|
|
|
Each `ClusterNode.NodeId` matches the node's `Cluster__PublicHostname` env value (Compose service name) — that's the lookup the runtime uses to resolve its own membership. `ApplicationUri` follows the `urn:OtOpcUa:<NodeId>` convention.
|
|
|
|
The SQL lives at `seed/seed-clusters.sql`; the wait-and-apply wrapper lives at `seed/entrypoint.sh`. To re-seed manually:
|
|
|
|
```bash
|
|
docker compose -f docker-dev/docker-compose.yml run --rm cluster-seed
|
|
```
|
|
|
|
## Bring up
|
|
|
|
```bash
|
|
# from the repo root
|
|
docker compose -f docker-dev/docker-compose.yml up -d --build
|
|
|
|
# wait ~20 seconds for SQL to come up + all three clusters to form
|
|
|
|
open http://localhost # main cluster admin UI
|
|
open http://site-a.localhost # site A admin UI
|
|
open http://site-b.localhost # site B admin UI
|
|
open http://localhost:8080 # Traefik dashboard
|
|
```
|
|
|
|
On macOS, `*.localhost` resolves to `127.0.0.1` automatically. On Linux add `127.0.0.1 site-a.localhost site-b.localhost` to `/etc/hosts` if your resolver doesn't.
|
|
|
|
The first build takes a few minutes (.NET SDK image + restore + publish). Subsequent rebuilds are faster with Docker's layer cache.
|
|
|
|
## Auth (dev only)
|
|
|
|
Use one of the LDAP dev users from `LDAP_USERS` in `docker-compose.yml`:
|
|
|
|
| Username | Password |
|
|
|---|---|
|
|
| `alice` | `alice123` |
|
|
| `bob` | `bob123` |
|
|
|
|
The compose mounts everyone into `ou=FleetAdmin` so the dev role mapping resolves to `FleetAdmin`.
|
|
|
|
## Tear down
|
|
|
|
```bash
|
|
docker compose -f docker-dev/docker-compose.yml down -v
|
|
```
|
|
|
|
The `-v` drops the SQL + LDAP volumes; remove it to keep ConfigDb state across restarts.
|
|
|
|
## Failover smoke
|
|
|
|
1. Watch the Traefik dashboard at `http://localhost:8080`. Both `admin-a` and `admin-b` should be listed as healthy in the `otopcua-admin` service.
|
|
2. `docker compose -f docker-dev/docker-compose.yml stop admin-a` — `admin-b` should pick up the admin role-leader within ~15 s (Akka split-brain stable-after). Traefik will route traffic to `admin-b` once its `/health/active` returns 200.
|
|
3. `docker compose -f docker-dev/docker-compose.yml start admin-a` — `admin-a` rejoins as a follower; `admin-b` keeps the leader role until something disturbs it.
|
|
|
|
## Notes
|
|
|
|
- This compose is for the **local Mac/Linux developer rig**. The team's CI + soak runs go to the remote docker host at `10.100.0.35` (see `docs/v2/dev-environment.md`); the file there mirrors this one with adjusted port bindings.
|
|
- The OPC UA driver endpoints are reachable directly from the host (Traefik is only in front of the admin HTTP surface):
|
|
- Main: `opc.tcp://localhost:4840` (driver-a), `opc.tcp://localhost:4841` (driver-b)
|
|
- Site A: `opc.tcp://localhost:4842` (site-a-1), `opc.tcp://localhost:4843` (site-a-2)
|
|
- Site B: `opc.tcp://localhost:4844` (site-b-1), `opc.tcp://localhost:4845` (site-b-2)
|
|
- Galaxy + Wonderware drivers can't run in Linux containers (they need the Windows-only mxaccessgw + Historian SDK). On non-Windows, `DriverInstanceActor.ShouldStub(driverType, roles)` returns `true` for those types and the actor goes straight to a `Stubbed` state that returns deterministic success.
|