Mac-friendly OtOpcUa fleet for manual UI exercise + integration smoke tests. Spins up one single Akka mesh (hub-and-spoke topology) + SQL Server + Traefik on the same Compose network. All six host nodes share the single OtOpcUa ConfigDb — logical separation between MAIN, SITE-A, and SITE-B is enforced by per-row ServerCluster.ClusterId scoping, not by mesh isolation.

Stack

Shared infrastructure

Service	Role	Ports
`sql`	SQL Server 2022 — single `OtOpcUa` ConfigDb shared by all nodes	host `14330` → container `1433`
`traefik`	Routes `:80` by PathPrefix to central admin nodes	host `80`, dashboard `8089`

Authentication uses the shared GLAuth on the Linux Docker host at 10.100.0.35:3893 (baseDN dc=zb,dc=local). Only the central admin nodes authenticate users. Sign in as multi-role / password to get all three OtOpcUa roles (Administrator, Designer, Viewer), or use any other shared test user with password password. Group→role mappings are seeded by seed/seed-clusters.sql (OtOpcUa-Admins→Administrator, OtOpcUa-Designers→Designer, OtOpcUa-Viewers→Viewer). The shared GLAuth source of truth and deploy runbook live in scadaproj/infra/glauth/.

Central nodes — fused admin+driver (MAIN cluster, UI + deploy singleton)

Service	Roles	Ports
`central-1`	`OTOPCUA_ROLES=admin,driver`, Akka mesh seed	host `4840` → container `4840`; internal `9000`
`central-2`	`OTOPCUA_ROLES=admin,driver`, joins central-1	host `4841` → container `4840`; internal `9000`

central-1 and central-2 are the only nodes that host the Admin UI and the deploy singleton. They are also the OPC UA publishers for the MAIN cluster. Traefik routes all PathPrefix(/) traffic to whichever central node has the leader role.

Site A nodes — driver-only (SITE-A cluster)

Service	Roles	Ports
`site-a-1`	`OTOPCUA_ROLES=driver`, joins the single mesh	host `4842` → container `4840`
`site-a-2`	`OTOPCUA_ROLES=driver`, joins the single mesh	host `4843` → container `4840`

Site B nodes — driver-only (SITE-B cluster)

Service	Roles	Ports
`site-b-1`	`OTOPCUA_ROLES=driver`, joins the single mesh	host `4844` → container `4840`
`site-b-2`	`OTOPCUA_ROLES=driver`, joins the single mesh	host `4845` → container `4840`

Site nodes serve no UI and authenticate no users. The central cluster manages and deploys to them over the shared Akka mesh. All six nodes bind Akka remoting to port 4053 inside their own network namespace; PublicHostname for each matches its Compose service name.

Multi-tenancy

All six host nodes write to the same OtOpcUa ConfigDb. The ServerCluster table differentiates the three logical clusters: each maps to one row, and each ClusterNode row's ClusterId ties the runtime node back to its owning cluster scope.

A one-shot cluster-seed Compose service (image mcr.microsoft.com/mssql-tools) waits for the OtOpcUa ConfigDb schema to exist (the host nodes do not auto-migrate — you apply EF migrations once; see First-time setup) and then INSERTs the rows below. The seed is idempotent — IF NOT EXISTS guards every insert — so re-runs on docker compose up are no-ops:

Logical cluster	`ServerCluster.ClusterId`	`ClusterNode.NodeId` rows
Main	`MAIN`	`central-1`, `central-2` (OPC UA publishers + admin UI)
Site A	`SITE-A`	`site-a-1`, `site-a-2`
Site B	`SITE-B`	`site-b-1`, `site-b-2`

Each ClusterNode.NodeId matches the node's Cluster__PublicHostname env value (Compose service name) — that's the lookup the runtime uses to resolve its own membership. ApplicationUri follows the urn:OtOpcUa:<NodeId> convention.

The SQL lives at seed/seed-clusters.sql; the wait-and-apply wrapper lives at seed/entrypoint.sh. To re-seed manually:

docker compose -f docker-dev/docker-compose.yml run --rm cluster-seed

Galaxy / MxAccess gateway

The seed also pre-creates a SystemPlatform Namespace + a GalaxyMxGateway DriverInstance in the MAIN cluster pointing at http://10.100.0.48:5120. The API key is resolved from the GALAXY_MXGW_API_KEY env var set on every driver-role container in compose; override via GALAXY_MXGW_API_KEY=... docker compose up -d to swap keys without editing the compose file.

The DriverHost actor doesn't spawn drivers from raw DriverInstance rows on its own — the v2 deploy lifecycle requires a sealed Deployment before drivers materialise. After first bring-up, sign in to the Admin UI and click Deploy current configuration on /deployments to compose the seeded rows into an artifact and dispatch it. The Galaxy driver instance will start its gRPC connection to the gateway on the next deploy ack.

Bring up

# from the repo root
docker compose -f docker-dev/docker-compose.yml up -d --build

# wait ~20 seconds for SQL to come up + the mesh to form

open http://localhost:9200                 # Admin UI (Traefik → central-1 or central-2)
open http://localhost:8089                 # Traefik dashboard

The first build takes a few minutes (.NET SDK image + restore + publish). Subsequent rebuilds are faster with Docker's layer cache.

First-time setup (or after `down -v`)

The host nodes do not auto-create the ConfigDb schema — on a brand-new SQL volume you must apply the EF migrations once, then (re)run the seed. (The auto-started cluster-seed polls for dbo.ServerCluster, which the first migration creates, so if it runs mid-migration it can fail against an intermediate schema — just re-run it after migrations finish.)

# 1. bring the stack up (SQL + nodes; nodes retry the DB until the schema exists)
docker compose -f docker-dev/docker-compose.yml up -d --build

# 2. create + migrate the OtOpcUa ConfigDb (one time; the design-time factory reads OTOPCUA_CONFIG_CONNECTION)
OTOPCUA_CONFIG_CONNECTION="Server=localhost,14330;Database=OtOpcUa;User Id=sa;Password=OtOpcUa!Dev123;TrustServerCertificate=True;" \
  dotnet ef database update \
    --project src/Core/ZB.MOM.WW.OtOpcUa.Configuration \
    --startup-project src/Core/ZB.MOM.WW.OtOpcUa.Configuration

# 3. apply the cluster/namespace/driver seed against the now-complete schema (idempotent)
docker compose -f docker-dev/docker-compose.yml run --rm cluster-seed

After the schema + seed exist, a plain docker compose ... up -d is enough — the named SQL volume keeps both across restarts (only down -v wipes them, which is when you repeat the steps above).

Auth (dev only)

Central nodes authenticate against the shared GLAuth at 10.100.0.35:3893 (baseDN dc=zb,dc=local). DevStubMode is not active. Sign in with any test user (password password); multi-role / password returns all three roles (Administrator, Designer, Viewer). Group→role mappings are seeded by seed/seed-clusters.sql. The GLAuth source of truth + deploy runbook is in scadaproj/infra/glauth/. Do not enable DevStubMode outside local debugging — production must always bind a real LDAP backend.

Headless deploy

POST http://localhost:9200/api/deployments
X-Api-Key: docker-dev-deploy-key

Tear down

docker compose -f docker-dev/docker-compose.yml down -v

The -v drops the SQL volume; remove it to keep ConfigDb state across restarts. There is no local LDAP volume — LDAP is the shared external GLAuth on 10.100.0.35:3893.

Failover smoke

Watch the Traefik dashboard at http://localhost:8089. Both central-1 and central-2 should be listed as healthy in the otopcua-admin service.
docker compose -f docker-dev/docker-compose.yml stop central-1 — central-2 should pick up the admin role-leader within ~15 s (Akka split-brain stable-after). Traefik will route traffic to central-2 once its /health/active returns 200.
docker compose -f docker-dev/docker-compose.yml start central-1 — central-1 rejoins as a follower; central-2 keeps the leader role until something disturbs it.

Notes

This compose is for the local Mac/Linux developer rig. The team's CI + soak runs go to the remote docker host at 10.100.0.35 (see docs/v2/dev-environment.md); the file there mirrors this one with adjusted port bindings.
The OPC UA endpoints are reachable directly from the host (Traefik is only in front of the admin HTTP surface):
- Main: opc.tcp://localhost:4840 (central-1), opc.tcp://localhost:4841 (central-2)
- Site A: opc.tcp://localhost:4842 (site-a-1), opc.tcp://localhost:4843 (site-a-2)
- Site B: opc.tcp://localhost:4844 (site-b-1), opc.tcp://localhost:4845 (site-b-2)
Galaxy + Wonderware drivers can't run in Linux containers (they need the Windows-only mxaccessgw + Historian SDK). On non-Windows, DriverInstanceActor.ShouldStub(driverType, roles) returns true for those types and the actor goes straight to a Stubbed state that returns deterministic success.
SQL persistence: ConfigDb state survives container restarts (named Docker volume). Drop the volume with down -v for a clean slate.

README.md

docker-dev