docker-dev
Mac-friendly OtOpcUa fleet for manual UI exercise + integration smoke tests. Spins up one single Akka mesh (hub-and-spoke topology) + SQL Server + Traefik on the same Compose network. All six host nodes share the single OtOpcUa ConfigDb — logical separation between MAIN, SITE-A, and SITE-B is enforced by per-row ServerCluster.ClusterId scoping, not by mesh isolation.
Stack
Shared infrastructure
| Service | Role | Ports |
|---|---|---|
sql |
SQL Server 2022 — single OtOpcUa ConfigDb shared by all nodes |
host 14330 → container 1433 |
traefik |
Routes :80 by PathPrefix to central admin nodes |
host 80, dashboard 8089 |
Authentication uses the shared GLAuth on the Linux Docker host at 10.100.0.35:3893 (baseDN dc=zb,dc=local). Only the central admin nodes authenticate users. Sign in as multi-role / password to get all three OtOpcUa roles (Administrator, Designer, Viewer), or use any other shared test user with password password. Group→role mappings are seeded by seed/seed-clusters.sql (OtOpcUa-Admins→Administrator, OtOpcUa-Designers→Designer, OtOpcUa-Viewers→Viewer). The shared GLAuth source of truth and deploy runbook live in scadaproj/infra/glauth/.
Central nodes — fused admin+driver (MAIN cluster, UI + deploy singleton)
| Service | Roles | Ports |
|---|---|---|
central-1 |
OTOPCUA_ROLES=admin,driver, Akka mesh seed |
host 4840 → container 4840; internal 9000 |
central-2 |
OTOPCUA_ROLES=admin,driver, joins central-1 |
host 4841 → container 4840; internal 9000 |
central-1 and central-2 are the only nodes that host the Admin UI and the deploy singleton. They are also the OPC UA publishers for the MAIN cluster. Traefik routes all PathPrefix(/) traffic to whichever central node has the leader role.
Site A nodes — driver-only (SITE-A cluster)
| Service | Roles | Ports |
|---|---|---|
site-a-1 |
OTOPCUA_ROLES=driver, joins the single mesh |
host 4842 → container 4840 |
site-a-2 |
OTOPCUA_ROLES=driver, joins the single mesh |
host 4843 → container 4840 |
Site B nodes — driver-only (SITE-B cluster)
| Service | Roles | Ports |
|---|---|---|
site-b-1 |
OTOPCUA_ROLES=driver, joins the single mesh |
host 4844 → container 4840 |
site-b-2 |
OTOPCUA_ROLES=driver, joins the single mesh |
host 4845 → container 4840 |
Site nodes serve no UI and authenticate no users. The central cluster manages and deploys to them over the shared Akka mesh. All six nodes bind Akka remoting to port 4053 inside their own network namespace; PublicHostname for each matches its Compose service name.
Multi-tenancy
All six host nodes write to the same OtOpcUa ConfigDb. The ServerCluster table differentiates the three logical clusters: each maps to one row, and each ClusterNode row's ClusterId ties the runtime node back to its owning cluster scope.
A one-shot cluster-seed Compose service (image mcr.microsoft.com/mssql-tools) waits for the OtOpcUa ConfigDb schema to exist (the host nodes do not auto-migrate — you apply EF migrations once; see First-time setup) and then INSERTs the rows below. The seed is idempotent — IF NOT EXISTS guards every insert — so re-runs on docker compose up are no-ops:
| Logical cluster | ServerCluster.ClusterId |
ClusterNode.NodeId rows |
|---|---|---|
| Main | MAIN |
central-1, central-2 (OPC UA publishers + admin UI) |
| Site A | SITE-A |
site-a-1, site-a-2 |
| Site B | SITE-B |
site-b-1, site-b-2 |
Each ClusterNode.NodeId matches the node's Cluster__PublicHostname env value (Compose service name) — that's the lookup the runtime uses to resolve its own membership. ApplicationUri follows the urn:OtOpcUa:<NodeId> convention.
The SQL lives at seed/seed-clusters.sql; the wait-and-apply wrapper lives at seed/entrypoint.sh. To re-seed manually:
docker compose -f docker-dev/docker-compose.yml run --rm cluster-seed
Galaxy / MxAccess gateway
The seed also pre-creates a SystemPlatform Namespace + a GalaxyMxGateway DriverInstance in the MAIN cluster pointing at http://10.100.0.48:5120. The API key is resolved from the GALAXY_MXGW_API_KEY env var set on every driver-role container in compose; override via GALAXY_MXGW_API_KEY=... docker compose up -d to swap keys without editing the compose file.
The DriverHost actor doesn't spawn drivers from raw DriverInstance rows on its own — the v2 deploy lifecycle requires a sealed Deployment before drivers materialise. After first bring-up, sign in to the Admin UI and click Deploy current configuration on /deployments to compose the seeded rows into an artifact and dispatch it. The Galaxy driver instance will start its gRPC connection to the gateway on the next deploy ack.
Bring up
# from the repo root
docker compose -f docker-dev/docker-compose.yml up -d --build
# wait ~20 seconds for SQL to come up + the mesh to form
open http://localhost:9200 # Admin UI (Traefik → central-1 or central-2)
open http://localhost:8089 # Traefik dashboard
The first build takes a few minutes (.NET SDK image + restore + publish). Subsequent rebuilds are faster with Docker's layer cache.
First-time setup (or after down -v)
The host nodes do not auto-create the ConfigDb schema — on a brand-new SQL volume you must apply the EF migrations once, then (re)run the seed. (The auto-started cluster-seed polls for dbo.ServerCluster, which the first migration creates, so if it runs mid-migration it can fail against an intermediate schema — just re-run it after migrations finish.)
# 1. bring the stack up (SQL + nodes; nodes retry the DB until the schema exists)
docker compose -f docker-dev/docker-compose.yml up -d --build
# 2. create + migrate the OtOpcUa ConfigDb (one time; the design-time factory reads OTOPCUA_CONFIG_CONNECTION)
OTOPCUA_CONFIG_CONNECTION="Server=localhost,14330;Database=OtOpcUa;User Id=sa;Password=OtOpcUa!Dev123;TrustServerCertificate=True;" \
dotnet ef database update \
--project src/Core/ZB.MOM.WW.OtOpcUa.Configuration \
--startup-project src/Core/ZB.MOM.WW.OtOpcUa.Configuration
# 3. apply the cluster/namespace/driver seed against the now-complete schema (idempotent)
docker compose -f docker-dev/docker-compose.yml run --rm cluster-seed
After the schema + seed exist, a plain docker compose ... up -d is enough — the named SQL volume keeps both across restarts (only down -v wipes them, which is when you repeat the steps above).
Auth (dev only)
Central nodes authenticate against the shared GLAuth at 10.100.0.35:3893 (baseDN dc=zb,dc=local). DevStubMode is not active. Sign in with any test user (password password); multi-role / password returns all three roles (Administrator, Designer, Viewer). Group→role mappings are seeded by seed/seed-clusters.sql. The GLAuth source of truth + deploy runbook is in scadaproj/infra/glauth/. Do not enable DevStubMode outside local debugging — production must always bind a real LDAP backend.
Headless deploy
POST http://localhost:9200/api/deployments
X-Api-Key: docker-dev-deploy-key
Tear down
docker compose -f docker-dev/docker-compose.yml down -v
The -v drops the SQL volume; remove it to keep ConfigDb state across restarts. There is no local LDAP volume — LDAP is the shared external GLAuth on 10.100.0.35:3893.
Failover smoke
- Watch the Traefik dashboard at
http://localhost:8089. Bothcentral-1andcentral-2should be listed as healthy in theotopcua-adminservice. docker compose -f docker-dev/docker-compose.yml stop central-1—central-2should pick up the admin role-leader within ~15 s (Akka split-brain stable-after). Traefik will route traffic tocentral-2once its/health/activereturns 200.docker compose -f docker-dev/docker-compose.yml start central-1—central-1rejoins as a follower;central-2keeps the leader role until something disturbs it.
Notes
- This compose is for the local Mac/Linux developer rig. The team's CI + soak runs go to the remote docker host at
10.100.0.35(seedocs/v2/dev-environment.md); the file there mirrors this one with adjusted port bindings. - The OPC UA endpoints are reachable directly from the host (Traefik is only in front of the admin HTTP surface):
- Main:
opc.tcp://localhost:4840(central-1),opc.tcp://localhost:4841(central-2) - Site A:
opc.tcp://localhost:4842(site-a-1),opc.tcp://localhost:4843(site-a-2) - Site B:
opc.tcp://localhost:4844(site-b-1),opc.tcp://localhost:4845(site-b-2)
- Main:
- Galaxy + Wonderware drivers can't run in Linux containers (they need the Windows-only mxaccessgw + Historian SDK). On non-Windows,
DriverInstanceActor.ShouldStub(driverType, roles)returnstruefor those types and the actor goes straight to aStubbedstate that returns deterministic success. - SQL persistence: ConfigDb state survives container restarts (named Docker volume). Drop the volume with
down -vfor a clean slate.