Files
lmxopcua/docker-dev
Joseph Doherty 7dfbca6469
Some checks failed
v2-ci / build (push) Failing after 47s
v2-ci / unit-tests (tests/Core/ZB.MOM.WW.OtOpcUa.Cluster.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.ControlPlane.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.Runtime.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.Security.Tests) (push) Has been skipped
v2-ci / integration (tests/Server/ZB.MOM.WW.OtOpcUa.Host.IntegrationTests) (push) Has been skipped
v2-ci / integration (tests/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer.IntegrationTests) (push) Has been skipped
feat(opcua): materialise SystemPlatform tags (Galaxy) as OPC UA variables
Closes the gap where Tag rows with EquipmentId=NULL + Namespace.Kind=SystemPlatform
(Galaxy hierarchy) existed in ConfigDb but were never surfaced in the OPC UA
address space. Now they materialise as Variable nodes under a folder named for
their FolderPath, browseable through any OPC UA client.

Layers touched:

- IOpcUaAddressSpaceSink: new EnsureVariable(nodeId, parentFolderId, displayName,
  dataType) signature on the sink interface, NullSink, DeferredSink, SdkSink.
- OtOpcUaNodeManager.EnsureVariable: creates a BaseDataVariableState parented
  under the named folder (or root), initial Value=null +
  StatusCode=BadWaitingForInitialData; resolves Tag.DataType strings to the
  matching OPC UA built-in NodeId. Idempotent.
- Phase7CompositionResult: new GalaxyTags collection of GalaxyTagPlan records
  carrying (TagId, DriverInstanceId, FolderPath, DisplayName, DataType,
  MxAccessRef). Constructor overloads keep existing call sites compiling.
- Phase7Composer.Compose: now takes Tag + Namespace inputs, filters for
  SystemPlatform-namespace tags with EquipmentId=NULL, emits GalaxyTagPlan
  rows with MXAccess ref "FolderPath.Name".
- Phase7Plan: new AddedGalaxyTags / RemovedGalaxyTags / ChangedGalaxyTags
  collections + GalaxyTagDelta record; IsEmpty + needsRebuild updated.
- Phase7Planner.Compute: diffs GalaxyTags by TagId via existing DiffById helper.
- DeploymentArtifact.ParseComposition: reads the Tags + Namespaces +
  DriverInstances arrays the ConfigComposer already emits, applies the same
  SystemPlatform filter, returns the same GalaxyTagPlan list as the composer
  so artifact-side and compose-side plans agree.
- Phase7Applier: new MaterialiseGalaxyTags pass that ensures one folder per
  distinct FolderPath then one Variable per tag. NodeId for the variable is
  "<FolderPath>.<Name>" matching the MXAccess ref so the future Galaxy
  SubscribeBulk wiring can address them directly.
- OpcUaPublishActor.RebuildAddressSpace: invokes MaterialiseGalaxyTags after
  MaterialiseHierarchy. _lastApplied initialiser updated for the new ctor.
- seed-clusters.sql: pre-existing TestMachine_001.TestAlarm001..003 rows
  needed no change — the composer/applier now picks them up automatically.

Verified end-to-end via docker-dev: deploy click → driver-a logs
"Phase7Applier: Galaxy tags materialised (tags=3, folders=1)" → OPC UA Client
CLI browses the three Variable nodes under TestMachine_001 folder. Reads
return BadWaitingForInitialData status (expected — Galaxy driver's
SubscribeBulk wiring to push values into the nodes is the remaining
follow-up).
2026-05-26 15:43:22 -04:00
..

docker-dev

Mac-friendly multi-cluster OtOpcUa fleet for manual UI exercise + integration smoke tests. Spins up three isolated Akka clusters + SQL Server + OpenLDAP + Traefik on the same Compose network. All three clusters share the single OtOpcUa ConfigDb — multi-tenancy is enforced by per-row ServerCluster.ClusterId scoping. Akka.Cluster gossip stays isolated between meshes because their seed-node lists are disjoint, even though they share the same system name otopcua.

Stack

Shared infrastructure

Service Role Ports
sql SQL Server 2022 — single OtOpcUa ConfigDb shared by all three clusters host 14330 → container 1433
traefik Routes :80 by Host header / PathPrefix host 80, dashboard 8089

Authentication runs in DevStubMode — every host container has Authentication__Ldap__DevStubMode=true set, so the LDAP service is not part of the dev compose right now (the bitnami/openldap:2.6 image was retired and the legacy tag crashes mid-setup with exit 68). Any non-empty username/password signs in as FleetAdmin. To restore a real LDAP service, drop the env var and add an openldap-compatible image back to compose.

Main cluster — split admin/driver roles

Service Role Ports
admin-a OTOPCUA_ROLES=admin, cluster seed internal 9000
admin-b OTOPCUA_ROLES=admin, joins admin-a internal 9000
driver-a OTOPCUA_ROLES=driver host 4840 → container 4840
driver-b OTOPCUA_ROLES=driver host 4841 → container 4840

Site A cluster — 2-node fused admin+driver

Service Role Ports
site-a-1 OTOPCUA_ROLES=admin,driver, cluster seed host 4842 → container 4840
site-a-2 OTOPCUA_ROLES=admin,driver, joins site-a-1 host 4843 → container 4840

Site B cluster — 2-node fused admin+driver

Service Role Ports
site-b-1 OTOPCUA_ROLES=admin,driver, cluster seed host 4844 → container 4840
site-b-2 OTOPCUA_ROLES=admin,driver, joins site-b-1 host 4845 → container 4840

All containers bind Akka remoting to port 4053 inside their own network namespace; the PublicHostname of each matches its Compose service name. Akka mesh isolation is enforced purely by disjoint seed lists. Configuration-side isolation is enforced by ServerCluster.ClusterId — see "Multi-tenancy" below.

Multi-tenancy

All eight host nodes write to the same OtOpcUa ConfigDb. The ServerCluster table differentiates the three Akka meshes: each Akka cluster maps to one row, and each ClusterNode row's ClusterId ties the runtime node back to its owning cluster scope.

A one-shot cluster-seed Compose service (image mcr.microsoft.com/mssql-tools) waits for SQL + the EF auto-migration to complete and then INSERTs the rows below. The seed is idempotentIF NOT EXISTS guards every insert — so re-runs on docker compose up are no-ops:

Akka mesh ServerCluster.ClusterId ClusterNode.NodeId rows
Main MAIN driver-a, driver-b (OPC UA publishers)
Site A SITE-A site-a-1, site-a-2
Site B SITE-B site-b-1, site-b-2

ClusterNode is the table for OPC UA-publishing nodes (not every Akka cluster member), which is why the main cluster's admin-a / admin-b don't get rows — they're control-plane-only.

Each ClusterNode.NodeId matches the node's Cluster__PublicHostname env value (Compose service name) — that's the lookup the runtime uses to resolve its own membership. ApplicationUri follows the urn:OtOpcUa:<NodeId> convention.

The SQL lives at seed/seed-clusters.sql; the wait-and-apply wrapper lives at seed/entrypoint.sh. To re-seed manually:

docker compose -f docker-dev/docker-compose.yml run --rm cluster-seed

Galaxy / MxAccess gateway

The seed also pre-creates a SystemPlatform Namespace + a GalaxyMxGateway DriverInstance in the MAIN cluster pointing at http://10.100.0.48:5120. The API key is resolved from the GALAXY_MXGW_API_KEY env var set on every driver-role container in compose; override via GALAXY_MXGW_API_KEY=... docker compose up -d to swap keys without editing the compose file.

The DriverHost actor doesn't spawn drivers from raw DriverInstance rows on its own — the v2 deploy lifecycle requires a sealed Deployment before drivers materialise. After first bring-up, sign in to the Admin UI and click Deploy current configuration on /deployments to compose the seeded rows into an artifact and dispatch it. The Galaxy driver instance will start its gRPC connection to the gateway on the next deploy ack.

Bring up

# from the repo root
docker compose -f docker-dev/docker-compose.yml up -d --build

# wait ~20 seconds for SQL to come up + all three clusters to form

open http://localhost                      # main cluster admin UI
open http://site-a.localhost               # site A admin UI
open http://site-b.localhost               # site B admin UI
open http://localhost:8089                 # Traefik dashboard

On macOS, *.localhost resolves to 127.0.0.1 automatically. On Linux add 127.0.0.1 site-a.localhost site-b.localhost to /etc/hosts if your resolver doesn't.

The first build takes a few minutes (.NET SDK image + restore + publish). Subsequent rebuilds are faster with Docker's layer cache.

Auth (dev only)

Authentication__Ldap__DevStubMode=true is set on every host container, so any non-empty username/password signs in as a FleetAdmin user without contacting an LDAP server. Do not ship this configuration to production — set DevStubMode=false and wire a real LDAP backend before any non-dev deployment.

Tear down

docker compose -f docker-dev/docker-compose.yml down -v

The -v drops the SQL + LDAP volumes; remove it to keep ConfigDb state across restarts.

Failover smoke

  1. Watch the Traefik dashboard at http://localhost:8089. Both admin-a and admin-b should be listed as healthy in the otopcua-admin service.
  2. docker compose -f docker-dev/docker-compose.yml stop admin-aadmin-b should pick up the admin role-leader within ~15 s (Akka split-brain stable-after). Traefik will route traffic to admin-b once its /health/active returns 200.
  3. docker compose -f docker-dev/docker-compose.yml start admin-aadmin-a rejoins as a follower; admin-b keeps the leader role until something disturbs it.

Notes

  • This compose is for the local Mac/Linux developer rig. The team's CI + soak runs go to the remote docker host at 10.100.0.35 (see docs/v2/dev-environment.md); the file there mirrors this one with adjusted port bindings.
  • The OPC UA driver endpoints are reachable directly from the host (Traefik is only in front of the admin HTTP surface):
    • Main: opc.tcp://localhost:4840 (driver-a), opc.tcp://localhost:4841 (driver-b)
    • Site A: opc.tcp://localhost:4842 (site-a-1), opc.tcp://localhost:4843 (site-a-2)
    • Site B: opc.tcp://localhost:4844 (site-b-1), opc.tcp://localhost:4845 (site-b-2)
  • Galaxy + Wonderware drivers can't run in Linux containers (they need the Windows-only mxaccessgw + Historian SDK). On non-Windows, DriverInstanceActor.ShouldStub(driverType, roles) returns true for those types and the actor goes straight to a Stubbed state that returns deterministic success.