fix(deploy): ClusterNode NodeId uses host:port + Traefik sticky cookie
Some checks failed
v2-ci / build (push) Failing after 41s
v2-ci / unit-tests (tests/Core/ZB.MOM.WW.OtOpcUa.Cluster.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.ControlPlane.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.Runtime.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.Security.Tests) (push) Has been skipped
v2-ci / integration (tests/Server/ZB.MOM.WW.OtOpcUa.Host.IntegrationTests) (push) Has been skipped
v2-ci / integration (tests/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer.IntegrationTests) (push) Has been skipped

Two bring-up issues found while clicking through the operator Deploy flow
on the docker-dev stack:

- ConfigPublishCoordinator computes expected-ack NodeIds from
  Akka.Cluster.State.Members as "{host}:{port}" (e.g. "driver-a:4053") to
  match ClusterRoleInfo's NodeId derivation. The seed had been using the
  bare service name ("driver-a"), so NodeDeploymentState INSERT hit FK
  violation 547 on NodeDeploymentState.NodeId → ClusterNode.NodeId. Seed
  now writes the full host:port form for every ClusterNode row.

- Blazor Server uses SignalR (WebSocket upgrade after the initial GET).
  Without sticky sessions, Traefik round-robins admin-a/admin-b and the
  WebSocket upgrade lands on the wrong backend, returning "No Connection
  with that ID: Status code '404'" so @onclick handlers never fire on the
  client. Added sticky.cookie (otopcua_lb, SameSite=Lax) to all three
  Traefik service loadBalancers so each session pins to one node.

Verified end-to-end: clicked "Deploy current configuration" on
/deployments → Deployment row sealed in ~70ms → driver-a + driver-b
spawn GalaxyMxGateway driver (stub=False) → GalaxyDriver connects to
http://10.100.0.48:5120 with the seeded ApiKeySecretRef=env:GALAXY_MXGW_API_KEY.
This commit is contained in:
Joseph Doherty
2026-05-26 15:10:11 -04:00
parent 60beb9128e
commit 44b8a9c7ff
2 changed files with 41 additions and 12 deletions

View File

@@ -55,45 +55,50 @@ IF NOT EXISTS (SELECT 1 FROM dbo.ServerCluster WHERE ClusterId = 'SITE-B')
------------------------------------------------------------------------------
-- ClusterNode — main cluster OPC UA publishers
--
-- NodeId is "<compose-service>:4053" so it matches what ClusterRoleInfo +
-- ConfigPublishCoordinator derive from Akka.Cluster.Get(system).State.Members
-- (member.Address.Host:Port). NodeDeploymentState.NodeId is FK-bound to
-- ClusterNode.NodeId; mismatched values cause FK 547 on deploy.
------------------------------------------------------------------------------
IF NOT EXISTS (SELECT 1 FROM dbo.ClusterNode WHERE NodeId = 'driver-a')
IF NOT EXISTS (SELECT 1 FROM dbo.ClusterNode WHERE NodeId = 'driver-a:4053')
INSERT INTO dbo.ClusterNode
(NodeId, ClusterId, Host, OpcUaPort, DashboardPort, ApplicationUri, ServiceLevelBase, Enabled, CreatedBy)
VALUES ('driver-a', 'MAIN', 'driver-a', 4840, 8081, 'urn:OtOpcUa:driver-a', 200, 1, 'docker-dev-seed');
VALUES ('driver-a:4053', 'MAIN', 'driver-a', 4840, 8081, 'urn:OtOpcUa:driver-a', 200, 1, 'docker-dev-seed');
IF NOT EXISTS (SELECT 1 FROM dbo.ClusterNode WHERE NodeId = 'driver-b')
IF NOT EXISTS (SELECT 1 FROM dbo.ClusterNode WHERE NodeId = 'driver-b:4053')
INSERT INTO dbo.ClusterNode
(NodeId, ClusterId, Host, OpcUaPort, DashboardPort, ApplicationUri, ServiceLevelBase, Enabled, CreatedBy)
VALUES ('driver-b', 'MAIN', 'driver-b', 4840, 8081, 'urn:OtOpcUa:driver-b', 150, 1, 'docker-dev-seed');
VALUES ('driver-b:4053', 'MAIN', 'driver-b', 4840, 8081, 'urn:OtOpcUa:driver-b', 150, 1, 'docker-dev-seed');
------------------------------------------------------------------------------
-- ClusterNode — site A
------------------------------------------------------------------------------
IF NOT EXISTS (SELECT 1 FROM dbo.ClusterNode WHERE NodeId = 'site-a-1')
IF NOT EXISTS (SELECT 1 FROM dbo.ClusterNode WHERE NodeId = 'site-a-1:4053')
INSERT INTO dbo.ClusterNode
(NodeId, ClusterId, Host, OpcUaPort, DashboardPort, ApplicationUri, ServiceLevelBase, Enabled, CreatedBy)
VALUES ('site-a-1', 'SITE-A', 'site-a-1', 4840, 8081, 'urn:OtOpcUa:site-a-1', 200, 1, 'docker-dev-seed');
VALUES ('site-a-1:4053', 'SITE-A', 'site-a-1', 4840, 8081, 'urn:OtOpcUa:site-a-1', 200, 1, 'docker-dev-seed');
IF NOT EXISTS (SELECT 1 FROM dbo.ClusterNode WHERE NodeId = 'site-a-2')
IF NOT EXISTS (SELECT 1 FROM dbo.ClusterNode WHERE NodeId = 'site-a-2:4053')
INSERT INTO dbo.ClusterNode
(NodeId, ClusterId, Host, OpcUaPort, DashboardPort, ApplicationUri, ServiceLevelBase, Enabled, CreatedBy)
VALUES ('site-a-2', 'SITE-A', 'site-a-2', 4840, 8081, 'urn:OtOpcUa:site-a-2', 150, 1, 'docker-dev-seed');
VALUES ('site-a-2:4053', 'SITE-A', 'site-a-2', 4840, 8081, 'urn:OtOpcUa:site-a-2', 150, 1, 'docker-dev-seed');
------------------------------------------------------------------------------
-- ClusterNode — site B
------------------------------------------------------------------------------
IF NOT EXISTS (SELECT 1 FROM dbo.ClusterNode WHERE NodeId = 'site-b-1')
IF NOT EXISTS (SELECT 1 FROM dbo.ClusterNode WHERE NodeId = 'site-b-1:4053')
INSERT INTO dbo.ClusterNode
(NodeId, ClusterId, Host, OpcUaPort, DashboardPort, ApplicationUri, ServiceLevelBase, Enabled, CreatedBy)
VALUES ('site-b-1', 'SITE-B', 'site-b-1', 4840, 8081, 'urn:OtOpcUa:site-b-1', 200, 1, 'docker-dev-seed');
VALUES ('site-b-1:4053', 'SITE-B', 'site-b-1', 4840, 8081, 'urn:OtOpcUa:site-b-1', 200, 1, 'docker-dev-seed');
IF NOT EXISTS (SELECT 1 FROM dbo.ClusterNode WHERE NodeId = 'site-b-2')
IF NOT EXISTS (SELECT 1 FROM dbo.ClusterNode WHERE NodeId = 'site-b-2:4053')
INSERT INTO dbo.ClusterNode
(NodeId, ClusterId, Host, OpcUaPort, DashboardPort, ApplicationUri, ServiceLevelBase, Enabled, CreatedBy)
VALUES ('site-b-2', 'SITE-B', 'site-b-2', 4840, 8081, 'urn:OtOpcUa:site-b-2', 150, 1, 'docker-dev-seed');
VALUES ('site-b-2:4053', 'SITE-B', 'site-b-2', 4840, 8081, 'urn:OtOpcUa:site-b-2', 150, 1, 'docker-dev-seed');
------------------------------------------------------------------------------
-- Galaxy MxAccess gateway — MAIN cluster

View File

@@ -28,6 +28,14 @@ http:
services:
otopcua-admin:
loadBalancer:
# Blazor Server uses SignalR; the WebSocket upgrade must hit the same
# backend that owns the circuit ID. Sticky cookie keeps each session
# pinned to one node so the post-handshake WebSocket doesn't 404.
sticky:
cookie:
name: otopcua_lb
httpOnly: true
sameSite: lax
servers:
- url: "http://admin-a:9000"
- url: "http://admin-b:9000"
@@ -38,6 +46,14 @@ http:
otopcua-site-a:
loadBalancer:
# Blazor Server uses SignalR; the WebSocket upgrade must hit the same
# backend that owns the circuit ID. Sticky cookie keeps each session
# pinned to one node so the post-handshake WebSocket doesn't 404.
sticky:
cookie:
name: otopcua_lb
httpOnly: true
sameSite: lax
servers:
- url: "http://site-a-1:9000"
- url: "http://site-a-2:9000"
@@ -48,6 +64,14 @@ http:
otopcua-site-b:
loadBalancer:
# Blazor Server uses SignalR; the WebSocket upgrade must hit the same
# backend that owns the circuit ID. Sticky cookie keeps each session
# pinned to one node so the post-handshake WebSocket doesn't 404.
sticky:
cookie:
name: otopcua_lb
httpOnly: true
sameSite: lax
servers:
- url: "http://site-b-1:9000"
- url: "http://site-b-2:9000"