12 Commits

Author SHA1 Message Date
Joseph Doherty 1d729fb0f8 feat: adopt shared ZB.MOM.WW.Health probes (preserve tiers + OtOpcUaCompat policy) 2026-06-01 13:36:28 -04:00
Joseph Doherty 0b99aceacb build: reference ZB.MOM.WW.Health packages from the Gitea feed 2026-06-01 13:30:13 -04:00
Joseph Doherty d57b42bcd6 chore: gitignore local credentials file and runtime PKI store
v2-ci / build (push) Failing after 45s
v2-ci / unit-tests (tests/Core/ZB.MOM.WW.OtOpcUa.Cluster.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.ControlPlane.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.Runtime.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.Security.Tests) (push) Has been skipped
v2-ci / integration (tests/Server/ZB.MOM.WW.OtOpcUa.Host.IntegrationTests) (push) Has been skipped
v2-ci / integration (tests/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer.IntegrationTests) (push) Has been skipped
sql_login.txt holds DB creds and the Host pki/ dir is the runtime OPC UA
certificate store (private keys + issued/trusted certs); neither belongs
in source control, and ignoring them prevents an accidental git add .
2026-05-31 10:27:59 -04:00
Joseph Doherty 5e87f7e16f docs(alarms): record 2026-05-31 live re-confirmation of native alarm feed
v2-ci / build (push) Failing after 41s
v2-ci / unit-tests (tests/Core/ZB.MOM.WW.OtOpcUa.Cluster.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.ControlPlane.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.Runtime.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.Security.Tests) (push) Has been skipped
v2-ci / integration (tests/Server/ZB.MOM.WW.OtOpcUa.Host.IntegrationTests) (push) Has been skipped
v2-ci / integration (tests/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer.IntegrationTests) (push) Has been skipped
Independently re-ran the D.1 alarm-source smoke against the live gateway
(10.100.0.48:5120) to back the native MxAccess alarm-event claim with a
fresh empirical run, not just the original 2026-05-29 capture.
2026-05-31 10:12:47 -04:00
Joseph Doherty 695fa6408b docs(alarms): record native alarms verified working; add D.1 smoke
v2-ci / build (push) Failing after 47s
v2-ci / unit-tests (tests/Core/ZB.MOM.WW.OtOpcUa.Cluster.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.ControlPlane.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.Runtime.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.Security.Tests) (push) Has been skipped
v2-ci / integration (tests/Server/ZB.MOM.WW.OtOpcUa.Host.IntegrationTests) (push) Has been skipped
v2-ci / integration (tests/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer.IntegrationTests) (push) Has been skipped
The 2026-04-30 alarm plan banners claimed worker-side native alarm
subscription was blocked on a COM-bitness finding. That's stale: the
mxaccessgw .NET client now has true MxAccess alarm-event support, and a
live StreamAlarms check (+ new Skip-gated GatewayGalaxyAlarmFeedLiveTests
through the lmxopcua consumer) confirms native alarms — operator comment,
category, severity, timestamps — flow end-to-end. Reconcile both plan docs
to reality and add docs/plans/alarms-d1-smoke-artifact.md as the D.1
alarm-source deliverable. Historian-write live smoke + full server->A&C
round-trip remain (Windows parity rig only).
2026-05-31 09:59:01 -04:00
Joseph Doherty 61193629b6 fix(adminui): wire Test Connect probes + live panels on admin-only nodes
v2-ci / build (push) Failing after 36s
v2-ci / unit-tests (tests/Core/ZB.MOM.WW.OtOpcUa.Cluster.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.ControlPlane.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.Runtime.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.Security.Tests) (push) Has been skipped
v2-ci / integration (tests/Server/ZB.MOM.WW.OtOpcUa.Host.IntegrationTests) (push) Has been skipped
v2-ci / integration (tests/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer.IntegrationTests) (push) Has been skipped
Both bugs surfaced only on split-role deployments (the MAIN cluster's
admin-only nodes), where the AdminUI runs without the driver role.

- Test Connect returned "No probe registered" for every driver: the
  IDriverProbe set was registered only under the driver role, but the
  admin-operations singleton that consumes it is pinned to admin. Extract
  AddOtOpcUaDriverProbes() (idempotent via TryAddEnumerable) and call it
  in the hasAdmin path too.

- Live driver-status/alerts/script-log panels showed "SignalR error:
  Connection refused": these Blazor Server components opened a HubConnection
  to their own hub via the browser's public URL, which server-side code
  can't reach behind Traefik (host :9200 -> container :9000). Read the
  in-process source directly instead -- DriverStatus via
  IDriverStatusSnapshotStore.SnapshotChanged, Alerts/ScriptLog via a new
  IInProcessBroadcaster<T>. Fleet status was unaffected (reads DB/ActorSystem).

Adds unit tests for probe registration, the snapshot-store event, and the
broadcaster.
2026-05-29 16:38:32 -04:00
Joseph Doherty e3a27422a1 fix(adminui): Galaxy editor 500 — read DriverConfig case-insensitively + null-safe FromRecord
v2-ci / build (push) Failing after 39s
v2-ci / unit-tests (tests/Core/ZB.MOM.WW.OtOpcUa.Cluster.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.ControlPlane.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.Runtime.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.Security.Tests) (push) Has been skipped
v2-ci / integration (tests/Server/ZB.MOM.WW.OtOpcUa.Host.IntegrationTests) (push) Has been skipped
v2-ci / integration (tests/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer.IntegrationTests) (push) Has been skipped
GalaxyDriverPage deserialized DriverConfig with case-sensitive camelCase opts, but the
persisted/seeded config is PascalCase (the runtime reads it case-insensitively). So all four
nested option records read as null -> FromRecord NRE (HTTP 500) on edit, and the form would
have shown defaults instead of the real config (risking a clobber on save). Fix: add
PropertyNameCaseInsensitive=true (matches the runtime) so real values load, plus null-coalesce
the nested records in FromRecord as defense-in-depth. Regression test asserts the seeded
PascalCase config loads its real values.
2026-05-29 12:45:44 -04:00
Joseph Doherty 32d7fd7cc9 fix(galaxy): complete PR 7.2 rename — use canonical GalaxyMxGateway driver type
v2-ci / build (push) Failing after 48s
v2-ci / unit-tests (tests/Core/ZB.MOM.WW.OtOpcUa.Cluster.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.ControlPlane.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.Runtime.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.Security.Tests) (push) Has been skipped
v2-ci / integration (tests/Server/ZB.MOM.WW.OtOpcUa.Host.IntegrationTests) (push) Has been skipped
v2-ci / integration (tests/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer.IntegrationTests) (push) Has been skipped
The driver/factory/seed use 'GalaxyMxGateway' (legacy 'Galaxy' was retired),
but the AdminUI editor router, GalaxyDriverPage, address picker, identity
dropdown, the Galaxy browser/probe, and DraftValidator still keyed on 'Galaxy'.
Result: the seeded GalaxyMxGateway driver couldn't be edited ('no editor
registered'), UI-created Galaxy drivers wrote a type with no factory, and a
SystemPlatform-bound GalaxyMxGateway driver failed publish validation.
Align all stragglers to GalaxyMxGateway (+ failing-test-first DraftValidator
coverage). ShouldStub's 'Galaxy' legacy safety-net left intact.
2026-05-29 12:31:55 -04:00
Joseph Doherty de666b24c3 test: fix Galaxy-tag Phase7 test fixtures + S7 CLI enum; add MaterialiseGalaxyTags coverage
v2-ci / build (push) Failing after 38s
v2-ci / unit-tests (tests/Core/ZB.MOM.WW.OtOpcUa.Cluster.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.ControlPlane.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.Runtime.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.Security.Tests) (push) Has been skipped
v2-ci / integration (tests/Server/ZB.MOM.WW.OtOpcUa.Host.IntegrationTests) (push) Has been skipped
v2-ci / integration (tests/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer.IntegrationTests) (push) Has been skipped
Completes the test side of the in-progress Galaxy-tag workstream:
- Phase7ApplierTests / Phase7ApplierHierarchyTests: supply the now-required
  Galaxy-tag args to Phase7Plan / Phase7CompositionResult.
- Add genuine coverage for Phase7Applier.MaterialiseGalaxyTags (folder-per-distinct-path,
  variable-per-tag node-id derivation, folder dedupe) + added-Galaxy-tags-trigger-rebuild.
- S7.Cli.Tests: use the project's S7CpuType (CLI option type) instead of S7.Net.CpuType.
Whole solution now builds 0/0; OpcUaServer.Tests 52, S7.Cli.Tests 36 green.
2026-05-29 12:18:01 -04:00
Joseph Doherty a4fb97aef8 chore(docker-dev): remap Traefik to host port 9200
v2-ci / build (push) Failing after 2m6s
v2-ci / unit-tests (tests/Core/ZB.MOM.WW.OtOpcUa.Cluster.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.ControlPlane.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.Runtime.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.Security.Tests) (push) Has been skipped
v2-ci / integration (tests/Server/ZB.MOM.WW.OtOpcUa.Host.IntegrationTests) (push) Has been skipped
v2-ci / integration (tests/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer.IntegrationTests) (push) Has been skipped
Host :80 collides with the sister scadabridge-traefik dev stack; bind the
OtOpcUa Traefik :80 entrypoint to host 9200 instead (admin UI now at
http://localhost:9200). Dashboard already on 8089 to avoid the same clash.
2026-05-29 12:09:21 -04:00
Joseph Doherty da4634d67e fix(tests,cli): implement IOpcUaAddressSpaceSink.EnsureVariable in test fakes; fix CLI CS1587
v2-ci / build (push) Failing after 44s
v2-ci / unit-tests (tests/Core/ZB.MOM.WW.OtOpcUa.Cluster.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.ControlPlane.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.Runtime.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.Security.Tests) (push) Has been skipped
v2-ci / integration (tests/Server/ZB.MOM.WW.OtOpcUa.Host.IntegrationTests) (push) Has been skipped
v2-ci / integration (tests/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer.IntegrationTests) (push) Has been skipped
Resolves the 12 reported build errors (7 CS0535 sink fakes + 5 CLI CS1587).
Runtime.Tests green (74). NOTE: OpcUaServer.Tests still has pre-existing CS7036
errors from the in-progress Galaxy-tag workstream (Phase7Plan/Phase7CompositionResult
new required params) — separate, test-only, not addressed here.
2026-05-29 10:19:32 -04:00
Joseph Doherty 869be660fd fix(adminui): strip stale Phase C.2 / rebuild-plan roadmap notes from cluster list pages
v2-ci / build (push) Failing after 49s
v2-ci / unit-tests (tests/Core/ZB.MOM.WW.OtOpcUa.Cluster.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.ControlPlane.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.Runtime.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.Security.Tests) (push) Has been skipped
v2-ci / integration (tests/Server/ZB.MOM.WW.OtOpcUa.Host.IntegrationTests) (push) Has been skipped
v2-ci / integration (tests/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer.IntegrationTests) (push) Has been skipped
Removes the internal-roadmap deferral banners (the original request that
seeded this work); kept the genuinely useful operator descriptions.
2026-05-29 10:12:15 -04:00
54 changed files with 1085 additions and 337 deletions
+6
View File
@@ -42,3 +42,9 @@ config_cache*.db
# Client CLI/UI runtime scratch (last-connected endpoint cache)
session.dat
# Secrets / local credentials — never commit
sql_login.txt
# OPC UA certificate store (runtime PKI: own/trusted/issued/rejected certs + keys)
src/Server/ZB.MOM.WW.OtOpcUa.Host/pki/
+3
View File
@@ -96,6 +96,9 @@
<PackageVersion Include="xunit" Version="2.9.2" />
<PackageVersion Include="xunit.runner.visualstudio" Version="3.0.2" />
<PackageVersion Include="xunit.v3" Version="1.1.0" />
<PackageVersion Include="ZB.MOM.WW.Health" Version="0.1.0" />
<PackageVersion Include="ZB.MOM.WW.Health.Akka" Version="0.1.0" />
<PackageVersion Include="ZB.MOM.WW.Health.EntityFrameworkCore" Version="0.1.0" />
<PackageVersion Include="ZB.MOM.WW.MxGateway.Client" Version="0.1.0" />
<PackageVersion Include="ZB.MOM.WW.MxGateway.Contracts" Version="0.1.0" />
</ItemGroup>
+13
View File
@@ -3,5 +3,18 @@
<packageSources>
<add key="nuget.org" value="https://api.nuget.org/v3/index.json" protocolVersion="3" />
<add key="local-mxgw" value="./nuget-packages" />
<add key="dohertj2-gitea" value="https://gitea.dohertylan.com/api/packages/dohertj2/nuget/index.json" />
</packageSources>
<packageSourceMapping>
<packageSource key="nuget.org">
<package pattern="*" />
</packageSource>
<packageSource key="local-mxgw">
<package pattern="ZB.MOM.WW.MxGateway.*" />
</packageSource>
<packageSource key="dohertj2-gitea">
<package pattern="ZB.MOM.WW.Health" />
<package pattern="ZB.MOM.WW.Health.*" />
</packageSource>
</packageSourceMapping>
</configuration>
+1 -1
View File
@@ -248,7 +248,7 @@ services:
- --providers.file.watch=true
- --api.insecure=true
ports:
- "80:80"
- "9200:80" # host port 9200 → traefik :80 entrypoint (80 conflicts with scadabridge-traefik)
- "8089:8080" # 8080 conflicts with the sister scadalink dev stack
volumes:
- ./traefik-dynamic.yml:/etc/traefik/dynamic.yml:ro
+106
View File
@@ -0,0 +1,106 @@
# Alarms D.1 — smoke artifact
> **Status (2026-05-29): alarm-source leg VERIFIED. Historian-write leg still
> pending the Windows sidecar + live AVEVA Historian.**
>
> **Re-confirmed 2026-05-31** against the same gateway (`http://10.100.0.48:5120`):
> the Skip-gated live test passed again, pulling a native `Raise` transition
> (`Galaxy!TestArea.TestMachine_001.TestAlarm001`, raw sev 500 → OPC UA 750/High,
> category `TestArea`, operator comment `Test alarm #1`) through the production
> consumer. Independent re-run, not the original capture.
>
> This is the D.1 deliverable called for by `docs/plans/alarms-worker-wiring-plan.md`
> — captured evidence that a live Galaxy alarm reaches lmxopcua through the native
> gateway path (not the sub-attribute fallback). It supersedes the "A.2 blocked"
> banners in `alarms-over-gateway.md` / `alarms-worker-wiring-plan.md`, which were
> written 2026-04-30 before the gateway's alarm feed was working.
## What was verified
The mxaccessgw gateway **does** serve native MxAccess alarms today, and the lmxopcua
consumer ingests them with full fidelity — **including operator-comment**, the field
the 2026-04-30 plan flagged as "the only v1 regression."
Verified from the macOS dev box against the live gateway at `http://10.100.0.48:5120`
(reachable; `nc -z` succeeds). No acknowledge / no writes were issued — read-only
`StreamAlarms`.
### 1. Gateway boundary — raw `StreamAlarms` (`ZB.MOM.WW.MxGateway.Client`)
A standalone client streamed the active-alarm snapshot: **20 active alarms**, each
carrying native metadata. Sample (one of 20):
```json
{ "alarmFullReference": "Galaxy!TestArea.TestMachine_001.TestAlarm001",
"sourceObjectReference": "TestMachine_001.TestAlarm001",
"alarmTypeName": "DSC", "severity": 500,
"currentState": "ALARM_CONDITION_STATE_ACTIVE", "category": "TestArea",
"lastTransitionTimestamp": "2026-05-24T16:04:10.856Z",
"operatorComment": "Test alarm #1" }
```
Followed by the `SnapshotComplete` marker. `operatorComment`, `category`, `severity`,
`currentState`, and `lastTransitionTimestamp` are all populated.
### 2. lmxopcua consumer — `GatewayGalaxyAlarmFeed` → `GalaxyAlarmTransition`
The Skip-gated live test
`Runtime/GatewayGalaxyAlarmFeedLiveTests.Live_gateway_delivers_native_alarm_transitions_through_the_consumer`
wires the real `MxGatewayClient.StreamAlarmsAsync` into the production consumer seam
and **passes**. Captured output (`D1_SMOKE_OUT`):
```
# consumer transitions observed: 2+
Raise Galaxy!TestArea.TestMachine_001.TestAlarm001 | sev=750(High) raw=500 | cat=TestArea | comment='Test alarm #1' | xitionUtc=2026-05-24T16:04:10.856Z
Raise Galaxy!TestArea.TestMachine_003.TestAlarm001 | sev=750(High) raw=500 | cat=TestArea | comment='Test alarm #1' | xitionUtc=2026-05-07T18:14:00.594Z
```
The consumer preserves `operatorComment` + `category` + transition timestamp and
applies the OPC UA severity-bucket mapping (`MxAccessSeverityMapper`: raw 500 →
OPC UA 750, bucket `High`).
### 3. Full chain to the OPC UA Part 9 surface (code-path verified)
`GalaxyDriver.OnAlarmFeedTransition` maps `GalaxyAlarmTransition`
`AlarmEventArgs`, carrying `OperatorComment`, `OriginalRaiseTimestampUtc`,
`AlarmCategory`, and the severity bucket onto `IAlarmSource.OnAlarmEvent`.
`AlarmEventArgs` already declares those fields — so the **E.7 contract extension is
done**, not pending. The server's Part-9 condition layer consumes `IAlarmSource`
via `AlarmSurfaceInvoker``GenericDriverNodeManager`. Unit coverage:
`GalaxyDriverAlarmSourceTests`, `GatewayGalaxyAlarmFeedTests`.
## How to re-run
```bash
export MXGW_ENDPOINT="http://10.100.0.48:5120"
export GALAXY_MXGW_API_KEY="<dev key from docker-dev/docker-compose.yml>"
export D1_SMOKE_OUT="/tmp/d1-consumer-transitions.txt" # optional capture
dotnet test tests/Drivers/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Tests \
--filter "FullyQualifiedName~GatewayGalaxyAlarmFeedLiveTests"
```
Without the env vars the test `Skip`s, so normal `dotnet test` runs are unaffected.
## Not covered here (still open)
1. **Scripted-alarm historian write-back → AVEVA Historian** (C.1's live leg). The
`SdkAlarmHistorianWriteBackend` (real `HistorianAccess.AddStreamedValue` path) is
implemented and unit-tested, but its `Live_*` write smoke needs the Windows
historian sidecar + a live AVEVA Historian — neither reachable from the macOS dev
box. Capture this leg on the Windows parity rig.
2. **Running-server → OPC UA A&C client round-trip.** This artifact proves the driver
consumer end; it does not exercise a full OtOpcUa server surfacing the condition to
an OPC UA client, because the docker-dev stack stubs the Galaxy driver on Linux
(`DriverInstanceActor.ShouldStub`). Capture on the Windows parity rig (or a Linux
host with `ShouldStub` overridden to point the real driver at the gateway).
## Mechanism — true MxAccess alarm-event support
The gateway delivers these alarms via **true MxAccess alarm-event support** in the
mxaccessgw .NET client — a real alarm-event subscription, **not** the value-driven
sub-attribute fallback. (Confirmed by the gateway maintainer; the client-side stream
check above can only observe the resulting feed, which is why this artifact records the
mechanism here rather than inferring it.) So A.2 is implemented as originally specified:
`MX_EVENT_FAMILY_ON_ALARM_TRANSITION` carries genuine native alarm-event metadata, and
the operator-comment / original-raise-time / category fields are first-class — not
reconstructed from attribute reads.
+33 -16
View File
@@ -9,24 +9,41 @@
> the new RPCs; the sub-attribute fallback path keeps Galaxy alarms
> functional today.
>
> ⚠️ **Worker-side native alarm subscription blocked on a dev-rig
> finding (2026-04-30):** the MXAccess COM Toolkit at
> **UPDATE 2026-05-29 — native alarm feed VERIFIED working; the
> 2026-04-30 "blocked" finding below is superseded.** A live
> `StreamAlarms` check against the gateway at `10.100.0.48:5120`
> returned the active-alarm snapshot (20 alarms) with full native
> metadata — `severity`, `category`, `currentState`,
> `lastTransitionTimestamp`, **and `operatorComment`** (the field the
> note below called "the only v1 regression"). The lmxopcua consumer
> (`GatewayGalaxyAlarmFeed` → `GalaxyAlarmTransition` →
> `AlarmEventArgs` → `IAlarmSource`) ingests it with full fidelity and
> the OPC UA severity-bucket mapping applied — proven by the passing
> Skip-gated live test `GatewayGalaxyAlarmFeedLiveTests`. `AlarmEventArgs`
> already carries operator-comment / original-raise-time / category, so
> **E.7 is done too**. See `docs/plans/alarms-d1-smoke-artifact.md` for
> the captured evidence. The gateway delivers this via **true MxAccess
> alarm-event support** in the mxaccessgw .NET client (a real
> alarm-event subscription — **not** the sub-attribute fallback), so A.2
> is implemented as originally specified. Still open: the scripted-alarm
> → AVEVA Historian write-back live smoke (C.1's `Live_*` leg) and a full
> running-server → OPC UA A&C round-trip — both need the Windows parity rig.
>
> ⚠️ **[SUPERSEDED — kept for history] Worker-side native alarm
> subscription blocked on a dev-rig finding (2026-04-30):** the MXAccess
> COM Toolkit at
> `C:\Program Files (x86)\ArchestrA\Framework\Bin\ArchestrA.MXAccess.dll`
> exposes no alarm-event family — only `OnDataChange`,
> `OnWriteComplete`, `OperationComplete`, `OnBufferedDataChange`.
> exposed no alarm-event family — only `OnDataChange`,
> `OnWriteComplete`, `OperationComplete`, `OnBufferedDataChange` — and
> AVEVA's `aaAlarmManagedClient` / `ArchestrAAlarmsAndEvents.SDK`
> assemblies are x64-only and incompatible with the worker's x86
> bitness. **Operator decision needed before
> `MX_EVENT_FAMILY_ON_ALARM_TRANSITION` carries any events:** either
> accept the value-driven sub-attribute path as the production
> architecture (operator-comment fidelity is the only v1 regression)
> or add an x64 alarm-helper sub-process alongside the worker. See
> `src/MxGateway.Worker/MxAccess/MxAccessAlarmEventSink.cs` in the
> mxaccessgw repo for the architectural notes. Live
> `aahClientManaged` alarm-event write call site
> (`SdkAlarmHistorianWriteBackend` placeholder from PR C.1) and the
> D.1 smoke artifact ship once those decisions resolve. The
> remainder of this document is preserved as the design record.
> assemblies are x64-only vs. the worker's x86 bitness. The operator
> decision (accept the value-driven sub-attribute path, or add an x64
> alarm-helper sub-process) has since been resolved on the gateway side
> — `MX_EVENT_FAMILY_ON_ALARM_TRANSITION` now carries events (verified
> above). The C.1 `SdkAlarmHistorianWriteBackend` is **no longer a
> placeholder** — it writes through the real
> `HistorianAccess.AddStreamedValue` path (only its live-rig write
> smoke remains).
Coordinated epic across two repos:
+27 -10
View File
@@ -1,5 +1,18 @@
# Alarms Worker Wiring Plan
> ✅ **UPDATE 2026-05-29 — the blocker below is RESOLVED on the gateway side; this
> plan is largely complete.** A live `StreamAlarms` check against `10.100.0.48:5120`
> returns the active-alarm snapshot with full native metadata **including
> `operatorComment`**, and the lmxopcua consumer ingests it end-to-end (passing live
> test `GatewayGalaxyAlarmFeedLiveTests`). So **A.2 / A.3 / A.4** are functionally done
> at the gateway boundary (the worker now emits native alarm transitions and the client
> exposes `AcknowledgeAlarm` / `QueryActiveAlarms` RPCs). **C.1** ships real code
> (`SdkAlarmHistorianWriteBackend` → `HistorianAccess.AddStreamedValue`). **D.1**'s
> alarm-source leg is captured in `docs/plans/alarms-d1-smoke-artifact.md`. Only two
> things remain, both needing the Windows parity rig: C.1's live historian-write smoke
> and a full running-server → OPC UA A&C round-trip. The per-item detail below is kept
> as the historical record of the original blocked state.
>
> **Context**: The alarms-over-gateway epic shipped 19 PRs across the
> `lmxopcua` and `mxaccessgw` repos (merged 2026-04-30). Contracts are live;
> the sub-attribute fallback path keeps Galaxy alarms functional today. Four
@@ -16,7 +29,7 @@
---
## Dev-rig finding that blocks everything (2026-04-30)
## Dev-rig finding that blocks everything (2026-04-30) — [SUPERSEDED 2026-05-29]
During PR A.2 work the following was discovered on the dev box:
@@ -318,16 +331,20 @@ fallback as production).
## Summary of blocks
| Item | Blocked by | Estimated effort once unblocked |
|------|-----------|--------------------------------|
| A.2 | Architectural decision (x64 alarm-helper vs. sub-attribute fallback as production) | 23 days implementation; 1 day tests |
| A.3 | A.2 delivering WorkerEvent bodies | 12 days |
| A.4 | A.2 (active-alarm query needs AlarmClient session) | 1 day |
| C.1 | aahClientManaged SDK access (available on dev box); NOT blocked by A.2 | 12 days |
| D.1 | A.2 + A.3 + C.1 all passing on parity rig | 0.5 day (smoke + artifact capture) |
> **Resolved as of 2026-05-29** — see the update banner at the top and
> `docs/plans/alarms-d1-smoke-artifact.md`. Original status table kept for history.
C.1 can proceed in parallel with A.2 / A.3 since the sidecar's `aahClientManaged`
is x64 and does not share the worker bitness constraint.
| Item | Status (2026-05-29) | Original block |
|------|--------------------|----------------|
| A.2 | ✅ **True MxAccess alarm-event support** in the gateway client (real alarm-event subscription, not the sub-attribute fallback); verified via live `StreamAlarms` with operator-comment fidelity | Architectural decision (x64 alarm-helper vs. sub-attribute fallback) |
| A.3 | ✅ Dispatch + `AcknowledgeAlarm` RPC present on the client surface | A.2 delivering WorkerEvent bodies |
| A.4 | ✅ `QueryActiveAlarms` RPC present on the client surface | A.2 (active-alarm query needs AlarmClient session) |
| C.1 | ✅ Code shipped (`AddStreamedValue` path); ⏳ live historian-write smoke needs the Windows rig | aahClientManaged SDK access |
| D.1 | ◑ Alarm-source leg captured (`alarms-d1-smoke-artifact.md`); ⏳ historian-write leg + full server→A&C round-trip need the Windows rig | A.2 + A.3 + C.1 all passing on parity rig |
The gateway delivers operator-comment fidelity through **true MxAccess alarm-event
support** in the mxaccessgw .NET client — a real alarm-event subscription, not the
value-driven sub-attribute path. The sub-attribute fallback is now legacy.
---
@@ -172,8 +172,8 @@ public static class DraftValidator
var compat = ns.Kind switch
{
NamespaceKind.SystemPlatform => di.DriverType == "Galaxy",
NamespaceKind.Equipment => di.DriverType != "Galaxy",
NamespaceKind.SystemPlatform => di.DriverType == "GalaxyMxGateway",
NamespaceKind.Equipment => di.DriverType != "GalaxyMxGateway",
_ => true,
};
@@ -12,20 +12,20 @@ namespace ZB.MOM.WW.OtOpcUa.Driver.AbLegacy.Cli.Commands;
[Command("subscribe", Description = "Watch a PCCC file address via polled subscription until Ctrl+C.")]
public sealed class SubscribeCommand : AbLegacyCommandBase
{
[CommandOption("address", 'a', Description = "PCCC file address — same format as `read`.", IsRequired = true)]
/// <summary>Gets or sets the PCCC file address to subscribe to.</summary>
[CommandOption("address", 'a', Description = "PCCC file address — same format as `read`.", IsRequired = true)]
public string Address { get; init; } = default!;
/// <summary>Gets or sets the data type of the address.</summary>
[CommandOption("type", 't', Description =
"Bit / Int / Long / Float / AnalogInt / String / TimerElement / CounterElement / " +
"ControlElement (default Int).")]
/// <summary>Gets or sets the data type of the address.</summary>
public AbLegacyDataType DataType { get; init; } = AbLegacyDataType.Int;
/// <summary>Gets or sets the polling interval in milliseconds.</summary>
[CommandOption("interval-ms", 'i', Description =
"Publishing interval in milliseconds (default 1000). PollGroupEngine floors " +
"sub-250ms values.")]
/// <summary>Gets or sets the polling interval in milliseconds.</summary>
public int IntervalMs { get; init; } = 1000;
/// <inheritdoc />
@@ -13,14 +13,14 @@ namespace ZB.MOM.WW.OtOpcUa.Driver.S7.Cli.Commands;
[Command("probe", Description = "Verify the S7 endpoint is reachable and a sample read succeeds.")]
public sealed class ProbeCommand : S7CommandBase
{
/// <summary>Gets or sets the S7 address to probe.</summary>
[CommandOption("address", 'a', Description =
"Probe address (default MW0 — merker word 0). DB1.DBW0 if your PLC project " +
"reserves a fingerprint DB.")]
/// <summary>Gets or sets the S7 address to probe.</summary>
public string Address { get; init; } = "MW0";
[CommandOption("type", Description = "Probe data type (default Int16).")]
/// <summary>Gets or sets the data type of the probe address.</summary>
[CommandOption("type", Description = "Probe data type (default Int16).")]
public S7DataType DataType { get; init; } = S7DataType.Int16;
/// <inheritdoc />
@@ -42,8 +42,10 @@ public sealed class GalaxyDriverBrowser : IDriverBrowser
_logger = logger ?? NullLogger<GalaxyDriverBrowser>.Instance;
}
/// <summary>Driver type key — matches the AdminUI's persisted "Galaxy" value.</summary>
public string DriverType => "Galaxy";
/// <summary>Driver type key — matches the AdminUI's persisted "GalaxyMxGateway" value.</summary>
// Hardcoded literal: this project references Driver.Galaxy.Contracts, not Driver.Galaxy,
// so GalaxyDriverFactoryExtensions.DriverTypeName isn't available here.
public string DriverType => "GalaxyMxGateway";
/// <summary>
/// Deserializes a <see cref="GalaxyDriverOptions"/> blob, opens a transient
@@ -25,7 +25,7 @@ public sealed class GalaxyDriverProbe : IDriverProbe
/// <inheritdoc />
// Matches DriverInstance.DriverType strings set by the AdminUI's GalaxyDriverPage.
public string DriverType => "Galaxy";
public string DriverType => GalaxyDriverFactoryExtensions.DriverTypeName;
/// <inheritdoc />
public async Task<DriverProbeResult> ProbeAsync(string configJson, TimeSpan timeout, CancellationToken ct)
@@ -4,11 +4,10 @@
and the AB CIP ALMD bridge. *@
@attribute [Microsoft.AspNetCore.Authorization.Authorize]
@rendermode RenderMode.InteractiveServer
@using Microsoft.AspNetCore.SignalR.Client
@using ZB.MOM.WW.OtOpcUa.AdminUI.Hubs
@using ZB.MOM.WW.OtOpcUa.Commons.Messages.Alerts
@inject NavigationManager Nav
@implements IAsyncDisposable
@inject IInProcessBroadcaster<AlarmTransitionEvent> Alarms
@implements IDisposable
<div class="d-flex justify-content-between align-items-center mb-3">
<h4 class="mb-0">Alerts</h4>
@@ -73,36 +72,26 @@ else
private const int Capacity = 200;
private readonly List<AlarmTransitionEvent> _rows = new();
private HubConnection? _hub;
private bool _connected;
protected override async Task OnInitializedAsync()
protected override void OnInitialized()
{
_hub = new HubConnectionBuilder()
.WithUrl(Nav.ToAbsoluteUri(AlertHub.Endpoint))
.WithAutomaticReconnect()
.Build();
// Live alarm tail straight from the in-process broadcaster (fed by AlertSignalRBridge off the
// 'alerts' DPS topic). A Blazor Server component can't self-connect a SignalR HubConnection
// behind a reverse proxy — see IInProcessBroadcaster — so we subscribe in-process instead.
Alarms.Received += OnAlarm;
_connected = true;
}
_hub.On<AlarmTransitionEvent>(AlertHub.MethodName, evt =>
private void OnAlarm(AlarmTransitionEvent evt) =>
// Marshal both the mutation and the re-render onto the circuit sync context so this can't
// race ClearAsync (which runs there) over the shared _rows list.
InvokeAsync(() =>
{
_rows.Insert(0, evt);
if (_rows.Count > Capacity) _rows.RemoveAt(_rows.Count - 1);
InvokeAsync(StateHasChanged);
StateHasChanged();
});
_hub.Closed += _ => { _connected = false; return InvokeAsync(StateHasChanged); };
_hub.Reconnected += _ => { _connected = true; return InvokeAsync(StateHasChanged); };
try
{
await _hub.StartAsync();
_connected = true;
}
catch
{
// Connection failures (admin-only deployment, hub not mapped, etc.) leave the page
// showing "disconnected" — operator action: reload or talk to the host operator.
}
}
private async Task ClearAsync()
{
@@ -119,8 +108,5 @@ else
_ => "chip-idle",
};
public async ValueTask DisposeAsync()
{
if (_hub is not null) await _hub.DisposeAsync();
}
public void Dispose() => Alarms.Received -= OnAlarm;
}
@@ -21,9 +21,8 @@ else
{
<section class="panel notice rise" style="animation-delay:.02s">
ACL rows grant LDAP groups specific <span class="mono">NodePermissions</span> on a scope
(a folder, an equipment, a tag). Q4 of the AdminUI rebuild plan dropped per-cluster role
grants in favour of fleet-wide LDAP-group → role mapping; ACLs here are the finer-grained
per-node scope. Live editing lands in a Phase C.2 follow-up.
(a folder, an equipment, a tag). Per-cluster role grants were dropped in favour of
fleet-wide LDAP-group → role mapping; ACLs here are the finer-grained per-node scope.
</section>
<section class="panel rise mt-3" style="animation-delay:.08s">
@@ -19,12 +19,6 @@
}
else
{
<section class="panel notice rise" style="animation-delay:.02s">
Per Q1 of the AdminUI rebuild plan, typed driver editors (Modbus, FOCAS) are deferred.
The expanded view below shows raw JSON config. Live editing — including a generic JSON
editor and per-driver-type forms when operators ask — lands in a Phase C.2 follow-up.
</section>
<section class="panel rise mt-3" style="animation-delay:.08s">
<div class="panel-head">@_rows.Count driver instance@(_rows.Count == 1 ? "" : "s")</div>
@if (_rows.Count == 0)
@@ -25,7 +25,7 @@ else
<section class="panel notice rise" style="animation-delay:.02s">
Equipment rows are scoped to a UNS line and bound to a single driver. EquipmentId is
system-generated (decision #125); browse identifiers are MachineCode (operator) + ZTag
(ERP). Live editing lands in a Phase C.2 follow-up.
(ERP).
</section>
<section class="panel rise mt-3" style="animation-delay:.08s">
@@ -21,8 +21,7 @@ else
{
<section class="panel notice rise" style="animation-delay:.02s">
Namespaces are content (decision #123) — they're served at the OPC UA endpoint and bound
to driver instances. NamespaceUri must be unique fleet-wide. Live editing lands in a
Phase C.2 follow-up.
to driver instances. NamespaceUri must be unique fleet-wide.
</section>
<section class="panel rise mt-3" style="animation-delay:.08s">
@@ -21,7 +21,7 @@ else
{
<section class="panel notice rise" style="animation-delay:.02s">
Tags are bound to a driver instance and (optionally) an equipment + poll group. The view
below shows the first @PageSize tags by Name; full pagination + search land in Phase C.2.
below shows the first @PageSize tags by Name.
</section>
<div class="d-flex align-items-center mb-3 gap-2 mt-3">
@@ -20,8 +20,7 @@ else
{
<section class="panel notice rise" style="animation-delay:.02s">
UNS levels: Enterprise (cluster) → Site (cluster) → Area → Line → Equipment. Areas and
lines are cluster-scoped; equipment hangs under a single line. Live editing lands in a
Phase C.2 follow-up.
lines are cluster-scoped; equipment hangs under a single line.
</section>
<section class="panel rise mt-3" style="animation-delay:.08s">
@@ -59,7 +59,7 @@ else
["TwinCat"] = typeof(TwinCATDriverPage),
["Focas"] = typeof(FocasDriverPage),
["OpcUaClient"] = typeof(OpcUaClientDriverPage),
["Galaxy"] = typeof(GalaxyDriverPage),
["GalaxyMxGateway"] = typeof(GalaxyDriverPage),
["Historian.Wonderware"] = typeof(HistorianWonderwareDriverPage),
};
@@ -208,13 +208,14 @@ else
[Parameter] public string ClusterId { get; set; } = "";
[Parameter] public string? DriverInstanceId { get; set; }
private const string DriverTypeKey = "Galaxy";
private const string DriverTypeKey = "GalaxyMxGateway";
private bool IsNew => string.IsNullOrEmpty(DriverInstanceId);
private static readonly System.Text.Json.JsonSerializerOptions _jsonOpts = new()
{
PropertyNamingPolicy = System.Text.Json.JsonNamingPolicy.CamelCase,
PropertyNameCaseInsensitive = true,
UnmappedMemberHandling = System.Text.Json.Serialization.JsonUnmappedMemberHandling.Skip,
WriteIndented = false,
};
@@ -408,26 +409,36 @@ else
// GalaxyDriverOptions top-level
public int ProbeTimeoutSeconds { get; set; } = 30;
public static GalaxyFormModel FromRecord(GalaxyDriverOptions r) => new()
public static GalaxyFormModel FromRecord(GalaxyDriverOptions r)
{
GatewayEndpoint = r.Gateway.Endpoint,
GatewayApiKeySecretRef = r.Gateway.ApiKeySecretRef,
GatewayUseTls = r.Gateway.UseTls,
GatewayCaCertificatePath = r.Gateway.CaCertificatePath,
GatewayConnectTimeoutSeconds = r.Gateway.ConnectTimeoutSeconds,
GatewayDefaultCallTimeoutSeconds = r.Gateway.DefaultCallTimeoutSeconds,
GatewayStreamTimeoutSeconds = r.Gateway.StreamTimeoutSeconds,
MxClientName = r.MxAccess.ClientName,
MxPublishingIntervalMs = r.MxAccess.PublishingIntervalMs,
MxWriteUserId = r.MxAccess.WriteUserId,
MxEventPumpChannelCapacity = r.MxAccess.EventPumpChannelCapacity,
RepositoryDiscoverPageSize = r.Repository.DiscoverPageSize,
RepositoryWatchDeployEvents = r.Repository.WatchDeployEvents,
ReconnectInitialBackoffMs = r.Reconnect.InitialBackoffMs,
ReconnectMaxBackoffMs = r.Reconnect.MaxBackoffMs,
ReconnectReplayOnSessionLost = r.Reconnect.ReplayOnSessionLost,
// Null-coalesce each nested record to its default so that persisted configs
// that pre-date a section (e.g. no Reconnect key, or PascalCase keys that
// don't match the camelCase deserializer) don't cause a NullReferenceException.
var gw = r.Gateway ?? new GalaxyGatewayOptions("https://localhost:5001", "env:MX_API_KEY");
var mx = r.MxAccess ?? new GalaxyMxAccessOptions("OtOpcUa");
var repo = r.Repository ?? new GalaxyRepositoryOptions();
var rc = r.Reconnect ?? new GalaxyReconnectOptions();
return new()
{
GatewayEndpoint = gw.Endpoint,
GatewayApiKeySecretRef = gw.ApiKeySecretRef,
GatewayUseTls = gw.UseTls,
GatewayCaCertificatePath = gw.CaCertificatePath,
GatewayConnectTimeoutSeconds = gw.ConnectTimeoutSeconds,
GatewayDefaultCallTimeoutSeconds = gw.DefaultCallTimeoutSeconds,
GatewayStreamTimeoutSeconds = gw.StreamTimeoutSeconds,
MxClientName = mx.ClientName,
MxPublishingIntervalMs = mx.PublishingIntervalMs,
MxWriteUserId = mx.WriteUserId,
MxEventPumpChannelCapacity = mx.EventPumpChannelCapacity,
RepositoryDiscoverPageSize = repo.DiscoverPageSize,
RepositoryWatchDeployEvents = repo.WatchDeployEvents,
ReconnectInitialBackoffMs = rc.InitialBackoffMs,
ReconnectMaxBackoffMs = rc.MaxBackoffMs,
ReconnectReplayOnSessionLost = rc.ReplayOnSessionLost,
ProbeTimeoutSeconds = r.ProbeTimeoutSeconds,
};
}
public GalaxyDriverOptions ToRecord() => new(
Gateway: new GalaxyGatewayOptions(
@@ -3,11 +3,10 @@
VirtualTagActor / ScriptedAlarmActor script execution. Engine emit lands with F8 + F9. *@
@attribute [Microsoft.AspNetCore.Authorization.Authorize]
@rendermode RenderMode.InteractiveServer
@using Microsoft.AspNetCore.SignalR.Client
@using ZB.MOM.WW.OtOpcUa.AdminUI.Hubs
@using ZB.MOM.WW.OtOpcUa.Commons.Messages.Logging
@inject NavigationManager Nav
@implements IAsyncDisposable
@inject IInProcessBroadcaster<ScriptLogEntry> ScriptLogs
@implements IDisposable
<div class="d-flex justify-content-between align-items-center mb-3">
<h4 class="mb-0">Script log</h4>
@@ -87,7 +86,6 @@ else
private const int Capacity = 500;
private readonly List<ScriptLogEntry> _rows = new();
private HubConnection? _hub;
private bool _connected;
private string _levelFilter = "";
private string _scriptFilter = "";
@@ -115,32 +113,24 @@ else
}
}
protected override async Task OnInitializedAsync()
protected override void OnInitialized()
{
_hub = new HubConnectionBuilder()
.WithUrl(Nav.ToAbsoluteUri(ScriptLogHub.Endpoint))
.WithAutomaticReconnect()
.Build();
// Live tail straight from the in-process broadcaster (fed by ScriptLogSignalRBridge off the
// 'script-logs' DPS topic). Blazor Server can't self-connect a SignalR HubConnection behind
// a reverse proxy — see IInProcessBroadcaster — so we subscribe in-process instead.
ScriptLogs.Received += OnEntry;
_connected = true;
}
_hub.On<ScriptLogEntry>(ScriptLogHub.MethodName, entry =>
private void OnEntry(ScriptLogEntry entry) =>
// Marshal both the mutation and the re-render onto the circuit sync context so this can't
// race ClearAsync (which runs there) over the shared _rows list.
InvokeAsync(() =>
{
_rows.Insert(0, entry);
if (_rows.Count > Capacity) _rows.RemoveAt(_rows.Count - 1);
InvokeAsync(StateHasChanged);
StateHasChanged();
});
_hub.Closed += _ => { _connected = false; return InvokeAsync(StateHasChanged); };
_hub.Reconnected += _ => { _connected = true; return InvokeAsync(StateHasChanged); };
try
{
await _hub.StartAsync();
_connected = true;
}
catch
{
// Connection error — page shows "disconnected".
}
}
private async Task ClearAsync()
{
@@ -156,8 +146,5 @@ else
_ => "chip-idle",
};
public async ValueTask DisposeAsync()
{
if (_hub is not null) await _hub.DisposeAsync();
}
public void Dispose() => ScriptLogs.Received -= OnEntry;
}
@@ -20,7 +20,7 @@ else
<section class="panel notice rise" style="animation-delay:.02s">
Virtual tags evaluate a script per equipment instance and publish the result as an OPC UA
variable. ChangeTriggered = re-evaluate when any dependency changes; TimerIntervalMs
re-evaluates on a periodic timer. Live editing lands in a Phase C.2-equivalent follow-up.
re-evaluates on a periodic timer.
</section>
<section class="panel rise mt-3" style="animation-delay:.08s">
@@ -35,7 +35,7 @@
<option value="TwinCat">TwinCat</option>
<option value="Focas">Focas</option>
<option value="OpcUaClient">OpcUaClient</option>
<option value="Galaxy">Galaxy</option>
<option value="GalaxyMxGateway">Galaxy</option>
<option value="Historian.Wonderware">Historian.Wonderware</option>
</InputSelect>
<div class="form-text">Cannot be changed after creation — drives the actor type that owns this instance.</div>
@@ -4,14 +4,14 @@
DriverOperator-gated Reconnect/Restart buttons appear for authorised users. *@
@implements IAsyncDisposable
@using Microsoft.AspNetCore.Authorization
@using Microsoft.AspNetCore.SignalR.Client
@using ZB.MOM.WW.OtOpcUa.AdminUI.Hubs
@using ZB.MOM.WW.OtOpcUa.Commons.Interfaces
@using ZB.MOM.WW.OtOpcUa.Commons.Messages.Admin
@using ZB.MOM.WW.OtOpcUa.Commons.Messages.Drivers
@inject NavigationManager Nav
@inject AuthenticationStateProvider AuthState
@inject IAuthorizationService AuthorizationService
@inject IAdminOperationsClient AdminOps
@inject IDriverStatusSnapshotStore StatusStore
<section class="panel rise mt-3" style="animation-delay:.04s; @(_stale ? "opacity:0.5;" : "")">
<div class="panel-head d-flex align-items-center gap-2">
@@ -139,7 +139,6 @@
[Parameter] public string ClusterId { get; set; } = "";
[Parameter] public bool Enabled { get; set; } = true;
private HubConnection? _hub;
private DriverHealthChanged? _snapshot;
private DateTime _lastUpdateUtc = DateTime.MinValue;
private bool _stale;
@@ -180,30 +179,44 @@
InvokeAsync(StateHasChanged);
}, null, TimeSpan.FromSeconds(5), TimeSpan.FromSeconds(5));
_hub = new HubConnectionBuilder()
.WithUrl(Nav.ToAbsoluteUri("/hubs/driverstatus"))
.WithAutomaticReconnect()
.Build();
_hub.On<DriverHealthChanged>("status", snap =>
// Read live status straight from the in-process snapshot store rather than opening a
// self-targeted SignalR connection. This component runs server-side (Blazor
// InteractiveServer), so a HubConnection to the browser's public URL (e.g.
// http://localhost:9200 behind Traefik) would dial that port from *inside* the container —
// where only Kestrel's :9000 listens — and fail with "Connection refused". The store is fed
// on every admin node by DriverStatusSignalRBridge (a per-node DistributedPubSub
// subscriber), so the local singleton is always current regardless of which replica serves
// this circuit.
try
{
StatusStore.SnapshotChanged += OnSnapshotChanged;
if (StatusStore.TryGet(DriverInstanceId, out var snap))
{
_snapshot = snap;
_lastUpdateUtc = DateTime.UtcNow;
}
}
catch (Exception ex)
{
_error = ex.Message;
}
finally
{
_connecting = false;
}
}
// Invoked by the snapshot store (on the bridge actor's thread) for every driver instance;
// ignore snapshots for other instances and marshal onto the render sync context.
private void OnSnapshotChanged(DriverHealthChanged snap)
{
if (!string.Equals(snap.DriverInstanceId, DriverInstanceId, StringComparison.Ordinal))
return;
_snapshot = snap;
_lastUpdateUtc = DateTime.UtcNow;
_stale = false;
InvokeAsync(StateHasChanged);
});
try
{
await _hub.StartAsync();
_connecting = false;
await _hub.InvokeAsync("JoinDriver", DriverInstanceId);
}
catch (Exception ex)
{
_connecting = false;
_error = ex.Message;
}
}
private async Task ReconnectAsync()
@@ -285,12 +298,13 @@
public async ValueTask DisposeAsync()
{
// Drain BOTH timers first so an in-flight callback can't invoke StateHasChanged on
// a component whose hub has already been released. System.Threading.Timer's async
// dispose awaits any in-flight callback (.NET 6+).
// Unsubscribe first so the singleton store can't invoke a handler on a disposed component.
StatusStore.SnapshotChanged -= OnSnapshotChanged;
// Drain BOTH timers so an in-flight callback can't invoke StateHasChanged on a component
// that's already gone. System.Threading.Timer's async dispose awaits any in-flight
// callback (.NET 6+).
if (_timer is not null) await _timer.DisposeAsync();
if (_opResultClearTimer is not null) await _opResultClearTimer.DisposeAsync();
if (_hub is not null) await _hub.DisposeAsync();
}
// Map DriverState string → chip CSS class using the 4 defined theme variants.
@@ -126,7 +126,7 @@
try
{
var json = GetConfigJson() ?? "{}";
var result = await BrowserService.OpenAsync("Galaxy", json, default);
var result = await BrowserService.OpenAsync("GalaxyMxGateway", json, default);
if (result.Ok) _token = result.Token;
else _openError = result.Message;
}
@@ -17,22 +17,26 @@ public sealed class AlertSignalRBridge : ReceiveActor
public const string TopicName = "alerts";
private readonly IHubContext<AlertHub> _hub;
private readonly IInProcessBroadcaster<AlarmTransitionEvent> _broadcaster;
private readonly ILoggingAdapter _log = Context.GetLogger();
/// <summary>
/// Creates actor props for the AlertSignalRBridge.
/// </summary>
/// <param name="hub">The SignalR hub context to send alerts to.</param>
public static Props Props(IHubContext<AlertHub> hub) =>
Akka.Actor.Props.Create(() => new AlertSignalRBridge(hub));
/// <param name="broadcaster">In-process fan-out read directly by the Blazor Server Alerts page.</param>
public static Props Props(IHubContext<AlertHub> hub, IInProcessBroadcaster<AlarmTransitionEvent> broadcaster) =>
Akka.Actor.Props.Create(() => new AlertSignalRBridge(hub, broadcaster));
/// <summary>
/// Initializes a new instance of the AlertSignalRBridge actor.
/// </summary>
/// <param name="hub">The SignalR hub context to send alerts to.</param>
public AlertSignalRBridge(IHubContext<AlertHub> hub)
/// <param name="broadcaster">In-process fan-out read directly by the Blazor Server Alerts page.</param>
public AlertSignalRBridge(IHubContext<AlertHub> hub, IInProcessBroadcaster<AlarmTransitionEvent> broadcaster)
{
_hub = hub;
_broadcaster = broadcaster;
ReceiveAsync<AlarmTransitionEvent>(ForwardAsync);
Receive<SubscribeAck>(_ => { /* DPS confirmation */ });
}
@@ -43,6 +47,9 @@ public sealed class AlertSignalRBridge : ReceiveActor
private async Task ForwardAsync(AlarmTransitionEvent msg)
{
// In-process fan-out first — this is what the Blazor Server Alerts page reads. The hub push
// is kept for any out-of-process (e.g. WASM) SignalR client.
_broadcaster.Publish(msg);
try
{
await _hub.Clients.All.SendAsync(AlertHub.MethodName, msg);
@@ -13,14 +13,21 @@ public static class HubServiceCollectionExtensions
public const string DriverStatusSignalRBridgeName = "driver-status-signalr-bridge";
/// <summary>
/// Registers services required by the driver-status hub pipeline:
/// <see cref="IDriverStatusSnapshotStore"/> as a singleton backed by
/// <see cref="InMemoryDriverStatusSnapshotStore"/>.
/// Registers the in-process live-push services the AdminUI's Blazor Server panels read
/// directly (instead of self-connecting a SignalR <c>HubConnection</c>, which fails behind a
/// reverse proxy — see <see cref="IInProcessBroadcaster{T}"/>):
/// <list type="bullet">
/// <item><see cref="IDriverStatusSnapshotStore"/> — last-value snapshot per driver.</item>
/// <item><see cref="IInProcessBroadcaster{T}"/> — append-stream fan-out (alarm
/// transitions, script-log lines). Registered as an open generic so each closed type
/// resolves to its own singleton shared by the bridge actor and the consuming component.</item>
/// </list>
/// </summary>
/// <param name="services">The service collection.</param>
public static IServiceCollection AddOtOpcUaDriverStatusServices(this IServiceCollection services)
{
services.AddSingleton<IDriverStatusSnapshotStore, InMemoryDriverStatusSnapshotStore>();
services.AddSingleton(typeof(IInProcessBroadcaster<>), typeof(InProcessBroadcaster<>));
return services;
}
@@ -48,11 +55,13 @@ public static class HubServiceCollectionExtensions
registry.Register<FleetStatusSignalRBridgeKey>(fleetBridge);
var alertHub = resolver.GetService<IHubContext<AlertHub>>();
var alertBridge = system.ActorOf(AlertSignalRBridge.Props(alertHub), AlertSignalRBridgeName);
var alertBroadcaster = resolver.GetService<IInProcessBroadcaster<Commons.Messages.Alerts.AlarmTransitionEvent>>();
var alertBridge = system.ActorOf(AlertSignalRBridge.Props(alertHub, alertBroadcaster), AlertSignalRBridgeName);
registry.Register<AlertSignalRBridgeKey>(alertBridge);
var scriptLogHub = resolver.GetService<IHubContext<ScriptLogHub>>();
var scriptLogBridge = system.ActorOf(ScriptLogSignalRBridge.Props(scriptLogHub), ScriptLogSignalRBridgeName);
var scriptLogBroadcaster = resolver.GetService<IInProcessBroadcaster<Commons.Messages.Logging.ScriptLogEntry>>();
var scriptLogBridge = system.ActorOf(ScriptLogSignalRBridge.Props(scriptLogHub, scriptLogBroadcaster), ScriptLogSignalRBridgeName);
registry.Register<ScriptLogSignalRBridgeKey>(scriptLogBridge);
var driverStatusHub = resolver.GetService<IHubContext<DriverStatusHub>>();
@@ -6,10 +6,21 @@ namespace ZB.MOM.WW.OtOpcUa.AdminUI.Hubs;
/// Singleton last-snapshot-per-instance cache. Populated by
/// <c>DriverStatusSignalRBridge</c> as it forwards DPS messages; read by
/// <see cref="DriverStatusHub.JoinDriver"/> so newly-joined clients see current state
/// without waiting for the next change event.
/// without waiting for the next change event, and subscribed to directly by the Blazor
/// Server <c>DriverStatusPanel</c> via <see cref="SnapshotChanged"/>.
/// </summary>
public interface IDriverStatusSnapshotStore
{
void Upsert(DriverHealthChanged snapshot);
bool TryGet(string driverInstanceId, out DriverHealthChanged snapshot);
/// <summary>
/// Raised after every <see cref="Upsert"/> with the just-stored snapshot. Lets in-process
/// consumers (the Blazor Server <c>DriverStatusPanel</c>) receive live updates by reading
/// this singleton directly instead of opening a self-targeted SignalR connection — which a
/// server-side Blazor component cannot reach when the public URL (e.g. a reverse-proxy port)
/// differs from the local Kestrel bind. Handlers run on the caller's thread (the bridge
/// actor), so subscribers must marshal to their own sync context.
/// </summary>
event Action<DriverHealthChanged>? SnapshotChanged;
}
@@ -0,0 +1,41 @@
namespace ZB.MOM.WW.OtOpcUa.AdminUI.Hubs;
/// <summary>
/// A singleton, in-process fan-out for live event streams (alarm transitions, script-log
/// lines). A per-node SignalR bridge actor subscribes to the cluster's DistributedPubSub topic
/// and calls <see cref="Publish"/>; Blazor Server components subscribe to <see cref="Received"/>
/// to render the live tail.
/// <para>
/// This exists because the AdminUI runs as Blazor <em>Server</em>: a component opening a
/// SignalR <c>HubConnection</c> to its own hub would dial the browser's public URL from
/// server-side code, which is unreachable behind a reverse proxy (e.g. Traefik mapping host
/// :9200 → container :9000) and so fails with "Connection refused". Reading this in-process
/// broadcaster instead avoids the network hop entirely. Mirrors the
/// <c>IDriverStatusSnapshotStore.SnapshotChanged</c> pattern for stream (vs. last-value) feeds.
/// </para>
/// </summary>
/// <typeparam name="T">The event payload type (e.g. AlarmTransitionEvent, ScriptLogEntry).</typeparam>
public interface IInProcessBroadcaster<T>
{
/// <summary>
/// Raised once per <see cref="Publish"/> with the published item. Handlers run on the
/// caller's thread (the bridge actor), so subscribers must marshal to their own sync
/// context (Blazor's <c>InvokeAsync</c>).
/// </summary>
event Action<T>? Received;
/// <summary>Fan the item out to all current <see cref="Received"/> subscribers.</summary>
void Publish(T item);
}
/// <summary>Thread-safe singleton implementation of <see cref="IInProcessBroadcaster{T}"/>.</summary>
/// <typeparam name="T">The event payload type.</typeparam>
public sealed class InProcessBroadcaster<T> : IInProcessBroadcaster<T>
{
/// <inheritdoc />
public event Action<T>? Received;
/// <inheritdoc />
// Capture-then-invoke (via ?.) so a concurrent unsubscribe can't null the delegate mid-raise.
public void Publish(T item) => Received?.Invoke(item);
}
@@ -11,9 +11,16 @@ public sealed class InMemoryDriverStatusSnapshotStore : IDriverStatusSnapshotSto
{
private readonly ConcurrentDictionary<string, DriverHealthChanged> _byInstance = new();
/// <inheritdoc />
public event Action<DriverHealthChanged>? SnapshotChanged;
/// <inheritdoc />
public void Upsert(DriverHealthChanged snapshot)
=> _byInstance[snapshot.DriverInstanceId] = snapshot;
{
_byInstance[snapshot.DriverInstanceId] = snapshot;
// Capture-then-invoke so a concurrent unsubscribe can't null the delegate mid-raise.
SnapshotChanged?.Invoke(snapshot);
}
/// <inheritdoc />
public bool TryGet(string driverInstanceId, out DriverHealthChanged snapshot)
@@ -15,18 +15,22 @@ public sealed class ScriptLogSignalRBridge : ReceiveActor
public const string TopicName = "script-logs";
private readonly IHubContext<ScriptLogHub> _hub;
private readonly IInProcessBroadcaster<ScriptLogEntry> _broadcaster;
private readonly ILoggingAdapter _log = Context.GetLogger();
/// <summary>Creates a Props instance for the ScriptLogSignalRBridge.</summary>
/// <param name="hub">The SignalR hub context for sending messages to clients.</param>
public static Props Props(IHubContext<ScriptLogHub> hub) =>
Akka.Actor.Props.Create(() => new ScriptLogSignalRBridge(hub));
/// <param name="broadcaster">In-process fan-out read directly by the Blazor Server Script log page.</param>
public static Props Props(IHubContext<ScriptLogHub> hub, IInProcessBroadcaster<ScriptLogEntry> broadcaster) =>
Akka.Actor.Props.Create(() => new ScriptLogSignalRBridge(hub, broadcaster));
/// <summary>Initializes a new instance of the <see cref="ScriptLogSignalRBridge"/> class.</summary>
/// <param name="hub">The SignalR hub context for sending messages to clients.</param>
public ScriptLogSignalRBridge(IHubContext<ScriptLogHub> hub)
/// <param name="broadcaster">In-process fan-out read directly by the Blazor Server Script log page.</param>
public ScriptLogSignalRBridge(IHubContext<ScriptLogHub> hub, IInProcessBroadcaster<ScriptLogEntry> broadcaster)
{
_hub = hub;
_broadcaster = broadcaster;
ReceiveAsync<ScriptLogEntry>(ForwardAsync);
Receive<SubscribeAck>(_ => { /* DPS confirmation */ });
}
@@ -37,6 +41,9 @@ public sealed class ScriptLogSignalRBridge : ReceiveActor
private async Task ForwardAsync(ScriptLogEntry msg)
{
// In-process fan-out first — this is what the Blazor Server Script log page reads. The hub
// push is kept for any out-of-process (e.g. WASM) SignalR client.
_broadcaster.Publish(msg);
try
{
await _hub.Clients.All.SendAsync(ScriptLogHub.MethodName, msg);
@@ -1,4 +1,5 @@
using Microsoft.Extensions.DependencyInjection;
using Microsoft.Extensions.DependencyInjection.Extensions;
using Microsoft.Extensions.Logging;
using ZB.MOM.WW.OtOpcUa.Core.Abstractions;
using ZB.MOM.WW.OtOpcUa.Core.Hosting;
@@ -23,8 +24,10 @@ using HistorianProbe = Driver.Historian.Wonderware.Client.WonderwareHistorianDri
/// over it. Replaces the F7 seam's <c>NullDriverFactory</c> default so deploys actually
/// materialise real <see cref="IDriver"/> instances on driver-role nodes.
///
/// Skipped entirely on admin-only nodes — they never run drivers, so the registry doesn't
/// need to exist (Program.cs guards via the <c>hasDriver</c> flag).
/// The factory registry is skipped on admin-only nodes — they never run drivers, so it doesn't
/// need to exist (Program.cs guards via the <c>hasDriver</c> flag). The driver <em>probe</em>
/// set is the exception: it backs the AdminUI Test Connect button and so must also be wired on
/// admin nodes — see <see cref="AddOtOpcUaDriverProbes"/>.
/// </summary>
public static class DriverFactoryBootstrap
{
@@ -46,16 +49,42 @@ public static class DriverFactoryBootstrap
services.AddSingleton<IDriverFactory>(sp =>
new DriverFactoryRegistryAdapter(sp.GetRequiredService<DriverFactoryRegistry>()));
// One IDriverProbe per driver type — wired into AdminOperationsActor via DI enumeration.
services.AddSingleton<IDriverProbe, ModbusProbe>();
services.AddSingleton<IDriverProbe, AbCipProbe>();
services.AddSingleton<IDriverProbe, AbLegacyProbe>();
services.AddSingleton<IDriverProbe, S7Probe>();
services.AddSingleton<IDriverProbe, TwinCATProbe>();
services.AddSingleton<IDriverProbe, FocasProbe>();
services.AddSingleton<IDriverProbe, OpcUaProbe>();
services.AddSingleton<IDriverProbe, GalaxyProbe>();
services.AddSingleton<IDriverProbe, HistorianProbe>();
// Driver nodes also carry the probe set so a fused admin,driver node has it; the admin-only
// case is covered by Program.cs calling AddOtOpcUaDriverProbes() in the hasAdmin block.
services.AddOtOpcUaDriverProbes();
return services;
}
/// <summary>
/// Register one <see cref="IDriverProbe"/> per driver type. These back the AdminUI's
/// "Test Connect" button: the <c>admin-operations</c> cluster singleton resolves
/// <see cref="IEnumerable{T}"/> of <see cref="IDriverProbe"/> and dispatches by DriverType.
/// <para>
/// That singleton is role-pinned to <c>admin</c>, so this MUST be wired on admin nodes —
/// including admin-only nodes that lack the <c>driver</c> role (e.g. the MAIN cluster's
/// admin-a/admin-b). Probes are lightweight (cheap connect, no persistent state) and don't
/// need the driver-factory registry, so they register independently of
/// <see cref="AddOtOpcUaDriverFactories"/>.
/// </para>
/// <para>
/// Uses <c>TryAddEnumerable</c> so a fused admin,driver node — which reaches this from both
/// the driver-factory path and the admin path — registers each probe exactly once. A
/// duplicate would make the singleton's <c>ToDictionary(p =&gt; p.DriverType)</c> throw.
/// </para>
/// </summary>
/// <param name="services">The service collection to register driver probes with.</param>
public static IServiceCollection AddOtOpcUaDriverProbes(this IServiceCollection services)
{
services.TryAddEnumerable(ServiceDescriptor.Singleton<IDriverProbe, ModbusProbe>());
services.TryAddEnumerable(ServiceDescriptor.Singleton<IDriverProbe, AbCipProbe>());
services.TryAddEnumerable(ServiceDescriptor.Singleton<IDriverProbe, AbLegacyProbe>());
services.TryAddEnumerable(ServiceDescriptor.Singleton<IDriverProbe, S7Probe>());
services.TryAddEnumerable(ServiceDescriptor.Singleton<IDriverProbe, TwinCATProbe>());
services.TryAddEnumerable(ServiceDescriptor.Singleton<IDriverProbe, FocasProbe>());
services.TryAddEnumerable(ServiceDescriptor.Singleton<IDriverProbe, OpcUaProbe>());
services.TryAddEnumerable(ServiceDescriptor.Singleton<IDriverProbe, GalaxyProbe>());
services.TryAddEnumerable(ServiceDescriptor.Singleton<IDriverProbe, HistorianProbe>());
return services;
}
@@ -1,39 +0,0 @@
using Microsoft.Extensions.Diagnostics.HealthChecks;
using ZB.MOM.WW.OtOpcUa.Commons.Interfaces;
namespace ZB.MOM.WW.OtOpcUa.Host.Health;
/// <summary>
/// Reports Healthy on the admin-role leader, Degraded on a non-leader admin member. Used by
/// the <c>/health/active</c> endpoint so external load balancers can route admin-singleton
/// traffic to the current leader (cookie sessions still work on either node — DataProtection
/// keys are shared).
/// </summary>
public sealed class AdminRoleLeaderHealthCheck : IHealthCheck
{
private readonly IClusterRoleInfo _roleInfo;
/// <summary>Initializes a new instance of the AdminRoleLeaderHealthCheck class.</summary>
/// <param name="roleInfo">The cluster role information provider.</param>
public AdminRoleLeaderHealthCheck(IClusterRoleInfo roleInfo)
{
_roleInfo = roleInfo;
}
/// <summary>Checks the health status of the admin role leader.</summary>
/// <param name="context">The health check context.</param>
/// <param name="cancellationToken">The cancellation token.</param>
/// <returns>A task representing the health check operation.</returns>
public Task<HealthCheckResult> CheckHealthAsync(HealthCheckContext context, CancellationToken cancellationToken = default)
{
if (!_roleInfo.HasRole("admin"))
return Task.FromResult(HealthCheckResult.Healthy("Node does not carry admin role"));
var leader = _roleInfo.RoleLeader("admin");
var isLeader = leader is not null && leader.Value.Equals(_roleInfo.LocalNode);
return Task.FromResult(isLeader
? HealthCheckResult.Healthy($"Admin leader ({_roleInfo.LocalNode})")
: HealthCheckResult.Degraded($"Admin member but not leader (leader={leader?.Value ?? "<unknown>"})"));
}
}
@@ -1,35 +0,0 @@
using Akka.Actor;
using Akka.Cluster;
using Microsoft.Extensions.Diagnostics.HealthChecks;
namespace ZB.MOM.WW.OtOpcUa.Host.Health;
public sealed class AkkaClusterHealthCheck : IHealthCheck
{
private readonly ActorSystem _system;
/// <summary>
/// Initializes a new instance of the AkkaClusterHealthCheck class.
/// </summary>
/// <param name="system">The Akka actor system to check cluster health for.</param>
public AkkaClusterHealthCheck(ActorSystem system)
{
_system = system;
}
/// <summary>
/// Checks the health of the Akka cluster asynchronously.
/// </summary>
/// <param name="context">The health check context.</param>
/// <param name="cancellationToken">Cancellation token.</param>
public Task<HealthCheckResult> CheckHealthAsync(HealthCheckContext context, CancellationToken cancellationToken = default)
{
var cluster = Akka.Cluster.Cluster.Get(_system);
var selfUp = cluster.State.Members.Any(m =>
m.Address == cluster.SelfAddress && m.Status == MemberStatus.Up);
return Task.FromResult(selfUp
? HealthCheckResult.Healthy($"Self Up; {cluster.State.Members.Count} member(s)")
: HealthCheckResult.Degraded("Self not yet Up in cluster"));
}
}
@@ -1,38 +0,0 @@
using Microsoft.EntityFrameworkCore;
using Microsoft.Extensions.Diagnostics.HealthChecks;
using ZB.MOM.WW.OtOpcUa.Configuration;
namespace ZB.MOM.WW.OtOpcUa.Host.Health;
public sealed class DatabaseHealthCheck : IHealthCheck
{
private readonly IDbContextFactory<OtOpcUaConfigDbContext> _dbFactory;
/// <summary>
/// Initializes a new instance of the <see cref="DatabaseHealthCheck"/> class.
/// </summary>
/// <param name="dbFactory">The database context factory for the config database.</param>
public DatabaseHealthCheck(IDbContextFactory<OtOpcUaConfigDbContext> dbFactory)
{
_dbFactory = dbFactory;
}
/// <summary>
/// Checks the health of the configuration database.
/// </summary>
/// <param name="context">The health check context.</param>
/// <param name="cancellationToken">Cancellation token.</param>
public async Task<HealthCheckResult> CheckHealthAsync(HealthCheckContext context, CancellationToken cancellationToken = default)
{
try
{
await using var db = await _dbFactory.CreateDbContextAsync(cancellationToken);
await db.Deployments.AsNoTracking().Take(1).ToListAsync(cancellationToken);
return HealthCheckResult.Healthy("ConfigDb reachable");
}
catch (Exception ex)
{
return HealthCheckResult.Unhealthy("ConfigDb unreachable", ex);
}
}
}
@@ -1,25 +1,40 @@
using Microsoft.AspNetCore.Builder;
using Microsoft.AspNetCore.Diagnostics.HealthChecks;
using Microsoft.AspNetCore.Routing;
using Microsoft.EntityFrameworkCore;
using Microsoft.Extensions.DependencyInjection;
using Microsoft.Extensions.Diagnostics.HealthChecks;
using ZB.MOM.WW.Health;
using ZB.MOM.WW.Health.Akka;
using ZB.MOM.WW.Health.EntityFrameworkCore;
using ZB.MOM.WW.OtOpcUa.Configuration;
namespace ZB.MOM.WW.OtOpcUa.Host.Health;
public static class HealthEndpoints
{
/// <summary>
/// Registers the standard ASP.NET Core health-check infrastructure plus the OtOpcUa-specific
/// probes. Mirrors ScadaLink's three-tier pattern: <c>ready</c> = boot ok; <c>active</c> =
/// fully serving traffic; <c>healthz</c> = bare process liveness.
/// Registers the shared ZB.MOM.WW health probes. Tier semantics preserved: configdb + akka on
/// ready+active; admin-leader on active only.
/// </summary>
/// <param name="services">The service collection to register health checks with.</param>
public static IServiceCollection AddOtOpcUaHealth(this IServiceCollection services)
{
services.AddHealthChecks()
.AddCheck<DatabaseHealthCheck>("configdb", tags: new[] { "ready", "active" })
.AddCheck<AkkaClusterHealthCheck>("akka", tags: new[] { "ready", "active" })
.AddCheck<AdminRoleLeaderHealthCheck>("admin-leader", tags: new[] { "active" });
.AddTypeActivatedCheck<DatabaseHealthCheck<OtOpcUaConfigDbContext>>(
"configdb",
failureStatus: null,
tags: new[] { ZbHealthTags.Ready, ZbHealthTags.Active },
args: new DatabaseHealthCheckOptions<OtOpcUaConfigDbContext>
{
ProbeQuery = static (db, ct) => db.Deployments.AsNoTracking().Take(1).ToListAsync(ct),
})
.AddTypeActivatedCheck<AkkaClusterHealthCheck>(
"akka",
failureStatus: null,
tags: new[] { ZbHealthTags.Ready, ZbHealthTags.Active },
args: AkkaClusterStatusPolicy.OtOpcUaCompat)
.AddTypeActivatedCheck<ActiveNodeHealthCheck>(
"admin-leader",
failureStatus: null,
tags: new[] { ZbHealthTags.Active },
args: "admin");
return services;
}
@@ -27,21 +42,7 @@ public static class HealthEndpoints
/// <param name="app">The endpoint route builder.</param>
public static IEndpointRouteBuilder MapOtOpcUaHealth(this IEndpointRouteBuilder app)
{
// AllowAnonymous on all three — Traefik / k8s liveness probes / load-balancers
// hit these without credentials. Without it the AddOtOpcUaAuth fallback policy
// 401s every probe and Traefik marks every backend unhealthy.
app.MapHealthChecks("/health/ready", new HealthCheckOptions
{
Predicate = c => c.Tags.Contains("ready"),
}).AllowAnonymous();
app.MapHealthChecks("/health/active", new HealthCheckOptions
{
Predicate = c => c.Tags.Contains("active"),
}).AllowAnonymous();
app.MapHealthChecks("/healthz", new HealthCheckOptions
{
Predicate = _ => false, // process-liveness only — no probes run.
}).AllowAnonymous();
app.MapZbHealth();
return app;
}
}
@@ -122,6 +122,11 @@ if (hasAdmin)
// Auth + AdminUI surface only mounted on admin-role nodes. Driver-only nodes have no UI.
builder.Services.AddOtOpcUaAuth(builder.Configuration);
builder.Services.AddAdminUI();
// Test Connect probes back the AdminUI driver pages. The admin-operations singleton (role-pinned
// to admin) resolves IEnumerable<IDriverProbe>, so admin-only nodes — which skip the hasDriver
// block above — must wire the probe set here too, or every Test Connect returns "No probe
// registered". Idempotent on fused admin,driver nodes (TryAddEnumerable de-dups).
builder.Services.AddOtOpcUaDriverProbes();
// Flow AuthenticationState through cascading parameters so <AuthorizeView/> works
// inside interactive components (NavSidebar's session block).
builder.Services.AddCascadingAuthenticationState();
@@ -27,6 +27,9 @@
</PackageReference>
<PackageReference Include="OpenTelemetry.Extensions.Hosting"/>
<PackageReference Include="OpenTelemetry.Exporter.Prometheus.AspNetCore"/>
<PackageReference Include="ZB.MOM.WW.Health" />
<PackageReference Include="ZB.MOM.WW.Health.Akka" />
<PackageReference Include="ZB.MOM.WW.Health.EntityFrameworkCore" />
</ItemGroup>
<ItemGroup>
@@ -122,15 +122,45 @@ public sealed class DraftValidatorTests
DraftValidator.Validate(draft).ShouldContain(e => e.Code == "EquipmentIdNotDerived");
}
/// <summary>Verifies that Galaxy driver cannot be placed in Equipment namespace.</summary>
/// <summary>Verifies that the canonical Galaxy driver type (GalaxyMxGateway, per PR 7.2 —
/// it was "Galaxy" pre-PR-7.2) is allowed in a SystemPlatform namespace, i.e. produces no
/// kind-mismatch error.</summary>
[Fact]
public void Galaxy_driver_in_Equipment_namespace_is_rejected()
public void GalaxyMxGateway_driver_in_SystemPlatform_namespace_is_allowed()
{
var draft = new DraftSnapshot
{
GenerationId = 1, ClusterId = "c",
Namespaces = [new Namespace { NamespaceId = "ns-1", ClusterId = "c", NamespaceUri = "urn:x", Kind = NamespaceKind.SystemPlatform }],
DriverInstances = [new DriverInstance { DriverInstanceId = "d-1", ClusterId = "c", NamespaceId = "ns-1", Name = "drv", DriverType = "GalaxyMxGateway", DriverConfig = "{}" }],
};
DraftValidator.Validate(draft).ShouldNotContain(e => e.Code == "DriverNamespaceKindMismatch");
}
/// <summary>Verifies that the canonical Galaxy driver type cannot be placed in an Equipment namespace.</summary>
[Fact]
public void GalaxyMxGateway_driver_in_Equipment_namespace_is_rejected()
{
var draft = new DraftSnapshot
{
GenerationId = 1, ClusterId = "c",
Namespaces = [new Namespace { NamespaceId = "ns-1", ClusterId = "c", NamespaceUri = "urn:x", Kind = NamespaceKind.Equipment }],
DriverInstances = [new DriverInstance { DriverInstanceId = "d-1", ClusterId = "c", NamespaceId = "ns-1", Name = "drv", DriverType = "Galaxy", DriverConfig = "{}" }],
DriverInstances = [new DriverInstance { DriverInstanceId = "d-1", ClusterId = "c", NamespaceId = "ns-1", Name = "drv", DriverType = "GalaxyMxGateway", DriverConfig = "{}" }],
};
DraftValidator.Validate(draft).ShouldContain(e => e.Code == "DriverNamespaceKindMismatch");
}
/// <summary>Verifies that a non-Galaxy driver cannot be placed in a SystemPlatform namespace.</summary>
[Fact]
public void NonGalaxy_driver_in_SystemPlatform_namespace_is_rejected()
{
var draft = new DraftSnapshot
{
GenerationId = 1, ClusterId = "c",
Namespaces = [new Namespace { NamespaceId = "ns-1", ClusterId = "c", NamespaceUri = "urn:x", Kind = NamespaceKind.SystemPlatform }],
DriverInstances = [new DriverInstance { DriverInstanceId = "d-1", ClusterId = "c", NamespaceId = "ns-1", Name = "drv", DriverType = "ModbusTcp", DriverConfig = "{}" }],
};
DraftValidator.Validate(draft).ShouldContain(e => e.Code == "DriverNamespaceKindMismatch");
@@ -145,7 +175,7 @@ public sealed class DraftValidatorTests
{
GenerationId = 1, ClusterId = "c-A",
Namespaces = [new Namespace { NamespaceId = "ns-1", ClusterId = "c-B", NamespaceUri = "urn:x", Kind = NamespaceKind.Equipment }],
DriverInstances = [new DriverInstance { DriverInstanceId = "d-1", ClusterId = "c-A", NamespaceId = "ns-1", Name = "drv", DriverType = "Galaxy", DriverConfig = "{}" }],
DriverInstances = [new DriverInstance { DriverInstanceId = "d-1", ClusterId = "c-A", NamespaceId = "ns-1", Name = "drv", DriverType = "GalaxyMxGateway", DriverConfig = "{}" }],
Equipment = [new Equipment { EquipmentUuid = uuid, EquipmentId = "EQ-wrong", Name = "BAD NAME", DriverInstanceId = "d-1", UnsLineId = "line-a", MachineCode = "m" }],
};
@@ -2,8 +2,8 @@ using CliFx.Attributes;
using CliFx.Infrastructure;
using Shouldly;
using Xunit;
using ZB.MOM.WW.OtOpcUa.Driver.S7;
using ZB.MOM.WW.OtOpcUa.Driver.S7.Cli;
using S7NetCpuType = global::S7.Net.CpuType;
namespace ZB.MOM.WW.OtOpcUa.Driver.S7.Cli.Tests;
@@ -41,7 +41,7 @@ public sealed class S7CommandBaseBuildOptionsTests
{
Host = "10.0.0.5",
Port = 102,
CpuType = S7NetCpuType.S71500,
CpuType = S7CpuType.S71500,
Rack = 0,
Slot = 0,
TimeoutMs = 5000,
@@ -72,7 +72,7 @@ public sealed class S7CommandBaseBuildOptionsTests
{
Host = "plc.shop.local",
Port = 4102,
CpuType = S7NetCpuType.S7300,
CpuType = S7CpuType.S7300,
Rack = 1,
Slot = 2,
TimeoutMs = 3000,
@@ -82,7 +82,7 @@ public sealed class S7CommandBaseBuildOptionsTests
options.Host.ShouldBe("plc.shop.local");
options.Port.ShouldBe(4102);
options.CpuType.ShouldBe(S7NetCpuType.S7300);
options.CpuType.ShouldBe(S7CpuType.S7300);
options.Rack.ShouldBe((short)1);
options.Slot.ShouldBe((short)2);
}
@@ -16,10 +16,10 @@ public sealed class GalaxyDriverBrowserTests
{
private readonly GalaxyDriverBrowser _sut = new();
/// <summary>The DriverType key must match the AdminUI's persisted "Galaxy" value
/// <summary>The DriverType key must match the AdminUI's persisted "GalaxyMxGateway" value
/// so the factory wire-up picks the right browser implementation.</summary>
[Fact]
public void DriverType_is_Galaxy() => _sut.DriverType.ShouldBe("Galaxy");
public void DriverType_is_GalaxyMxGateway() => _sut.DriverType.ShouldBe("GalaxyMxGateway");
/// <summary>An empty Gateway.Endpoint must fail fast with a clear, endpoint-mentioning
/// message rather than surfacing a downstream gRPC URI parse error.</summary>
@@ -0,0 +1,82 @@
using ZB.MOM.WW.MxGateway.Client;
using Shouldly;
using Xunit;
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Runtime;
namespace ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Tests.Runtime;
/// <summary>
/// D.1 smoke (alarm-source leg): drives the REAL gateway <c>StreamAlarms</c> feed through the
/// production lmxopcua consumer (<see cref="GatewayGalaxyAlarmFeed"/>) and asserts native alarm
/// transitions — with operator comment, category, original raise time, and the mapped OPC UA
/// severity bucket preserved — reach the driver-side boundary that feeds
/// <c>IAlarmSource.OnAlarmEvent</c>.
/// <para>
/// Skip-gated: runs only when <c>MXGW_ENDPOINT</c> + <c>GALAXY_MXGW_API_KEY</c> are set to a
/// reachable gateway. Captured 2026-05-29 against <c>10.100.0.48:5120</c> — see
/// <c>docs/plans/alarms-d1-smoke-artifact.md</c>. Set <c>D1_SMOKE_OUT</c> to dump the observed
/// transitions to a file for artifact capture.
/// </para>
/// </summary>
[Trait("Category", "Integration")]
public sealed class GatewayGalaxyAlarmFeedLiveTests
{
[Fact]
public async Task Live_gateway_delivers_native_alarm_transitions_through_the_consumer()
{
var endpoint = Environment.GetEnvironmentVariable("MXGW_ENDPOINT");
var apiKey = Environment.GetEnvironmentVariable("GALAXY_MXGW_API_KEY");
if (string.IsNullOrWhiteSpace(endpoint) || string.IsNullOrWhiteSpace(apiKey))
Assert.Skip("Set MXGW_ENDPOINT + GALAXY_MXGW_API_KEY to run the live gateway alarm-feed smoke.");
var client = MxGatewayClient.Create(new MxGatewayClientOptions
{
Endpoint = new Uri(endpoint!, UriKind.Absolute),
ApiKey = apiKey!,
UseTls = false,
ConnectTimeout = TimeSpan.FromSeconds(10),
DefaultCallTimeout = TimeSpan.FromSeconds(30),
StreamTimeout = TimeSpan.FromSeconds(30),
});
var observed = new List<GalaxyAlarmTransition>();
var gotOne = new TaskCompletionSource<bool>(TaskCreationOptions.RunContinuationsAsynchronously);
// Wire the live client's StreamAlarms method group into the production consumer seam.
await using var feed = new GatewayGalaxyAlarmFeed(client.StreamAlarmsAsync, clientName: "D1Smoke");
feed.OnAlarmTransition += (_, t) =>
{
lock (observed) { observed.Add(t); }
gotOne.TrySetResult(true);
};
feed.Start();
// The stream opens with the active-alarm snapshot, so we expect ≥1 transition promptly.
await Task.WhenAny(gotOne.Task, Task.Delay(TimeSpan.FromSeconds(20), TestContext.Current.CancellationToken));
List<GalaxyAlarmTransition> snapshot;
lock (observed) snapshot = observed.ToList();
snapshot.ShouldNotBeEmpty(
"Live gateway should deliver at least the active-alarm snapshot through the lmxopcua consumer.");
var first = snapshot[0];
first.AlarmFullReference.ShouldNotBeNullOrWhiteSpace();
first.OpcUaSeverity.ShouldBeGreaterThan(0); // severity bucket mapping applied by the consumer
foreach (var t in snapshot.Take(8))
TestContext.Current.SendDiagnosticMessage(
$"{t.TransitionKind,-11} {t.AlarmFullReference} sev={t.OpcUaSeverity}({t.SeverityBucket}) cat={t.Category} comment='{t.OperatorComment}'");
TestContext.Current.SendDiagnosticMessage($"TOTAL consumer transitions observed: {snapshot.Count}");
// Deterministic artifact capture (only when D1_SMOKE_OUT is set).
var outPath = Environment.GetEnvironmentVariable("D1_SMOKE_OUT");
if (!string.IsNullOrWhiteSpace(outPath))
{
var lines = snapshot.Take(50).Select(t =>
$"{t.TransitionKind,-11} {t.AlarmFullReference} | sev={t.OpcUaSeverity}({t.SeverityBucket}) raw={t.RawMxAccessSeverity} | cat={t.Category} | comment='{t.OperatorComment}' | xitionUtc={t.TransitionTimestampUtc:o}");
await File.WriteAllLinesAsync(outPath!,
new[] { $"# consumer transitions observed: {snapshot.Count}" }.Concat(lines),
TestContext.Current.CancellationToken);
}
}
}
@@ -25,6 +25,9 @@
<ItemGroup>
<PackageReference Include="ZB.MOM.WW.MxGateway.Contracts" />
<!-- Client package: only the Skip-gated live alarm-feed smoke (GatewayGalaxyAlarmFeedLiveTests)
constructs a real MxGatewayClient. Unit tests use the fake stream-factory seam. -->
<PackageReference Include="ZB.MOM.WW.MxGateway.Client" />
</ItemGroup>
</Project>
@@ -0,0 +1,59 @@
using Shouldly;
using Xunit;
using ZB.MOM.WW.OtOpcUa.AdminUI.Hubs;
using ZB.MOM.WW.OtOpcUa.Commons.Messages.Drivers;
namespace ZB.MOM.WW.OtOpcUa.AdminUI.Tests;
/// <summary>
/// Covers the in-process push contract the Blazor Server <c>DriverStatusPanel</c> relies on:
/// <see cref="IDriverStatusSnapshotStore.SnapshotChanged"/> fires on every
/// <see cref="IDriverStatusSnapshotStore.Upsert"/>, and <c>TryGet</c> returns the latest.
/// The panel subscribes to this store directly instead of opening a self-targeted SignalR
/// connection (which a server-side component can't reach behind a reverse proxy).
/// </summary>
public sealed class DriverStatusSnapshotStoreTests
{
private static DriverHealthChanged Snap(string instance, string state = "Healthy") =>
new("MAIN", instance, state, null, null, 0, new DateTime(2026, 5, 29, 0, 0, 0, DateTimeKind.Utc));
[Fact]
public void Upsert_raises_SnapshotChanged_with_the_stored_snapshot()
{
var store = new InMemoryDriverStatusSnapshotStore();
var received = new List<DriverHealthChanged>();
store.SnapshotChanged += received.Add;
var snap = Snap("drv-1", "Faulted");
store.Upsert(snap);
received.Count.ShouldBe(1);
received[0].ShouldBeSameAs(snap);
}
[Fact]
public void Upsert_then_TryGet_returns_the_latest_snapshot()
{
var store = new InMemoryDriverStatusSnapshotStore();
store.Upsert(Snap("drv-1", "Healthy"));
store.Upsert(Snap("drv-1", "Degraded"));
store.TryGet("drv-1", out var latest).ShouldBeTrue();
latest.State.ShouldBe("Degraded");
}
[Fact]
public void Unsubscribed_handler_stops_receiving_after_removal()
{
var store = new InMemoryDriverStatusSnapshotStore();
var count = 0;
void Handler(DriverHealthChanged _) => count++;
store.SnapshotChanged += Handler;
store.Upsert(Snap("drv-1"));
store.SnapshotChanged -= Handler;
store.Upsert(Snap("drv-1"));
count.ShouldBe(1);
}
}
@@ -2,18 +2,29 @@ using System.Text.Json;
using System.Text.Json.Serialization;
using Shouldly;
using Xunit;
using ZB.MOM.WW.OtOpcUa.AdminUI.Components.Pages.Clusters.Drivers;
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Config;
namespace ZB.MOM.WW.OtOpcUa.AdminUI.Tests;
public sealed class GalaxyDriverPageFormSerializationTests
{
// Matches GalaxyDriverPage._jsonOpts (camelCase, no PropertyNameCaseInsensitive).
private static readonly JsonSerializerOptions _opts = new()
{
PropertyNamingPolicy = JsonNamingPolicy.CamelCase,
WriteIndented = false,
};
// Matches the page's _jsonOpts exactly: camelCase + case-insensitive read + UnmappedMemberHandling.Skip.
private static readonly JsonSerializerOptions _pageOpts = new()
{
PropertyNamingPolicy = JsonNamingPolicy.CamelCase,
PropertyNameCaseInsensitive = true,
UnmappedMemberHandling = JsonUnmappedMemberHandling.Skip,
WriteIndented = false,
};
[Fact]
public void RoundTrip_PreservesKnownFields()
{
@@ -92,4 +103,135 @@ public sealed class GalaxyDriverPageFormSerializationTests
back.ProbeTimeoutSeconds.ShouldBe(20);
back.Gateway.Endpoint.ShouldBe("https://localhost:5001");
}
/// <summary>
/// Regression test: the seed SQL stores PascalCase JSON. With
/// <c>PropertyNameCaseInsensitive = true</c> the page must read the real values, not
/// fall back to defaults. FAILS against case-sensitive opts; PASSES with the fix.
/// </summary>
[Fact]
public void Seeded_pascalcase_config_loads_real_values()
{
// Exact JSON from docker-dev/seed/seed-clusters.sql (lines 130-151).
var seededJson = """
{
"Gateway": {
"Endpoint": "http://10.100.0.48:5120",
"ApiKeySecretRef": "env:GALAXY_MXGW_API_KEY",
"UseTls": false,
"ConnectTimeoutSeconds": 10,
"DefaultCallTimeoutSeconds": 30
},
"MxAccess": {
"ClientName": "OtOpcUa-MAIN-docker-dev",
"PublishingIntervalMs": 1000
},
"Repository": {
"DiscoverPageSize": 5000,
"WatchDeployEvents": true
},
"Reconnect": {
"InitialBackoffMs": 500,
"MaxBackoffMs": 30000,
"ReplayOnSessionLost": true
}
}
""";
// Deserialize with page-mirrored opts (camelCase + case-insensitive, as fixed).
var driverOpts = JsonSerializer.Deserialize<GalaxyDriverOptions>(seededJson, _pageOpts);
driverOpts.ShouldNotBeNull();
var form = GalaxyDriverPage.GalaxyFormModel.FromRecord(driverOpts!);
// Assert REAL seeded values — not defaults.
form.GatewayEndpoint.ShouldBe("http://10.100.0.48:5120");
form.GatewayApiKeySecretRef.ShouldBe("env:GALAXY_MXGW_API_KEY");
form.GatewayUseTls.ShouldBeFalse();
form.MxClientName.ShouldBe("OtOpcUa-MAIN-docker-dev");
form.RepositoryDiscoverPageSize.ShouldBe(5000);
form.ReconnectInitialBackoffMs.ShouldBe(500);
}
/// <summary>
/// Defence-in-depth: a config that genuinely OMITS a section (no Reconnect key at all)
/// must not throw — <see cref="GalaxyDriverPage.GalaxyFormModel.FromRecord"/> must
/// null-coalesce the missing section to its default value.
/// </summary>
[Fact]
public void FromRecord_with_omitted_section_uses_defaults()
{
// Only gateway section present — Reconnect intentionally absent.
var partialJson = """
{
"gateway": {
"endpoint": "opc://x",
"apiKeySecretRef": "env:K"
}
}
""";
var driverOpts = JsonSerializer.Deserialize<GalaxyDriverOptions>(partialJson, _pageOpts);
driverOpts.ShouldNotBeNull();
// FromRecord must not throw even though Reconnect (and other sections) is null.
var form = Should.NotThrow(() => GalaxyDriverPage.GalaxyFormModel.FromRecord(driverOpts!));
// Omitted Reconnect section falls back to GalaxyReconnectOptions() defaults.
var defaultRc = new GalaxyReconnectOptions();
form.ReconnectInitialBackoffMs.ShouldBe(defaultRc.InitialBackoffMs);
form.ReconnectMaxBackoffMs.ShouldBe(defaultRc.MaxBackoffMs);
form.ReconnectReplayOnSessionLost.ShouldBe(defaultRc.ReplayOnSessionLost);
}
/// <summary>
/// Confirms that <see cref="GalaxyDriverPage.GalaxyFormModel.FromRecord"/> still
/// round-trips correctly when all nested records are populated (non-regressed path).
/// </summary>
[Fact]
public void FromRecord_with_fully_populated_options_round_trips()
{
var original = new GalaxyDriverOptions(
Gateway: new GalaxyGatewayOptions(
Endpoint: "https://gw.example.com:5001",
ApiKeySecretRef: "env:MY_KEY",
UseTls: true,
CaCertificatePath: null,
ConnectTimeoutSeconds: 12,
DefaultCallTimeoutSeconds: 40,
StreamTimeoutSeconds: 0),
MxAccess: new GalaxyMxAccessOptions(
ClientName: "OtOpcUa-Test",
PublishingIntervalMs: 750,
WriteUserId: 2,
EventPumpChannelCapacity: 25_000),
Repository: new GalaxyRepositoryOptions(
DiscoverPageSize: 3000,
WatchDeployEvents: false),
Reconnect: new GalaxyReconnectOptions(
InitialBackoffMs: 800,
MaxBackoffMs: 45_000,
ReplayOnSessionLost: false))
{
ProbeTimeoutSeconds = 20,
};
var form = GalaxyDriverPage.GalaxyFormModel.FromRecord(original);
form.GatewayEndpoint.ShouldBe("https://gw.example.com:5001");
form.GatewayApiKeySecretRef.ShouldBe("env:MY_KEY");
form.GatewayUseTls.ShouldBeTrue();
form.GatewayConnectTimeoutSeconds.ShouldBe(12);
form.GatewayDefaultCallTimeoutSeconds.ShouldBe(40);
form.MxClientName.ShouldBe("OtOpcUa-Test");
form.MxPublishingIntervalMs.ShouldBe(750);
form.MxWriteUserId.ShouldBe(2);
form.MxEventPumpChannelCapacity.ShouldBe(25_000);
form.RepositoryDiscoverPageSize.ShouldBe(3000);
form.RepositoryWatchDeployEvents.ShouldBeFalse();
form.ReconnectInitialBackoffMs.ShouldBe(800);
form.ReconnectMaxBackoffMs.ShouldBe(45_000);
form.ReconnectReplayOnSessionLost.ShouldBeFalse();
form.ProbeTimeoutSeconds.ShouldBe(20);
}
}
@@ -0,0 +1,51 @@
using Shouldly;
using Xunit;
using ZB.MOM.WW.OtOpcUa.AdminUI.Hubs;
namespace ZB.MOM.WW.OtOpcUa.AdminUI.Tests;
/// <summary>
/// Covers the in-process fan-out the Blazor Server Alerts / Script log pages rely on:
/// <see cref="IInProcessBroadcaster{T}.Publish"/> raises <c>Received</c> for every current
/// subscriber, and unsubscribing stops delivery. These pages read this broadcaster directly
/// instead of opening a self-targeted SignalR connection (unreachable behind a reverse proxy).
/// </summary>
public sealed class InProcessBroadcasterTests
{
[Fact]
public void Publish_raises_Received_for_all_current_subscribers()
{
var broadcaster = new InProcessBroadcaster<string>();
var a = new List<string>();
var b = new List<string>();
broadcaster.Received += a.Add;
broadcaster.Received += b.Add;
broadcaster.Publish("evt-1");
a.ShouldBe(["evt-1"]);
b.ShouldBe(["evt-1"]);
}
[Fact]
public void Unsubscribed_handler_stops_receiving()
{
var broadcaster = new InProcessBroadcaster<string>();
var received = new List<string>();
void Handler(string s) => received.Add(s);
broadcaster.Received += Handler;
broadcaster.Publish("first");
broadcaster.Received -= Handler;
broadcaster.Publish("second");
received.ShouldBe(["first"]);
}
[Fact]
public void Publish_with_no_subscribers_does_not_throw()
{
var broadcaster = new InProcessBroadcaster<int>();
Should.NotThrow(() => broadcaster.Publish(42));
}
}
@@ -0,0 +1,69 @@
using Microsoft.Extensions.DependencyInjection;
using Shouldly;
using Xunit;
using ZB.MOM.WW.OtOpcUa.Core.Abstractions;
using ZB.MOM.WW.OtOpcUa.Host.Drivers;
namespace ZB.MOM.WW.OtOpcUa.Host.IntegrationTests;
/// <summary>
/// Guards the Test Connect wiring contract: every driver type editable in the AdminUI must have
/// a registered <see cref="IDriverProbe"/>, resolvable from the same DI container that hosts the
/// <c>admin-operations</c> cluster singleton. The singleton is role-pinned to <c>admin</c>, so on
/// a split-role deployment (the MAIN cluster's admin-only nodes) the probes must be wired by the
/// admin path — not only the driver path — or every Test Connect button returns
/// "No probe registered for driver type X".
/// </summary>
public sealed class DriverProbeRegistrationTests
{
// The canonical "all drivers" set — one entry per AdminUI typed driver page's DriverTypeKey.
// Keep in sync with the DriverTypeKey constants in
// src/Server/.../Components/Pages/Clusters/Drivers/*DriverPage.razor.
private static readonly string[] AdminUiDriverTypeKeys =
[
"ModbusTcp",
"AbCip",
"AbLegacy",
"S7",
"TwinCat", // page key; probe reports "TwinCAT" — must resolve case-insensitively
"Focas", // page key; probe reports "FOCAS" — must resolve case-insensitively
"OpcUaClient",
"GalaxyMxGateway",
"Historian.Wonderware",
];
[Fact]
public void AddOtOpcUaDriverProbes_registers_a_probe_for_every_AdminUI_driver_type()
{
var services = new ServiceCollection();
services.AddOtOpcUaDriverProbes();
using var sp = services.BuildServiceProvider();
var probes = sp.GetServices<IDriverProbe>().ToList();
// No duplicate DriverType — AdminOperationsActor builds a dictionary keyed by DriverType
// (case-insensitive) and would throw on a duplicate key, crashing the singleton.
var byType = probes.ToDictionary(p => p.DriverType, StringComparer.OrdinalIgnoreCase);
foreach (var key in AdminUiDriverTypeKeys)
byType.ContainsKey(key).ShouldBeTrue($"No IDriverProbe registered for AdminUI driver type '{key}'.");
}
[Fact]
public void AddOtOpcUaDriverProbes_is_idempotent()
{
// A fused admin,driver node calls the registration from both the driver-factory path and the
// admin path. TryAddEnumerable must de-dup so the probe set stays unique (else the actor's
// ToDictionary throws).
var services = new ServiceCollection();
services.AddOtOpcUaDriverProbes();
services.AddOtOpcUaDriverProbes();
using var sp = services.BuildServiceProvider();
var probes = sp.GetServices<IDriverProbe>().ToList();
var distinctTypes = probes.Select(p => p.DriverType).Distinct(StringComparer.OrdinalIgnoreCase).Count();
probes.Count.ShouldBe(distinctTypes, "Duplicate IDriverProbe registrations — TryAddEnumerable should de-dup.");
distinctTypes.ShouldBe(AdminUiDriverTypeKeys.Length);
}
}
@@ -84,6 +84,9 @@ public sealed class DeferredAddressSpaceSinkTests
public void EnsureFolder(string folderNodeId, string? parentNodeId, string displayName)
=> CallQueue.Enqueue($"EF:{folderNodeId}");
/// <inheritdoc />
public void EnsureVariable(string variableNodeId, string? parentFolderNodeId, string displayName, string dataType)
=> CallQueue.Enqueue($"EV:{variableNodeId}");
/// <inheritdoc />
public void RebuildAddressSpace() => CallQueue.Enqueue("RB");
}
}
@@ -34,7 +34,8 @@ public sealed class Phase7ApplierHierarchyTests : IDisposable
UnsLines: new[] { new UnsLineProjection("line-1", "area-1", "Cell A") },
EquipmentNodes: new[] { new EquipmentNode("eq-1", "Pump-1", "line-1") },
DriverInstancePlans: Array.Empty<DriverInstancePlan>(),
ScriptedAlarmPlans: Array.Empty<ScriptedAlarmPlan>());
ScriptedAlarmPlans: Array.Empty<ScriptedAlarmPlan>(),
GalaxyTags: Array.Empty<GalaxyTagPlan>());
applier.MaterialiseHierarchy(composition);
@@ -57,7 +58,8 @@ public sealed class Phase7ApplierHierarchyTests : IDisposable
UnsLines: Array.Empty<UnsLineProjection>(),
EquipmentNodes: new[] { new EquipmentNode("eq-orphan", "Orphan", UnsLineId: "") },
DriverInstancePlans: Array.Empty<DriverInstancePlan>(),
ScriptedAlarmPlans: Array.Empty<ScriptedAlarmPlan>());
ScriptedAlarmPlans: Array.Empty<ScriptedAlarmPlan>(),
GalaxyTags: Array.Empty<GalaxyTagPlan>());
applier.MaterialiseHierarchy(composition);
@@ -91,7 +93,8 @@ public sealed class Phase7ApplierHierarchyTests : IDisposable
UnsLines: new[] { new UnsLineProjection("line-1", "area-A", "Line 1") },
EquipmentNodes: new[] { new EquipmentNode("eq-1", "Eq 1", "line-1"), new EquipmentNode("eq-2", "Eq 2", "line-1") },
DriverInstancePlans: Array.Empty<DriverInstancePlan>(),
ScriptedAlarmPlans: Array.Empty<ScriptedAlarmPlan>()));
ScriptedAlarmPlans: Array.Empty<ScriptedAlarmPlan>(),
GalaxyTags: Array.Empty<GalaxyTagPlan>()));
sdkServer.NodeManager!.FolderCount.ShouldBe(5); // 2 areas + 1 line + 2 equipment
@@ -101,7 +104,8 @@ public sealed class Phase7ApplierHierarchyTests : IDisposable
UnsLines: new[] { new UnsLineProjection("line-1", "area-A", "Line 1") },
EquipmentNodes: new[] { new EquipmentNode("eq-1", "Eq 1", "line-1"), new EquipmentNode("eq-2", "Eq 2", "line-1") },
DriverInstancePlans: Array.Empty<DriverInstancePlan>(),
ScriptedAlarmPlans: Array.Empty<ScriptedAlarmPlan>()));
ScriptedAlarmPlans: Array.Empty<ScriptedAlarmPlan>(),
GalaxyTags: Array.Empty<GalaxyTagPlan>()));
sdkServer.NodeManager!.FolderCount.ShouldBe(5);
}
@@ -149,6 +153,12 @@ public sealed class Phase7ApplierHierarchyTests : IDisposable
/// <param name="displayName">The display name of the folder.</param>
public void EnsureFolder(string folderNodeId, string? parentNodeId, string displayName)
=> _calls.Enqueue((folderNodeId, parentNodeId, displayName));
/// <summary>Ensures a variable exists (stub implementation for testing).</summary>
/// <param name="variableNodeId">The node ID of the variable.</param>
/// <param name="parentFolderNodeId">The node ID of the parent folder, or null for root.</param>
/// <param name="displayName">The display name of the variable.</param>
/// <param name="dataType">The OPC UA built-in type name.</param>
public void EnsureVariable(string variableNodeId, string? parentFolderNodeId, string displayName, string dataType) { }
/// <summary>Rebuilds the address space (stub implementation for testing).</summary>
public void RebuildAddressSpace() { }
}
@@ -58,7 +58,10 @@ public sealed class Phase7ApplierTests
ChangedDrivers: Array.Empty<Phase7Plan.DriverDelta>(),
AddedAlarms: Array.Empty<ScriptedAlarmPlan>(),
RemovedAlarms: Array.Empty<ScriptedAlarmPlan>(),
ChangedAlarms: Array.Empty<Phase7Plan.AlarmDelta>());
ChangedAlarms: Array.Empty<Phase7Plan.AlarmDelta>(),
AddedGalaxyTags: Array.Empty<GalaxyTagPlan>(),
RemovedGalaxyTags: Array.Empty<GalaxyTagPlan>(),
ChangedGalaxyTags: Array.Empty<Phase7Plan.GalaxyTagDelta>());
var outcome = applier.Apply(plan);
@@ -89,7 +92,10 @@ public sealed class Phase7ApplierTests
},
AddedAlarms: Array.Empty<ScriptedAlarmPlan>(),
RemovedAlarms: Array.Empty<ScriptedAlarmPlan>(),
ChangedAlarms: Array.Empty<Phase7Plan.AlarmDelta>());
ChangedAlarms: Array.Empty<Phase7Plan.AlarmDelta>(),
AddedGalaxyTags: Array.Empty<GalaxyTagPlan>(),
RemovedGalaxyTags: Array.Empty<GalaxyTagPlan>(),
ChangedGalaxyTags: Array.Empty<Phase7Plan.GalaxyTagDelta>());
var outcome = applier.Apply(plan);
@@ -111,10 +117,102 @@ public sealed class Phase7ApplierTests
outcome.RebuildCalled.ShouldBeTrue();
}
/// <summary>Verifies MaterialiseGalaxyTags creates one folder per distinct FolderPath and one
/// variable per tag, with root-level tags hung directly under the namespace root.</summary>
[Fact]
public void MaterialiseGalaxyTags_creates_folder_per_distinct_path_and_variable_per_tag()
{
var sink = new RecordingSink();
var applier = new Phase7Applier(sink, NullLogger<Phase7Applier>.Instance);
var composition = new Phase7CompositionResult(
UnsAreas: Array.Empty<UnsAreaProjection>(),
UnsLines: Array.Empty<UnsLineProjection>(),
EquipmentNodes: Array.Empty<EquipmentNode>(),
DriverInstancePlans: Array.Empty<DriverInstancePlan>(),
ScriptedAlarmPlans: Array.Empty<ScriptedAlarmPlan>(),
GalaxyTags: new[]
{
new GalaxyTagPlan("t1", "drv", "section.area", "Temperature", "Float", "section.area.Temperature"),
new GalaxyTagPlan("t2", "drv", "", "Pressure", "Int32", "Pressure"),
});
applier.MaterialiseGalaxyTags(composition);
// One folder for the single distinct non-empty FolderPath; the root-level tag adds none.
sink.FolderCalls.ShouldHaveSingleItem().ShouldBe(("section.area", null, "section.area"));
// Foldered tag → NodeId is its MxAccessRef under the FolderPath parent.
// Root-level tag → NodeId is its DisplayName under the root (null parent).
sink.VariableCalls.ShouldContain(("section.area.Temperature", "section.area", "Temperature", "Float"));
sink.VariableCalls.ShouldContain(("Pressure", (string?)null, "Pressure", "Int32"));
sink.VariableCalls.Count.ShouldBe(2);
}
/// <summary>Verifies that two tags sharing a FolderPath produce a single EnsureFolder call
/// (deduped) but one EnsureVariable per tag.</summary>
[Fact]
public void MaterialiseGalaxyTags_dedupes_folders_for_tags_sharing_a_path()
{
var sink = new RecordingSink();
var applier = new Phase7Applier(sink, NullLogger<Phase7Applier>.Instance);
var composition = new Phase7CompositionResult(
UnsAreas: Array.Empty<UnsAreaProjection>(),
UnsLines: Array.Empty<UnsLineProjection>(),
EquipmentNodes: Array.Empty<EquipmentNode>(),
DriverInstancePlans: Array.Empty<DriverInstancePlan>(),
ScriptedAlarmPlans: Array.Empty<ScriptedAlarmPlan>(),
GalaxyTags: new[]
{
new GalaxyTagPlan("t1", "drv", "line.cell", "Speed", "Float", "line.cell.Speed"),
new GalaxyTagPlan("t2", "drv", "line.cell", "Torque", "Float", "line.cell.Torque"),
});
applier.MaterialiseGalaxyTags(composition);
sink.FolderCalls.ShouldHaveSingleItem().ShouldBe(("line.cell", null, "line.cell"));
sink.VariableCalls.Count.ShouldBe(2);
sink.VariableCalls.ShouldContain(("line.cell.Speed", "line.cell", "Speed", "Float"));
sink.VariableCalls.ShouldContain(("line.cell.Torque", "line.cell", "Torque", "Float"));
}
/// <summary>Verifies that added Galaxy tags in an otherwise-empty plan trigger an address-space rebuild.</summary>
[Fact]
public void Added_galaxy_tags_trigger_rebuild()
{
var sink = new RecordingSink();
var applier = new Phase7Applier(sink, NullLogger<Phase7Applier>.Instance);
var plan = new Phase7Plan(
AddedEquipment: Array.Empty<EquipmentNode>(),
RemovedEquipment: Array.Empty<EquipmentNode>(),
ChangedEquipment: Array.Empty<Phase7Plan.EquipmentDelta>(),
AddedDrivers: Array.Empty<DriverInstancePlan>(),
RemovedDrivers: Array.Empty<DriverInstancePlan>(),
ChangedDrivers: Array.Empty<Phase7Plan.DriverDelta>(),
AddedAlarms: Array.Empty<ScriptedAlarmPlan>(),
RemovedAlarms: Array.Empty<ScriptedAlarmPlan>(),
ChangedAlarms: Array.Empty<Phase7Plan.AlarmDelta>(),
AddedGalaxyTags: new[]
{
new GalaxyTagPlan("t1", "drv", "section.area", "Temperature", "Float", "section.area.Temperature"),
},
RemovedGalaxyTags: Array.Empty<GalaxyTagPlan>(),
ChangedGalaxyTags: Array.Empty<Phase7Plan.GalaxyTagDelta>());
var outcome = applier.Apply(plan);
outcome.RebuildCalled.ShouldBeTrue();
outcome.AddedNodes.ShouldBe(1);
sink.RebuildCalls.ShouldBe(1);
}
private static Phase7Plan EmptyPlan => new(
Array.Empty<EquipmentNode>(), Array.Empty<EquipmentNode>(), Array.Empty<Phase7Plan.EquipmentDelta>(),
Array.Empty<DriverInstancePlan>(), Array.Empty<DriverInstancePlan>(), Array.Empty<Phase7Plan.DriverDelta>(),
Array.Empty<ScriptedAlarmPlan>(), Array.Empty<ScriptedAlarmPlan>(), Array.Empty<Phase7Plan.AlarmDelta>());
Array.Empty<ScriptedAlarmPlan>(), Array.Empty<ScriptedAlarmPlan>(), Array.Empty<Phase7Plan.AlarmDelta>(),
Array.Empty<GalaxyTagPlan>(), Array.Empty<GalaxyTagPlan>(), Array.Empty<Phase7Plan.GalaxyTagDelta>());
private static Phase7Plan WithEquipmentRemoval(params string[] ids) => new(
AddedEquipment: Array.Empty<EquipmentNode>(),
@@ -125,7 +223,10 @@ public sealed class Phase7ApplierTests
ChangedDrivers: Array.Empty<Phase7Plan.DriverDelta>(),
AddedAlarms: Array.Empty<ScriptedAlarmPlan>(),
RemovedAlarms: Array.Empty<ScriptedAlarmPlan>(),
ChangedAlarms: Array.Empty<Phase7Plan.AlarmDelta>());
ChangedAlarms: Array.Empty<Phase7Plan.AlarmDelta>(),
AddedGalaxyTags: Array.Empty<GalaxyTagPlan>(),
RemovedGalaxyTags: Array.Empty<GalaxyTagPlan>(),
ChangedGalaxyTags: Array.Empty<Phase7Plan.GalaxyTagDelta>());
private sealed class RecordingSink : IOpcUaAddressSpaceSink
{
@@ -133,6 +234,8 @@ public sealed class Phase7ApplierTests
public ConcurrentQueue<(string NodeId, bool Active, bool Acknowledged)> AlarmQueue { get; } = new();
/// <summary>Gets the queue of folder creation calls.</summary>
public ConcurrentQueue<(string NodeId, string? Parent, string DisplayName)> FolderQueue { get; } = new();
/// <summary>Gets the queue of variable creation calls.</summary>
public ConcurrentQueue<(string NodeId, string? Parent, string DisplayName, string DataType)> VariableQueue { get; } = new();
/// <summary>Gets the number of rebuild calls made on this sink.</summary>
public int RebuildCalls;
@@ -140,6 +243,8 @@ public sealed class Phase7ApplierTests
public List<(string NodeId, bool Active, bool Acknowledged)> AlarmWrites => AlarmQueue.ToList();
/// <summary>Gets the list of recorded folder creation calls.</summary>
public List<(string NodeId, string? Parent, string DisplayName)> FolderCalls => FolderQueue.ToList();
/// <summary>Gets the list of recorded variable creation calls.</summary>
public List<(string NodeId, string? Parent, string DisplayName, string DataType)> VariableCalls => VariableQueue.ToList();
/// <summary>Records a value write (no-op in this recording sink).</summary>
/// <param name="nodeId">The node ID.</param>
@@ -160,6 +265,13 @@ public sealed class Phase7ApplierTests
/// <param name="displayName">The display name for the folder.</param>
public void EnsureFolder(string folderNodeId, string? parentNodeId, string displayName)
=> FolderQueue.Enqueue((folderNodeId, parentNodeId, displayName));
/// <summary>Records a variable creation call.</summary>
/// <param name="variableNodeId">The variable node ID.</param>
/// <param name="parentFolderNodeId">The parent folder node ID, if any.</param>
/// <param name="displayName">The display name for the variable.</param>
/// <param name="dataType">The OPC UA built-in type name.</param>
public void EnsureVariable(string variableNodeId, string? parentFolderNodeId, string displayName, string dataType)
=> VariableQueue.Enqueue((variableNodeId, parentFolderNodeId, displayName, dataType));
/// <summary>Records a rebuild address space call.</summary>
public void RebuildAddressSpace() => Interlocked.Increment(ref RebuildCalls);
}
@@ -192,6 +304,12 @@ public sealed class Phase7ApplierTests
/// <param name="parentNodeId">The parent folder node ID, if any.</param>
/// <param name="displayName">The display name for the folder.</param>
public void EnsureFolder(string folderNodeId, string? parentNodeId, string displayName) { }
/// <summary>No-op variable creation call.</summary>
/// <param name="variableNodeId">The variable node ID.</param>
/// <param name="parentFolderNodeId">The parent folder node ID, if any.</param>
/// <param name="displayName">The display name for the variable.</param>
/// <param name="dataType">The OPC UA built-in type name.</param>
public void EnsureVariable(string variableNodeId, string? parentFolderNodeId, string displayName, string dataType) { }
/// <summary>No-op rebuild address space call.</summary>
public void RebuildAddressSpace() { }
}
@@ -206,6 +206,12 @@ public sealed class OtOpcUaTelemetryHookTests : RuntimeActorTestBase
/// <param name="parentNodeId">The parent folder node identifier.</param>
/// <param name="displayName">The display name for the folder.</param>
public void EnsureFolder(string folderNodeId, string? parentNodeId, string displayName) { }
/// <summary>Ensures variable exists (stub implementation).</summary>
/// <param name="variableNodeId">The variable node identifier.</param>
/// <param name="parentFolderNodeId">The parent folder node identifier.</param>
/// <param name="displayName">The display name for the variable.</param>
/// <param name="dataType">The OPC UA built-in type name.</param>
public void EnsureVariable(string variableNodeId, string? parentFolderNodeId, string displayName, string dataType) { }
/// <summary>Rebuilds address space (recorded via span).</summary>
public void RebuildAddressSpace() { /* recorded via span */ }
}
@@ -161,6 +161,13 @@ public sealed class OpcUaPublishActorRebuildTests : RuntimeActorTestBase
/// <param name="displayName">The display name of the folder.</param>
public void EnsureFolder(string folderNodeId, string? parentNodeId, string displayName)
=> Calls.Enqueue($"EF:{folderNodeId}");
/// <summary>Records a variable ensure call.</summary>
/// <param name="variableNodeId">The variable node ID.</param>
/// <param name="parentFolderNodeId">The parent folder node ID, or null if this is a root variable.</param>
/// <param name="displayName">The display name of the variable.</param>
/// <param name="dataType">The OPC UA built-in type name.</param>
public void EnsureVariable(string variableNodeId, string? parentFolderNodeId, string displayName, string dataType)
=> Calls.Enqueue($"EV:{variableNodeId}");
/// <summary>Records a rebuild address space call.</summary>
public void RebuildAddressSpace() => Interlocked.Increment(ref RebuildCalls);
}
@@ -182,6 +182,13 @@ public sealed class OpcUaPublishActorTests : RuntimeActorTestBase
/// <param name="displayName">The display name of the folder.</param>
public void EnsureFolder(string folderNodeId, string? parentNodeId, string displayName) { }
/// <summary>Ensures a variable exists (no-op in test).</summary>
/// <param name="variableNodeId">The OPC UA variable node identifier.</param>
/// <param name="parentFolderNodeId">The parent folder node identifier, or null for root.</param>
/// <param name="displayName">The display name of the variable.</param>
/// <param name="dataType">The OPC UA built-in type name.</param>
public void EnsureVariable(string variableNodeId, string? parentFolderNodeId, string displayName, string dataType) { }
/// <summary>Records a rebuild call.</summary>
public void RebuildAddressSpace() => Interlocked.Increment(ref RebuildCalls);
}