Files
lmxopcua/docs/plans/alarms-d1-smoke-artifact.md
T
Joseph Doherty 695fa6408b
v2-ci / build (push) Failing after 47s
v2-ci / unit-tests (tests/Core/ZB.MOM.WW.OtOpcUa.Cluster.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.ControlPlane.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.Runtime.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.Security.Tests) (push) Has been skipped
v2-ci / integration (tests/Server/ZB.MOM.WW.OtOpcUa.Host.IntegrationTests) (push) Has been skipped
v2-ci / integration (tests/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer.IntegrationTests) (push) Has been skipped
docs(alarms): record native alarms verified working; add D.1 smoke
The 2026-04-30 alarm plan banners claimed worker-side native alarm
subscription was blocked on a COM-bitness finding. That's stale: the
mxaccessgw .NET client now has true MxAccess alarm-event support, and a
live StreamAlarms check (+ new Skip-gated GatewayGalaxyAlarmFeedLiveTests
through the lmxopcua consumer) confirms native alarms — operator comment,
category, severity, timestamps — flow end-to-end. Reconcile both plan docs
to reality and add docs/plans/alarms-d1-smoke-artifact.md as the D.1
alarm-source deliverable. Historian-write live smoke + full server->A&C
round-trip remain (Windows parity rig only).
2026-05-31 09:59:01 -04:00

101 lines
5.1 KiB
Markdown

# Alarms D.1 — smoke artifact
> **Status (2026-05-29): alarm-source leg VERIFIED. Historian-write leg still
> pending the Windows sidecar + live AVEVA Historian.**
>
> This is the D.1 deliverable called for by `docs/plans/alarms-worker-wiring-plan.md`
> — captured evidence that a live Galaxy alarm reaches lmxopcua through the native
> gateway path (not the sub-attribute fallback). It supersedes the "A.2 blocked"
> banners in `alarms-over-gateway.md` / `alarms-worker-wiring-plan.md`, which were
> written 2026-04-30 before the gateway's alarm feed was working.
## What was verified
The mxaccessgw gateway **does** serve native MxAccess alarms today, and the lmxopcua
consumer ingests them with full fidelity — **including operator-comment**, the field
the 2026-04-30 plan flagged as "the only v1 regression."
Verified from the macOS dev box against the live gateway at `http://10.100.0.48:5120`
(reachable; `nc -z` succeeds). No acknowledge / no writes were issued — read-only
`StreamAlarms`.
### 1. Gateway boundary — raw `StreamAlarms` (`ZB.MOM.WW.MxGateway.Client`)
A standalone client streamed the active-alarm snapshot: **20 active alarms**, each
carrying native metadata. Sample (one of 20):
```json
{ "alarmFullReference": "Galaxy!TestArea.TestMachine_001.TestAlarm001",
"sourceObjectReference": "TestMachine_001.TestAlarm001",
"alarmTypeName": "DSC", "severity": 500,
"currentState": "ALARM_CONDITION_STATE_ACTIVE", "category": "TestArea",
"lastTransitionTimestamp": "2026-05-24T16:04:10.856Z",
"operatorComment": "Test alarm #1" }
```
Followed by the `SnapshotComplete` marker. `operatorComment`, `category`, `severity`,
`currentState`, and `lastTransitionTimestamp` are all populated.
### 2. lmxopcua consumer — `GatewayGalaxyAlarmFeed` → `GalaxyAlarmTransition`
The Skip-gated live test
`Runtime/GatewayGalaxyAlarmFeedLiveTests.Live_gateway_delivers_native_alarm_transitions_through_the_consumer`
wires the real `MxGatewayClient.StreamAlarmsAsync` into the production consumer seam
and **passes**. Captured output (`D1_SMOKE_OUT`):
```
# consumer transitions observed: 2+
Raise Galaxy!TestArea.TestMachine_001.TestAlarm001 | sev=750(High) raw=500 | cat=TestArea | comment='Test alarm #1' | xitionUtc=2026-05-24T16:04:10.856Z
Raise Galaxy!TestArea.TestMachine_003.TestAlarm001 | sev=750(High) raw=500 | cat=TestArea | comment='Test alarm #1' | xitionUtc=2026-05-07T18:14:00.594Z
```
The consumer preserves `operatorComment` + `category` + transition timestamp and
applies the OPC UA severity-bucket mapping (`MxAccessSeverityMapper`: raw 500 →
OPC UA 750, bucket `High`).
### 3. Full chain to the OPC UA Part 9 surface (code-path verified)
`GalaxyDriver.OnAlarmFeedTransition` maps `GalaxyAlarmTransition`
`AlarmEventArgs`, carrying `OperatorComment`, `OriginalRaiseTimestampUtc`,
`AlarmCategory`, and the severity bucket onto `IAlarmSource.OnAlarmEvent`.
`AlarmEventArgs` already declares those fields — so the **E.7 contract extension is
done**, not pending. The server's Part-9 condition layer consumes `IAlarmSource`
via `AlarmSurfaceInvoker``GenericDriverNodeManager`. Unit coverage:
`GalaxyDriverAlarmSourceTests`, `GatewayGalaxyAlarmFeedTests`.
## How to re-run
```bash
export MXGW_ENDPOINT="http://10.100.0.48:5120"
export GALAXY_MXGW_API_KEY="<dev key from docker-dev/docker-compose.yml>"
export D1_SMOKE_OUT="/tmp/d1-consumer-transitions.txt" # optional capture
dotnet test tests/Drivers/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Tests \
--filter "FullyQualifiedName~GatewayGalaxyAlarmFeedLiveTests"
```
Without the env vars the test `Skip`s, so normal `dotnet test` runs are unaffected.
## Not covered here (still open)
1. **Scripted-alarm historian write-back → AVEVA Historian** (C.1's live leg). The
`SdkAlarmHistorianWriteBackend` (real `HistorianAccess.AddStreamedValue` path) is
implemented and unit-tested, but its `Live_*` write smoke needs the Windows
historian sidecar + a live AVEVA Historian — neither reachable from the macOS dev
box. Capture this leg on the Windows parity rig.
2. **Running-server → OPC UA A&C client round-trip.** This artifact proves the driver
consumer end; it does not exercise a full OtOpcUa server surfacing the condition to
an OPC UA client, because the docker-dev stack stubs the Galaxy driver on Linux
(`DriverInstanceActor.ShouldStub`). Capture on the Windows parity rig (or a Linux
host with `ShouldStub` overridden to point the real driver at the gateway).
## Mechanism — true MxAccess alarm-event support
The gateway delivers these alarms via **true MxAccess alarm-event support** in the
mxaccessgw .NET client — a real alarm-event subscription, **not** the value-driven
sub-attribute fallback. (Confirmed by the gateway maintainer; the client-side stream
check above can only observe the resulting feed, which is why this artifact records the
mechanism here rather than inferring it.) So A.2 is implemented as originally specified:
`MX_EVENT_FAMILY_ON_ALARM_TRANSITION` carries genuine native alarm-event metadata, and
the operator-comment / original-raise-time / category fields are first-class — not
reconstructed from attribute reads.