Compare commits
1 Commits
| Author | SHA1 | Date | |
|---|---|---|---|
| 695fa6408b |
@@ -0,0 +1,100 @@
|
|||||||
|
# Alarms D.1 — smoke artifact
|
||||||
|
|
||||||
|
> **Status (2026-05-29): alarm-source leg VERIFIED. Historian-write leg still
|
||||||
|
> pending the Windows sidecar + live AVEVA Historian.**
|
||||||
|
>
|
||||||
|
> This is the D.1 deliverable called for by `docs/plans/alarms-worker-wiring-plan.md`
|
||||||
|
> — captured evidence that a live Galaxy alarm reaches lmxopcua through the native
|
||||||
|
> gateway path (not the sub-attribute fallback). It supersedes the "A.2 blocked"
|
||||||
|
> banners in `alarms-over-gateway.md` / `alarms-worker-wiring-plan.md`, which were
|
||||||
|
> written 2026-04-30 before the gateway's alarm feed was working.
|
||||||
|
|
||||||
|
## What was verified
|
||||||
|
|
||||||
|
The mxaccessgw gateway **does** serve native MxAccess alarms today, and the lmxopcua
|
||||||
|
consumer ingests them with full fidelity — **including operator-comment**, the field
|
||||||
|
the 2026-04-30 plan flagged as "the only v1 regression."
|
||||||
|
|
||||||
|
Verified from the macOS dev box against the live gateway at `http://10.100.0.48:5120`
|
||||||
|
(reachable; `nc -z` succeeds). No acknowledge / no writes were issued — read-only
|
||||||
|
`StreamAlarms`.
|
||||||
|
|
||||||
|
### 1. Gateway boundary — raw `StreamAlarms` (`ZB.MOM.WW.MxGateway.Client`)
|
||||||
|
|
||||||
|
A standalone client streamed the active-alarm snapshot: **20 active alarms**, each
|
||||||
|
carrying native metadata. Sample (one of 20):
|
||||||
|
|
||||||
|
```json
|
||||||
|
{ "alarmFullReference": "Galaxy!TestArea.TestMachine_001.TestAlarm001",
|
||||||
|
"sourceObjectReference": "TestMachine_001.TestAlarm001",
|
||||||
|
"alarmTypeName": "DSC", "severity": 500,
|
||||||
|
"currentState": "ALARM_CONDITION_STATE_ACTIVE", "category": "TestArea",
|
||||||
|
"lastTransitionTimestamp": "2026-05-24T16:04:10.856Z",
|
||||||
|
"operatorComment": "Test alarm #1" }
|
||||||
|
```
|
||||||
|
|
||||||
|
Followed by the `SnapshotComplete` marker. `operatorComment`, `category`, `severity`,
|
||||||
|
`currentState`, and `lastTransitionTimestamp` are all populated.
|
||||||
|
|
||||||
|
### 2. lmxopcua consumer — `GatewayGalaxyAlarmFeed` → `GalaxyAlarmTransition`
|
||||||
|
|
||||||
|
The Skip-gated live test
|
||||||
|
`Runtime/GatewayGalaxyAlarmFeedLiveTests.Live_gateway_delivers_native_alarm_transitions_through_the_consumer`
|
||||||
|
wires the real `MxGatewayClient.StreamAlarmsAsync` into the production consumer seam
|
||||||
|
and **passes**. Captured output (`D1_SMOKE_OUT`):
|
||||||
|
|
||||||
|
```
|
||||||
|
# consumer transitions observed: 2+
|
||||||
|
Raise Galaxy!TestArea.TestMachine_001.TestAlarm001 | sev=750(High) raw=500 | cat=TestArea | comment='Test alarm #1' | xitionUtc=2026-05-24T16:04:10.856Z
|
||||||
|
Raise Galaxy!TestArea.TestMachine_003.TestAlarm001 | sev=750(High) raw=500 | cat=TestArea | comment='Test alarm #1' | xitionUtc=2026-05-07T18:14:00.594Z
|
||||||
|
```
|
||||||
|
|
||||||
|
The consumer preserves `operatorComment` + `category` + transition timestamp and
|
||||||
|
applies the OPC UA severity-bucket mapping (`MxAccessSeverityMapper`: raw 500 →
|
||||||
|
OPC UA 750, bucket `High`).
|
||||||
|
|
||||||
|
### 3. Full chain to the OPC UA Part 9 surface (code-path verified)
|
||||||
|
|
||||||
|
`GalaxyDriver.OnAlarmFeedTransition` maps `GalaxyAlarmTransition` →
|
||||||
|
`AlarmEventArgs`, carrying `OperatorComment`, `OriginalRaiseTimestampUtc`,
|
||||||
|
`AlarmCategory`, and the severity bucket onto `IAlarmSource.OnAlarmEvent`.
|
||||||
|
`AlarmEventArgs` already declares those fields — so the **E.7 contract extension is
|
||||||
|
done**, not pending. The server's Part-9 condition layer consumes `IAlarmSource`
|
||||||
|
via `AlarmSurfaceInvoker` → `GenericDriverNodeManager`. Unit coverage:
|
||||||
|
`GalaxyDriverAlarmSourceTests`, `GatewayGalaxyAlarmFeedTests`.
|
||||||
|
|
||||||
|
## How to re-run
|
||||||
|
|
||||||
|
```bash
|
||||||
|
export MXGW_ENDPOINT="http://10.100.0.48:5120"
|
||||||
|
export GALAXY_MXGW_API_KEY="<dev key from docker-dev/docker-compose.yml>"
|
||||||
|
export D1_SMOKE_OUT="/tmp/d1-consumer-transitions.txt" # optional capture
|
||||||
|
dotnet test tests/Drivers/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Tests \
|
||||||
|
--filter "FullyQualifiedName~GatewayGalaxyAlarmFeedLiveTests"
|
||||||
|
```
|
||||||
|
|
||||||
|
Without the env vars the test `Skip`s, so normal `dotnet test` runs are unaffected.
|
||||||
|
|
||||||
|
## Not covered here (still open)
|
||||||
|
|
||||||
|
1. **Scripted-alarm historian write-back → AVEVA Historian** (C.1's live leg). The
|
||||||
|
`SdkAlarmHistorianWriteBackend` (real `HistorianAccess.AddStreamedValue` path) is
|
||||||
|
implemented and unit-tested, but its `Live_*` write smoke needs the Windows
|
||||||
|
historian sidecar + a live AVEVA Historian — neither reachable from the macOS dev
|
||||||
|
box. Capture this leg on the Windows parity rig.
|
||||||
|
2. **Running-server → OPC UA A&C client round-trip.** This artifact proves the driver
|
||||||
|
consumer end; it does not exercise a full OtOpcUa server surfacing the condition to
|
||||||
|
an OPC UA client, because the docker-dev stack stubs the Galaxy driver on Linux
|
||||||
|
(`DriverInstanceActor.ShouldStub`). Capture on the Windows parity rig (or a Linux
|
||||||
|
host with `ShouldStub` overridden to point the real driver at the gateway).
|
||||||
|
|
||||||
|
## Mechanism — true MxAccess alarm-event support
|
||||||
|
|
||||||
|
The gateway delivers these alarms via **true MxAccess alarm-event support** in the
|
||||||
|
mxaccessgw .NET client — a real alarm-event subscription, **not** the value-driven
|
||||||
|
sub-attribute fallback. (Confirmed by the gateway maintainer; the client-side stream
|
||||||
|
check above can only observe the resulting feed, which is why this artifact records the
|
||||||
|
mechanism here rather than inferring it.) So A.2 is implemented as originally specified:
|
||||||
|
`MX_EVENT_FAMILY_ON_ALARM_TRANSITION` carries genuine native alarm-event metadata, and
|
||||||
|
the operator-comment / original-raise-time / category fields are first-class — not
|
||||||
|
reconstructed from attribute reads.
|
||||||
@@ -9,24 +9,41 @@
|
|||||||
> the new RPCs; the sub-attribute fallback path keeps Galaxy alarms
|
> the new RPCs; the sub-attribute fallback path keeps Galaxy alarms
|
||||||
> functional today.
|
> functional today.
|
||||||
>
|
>
|
||||||
> ⚠️ **Worker-side native alarm subscription blocked on a dev-rig
|
> ✅ **UPDATE 2026-05-29 — native alarm feed VERIFIED working; the
|
||||||
> finding (2026-04-30):** the MXAccess COM Toolkit at
|
> 2026-04-30 "blocked" finding below is superseded.** A live
|
||||||
|
> `StreamAlarms` check against the gateway at `10.100.0.48:5120`
|
||||||
|
> returned the active-alarm snapshot (20 alarms) with full native
|
||||||
|
> metadata — `severity`, `category`, `currentState`,
|
||||||
|
> `lastTransitionTimestamp`, **and `operatorComment`** (the field the
|
||||||
|
> note below called "the only v1 regression"). The lmxopcua consumer
|
||||||
|
> (`GatewayGalaxyAlarmFeed` → `GalaxyAlarmTransition` →
|
||||||
|
> `AlarmEventArgs` → `IAlarmSource`) ingests it with full fidelity and
|
||||||
|
> the OPC UA severity-bucket mapping applied — proven by the passing
|
||||||
|
> Skip-gated live test `GatewayGalaxyAlarmFeedLiveTests`. `AlarmEventArgs`
|
||||||
|
> already carries operator-comment / original-raise-time / category, so
|
||||||
|
> **E.7 is done too**. See `docs/plans/alarms-d1-smoke-artifact.md` for
|
||||||
|
> the captured evidence. The gateway delivers this via **true MxAccess
|
||||||
|
> alarm-event support** in the mxaccessgw .NET client (a real
|
||||||
|
> alarm-event subscription — **not** the sub-attribute fallback), so A.2
|
||||||
|
> is implemented as originally specified. Still open: the scripted-alarm
|
||||||
|
> → AVEVA Historian write-back live smoke (C.1's `Live_*` leg) and a full
|
||||||
|
> running-server → OPC UA A&C round-trip — both need the Windows parity rig.
|
||||||
|
>
|
||||||
|
> ⚠️ **[SUPERSEDED — kept for history] Worker-side native alarm
|
||||||
|
> subscription blocked on a dev-rig finding (2026-04-30):** the MXAccess
|
||||||
|
> COM Toolkit at
|
||||||
> `C:\Program Files (x86)\ArchestrA\Framework\Bin\ArchestrA.MXAccess.dll`
|
> `C:\Program Files (x86)\ArchestrA\Framework\Bin\ArchestrA.MXAccess.dll`
|
||||||
> exposes no alarm-event family — only `OnDataChange`,
|
> exposed no alarm-event family — only `OnDataChange`,
|
||||||
> `OnWriteComplete`, `OperationComplete`, `OnBufferedDataChange`.
|
> `OnWriteComplete`, `OperationComplete`, `OnBufferedDataChange` — and
|
||||||
> AVEVA's `aaAlarmManagedClient` / `ArchestrAAlarmsAndEvents.SDK`
|
> AVEVA's `aaAlarmManagedClient` / `ArchestrAAlarmsAndEvents.SDK`
|
||||||
> assemblies are x64-only and incompatible with the worker's x86
|
> assemblies are x64-only vs. the worker's x86 bitness. The operator
|
||||||
> bitness. **Operator decision needed before
|
> decision (accept the value-driven sub-attribute path, or add an x64
|
||||||
> `MX_EVENT_FAMILY_ON_ALARM_TRANSITION` carries any events:** either
|
> alarm-helper sub-process) has since been resolved on the gateway side
|
||||||
> accept the value-driven sub-attribute path as the production
|
> — `MX_EVENT_FAMILY_ON_ALARM_TRANSITION` now carries events (verified
|
||||||
> architecture (operator-comment fidelity is the only v1 regression)
|
> above). The C.1 `SdkAlarmHistorianWriteBackend` is **no longer a
|
||||||
> or add an x64 alarm-helper sub-process alongside the worker. See
|
> placeholder** — it writes through the real
|
||||||
> `src/MxGateway.Worker/MxAccess/MxAccessAlarmEventSink.cs` in the
|
> `HistorianAccess.AddStreamedValue` path (only its live-rig write
|
||||||
> mxaccessgw repo for the architectural notes. Live
|
> smoke remains).
|
||||||
> `aahClientManaged` alarm-event write call site
|
|
||||||
> (`SdkAlarmHistorianWriteBackend` placeholder from PR C.1) and the
|
|
||||||
> D.1 smoke artifact ship once those decisions resolve. The
|
|
||||||
> remainder of this document is preserved as the design record.
|
|
||||||
|
|
||||||
Coordinated epic across two repos:
|
Coordinated epic across two repos:
|
||||||
|
|
||||||
|
|||||||
@@ -1,5 +1,18 @@
|
|||||||
# Alarms Worker Wiring Plan
|
# Alarms Worker Wiring Plan
|
||||||
|
|
||||||
|
> ✅ **UPDATE 2026-05-29 — the blocker below is RESOLVED on the gateway side; this
|
||||||
|
> plan is largely complete.** A live `StreamAlarms` check against `10.100.0.48:5120`
|
||||||
|
> returns the active-alarm snapshot with full native metadata **including
|
||||||
|
> `operatorComment`**, and the lmxopcua consumer ingests it end-to-end (passing live
|
||||||
|
> test `GatewayGalaxyAlarmFeedLiveTests`). So **A.2 / A.3 / A.4** are functionally done
|
||||||
|
> at the gateway boundary (the worker now emits native alarm transitions and the client
|
||||||
|
> exposes `AcknowledgeAlarm` / `QueryActiveAlarms` RPCs). **C.1** ships real code
|
||||||
|
> (`SdkAlarmHistorianWriteBackend` → `HistorianAccess.AddStreamedValue`). **D.1**'s
|
||||||
|
> alarm-source leg is captured in `docs/plans/alarms-d1-smoke-artifact.md`. Only two
|
||||||
|
> things remain, both needing the Windows parity rig: C.1's live historian-write smoke
|
||||||
|
> and a full running-server → OPC UA A&C round-trip. The per-item detail below is kept
|
||||||
|
> as the historical record of the original blocked state.
|
||||||
|
>
|
||||||
> **Context**: The alarms-over-gateway epic shipped 19 PRs across the
|
> **Context**: The alarms-over-gateway epic shipped 19 PRs across the
|
||||||
> `lmxopcua` and `mxaccessgw` repos (merged 2026-04-30). Contracts are live;
|
> `lmxopcua` and `mxaccessgw` repos (merged 2026-04-30). Contracts are live;
|
||||||
> the sub-attribute fallback path keeps Galaxy alarms functional today. Four
|
> the sub-attribute fallback path keeps Galaxy alarms functional today. Four
|
||||||
@@ -16,7 +29,7 @@
|
|||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
## Dev-rig finding that blocks everything (2026-04-30)
|
## Dev-rig finding that blocks everything (2026-04-30) — [SUPERSEDED 2026-05-29]
|
||||||
|
|
||||||
During PR A.2 work the following was discovered on the dev box:
|
During PR A.2 work the following was discovered on the dev box:
|
||||||
|
|
||||||
@@ -318,16 +331,20 @@ fallback as production).
|
|||||||
|
|
||||||
## Summary of blocks
|
## Summary of blocks
|
||||||
|
|
||||||
| Item | Blocked by | Estimated effort once unblocked |
|
> **Resolved as of 2026-05-29** — see the update banner at the top and
|
||||||
|------|-----------|--------------------------------|
|
> `docs/plans/alarms-d1-smoke-artifact.md`. Original status table kept for history.
|
||||||
| A.2 | Architectural decision (x64 alarm-helper vs. sub-attribute fallback as production) | 2–3 days implementation; 1 day tests |
|
|
||||||
| A.3 | A.2 delivering WorkerEvent bodies | 1–2 days |
|
|
||||||
| A.4 | A.2 (active-alarm query needs AlarmClient session) | 1 day |
|
|
||||||
| C.1 | aahClientManaged SDK access (available on dev box); NOT blocked by A.2 | 1–2 days |
|
|
||||||
| D.1 | A.2 + A.3 + C.1 all passing on parity rig | 0.5 day (smoke + artifact capture) |
|
|
||||||
|
|
||||||
C.1 can proceed in parallel with A.2 / A.3 since the sidecar's `aahClientManaged`
|
| Item | Status (2026-05-29) | Original block |
|
||||||
is x64 and does not share the worker bitness constraint.
|
|------|--------------------|----------------|
|
||||||
|
| A.2 | ✅ **True MxAccess alarm-event support** in the gateway client (real alarm-event subscription, not the sub-attribute fallback); verified via live `StreamAlarms` with operator-comment fidelity | Architectural decision (x64 alarm-helper vs. sub-attribute fallback) |
|
||||||
|
| A.3 | ✅ Dispatch + `AcknowledgeAlarm` RPC present on the client surface | A.2 delivering WorkerEvent bodies |
|
||||||
|
| A.4 | ✅ `QueryActiveAlarms` RPC present on the client surface | A.2 (active-alarm query needs AlarmClient session) |
|
||||||
|
| C.1 | ✅ Code shipped (`AddStreamedValue` path); ⏳ live historian-write smoke needs the Windows rig | aahClientManaged SDK access |
|
||||||
|
| D.1 | ◑ Alarm-source leg captured (`alarms-d1-smoke-artifact.md`); ⏳ historian-write leg + full server→A&C round-trip need the Windows rig | A.2 + A.3 + C.1 all passing on parity rig |
|
||||||
|
|
||||||
|
The gateway delivers operator-comment fidelity through **true MxAccess alarm-event
|
||||||
|
support** in the mxaccessgw .NET client — a real alarm-event subscription, not the
|
||||||
|
value-driven sub-attribute path. The sub-attribute fallback is now legacy.
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
|
|||||||
+82
@@ -0,0 +1,82 @@
|
|||||||
|
using ZB.MOM.WW.MxGateway.Client;
|
||||||
|
using Shouldly;
|
||||||
|
using Xunit;
|
||||||
|
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Runtime;
|
||||||
|
|
||||||
|
namespace ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Tests.Runtime;
|
||||||
|
|
||||||
|
/// <summary>
|
||||||
|
/// D.1 smoke (alarm-source leg): drives the REAL gateway <c>StreamAlarms</c> feed through the
|
||||||
|
/// production lmxopcua consumer (<see cref="GatewayGalaxyAlarmFeed"/>) and asserts native alarm
|
||||||
|
/// transitions — with operator comment, category, original raise time, and the mapped OPC UA
|
||||||
|
/// severity bucket preserved — reach the driver-side boundary that feeds
|
||||||
|
/// <c>IAlarmSource.OnAlarmEvent</c>.
|
||||||
|
/// <para>
|
||||||
|
/// Skip-gated: runs only when <c>MXGW_ENDPOINT</c> + <c>GALAXY_MXGW_API_KEY</c> are set to a
|
||||||
|
/// reachable gateway. Captured 2026-05-29 against <c>10.100.0.48:5120</c> — see
|
||||||
|
/// <c>docs/plans/alarms-d1-smoke-artifact.md</c>. Set <c>D1_SMOKE_OUT</c> to dump the observed
|
||||||
|
/// transitions to a file for artifact capture.
|
||||||
|
/// </para>
|
||||||
|
/// </summary>
|
||||||
|
[Trait("Category", "Integration")]
|
||||||
|
public sealed class GatewayGalaxyAlarmFeedLiveTests
|
||||||
|
{
|
||||||
|
[Fact]
|
||||||
|
public async Task Live_gateway_delivers_native_alarm_transitions_through_the_consumer()
|
||||||
|
{
|
||||||
|
var endpoint = Environment.GetEnvironmentVariable("MXGW_ENDPOINT");
|
||||||
|
var apiKey = Environment.GetEnvironmentVariable("GALAXY_MXGW_API_KEY");
|
||||||
|
if (string.IsNullOrWhiteSpace(endpoint) || string.IsNullOrWhiteSpace(apiKey))
|
||||||
|
Assert.Skip("Set MXGW_ENDPOINT + GALAXY_MXGW_API_KEY to run the live gateway alarm-feed smoke.");
|
||||||
|
|
||||||
|
var client = MxGatewayClient.Create(new MxGatewayClientOptions
|
||||||
|
{
|
||||||
|
Endpoint = new Uri(endpoint!, UriKind.Absolute),
|
||||||
|
ApiKey = apiKey!,
|
||||||
|
UseTls = false,
|
||||||
|
ConnectTimeout = TimeSpan.FromSeconds(10),
|
||||||
|
DefaultCallTimeout = TimeSpan.FromSeconds(30),
|
||||||
|
StreamTimeout = TimeSpan.FromSeconds(30),
|
||||||
|
});
|
||||||
|
|
||||||
|
var observed = new List<GalaxyAlarmTransition>();
|
||||||
|
var gotOne = new TaskCompletionSource<bool>(TaskCreationOptions.RunContinuationsAsynchronously);
|
||||||
|
|
||||||
|
// Wire the live client's StreamAlarms method group into the production consumer seam.
|
||||||
|
await using var feed = new GatewayGalaxyAlarmFeed(client.StreamAlarmsAsync, clientName: "D1Smoke");
|
||||||
|
feed.OnAlarmTransition += (_, t) =>
|
||||||
|
{
|
||||||
|
lock (observed) { observed.Add(t); }
|
||||||
|
gotOne.TrySetResult(true);
|
||||||
|
};
|
||||||
|
feed.Start();
|
||||||
|
|
||||||
|
// The stream opens with the active-alarm snapshot, so we expect ≥1 transition promptly.
|
||||||
|
await Task.WhenAny(gotOne.Task, Task.Delay(TimeSpan.FromSeconds(20), TestContext.Current.CancellationToken));
|
||||||
|
|
||||||
|
List<GalaxyAlarmTransition> snapshot;
|
||||||
|
lock (observed) snapshot = observed.ToList();
|
||||||
|
|
||||||
|
snapshot.ShouldNotBeEmpty(
|
||||||
|
"Live gateway should deliver at least the active-alarm snapshot through the lmxopcua consumer.");
|
||||||
|
var first = snapshot[0];
|
||||||
|
first.AlarmFullReference.ShouldNotBeNullOrWhiteSpace();
|
||||||
|
first.OpcUaSeverity.ShouldBeGreaterThan(0); // severity bucket mapping applied by the consumer
|
||||||
|
|
||||||
|
foreach (var t in snapshot.Take(8))
|
||||||
|
TestContext.Current.SendDiagnosticMessage(
|
||||||
|
$"{t.TransitionKind,-11} {t.AlarmFullReference} sev={t.OpcUaSeverity}({t.SeverityBucket}) cat={t.Category} comment='{t.OperatorComment}'");
|
||||||
|
TestContext.Current.SendDiagnosticMessage($"TOTAL consumer transitions observed: {snapshot.Count}");
|
||||||
|
|
||||||
|
// Deterministic artifact capture (only when D1_SMOKE_OUT is set).
|
||||||
|
var outPath = Environment.GetEnvironmentVariable("D1_SMOKE_OUT");
|
||||||
|
if (!string.IsNullOrWhiteSpace(outPath))
|
||||||
|
{
|
||||||
|
var lines = snapshot.Take(50).Select(t =>
|
||||||
|
$"{t.TransitionKind,-11} {t.AlarmFullReference} | sev={t.OpcUaSeverity}({t.SeverityBucket}) raw={t.RawMxAccessSeverity} | cat={t.Category} | comment='{t.OperatorComment}' | xitionUtc={t.TransitionTimestampUtc:o}");
|
||||||
|
await File.WriteAllLinesAsync(outPath!,
|
||||||
|
new[] { $"# consumer transitions observed: {snapshot.Count}" }.Concat(lines),
|
||||||
|
TestContext.Current.CancellationToken);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
+3
@@ -25,6 +25,9 @@
|
|||||||
|
|
||||||
<ItemGroup>
|
<ItemGroup>
|
||||||
<PackageReference Include="ZB.MOM.WW.MxGateway.Contracts" />
|
<PackageReference Include="ZB.MOM.WW.MxGateway.Contracts" />
|
||||||
|
<!-- Client package: only the Skip-gated live alarm-feed smoke (GatewayGalaxyAlarmFeedLiveTests)
|
||||||
|
constructs a real MxGatewayClient. Unit tests use the fake stream-factory seam. -->
|
||||||
|
<PackageReference Include="ZB.MOM.WW.MxGateway.Client" />
|
||||||
</ItemGroup>
|
</ItemGroup>
|
||||||
|
|
||||||
</Project>
|
</Project>
|
||||||
|
|||||||
Reference in New Issue
Block a user