alarms-over-gateway: wire worker AlarmClient + pin SDK call site (4 inert scaffolds + D.1 smoke) #420
Reference in New Issue
Block a user
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Tracking the remaining work after PR #419 reconciled the plan banner against the audited source. Architectural decision was already resolved 2026-04-30 (
aaAlarmManagedClient.AlarmClientis x86 net48, same bitness as the worker; API surface discovered via reflection probe). What remains is wiring.mxaccessgw repo — worker AlarmClient wiring
A.2 — replace
MxAccessAlarmEventSink.Attachno-op with a real subscription. Per the file's own xmldoc:AlarmClient.RegisterConsumer(hWnd, productName, applicationName, version, retainHidden)against the worker's existing STA hWnd at session startup.AlarmClient.Subscribe(provider, fromPri, toPri, queryType, sortFlags, filterMask, filterSpec)with the Galaxy provider name and a permissive priority/filter range.MxAccessAlarmEventSink.EnqueueTransitionafter pulling each changed alarm viaGetStatistics+GetAlarmExtendedRec.A.3 worker dispatcher — replace
NotWiredAlarmRpcDispatcher. Build aWorkerAlarmRpcDispatcherthat translatesAcknowledgeAlarmRequestinto a worker command callingAlarmClient.AlarmAckByGUID(alarmGuid, comment, oprName, oprNode, oprDomain, oprFullName)with the OPC UA operator's resolved identity. Swap into DI in place of theNotWiredimpl.A.4 worker dispatcher —
QueryActiveAlarmsserver-streaming reply. WalkAlarmClient's active-alarm collection (useGetStatisticsto enumeratehAlarmhandles, thenGetAlarmExtendedRecper handle) and streamActiveAlarmSnapshotmessages back through the existing command-reply channel.lmxopcua repo — sidecar SDK pin
C.1 — pin
SdkAlarmHistorianWriteBackend.WriteBatchAsync. Replace the placeholderRetryPleasebody with the liveaahClientManagedalarm-event write call. The outcome-mapping helperAahClientManagedAlarmEventWriter.MapOutcomeis already shared, so the smoke-pinned change is small. Performed on the dev rig as part of D.1.D.1 smoke artifact
Capture
docs/plans/artifacts/d1-rollout-YYYY-MM-DD.mdper the test plan indocs/plans/alarms-over-gateway.mdTrack D — log tails from all three services after refresh, plus the three functional verifications (Galaxy-native alarm, scripted alarm, sub-attribute fallback). Directory does not exist yet.Acceptance criteria
MX_EVENT_FAMILY_ON_ALARM_TRANSITIONcarries it.WorkerAlarmRpcDispatcherimplemented;NotWiredAlarmRpcDispatcherremoved from DI.SdkAlarmHistorianWriteBackendcalls the liveaahClientManagedwrite API.docs/plans/alarms-over-gateway.mdbanner updated to ✅ historical record (final pass).References
docs/plans/alarms-over-gateway.mdmxaccessgwsrc/MxGateway.Worker/MxAccess/MxAccessAlarmEventSink.cs(architecture pinned + API surface discovered)mxaccessgwMxGateway.Worker.TestsAlarmClientDiscoveryTests.DumpAlarmClientPublicSurface(Skip-gated)Update 2026-05-01 — A.2 architecture finding
Attempted to start A.2 by wiring
MxAccessAlarmEventSinkagainst the existing PR A.5AlarmClientConsumer. Ran the Skip-gated reflection probe (MxGateway.Worker.Tests.AlarmClientDiscoveryTests.DumpAlarmClientPublicSurface) against the deployedaaAlarmManagedClient.dll(v1.0.7368.41290) and discovered:The
aaAlarmManagedClient.AlarmClientclass has zero public events. PR A.5's xmldoc claim that the AVEVA alarm client exposes a managed-event surface is wrong against this assembly. The actual notification mechanism is WM_APP messaging —RegisterConsumer(hWnd, ...)takes a window handle for a reason; AVEVA's alarm provider WM_APP-pokes the registered window, thenGetStatistics+GetAlarmExtendedRecpull the change set on each poke.Practical impact
AlarmClientConsumer.AlarmRecordReceivedhas no production caller.RaiseAlarmRecordReceivedis invoked only from tests.Subscribe(...)returns OK fromRegisterConsumer+Subscribebut no notifications reach the consumer at runtime because no real window is attached.MX_EVENT_FAMILY_ON_ALARM_TRANSITIONfamily is reserved on the wire but cannot carry events until A.2 lands a real WM_APP pump.AcknowledgeByGuidandSnapshotActiveAlarmsare pull-style and remain correct as written.What landed
mxaccessgw PR #118 — doc-only commit recording the finding:
docs/AlarmClientDiscovery.mdwith the reflection probe summary, fullAlarmClientmethod list, and open questions for the A.2 implementation.AlarmClientConsumer.csxmldoc fixed (managed-event premise → WM_APP).MxAccessAlarmEventSink.csxmldoc fixed ("verify on dev rig" hedge → resolved finding + expanded open-questions list).Open questions blocking implementation
Documented in
mxaccessgw/docs/AlarmClientDiscovery.md"Implications for A.2":gateway.md) or a runtime probe (subclass a window, log every WM arriving while a live alarm is fired, identify the AVEVA one). Worth doing once on the dev rig and checking the result in.wParam/lParamsemantics. Probably none — pattern is "got poked, pull state viaGetStatistics." Confirm during the probe.StaRuntimealready runs a pump there. If AVEVA assumes a UI thread insideGetStatistics, the alarm path may need its own STA.Subscribe(szSubscription, ...)takes an AVEVA-syntax string for the alarm provider. The configured Galaxy name is already known to the worker via the existing data session — reuse it.Next A.2 PR is a code change that:
WindowProcthat intercepts the AVEVA WM_APP message and routes change-enumeration intoMxAccessAlarmEventSink.EnqueueTransition.AlarmClientConsumer.Subscribe'shWnd: 0placeholder with the real window handle.The doc-only PR keeps that future code-change PR tightly scoped — the discovery / re-architecture rationale is already captured.
Update 2026-05-01 — live runtime probe results
Added
AlarmClientWmProbeTests.csto mxaccessgw and ran it against the live AVEVA install on this dev rig. Results in PR #118 (now also includes the probe code + revised findings indocs/AlarmClientDiscovery.md):RegisterConsumerandSubscribeboth return 0 (success). The lifecycle calls are valid against the deployed assembly.A registered-message-class WM (ID 0xC275 in this OS session) fires every ~1 second after
Subscribecompletes. ConstantwParam=0x1100, constantlParam=0x079E46D8(looks like a stable internal pointer) for all 20 hits with no manual alarm fired. The constant payload + 1Hz cadence suggests a heartbeat/keepalive, not a per-change notification.Critically: this WM is delivered to AVEVA's own internal window (hwnd=0x18032E), NOT to the consumer hWnd we registered. The consumer window receives only the standard
WM_CREATE/WM_DESTROYlifecycle sequence — nothing from AVEVA in between.What this changes
The WM_APP-pump design from the original plan banner does not match how AVEVA actually delivers notifications. The hWnd parameter to
RegisterConsumerappears to be a registration identity only — AVEVA's notification path runs entirely against its own internal window. A worker hWnd would never receive any of AVEVA's alarm traffic.New A.2 design options
Replace the previous WM_APP-pump approach with one of:
GetStatisticson a 500ms (or configurable) timer in the worker's STA and react to whatever change set it reports. No window plumbing needed. Latency floor = poll period. Matches AVEVA's own internal heartbeat cadence. Cheap to implement and robust against AVEVA-internal change.SetWindowSubclasson AVEVA's hwnd, intercept WM 0xC275 on AVEVA's thread. Lower latency but invasive, fragile across AVEVA upgrades, requires same-process/thread coupling. Likely a non-starter.Recommendation: option 1 (polling). The unanswered question is whether
GetStatisticsis safe to call outside AVEVA's own message-pump thread — confirmable with a follow-up probe.Open follow-up probes (documented in
docs/AlarmClientDiscovery.md)GetStatisticsreturns non-empty arrays. This needs human interaction with the System Platform IDE.GetStatisticsthreading-affinity test.aaAlarmManagedClient.dllIL forRegisterConsumerto find whetherWNAL_Register's callback surface is wrapped (the alarmlst.dll strings includeWNAL_CallBackand a 'Invalid callbacks' error suggesting the underlying C API takes callbacks the managed wrapper might hide).Next concrete step is probe (1) — fire a real alarm with the probe running. Without that, option 1 vs 2 is a guess; with it we'll know whether
GetStatisticsactually reports per-change deltas or whether AVEVA's notification layer is fundamentally one-way-into-AVEVA-internal.