Replaces NotWiredAlarmRpcDispatcher in DI with a production
implementation that issues the new MxCommandKind.{AcknowledgeAlarm,
QueryActiveAlarms} commands across the IPC and unwraps the resulting
MxCommandReply into the public RPC types.
QueryActiveAlarms is fully wired: builds the QueryActiveAlarmsCommand
(forwarding alarm_filter_prefix), invokes it on the resolved
GatewaySession's worker client, and yields each ActiveAlarmSnapshot
from the QueryActiveAlarmsReplyPayload as the RPC stream. Worker
failures + missing sessions yield an empty stream — matches the
ConditionRefresh contract clients already speak to.
AcknowledgeAlarm is partially wired: the public RPC takes
AlarmFullReference (Provider!Group.Tag), but the worker's wnwrap
consumer acks by GUID. Strategy:
- If AlarmFullReference parses as a canonical GUID, forward it
directly through MxCommandKind.AcknowledgeAlarm. Native status
flows back via MxCommandReply.Hresult and the dedicated
AcknowledgeAlarmReplyPayload.NativeStatus.
- Otherwise, return InvalidRequest with a clear diagnostic naming the
follow-up — reference→GUID lookup needs a worker-side AlarmAckByName
command wrapping wwAlarmConsumerClass.AlarmAckByName.
DI: SessionServiceCollectionExtensions registers WorkerAlarmRpcDispatcher
as the default IAlarmRpcDispatcher; MxAccessGatewayService picks it up
via constructor injection. NotWiredAlarmRpcDispatcher is retained for
test fixtures that want the no-side-effect fake.
Tests: 7 new unit tests cover session-not-found short-circuit, GUID-vs-
reference branching, native-status propagation, worker MxaccessFailure
diagnostic propagation, and snapshot-stream yielding. Server test
suite total: 288/0 fail. Solution builds clean.
End-to-end alarms-over-gateway pipeline status:
consumer → sink → queue (A.2 + A.3 in-process slice)
worker IPC commands (A.3 worker slice)
gateway dispatcher (this slice)
Remaining for full E2E:
- Auto-issue SubscribeAlarms on session open (or add a public
SubscribeAlarms RPC). Without this trigger the consumer never
starts and Acknowledge/Query return "not subscribed".
- AlarmAckByName worker command for ack-by-reference.
- End-to-end live test against the dev rig.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds the worker-side IPC surface for the alarm subsystem so the gateway
can drive the AlarmDispatcher across the named-pipe boundary. Adds four
proto MxCommandKind values + matching command messages and two
MxCommandReply payload variants:
- SubscribeAlarmsCommand(subscription_expression)
- UnsubscribeAlarmsCommand
- AcknowledgeAlarmCommand(alarm_guid, comment, operator_user/node/domain/full_name)
- QueryActiveAlarmsCommand(alarm_filter_prefix)
- AcknowledgeAlarmReplyPayload(native_status)
- QueryActiveAlarmsReplyPayload(repeated ActiveAlarmSnapshot snapshots)
Worker plumbing:
- New IAlarmCommandHandler interface + AlarmCommandHandler production
impl. Lazy-creates an AlarmDispatcher (with a wnwrap-backed consumer
by default) on the first SubscribeAlarms; routes Acknowledge / QueryActive /
Unsubscribe through it. Idempotent under repeated Unsubscribe; rejects
a second Subscribe without an intervening Unsubscribe; cleans up the
consumer if the underlying Subscribe call throws.
- MxAccessCommandExecutor: 4 new switch arms map MxCommandKind values to
IAlarmCommandHandler calls. Acknowledge surfaces the AVEVA native
status into both MxCommandReply.Hresult and the dedicated
AcknowledgeAlarmReplyPayload.NativeStatus so gateway-side consumers
can echo it without unpacking the outer envelope. Invalid GUIDs and
missing payloads return InvalidRequest; handler exceptions return
MxaccessFailure with the exception message in DiagnosticMessage.
- MxAccessStaSession: new constructor overload accepts an
alarmCommandHandlerFactory; it's invoked on the STA thread during
StartAsync and the resulting handler is passed into the executor.
ShutdownGracefullyAsync + Dispose tear it down on the STA before the
data-side cleanup runs.
Tests: 20 new unit tests covering AlarmCommandHandler lazy lifecycle
(Subscribe/Unsubscribe/Acknowledge/Query/Dispose, error paths) and the
executor's 4 alarm switch arms (OK/InvalidRequest/MxaccessFailure paths,
hresult propagation, prefix filtering). Worker test suite total: 192
passed / 3 skipped (live probes) / 1 pre-existing structure-test fail
(untouched).
Deferred to next slice: gateway-side WorkerAlarmRpcDispatcher that
replaces NotWiredAlarmRpcDispatcher, builds + sends these commands across
the IPC, and unwraps the resulting MxCommandReply into AcknowledgeAlarmReply
/ ActiveAlarmSnapshot stream.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds the in-process plumbing that connects WnWrapAlarmConsumer's
AlarmTransitionEmitted stream to the worker's MxAccessEventQueue via
MxAccessAlarmEventSink. With this change a transition raised by the
consumer lands as an OnAlarmTransitionEvent proto on the queue,
SessionId attached, ready for IPC dispatch.
Mapping: provider!group.tag → AlarmFullReference, tag → SourceObjectReference,
priority → severity, wnwrap STATE → AlarmConditionState (Active /
ActiveAcked / Inactive — wnwrap's ack-vs-unack-on-cleared distinction
collapses since OPC UA Part 9 doesn't model it). State delta drives
AlarmTransitionKind via the existing AlarmRecordTransitionMapper table.
Holding off on the proto IPC additions (SubscribeAlarms /
AcknowledgeAlarm / QueryActiveAlarms commands + WorkerAlarmRpcDispatcher)
for a follow-up — those touch every layer of the worker IPC and warrant
their own PR. This slice proves the consumer→sink→queue pipeline
end-to-end with unit tests and clears the path for the proto additions
to plug in cleanly.
Tests: 10 new unit tests cover field-by-field mapping, the
"unchanged-state-doesn't-emit" filter, the state→transition kind table,
Subscribe / Acknowledge passthrough, SnapshotActiveAlarms → proto
ActiveAlarmSnapshot mapping, and Dispose detaches the handler. All
passing; total worker test count 172/3 skip / 1 pre-existing structure
fail (untouched).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Switch the worker's alarm-consumer surface from `aaAlarmManagedClient.AlarmClient`
to `WNWRAPCONSUMERLib.wwAlarmConsumerClass` (CLSID 7AB52E5F-…) hosted by
`wnwrapConsumer.dll`. The new path returns alarm records as a BSTR XML
payload via `GetXmlCurrentAlarms2`, bypassing the FILETIME→DateTime
auto-marshaling that crashed `GetHighPriAlarm` with
ArgumentOutOfRangeException on every poll. Live captured 60/60 polls
clean against `\DESKTOP-6JL3KKO\Galaxy!DEV` while a System Platform
script flipped TestMachine_001.TestAlarm001 every 10s; the GUID,
priority, state (UNACK_ALM ↔ UNACK_RTN), and ASCII-formatted timestamps
arrived end-to-end.
Implementation:
- `Interop.WNWRAPCONSUMERLib.dll` generated via tlbimp, checked in under
`lib/` so dev boxes don't need the SDK to build.
- New `WnWrapAlarmConsumer` (replaces `AlarmClientConsumer`): owns a
500ms polling timer, parses `GetXmlCurrentAlarms2` output, diffs the
snapshot keyed by alarm GUID, and raises one
`MxAlarmTransitionEvent` per state change. Includes the
Initialize→Register-before-Subscribe ordering fix found during
Discovery probe runs.
- New library-agnostic types `MxAlarmSnapshotRecord` /
`MxAlarmStateKind` / `MxAlarmTransitionEvent` so the proto-build
path is testable without an AVEVA install.
- `AlarmRecordTransitionMapper` retired the COM-coupled
`MapTransitionKind(eAlmTransitions)`; new pure helpers
`ParseStateKind`, `MapTransition(prev, curr)`, and
`ParseTransitionTimestampUtc` cover XML decode + state-delta logic.
- `IMxAccessAlarmConsumer` event surface changed from
`EventHandler<AlarmRecord>` to `EventHandler<MxAlarmTransitionEvent>`
and `SnapshotActiveAlarms()` returns `MxAlarmSnapshotRecord` —
decoupling the interface from any specific COM library.
- Worker csproj drops `aaAlarmManagedClient` / `IAlarmMgrDataProvider`
refs; adds `Interop.WNWRAPCONSUMERLib`.
Tests:
- 36 new unit tests (state-string mapping, prev/current → proto kind
decision table, timestamp UTC reassembly, XML payload parser, 32-char
hex GUID round-trip) covering everything that doesn't touch the live
COM surface — all passing.
- Skip-gated `WnWrapConsumerProbeTests.ProbeWnWrapConsumer` archives
the live capture flow for regression / future probes.
Docs:
- `docs/AlarmClientDiscovery.md` "Option A — captured" section records
sample XML payloads, the mangled `SetXmlAlarmQuery` round-trip
(prefer `Subscribe` for filtering), the `GetStatistics`
AccessViolationException quirk, and the worker-integration outline.
Pre-existing failure noted (separate):
`MxAccessInteropReference_ExistsOnlyInWorkerProject` was already
failing on HEAD — the test project still references `ArchestrA.MxAccess`
for the Skip-gated discovery probes. Not regressed by this change.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Reflection on aaAlarmManagedClient.AlarmClient shows it implements
only IDisposable (no [ComImport] interface, no class GUID) and
has a single field "CwwAlarmConsumer* m_almUnmanaged". So
AlarmClient is a C++/CLI managed wrapper around a native C++
class -- NOT a COM-interop class. The DateTime conversion happens
INSIDE AVEVA's wrapper IL, not at the .NET-COM marshaling
boundary. There's no separate COM interface to QI to.
Revised approach (in docs/AlarmClientDiscovery.md):
A. wnwrapConsumer.dll -- separate standalone COM library AVEVA
ships at "C:\Program Files (x86)\Common Files\ArchestrA"
exposing WNWRAPCONSUMERLib.wwAlarmConsumerClass with
SetXmlAlarmQuery / GetXmlCurrentAlarms. XML-string output
bypasses FILETIME marshaling entirely. Best fit -- real COM,
self-contained, conventional production-grade approach.
B. Patch aaAlarmManagedClient.dll IL -- direct but modifies a
vendor binary, brittle to upgrades.
C. Reflect into m_almUnmanaged and call native vtable directly
-- requires reverse-engineering the C++ class layout.
Picking A. Probe restored to Skip; next commit starts the
wnwrapConsumer integration.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two findings that turn the alarm capture path on:
1. Subscription expression: \<MachineName>\Galaxy!<Area> is the
canonical AlarmClient subscription format per ArchestrA docs:
\Node\Provider!Area!Filter, with Provider literally "Galaxy"
(not the Galaxy name) and Node being the machine name. For
this rig: \DESKTOP-6JL3KKO\Galaxy!DEV catches alarms.
2. InitializeConsumer before RegisterConsumer — discovered
earlier; bug-fix for PR A.5's AlarmClientConsumer.
With these in place, GetHighPriAlarm returned a record on every
poll for 60s straight (117/117 calls). But every call throws
ArgumentOutOfRangeException: Not a valid Win32 FileTime, because
AlarmRecord has five DateTime fields (ar_Time / ar_OrigTime /
ar_AckTime / ar_RtnTime / ar_SubTime) and AVEVA writes sentinel
FILETIME values for unset ones (e.g., ar_AckTime on an
unacknowledged alarm). The aaAlarmManagedClient.dll auto-marshals
FILETIME -> DateTime and rejects out-of-range values.
GetStatistics still reports total=0 active=0 even with
GetHighPriAlarm returning records — those two APIs have
different views. The active read API for current alarms is
GetHighPriAlarm, not GetStatistics's change array.
So the consumer chain works. The blocking issue is now
extracting the payload past the AVEVA-shipped DateTime
auto-marshaling. Three approaches for the next PR:
1. Patch aaAlarmManagedClient.dll via ildasm/ilasm round-trip.
2. Define a custom [ComImport] interface with safe-blittable
types and Marshal.QueryInterface to it.
3. Use IDispatch late binding to bypass strong-typed marshaling.
Option 2 is cleanest; needs the AlarmClient COM IID.
Probe changes:
- Subscription expression set to \<MachineName>\Galaxy!DEV.
- GetHighPriAlarm tally counters (ok-with-record vs throw).
- 117 throws / 0 ok-with-record over 60s confirms alarms are
flowing continuously while the user's flip script runs.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sixth probe iteration with every consumer-side knob exhausted:
- Subscriptions tried (all rc=0): \Galaxy!, \Galaxy!*, \Galaxy!,
\Galaxy!TestArea, \.\Galaxy!.
- Read channels polled at 500ms: GetStatistics, GetHighPriAlarm,
SFCreateSnapshot + SFGetStatistics.
- Filters: priority 0..32767, qtSummary + qtHistory both tried,
asAlarmActiveNow.
- AlarmRecord pre-init to FILETIME epoch to dodge marshaler bug
on default(DateTime).
Result: every read API returns empty for the entire 60s window
even with TestMachine_001.TestAlarm001 firing every 10s and
aaObjectViewer confirming InAlarm transitions. The
aaAlarmManagedClient.AlarmClient is not the receive surface
AVEVA's alarm pipeline routes to in this Galaxy configuration.
The consumer chain is verified working end-to-end: Initialize +
Register + Subscribe all succeed, GetProviders finds the
provider, the WM 0xC275 heartbeat fires at 1Hz to AVEVA's
internal hwnd. There is simply no alarm data flowing through
this consumer surface.
Next investigation is not consumer-side: either find the SDK
aaObjectViewer's alarm panel uses, or query the historian
event storage directly. If alarms only flow via the historian
path on this customer's Galaxy, the worker's PR A.5 architecture
is a dead-end and A.2 needs a different transport.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Tried every documented subscription knob with InitializeConsumer
present + provider visible at status 100:
- qtSummary AND qtHistory (the only eQueryType values).
- Priority 1..999 AND 0..32767.
- FilterMask/Spec asNone AND asAlarmActiveNow.
eAlarmFilterState is single-state-valued (asNone=0,
asAlarmActiveNow=1, asAlarmAcked=2, asShelved=3), not flag bits,
so the filter surface is exhausted.
GetStatistics continued to report total=0 active=0 codes=[7]
for every poll across all combinations.
User confirmation: the BoolAlarm extension on
TestMachine_001.TestAlarm001 is evaluating (the $Alarm.InAlarm
sub-attribute flips true/false in lockstep with the script
writes, visible in aaObjectViewer). So the consumer chain is
verified working end-to-end on our side. What's missing is
producer-side publication into the aaAlarmManagedClient stream.
Probable causes (config, not code):
- BoolAlarm extension's "publish to alarm manager" / "Active" /
"Enabled" flag may be off.
- Alarm-vs-event mode setting may have it routing to events,
not alarms.
- Platform alarm area may not match the consumer's subscription
scope.
Resolution path: check the BoolAlarm extension's config in System
Platform IDE; check aaObjectViewer's Active Alarms panel (not
attribute panel) to see if the alarm appears there.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
InitializeConsumer was the missing call. Adding it before
RegisterConsumer makes the \Galaxy! provider appear in
GetProviders (status 0 -> 100 within 500ms). Without Initialize,
GetProviders returns an empty list even though everything else
returns rc=0 (success).
Probe trace 2026-05-01:
InitializeConsumer -> 0
RegisterConsumer -> 0
GetProviders [after Register] -> count=0 list=[]
Subscribe('\Galaxy!') -> 0
GetProviders [after Subscribe] -> count=1 list=[ 0 \Galaxy!]
GetProviders [poll #1] -> count=1 list=[100 \Galaxy!]
Despite the provider being at "100% query complete" for the
entire 60s window, GetStatistics continued to report
total=0 active=0 codes=[7] -- no alarm transitions reached the
consumer even with a System Platform script flipping
TestMachine_001.TestAlarm001 every 10s during the run.
So the consumer chain works end-to-end. What's missing is alarm
traffic from the producer side. The next discriminator is
whether ObjectViewer (or another live consumer) sees the alarm
fire while the script runs.
API-ordering bug fix to apply to PR A.5's AlarmClientConsumer
regardless of how A.2 lands: AlarmClientConsumer.Subscribe
should call InitializeConsumer before RegisterConsumer (currently
omits Initialize entirely, which means the provider chain is
never visible from the worker either). That fix lifts a
fundamental bug independent of the polling-vs-callback question.
Probe changes:
- Added InitializeConsumer call before RegisterConsumer.
- Added LogProviders helper that logs only on change; called
after Register, after Subscribe, and on every poll. Easier
to spot when the provider chain transitions from empty to
populated.
- Restored Skip-gating after run.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Extended AlarmClientWmProbeTests to call AlarmClient.GetProviders
after RegisterConsumer. Run 2026-05-01:
GetProviders -> rc=0 count=0 list=[]
Zero alarm providers visible to the consumer. This explains every
preceding probe run — no providers means no alarm events,
regardless of subscription expression or value writes upstream.
Even with a System Platform script flipping
TestMachine_001.TestAlarm001 every 10s during the run,
GetStatistics reported no transitions, no positions[] entries,
no field changes after t=0.85s.
Possible causes (dev-rig configuration, not code):
1. No $Alarm extension on the test bool — flipping the value
writes a value but doesn't fire an alarm.
2. AVEVA alarm-manager service (aaAlarmMgr or equivalent) not
running on this rig.
3. Process security context — providers registered under a
service account aren't visible to a consumer running under
a normal user account.
A.2 implementation is blocked on this until at least one provider
is visible. Once a provider exists, the polling-vs-callback
question is answerable in one probe run; without a provider both
paths return the same "nothing happening" answer.
Probe changes:
- Added in-process MxAccess Write attempt (TriggerWriteValue) —
hit TargetParameterCountException so the Write signature is
not (handle, item, value); reflection diag added but not
resolved. Now disabled in favor of external trigger.
- Added GetProviders enumeration after RegisterConsumer.
- Removed firePrint/clearPrint markers; probe is observe-only.
- Added ArchestrA.MxAccess reference to the test project.
Also updated docs/AlarmClientDiscovery.md with the
alarm-provider-visibility section explaining what's blocked
and why.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Extended AlarmClientWmProbeTests.ProbeAlarmClientWmMessages to also
call GetStatistics every ~2s during the pump window. Re-ran on the
dev rig 2026-05-01:
- GetStatistics is safely callable from the same thread that did
RegisterConsumer + Subscribe. Every poll (9 calls / 20s window)
returned rc=0, no exceptions.
- Galaxy currently has zero active alarms. total=0 active=0
suppressed=0 newAlarms=0 across every poll. positions[] and
handles[] arrays were empty.
- changes=1 codes=[7] was constant across all polls, matching the
constant 1 Hz WM 0xC275 cadence — same heartbeat semantics
exposed through both the WM path and the pull API.
Confirms the polling design is mechanically viable: GetStatistics
threading-affinity is fine and the call is cheap. The remaining
unknown is whether GetStatistics populates positions[] / handles[]
with real entries when an alarm actually fires. Proving that
requires triggering an alarm — next probe is an MxAccess write to a
$Alarm-extended boolean tag (reference pending).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Added MxGateway.Worker.Tests/AlarmClientWmProbeTests.cs as a Skip-gated
runtime probe. Run on the dev rig 2026-05-01 against the live AVEVA
install (Galaxy reachable, no manual alarm fired). Findings:
- RegisterConsumer(hWnd, ...) and Subscribe("\Galaxy!", ...) both
return 0 (success). Calls are valid against the deployed assembly.
- A registered-message-class WM (ID 0xC275 in this OS session) fires
every ~1 second after Subscribe completes. Constant wParam=0x1100,
constant lParam=0x079E46D8 — looks like a heartbeat / keepalive,
not a per-change notification.
- Critically, this WM is delivered to AVEVA's own internal window
(hwnd=0x18032E), NOT to the consumer hWnd we registered. The
consumer window receives only the standard WM_CREATE / WM_DESTROY
sequence; no AVEVA traffic in between.
This invalidates the WM_APP-pump design previously documented. The
hWnd parameter to RegisterConsumer appears to be a registration
identity only — AVEVA's notification path runs entirely against
AVEVA's own internal window.
Two viable A.2 designs replace the previous one:
1. Polling. Call GetStatistics on a 500ms timer in the worker's STA
and react to whatever change set it reports. No window plumbing
needed. Latency floor = poll period. Matches AVEVA's own
internal heartbeat cadence.
2. Hook AVEVA's internal window. Discover AVEVA's own hwnd,
SetWindowSubclass on it, intercept WM 0xC275 on AVEVA's thread.
Higher fidelity, lower latency, but invasive and fragile across
AVEVA upgrades — likely a non-starter.
Recommendation in docs/AlarmClientDiscovery.md is option 1 (polling)
unless a follow-up probe with a real fired alarm shows AVEVA does
post change-specific WMs to a different hWnd.
Open follow-up probes documented:
- Fire a real Galaxy alarm during pump and check whether WM 0xC275
cadence changes or GetStatistics returns non-empty arrays.
- GetStatistics threading affinity test.
- Hook AVEVA's internal window 0x18032E.
- Decompile aaAlarmManagedClient IL for RegisterConsumer to find
whether WNAL_Register's callback surface is wrapped.
Test project changes:
- Added Reference to aaAlarmManagedClient + IAlarmMgrDataProvider
(Private=true so the DLL gets copied into bin for test load).
- Test-suite-wide: 127 real tests still pass; both alarm-related
Skip-gated tests skip cleanly.
Code change to the probe is additive — the worker is unchanged.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Reflection probe of the deployed aaAlarmManagedClient.dll
(v1.0.7368.41290) on 2026-05-01 confirmed the public AlarmClient class
exposes zero public events. The PR A.5 design that AlarmClientConsumer
is built on (managed-event surface, no message pump) does not hold
against this assembly.
The actual notification mechanism is WM_APP messaging:
RegisterConsumer(hWnd, ...) takes a window handle because AVEVA's alarm
provider WM_APP-pokes the registered window, then GetStatistics +
GetAlarmExtendedRec pull the change set on each poke.
Practical impact:
- AlarmClientConsumer.AlarmRecordReceived has no production caller.
RaiseAlarmRecordReceived is invoked only from tests. Subscribe(...)
returns OK from RegisterConsumer + Subscribe but no notifications
reach the consumer at runtime because no window is attached.
- Until A.2 lands a hidden message-only window + WindowProc that routes
WM_APP into MxAccessAlarmEventSink.EnqueueTransition, the gateway's
MX_EVENT_FAMILY_ON_ALARM_TRANSITION family cannot carry events.
- AcknowledgeByGuid and SnapshotActiveAlarms are pull-style and remain
correct as written.
Changes:
- docs/AlarmClientDiscovery.md (new) — reflection probe summary, full
AlarmClient method list, open questions for A.2 implementation.
- AlarmClientConsumer.cs xmldoc — replaced the inaccurate "managed
event surface" claim with the WM_APP finding; flagged
AlarmRecordReceived as unreachable in production until the WM_APP
pump lands.
- MxAccessAlarmEventSink.cs xmldoc — replaced the "verify on dev rig"
hedge in the wiring plan with the resolved finding; expanded the
open-questions list (WM_APP message ID, wParam/lParam semantics, STA
affinity, subscription scope) so the next A.2 PR knows what the
dev-rig probe needs to answer.
Code-only no-op for the worker; worker builds clean.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Replaces the inline diagnostic strings in PR A.3's AcknowledgeAlarm
+ QueryActiveAlarms handlers with an IAlarmRpcDispatcher seam.
- IAlarmRpcDispatcher (new) — gateway-side abstraction over the
worker-RPC path that fronts AlarmClient.AlarmAckByGUID and the
active-alarm walk. AcknowledgeAsync returns the
AcknowledgeAlarmReply directly; QueryActiveAlarmsAsync yields an
IAsyncEnumerable<ActiveAlarmSnapshot>.
- NotWiredAlarmRpcDispatcher (new, default impl) — returns
PROTOCOL_STATUS_OK with a structured worker-pending diagnostic
on Acknowledge, yields an empty stream on QueryActiveAlarms.
Same observable shape as PR A.3, but the integration seam is
now in code instead of hardcoded inside the handler.
- MxAccessGatewayService — handlers delegate to the dispatcher.
Constructor accepts an optional IAlarmRpcDispatcher (default
NotWiredAlarmRpcDispatcher); a future WorkerAlarmRpcDispatcher
registration in DI swaps in the live worker-IPC routing without
changing the public RPC surface.
- 2 new dispatcher tests pin the not-wired contract; 279 → 281
total tests, all green.
Worker-side dispatch (translating Acknowledge / QueryActiveAlarms
to the IPC method that calls IMxAccessAlarmConsumer from PR A.5)
is the dev-rig follow-up — it depends on validating the AVEVA
GetAlarmChangesCompleted event subscription against a live alarm
provider before pinning a wire format.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Wires the worker-side consumer for AVEVA alarm transitions over the
aaAlarmManagedClient API discovered in the prior foundation PR.
- IAlarmMgrDataProvider.dll referenced — exposes AlarmRecord +
eAlmTransitions / eQueryType / eSortFlags / eAlarmFilterState.
Both DLLs (aaAlarmManagedClient + IAlarmMgrDataProvider) load in
the worker's existing net48 x86 process; no new bitness boundary.
- IMxAccessAlarmConsumer abstraction — Subscribe / AcknowledgeByGuid
/ SnapshotActiveAlarms / AlarmRecordReceived event. Test seam.
- AlarmClientConsumer production wrapper — RegisterConsumer +
Subscribe + AlarmAckByGUID + GetStatistics-based active-alarm
walk, all delegated to AlarmClient. Uses AVEVA's managed event
surface (GetAlarmChangesCompleted on IAlarmMgrDataProvider) so
no Windows message pump is required — plain .NET events arrive
on the alarm-client's internal callback thread.
- AlarmRecordTransitionMapper — pure-function helpers:
MapTransitionKind(eAlmTransitions): ALM→Raise, ACK→Acknowledge,
RTN→Clear, others (SUB/ENB/DIS/SUP/REL/REMOVE)→Unspecified
so EventPump's decoding-failure counter records them.
ComposeFullReference(provider, group, name): Provider!Group.Name
format matching AVEVA's standard alarm-reference syntax.
Pinned during dev-rig validation (subsequent commits):
1. Confirm RegisterConsumer accepts hWnd=0 — if it requires a real
hwnd, the worker creates a hidden message-only window and
passes that handle. The managed event surface should make
this irrelevant but the AVEVA API is older than its managed
wrapper.
2. Wire AlarmClientConsumer.AlarmRecordReceived: the AVEVA
IAlarmMgrDataProvider.GetAlarmChangesCompleted event needs to
be hooked from inside the AlarmClient — find the proper
accessor (likely a property exposing the inner provider).
3. AlarmRecord field-by-field translation into the proto event
uses MxAccessAlarmEventSink.EnqueueTransition (existing
plumbing). The AlarmRecord field names (ar_OrigTime,
AlarmName, AckOperatorFullName, AckComment, etc.) are
pinned in the discovery dump preserved in
AlarmClientDiscoveryTests.
Tests: 127 pass (4 new ComposeFullReference cases + 1 Skip-gated
discovery probe). Transition-kind enum mapping is dev-rig-validated
rather than unit-tested because the AVEVA assembly is Private=false
on the reference and isn't copied to the test bin directory.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Discovers the surface of aaAlarmManagedClient.dll and stages the worker
csproj reference so subsequent PRs can wire native MxAccess alarm
subscription. Replaces the speculative "operator decision needed
between path 1 and path 2" framing in MxAccessAlarmEventSink with the
validated architecture.
Key findings from the discovery probe:
1. aaAlarmManagedClient.dll is x86 + .NET Framework (mixed-mode
C++/CLI; PE Machine = i386, NativeEntryPoint flag set). The
"x64-only" framing in the prior follow-up was wrong — confused
by the file path under Wonderware\Historian\x64\.
The assembly is bitness- and runtime-compatible with the
worker (net48 x86), so it loads in the existing process. No
sub-process needed.
2. AlarmClient is the public class. Its model mirrors MxAccess:
RegisterConsumer takes a Windows hWnd and the AVEVA alarm
service WM_APP-pokes that hwnd when alarms change. The worker's
existing STA + WM_APP pump can drive both the data-change COM
subscriber and the alarm-client consumer.
3. AlarmAckByGUID(alarmGuid, ackComment, oprName, oprNode,
oprDomain, oprFullName) — the native ack carries the operator's
full identity atomically with the comment. Closes the v1
operator-comment fidelity gap completely.
This PR:
- Adds the aaAlarmManagedClient.dll reference to MxGateway.Worker.
csproj. Worker still builds clean.
- Adds AlarmClientDiscoveryTests as a Skip-gated reflection probe;
flip the Skip parameter to dump the public type surface for
reference. Captured the dump into MxAccessAlarmEventSink
documentation so it doesn't have to be re-run.
- Replaces MxAccessAlarmEventSink's "two paths forward" doc with
the actual wiring plan against AlarmClient's RegisterConsumer +
Subscribe + AlarmAckByGUID surface.
Subsequent PRs (gated on STA + WM_APP integration testing on the
dev rig):
- Wire RegisterConsumer + Subscribe at session-startup; route
WM_APP messages through GetStatistics + GetAlarmExtendedRec into
EnqueueTransition.
- Translate gateway-side AcknowledgeAlarm RPC to a worker command
that calls AlarmAckByGUID with the OPC UA operator's identity;
replaces the worker-pending diagnostic from PR A.3.
- Translate gateway-side QueryActiveAlarms to a worker command
that walks GetStatistics's reported handles via GetAlarmExtendedRec.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
PR A.2 ship-pin discovery: the MXAccess COM Toolkit installed at
C:\Program Files (x86)\ArchestrA\Framework\Bin\ArchestrA.MXAccess.dll
does not expose any alarm-event family. Reflection enumeration of
the assembly confirms ILMXProxyServerEvents and
ILMXProxyServerEvents2 only carry OnDataChange, OnWriteComplete,
OperationComplete, and OnBufferedDataChange — no IAlarmEventSink,
no Alarms collection, no OnAlarmTransition.
AVEVA's separate alarm-subscription managed assemblies
(aaAlarmManagedClient.dll under InTouch\ViewAppFramework\Content\MA\
and ArchestrAAlarmsAndEvents.SDK.Common.dll under
Wonderware\Historian\x64\) exist on this box but are x64-only —
incompatible with the worker's x86 bitness, which is the bitness
constraint the mxaccessgw architecture exists to isolate in the
first place.
This commit replaces the speculative "TBD pin during dev-rig
validation" comment in MxAccessAlarmEventSink with the actual
finding plus the two operator-facing paths forward:
1. Stay on the value-driven sub-attribute path (current production
behaviour). lmxopcua's AlarmConditionService already synthesizes
Part 9 transitions from the four MXAccess sub-attributes.
Operator-comment fidelity is the only v1 regression.
2. Add an x64 alarm-helper sub-process alongside the worker that
loads aaAlarmManagedClient and forwards transitions to the
worker over a small named-pipe IPC. Recovers full v1 fidelity
but adds operational complexity.
Until that decision resolves, the sink's Attach is a no-op, the
worker continues to function for data subscriptions, and
lmxopcua-side AlarmConditionService keeps the sub-attribute
synthesis active.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Nineteenth (final) PR of the alarms-over-gateway epic. Pins the
public RPC handler contract added in PR A.3:
- AcknowledgeAlarm rejects empty session_id and empty
alarm_full_reference with InvalidArgument.
- AcknowledgeAlarm with valid input returns OK and a
worker-pending diagnostic so clients see a successful round-trip
even before A.2's worker dispatch lands.
- QueryActiveAlarms rejects empty session_id with InvalidArgument.
- QueryActiveAlarms with valid input streams zero snapshots until
PR A.2 wires the worker-side QueryActiveAlarmsCommand
(filter-prefix passthrough verified at the proto layer).
- OpenSession advertises both new RPC capability strings
(unary-acknowledge-alarm, server-stream-active-alarms) so client
capability negotiation lights up against the contract surface.
Closes Track A's gateway-side surface. The remaining worker
ConditionRefresh walk + integration parity-rig validation lands
during dev-rig hardware validation alongside PR A.2's COM-side
alarm subscription pin.
Tests: 279 passed (was 273; 6 new). Per-handler integration tests
land alongside the dev-rig validation when the worker walks the
real MxAccess active-alarm collection.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Eighteenth PR of the alarms-over-gateway epic
(docs/plans/alarms-over-gateway.md). Lands the proto-build path that
the worker uses to create OnAlarmTransition events. The COM-side
subscription that registers an alarm event sink against the MXAccess
Toolkit is pinned during dev-rig validation — the exact API differs
across AVEVA versions and needs hardware to verify.
Lands today (unit-testable, no hardware needed):
- MxAccessEventMapper.CreateOnAlarmTransition — mechanical proto
builder. Takes decoded alarm fields (full reference, source
object, alarm type, transition kind, severity, timestamps,
operator user/comment, category, description) and produces an
MxEvent with the OnAlarmTransition body populated. Mirrors the
pattern of CreateOnDataChange / CreateOnWriteComplete / etc.
- MxAccessAlarmEventSink — scaffolded class with documented
Attach / Detach + an internal EnqueueTransition entry point.
When dev-rig validation pins the MXAccess Toolkit alarm
subscription API, the only edit needed is to wire the COM
delegate inside Attach to call EnqueueTransition. The mapper
bridge is already done.
Pending dev-rig validation:
- Pin the MXAccess Toolkit alarm event source COM API (likely one
of IAlarmEventSink, IAlarmEventSubscription, or a method on
LMXProxyServerClass — verify against the worker host's installed
version).
- Add cancellation/cleanup tests once the COM hook is wired.
- Integration test against the parity rig that fires a real Galaxy
alarm and asserts the gateway emits OnAlarmTransition.
Tests:
- 2 new mapper tests pin the full-payload Acknowledge case and
the bare-bones Raise case.
- Full Worker.Tests suite green: 123 passed (was 121; 2 new).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Twelfth PR of the alarms-over-gateway epic
(docs/plans/alarms-over-gateway.md). Lands the public RPC handler
surface that PR A.1's proto introduced. The actual worker-side
ack call + active-alarm walk depend on PR A.2 (worker MxAccess
subscription); this PR ensures clients can call the RPCs and
receive a meaningful response without UNIMPLEMENTED at the gRPC
layer.
- AcknowledgeAlarm — validates session_id + alarm_full_reference,
resolves the session (NotFound on miss), returns a successful
reply with a structured DiagnosticMessage indicating worker
dispatch is pending PR A.2. Once A.2 ships, the body translates
the request into a WorkerCommand and forwards through
SessionManager.InvokeAsync.
- QueryActiveAlarms — validates session_id, returns an empty
stream. PR A.4 layers the actual ConditionRefresh implementation
once the worker's QueryActiveAlarmsCommand is available.
- OpenSessionReply.Capabilities advertises both new RPCs
(unary-acknowledge-alarm, server-stream-active-alarms) so
clients can negotiate against the contract surface.
OnAlarmTransition events flow through the existing StreamEvents
path automatically — EventStreamService and MxAccessGrpcMapper
forward whatever family the worker emits without filtering, so
no changes are needed there for A.3.
Tests: full 273-test suite still green. Per-handler unit tests
ship with PR A.4's expanded surface; A.3's stub handlers are
narrow enough that the existing parity-fixture tests cover the
contract round-trip.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Eleventh PR of the alarms-over-gateway epic
(docs/plans/alarms-over-gateway.md). Mirrors PR E.2's .NET surface
on the Rust async SDK. Depends on PR E.1 (regen, merged).
- GatewayClient::acknowledge_alarm — async unary call. Uses the
existing unary_request helper (call timeout) and routes failures
through Error mapping; non-OK protocol status promotes to
Error::ProtocolStatus via ensure_protocol_success.
- GatewayClient::query_active_alarms — async server-streaming call
returning a new ActiveAlarmStream type alias (parallel to
EventStream). Errors are pre-mapped from tonic::Status; dropping
the stream cancels the call cooperatively.
- GATEWAY_PROTOCOL_VERSION bumped 2 → 3 to match the .NET contract.
- FakeGateway test impl extends to satisfy the new trait methods so
client_behavior.rs builds. Two new integration tests cover the
new SDK methods.
Tests:
- 12 unit + 10 client_behavior + 4 proto_fixtures = 26 tests, all
pass under cargo test (Rust 1.x via existing toolchain).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Tenth PR of the alarms-over-gateway epic
(docs/plans/alarms-over-gateway.md). Mirrors PR E.2's .NET surface
on the Java SDK. Depends on PR E.1 (regen, merged).
- MxGatewayClient.acknowledgeAlarm — blocking unary call, validates
protocol status via the existing MxGatewayErrors helper. Wraps
RuntimeException through MxGatewayErrors.fromGrpc for typed
failure mapping.
- MxGatewayClient.acknowledgeAlarmAsync — CompletableFuture variant
using the future stub.
- MxGatewayClient.queryActiveAlarms — async server-streaming RPC
observed via a new MxGatewayActiveAlarmsSubscription handle
(parallel to MxGatewayEventSubscription; the existing
subscription class is hard-typed to MxEvent so a parallel type
was simpler than retrofitting generics).
- MxGatewayClientVersion bumps GATEWAY_PROTOCOL_VERSION 2 → 3 to
match the .NET contract; CLI version-string assertions updated
to match.
Java SDK build green via Gradle 9.4.1 (mxgateway-client + mxgateway-cli).
17 tasks, all tests passing.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Ninth PR of the alarms-over-gateway epic
(docs/plans/alarms-over-gateway.md). Mirrors PR E.2's .NET surface
on the Go SDK. Depends on PR E.1 (regen, merged).
- Client.AcknowledgeAlarm — context-aware unary call routed through
the existing callContext helper (default 30s timeout). Failures
wrap into *GatewayError; protocol-status non-OK promotes to typed
protocol errors via EnsureProtocolSuccess.
- Client.QueryActiveAlarms — context-streaming wrapper around the
generated MxAccessGateway_QueryActiveAlarmsClient. Caller drives
the stream via Recv(); cancelling ctx releases it.
- types.go re-exports the four new generated types
(AcknowledgeAlarmRequest/Reply, QueryActiveAlarmsRequest,
ActiveAlarmSnapshot) plus the AlarmTransitionKind /
AlarmConditionState enums and the
QueryActiveAlarmsClient stream alias.
- version.go bumps GatewayProtocolVersion 1 → 3 to match the .NET
contract; the const was previously stale and the bump fixes the
pre-existing TestOpenSessionFixtureProtocolVersions failure that
was masked because the fixture had not been regenerated until A.1.
Tests:
- 4 new tests in alarms_test.go — request shape + auth metadata,
nil-request rejection, Unauthenticated mapping, snapshot
streaming over bufconn, filter-prefix passthrough.
- All Go test suites green: cmd/mxgw-go + mxgateway.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Seventh PR of the alarms-over-gateway epic
(docs/plans/alarms-over-gateway.md). Depends on PR A.1 (proto, merged)
and E.1 (regen, merged).
Hand-written .NET SDK methods on top of the regenerated proto types:
- MxGatewayClient.AcknowledgeAlarmAsync — routes through the existing
safe-unary retry pipeline (Acks are idempotent at MxAccess), maps
Unauthenticated/PermissionDenied RpcExceptions to typed
MxGatewayAuthenticationException / MxGatewayAuthorizationException
via GrpcMxGatewayClientTransport.MapRpcException.
- MxGatewayClient.QueryActiveAlarmsAsync — server-streaming
IAsyncEnumerable<ActiveAlarmSnapshot> mirroring the StreamEvents
pattern.
- IMxGatewayClientTransport extended; GrpcMxGatewayClientTransport
implements both methods using the regenerated grpc client.
- FakeGatewayTransport extended with capture lists, exception queue,
and reply / snapshot enqueue helpers.
CLI version-string assertions updated for the GatewayProtocolVersion
2 → 3 bump from A.1.
The CLI alarms verb (subscribe / acknowledge / query-active) is
deferred to a follow-up — keeping this PR focused on the SDK surface
that lmxopcua's GalaxyDriver consumes in PR B.2. The other-language
SDKs (E.3-E.6) layer the same shape on the regen.
Tests:
- 6 new MxGatewayClientAlarmsTests — request shape, cancellation
honor (linked-token via retry pipeline), Unauthenticated mapping,
streaming snapshot enumeration, filter prefix passthrough,
cancellation during enumeration.
- Full client test suite: 57 passed (was 51; 6 new).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Pure mechanical regen following PR A.1 (alarm-transition event family
+ AcknowledgeAlarm / QueryActiveAlarms public RPCs). Ran:
- clients/python/generate-proto.ps1 → mxaccess_gateway_pb2.py +
mxaccess_gateway_pb2_grpc.py.
- clients/go/generate-proto.ps1 → mxaccess_gateway.pb.go +
mxaccess_gateway_grpc.pb.go + galaxy_repository.pb.go (whitespace
diff from upstream protoc minor version).
The .NET binding regenerates on csproj rebuild via Grpc.Tools — its
artifact (Generated/MxaccessGateway*.cs) was already updated as part
of A.1's commit. Java + Rust regen happens at build time via the
gradle plugin / build.rs respectively, with no committed output to
update.
Smoke-imported the regenerated Python descriptors:
OnAlarmTransitionEvent.DESCRIPTOR.fields → alarm_full_reference,
alarm_type_name, category, current_value, description, ...
AcknowledgeAlarmRequest.DESCRIPTOR.fields → alarm_full_reference,
client_correlation_id, comment, operator_user, session_id
ActiveAlarmSnapshot.DESCRIPTOR.fields → alarm_full_reference,
alarm_type_name, category, current_state, current_value, ...
PRs E.2 - E.6 layer hand-written SDK methods on top of the regenerated
types — those land per-language as separate PRs.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
First PR of the alarms-over-gateway epic
(docs/plans/alarms-over-gateway.md in lmxopcua). Pure contract-surface
change — no functional wiring yet. Worker-side subscription (A.2),
gateway-side dispatch + ack handler (A.3), and ConditionRefresh
(A.4) follow.
mxaccess_gateway.proto:
- Extend MxEventFamily with MX_EVENT_FAMILY_ON_ALARM_TRANSITION = 5.
- Extend MxEvent.body oneof with OnAlarmTransitionEvent on_alarm_transition = 24.
- Add OnAlarmTransitionEvent message carrying the full MxAccess alarm
payload (full reference, source object, alarm-type-name, transition
kind, raw severity, original raise timestamp, transition timestamp,
operator user/comment, category, description, current/limit value).
Mapping to OPC UA 0-1000 severity ladder happens server-side in
lmxopcua's MxAccessSeverityMapper (B.1) — gateway preserves the
native MxAccess scale.
- Add AlarmTransitionKind enum (Raise / Acknowledge / Clear / Retrigger).
- Add ActiveAlarmSnapshot + AlarmConditionState for the
ConditionRefresh stream.
- Add public RPCs AcknowledgeAlarm (unary) and QueryActiveAlarms
(server-streaming) on MxAccessGateway service.
- Add AcknowledgeAlarmRequest/Reply + QueryActiveAlarmsRequest.
GatewayContractInfo.GatewayProtocolVersion bumps 2 -> 3. Fixture
manifests (proto-inputs, behavior, parity, golden OpenSessionReply)
and protoset descriptor regenerated.
Tests: round-trip serialization for the new messages with
all-fields-populated and empty-optional-fields cases; oneof
last-write-wins guard between OnDataChange and OnAlarmTransition;
descriptor service-method enumeration includes the two new RPCs.
All 273 existing tests still pass.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Resolve 14 conflicts from popping local stash on top of origin's
eed1e88 + 8d3352f doc-comment additions (11 mechanical, plus
version.rs, DashboardAuthenticatorTests.cs, DashboardGalaxyProjector.cs)
- Fix 4 test files that used AGENTS.md as the repo-root sentinel
(now use CLAUDE.md, since AGENTS.md was removed in 4731ab5)
- Redirect 10 doc citations from AGENTS.md to the matching gateway.md
sections (Value Model, Status Model, Security, STA Worker Thread
Model, gRPC Layer rule, cancellation rule)
Verified: solution build clean, x86 worker build clean, 266/266
gateway tests passing, 121/121 worker tests passing.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The operational rules from AGENTS.md (parity contract, source-update
verification matrix, MXAccess/Galaxy analysis sources, dashboard
constraints, gateway-doesn't-reattach-orphans) are now in CLAUDE.md.
Architecture details remain in gateway.md.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Rename 16 kebab-case docs to PascalCase per StyleGuide
- Move per-language client design docs from docs/ to clients/<lang>/
alongside their READMEs
- Add ## Related Documentation sections to 15 docs that lacked one
- Fix sentence-case violations in H3 headings (StyleGuide rule)
- Update cross-references in gateway.md, client READMEs, scripts,
and generate-proto.ps1 helpers to follow the new paths
- Add CLAUDE.md with build/test commands, the source-update
verification matrix, the parity-first contract, and pointers
to MXAccess and Galaxy Repository analysis sources
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>