f4423dfb6d
Extended AlarmClientWmProbeTests to call AlarmClient.GetProviders after RegisterConsumer. Run 2026-05-01: GetProviders -> rc=0 count=0 list=[] Zero alarm providers visible to the consumer. This explains every preceding probe run — no providers means no alarm events, regardless of subscription expression or value writes upstream. Even with a System Platform script flipping TestMachine_001.TestAlarm001 every 10s during the run, GetStatistics reported no transitions, no positions[] entries, no field changes after t=0.85s. Possible causes (dev-rig configuration, not code): 1. No $Alarm extension on the test bool — flipping the value writes a value but doesn't fire an alarm. 2. AVEVA alarm-manager service (aaAlarmMgr or equivalent) not running on this rig. 3. Process security context — providers registered under a service account aren't visible to a consumer running under a normal user account. A.2 implementation is blocked on this until at least one provider is visible. Once a provider exists, the polling-vs-callback question is answerable in one probe run; without a provider both paths return the same "nothing happening" answer. Probe changes: - Added in-process MxAccess Write attempt (TriggerWriteValue) — hit TargetParameterCountException so the Write signature is not (handle, item, value); reflection diag added but not resolved. Now disabled in favor of external trigger. - Added GetProviders enumeration after RegisterConsumer. - Removed firePrint/clearPrint markers; probe is observe-only. - Added ArchestrA.MxAccess reference to the test project. Also updated docs/AlarmClientDiscovery.md with the alarm-provider-visibility section explaining what's blocked and why. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
263 lines
12 KiB
Markdown
263 lines
12 KiB
Markdown
# aaAlarmManagedClient discovery — public surface, 2026-05-01
|
|
|
|
Result of running
|
|
`MxGateway.Worker.Tests.AlarmClientDiscoveryTests.DumpAlarmClientPublicSurface`
|
|
against the deployed AVEVA assembly:
|
|
|
|
- File:
|
|
`C:\Program Files (x86)\ArchestrA\Framework\Bin\ViewAppFramework\Content\MA\aaAlarmManagedClient.dll`
|
|
- Assembly identity: `aaAlarmManagedClient, Version=1.0.7368.41290,
|
|
Culture=neutral, PublicKeyToken=7ebd82b507d9e10c`
|
|
|
|
## Public types
|
|
|
|
- `aaAlarmManagedClient.AlarmClient` (class)
|
|
- `aaAlarmManagedClient.PriorityData` (class)
|
|
|
|
That's the entire exported surface — two types, no interfaces, no
|
|
delegates.
|
|
|
|
## `AlarmClient` events
|
|
|
|
**None.** The class has no public events at all. The reflection probe's
|
|
`GetEvents(BindingFlags.Public | Instance | Static)` returned an empty
|
|
list.
|
|
|
|
## `AlarmClient` methods (relevant subset)
|
|
|
|
- **Lifecycle:**
|
|
`RegisterConsumer(int hWnd, string szProductName, string
|
|
szApplicationName, string szVersion, bool bRetainHiddenAlarms) → int`,
|
|
`DeregisterConsumer() → int`,
|
|
`InitializeConsumer(string szApplicationName) → int`,
|
|
`UninitializeConsumer() → int`,
|
|
`Dispose()`.
|
|
- **Subscription:**
|
|
`Subscribe(string szSubscription, short wFromPri, short wToPri,
|
|
eQueryType QueryType, eSortFlags SortFlags, eAlarmFilterState
|
|
FilterMask, eAlarmFilterState FilterSpecification) → int`.
|
|
- **Change enumeration (pull on poke):**
|
|
`GetStatistics(out int lPercentQuery, out int lTotalAlarms, out int
|
|
lActiveAlarms, out int lSuppressedAlarms, out int lSuppressedFilters,
|
|
out int lNewAlarms, out int lChangesCount, out int[] ChangeCodes,
|
|
out int[] ChangePos, out int[] hAlarm) → int`.
|
|
- **Record fetch:**
|
|
`GetAlarmExtendedRec(int lIndex, out AlarmRecord almRec) → int`,
|
|
`GetAlarmExtendedRec2(...)`,
|
|
`GetHighPriAlarm(out AlarmRecord almRec) → int`.
|
|
- **Selection model** (used by ack-selected-* family):
|
|
`DeselectAll`, `SelectAlaramEntry(short select, int from, int to)`,
|
|
`SelectByGUID(Guid)`, `SelectAlarmCount(int from, int to)`.
|
|
- **Acknowledge:**
|
|
`AlarmAckByGUID(Guid alarmGuid, string ackComment, string ackOprName,
|
|
string ackOprNode, string ackOprDomain, string ackOprFullName) → int`
|
|
is the per-alarm full-fidelity native ack.
|
|
`AlarmAckSelected(string ackComment, string ackOprName, string
|
|
ackOprNode, string ackOprDomain, string ackOprFullName) → int`
|
|
acks whatever the selection model currently has selected.
|
|
Several `AckSelected*Group/Tag/Priority/All/Visible*Alarms_Ex(...)`
|
|
variants exist for bulk ack scoped to a group / tag / priority range.
|
|
- **Suppress / shelve:** `SupressSelected*` and `ShelveSelected*`
|
|
families plus `DoAlarmShelveAction(...)`. Out of scope for the v1
|
|
alarm path.
|
|
- **Snapshot/filter** (`SF*` prefix): `SFSetSortA / SFSetFilterA /
|
|
SFCreateSnapshot / SFGetListCount / SFDeleteSnapshot / SFRefreshAlarm /
|
|
SFGetStatistics`. Snapshot-style query API, distinct from the
|
|
consumer-subscription path. Not currently used.
|
|
|
|
## What this means
|
|
|
|
The architecture comment on
|
|
`src/MxGateway.Worker/MxAccess/AlarmClientConsumer.cs` (PR A.5) is
|
|
**wrong against this deployed assembly**:
|
|
|
|
> "The AVEVA alarm-manager surface (`IAlarmMgrDataProvider`) exposes
|
|
> the events we need as plain .NET events — no Windows message pump
|
|
> required."
|
|
|
|
There is no managed event surface. `AlarmClient.RegisterConsumer`
|
|
takes an `hWnd` because **WM_APP messaging is the actual notification
|
|
mechanism**: AVEVA's alarm provider WM_APP-pokes the registered window,
|
|
and the consumer is expected to call `GetStatistics` on each poke to
|
|
pull `ChangeCodes` / `ChangePos` / `hAlarm` arrays, then
|
|
`GetAlarmExtendedRec(pos, …)` per index to fetch each changed record.
|
|
|
|
`AlarmClientConsumer.AlarmRecordReceived` has no production callers as
|
|
a result — `RaiseAlarmRecordReceived` is `internal` for tests and
|
|
never gets invoked at runtime. Until A.2 lands a WM_APP pump,
|
|
`MX_EVENT_FAMILY_ON_ALARM_TRANSITION` cannot carry events.
|
|
|
|
## Live runtime probe — 2026-05-01
|
|
|
|
`MxGateway.Worker.Tests.AlarmClientWmProbeTests.ProbeAlarmClientWmMessages`
|
|
is a Skip-gated runtime probe that creates a real message-only
|
|
window, calls `AlarmClient.RegisterConsumer(hWnd, …)` +
|
|
`Subscribe(@"\Galaxy!", …)`, and pumps for 20s while logging every
|
|
window message that arrives. Run results below — this turned the
|
|
"WM_APP pump" design assumption upside down.
|
|
|
|
**`RegisterConsumer` and `Subscribe` both returned 0 (success).** The
|
|
calls are valid against the deployed assembly; no parameter pinning
|
|
needed.
|
|
|
|
**A registered-message-class WM (ID `0xC275` in this OS session)
|
|
fired every ~1s after `Subscribe` completed.** Constant
|
|
`wParam = 0x00001100`, constant `lParam = 0x079E46D8` (looks like a
|
|
stable pointer into AVEVA-internal state) for all 20 hits. The
|
|
constant payload across hits with no Galaxy alarm being fired
|
|
suggests this is a **heartbeat/keepalive**, not a per-change
|
|
notification.
|
|
|
|
**Critically: this WM is delivered to AVEVA's own internal window
|
|
(`hwnd=0x18032E`) — NOT to the consumer's `hWnd` we passed in.** The
|
|
consumer window's `WndProc` received only the standard creation
|
|
sequence (`WM_GETMINMAXINFO`, `WM_NCCREATE`, `WM_NCCALCSIZE`,
|
|
`WM_CREATE`) and the destruction sequence (`WM_NCDESTROY`,
|
|
`WM_DESTROY`, `WM_NCCALCSIZE`) — nothing in between. AVEVA's
|
|
notification path runs entirely against AVEVA's internal window;
|
|
it never forwards to the user-supplied hWnd.
|
|
|
|
The message ID itself is dynamic (a `RegisterWindowMessage`
|
|
allocation in the >= 0xC000 range), so it cannot be hard-coded —
|
|
each consumer process must call `RegisterWindowMessage` with the
|
|
correct *string* and use whatever ID the OS returns.
|
|
|
|
## What this means for A.2
|
|
|
|
The "WM_APP pump on the user hWnd" design — what the original plan
|
|
banner described and what the previous version of this doc
|
|
recommended — does not match how AVEVA actually delivers
|
|
notifications. The hWnd parameter to `RegisterConsumer` does not
|
|
appear to receive any of AVEVA's alarm traffic; it's likely used
|
|
only as a registration identity (and perhaps as a parent for modal
|
|
dialogs).
|
|
|
|
Two viable A.2 designs given the probe data:
|
|
|
|
1. **Polling.** Just call `GetStatistics` on a timer (e.g. every
|
|
500ms in the worker's STA) and react to the change set it
|
|
reports. No window plumbing needed. Trade-off: latency floor =
|
|
poll period; modest CPU floor because the call is cheap. Matches
|
|
the heartbeat-style WM 0xC275 semantics — AVEVA itself runs a
|
|
poll loop internally.
|
|
2. **Hook AVEVA's internal window.** Discover AVEVA's own window
|
|
(`hwnd=0x18032E` in the probe), `SetWindowsHookEx` or
|
|
`SetWindowSubclass` on it, and intercept WM 0xC275 on AVEVA's
|
|
thread. Higher fidelity, near-zero latency, but invasive,
|
|
fragile across AVEVA upgrades, and requires running on the same
|
|
process / thread as the AVEVA window. Probably a non-starter
|
|
without further AVEVA documentation.
|
|
|
|
**Recommendation:** the polling path (option 1) is cheaper to
|
|
implement, more robust against AVEVA-internal change, and
|
|
acceptable for a typical alarm cadence. The worker's existing STA
|
|
already provides a thread-affinitized timer surface. The unanswered
|
|
question is whether `GetStatistics` can be safely called outside
|
|
AVEVA's own message-pump thread — confirmable by extending the
|
|
probe to fire `GetStatistics` on its own thread and check the
|
|
result.
|
|
|
|
## Alarm-provider visibility — third probe run, 2026-05-01
|
|
|
|
Extended the probe to call `AlarmClient.GetProviders` after
|
|
`RegisterConsumer`. Result on this rig:
|
|
|
|
```
|
|
GetProviders -> rc=0 count=0 list=[]
|
|
```
|
|
|
|
**Zero alarm providers visible to the consumer process.** This
|
|
explains every preceding probe run: no providers means no alarm
|
|
events, regardless of how many times any value (including a
|
|
bool with an `$Alarm` extension) flips. `Subscribe(@"\Galaxy!")`
|
|
returns 0 (success) but matches nothing because the alarm-manager
|
|
chain that provides the matching feed doesn't expose any provider
|
|
to this consumer.
|
|
|
|
A System Platform script flipping `TestMachine_001.TestAlarm001`
|
|
every 10s during this probe run produced no observable
|
|
`GetStatistics` transitions, no `positions[]` / `handles[]`
|
|
entries, no change in any field — confirms the silence is not
|
|
about subscription-scope / message-pump but about provider
|
|
absence.
|
|
|
|
### Possible causes
|
|
|
|
1. **No `$Alarm` extension on the test bool.** If
|
|
`TestMachine_001.TestAlarm001` is a regular UDA without a
|
|
`BoolAlarm` extension wired to it, flipping the value just
|
|
writes a new value — no alarm fires.
|
|
2. **Alarm manager service not running.** AVEVA's `aaAlarmMgr`
|
|
(or the equivalent on this rig's Platform version) needs to
|
|
be running for providers to register.
|
|
3. **Process security context.** A consumer running under a
|
|
normal user account may not see providers that registered
|
|
under `LocalSystem` / a Platform service identity. The
|
|
gateway-worker installation runs under a service account
|
|
that may have access where `dotnet test` doesn't.
|
|
|
|
### Implications for A.2 implementation
|
|
|
|
The A.2 PR's value is unmeasurable until at least one alarm
|
|
provider is visible. The choice between polling-via-`GetStatistics`
|
|
and the callback path can only be decided by observing what
|
|
populates first when a real alarm fires. Without a provider,
|
|
both paths return the same "nothing happening" answer.
|
|
|
|
Until that's resolved, A.2 implementation work is genuinely
|
|
blocked on a dev-rig configuration issue — not on architectural
|
|
choice or code structure.
|
|
|
|
## GetStatistics polling — second probe run, 2026-05-01
|
|
|
|
Extended the probe to call `GetStatistics` every ~2s alongside the
|
|
WM logger. Key findings:
|
|
|
|
- **`GetStatistics` is safely callable from the same thread that
|
|
did `RegisterConsumer` + `Subscribe`.** Every poll returned rc=0
|
|
with no exceptions over 9 polls / 20s window.
|
|
- **The deployed Galaxy currently has zero active alarms.** Every
|
|
poll reported `total=0 active=0 suppressed=0 newAlarms=0`. The
|
|
`positions[]` and `handles[]` arrays were empty.
|
|
- **`changes=1 codes=[7]` was constant across all polls**, matching
|
|
the constant 1 Hz WM 0xC275 cadence. Code 7 is consistent with a
|
|
"heartbeat / subscription healthy" sentinel — same semantics as
|
|
the WM but reported through the pull-side API.
|
|
- `percent=100` (query-complete percentage) was constant — the
|
|
subscription is steady-state.
|
|
|
|
This confirms the polling design (option 1 in the previous section)
|
|
is mechanically viable. The remaining open question is whether
|
|
`GetStatistics` populates `positions[] / handles[]` with real
|
|
entries when an alarm transition actually fires — proving that
|
|
requires firing an alarm.
|
|
|
|
## Open follow-up probes
|
|
|
|
Each can be added to `AlarmClientWmProbeTests` as a separate
|
|
Skip-gated test:
|
|
|
|
1. **Fire a real Galaxy alarm during the pump window.** The cleanest
|
|
programmatic trigger is an MxAccess write that flips a
|
|
`$Alarm`-extended boolean to true (alarm in) and back to false
|
|
(alarm out). Pinning the exact tag reference is pending — needs
|
|
either a documented test-fixture tag or an interactive selection
|
|
in System Platform IDE. Once the trigger fires, this resolves
|
|
whether AVEVA's pulled change set arrives via `GetStatistics`
|
|
`positions[] / handles[]` (per-change polling works) or only via
|
|
the AVEVA-internal window (callback path needed).
|
|
2. **Hook AVEVA's internal window** to log what WMs it actually
|
|
processes — only relevant if probe 1 shows `GetStatistics` does
|
|
NOT report per-change activity.
|
|
3. **Decompile `aaAlarmManagedClient.dll`'s IL** for the
|
|
`RegisterConsumer` method to find what `RegisterWindowMessage`
|
|
string is used and whether there's a callback-registration
|
|
surface on `WNAL_Register` that the managed client wraps. The
|
|
alarmlst.dll strings (`WNAL_CallBack`, "Invalid callbacks" error)
|
|
suggest the underlying C API takes callbacks, but the managed
|
|
wrapper exposes none of them.
|
|
|
|
PR A.5's `Subscribe` / `AcknowledgeByGuid` / `SnapshotActiveAlarms`
|
|
are correct — they're pull-style and don't depend on the
|
|
notification mechanism.
|