Remove the AlarmClientDiscovery probe log
Delete docs/AlarmClientDiscovery.md — an archival AVEVA alarm-consumer investigation log whose durable findings now live in the alarm worker/monitor code. Drop the now-dangling links from Grpc.md and GatewayConfiguration.md. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -1,828 +0,0 @@
|
||||
# aaAlarmManagedClient discovery — public surface, 2026-05-01
|
||||
|
||||
Result of running
|
||||
`MxGateway.Worker.Tests.AlarmClientDiscoveryTests.DumpAlarmClientPublicSurface`
|
||||
against the deployed AVEVA assembly:
|
||||
|
||||
- File:
|
||||
`C:\Program Files (x86)\ArchestrA\Framework\Bin\ViewAppFramework\Content\MA\aaAlarmManagedClient.dll`
|
||||
- Assembly identity: `aaAlarmManagedClient, Version=1.0.7368.41290,
|
||||
Culture=neutral, PublicKeyToken=7ebd82b507d9e10c`
|
||||
|
||||
## Public types
|
||||
|
||||
- `aaAlarmManagedClient.AlarmClient` (class)
|
||||
- `aaAlarmManagedClient.PriorityData` (class)
|
||||
|
||||
That's the entire exported surface — two types, no interfaces, no
|
||||
delegates.
|
||||
|
||||
## `AlarmClient` events
|
||||
|
||||
**None.** The class has no public events at all. The reflection probe's
|
||||
`GetEvents(BindingFlags.Public | Instance | Static)` returned an empty
|
||||
list.
|
||||
|
||||
## `AlarmClient` methods (relevant subset)
|
||||
|
||||
- **Lifecycle:**
|
||||
`RegisterConsumer(int hWnd, string szProductName, string
|
||||
szApplicationName, string szVersion, bool bRetainHiddenAlarms) → int`,
|
||||
`DeregisterConsumer() → int`,
|
||||
`InitializeConsumer(string szApplicationName) → int`,
|
||||
`UninitializeConsumer() → int`,
|
||||
`Dispose()`.
|
||||
- **Subscription:**
|
||||
`Subscribe(string szSubscription, short wFromPri, short wToPri,
|
||||
eQueryType QueryType, eSortFlags SortFlags, eAlarmFilterState
|
||||
FilterMask, eAlarmFilterState FilterSpecification) → int`.
|
||||
- **Change enumeration (pull on poke):**
|
||||
`GetStatistics(out int lPercentQuery, out int lTotalAlarms, out int
|
||||
lActiveAlarms, out int lSuppressedAlarms, out int lSuppressedFilters,
|
||||
out int lNewAlarms, out int lChangesCount, out int[] ChangeCodes,
|
||||
out int[] ChangePos, out int[] hAlarm) → int`.
|
||||
- **Record fetch:**
|
||||
`GetAlarmExtendedRec(int lIndex, out AlarmRecord almRec) → int`,
|
||||
`GetAlarmExtendedRec2(...)`,
|
||||
`GetHighPriAlarm(out AlarmRecord almRec) → int`.
|
||||
- **Selection model** (used by ack-selected-* family):
|
||||
`DeselectAll`, `SelectAlaramEntry(short select, int from, int to)`,
|
||||
`SelectByGUID(Guid)`, `SelectAlarmCount(int from, int to)`.
|
||||
- **Acknowledge:**
|
||||
`AlarmAckByGUID(Guid alarmGuid, string ackComment, string ackOprName,
|
||||
string ackOprNode, string ackOprDomain, string ackOprFullName) → int`
|
||||
is the per-alarm full-fidelity native ack.
|
||||
`AlarmAckSelected(string ackComment, string ackOprName, string
|
||||
ackOprNode, string ackOprDomain, string ackOprFullName) → int`
|
||||
acks whatever the selection model currently has selected.
|
||||
Several `AckSelected*Group/Tag/Priority/All/Visible*Alarms_Ex(...)`
|
||||
variants exist for bulk ack scoped to a group / tag / priority range.
|
||||
- **Suppress / shelve:** `SupressSelected*` and `ShelveSelected*`
|
||||
families plus `DoAlarmShelveAction(...)`. Out of scope for the v1
|
||||
alarm path.
|
||||
- **Snapshot/filter** (`SF*` prefix): `SFSetSortA / SFSetFilterA /
|
||||
SFCreateSnapshot / SFGetListCount / SFDeleteSnapshot / SFRefreshAlarm /
|
||||
SFGetStatistics`. Snapshot-style query API, distinct from the
|
||||
consumer-subscription path. Not currently used.
|
||||
|
||||
## What this means
|
||||
|
||||
The architecture comment on
|
||||
`src/MxGateway.Worker/MxAccess/AlarmClientConsumer.cs` (PR A.5) is
|
||||
**wrong against this deployed assembly**:
|
||||
|
||||
> "The AVEVA alarm-manager surface (`IAlarmMgrDataProvider`) exposes
|
||||
> the events we need as plain .NET events — no Windows message pump
|
||||
> required."
|
||||
|
||||
There is no managed event surface. `AlarmClient.RegisterConsumer`
|
||||
takes an `hWnd` because **WM_APP messaging is the actual notification
|
||||
mechanism**: AVEVA's alarm provider WM_APP-pokes the registered window,
|
||||
and the consumer is expected to call `GetStatistics` on each poke to
|
||||
pull `ChangeCodes` / `ChangePos` / `hAlarm` arrays, then
|
||||
`GetAlarmExtendedRec(pos, …)` per index to fetch each changed record.
|
||||
|
||||
`AlarmClientConsumer.AlarmRecordReceived` has no production callers as
|
||||
a result — `RaiseAlarmRecordReceived` is `internal` for tests and
|
||||
never gets invoked at runtime. Until A.2 lands a WM_APP pump,
|
||||
`MX_EVENT_FAMILY_ON_ALARM_TRANSITION` cannot carry events.
|
||||
|
||||
## Live runtime probe — 2026-05-01
|
||||
|
||||
`MxGateway.Worker.Tests.AlarmClientWmProbeTests.ProbeAlarmClientWmMessages`
|
||||
is a Skip-gated runtime probe that creates a real message-only
|
||||
window, calls `AlarmClient.RegisterConsumer(hWnd, …)` +
|
||||
`Subscribe(@"\Galaxy!", …)`, and pumps for 20s while logging every
|
||||
window message that arrives. Run results below — this turned the
|
||||
"WM_APP pump" design assumption upside down.
|
||||
|
||||
**`RegisterConsumer` and `Subscribe` both returned 0 (success).** The
|
||||
calls are valid against the deployed assembly; no parameter pinning
|
||||
needed.
|
||||
|
||||
**A registered-message-class WM (ID `0xC275` in this OS session)
|
||||
fired every ~1s after `Subscribe` completed.** Constant
|
||||
`wParam = 0x00001100`, constant `lParam = 0x079E46D8` (looks like a
|
||||
stable pointer into AVEVA-internal state) for all 20 hits. The
|
||||
constant payload across hits with no Galaxy alarm being fired
|
||||
suggests this is a **heartbeat/keepalive**, not a per-change
|
||||
notification.
|
||||
|
||||
**Critically: this WM is delivered to AVEVA's own internal window
|
||||
(`hwnd=0x18032E`) — NOT to the consumer's `hWnd` we passed in.** The
|
||||
consumer window's `WndProc` received only the standard creation
|
||||
sequence (`WM_GETMINMAXINFO`, `WM_NCCREATE`, `WM_NCCALCSIZE`,
|
||||
`WM_CREATE`) and the destruction sequence (`WM_NCDESTROY`,
|
||||
`WM_DESTROY`, `WM_NCCALCSIZE`) — nothing in between. AVEVA's
|
||||
notification path runs entirely against AVEVA's internal window;
|
||||
it never forwards to the user-supplied hWnd.
|
||||
|
||||
The message ID itself is dynamic (a `RegisterWindowMessage`
|
||||
allocation in the >= 0xC000 range), so it cannot be hard-coded —
|
||||
each consumer process must call `RegisterWindowMessage` with the
|
||||
correct *string* and use whatever ID the OS returns.
|
||||
|
||||
## What this means for A.2
|
||||
|
||||
The "WM_APP pump on the user hWnd" design — what the original plan
|
||||
banner described and what the previous version of this doc
|
||||
recommended — does not match how AVEVA actually delivers
|
||||
notifications. The hWnd parameter to `RegisterConsumer` does not
|
||||
appear to receive any of AVEVA's alarm traffic; it's likely used
|
||||
only as a registration identity (and perhaps as a parent for modal
|
||||
dialogs).
|
||||
|
||||
Two viable A.2 designs given the probe data:
|
||||
|
||||
1. **Polling.** Just call `GetStatistics` on a timer (e.g. every
|
||||
500ms in the worker's STA) and react to the change set it
|
||||
reports. No window plumbing needed. Trade-off: latency floor =
|
||||
poll period; modest CPU floor because the call is cheap. Matches
|
||||
the heartbeat-style WM 0xC275 semantics — AVEVA itself runs a
|
||||
poll loop internally.
|
||||
2. **Hook AVEVA's internal window.** Discover AVEVA's own window
|
||||
(`hwnd=0x18032E` in the probe), `SetWindowsHookEx` or
|
||||
`SetWindowSubclass` on it, and intercept WM 0xC275 on AVEVA's
|
||||
thread. Higher fidelity, near-zero latency, but invasive,
|
||||
fragile across AVEVA upgrades, and requires running on the same
|
||||
process / thread as the AVEVA window. Probably a non-starter
|
||||
without further AVEVA documentation.
|
||||
|
||||
**Recommendation:** the polling path (option 1) is cheaper to
|
||||
implement, more robust against AVEVA-internal change, and
|
||||
acceptable for a typical alarm cadence. The worker's existing STA
|
||||
already provides a thread-affinitized timer surface. The unanswered
|
||||
question is whether `GetStatistics` can be safely called outside
|
||||
AVEVA's own message-pump thread — confirmable by extending the
|
||||
probe to fire `GetStatistics` on its own thread and check the
|
||||
result.
|
||||
|
||||
## Alarm-provider visibility — third probe run, 2026-05-01
|
||||
|
||||
Extended the probe to call `AlarmClient.GetProviders` after
|
||||
`RegisterConsumer`. Result on this rig:
|
||||
|
||||
```
|
||||
GetProviders -> rc=0 count=0 list=[]
|
||||
```
|
||||
|
||||
**Zero alarm providers visible to the consumer process.** This
|
||||
explains every preceding probe run: no providers means no alarm
|
||||
events, regardless of how many times any value (including a
|
||||
bool with an `$Alarm` extension) flips. `Subscribe(@"\Galaxy!")`
|
||||
returns 0 (success) but matches nothing because the alarm-manager
|
||||
chain that provides the matching feed doesn't expose any provider
|
||||
to this consumer.
|
||||
|
||||
A System Platform script flipping `TestMachine_001.TestAlarm001`
|
||||
every 10s during this probe run produced no observable
|
||||
`GetStatistics` transitions, no `positions[]` / `handles[]`
|
||||
entries, no change in any field — confirms the silence is not
|
||||
about subscription-scope / message-pump but about provider
|
||||
absence.
|
||||
|
||||
### Possible causes
|
||||
|
||||
1. **No `$Alarm` extension on the test bool.** If
|
||||
`TestMachine_001.TestAlarm001` is a regular UDA without a
|
||||
`BoolAlarm` extension wired to it, flipping the value just
|
||||
writes a new value — no alarm fires.
|
||||
2. **Alarm manager service not running.** AVEVA's `aaAlarmMgr`
|
||||
(or the equivalent on this rig's Platform version) needs to
|
||||
be running for providers to register.
|
||||
3. **Process security context.** A consumer running under a
|
||||
normal user account may not see providers that registered
|
||||
under `LocalSystem` / a Platform service identity. The
|
||||
gateway-worker installation runs under a service account
|
||||
that may have access where `dotnet test` doesn't.
|
||||
|
||||
## InitializeConsumer required — fourth probe run, 2026-05-01
|
||||
|
||||
Adding `InitializeConsumer("AlarmProbe.Tests")` before
|
||||
`RegisterConsumer` made `\Galaxy!` appear in `GetProviders`
|
||||
(count=1, status 0 → 100 within 500ms). So #2 and #3 above are
|
||||
NOT the cause — the consumer can see the alarm provider once it
|
||||
calls Initialize. That's a missing API-call ordering, not a
|
||||
permission or service issue.
|
||||
|
||||
```
|
||||
InitializeConsumer -> 0
|
||||
RegisterConsumer -> 0
|
||||
GetProviders [after Register] -> rc=0 count=0 list=[]
|
||||
Subscribe('\Galaxy!') -> 0
|
||||
GetProviders [after Subscribe] -> rc=0 count=1 list=[ 0 \Galaxy!]
|
||||
GetProviders [poll #1] -> rc=0 count=1 list=[100 \Galaxy!]
|
||||
```
|
||||
|
||||
Despite the provider being visible at "100% query complete" for
|
||||
the entire 60s window, `GetStatistics` continued to report
|
||||
`total=0 active=0 codes=[7]` — no alarm transitions reached the
|
||||
consumer even with a System Platform script flipping the test
|
||||
boolean every 10s during the run.
|
||||
|
||||
That isolates the remaining unknown to whether the test bool's
|
||||
alarm extension is actually generating MxAccess alarm-provider
|
||||
events when its value flips. The probe has confirmed every link
|
||||
in the consumer chain works (Initialize → Register → Subscribe →
|
||||
provider visible at 100%) — what's missing is alarm traffic from
|
||||
the producer side. ObjectViewer or another live consumer running
|
||||
alongside the script is the next discriminator: does it visibly
|
||||
see the alarm fire?
|
||||
|
||||
API-ordering finding: `InitializeConsumer` MUST precede
|
||||
`RegisterConsumer` (or at least, must be called before
|
||||
`GetProviders` returns anything). PR A.5's `AlarmClientConsumer`
|
||||
omits `InitializeConsumer` entirely — that's a bug fix to apply
|
||||
even before A.2 lands, since without it the provider chain never
|
||||
becomes visible.
|
||||
|
||||
## Subscribe-parameter sweep — fifth probe run, 2026-05-01
|
||||
|
||||
Even with `InitializeConsumer` + provider visible at status 100,
|
||||
no alarm transitions arrived during a 60s window with the user's
|
||||
script flipping the test bool every 10s. Tried:
|
||||
|
||||
- `qtSummary` and `qtHistory` (the only `eQueryType` values).
|
||||
- Priority 1..999 and 0..32767.
|
||||
- `eAlarmFilterState.asNone` and `asAlarmActiveNow` for both
|
||||
`FilterMask` and `FilterSpecification`.
|
||||
|
||||
`eAlarmFilterState` is single-state-valued (asNone=0,
|
||||
asAlarmActiveNow=1, asAlarmAcked=2, asShelved=3), not flag bits.
|
||||
None of these knobs surfaced any alarm activity.
|
||||
|
||||
User confirmation 2026-05-01: the test bool does have a
|
||||
`BoolAlarm` extension on it; in `aaObjectViewer` the
|
||||
`$Alarm.InAlarm` sub-attribute flips true/false in lockstep with
|
||||
the script's writes. So the alarm extension is **evaluating**
|
||||
its condition, just not visibly producing transitions on the
|
||||
`aaAlarmManagedClient` consumer stream.
|
||||
|
||||
## Multi-channel + multi-subscription probe — sixth run, 2026-05-01
|
||||
|
||||
Extended the probe to try every consumer-side approach in
|
||||
parallel:
|
||||
|
||||
- **Subscription expressions** (sequential): `\Galaxy!`,
|
||||
`\Galaxy!*`, `\\Galaxy!`, `\Galaxy!TestArea`, `\\.\Galaxy!`.
|
||||
All Subscribe calls returned rc=0; the last one
|
||||
(`\\.\Galaxy!`) is reflected in `GetProviders` (count=1).
|
||||
- **Read channels** polled at 500ms cadence: `GetStatistics`,
|
||||
`GetHighPriAlarm`, `SFCreateSnapshot` + `SFGetStatistics`.
|
||||
- **Filter+sort**: priority 0..32767, `qtSummary`,
|
||||
state=`asAlarmActiveNow`, sort=`sfReturnNewestFirst`.
|
||||
- **AlarmRecord init** (worked around `Not a valid Win32
|
||||
FileTime` exception): all DateTime fields pre-set to FILETIME
|
||||
epoch (1601-01-01 UTC) before the call, since
|
||||
`default(DateTime)` is outside FILETIME range and trips the
|
||||
interop marshaler.
|
||||
|
||||
Result of the 60s run with `TestMachine_001.TestAlarm001` being
|
||||
flipped every 10s:
|
||||
|
||||
```
|
||||
Subscribe('\Galaxy!') -> 0
|
||||
Subscribe('\Galaxy!*') -> 0
|
||||
Subscribe('\\Galaxy!') -> 0
|
||||
Subscribe('\Galaxy!TestArea') -> 0
|
||||
Subscribe('\\.\Galaxy!') -> 0
|
||||
GetProviders [after Subscribe-multi] -> count=1 list=[ 0 \\.\Galaxy!]
|
||||
GetStatistics #1: total=0 active=0 changes=1 codes=[7] positions=[] handles=[]
|
||||
GetHighPriAlarm #1: rc=0 { }
|
||||
SF channel #1: SFCreate=0 numAlarms=0 SFStats=0 unackRet=0 unackAlm=0 ackAlm=0 others=0 events=0 idxNewest=-1
|
||||
```
|
||||
|
||||
**No further "(changed)" entries for the entire 60s window.**
|
||||
Every read API returned the same empty result on every poll.
|
||||
|
||||
User confirms the alarm IS firing — `aaObjectViewer` sees
|
||||
`$Alarm.InAlarm` flip in lockstep with the script. Historian
|
||||
records exist (per user — needs verification by querying the
|
||||
historian directly).
|
||||
|
||||
## Conclusion of consumer-side probing
|
||||
|
||||
`aaAlarmManagedClient.AlarmClient` is **not** the receive
|
||||
surface AVEVA's alarm pipeline routes to in this Galaxy
|
||||
configuration. The consumer chain is verified end-to-end:
|
||||
|
||||
- `InitializeConsumer` + `RegisterConsumer` + `Subscribe` all
|
||||
succeed (rc=0).
|
||||
- `GetProviders` finds `\Galaxy!` once Initialize is called.
|
||||
- All read APIs (`GetStatistics`, `GetHighPriAlarm`,
|
||||
`SFCreateSnapshot`/`SFGetStatistics`) return empty even with
|
||||
every documented filter combination.
|
||||
- The consumer's hWnd receives zero AVEVA messages between
|
||||
`WM_CREATE` and `WM_DESTROY`; AVEVA's traffic goes to its own
|
||||
internal hwnd.
|
||||
|
||||
The next investigation directions are not consumer-side:
|
||||
|
||||
1. **Inspect `aaObjectViewer`'s alarm SDK** to see what library
|
||||
it uses to read alarms. If different from
|
||||
`aaAlarmManagedClient`, switch the worker over.
|
||||
2. **Query the historian directly** (`aahEventStorage` /
|
||||
`aahEventSvc`) to confirm alarms are recorded — and use the
|
||||
same path for v2 alarm capture.
|
||||
3. **Inspect AVEVA's alarm-routing config** for this Galaxy in
|
||||
System Platform IDE — area assignments, alarm provider
|
||||
bindings, "publish alarm events to" settings on the platform.
|
||||
|
||||
For A.2 implementation: the `aaAlarmManagedClient` path the
|
||||
gateway-worker is currently architected around may be a
|
||||
dead-end on customer Galaxies configured this way. If the
|
||||
alarms truly only flow through the historian event-storage path,
|
||||
A.2 needs to consume from `aahEventStorage` instead — a
|
||||
fundamental architecture pivot.
|
||||
|
||||
## BREAKTHROUGH — seventh probe run, 2026-05-01
|
||||
|
||||
Two changes finally produced a signal:
|
||||
|
||||
1. **Subscription scope:** `\\<MachineName>\Galaxy!<TopArea>` is the
|
||||
canonical AlarmClient subscription format (per ArchestrA Alarm
|
||||
Client docs at `archestra6.rssing.com/chan-12008125/article13.html`):
|
||||
`\\Node\Provider!Area!Filter`, where Node is the *machine* name,
|
||||
Provider is **literally `Galaxy`**, and Area is a hosted area
|
||||
object. For this rig (`\\DESKTOP-6JL3KKO\Galaxy!DEV`) the DEV
|
||||
area — the platform's primary area — is the right scope. Earlier
|
||||
`\Galaxy!`, `\Galaxy!TestArea`, `\\.\Galaxy!`, etc., all returned
|
||||
rc=0 but matched no traffic — they were not the canonical form.
|
||||
2. **`InitializeConsumer` before `RegisterConsumer`** — already
|
||||
discovered earlier; bug-fix for PR A.5's `AlarmClientConsumer`.
|
||||
|
||||
With both in place, `GetHighPriAlarm` returned a record on every
|
||||
poll for 60s straight (117/117 calls), but threw
|
||||
`ArgumentOutOfRangeException: Not a valid Win32 FileTime` instead
|
||||
of returning successfully — the AlarmRecord struct contains five
|
||||
DateTime fields (`ar_Time`, `ar_OrigTime`, `ar_AckTime`,
|
||||
`ar_RtnTime`, `ar_SubTime`) and AVEVA writes sentinel/invalid
|
||||
FILETIME values for unset ones (e.g., `ar_AckTime` for an
|
||||
unacknowledged alarm). The .NET interop that AVEVA ships
|
||||
(`aaAlarmManagedClient.dll`) auto-converts FILETIME→DateTime and
|
||||
rejects out-of-range values.
|
||||
|
||||
`GetStatistics` continues to report `total=0 active=0` even with
|
||||
GetHighPriAlarm returning records — those two API surfaces have
|
||||
genuinely different views in AVEVA's data model.
|
||||
|
||||
So: **alarms flow through `aaAlarmManagedClient.AlarmClient` once
|
||||
the subscription expression is canonical**. The blocking issue is
|
||||
extracting the payload past the .NET interop's DateTime
|
||||
auto-marshaling.
|
||||
|
||||
## Remaining work to capture alarm payloads
|
||||
|
||||
Define a custom COM interop that uses `long` (FILETIME-as-int64)
|
||||
instead of `DateTime` for the timestamp fields. Approach options:
|
||||
|
||||
1. **Patch the AVEVA-shipped `aaAlarmManagedClient.dll`** — ildasm
|
||||
the assembly, replace `DateTime` with `long` on AlarmRecord's
|
||||
timestamp fields, ilasm back. Brittle across AVEVA upgrades.
|
||||
2. **Write our own `[ComImport]` interface** — declare
|
||||
`IRawAlarmConsumer` ourselves with safe-blittable types,
|
||||
discover the underlying COM IID (via reflection on
|
||||
`AlarmClient`'s `[Guid]` attribute), and `(IRawAlarmConsumer)
|
||||
alarmClient` cast. Cleaner; requires the IID.
|
||||
3. **Use `IDispatch` late binding** — dispatch-Invoke bypasses
|
||||
strong-typed marshaling. Verbose but doesn't need IIDs.
|
||||
|
||||
For PR A.2's worker integration, option 2 is the least
|
||||
disruptive. Once the interop is custom, `AlarmClient.Subscribe` +
|
||||
`GetHighPriAlarm` + `GetAlarmExtendedRec` form a viable
|
||||
polling-style alarm consumer.
|
||||
|
||||
**REVISED 2026-05-01 — option 1 not directly applicable.**
|
||||
Reflection on `aaAlarmManagedClient.AlarmClient` shows it
|
||||
implements only `IDisposable` (no `[ComImport]` interface, no
|
||||
class GUID). It has a single field `CwwAlarmConsumer*
|
||||
m_almUnmanaged` — meaning `AlarmClient` is a **C++/CLI managed
|
||||
wrapper around a native C++ class**, NOT a COM-interop class.
|
||||
The DateTime conversion happens inside the AVEVA wrapper's IL,
|
||||
not at a .NET-to-COM marshaling boundary. There is no separate
|
||||
COM interface IID we can QI to.
|
||||
|
||||
Revised approach options:
|
||||
|
||||
A. **Switch to `wnwrapConsumer.dll`** — a separate standalone
|
||||
COM library AVEVA ships at
|
||||
`C:\Program Files (x86)\Common Files\ArchestrA\wnwrapConsumer.dll`
|
||||
exposing `WNWRAPCONSUMERLib.wwAlarmConsumerClass` with
|
||||
`SetXmlAlarmQuery` / `GetXmlCurrentAlarms`. XML-string output
|
||||
bypasses FILETIME marshaling entirely.
|
||||
B. **Patch `aaAlarmManagedClient.dll` IL** — wrap the unsafe
|
||||
`DateTime.FromFileTime` calls with a safe variant. Direct
|
||||
fix but modifies a vendor binary.
|
||||
C. **Reflect into `m_almUnmanaged` and call native vtable** —
|
||||
get the IntPtr, walk the MSVC C++ vtable, call
|
||||
`__thiscall` methods via `Marshal.GetDelegateForFunctionPointer`.
|
||||
Doable but requires reverse-engineering the C++ class layout.
|
||||
|
||||
Option A is the best fit: real COM-based, self-contained in
|
||||
our code, conventional production-grade approach (the WIN-911
|
||||
consumer pattern referenced in AVEVA support forums uses it).
|
||||
|
||||
The polling-vs-WM_APP-callback question from earlier is now
|
||||
moot: `GetStatistics`'s `positions[]/handles[]` arrays remained
|
||||
empty even when alarms were demonstrably present. The active
|
||||
read API for current alarms is `GetHighPriAlarm`, not
|
||||
`GetStatistics`'s change array.
|
||||
|
||||
### Implications for A.2 implementation
|
||||
|
||||
The A.2 PR's value is unmeasurable until at least one alarm
|
||||
provider is visible. The choice between polling-via-`GetStatistics`
|
||||
and the callback path can only be decided by observing what
|
||||
populates first when a real alarm fires. Without a provider,
|
||||
both paths return the same "nothing happening" answer.
|
||||
|
||||
Until that's resolved, A.2 implementation work is genuinely
|
||||
blocked on a dev-rig configuration issue — not on architectural
|
||||
choice or code structure.
|
||||
|
||||
## GetStatistics polling — second probe run, 2026-05-01
|
||||
|
||||
Extended the probe to call `GetStatistics` every ~2s alongside the
|
||||
WM logger. Key findings:
|
||||
|
||||
- **`GetStatistics` is safely callable from the same thread that
|
||||
did `RegisterConsumer` + `Subscribe`.** Every poll returned rc=0
|
||||
with no exceptions over 9 polls / 20s window.
|
||||
- **The deployed Galaxy currently has zero active alarms.** Every
|
||||
poll reported `total=0 active=0 suppressed=0 newAlarms=0`. The
|
||||
`positions[]` and `handles[]` arrays were empty.
|
||||
- **`changes=1 codes=[7]` was constant across all polls**, matching
|
||||
the constant 1 Hz WM 0xC275 cadence. Code 7 is consistent with a
|
||||
"heartbeat / subscription healthy" sentinel — same semantics as
|
||||
the WM but reported through the pull-side API.
|
||||
- `percent=100` (query-complete percentage) was constant — the
|
||||
subscription is steady-state.
|
||||
|
||||
This confirms the polling design (option 1 in the previous section)
|
||||
is mechanically viable. The remaining open question is whether
|
||||
`GetStatistics` populates `positions[] / handles[]` with real
|
||||
entries when an alarm transition actually fires — proving that
|
||||
requires firing an alarm.
|
||||
|
||||
## Open follow-up probes
|
||||
|
||||
Each can be added to `AlarmClientWmProbeTests` as a separate
|
||||
Skip-gated test:
|
||||
|
||||
1. **Fire a real Galaxy alarm during the pump window.** The cleanest
|
||||
programmatic trigger is an MxAccess write that flips a
|
||||
`$Alarm`-extended boolean to true (alarm in) and back to false
|
||||
(alarm out). Pinning the exact tag reference is pending — needs
|
||||
either a documented test-fixture tag or an interactive selection
|
||||
in System Platform IDE. Once the trigger fires, this resolves
|
||||
whether AVEVA's pulled change set arrives via `GetStatistics`
|
||||
`positions[] / handles[]` (per-change polling works) or only via
|
||||
the AVEVA-internal window (callback path needed).
|
||||
2. **Hook AVEVA's internal window** to log what WMs it actually
|
||||
processes — only relevant if probe 1 shows `GetStatistics` does
|
||||
NOT report per-change activity.
|
||||
3. **Decompile `aaAlarmManagedClient.dll`'s IL** for the
|
||||
`RegisterConsumer` method to find what `RegisterWindowMessage`
|
||||
string is used and whether there's a callback-registration
|
||||
surface on `WNAL_Register` that the managed client wraps. The
|
||||
alarmlst.dll strings (`WNAL_CallBack`, "Invalid callbacks" error)
|
||||
suggest the underlying C API takes callbacks, but the managed
|
||||
wrapper exposes none of them.
|
||||
|
||||
PR A.5's `Subscribe` / `AcknowledgeByGuid` / `SnapshotActiveAlarms`
|
||||
are correct — they're pull-style and don't depend on the
|
||||
notification mechanism.
|
||||
|
||||
## Option A — captured, 2026-05-01
|
||||
|
||||
`wnwrapConsumer.dll` (`C:\Program Files (x86)\Common Files\
|
||||
ArchestrA\wnwrapConsumer.dll`) hosts the standalone COM class
|
||||
`WNWRAPCONSUMERLib.wwAlarmConsumerClass`. Type library imports
|
||||
cleanly via `tlbimp` (output stored under `mxaccessgw/lib/
|
||||
Interop.WNWRAPCONSUMERLib.dll`). The COM class is registered in
|
||||
`HKLM:\SOFTWARE\WOW6432Node\Classes\CLSID\
|
||||
{7AB52E5F-36B2-4A30-AE46-952A746F667C}` with `ThreadingModel=
|
||||
Apartment` — `new wwAlarmConsumerClass()` succeeds via
|
||||
`CoCreateInstance`.
|
||||
|
||||
The probe `MxGateway.Worker.Tests/WnWrapConsumerProbeTests.cs`
|
||||
(Skip-gated, archival) drove the captured run. Lifecycle:
|
||||
|
||||
1. `new wwAlarmConsumerClass()` — instantiated.
|
||||
2. `InitializeConsumer("MxGatewayProbe.WnWrap")` -> 0.
|
||||
3. `RegisterConsumer(hWnd: 0, productName, applicationName,
|
||||
version)` -> 0. **Note:** wnwrap's `RegisterConsumer` is
|
||||
4-arg (no `bRetainHiddenAlarms`); `aaAlarmManagedClient`'s
|
||||
is 5-arg. Different surface.
|
||||
4. `Subscribe(@"\\<machine>\Galaxy!DEV", priLow=1, priHigh=999,
|
||||
qtSummary, sfReturnNewestFirst, asAlarmActiveNow,
|
||||
asAlarmActiveNow)` -> 0. Same canonical scope that worked
|
||||
for `aaAlarmManagedClient`.
|
||||
5. `SetXmlAlarmQuery(...)` was called too but the round-trip
|
||||
`GetXmlAlarmQuery` returned a mangled echo (NODE became
|
||||
`DESKTOP-6JL3KKO\Galaxy!DEV`, PROVIDER became `Galaxy!DEV`,
|
||||
ALARM_STATE shortened to `All`, DISPLAY_MODE truncated to
|
||||
`Sum`). The XML-query path looks broken in this build; rely
|
||||
on `Subscribe` for the filter and skip `SetXmlAlarmQuery` in
|
||||
production. Confirming "Subscribe alone is sufficient" is
|
||||
one follow-up probe (call `Subscribe` and read XML, no
|
||||
`SetXmlAlarmQuery`) — out of scope for the breakthrough run
|
||||
but easy to verify.
|
||||
|
||||
### Captured XML (60 polls over 30s, 500ms cadence)
|
||||
|
||||
`GetXmlCurrentAlarms2(maxAlmCnt: 100, out vartCurrentXmlAlarms)`
|
||||
returned BSTR XML cleanly on every call — 60/60 ok, zero throws.
|
||||
`GetXmlCurrentAlarms` (the v1 method) returned identical content
|
||||
on the same cadence; either method is viable.
|
||||
|
||||
Empty state:
|
||||
|
||||
```xml
|
||||
<?xml version="1.0"?><ALARM_RECORDS COUNT="0"></ALARM_RECORDS>
|
||||
```
|
||||
|
||||
With alarm active (`UNACK_ALM`, value=true after the flip
|
||||
script set the bool true):
|
||||
|
||||
```xml
|
||||
<?xml version="1.0"?>
|
||||
<ALARM_RECORDS COUNT="1">
|
||||
<ALARM>
|
||||
<GUID>BCC4705395424D65BDAABCDEA6A32A73</GUID>
|
||||
<DATE>2026/5/1</DATE>
|
||||
<TIME>13:26:14.709</TIME>
|
||||
<GMTOFFSET>240</GMTOFFSET>
|
||||
<DSTADJUST>0</DSTADJUST>
|
||||
<PROVIDER_NODE>DESKTOP-6JL3KKO</PROVIDER_NODE>
|
||||
<PROVIDER_NAME>Galaxy</PROVIDER_NAME>
|
||||
<GROUP>TestArea</GROUP>
|
||||
<TAGNAME>TestMachine_001.TestAlarm001</TAGNAME>
|
||||
<TYPE>DSC</TYPE>
|
||||
<VALUE>true</VALUE>
|
||||
<LIMIT>true</LIMIT>
|
||||
<PRIORITY>500</PRIORITY>
|
||||
<STATE>UNACK_ALM</STATE>
|
||||
<OPERATOR_NODE></OPERATOR_NODE>
|
||||
<OPERATOR_NAME></OPERATOR_NAME>
|
||||
<ALARM_COMMENT>Test alarm #1</ALARM_COMMENT>
|
||||
</ALARM>
|
||||
</ALARM_RECORDS>
|
||||
```
|
||||
|
||||
After the script set the bool false (`UNACK_RTN`, value=false):
|
||||
|
||||
```xml
|
||||
<?xml version="1.0"?>
|
||||
<ALARM_RECORDS COUNT="1">
|
||||
<ALARM>
|
||||
<GUID>BCC4705395424D65BDAABCDEA6A32A73</GUID>
|
||||
<DATE>2026/5/1</DATE>
|
||||
<TIME>13:26:24.710</TIME>
|
||||
...
|
||||
<VALUE>false</VALUE>
|
||||
<STATE>UNACK_RTN</STATE>
|
||||
...
|
||||
</ALARM>
|
||||
</ALARM_RECORDS>
|
||||
```
|
||||
|
||||
The 10s cadence between transitions matches the System Platform
|
||||
script's flip frequency exactly. **GUID is stable across the
|
||||
in→out cycle** (`BCC4705…` carried through both states), so the
|
||||
XML stream represents the alarm record's lifecycle, not separate
|
||||
event records — this is "current alarms snapshot," not
|
||||
"transition stream." For an OPC UA `AlarmConditionService`
|
||||
adapter this is fine: condition-state changes per-snapshot is
|
||||
the supported model.
|
||||
|
||||
`STATE` enum values observed: `UNACK_RTN` (the alarm has
|
||||
returned to normal but is unacknowledged — i.e., visible in the
|
||||
"current alarms" list because operator hasn't acked it yet) and
|
||||
`UNACK_ALM` (the alarm is currently active and unacknowledged).
|
||||
The other states from `eAlmState` (`ACK_RTN`, `ACK_ALM`) would
|
||||
appear when an ack is performed — `wwAlarmConsumerClass.AlarmAckByGUID`
|
||||
is the method to call.
|
||||
|
||||
### `GetStatistics` AV — unrelated quirk
|
||||
|
||||
Every `GetStatistics` call threw `AccessViolationException` in
|
||||
the probe. Cause: the wnwrap interop signature uses `IntPtr` for
|
||||
the three array out-parameters (`pChangeCode`, `pChangePos`,
|
||||
`phAlarm`); passing `IntPtr.Zero` is wrong — the COM impl is
|
||||
writing into the buffer pointer without null-checking. Pre-
|
||||
allocate three int-arrays and pass pinned pointers (or use
|
||||
`Marshal.AllocCoTaskMem`) to fix. Not required for the
|
||||
production path — the XML methods give us everything we need.
|
||||
|
||||
### Implications for PR A.2 worker integration
|
||||
|
||||
Replacing `aaAlarmManagedClient.AlarmClient` with
|
||||
`WNWRAPCONSUMERLib.wwAlarmConsumerClass` in the worker's
|
||||
alarm-consumer surface unblocks A.2 fully. Outline:
|
||||
|
||||
1. **Reference path:** drop `aaAlarmManagedClient.dll` reference
|
||||
from `MxGateway.Worker.csproj`; add `Interop.WNWRAPCONSUMERLib.dll`
|
||||
reference from `mxaccessgw/lib/`. (Or commit the interop dll
|
||||
in-tree under `lib/` and reference relatively.)
|
||||
2. **`AlarmClientConsumer` → `WnWrapAlarmConsumer`:** rewrite
|
||||
the consumer wrapper to:
|
||||
- `new wwAlarmConsumerClass()` on the worker's STA thread.
|
||||
- `InitializeConsumer(applicationName)` then
|
||||
`RegisterConsumer(hWnd: 0, …)`.
|
||||
- `Subscribe(@"\\<node>\Galaxy!<area>", …)` per configured
|
||||
area. The `<node>` and `<area>` are configurable (default
|
||||
`Environment.MachineName` + the platform's primary area).
|
||||
- Poll `GetXmlCurrentAlarms2(maxAlmCnt, out xml)` on a
|
||||
timer (500ms-1s cadence is comfortable). Parse XML
|
||||
payload; diff against the previous snapshot (keyed by
|
||||
`GUID`); emit `MX_EVENT_FAMILY_ON_ALARM_TRANSITION`
|
||||
events for added/changed/removed records.
|
||||
- `AlarmAckByGUID(VBGUID, comment, oprName, node, domain,
|
||||
fullName)` for client-driven acknowledgements (matches
|
||||
PR A.5's `AlarmAckCommand` payload).
|
||||
- Lifecycle teardown: `DeregisterConsumer` +
|
||||
`UninitializeConsumer` + `Marshal.FinalReleaseComObject`.
|
||||
3. **Conversion layer:** map XML record fields to
|
||||
`MxAlarmConditionRecord` proto:
|
||||
- `GUID` → `condition_id` (canonicalize the no-dashes hex
|
||||
to a UUID string).
|
||||
- `STATE` enum → `inAlarm` + `acked` booleans
|
||||
(`UNACK_ALM` → in_alarm=true, acked=false;
|
||||
`UNACK_RTN` → in_alarm=false, acked=false;
|
||||
`ACK_ALM` → in_alarm=true, acked=true;
|
||||
`ACK_RTN` → in_alarm=false, acked=true).
|
||||
- `DATE + TIME + GMTOFFSET + DSTADJUST` → reassemble UTC
|
||||
timestamp; matches the worker's existing `Timestamp`
|
||||
wire format.
|
||||
- `PRIORITY` → severity (already 1-1000-ish range).
|
||||
- `TAGNAME` → reference; `PROVIDER_NAME` + `GROUP` for
|
||||
scope metadata.
|
||||
4. **PR A.5 fix carry-over:** `InitializeConsumer` MUST be
|
||||
called before `RegisterConsumer` (rediscovered with
|
||||
`aaAlarmManagedClient`, also true here). The existing
|
||||
`AlarmClientConsumer` skips Initialize entirely; the new
|
||||
`WnWrapAlarmConsumer` includes it from day one.
|
||||
5. **Test reuse:** PR A.5's snapshot/ack contract tests can
|
||||
stay — they don't touch the underlying COM API. Add a new
|
||||
integration test against the wnwrap surface (live-AVEVA-only,
|
||||
Skip-gated like the probe).
|
||||
|
||||
### Settled API-ordering and surface knowledge
|
||||
|
||||
- `InitializeConsumer` first, then `RegisterConsumer` — both
|
||||
on `aaAlarmManagedClient.AlarmClient` and
|
||||
`wwAlarmConsumerClass`.
|
||||
- `RegisterConsumer` arity differs:
|
||||
`aaAlarmManagedClient.AlarmClient.RegisterConsumer(hWnd,
|
||||
product, app, version, bRetainHiddenAlarms)` — 5 args;
|
||||
`wwAlarmConsumerClass.RegisterConsumer(hWnd, product, app,
|
||||
version)` — 4 args. The wnwrap class has no
|
||||
`bRetainHiddenAlarms` parameter at all.
|
||||
- Subscription expression format: `\\<machine>\Galaxy!<area>`
|
||||
(literal `Galaxy` provider) for both libraries.
|
||||
- Native ack: `AlarmAckByGUID(VBGUID guid, comment, oprName,
|
||||
node, domain, fullName)` on the v2 surface; ID 5-arg
|
||||
variant on the legacy `IwwAlarmConsumer` interface.
|
||||
|
||||
These findings retire the open follow-up probes from the
|
||||
"polling-vs-pump" debate above — `wwAlarmConsumerClass` plus
|
||||
poll-on-timer is the implementation.
|
||||
|
||||
## Live smoke-test discoveries — 2026-05-01
|
||||
|
||||
The Skip-gated `AlarmsLiveSmokeTests.Alarms_full_pipeline_round_trip`
|
||||
ran the full
|
||||
`WnWrapAlarmConsumer` + `AlarmDispatcher` + `MxAccessAlarmEventSink`
|
||||
pipeline against the dev rig with the flip script running. End-to-end
|
||||
verified: 6 real transitions captured on the 10s cadence, ack-by-name
|
||||
returned rc=0, pipeline stayed healthy through 5 more transitions
|
||||
afterwards. Three production-relevant quirks surfaced and were fixed
|
||||
in the consumer:
|
||||
|
||||
### 1. `SetXmlAlarmQuery` is mandatory for reads despite the mangled echo
|
||||
|
||||
Without `SetXmlAlarmQuery`, the first `GetXmlCurrentAlarms2` call
|
||||
fails with `E_FAIL` (HRESULT `0x80004005`). The discovery doc above
|
||||
flagged the round-trip echo as mangled and recommended skipping the
|
||||
call — that recommendation is **wrong**. The echo *is* mangled (AVEVA
|
||||
parses NODE/PROVIDER/ALARM_STATE/DISPLAY_MODE incorrectly), but the
|
||||
call itself is required as some kind of subscription enabler. Even
|
||||
the Subscribe call setting the actual filter doesn't avoid the need
|
||||
for `SetXmlAlarmQuery`.
|
||||
|
||||
`WnWrapAlarmConsumer.ComposeXmlAlarmQuery(subscription)` decomposes
|
||||
the canonical `\\<machine>\Galaxy!<area>` form into the XML's
|
||||
NODE/PROVIDER/GROUP fields. Mangled or not, the call enables reads.
|
||||
|
||||
### 2. Two consumers required: read-side vs. ack-side
|
||||
|
||||
`SetXmlAlarmQuery` enables reads but **breaks `AlarmAckByName` on
|
||||
the same consumer instance**. With SetXml applied, AlarmAckByName
|
||||
returns -55 even with valid name+provider+group+operator. Without
|
||||
SetXml, AlarmAckByName succeeds with rc=0.
|
||||
|
||||
The production consumer therefore provisions **two** wnwrap COM
|
||||
instances:
|
||||
- Primary consumer (`client`): runs full lifecycle including
|
||||
`SetXmlAlarmQuery` for `GetXmlCurrentAlarms2` polls.
|
||||
- Ack-only consumer (`ackClient`): runs Initialize → Register →
|
||||
Subscribe via the v1-prefixed methods, **no SetXmlAlarmQuery**.
|
||||
All `AcknowledgeByName` calls dispatch through this instance.
|
||||
|
||||
Both consumers subscribe to the same expression. Disposal cleans up
|
||||
both via a shared `ReleaseConsumerCom` helper.
|
||||
|
||||
### 3. `AlarmAckByName` v2 8-arg vs. v1 6-arg
|
||||
|
||||
`wwAlarmConsumerClass` exposes two `AlarmAckByName` overloads:
|
||||
- `IwwAlarmConsumer2` v2: 8 args (`name, provider, group, comment,
|
||||
oprName, node, domainName, oprFullName`).
|
||||
- `IwwAlarmConsumer` v1: 6 args (no domain, no full-name).
|
||||
|
||||
The v2 8-arg method returns -55 on this AVEVA build regardless of
|
||||
operator-identity inputs — looks like a stub. The v1 6-arg method
|
||||
works. Production `WnWrapAlarmConsumer.AcknowledgeByName` calls the
|
||||
6-arg overload and discards the proto's `domain` + `full_name` fields.
|
||||
The proto contract keeps the 8 fields for forward compatibility if
|
||||
AVEVA fixes the v2 method later.
|
||||
|
||||
### 4. `AlarmAckByGUID` is not implemented
|
||||
|
||||
The v2 `AlarmAckByGUID(VBGUID, …)` throws `NotImplementedException`
|
||||
(COM `E_NOTIMPL`) on `wwAlarmConsumerClass` against this AVEVA
|
||||
build. The reference→GUID lookup that we initially planned to wire
|
||||
through `AlarmAckByGUID` is therefore not viable on wnwrap; all acks
|
||||
must go through `AlarmAckByName`.
|
||||
|
||||
The proto `AcknowledgeAlarmCommand` (GUID-based) and the worker's
|
||||
`MxAccessCommandExecutor.ExecuteAcknowledgeAlarm` switch arm remain
|
||||
in the codebase for the forward-compat shape, but the gateway-side
|
||||
`WorkerAlarmRpcDispatcher.AcknowledgeAsync` now always routes through
|
||||
`AcknowledgeAlarmByName` when the public RPC supplies a recognizable
|
||||
`Provider!Group.Tag` reference.
|
||||
|
||||
**Command/reply payload reuse.** `MxCommand.payload` has a dedicated
|
||||
`acknowledge_alarm_by_name_command` field, but `MxCommandReply.payload`
|
||||
intentionally has **no** by-name-specific case. The by-name ack carries
|
||||
no outcome detail beyond the native return code, so the worker's
|
||||
`ExecuteAcknowledgeAlarmByName` sets the same `acknowledge_alarm`
|
||||
(`AcknowledgeAlarmReplyPayload`) reply case used by the GUID arm, with
|
||||
`native_status` = the `AlarmAckByName` return code (also echoed into the
|
||||
top-level `MxCommandReply.hresult`). Reply consumers must dispatch on
|
||||
`MxCommandReply.kind` (`MX_COMMAND_KIND_ACKNOWLEDGE_ALARM` vs.
|
||||
`MX_COMMAND_KIND_ACKNOWLEDGE_ALARM_BY_NAME`), not on the payload oneof
|
||||
case, to distinguish the two acks. `WorkerAlarmRpcDispatcher` reads only
|
||||
the top-level `hresult`/`protocol_status`, so it handles both arms
|
||||
without unpacking the payload.
|
||||
|
||||
**Worker `native_status` → public `AcknowledgeAlarmReply` mapping.** The
|
||||
worker carries the ack outcome as a single `int32`
|
||||
(`AcknowledgeAlarmReplyPayload.native_status`, the `AlarmAckByName` /
|
||||
`AlarmAckByGUID` return code; `0` = success), also mirrored into the
|
||||
worker `MxCommandReply.hresult`. The public `AcknowledgeAlarmReply` has
|
||||
two outcome-shaped fields, but only one is populated:
|
||||
|
||||
- `AcknowledgeAlarmReply.hresult` — `WorkerAlarmRpcDispatcher` copies the
|
||||
worker's `MxCommandReply.hresult` (the native return code) into this
|
||||
field. **This is the authoritative ack-outcome field**; `0` means the
|
||||
ack succeeded. It is absent only when the worker reply omitted the
|
||||
value, which is a protocol violation surfaced in `protocol_status`.
|
||||
- `AcknowledgeAlarmReply.status` (`MxStatusProxy`) — the worker by-name /
|
||||
by-GUID ack path produces only the `int32` return code, never a
|
||||
populated `MXSTATUS_PROXY` struct, so `WorkerAlarmRpcDispatcher` leaves
|
||||
this field **unset on every reply**. It is reserved for a future
|
||||
structured view of the ack outcome. Clients must not depend on it.
|
||||
|
||||
Client authors should therefore branch on `protocol_status` first (for
|
||||
transport/session-level failures) and then on `hresult` (`0` = ack
|
||||
accepted by MXAccess) — never on `status`.
|
||||
|
||||
### 5. STA / threading — production fix needed
|
||||
|
||||
The wnwrap COM is `ThreadingModel=Apartment`. The consumer's
|
||||
internal `Timer` fires on threadpool threads and would block forever
|
||||
on cross-apartment marshaling unless the host STA pumps Win32
|
||||
messages. The smoke test sidesteps this by setting
|
||||
`pollIntervalMilliseconds=0` (Timer disabled) and driving `PollOnce`
|
||||
manually from the test's STA. Production hosting will route polls
|
||||
through the worker's `StaRuntime` in a follow-up — the consumer's
|
||||
`PollOnce` is `public` and idempotent so the wire-up is mechanical.
|
||||
|
||||
### Capture summary
|
||||
|
||||
```
|
||||
Transition: kind=Clear ref='Galaxy!TestArea.TestMachine_001.TestAlarm001' …
|
||||
Transition: kind=Raise ref='Galaxy!TestArea.TestMachine_001.TestAlarm001' …
|
||||
SnapshotActiveAlarms count=1
|
||||
active: ref='Galaxy!TestArea.TestMachine_001.TestAlarm001' state=Active
|
||||
AcknowledgeByName(real identity) -> rc=0
|
||||
Post-ack transition: kind=Clear …
|
||||
+1: kind=Raise … (10s after ack)
|
||||
+2: kind=Clear … (20s)
|
||||
+3: kind=Raise … (30s)
|
||||
+4: kind=Clear … (40s)
|
||||
```
|
||||
|
||||
10s cadence held throughout; full proto fields populated correctly;
|
||||
ack registered server-side without errors.
|
||||
@@ -184,9 +184,7 @@ behavior.
|
||||
| `MxGateway:Alarms:ReconcileIntervalSeconds` | `30` | How often the monitor reconciles its in-process alarm cache against the worker's authoritative active-alarm snapshot, catching transitions the live poll-and-diff feed missed. Floored at 5 seconds. |
|
||||
|
||||
The alarm monitor is independent of client sessions: `AcknowledgeAlarm` and
|
||||
`StreamAlarms` are session-less RPCs served by the monitor. See
|
||||
[Alarm Client Discovery](./AlarmClientDiscovery.md) for the AVEVA consumer
|
||||
surface the monitor's worker session drives.
|
||||
`StreamAlarms` are session-less RPCs served by the monitor.
|
||||
|
||||
## Related Documentation
|
||||
|
||||
|
||||
+1
-1
@@ -88,7 +88,7 @@ Carrying the enqueue timestamp into the worker layer is what lets queue-wait tim
|
||||
|
||||
### `AcknowledgeAlarm`
|
||||
|
||||
`AcknowledgeAlarm` is a unary, **session-less** RPC that acknowledges a single alarm. The handler validates `alarm_full_reference` inline (it does not run through `MxAccessGrpcRequestValidator`) and delegates to `IGatewayAlarmService.AcknowledgeAsync`. The always-on `GatewayAlarmMonitor` routes the ack over its own gateway-managed worker session — clients no longer open a session to acknowledge an alarm. A reference that parses as a canonical GUID forwards to `AcknowledgeAlarmCommand`; a `Provider!Group.Tag` reference forwards to `AcknowledgeAlarmByNameCommand`. The alarm contract and the central monitor are documented in [Alarm Client Discovery](./AlarmClientDiscovery.md).
|
||||
`AcknowledgeAlarm` is a unary, **session-less** RPC that acknowledges a single alarm. The handler validates `alarm_full_reference` inline (it does not run through `MxAccessGrpcRequestValidator`) and delegates to `IGatewayAlarmService.AcknowledgeAsync`. The always-on `GatewayAlarmMonitor` routes the ack over its own gateway-managed worker session — clients no longer open a session to acknowledge an alarm. A reference that parses as a canonical GUID forwards to `AcknowledgeAlarmCommand`; a `Provider!Group.Tag` reference forwards to `AcknowledgeAlarmByNameCommand`.
|
||||
|
||||
### `StreamAlarms`
|
||||
|
||||
|
||||
Reference in New Issue
Block a user