Files
mxaccessgw/docs/audit/fragments/09-alarms.md
T

405 lines
29 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Cluster 09 — Alarms
Audited doc: `docs/AlarmClientDiscovery.md`
Code base verified against:
- `src/ZB.MOM.WW.MxGateway.Worker/MxAccess/WnWrapAlarmConsumer.cs`
- `src/ZB.MOM.WW.MxGateway.Worker/MxAccess/AlarmDispatcher.cs`
- `src/ZB.MOM.WW.MxGateway.Worker/MxAccess/AlarmCommandHandler.cs`
- `src/ZB.MOM.WW.MxGateway.Worker/MxAccess/AlarmRecordTransitionMapper.cs`
- `src/ZB.MOM.WW.MxGateway.Worker/MxAccess/MxAccessAlarmEventSink.cs`
- `src/ZB.MOM.WW.MxGateway.Worker/MxAccess/MxAlarmStateKind.cs`
- `src/ZB.MOM.WW.MxGateway.Worker/MxAccess/MxAlarmTransitionEvent.cs`
- `src/ZB.MOM.WW.MxGateway.Worker/MxAccess/IMxAccessAlarmConsumer.cs`
- `src/ZB.MOM.WW.MxGateway.Worker/MxAccess/IAlarmCommandHandler.cs`
- `src/ZB.MOM.WW.MxGateway.Worker/MxAccess/MxAccessCommandExecutor.cs` (alarm arms)
- `src/ZB.MOM.WW.MxGateway.Server/Alarms/GatewayAlarmMonitor.cs`
- `src/ZB.MOM.WW.MxGateway.Server/Alarms/IGatewayAlarmService.cs`
- `src/ZB.MOM.WW.MxGateway.Server/Alarms/AlarmsServiceCollectionExtensions.cs`
- `src/ZB.MOM.WW.MxGateway.Server/Configuration/AlarmsOptions.cs`
- `src/ZB.MOM.WW.MxGateway.Contracts/Protos/mxaccess_gateway.proto`
---
DOC / LINES / 71-74 (comment about `AlarmClientConsumer.cs`)
CLAIM / The file `src/ZB.MOM.WW.MxGateway.Worker/MxAccess/AlarmClientConsumer.cs` exists in the repo.
CLAIM_TYPE / path
VERDICT / wrong
EVIDENCE / `find /Users/dohertj2/Desktop/mxaccessgateway/src -name "AlarmClientConsumer*"` returns nothing. The file no longer exists; `WnWrapAlarmConsumer.cs` comments confirm it was replaced: `WnWrapAlarmConsumer.cs:18-19`.
CODE_AREA / alarm.subscribe
SEVERITY / medium
PROPOSED_FIX / Update references to the obsolete `AlarmClientConsumer.cs` throughout the doc to note that the file was retired and replaced by `WnWrapAlarmConsumer.cs`.
---
DOC / LINES / 71-74
CLAIM / The architecture comment on `AlarmClientConsumer.cs` (PR A.5) describing `IAlarmMgrDataProvider` managed events is wrong against the deployed assembly — there is no managed event surface.
CLAIM_TYPE / behavior-rule
VERDICT / stale
EVIDENCE / The source file it critiques (`AlarmClientConsumer.cs`) no longer exists in the repo. The critique is historically accurate but refers to a file that was removed during the wnwrap migration. No live code contains `IAlarmMgrDataProvider`. `WnWrapAlarmConsumer.cs:1-575`.
CODE_AREA / alarm.subscribe
SEVERITY / low
PROPOSED_FIX / Note that the critique is a historical record; the offending file has been removed. The section remains valid as probe context but should clarify the current state.
---
DOC / LINES / 87-88
CLAIM / `AlarmClientConsumer.AlarmRecordReceived` has no production callers; `RaiseAlarmRecordReceived` is `internal` for tests and never invoked at runtime.
CLAIM_TYPE / behavior-rule
VERDICT / stale
EVIDENCE / Neither `AlarmRecordReceived` nor `RaiseAlarmRecordReceived` appear anywhere in the current source tree (`grep -rn "AlarmRecordReceived\|RaiseAlarmRecordReceived" src` — zero results outside tests or binaries). The entire `AlarmClientConsumer` class was removed; the observation is a dead historical probe note.
CODE_AREA / alarm.subscribe
SEVERITY / low
PROPOSED_FIX / Flag as historical only; the code path no longer exists.
---
DOC / LINES / 492
CLAIM / "PR A.5's `Subscribe` / `AcknowledgeByGuid` / `SnapshotActiveAlarms` are correct — they're pull-style and don't depend on the notification mechanism."
CLAIM_TYPE / behavior-rule
VERDICT / stale
EVIDENCE / The method names in the current interface are `IMxAccessAlarmConsumer.AcknowledgeByGuid` and `SnapshotActiveAlarms` (`IMxAccessAlarmConsumer.cs:64,104`), so the names are accurate. However, this statement refers to PR A.5's `AlarmClientConsumer`, which no longer exists. The claim implicitly endorses `AlarmClientConsumer` code that has been replaced by `WnWrapAlarmConsumer`. The successor also exposes `AcknowledgeByGuid` but routes it through `AlarmAckByGUID` on `wwAlarmConsumerClass`.
CODE_AREA / alarm.ack
SEVERITY / low
PROPOSED_FIX / Note that PR A.5 was superseded; the current production path is `WnWrapAlarmConsumer`.
---
DOC / LINES / 604-605
CLAIM / After an alarm return-to-normal (`UNACK_RTN`), `wwAlarmConsumerClass.AlarmAckByGUID` is "the method to call" for acknowledgement.
CLAIM_TYPE / behavior-rule
VERDICT / wrong
EVIDENCE / The doc itself contradicts this eleven sections later ("Section 4. `AlarmAckByGUID` is not implemented", lines 750-756): `AlarmAckByGUID(VBGUID, …)` throws `NotImplementedException` (COM `E_NOTIMPL`) on `wwAlarmConsumerClass`. The doc at line 604 presents it as the correct ack method before the discovery in the live-smoke section, creating a contradiction within the document that integrators reading top-to-bottom will encounter.
CODE_AREA / alarm.ack
SEVERITY / high
PROPOSED_FIX / Add a forward-reference warning at line 604 ("Note: see 'Live smoke-test discoveries — section 4' below; AlarmAckByGUID is E_NOTIMPL on wnwrap and must not be called directly; use AlarmAckByName via the ack-only consumer.") or reorder the section.
---
DOC / LINES / 750-756
CLAIM / `AlarmAckByGUID(VBGUID, …)` throws `NotImplementedException` (`E_NOTIMPL`) on `wwAlarmConsumerClass`, so all acks must go through `AlarmAckByName`.
CLAIM_TYPE / behavior-rule
VERDICT / accurate
EVIDENCE / `WnWrapAlarmConsumer.cs:215-239` provides `AcknowledgeByGuid` which calls `com.AlarmAckByGUID` directly (the COM interop). The method is present in the consumer and called from `AlarmCommandHandler.Acknowledge` (`AlarmCommandHandler.cs:141-158`) and `AlarmDispatcher.Acknowledge` (`AlarmDispatcher.cs:87-103`). The code path is plumbed through and compiles. Whether it still throws `E_NOTIMPL` at runtime on the deployed AVEVA build is a runtime-only observable — the doc's claim was empirically confirmed 2026-05-01.
CODE_AREA / alarm.ack
SEVERITY / medium
PROPOSED_FIX / Flag: the code now calls `AlarmAckByGUID` without a try/catch for `E_NOTIMPL`; document that the GUID path will surface a `COMException` at runtime on affected AVEVA builds and that the gateway routes canonical `Provider!Group.Tag` references through `AcknowledgeAlarmByName` to avoid this.
---
DOC / LINES / 758-762
CLAIM / "The proto `AcknowledgeAlarmCommand` (GUID-based) and `MxAccessCommandExecutor.ExecuteAcknowledgeAlarm` switch arm remain in the codebase for forward-compat, but the gateway-side `WorkerAlarmRpcDispatcher.AcknowledgeAsync` now always routes through `AcknowledgeAlarmByName` when the public RPC supplies a recognizable `Provider!Group.Tag` reference."
CLAIM_TYPE / cross-ref
VERDICT / wrong
EVIDENCE / (a) `WorkerAlarmRpcDispatcher` does not exist in the source tree. The class that routes acknowledge requests is `GatewayAlarmMonitor.AcknowledgeAsync` + `BuildAcknowledgeCommand` (`GatewayAlarmMonitor.cs:437,516`). (b) The gateway does NOT always route through `AcknowledgeAlarmByName`: `BuildAcknowledgeCommand` first tries `Guid.TryParse`; if the `alarm_full_reference` is a canonical GUID it still dispatches `MxCommandKind.AcknowledgeAlarm` (the GUID path) (`GatewayAlarmMonitor.cs:528-543`). Only when the reference is not a GUID does it fall through to `AcknowledgeAlarmByName` (`GatewayAlarmMonitor.cs:545-563`).
CODE_AREA / alarm.ack
SEVERITY / high
PROPOSED_FIX / (1) Replace `WorkerAlarmRpcDispatcher` with the actual class name `GatewayAlarmMonitor`. (2) Correct the routing description: GUID-shaped references still go through `AcknowledgeAlarmCommand` (GUID path); `Provider!Group.Tag` references go through `AcknowledgeAlarmByNameCommand`. The claim that it "always routes through `AcknowledgeAlarmByName`" is false.
---
DOC / LINES / 636-639 (A.2 outline step 2)
CLAIM / Production `WnWrapAlarmConsumer` polls `GetXmlCurrentAlarms2(maxAlmCnt, out xml)` on a timer (500ms1s cadence).
CLAIM_TYPE / behavior-rule
VERDICT / wrong
EVIDENCE / `WnWrapAlarmConsumer.cs:38-43` explicitly states "the consumer owns no internal timer." `PollOnce()` is driven externally by `StaRuntime.InvokeAsync` (`WnWrapAlarmConsumer.cs:39`, `AlarmCommandHandler.cs:29-33`). The 500ms1s timer cadence mentioned in the doc was a design proposal; the implementation delegates all poll scheduling to the caller (STA). The doc's step 2 reads as if the consumer self-schedules.
CODE_AREA / alarm.subscribe
SEVERITY / medium
PROPOSED_FIX / Correct to: "Poll `GetXmlCurrentAlarms2` via `PollOnce()` called externally by the worker's STA through `StaRuntime.InvokeAsync`; the consumer owns no internal timer."
---
DOC / LINES / 641-643 (A.2 outline step 2)
CLAIM / "`AlarmAckByGUID(VBGUID, comment, oprName, node, domain, fullName)` for client-driven acknowledgements (matches PR A.5's `AlarmAckCommand` payload)."
CLAIM_TYPE / rpc/proto
VERDICT / wrong
EVIDENCE / The proto message is named `AcknowledgeAlarmCommand` (not `AlarmAckCommand`): `mxaccess_gateway.proto:337`. The consumer also exposes `AcknowledgeByGuid` (not `AlarmAckByGUID`) as its interface method (`IMxAccessAlarmConsumer.cs:64`). The doc uses the COM method name where it should use the proto/interface name, and uses the wrong proto message name.
CODE_AREA / alarm.ack
SEVERITY / medium
PROPOSED_FIX / Replace "PR A.5's `AlarmAckCommand` payload" with "the proto's `AcknowledgeAlarmCommand` message" (`mxaccess_gateway.proto:337`).
---
DOC / LINES / 644-647 (A.2 outline step 3)
CLAIM / STATE mapping: `UNACK_ALM``in_alarm=true, acked=false`; `UNACK_RTN``in_alarm=false, acked=false`; `ACK_ALM``in_alarm=true, acked=true`; `ACK_RTN``in_alarm=false, acked=true`.
CLAIM_TYPE / term
VERDICT / wrong
EVIDENCE / The production proto uses `AlarmConditionState` (Active / ActiveAcked / Inactive), not boolean `in_alarm`/`acked` fields. `AlarmDispatcher.MapConditionState` (`AlarmDispatcher.cs:221-234`): `UnackAlm→Active`, `AckAlm→ActiveAcked`, `UnackRtn→Inactive`, `AckRtn→Inactive`. Both Rtn states collapse to `Inactive` — the `acked` distinction on a cleared alarm is not surfaced. The doc's proposed boolean decomposition was a design proposal that was not adopted; the final proto shape uses the enum.
CODE_AREA / alarm.state
SEVERITY / high
PROPOSED_FIX / Replace the boolean mapping table with the actual `AlarmConditionState` enum mapping used in `AlarmDispatcher.MapConditionState`. Document that `UnackRtn` and `AckRtn` both map to `Inactive` (ack-vs-unack on a cleared alarm is not exposed through the proto).
---
DOC / LINES / 648-649 (A.2 outline step 3)
CLAIM / "`GUID``condition_id` (canonicalize the no-dashes hex to a UUID string)."
CLAIM_TYPE / term
VERDICT / wrong
EVIDENCE / The production code stores the GUID as `MxAlarmSnapshotRecord.AlarmGuid` (a `System.Guid`) and the proto carries it inside `OnAlarmTransitionEvent` only implicitly (there is no `condition_id` field in the proto). The `alarm_full_reference` field is used as the stable identifier for condition correlation, not a `condition_id`. `mxaccess_gateway.proto:720-723`, `OnAlarmTransitionEvent.alarm_full_reference`. The field name `condition_id` does not exist in the proto.
CODE_AREA / alarm.state
SEVERITY / medium
PROPOSED_FIX / Replace `condition_id` with the actual stable identifier: `alarm_full_reference` (`OnAlarmTransitionEvent.alarm_full_reference`). The GUID is used internally by `WnWrapAlarmConsumer` as a snapshot key but is not exposed as a proto field.
---
DOC / LINES / 651-654 (A.2 outline step 3 — timestamp)
CLAIM / "`DATE + TIME + GMTOFFSET + DSTADJUST` → reassemble UTC timestamp; matches the worker's existing `Timestamp` wire format."
CLAIM_TYPE / behavior-rule
VERDICT / accurate
EVIDENCE / `AlarmRecordTransitionMapper.ParseTransitionTimestampUtc` (`AlarmRecordTransitionMapper.cs:116-188`) parses all four fields and computes UTC. The proto uses `google.protobuf.Timestamp` (`mxaccess_gateway.proto:747`). Wire-up matches.
CODE_AREA / alarm.state
SEVERITY / low
PROPOSED_FIX / flag only
---
DOC / LINES / 656-657 (A.2 outline step 3)
CLAIM / "`PRIORITY` → severity (already 1-1000-ish range)."
CLAIM_TYPE / behavior-rule
VERDICT / accurate
EVIDENCE / `WnWrapAlarmConsumer.ParseSnapshotXml` reads `PRIORITY` as `int` (`WnWrapAlarmConsumer.cs:433`), stored as `MxAlarmSnapshotRecord.Priority`. `AlarmDispatcher.OnTransition` passes it as `severity: record.Priority` (`AlarmDispatcher.cs:187`). `OnAlarmTransitionEvent.severity` is `int32` in the proto (`mxaccess_gateway.proto`). The 1-1000 range is consistent with AVEVA's alarm priority range.
CODE_AREA / alarm.state
SEVERITY / low
PROPOSED_FIX / flag only
---
DOC / LINES / 658-659 (A.2 outline step 3)
CLAIM / "`TAGNAME` → reference; `PROVIDER_NAME` + `GROUP` for scope metadata."
CLAIM_TYPE / behavior-rule
VERDICT / accurate
EVIDENCE / `AlarmDispatcher.OnTransition` calls `AlarmRecordTransitionMapper.ComposeFullReference(record.ProviderName, record.Group, record.TagName)` and passes the result as `alarmFullReference` (`AlarmDispatcher.cs:180-183`). `ComposeFullReference` formats it as `Provider!Group.TagName` (`AlarmRecordTransitionMapper.cs:90-102`). `TAGNAME` alone is passed as `sourceObjectReference` (`AlarmDispatcher.cs:184`).
CODE_AREA / alarm.state
SEVERITY / low
PROPOSED_FIX / flag only
---
DOC / LINES / 672-676 (A.2 outline step 5)
CLAIM / "PR A.5's snapshot/ack contract tests can stay — they don't touch the underlying COM API."
CLAIM_TYPE / cross-ref
VERDICT / stale
EVIDENCE / PR A.5's `AlarmClientConsumer` was retired; there is no class by that name. The test files for alarm command handling now cover `AlarmCommandHandler`, `AlarmDispatcher`, and `WnWrapAlarmConsumerXmlTests` — none named as "PR A.5 tests." The statement implies a test corpus that doesn't exist under the described label.
CODE_AREA / alarm.subscribe
SEVERITY / low
PROPOSED_FIX / Remove or update the PR label; reference actual test files: `AlarmCommandHandlerTests.cs`, `AlarmDispatcherTests.cs`, `WnWrapAlarmConsumerXmlTests.cs`.
---
DOC / LINES / 673-675 (settled API ordering section)
CLAIM / "`InitializeConsumer` first, then `RegisterConsumer` — both on `aaAlarmManagedClient.AlarmClient` and `wwAlarmConsumerClass`."
CLAIM_TYPE / behavior-rule
VERDICT / accurate
EVIDENCE / `WnWrapAlarmConsumer.Subscribe` calls `IwwAlarmConsumer_InitializeConsumer` before `IwwAlarmConsumer_RegisterConsumer` (`WnWrapAlarmConsumer.cs:117-137`). Same ordering for `ackClient` (`WnWrapAlarmConsumer.cs:188-208`).
CODE_AREA / alarm.subscribe
SEVERITY / low
PROPOSED_FIX / flag only
---
DOC / LINES / 676-682 (settled API section)
CLAIM / "`aaAlarmManagedClient.AlarmClient.RegisterConsumer` is 5-arg (includes `bRetainHiddenAlarms`); `wwAlarmConsumerClass.RegisterConsumer` is 4-arg (no `bRetainHiddenAlarms`)."
CLAIM_TYPE / behavior-rule
VERDICT / accurate
EVIDENCE / `WnWrapAlarmConsumer.Subscribe` calls `IwwAlarmConsumer_RegisterConsumer` with 4 args: `hWnd, szProductName, szApplicationName, szVersion` (`WnWrapAlarmConsumer.cs:128-132`). Consistent with the doc.
CODE_AREA / alarm.subscribe
SEVERITY / low
PROPOSED_FIX / flag only
---
DOC / LINES / 683-685 (settled API section)
CLAIM / "Subscription expression format: `\\<machine>\Galaxy!<area>` (literal `Galaxy` provider) for both libraries."
CLAIM_TYPE / path
VERDICT / accurate
EVIDENCE / `WnWrapAlarmConsumer.ComposeXmlAlarmQuery` parses this format and treats `Galaxy` as the provider (`WnWrapAlarmConsumer.cs:489-530`). `IMxAccessAlarmConsumer.Subscribe` doc comment confirms: "Subscription string follows AVEVA's canonical format: `\\<node>\Galaxy!<area>`. The literal 'Galaxy' is the provider name (regardless of the configured Galaxy database name)." (`IMxAccessAlarmConsumer.cs:44-46`). `AlarmsOptions.cs:16-17` also confirms.
CODE_AREA / alarm.subscribe
SEVERITY / low
PROPOSED_FIX / flag only
---
DOC / LINES / 684-685 (settled API section)
CLAIM / "Native ack: `AlarmAckByGUID(VBGUID guid, comment, oprName, node, domain, fullName)` on the v2 surface."
CLAIM_TYPE / behavior-rule
VERDICT / accurate
EVIDENCE / `WnWrapAlarmConsumer.AcknowledgeByGuid` calls `com.AlarmAckByGUID` with exactly those args (`WnWrapAlarmConsumer.cs:232-238`).
CODE_AREA / alarm.ack
SEVERITY / low
PROPOSED_FIX / flag only
---
DOC / LINES / 695-699 (live smoke quirk 1)
CLAIM / "Without `SetXmlAlarmQuery`, the first `GetXmlCurrentAlarms2` call fails with `E_FAIL` (HRESULT `0x80004005`)."
CLAIM_TYPE / behavior-rule
VERDICT / accurate
EVIDENCE / `WnWrapAlarmConsumer.Subscribe` calls `SetXmlAlarmQuery` and wraps it with a `COMException` guard that would surface as `InvalidOperationException` with the E_FAIL message (`WnWrapAlarmConsumer.cs:156-182`). The call is mandatory per production code structure.
CODE_AREA / alarm.subscribe
SEVERITY / low
PROPOSED_FIX / flag only
---
DOC / LINES / 719-733 (live smoke quirk 2)
CLAIM / "Two consumers required: read-side consumer (with `SetXmlAlarmQuery`) and ack-only consumer (without `SetXmlAlarmQuery`). All `AcknowledgeByName` calls dispatch through the ack-only instance."
CLAIM_TYPE / behavior-rule
VERDICT / accurate
EVIDENCE / `WnWrapAlarmConsumer.Subscribe` provisions `ackClient = new wwAlarmConsumerClass()` with full lifecycle but no `SetXmlAlarmQuery` (`WnWrapAlarmConsumer.cs:184-210`). `AcknowledgeByName` uses `ackClient` (`WnWrapAlarmConsumer.cs:256-278`). `AcknowledgeByGuid` uses `client` (read-side) (`WnWrapAlarmConsumer.cs:224-238`).
CODE_AREA / alarm.ack
SEVERITY / low
PROPOSED_FIX / flag only
---
DOC / LINES / 736-748 (live smoke quirk 3)
CLAIM / "The v2 8-arg `AlarmAckByName` returns -55 on this AVEVA build. The v1 6-arg `AlarmAckByName` works. Production `WnWrapAlarmConsumer.AcknowledgeByName` calls the 6-arg overload. Operator domain and full-name fields are accepted by the proto but not propagated to AVEVA (discarded)."
CLAIM_TYPE / behavior-rule
VERDICT / accurate
EVIDENCE / `WnWrapAlarmConsumer.AcknowledgeByName` calls `com.AlarmAckByName` (6-arg) and explicitly discards `ackOperatorDomain` and `ackOperatorFullName` with `_ = ...` (`WnWrapAlarmConsumer.cs:268-278`). The proto `AcknowledgeAlarmByNameCommand` retains `operator_domain` and `operator_full_name` fields (`mxaccess_gateway.proto:359-373`).
CODE_AREA / alarm.ack
SEVERITY / low
PROPOSED_FIX / flag only
---
DOC / LINES / 750-756 (live smoke quirk 4)
CLAIM / "`AlarmAckByGUID` is not implemented on `wwAlarmConsumerClass`; it throws `NotImplementedException` / `E_NOTIMPL`. The reference→GUID lookup is not viable; all acks must go through `AlarmAckByName`."
CLAIM_TYPE / behavior-rule
VERDICT / stale
EVIDENCE / The production code at `WnWrapAlarmConsumer.AcknowledgeByGuid` still calls `com.AlarmAckByGUID` directly without a guard for `E_NOTIMPL` (`WnWrapAlarmConsumer.cs:215-239`). The gateway's `BuildAcknowledgeCommand` still dispatches `MxCommandKind.AcknowledgeAlarm` (GUID path) when `alarm_full_reference` parses as a GUID (`GatewayAlarmMonitor.cs:528-543`). The doc says all acks must go through `AcknowledgeByName`, but the code still routes GUID-shaped references through `AlarmAckByGUID`. The `E_NOTIMPL` runtime behavior is unguarded.
CODE_AREA / alarm.ack
SEVERITY / high
PROPOSED_FIX / Either (a) add a `COMException`/`NotImplementedException` guard around `AlarmAckByGUID` in `WnWrapAlarmConsumer.AcknowledgeByGuid` that falls back to `AcknowledgeByName`, or (b) make the gateway never dispatch the GUID arm. Document whichever approach is taken. The current state silently sends a doomed IPC command.
---
DOC / LINES / 761-762
CLAIM / "`WorkerAlarmRpcDispatcher.AcknowledgeAsync` now always routes through `AcknowledgeAlarmByName` when the public RPC supplies a recognizable `Provider!Group.Tag` reference."
CLAIM_TYPE / cross-ref
VERDICT / wrong
EVIDENCE / (a) No class named `WorkerAlarmRpcDispatcher` exists in the source tree. The gateway-side routing is in `GatewayAlarmMonitor.BuildAcknowledgeCommand` (`GatewayAlarmMonitor.cs:516`). (b) The routing is conditional: GUID-shaped `alarm_full_reference``AcknowledgeAlarmCommand` (GUID path); `Provider!Group.Tag``AcknowledgeAlarmByNameCommand`. The claim that the routing "always" goes through `AcknowledgeAlarmByName` is incorrect.
CODE_AREA / alarm.ack
SEVERITY / high
PROPOSED_FIX / Replace the entire sentence. The correct description: "The gateway's `GatewayAlarmMonitor.BuildAcknowledgeCommand` (`GatewayAlarmMonitor.cs:516`) dispatches `MxCommandKind.AcknowledgeAlarm` for GUID-shaped references and `MxCommandKind.AcknowledgeAlarmByName` for `Provider!Group.Tag` references."
---
DOC / LINES / 765-773 (STA quirk 5)
CLAIM / "The consumer's internal `Timer` fires on threadpool threads and would block on cross-apartment marshaling unless the host STA pumps Win32 messages. The smoke test sidesteps this by setting `pollIntervalMilliseconds=0` (Timer disabled) and driving `PollOnce` manually."
CLAIM_TYPE / behavior-rule
VERDICT / stale
EVIDENCE / The production `WnWrapAlarmConsumer` has no internal `Timer` at all — the design was revised so `PollOnce()` is always external (`WnWrapAlarmConsumer.cs:38-43`: "the consumer owns no internal timer"). There is no `pollIntervalMilliseconds` constructor parameter (`WnWrapAlarmConsumer.cs:69-87`). The constructor takes only `wwAlarmConsumerClass client` and `int maxAlarmsPerFetch`. The smoke test mention of `pollIntervalMilliseconds=0` refers to a superseded design.
CODE_AREA / alarm.subscribe
SEVERITY / medium
PROPOSED_FIX / Update to reflect the final design: `WnWrapAlarmConsumer` has no internal timer; `PollOnce()` is always called externally by the STA. Remove the `pollIntervalMilliseconds=0` test-workaround reference.
---
DOC / LINES / 599-601 (XML STATE enum section)
CLAIM / "`STATE` enum values observed: `UNACK_RTN` (alarm returned to normal, unacknowledged) and `UNACK_ALM` (alarm active and unacknowledged). Other states (`ACK_RTN`, `ACK_ALM`) would appear when an ack is performed."
CLAIM_TYPE / term
VERDICT / accurate
EVIDENCE / `MxAlarmStateKind.cs:1-17` defines all four values. `AlarmRecordTransitionMapper.ParseStateKind` handles all four (`AlarmRecordTransitionMapper.cs:27-38`).
CODE_AREA / alarm.state
SEVERITY / low
PROPOSED_FIX / flag only
---
DOC / LINES / 628-630 (reference format in smoke capture)
CLAIM / Reference format in the capture: `ref='Galaxy!TestArea.TestMachine_001.TestAlarm001'` — the `alarm_full_reference` is composed as `ProviderName!Group.TagName`.
CLAIM_TYPE / term
VERDICT / accurate
EVIDENCE / `AlarmRecordTransitionMapper.ComposeFullReference` formats as `{provider}!{group}.{name}` (`AlarmRecordTransitionMapper.cs:90-102`). The example matches this pattern exactly.
CODE_AREA / alarm.state
SEVERITY / low
PROPOSED_FIX / flag only
---
DOC / LINES / (entire doc — RPC names)
CLAIM / The document mentions IPC commands `SubscribeAlarms`, `AcknowledgeByGuid`, `SnapshotActiveAlarms`, `QueryActiveAlarms` but never names the public gRPC RPCs — `AcknowledgeAlarm`, `StreamAlarms`, `QueryActiveAlarms` — or the config keys governing the always-on monitor (`MxGateway:Alarms:Enabled`, `MxGateway:Alarms:SubscriptionExpression`, `MxGateway:Alarms:DefaultArea`, `MxGateway:Alarms:ReconcileIntervalSeconds`).
CLAIM_TYPE / gap
VERDICT / gap
EVIDENCE / `mxaccess_gateway.proto:22-37` (RPCs); `AlarmsOptions.cs:21-47` (config keys); `GatewayAlarmMonitor.cs:17-51` (always-on broker). None documented in `AlarmClientDiscovery.md`.
CODE_AREA / alarm.subscribe
SEVERITY / high
PROPOSED_FIX / The doc is a probe/research log, not an operator/integrator guide. However, the gap means no other document covers these public-surface items. Add a section or cross-reference to the public alarm API: RPCs `AcknowledgeAlarm`, `StreamAlarms`, `QueryActiveAlarms`; config keys `MxGateway:Alarms:Enabled`, `MxGateway:Alarms:SubscriptionExpression`, `MxGateway:Alarms:DefaultArea`, `MxGateway:Alarms:ReconcileIntervalSeconds`.
---
DOC / LINES / (entire doc — always-on broker architecture)
CLAIM / (gap) The doc describes a model where individual client sessions subscribe to alarms. The production architecture uses a gateway-owned always-on `GatewayAlarmMonitor` that holds one dedicated worker session and fans the alarm feed to all clients. No client opens its own alarm subscription; `StreamAlarms` is session-less.
CLAIM_TYPE / gap
VERDICT / gap
EVIDENCE / `GatewayAlarmMonitor.cs:1-697`; `IGatewayAlarmService.cs:27-63`; `AlarmsOptions.cs:1-48`. `AlarmClientDiscovery.md` describes the worker alarm consumer (IPC layer) but never describes the gateway-level brokering architecture that wraps it.
CODE_AREA / alarm.subscribe
SEVERITY / high
PROPOSED_FIX / Add a section describing `GatewayAlarmMonitor` as the always-on broker: one gateway-owned session, periodic reconcile loop (`ReconcileIntervalSeconds`), `StreamAsync` fan-out to per-client `Channel<AlarmFeedMessage>`, subscriber capacity (2048 messages), fail-open restart-backoff (5s).
---
DOC / LINES / (entire doc — AlarmFeedMessage / snapshot_complete protocol)
CLAIM / (gap) The doc does not document the `AlarmFeedMessage` stream protocol: initial burst of `active_alarm` messages, then `snapshot_complete` sentinel, then `transition` messages for live changes.
CLAIM_TYPE / gap
VERDICT / gap
EVIDENCE / `mxaccess_gateway.proto:857-868` (message definition); `GatewayAlarmMonitor.StreamAsync:386-434`. This is the key integrator-facing streaming contract.
CODE_AREA / alarm.subscribe
SEVERITY / high
PROPOSED_FIX / Document the `StreamAlarms` protocol: `AlarmFeedMessage` union with `active_alarm`, `snapshot_complete`, and `transition` fields; the invariant that the snapshot precedes the sentinel which precedes live transitions.
---
DOC / LINES / (entire doc — reconcile mechanism)
CLAIM / (gap) The periodic reconcile loop (`ReconcileIntervalSeconds`, default 30s, floor 5s) that snapshots the worker's active-alarm set and broadcasts synthetic raise/clear transitions for missed alarms is not documented.
CLAIM_TYPE / gap
VERDICT / gap
EVIDENCE / `GatewayAlarmMonitor.ReconcileLoopAsync:235-260`; `GatewayAlarmMonitor.ApplyReconcile:315-354`; `AlarmsOptions.ReconcileIntervalSeconds:47`.
CODE_AREA / alarm.subscribe
SEVERITY / medium
PROPOSED_FIX / Document the reconcile pass: cadence, purpose (catches missed poll-and-diff transitions), synthetic transition kind (`Raise`/`Clear`), and that it does not emit `Acknowledge` transitions.
---
DOC / LINES / (entire doc — subscriber backpressure / drop behavior)
CLAIM / (gap) A subscriber that cannot keep up with the alarm feed is dropped with an error ("Alarm feed subscriber fell behind and was dropped; reconnect to re-snapshot"). The queue capacity is 2048. This behavior is not documented.
CLAIM_TYPE / gap
VERDICT / gap
EVIDENCE / `GatewayAlarmMonitor.Broadcast:358-375`; `SubscriberQueueCapacity = 2048` (`GatewayAlarmMonitor.cs:21`).
CODE_AREA / alarm.subscribe
SEVERITY / medium
PROPOSED_FIX / Document the backpressure model: bounded 2048-message channel per subscriber; slow subscribers are completed with error and must reconnect; reconnect re-snapshots the active set.
---
DOC / LINES / (entire doc — `alarm_full_reference` parse format for ack)
CLAIM / (gap) The doc does not document the `alarm_full_reference` parse contract for `AcknowledgeAlarm`: a canonical GUID string triggers the GUID path; `Provider!Group.Tag` (first `!` splits provider, first `.` splits group from tag) triggers the by-name path; anything else is rejected.
CLAIM_TYPE / gap
VERDICT / gap
EVIDENCE / `GatewayAlarmMonitor.BuildAcknowledgeCommand` and `TryParseAlarmReference` (`GatewayAlarmMonitor.cs:516-610`). Error message: "alarm_full_reference must be a canonical GUID or 'Provider!Group.Tag' format."
CODE_AREA / alarm.ack
SEVERITY / high
PROPOSED_FIX / Document the `AcknowledgeAlarm.alarm_full_reference` field's two accepted formats and how the gateway routes each.
---
DOC / LINES / (entire doc — `AlarmConditionState` on snapshot)
CLAIM / (gap) The `ActiveAlarmSnapshot.current_state` field uses `AlarmConditionState` (Active / ActiveAcked / Inactive) — the distinction between `UnackRtn` and `AckRtn` is lost in the snapshot (both collapse to Inactive). This is not documented.
CLAIM_TYPE / gap
VERDICT / gap
EVIDENCE / `AlarmDispatcher.MapConditionState` (`AlarmDispatcher.cs:221-234`): both `UnackRtn` and `AckRtn` map to `AlarmConditionState.Inactive`.
CODE_AREA / alarm.state
SEVERITY / medium
PROPOSED_FIX / Document the state collapse rule: the `ActiveAlarmSnapshot.current_state` field does not distinguish between acknowledged-cleared and unacknowledged-cleared alarms; both surface as `Inactive`. Consumers that need this distinction must track the transition stream.
---
DOC / LINES / (entire doc — transition kind table)
CLAIM / (gap) The `AlarmTransitionKind` enum has a `Retrigger` value (`ALARM_TRANSITION_KIND_RETRIGGER = 4`), but the doc only describes Raise / Acknowledge / Clear.
CLAIM_TYPE / gap
VERDICT / gap
EVIDENCE / `mxaccess_gateway.proto:777`; `AlarmRecordTransitionMapper.MapTransition` does not produce `Retrigger` — it is defined in the proto but unused by the current mapping logic (`AlarmRecordTransitionMapper.cs:54-78`).
CODE_AREA / alarm.state
SEVERITY / low
PROPOSED_FIX / Note that `AlarmTransitionKind.Retrigger` exists in the proto but is not emitted by the current worker (the `*Rtn→*Alm` re-trigger case maps to `Raise`). Flag as reserved for future use or remove from the proto if unused.