405 lines
29 KiB
Markdown
405 lines
29 KiB
Markdown
# Cluster 09 — Alarms
|
||
|
||
Audited doc: `docs/AlarmClientDiscovery.md`
|
||
Code base verified against:
|
||
- `src/ZB.MOM.WW.MxGateway.Worker/MxAccess/WnWrapAlarmConsumer.cs`
|
||
- `src/ZB.MOM.WW.MxGateway.Worker/MxAccess/AlarmDispatcher.cs`
|
||
- `src/ZB.MOM.WW.MxGateway.Worker/MxAccess/AlarmCommandHandler.cs`
|
||
- `src/ZB.MOM.WW.MxGateway.Worker/MxAccess/AlarmRecordTransitionMapper.cs`
|
||
- `src/ZB.MOM.WW.MxGateway.Worker/MxAccess/MxAccessAlarmEventSink.cs`
|
||
- `src/ZB.MOM.WW.MxGateway.Worker/MxAccess/MxAlarmStateKind.cs`
|
||
- `src/ZB.MOM.WW.MxGateway.Worker/MxAccess/MxAlarmTransitionEvent.cs`
|
||
- `src/ZB.MOM.WW.MxGateway.Worker/MxAccess/IMxAccessAlarmConsumer.cs`
|
||
- `src/ZB.MOM.WW.MxGateway.Worker/MxAccess/IAlarmCommandHandler.cs`
|
||
- `src/ZB.MOM.WW.MxGateway.Worker/MxAccess/MxAccessCommandExecutor.cs` (alarm arms)
|
||
- `src/ZB.MOM.WW.MxGateway.Server/Alarms/GatewayAlarmMonitor.cs`
|
||
- `src/ZB.MOM.WW.MxGateway.Server/Alarms/IGatewayAlarmService.cs`
|
||
- `src/ZB.MOM.WW.MxGateway.Server/Alarms/AlarmsServiceCollectionExtensions.cs`
|
||
- `src/ZB.MOM.WW.MxGateway.Server/Configuration/AlarmsOptions.cs`
|
||
- `src/ZB.MOM.WW.MxGateway.Contracts/Protos/mxaccess_gateway.proto`
|
||
|
||
---
|
||
|
||
DOC / LINES / 71-74 (comment about `AlarmClientConsumer.cs`)
|
||
CLAIM / The file `src/ZB.MOM.WW.MxGateway.Worker/MxAccess/AlarmClientConsumer.cs` exists in the repo.
|
||
CLAIM_TYPE / path
|
||
VERDICT / wrong
|
||
EVIDENCE / `find /Users/dohertj2/Desktop/mxaccessgateway/src -name "AlarmClientConsumer*"` returns nothing. The file no longer exists; `WnWrapAlarmConsumer.cs` comments confirm it was replaced: `WnWrapAlarmConsumer.cs:18-19`.
|
||
CODE_AREA / alarm.subscribe
|
||
SEVERITY / medium
|
||
PROPOSED_FIX / Update references to the obsolete `AlarmClientConsumer.cs` throughout the doc to note that the file was retired and replaced by `WnWrapAlarmConsumer.cs`.
|
||
|
||
---
|
||
|
||
DOC / LINES / 71-74
|
||
CLAIM / The architecture comment on `AlarmClientConsumer.cs` (PR A.5) describing `IAlarmMgrDataProvider` managed events is wrong against the deployed assembly — there is no managed event surface.
|
||
CLAIM_TYPE / behavior-rule
|
||
VERDICT / stale
|
||
EVIDENCE / The source file it critiques (`AlarmClientConsumer.cs`) no longer exists in the repo. The critique is historically accurate but refers to a file that was removed during the wnwrap migration. No live code contains `IAlarmMgrDataProvider`. `WnWrapAlarmConsumer.cs:1-575`.
|
||
CODE_AREA / alarm.subscribe
|
||
SEVERITY / low
|
||
PROPOSED_FIX / Note that the critique is a historical record; the offending file has been removed. The section remains valid as probe context but should clarify the current state.
|
||
|
||
---
|
||
|
||
DOC / LINES / 87-88
|
||
CLAIM / `AlarmClientConsumer.AlarmRecordReceived` has no production callers; `RaiseAlarmRecordReceived` is `internal` for tests and never invoked at runtime.
|
||
CLAIM_TYPE / behavior-rule
|
||
VERDICT / stale
|
||
EVIDENCE / Neither `AlarmRecordReceived` nor `RaiseAlarmRecordReceived` appear anywhere in the current source tree (`grep -rn "AlarmRecordReceived\|RaiseAlarmRecordReceived" src` — zero results outside tests or binaries). The entire `AlarmClientConsumer` class was removed; the observation is a dead historical probe note.
|
||
CODE_AREA / alarm.subscribe
|
||
SEVERITY / low
|
||
PROPOSED_FIX / Flag as historical only; the code path no longer exists.
|
||
|
||
---
|
||
|
||
DOC / LINES / 492
|
||
CLAIM / "PR A.5's `Subscribe` / `AcknowledgeByGuid` / `SnapshotActiveAlarms` are correct — they're pull-style and don't depend on the notification mechanism."
|
||
CLAIM_TYPE / behavior-rule
|
||
VERDICT / stale
|
||
EVIDENCE / The method names in the current interface are `IMxAccessAlarmConsumer.AcknowledgeByGuid` and `SnapshotActiveAlarms` (`IMxAccessAlarmConsumer.cs:64,104`), so the names are accurate. However, this statement refers to PR A.5's `AlarmClientConsumer`, which no longer exists. The claim implicitly endorses `AlarmClientConsumer` code that has been replaced by `WnWrapAlarmConsumer`. The successor also exposes `AcknowledgeByGuid` but routes it through `AlarmAckByGUID` on `wwAlarmConsumerClass`.
|
||
CODE_AREA / alarm.ack
|
||
SEVERITY / low
|
||
PROPOSED_FIX / Note that PR A.5 was superseded; the current production path is `WnWrapAlarmConsumer`.
|
||
|
||
---
|
||
|
||
DOC / LINES / 604-605
|
||
CLAIM / After an alarm return-to-normal (`UNACK_RTN`), `wwAlarmConsumerClass.AlarmAckByGUID` is "the method to call" for acknowledgement.
|
||
CLAIM_TYPE / behavior-rule
|
||
VERDICT / wrong
|
||
EVIDENCE / The doc itself contradicts this eleven sections later ("Section 4. `AlarmAckByGUID` is not implemented", lines 750-756): `AlarmAckByGUID(VBGUID, …)` throws `NotImplementedException` (COM `E_NOTIMPL`) on `wwAlarmConsumerClass`. The doc at line 604 presents it as the correct ack method before the discovery in the live-smoke section, creating a contradiction within the document that integrators reading top-to-bottom will encounter.
|
||
CODE_AREA / alarm.ack
|
||
SEVERITY / high
|
||
PROPOSED_FIX / Add a forward-reference warning at line 604 ("Note: see 'Live smoke-test discoveries — section 4' below; AlarmAckByGUID is E_NOTIMPL on wnwrap and must not be called directly; use AlarmAckByName via the ack-only consumer.") or reorder the section.
|
||
|
||
---
|
||
|
||
DOC / LINES / 750-756
|
||
CLAIM / `AlarmAckByGUID(VBGUID, …)` throws `NotImplementedException` (`E_NOTIMPL`) on `wwAlarmConsumerClass`, so all acks must go through `AlarmAckByName`.
|
||
CLAIM_TYPE / behavior-rule
|
||
VERDICT / accurate
|
||
EVIDENCE / `WnWrapAlarmConsumer.cs:215-239` provides `AcknowledgeByGuid` which calls `com.AlarmAckByGUID` directly (the COM interop). The method is present in the consumer and called from `AlarmCommandHandler.Acknowledge` (`AlarmCommandHandler.cs:141-158`) and `AlarmDispatcher.Acknowledge` (`AlarmDispatcher.cs:87-103`). The code path is plumbed through and compiles. Whether it still throws `E_NOTIMPL` at runtime on the deployed AVEVA build is a runtime-only observable — the doc's claim was empirically confirmed 2026-05-01.
|
||
CODE_AREA / alarm.ack
|
||
SEVERITY / medium
|
||
PROPOSED_FIX / Flag: the code now calls `AlarmAckByGUID` without a try/catch for `E_NOTIMPL`; document that the GUID path will surface a `COMException` at runtime on affected AVEVA builds and that the gateway routes canonical `Provider!Group.Tag` references through `AcknowledgeAlarmByName` to avoid this.
|
||
|
||
---
|
||
|
||
DOC / LINES / 758-762
|
||
CLAIM / "The proto `AcknowledgeAlarmCommand` (GUID-based) and `MxAccessCommandExecutor.ExecuteAcknowledgeAlarm` switch arm remain in the codebase for forward-compat, but the gateway-side `WorkerAlarmRpcDispatcher.AcknowledgeAsync` now always routes through `AcknowledgeAlarmByName` when the public RPC supplies a recognizable `Provider!Group.Tag` reference."
|
||
CLAIM_TYPE / cross-ref
|
||
VERDICT / wrong
|
||
EVIDENCE / (a) `WorkerAlarmRpcDispatcher` does not exist in the source tree. The class that routes acknowledge requests is `GatewayAlarmMonitor.AcknowledgeAsync` + `BuildAcknowledgeCommand` (`GatewayAlarmMonitor.cs:437,516`). (b) The gateway does NOT always route through `AcknowledgeAlarmByName`: `BuildAcknowledgeCommand` first tries `Guid.TryParse`; if the `alarm_full_reference` is a canonical GUID it still dispatches `MxCommandKind.AcknowledgeAlarm` (the GUID path) (`GatewayAlarmMonitor.cs:528-543`). Only when the reference is not a GUID does it fall through to `AcknowledgeAlarmByName` (`GatewayAlarmMonitor.cs:545-563`).
|
||
CODE_AREA / alarm.ack
|
||
SEVERITY / high
|
||
PROPOSED_FIX / (1) Replace `WorkerAlarmRpcDispatcher` with the actual class name `GatewayAlarmMonitor`. (2) Correct the routing description: GUID-shaped references still go through `AcknowledgeAlarmCommand` (GUID path); `Provider!Group.Tag` references go through `AcknowledgeAlarmByNameCommand`. The claim that it "always routes through `AcknowledgeAlarmByName`" is false.
|
||
|
||
---
|
||
|
||
DOC / LINES / 636-639 (A.2 outline step 2)
|
||
CLAIM / Production `WnWrapAlarmConsumer` polls `GetXmlCurrentAlarms2(maxAlmCnt, out xml)` on a timer (500ms–1s cadence).
|
||
CLAIM_TYPE / behavior-rule
|
||
VERDICT / wrong
|
||
EVIDENCE / `WnWrapAlarmConsumer.cs:38-43` explicitly states "the consumer owns no internal timer." `PollOnce()` is driven externally by `StaRuntime.InvokeAsync` (`WnWrapAlarmConsumer.cs:39`, `AlarmCommandHandler.cs:29-33`). The 500ms–1s timer cadence mentioned in the doc was a design proposal; the implementation delegates all poll scheduling to the caller (STA). The doc's step 2 reads as if the consumer self-schedules.
|
||
CODE_AREA / alarm.subscribe
|
||
SEVERITY / medium
|
||
PROPOSED_FIX / Correct to: "Poll `GetXmlCurrentAlarms2` via `PollOnce()` called externally by the worker's STA through `StaRuntime.InvokeAsync`; the consumer owns no internal timer."
|
||
|
||
---
|
||
|
||
DOC / LINES / 641-643 (A.2 outline step 2)
|
||
CLAIM / "`AlarmAckByGUID(VBGUID, comment, oprName, node, domain, fullName)` for client-driven acknowledgements (matches PR A.5's `AlarmAckCommand` payload)."
|
||
CLAIM_TYPE / rpc/proto
|
||
VERDICT / wrong
|
||
EVIDENCE / The proto message is named `AcknowledgeAlarmCommand` (not `AlarmAckCommand`): `mxaccess_gateway.proto:337`. The consumer also exposes `AcknowledgeByGuid` (not `AlarmAckByGUID`) as its interface method (`IMxAccessAlarmConsumer.cs:64`). The doc uses the COM method name where it should use the proto/interface name, and uses the wrong proto message name.
|
||
CODE_AREA / alarm.ack
|
||
SEVERITY / medium
|
||
PROPOSED_FIX / Replace "PR A.5's `AlarmAckCommand` payload" with "the proto's `AcknowledgeAlarmCommand` message" (`mxaccess_gateway.proto:337`).
|
||
|
||
---
|
||
|
||
DOC / LINES / 644-647 (A.2 outline step 3)
|
||
CLAIM / STATE mapping: `UNACK_ALM` → `in_alarm=true, acked=false`; `UNACK_RTN` → `in_alarm=false, acked=false`; `ACK_ALM` → `in_alarm=true, acked=true`; `ACK_RTN` → `in_alarm=false, acked=true`.
|
||
CLAIM_TYPE / term
|
||
VERDICT / wrong
|
||
EVIDENCE / The production proto uses `AlarmConditionState` (Active / ActiveAcked / Inactive), not boolean `in_alarm`/`acked` fields. `AlarmDispatcher.MapConditionState` (`AlarmDispatcher.cs:221-234`): `UnackAlm→Active`, `AckAlm→ActiveAcked`, `UnackRtn→Inactive`, `AckRtn→Inactive`. Both Rtn states collapse to `Inactive` — the `acked` distinction on a cleared alarm is not surfaced. The doc's proposed boolean decomposition was a design proposal that was not adopted; the final proto shape uses the enum.
|
||
CODE_AREA / alarm.state
|
||
SEVERITY / high
|
||
PROPOSED_FIX / Replace the boolean mapping table with the actual `AlarmConditionState` enum mapping used in `AlarmDispatcher.MapConditionState`. Document that `UnackRtn` and `AckRtn` both map to `Inactive` (ack-vs-unack on a cleared alarm is not exposed through the proto).
|
||
|
||
---
|
||
|
||
DOC / LINES / 648-649 (A.2 outline step 3)
|
||
CLAIM / "`GUID` → `condition_id` (canonicalize the no-dashes hex to a UUID string)."
|
||
CLAIM_TYPE / term
|
||
VERDICT / wrong
|
||
EVIDENCE / The production code stores the GUID as `MxAlarmSnapshotRecord.AlarmGuid` (a `System.Guid`) and the proto carries it inside `OnAlarmTransitionEvent` only implicitly (there is no `condition_id` field in the proto). The `alarm_full_reference` field is used as the stable identifier for condition correlation, not a `condition_id`. `mxaccess_gateway.proto:720-723`, `OnAlarmTransitionEvent.alarm_full_reference`. The field name `condition_id` does not exist in the proto.
|
||
CODE_AREA / alarm.state
|
||
SEVERITY / medium
|
||
PROPOSED_FIX / Replace `condition_id` with the actual stable identifier: `alarm_full_reference` (`OnAlarmTransitionEvent.alarm_full_reference`). The GUID is used internally by `WnWrapAlarmConsumer` as a snapshot key but is not exposed as a proto field.
|
||
|
||
---
|
||
|
||
DOC / LINES / 651-654 (A.2 outline step 3 — timestamp)
|
||
CLAIM / "`DATE + TIME + GMTOFFSET + DSTADJUST` → reassemble UTC timestamp; matches the worker's existing `Timestamp` wire format."
|
||
CLAIM_TYPE / behavior-rule
|
||
VERDICT / accurate
|
||
EVIDENCE / `AlarmRecordTransitionMapper.ParseTransitionTimestampUtc` (`AlarmRecordTransitionMapper.cs:116-188`) parses all four fields and computes UTC. The proto uses `google.protobuf.Timestamp` (`mxaccess_gateway.proto:747`). Wire-up matches.
|
||
CODE_AREA / alarm.state
|
||
SEVERITY / low
|
||
PROPOSED_FIX / flag only
|
||
|
||
---
|
||
|
||
DOC / LINES / 656-657 (A.2 outline step 3)
|
||
CLAIM / "`PRIORITY` → severity (already 1-1000-ish range)."
|
||
CLAIM_TYPE / behavior-rule
|
||
VERDICT / accurate
|
||
EVIDENCE / `WnWrapAlarmConsumer.ParseSnapshotXml` reads `PRIORITY` as `int` (`WnWrapAlarmConsumer.cs:433`), stored as `MxAlarmSnapshotRecord.Priority`. `AlarmDispatcher.OnTransition` passes it as `severity: record.Priority` (`AlarmDispatcher.cs:187`). `OnAlarmTransitionEvent.severity` is `int32` in the proto (`mxaccess_gateway.proto`). The 1-1000 range is consistent with AVEVA's alarm priority range.
|
||
CODE_AREA / alarm.state
|
||
SEVERITY / low
|
||
PROPOSED_FIX / flag only
|
||
|
||
---
|
||
|
||
DOC / LINES / 658-659 (A.2 outline step 3)
|
||
CLAIM / "`TAGNAME` → reference; `PROVIDER_NAME` + `GROUP` for scope metadata."
|
||
CLAIM_TYPE / behavior-rule
|
||
VERDICT / accurate
|
||
EVIDENCE / `AlarmDispatcher.OnTransition` calls `AlarmRecordTransitionMapper.ComposeFullReference(record.ProviderName, record.Group, record.TagName)` and passes the result as `alarmFullReference` (`AlarmDispatcher.cs:180-183`). `ComposeFullReference` formats it as `Provider!Group.TagName` (`AlarmRecordTransitionMapper.cs:90-102`). `TAGNAME` alone is passed as `sourceObjectReference` (`AlarmDispatcher.cs:184`).
|
||
CODE_AREA / alarm.state
|
||
SEVERITY / low
|
||
PROPOSED_FIX / flag only
|
||
|
||
---
|
||
|
||
DOC / LINES / 672-676 (A.2 outline step 5)
|
||
CLAIM / "PR A.5's snapshot/ack contract tests can stay — they don't touch the underlying COM API."
|
||
CLAIM_TYPE / cross-ref
|
||
VERDICT / stale
|
||
EVIDENCE / PR A.5's `AlarmClientConsumer` was retired; there is no class by that name. The test files for alarm command handling now cover `AlarmCommandHandler`, `AlarmDispatcher`, and `WnWrapAlarmConsumerXmlTests` — none named as "PR A.5 tests." The statement implies a test corpus that doesn't exist under the described label.
|
||
CODE_AREA / alarm.subscribe
|
||
SEVERITY / low
|
||
PROPOSED_FIX / Remove or update the PR label; reference actual test files: `AlarmCommandHandlerTests.cs`, `AlarmDispatcherTests.cs`, `WnWrapAlarmConsumerXmlTests.cs`.
|
||
|
||
---
|
||
|
||
DOC / LINES / 673-675 (settled API ordering section)
|
||
CLAIM / "`InitializeConsumer` first, then `RegisterConsumer` — both on `aaAlarmManagedClient.AlarmClient` and `wwAlarmConsumerClass`."
|
||
CLAIM_TYPE / behavior-rule
|
||
VERDICT / accurate
|
||
EVIDENCE / `WnWrapAlarmConsumer.Subscribe` calls `IwwAlarmConsumer_InitializeConsumer` before `IwwAlarmConsumer_RegisterConsumer` (`WnWrapAlarmConsumer.cs:117-137`). Same ordering for `ackClient` (`WnWrapAlarmConsumer.cs:188-208`).
|
||
CODE_AREA / alarm.subscribe
|
||
SEVERITY / low
|
||
PROPOSED_FIX / flag only
|
||
|
||
---
|
||
|
||
DOC / LINES / 676-682 (settled API section)
|
||
CLAIM / "`aaAlarmManagedClient.AlarmClient.RegisterConsumer` is 5-arg (includes `bRetainHiddenAlarms`); `wwAlarmConsumerClass.RegisterConsumer` is 4-arg (no `bRetainHiddenAlarms`)."
|
||
CLAIM_TYPE / behavior-rule
|
||
VERDICT / accurate
|
||
EVIDENCE / `WnWrapAlarmConsumer.Subscribe` calls `IwwAlarmConsumer_RegisterConsumer` with 4 args: `hWnd, szProductName, szApplicationName, szVersion` (`WnWrapAlarmConsumer.cs:128-132`). Consistent with the doc.
|
||
CODE_AREA / alarm.subscribe
|
||
SEVERITY / low
|
||
PROPOSED_FIX / flag only
|
||
|
||
---
|
||
|
||
DOC / LINES / 683-685 (settled API section)
|
||
CLAIM / "Subscription expression format: `\\<machine>\Galaxy!<area>` (literal `Galaxy` provider) for both libraries."
|
||
CLAIM_TYPE / path
|
||
VERDICT / accurate
|
||
EVIDENCE / `WnWrapAlarmConsumer.ComposeXmlAlarmQuery` parses this format and treats `Galaxy` as the provider (`WnWrapAlarmConsumer.cs:489-530`). `IMxAccessAlarmConsumer.Subscribe` doc comment confirms: "Subscription string follows AVEVA's canonical format: `\\<node>\Galaxy!<area>`. The literal 'Galaxy' is the provider name (regardless of the configured Galaxy database name)." (`IMxAccessAlarmConsumer.cs:44-46`). `AlarmsOptions.cs:16-17` also confirms.
|
||
CODE_AREA / alarm.subscribe
|
||
SEVERITY / low
|
||
PROPOSED_FIX / flag only
|
||
|
||
---
|
||
|
||
DOC / LINES / 684-685 (settled API section)
|
||
CLAIM / "Native ack: `AlarmAckByGUID(VBGUID guid, comment, oprName, node, domain, fullName)` on the v2 surface."
|
||
CLAIM_TYPE / behavior-rule
|
||
VERDICT / accurate
|
||
EVIDENCE / `WnWrapAlarmConsumer.AcknowledgeByGuid` calls `com.AlarmAckByGUID` with exactly those args (`WnWrapAlarmConsumer.cs:232-238`).
|
||
CODE_AREA / alarm.ack
|
||
SEVERITY / low
|
||
PROPOSED_FIX / flag only
|
||
|
||
---
|
||
|
||
DOC / LINES / 695-699 (live smoke quirk 1)
|
||
CLAIM / "Without `SetXmlAlarmQuery`, the first `GetXmlCurrentAlarms2` call fails with `E_FAIL` (HRESULT `0x80004005`)."
|
||
CLAIM_TYPE / behavior-rule
|
||
VERDICT / accurate
|
||
EVIDENCE / `WnWrapAlarmConsumer.Subscribe` calls `SetXmlAlarmQuery` and wraps it with a `COMException` guard that would surface as `InvalidOperationException` with the E_FAIL message (`WnWrapAlarmConsumer.cs:156-182`). The call is mandatory per production code structure.
|
||
CODE_AREA / alarm.subscribe
|
||
SEVERITY / low
|
||
PROPOSED_FIX / flag only
|
||
|
||
---
|
||
|
||
DOC / LINES / 719-733 (live smoke quirk 2)
|
||
CLAIM / "Two consumers required: read-side consumer (with `SetXmlAlarmQuery`) and ack-only consumer (without `SetXmlAlarmQuery`). All `AcknowledgeByName` calls dispatch through the ack-only instance."
|
||
CLAIM_TYPE / behavior-rule
|
||
VERDICT / accurate
|
||
EVIDENCE / `WnWrapAlarmConsumer.Subscribe` provisions `ackClient = new wwAlarmConsumerClass()` with full lifecycle but no `SetXmlAlarmQuery` (`WnWrapAlarmConsumer.cs:184-210`). `AcknowledgeByName` uses `ackClient` (`WnWrapAlarmConsumer.cs:256-278`). `AcknowledgeByGuid` uses `client` (read-side) (`WnWrapAlarmConsumer.cs:224-238`).
|
||
CODE_AREA / alarm.ack
|
||
SEVERITY / low
|
||
PROPOSED_FIX / flag only
|
||
|
||
---
|
||
|
||
DOC / LINES / 736-748 (live smoke quirk 3)
|
||
CLAIM / "The v2 8-arg `AlarmAckByName` returns -55 on this AVEVA build. The v1 6-arg `AlarmAckByName` works. Production `WnWrapAlarmConsumer.AcknowledgeByName` calls the 6-arg overload. Operator domain and full-name fields are accepted by the proto but not propagated to AVEVA (discarded)."
|
||
CLAIM_TYPE / behavior-rule
|
||
VERDICT / accurate
|
||
EVIDENCE / `WnWrapAlarmConsumer.AcknowledgeByName` calls `com.AlarmAckByName` (6-arg) and explicitly discards `ackOperatorDomain` and `ackOperatorFullName` with `_ = ...` (`WnWrapAlarmConsumer.cs:268-278`). The proto `AcknowledgeAlarmByNameCommand` retains `operator_domain` and `operator_full_name` fields (`mxaccess_gateway.proto:359-373`).
|
||
CODE_AREA / alarm.ack
|
||
SEVERITY / low
|
||
PROPOSED_FIX / flag only
|
||
|
||
---
|
||
|
||
DOC / LINES / 750-756 (live smoke quirk 4)
|
||
CLAIM / "`AlarmAckByGUID` is not implemented on `wwAlarmConsumerClass`; it throws `NotImplementedException` / `E_NOTIMPL`. The reference→GUID lookup is not viable; all acks must go through `AlarmAckByName`."
|
||
CLAIM_TYPE / behavior-rule
|
||
VERDICT / stale
|
||
EVIDENCE / The production code at `WnWrapAlarmConsumer.AcknowledgeByGuid` still calls `com.AlarmAckByGUID` directly without a guard for `E_NOTIMPL` (`WnWrapAlarmConsumer.cs:215-239`). The gateway's `BuildAcknowledgeCommand` still dispatches `MxCommandKind.AcknowledgeAlarm` (GUID path) when `alarm_full_reference` parses as a GUID (`GatewayAlarmMonitor.cs:528-543`). The doc says all acks must go through `AcknowledgeByName`, but the code still routes GUID-shaped references through `AlarmAckByGUID`. The `E_NOTIMPL` runtime behavior is unguarded.
|
||
CODE_AREA / alarm.ack
|
||
SEVERITY / high
|
||
PROPOSED_FIX / Either (a) add a `COMException`/`NotImplementedException` guard around `AlarmAckByGUID` in `WnWrapAlarmConsumer.AcknowledgeByGuid` that falls back to `AcknowledgeByName`, or (b) make the gateway never dispatch the GUID arm. Document whichever approach is taken. The current state silently sends a doomed IPC command.
|
||
|
||
---
|
||
|
||
DOC / LINES / 761-762
|
||
CLAIM / "`WorkerAlarmRpcDispatcher.AcknowledgeAsync` now always routes through `AcknowledgeAlarmByName` when the public RPC supplies a recognizable `Provider!Group.Tag` reference."
|
||
CLAIM_TYPE / cross-ref
|
||
VERDICT / wrong
|
||
EVIDENCE / (a) No class named `WorkerAlarmRpcDispatcher` exists in the source tree. The gateway-side routing is in `GatewayAlarmMonitor.BuildAcknowledgeCommand` (`GatewayAlarmMonitor.cs:516`). (b) The routing is conditional: GUID-shaped `alarm_full_reference` → `AcknowledgeAlarmCommand` (GUID path); `Provider!Group.Tag` → `AcknowledgeAlarmByNameCommand`. The claim that the routing "always" goes through `AcknowledgeAlarmByName` is incorrect.
|
||
CODE_AREA / alarm.ack
|
||
SEVERITY / high
|
||
PROPOSED_FIX / Replace the entire sentence. The correct description: "The gateway's `GatewayAlarmMonitor.BuildAcknowledgeCommand` (`GatewayAlarmMonitor.cs:516`) dispatches `MxCommandKind.AcknowledgeAlarm` for GUID-shaped references and `MxCommandKind.AcknowledgeAlarmByName` for `Provider!Group.Tag` references."
|
||
|
||
---
|
||
|
||
DOC / LINES / 765-773 (STA quirk 5)
|
||
CLAIM / "The consumer's internal `Timer` fires on threadpool threads and would block on cross-apartment marshaling unless the host STA pumps Win32 messages. The smoke test sidesteps this by setting `pollIntervalMilliseconds=0` (Timer disabled) and driving `PollOnce` manually."
|
||
CLAIM_TYPE / behavior-rule
|
||
VERDICT / stale
|
||
EVIDENCE / The production `WnWrapAlarmConsumer` has no internal `Timer` at all — the design was revised so `PollOnce()` is always external (`WnWrapAlarmConsumer.cs:38-43`: "the consumer owns no internal timer"). There is no `pollIntervalMilliseconds` constructor parameter (`WnWrapAlarmConsumer.cs:69-87`). The constructor takes only `wwAlarmConsumerClass client` and `int maxAlarmsPerFetch`. The smoke test mention of `pollIntervalMilliseconds=0` refers to a superseded design.
|
||
CODE_AREA / alarm.subscribe
|
||
SEVERITY / medium
|
||
PROPOSED_FIX / Update to reflect the final design: `WnWrapAlarmConsumer` has no internal timer; `PollOnce()` is always called externally by the STA. Remove the `pollIntervalMilliseconds=0` test-workaround reference.
|
||
|
||
---
|
||
|
||
DOC / LINES / 599-601 (XML STATE enum section)
|
||
CLAIM / "`STATE` enum values observed: `UNACK_RTN` (alarm returned to normal, unacknowledged) and `UNACK_ALM` (alarm active and unacknowledged). Other states (`ACK_RTN`, `ACK_ALM`) would appear when an ack is performed."
|
||
CLAIM_TYPE / term
|
||
VERDICT / accurate
|
||
EVIDENCE / `MxAlarmStateKind.cs:1-17` defines all four values. `AlarmRecordTransitionMapper.ParseStateKind` handles all four (`AlarmRecordTransitionMapper.cs:27-38`).
|
||
CODE_AREA / alarm.state
|
||
SEVERITY / low
|
||
PROPOSED_FIX / flag only
|
||
|
||
---
|
||
|
||
DOC / LINES / 628-630 (reference format in smoke capture)
|
||
CLAIM / Reference format in the capture: `ref='Galaxy!TestArea.TestMachine_001.TestAlarm001'` — the `alarm_full_reference` is composed as `ProviderName!Group.TagName`.
|
||
CLAIM_TYPE / term
|
||
VERDICT / accurate
|
||
EVIDENCE / `AlarmRecordTransitionMapper.ComposeFullReference` formats as `{provider}!{group}.{name}` (`AlarmRecordTransitionMapper.cs:90-102`). The example matches this pattern exactly.
|
||
CODE_AREA / alarm.state
|
||
SEVERITY / low
|
||
PROPOSED_FIX / flag only
|
||
|
||
---
|
||
|
||
DOC / LINES / (entire doc — RPC names)
|
||
CLAIM / The document mentions IPC commands `SubscribeAlarms`, `AcknowledgeByGuid`, `SnapshotActiveAlarms`, `QueryActiveAlarms` but never names the public gRPC RPCs — `AcknowledgeAlarm`, `StreamAlarms`, `QueryActiveAlarms` — or the config keys governing the always-on monitor (`MxGateway:Alarms:Enabled`, `MxGateway:Alarms:SubscriptionExpression`, `MxGateway:Alarms:DefaultArea`, `MxGateway:Alarms:ReconcileIntervalSeconds`).
|
||
CLAIM_TYPE / gap
|
||
VERDICT / gap
|
||
EVIDENCE / `mxaccess_gateway.proto:22-37` (RPCs); `AlarmsOptions.cs:21-47` (config keys); `GatewayAlarmMonitor.cs:17-51` (always-on broker). None documented in `AlarmClientDiscovery.md`.
|
||
CODE_AREA / alarm.subscribe
|
||
SEVERITY / high
|
||
PROPOSED_FIX / The doc is a probe/research log, not an operator/integrator guide. However, the gap means no other document covers these public-surface items. Add a section or cross-reference to the public alarm API: RPCs `AcknowledgeAlarm`, `StreamAlarms`, `QueryActiveAlarms`; config keys `MxGateway:Alarms:Enabled`, `MxGateway:Alarms:SubscriptionExpression`, `MxGateway:Alarms:DefaultArea`, `MxGateway:Alarms:ReconcileIntervalSeconds`.
|
||
|
||
---
|
||
|
||
DOC / LINES / (entire doc — always-on broker architecture)
|
||
CLAIM / (gap) The doc describes a model where individual client sessions subscribe to alarms. The production architecture uses a gateway-owned always-on `GatewayAlarmMonitor` that holds one dedicated worker session and fans the alarm feed to all clients. No client opens its own alarm subscription; `StreamAlarms` is session-less.
|
||
CLAIM_TYPE / gap
|
||
VERDICT / gap
|
||
EVIDENCE / `GatewayAlarmMonitor.cs:1-697`; `IGatewayAlarmService.cs:27-63`; `AlarmsOptions.cs:1-48`. `AlarmClientDiscovery.md` describes the worker alarm consumer (IPC layer) but never describes the gateway-level brokering architecture that wraps it.
|
||
CODE_AREA / alarm.subscribe
|
||
SEVERITY / high
|
||
PROPOSED_FIX / Add a section describing `GatewayAlarmMonitor` as the always-on broker: one gateway-owned session, periodic reconcile loop (`ReconcileIntervalSeconds`), `StreamAsync` fan-out to per-client `Channel<AlarmFeedMessage>`, subscriber capacity (2048 messages), fail-open restart-backoff (5s).
|
||
|
||
---
|
||
|
||
DOC / LINES / (entire doc — AlarmFeedMessage / snapshot_complete protocol)
|
||
CLAIM / (gap) The doc does not document the `AlarmFeedMessage` stream protocol: initial burst of `active_alarm` messages, then `snapshot_complete` sentinel, then `transition` messages for live changes.
|
||
CLAIM_TYPE / gap
|
||
VERDICT / gap
|
||
EVIDENCE / `mxaccess_gateway.proto:857-868` (message definition); `GatewayAlarmMonitor.StreamAsync:386-434`. This is the key integrator-facing streaming contract.
|
||
CODE_AREA / alarm.subscribe
|
||
SEVERITY / high
|
||
PROPOSED_FIX / Document the `StreamAlarms` protocol: `AlarmFeedMessage` union with `active_alarm`, `snapshot_complete`, and `transition` fields; the invariant that the snapshot precedes the sentinel which precedes live transitions.
|
||
|
||
---
|
||
|
||
DOC / LINES / (entire doc — reconcile mechanism)
|
||
CLAIM / (gap) The periodic reconcile loop (`ReconcileIntervalSeconds`, default 30s, floor 5s) that snapshots the worker's active-alarm set and broadcasts synthetic raise/clear transitions for missed alarms is not documented.
|
||
CLAIM_TYPE / gap
|
||
VERDICT / gap
|
||
EVIDENCE / `GatewayAlarmMonitor.ReconcileLoopAsync:235-260`; `GatewayAlarmMonitor.ApplyReconcile:315-354`; `AlarmsOptions.ReconcileIntervalSeconds:47`.
|
||
CODE_AREA / alarm.subscribe
|
||
SEVERITY / medium
|
||
PROPOSED_FIX / Document the reconcile pass: cadence, purpose (catches missed poll-and-diff transitions), synthetic transition kind (`Raise`/`Clear`), and that it does not emit `Acknowledge` transitions.
|
||
|
||
---
|
||
|
||
DOC / LINES / (entire doc — subscriber backpressure / drop behavior)
|
||
CLAIM / (gap) A subscriber that cannot keep up with the alarm feed is dropped with an error ("Alarm feed subscriber fell behind and was dropped; reconnect to re-snapshot"). The queue capacity is 2048. This behavior is not documented.
|
||
CLAIM_TYPE / gap
|
||
VERDICT / gap
|
||
EVIDENCE / `GatewayAlarmMonitor.Broadcast:358-375`; `SubscriberQueueCapacity = 2048` (`GatewayAlarmMonitor.cs:21`).
|
||
CODE_AREA / alarm.subscribe
|
||
SEVERITY / medium
|
||
PROPOSED_FIX / Document the backpressure model: bounded 2048-message channel per subscriber; slow subscribers are completed with error and must reconnect; reconnect re-snapshots the active set.
|
||
|
||
---
|
||
|
||
DOC / LINES / (entire doc — `alarm_full_reference` parse format for ack)
|
||
CLAIM / (gap) The doc does not document the `alarm_full_reference` parse contract for `AcknowledgeAlarm`: a canonical GUID string triggers the GUID path; `Provider!Group.Tag` (first `!` splits provider, first `.` splits group from tag) triggers the by-name path; anything else is rejected.
|
||
CLAIM_TYPE / gap
|
||
VERDICT / gap
|
||
EVIDENCE / `GatewayAlarmMonitor.BuildAcknowledgeCommand` and `TryParseAlarmReference` (`GatewayAlarmMonitor.cs:516-610`). Error message: "alarm_full_reference must be a canonical GUID or 'Provider!Group.Tag' format."
|
||
CODE_AREA / alarm.ack
|
||
SEVERITY / high
|
||
PROPOSED_FIX / Document the `AcknowledgeAlarm.alarm_full_reference` field's two accepted formats and how the gateway routes each.
|
||
|
||
---
|
||
|
||
DOC / LINES / (entire doc — `AlarmConditionState` on snapshot)
|
||
CLAIM / (gap) The `ActiveAlarmSnapshot.current_state` field uses `AlarmConditionState` (Active / ActiveAcked / Inactive) — the distinction between `UnackRtn` and `AckRtn` is lost in the snapshot (both collapse to Inactive). This is not documented.
|
||
CLAIM_TYPE / gap
|
||
VERDICT / gap
|
||
EVIDENCE / `AlarmDispatcher.MapConditionState` (`AlarmDispatcher.cs:221-234`): both `UnackRtn` and `AckRtn` map to `AlarmConditionState.Inactive`.
|
||
CODE_AREA / alarm.state
|
||
SEVERITY / medium
|
||
PROPOSED_FIX / Document the state collapse rule: the `ActiveAlarmSnapshot.current_state` field does not distinguish between acknowledged-cleared and unacknowledged-cleared alarms; both surface as `Inactive`. Consumers that need this distinction must track the transition stream.
|
||
|
||
---
|
||
|
||
DOC / LINES / (entire doc — transition kind table)
|
||
CLAIM / (gap) The `AlarmTransitionKind` enum has a `Retrigger` value (`ALARM_TRANSITION_KIND_RETRIGGER = 4`), but the doc only describes Raise / Acknowledge / Clear.
|
||
CLAIM_TYPE / gap
|
||
VERDICT / gap
|
||
EVIDENCE / `mxaccess_gateway.proto:777`; `AlarmRecordTransitionMapper.MapTransition` does not produce `Retrigger` — it is defined in the proto but unused by the current mapping logic (`AlarmRecordTransitionMapper.cs:54-78`).
|
||
CODE_AREA / alarm.state
|
||
SEVERITY / low
|
||
PROPOSED_FIX / Note that `AlarmTransitionKind.Retrigger` exists in the proto but is not emitted by the current worker (the `*Rtn→*Alm` re-trigger case maps to `Raise`). Flag as reserved for future use or remove from the proto if unused.
|