docs(alarms): document alarmmgr->subtag fallback (providers, failover, config, contract, parity)
This commit is contained in:
@@ -790,3 +790,127 @@ Post-ack transition: kind=Clear …
|
||||
|
||||
10s cadence held throughout; full proto fields populated correctly;
|
||||
ack registered server-side without errors.
|
||||
|
||||
## Subtag-monitoring fallback provider
|
||||
|
||||
When the wnwrap alarm-manager source fails, the gateway worker switches to
|
||||
`SubtagAlarmConsumer` — a synthetic alarm source that advises each alarm
|
||||
attribute's subtags via the existing MXAccess `AddItem`/`Advise` pipeline and
|
||||
derives alarm transitions from the resulting value-change stream. This is a
|
||||
non-parity, degraded-mode source; every transition and snapshot it produces
|
||||
carries `degraded = true`.
|
||||
|
||||
### Watch-list discovery
|
||||
|
||||
`GatewayAlarmMonitor` resolves the subtag watch-list at subscribe time by
|
||||
calling `IAlarmWatchListResolver.GetAlarmAttributesAsync`. The resolver merges:
|
||||
|
||||
1. Galaxy Repository SQL (`GetAlarmAttributesAsync`) — objects that have alarm
|
||||
extensions in the configured area.
|
||||
2. Config overrides — `IncludeAttributes` adds explicit entries;
|
||||
`ExcludeAttributes` removes Repository-derived ones. The config list takes
|
||||
effect even when `UseGalaxyRepository` is `false`.
|
||||
|
||||
The resolved list is a set of `AlarmSubtagTarget` messages sent to the worker
|
||||
inside `SubscribeAlarmsCommand.watch_list`. Each target carries the composed
|
||||
MXAccess item addresses for the `.active`, `.acked`, ack-comment, and priority
|
||||
subtags. The gateway re-runs discovery on its reconcile cadence and pushes an
|
||||
updated watch-list when the model changes.
|
||||
|
||||
### Subtag advise and `LmxSubtagAlarmSource`
|
||||
|
||||
`LmxSubtagAlarmSource` (implements `ISubtagAlarmSource`) owns a separate
|
||||
`LMXProxyServerClass` instance on the worker STA — it does not share the
|
||||
session's main MXAccess object. For each watch-list target it calls
|
||||
`AddItem`/`Advise` on the configured subtag addresses. When a subtag value
|
||||
changes, it raises `ValueChanged` on the STA and `SubtagAlarmConsumer`
|
||||
forwards it to `SubtagAlarmStateMachine`.
|
||||
|
||||
`PollOnce()` on the subtag consumer is a no-op — the path is event-driven
|
||||
through `Advise`, not poll-driven.
|
||||
|
||||
### Synthesis rules
|
||||
|
||||
`SubtagAlarmStateMachine` tracks `(active, acked)` per watch-list entry and
|
||||
emits `MxAlarmTransitionEvent` records on change:
|
||||
|
||||
| Subtag change | Emitted transition | Notes |
|
||||
|---|---|---|
|
||||
| `active` false → true | Raise (`UNACK_ALM`) | `original_raise_timestamp` = first observed active time for this episode |
|
||||
| `acked` false → true, while `active` | Acknowledge (`ACK_ALM`) | `AckedDuringEpisode` latch set |
|
||||
| `active` true → false | Clear | `AckRtn` if `AckedDuringEpisode` is set, else `UnackRtn` |
|
||||
| `acked` true → false, while `active` | (none) | Latch is NOT cleared; the episode retains its acknowledged status at clear |
|
||||
|
||||
The `AckedDuringEpisode` latch addresses out-of-order subtag delivery:
|
||||
MXAccess does not guarantee the `acked = false` update arrives before the
|
||||
`active = false` update. The latch ensures a clear always emits `ACK_RTN`
|
||||
when the alarm was acknowledged at any point during the active episode.
|
||||
|
||||
`SnapshotActive()` returns one `MxAlarmSnapshotRecord` per currently-active
|
||||
alarm. State mapping:
|
||||
|
||||
- `active && !acked` → `UNACK_ALM`
|
||||
- `active && acked` → `ACK_ALM`
|
||||
- `!active` → not included in the snapshot
|
||||
|
||||
### Synthetic GUID
|
||||
|
||||
The alarmmgr provider supplies a native GUID per alarm record. The subtag
|
||||
provider has no native GUID. `SubtagAlarmConsumer` derives a deterministic
|
||||
GUID by hashing `alarm_full_reference` (via `SyntheticAlarmGuid.ForReference`).
|
||||
The same reference always produces the same GUID within a session, so
|
||||
GUID-based ack routing resolves correctly. The GUID is not stable across
|
||||
different alarm references or gateway restarts in the sense of matching any
|
||||
AVEVA-internal GUID.
|
||||
|
||||
### Acknowledge in subtag mode
|
||||
|
||||
`AlarmDispatcher` routes ack calls by active provider mode:
|
||||
|
||||
- **Alarm-manager mode:** `AlarmAckByName` on `wwAlarmConsumerClass` (unchanged).
|
||||
- **Subtag mode:** `SubtagAlarmConsumer.AcknowledgeByName` resolves the
|
||||
watch-list entry's `ack_comment_subtag` and issues a `Write(comment)` on
|
||||
the STA via `LmxSubtagAlarmSource`. The write performs the ack in AVEVA.
|
||||
|
||||
If the alarm has no writable ack-comment subtag (`AckComment` config key is
|
||||
empty, or the entry's `ack_comment_subtag` field is empty), the ack call
|
||||
returns a failure code that the gateway surfaces as `FailedPrecondition`.
|
||||
`AcknowledgeByGuid` maps the synthetic GUID back to its reference via an
|
||||
internal dictionary, then calls the same write path.
|
||||
|
||||
### Fidelity limitations
|
||||
|
||||
The following fields are not available or have lower quality in subtag mode:
|
||||
|
||||
| Field | Subtag-mode behavior |
|
||||
|-------|---------------------|
|
||||
| `alarm_guid` | Synthetic deterministic GUID from `alarm_full_reference`; not an AVEVA-native GUID |
|
||||
| `original_raise_timestamp` | First observed `active = true` time; no AVEVA-native raise time |
|
||||
| `transition_timestamp` | `OnDataChange` source timestamp from MXAccess |
|
||||
| `severity` | From priority subtag if advised; 0 otherwise |
|
||||
| `category` / `description` | Not populated (no subtag for these) |
|
||||
| `current_value` / `limit_value` | Not populated unless corresponding subtags are in the watch-list |
|
||||
| `alarm_type_name` | Not populated |
|
||||
| `operator_user` / `operator_comment` | Not populated on synthesized raise/clear transitions |
|
||||
| `retrigger` transition | Not synthesized (no re-alarm counter subtag is observed) |
|
||||
|
||||
Every transition and snapshot record carries `degraded = true` and
|
||||
`source_provider = ALARM_PROVIDER_MODE_SUBTAG`. Clients that require full
|
||||
fidelity must wait for failback to the alarm manager.
|
||||
|
||||
### Provider mode reflection
|
||||
|
||||
When `FailoverAlarmConsumer` switches between providers, it raises
|
||||
`ProviderModeChanged`. `AlarmDispatcher` enqueues an
|
||||
`OnAlarmProviderModeChangedEvent` (carried as an `MxEvent`), which the
|
||||
gateway receives and reflects into:
|
||||
|
||||
- `AlarmFeedMessage.provider_status` emitted to every `StreamAlarms`
|
||||
subscriber.
|
||||
- The `/hubs/alarms` SignalR hub for the dashboard.
|
||||
- Metrics: `mxgateway.alarms.provider_mode` gauge and
|
||||
`mxgateway.alarms.provider_switches` counter.
|
||||
|
||||
On every switch `GatewayAlarmMonitor` also forces a reconcile
|
||||
(`QueryActiveAlarms`) against the now-active provider so the gateway cache
|
||||
reflects the post-switch state without a spurious raise/clear storm.
|
||||
|
||||
Reference in New Issue
Block a user