docs(alarms): document alarmmgr->subtag fallback (providers, failover, config, contract, parity)
This commit is contained in:
@@ -94,6 +94,73 @@ Carrying the enqueue timestamp into the worker layer is what lets queue-wait tim
|
||||
|
||||
`StreamAlarms` is a server-streaming, **session-less** RPC that attaches to the gateway's central alarm feed. The handler delegates to `IGatewayAlarmService.StreamAsync`. The stream opens with one `AlarmFeedMessage` carrying an `active_alarm` per currently-active alarm (the ConditionRefresh snapshot), then a single `snapshot_complete`, then a `transition` for every subsequent raise / acknowledge / clear. It is served by the always-on `GatewayAlarmMonitor`, which owns a single gateway-managed worker session and fans out to every attached client — clients no longer open a session of their own. `alarm_filter_prefix`, when set, scopes the stream to a sub-tree.
|
||||
|
||||
#### Provider status on the alarm feed
|
||||
|
||||
`AlarmFeedMessage` has a fourth `payload` case, `provider_status`, carrying
|
||||
an `AlarmProviderStatus` message:
|
||||
|
||||
```protobuf
|
||||
message AlarmProviderStatus {
|
||||
AlarmProviderMode mode = 1;
|
||||
bool degraded = 2; // true whenever mode == SUBTAG
|
||||
string reason = 3; // human-readable switch reason
|
||||
google.protobuf.Timestamp since = 4;
|
||||
}
|
||||
```
|
||||
|
||||
The gateway emits `provider_status` once when a client first subscribes
|
||||
(immediately after the initial snapshot and before the first live transition)
|
||||
and again on every failover or failback. A late-joining client therefore
|
||||
always learns the current provider mode without waiting for the next switch.
|
||||
|
||||
`AlarmProviderMode` is an enum with three values:
|
||||
|
||||
| Value | Meaning |
|
||||
|-------|---------|
|
||||
| `ALARM_PROVIDER_MODE_UNSPECIFIED` (0) | Default / unset |
|
||||
| `ALARM_PROVIDER_MODE_ALARMMGR` (1) | Native wnwrap alarm-manager source |
|
||||
| `ALARM_PROVIDER_MODE_SUBTAG` (2) | Subtag-monitoring fallback (degraded) |
|
||||
|
||||
#### Degraded and source-provider fields on transitions and snapshots
|
||||
|
||||
`OnAlarmTransitionEvent` and `ActiveAlarmSnapshot` both carry two new fields:
|
||||
|
||||
- `bool degraded` (field 14) — `true` when the record came from the subtag
|
||||
fallback, not the native alarmmgr.
|
||||
- `AlarmProviderMode source_provider` (field 15) — which provider produced
|
||||
this record (`ALARMMGR` or `SUBTAG`).
|
||||
|
||||
Both fields are proto3 defaults (`false` / `UNSPECIFIED`) in alarmmgr mode,
|
||||
so existing clients that do not read them continue to function without change.
|
||||
Clients that care about provenance — for example, an OPC UA server that
|
||||
applies different quality flags to degraded alarms — should inspect `degraded`
|
||||
before consuming the transition.
|
||||
|
||||
Subtag-mode records are a non-parity source. They carry synthetic GUIDs,
|
||||
best-effort timestamps, and reduced field coverage. See
|
||||
`docs/AlarmClientDiscovery.md` for the full fidelity table.
|
||||
|
||||
#### Provider-mode-changed event
|
||||
|
||||
The worker emits `OnAlarmProviderModeChangedEvent` (family
|
||||
`MX_EVENT_FAMILY_ON_ALARM_PROVIDER_MODE_CHANGED`) on each switch between
|
||||
providers:
|
||||
|
||||
```protobuf
|
||||
message OnAlarmProviderModeChangedEvent {
|
||||
AlarmProviderMode mode = 1;
|
||||
string reason = 2;
|
||||
int32 hresult = 3; // COM HRESULT that triggered failover; 0 on failback
|
||||
google.protobuf.Timestamp at = 4;
|
||||
}
|
||||
```
|
||||
|
||||
This event arrives on the `StreamEvents` stream of the alarm monitor's
|
||||
internal gateway session (not on client sessions). `GatewayAlarmMonitor`
|
||||
consumes it and reflects the new mode into the `StreamAlarms` feed's
|
||||
`provider_status`, the dashboard hub, and metrics. Client sessions do not
|
||||
receive this event directly.
|
||||
|
||||
## Validation Rules
|
||||
|
||||
`MxAccessGrpcRequestValidator` rejects requests with `StatusCode.InvalidArgument` before any session work happens. The rules are intentionally narrow — anything that requires session state (for example, "session does not exist") is left for `ISessionManager` so the validator can stay synchronous and side-effect free.
|
||||
|
||||
Reference in New Issue
Block a user