Document the session-less StreamAlarms feed and alarm config

Update the gateway docs for the central alarm monitor reversal:
Grpc.md replaces QueryActiveAlarms with the session-less StreamAlarms
RPC and notes AcknowledgeAlarm no longer needs a session;
Authorization.md maps StreamAlarmsRequest to events:read;
GatewayConfiguration.md adds the MxGateway:Alarms options block; and
GatewayDashboardDesign.md points the Alarms page at the central
monitor cache instead of a per-session subscription.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
Joseph Doherty
2026-05-21 16:59:48 -04:00
parent 6777d49030
commit de58872435
4 changed files with 51 additions and 27 deletions
+3 -3
View File
@@ -103,7 +103,7 @@ public string ResolveRequiredScope(object request)
StreamEventsRequest => GatewayScopes.EventsRead, StreamEventsRequest => GatewayScopes.EventsRead,
MxCommandRequest commandRequest => ResolveCommandScope(commandRequest.Command?.Kind ?? MxCommandKind.Unspecified), MxCommandRequest commandRequest => ResolveCommandScope(commandRequest.Command?.Kind ?? MxCommandKind.Unspecified),
AcknowledgeAlarmRequest => GatewayScopes.InvokeWrite, AcknowledgeAlarmRequest => GatewayScopes.InvokeWrite,
QueryActiveAlarmsRequest => GatewayScopes.EventsRead, StreamAlarmsRequest => GatewayScopes.EventsRead,
TestConnectionRequest or TestConnectionRequest or
GetLastDeployTimeRequest or GetLastDeployTimeRequest or
DiscoverHierarchyRequest or DiscoverHierarchyRequest or
@@ -113,7 +113,7 @@ public string ResolveRequiredScope(object request)
} }
``` ```
The `_ => GatewayScopes.Admin` fallback is intentional: any future request type that the resolver does not recognize fails closed, requiring the strongest scope until the resolver is updated. `AcknowledgeAlarm` is treated as a write — it mutates alarm state, mirroring `MxCommandKind.Write*` — and `QueryActiveAlarms` shares the alarm/event surface with `StreamEvents` and `MxCommandKind.DrainEvents`, so it carries `events:read`. The `_ => GatewayScopes.Admin` fallback is intentional: any future request type that the resolver does not recognize fails closed, requiring the strongest scope until the resolver is updated. `AcknowledgeAlarm` is treated as a write — it mutates alarm state, mirroring `MxCommandKind.Write*` — and `StreamAlarms` shares the alarm/event surface with `StreamEvents` and `MxCommandKind.DrainEvents`, so it carries `events:read`. Both alarm RPCs are session-less: the scope check is the only authorization gate, since there is no per-session ownership to enforce.
`MxCommandRequest` is special because it multiplexes many MxAccess operations through a single RPC. The resolver inspects the embedded `MxCommandKind` so each operation gets its own scope: `MxCommandRequest` is special because it multiplexes many MxAccess operations through a single RPC. The resolver inspects the embedded `MxCommandKind` so each operation gets its own scope:
@@ -205,7 +205,7 @@ blocking constraint; secured values and raw credentials are never logged.
|----------|-------|--------------| |----------|-------|--------------|
| `SessionOpen` | `session:open` | `OpenSessionRequest` | | `SessionOpen` | `session:open` | `OpenSessionRequest` |
| `SessionClose` | `session:close` | `CloseSessionRequest` | | `SessionClose` | `session:close` | `CloseSessionRequest` |
| `EventsRead` | `events:read` | `StreamEventsRequest`, `QueryActiveAlarmsRequest`, `MxCommandKind.DrainEvents` | | `EventsRead` | `events:read` | `StreamEventsRequest`, `StreamAlarmsRequest`, `MxCommandKind.DrainEvents` |
| `InvokeRead` | `invoke:read` | `MxCommandRequest` for read-style command kinds (`Register`, `AddItem`, `Advise`, `ReadBulk`, and any kind not otherwise mapped) | | `InvokeRead` | `invoke:read` | `MxCommandRequest` for read-style command kinds (`Register`, `AddItem`, `Advise`, `ReadBulk`, and any kind not otherwise mapped) |
| `InvokeWrite` | `invoke:write` | `AcknowledgeAlarmRequest`, `MxCommandKind.Write`, `MxCommandKind.Write2`, `MxCommandKind.WriteBulk`, `MxCommandKind.Write2Bulk` | | `InvokeWrite` | `invoke:write` | `AcknowledgeAlarmRequest`, `MxCommandKind.Write`, `MxCommandKind.Write2`, `MxCommandKind.WriteBulk`, `MxCommandKind.Write2Bulk` |
| `InvokeSecure` | `invoke:secure` | `MxCommandKind.WriteSecured`, `MxCommandKind.WriteSecured2`, `MxCommandKind.WriteSecuredBulk`, `MxCommandKind.WriteSecured2Bulk`, `MxCommandKind.AuthenticateUser` | | `InvokeSecure` | `invoke:secure` | `MxCommandKind.WriteSecured`, `MxCommandKind.WriteSecured2`, `MxCommandKind.WriteSecuredBulk`, `MxCommandKind.WriteSecured2Bulk`, `MxCommandKind.AuthenticateUser` |
+20
View File
@@ -61,6 +61,12 @@ paths, timeouts, queue sizes, enum values, or protocol values are invalid.
"ConnectionString": "Server=localhost;Database=ZB;Integrated Security=True;TrustServerCertificate=True;Encrypt=False;", "ConnectionString": "Server=localhost;Database=ZB;Integrated Security=True;TrustServerCertificate=True;Encrypt=False;",
"CommandTimeoutSeconds": 60, "CommandTimeoutSeconds": 60,
"DashboardRefreshIntervalSeconds": 30 "DashboardRefreshIntervalSeconds": 30
},
"Alarms": {
"Enabled": false,
"SubscriptionExpression": "",
"DefaultArea": "",
"ReconcileIntervalSeconds": 30
} }
} }
} }
@@ -168,6 +174,20 @@ at startup.
See [Galaxy Repository Browse](./GalaxyRepository.md) for the RPC surface and See [Galaxy Repository Browse](./GalaxyRepository.md) for the RPC surface and
behavior. behavior.
## Alarm Options
| Option | Default | Description |
|--------|---------|-------------|
| `MxGateway:Alarms:Enabled` | `false` | Gates the gateway's always-on central alarm monitor. When `true`, the gateway opens one gateway-owned worker session dedicated to alarms, caches the active-alarm set, and fans it out to every client through the `StreamAlarms` RPC — no client opens its own session to see alarms. |
| `MxGateway:Alarms:SubscriptionExpression` | _(empty)_ | AVEVA alarm-subscription expression the monitor subscribes on startup, in canonical `\\<machine>\Galaxy!<area>` form. The literal `Galaxy` provider is correct regardless of the Galaxy database name. When empty and `Enabled` is `true`, the gateway falls back to `\\<MachineName>\Galaxy!<DefaultArea>` if `DefaultArea` is set. |
| `MxGateway:Alarms:DefaultArea` | _(empty)_ | Area name used to compose a default subscription when `SubscriptionExpression` is empty. If both are empty while `Enabled` is `true`, the monitor faults with a configuration diagnostic. |
| `MxGateway:Alarms:ReconcileIntervalSeconds` | `30` | How often the monitor reconciles its in-process alarm cache against the worker's authoritative active-alarm snapshot, catching transitions the live poll-and-diff feed missed. Floored at 5 seconds. |
The alarm monitor is independent of client sessions: `AcknowledgeAlarm` and
`StreamAlarms` are session-less RPCs served by the monitor. See
[Alarm Client Discovery](./AlarmClientDiscovery.md) for the AVEVA consumer
surface the monitor's worker session drives.
## Related Documentation ## Related Documentation
- [Gateway Process Detailed Design](./GatewayProcessDesign.md) - [Gateway Process Detailed Design](./GatewayProcessDesign.md)
+22 -18
View File
@@ -274,28 +274,32 @@ diagnostic session/worker views.
### Alarms page ### Alarms page
`/dashboard/alarms` lists the alarms the dashboard session's worker currently `/dashboard/alarms` lists the alarms the gateway's central alarm monitor
reports as Active or ActiveAcked, refreshed every three seconds. It defaults to currently holds as Active or ActiveAcked, refreshed every three seconds. It
showing unacknowledged `Active` alarms; filters add acknowledged alarms and defaults to showing unacknowledged `Active` alarms; filters add acknowledged
narrow by area, severity range, and a reference/source/description text search. alarms and narrow by area, severity range, and a reference/source/description
Cleared alarms are not retained — the gateway holds no alarm-history store, so text search. Cleared alarms are not retained — the gateway holds no
the page reflects only the live active set. The page is read-only; it does not alarm-history store, so the page reflects only the live active set. The page is
acknowledge alarms. If `MxGateway:Alarms:Enabled` is false the session is never read-only; it does not acknowledge alarms. If `MxGateway:Alarms:Enabled` is
subscribed to an alarm provider, and the page says so instead of showing an false the central monitor never starts, and the page says so instead of showing
empty list with no explanation. an empty list with no explanation.
### Live data source ### Live data source
Both the Browse subscription panel and the Alarms page read live MXAccess data Both the Browse subscription panel and the Alarms page read live MXAccess data
through `IDashboardLiveDataService` (`DashboardLiveDataService`). It owns one through `IDashboardLiveDataService` (`DashboardLiveDataService`). For tag data
shared gateway session for the whole dashboard, opened lazily on first use via it owns one shared gateway session for the whole dashboard, opened lazily on
`ISessionManager` and re-opened transparently when it faults or its lease first use via `ISessionManager` and re-opened transparently when it faults or
expires. One session means one worker process backs every dashboard circuit; its lease expires. One session means one worker process backs every dashboard
all access is serialised so the worker sees one in-flight command at a time. circuit; all access is serialised so the worker sees one in-flight command at a
Tag reads go through `GatewaySession.SubscribeBulkAsync` / `ReadBulkAsync`; time. Tag reads go through `GatewaySession.SubscribeBulkAsync` / `ReadBulkAsync`.
alarm queries go through `IAlarmRpcDispatcher`. Alarm subscription is the
gateway's existing auto-subscribe-on-open hook, so the dashboard session is The Alarms page does **not** use the dashboard session: alarm data comes from
alarm-subscribed only when `MxGateway:Alarms:Enabled` is set. the gateway's always-on central monitor. `QueryAlarmsAsync` reads
`IGatewayAlarmService.CurrentAlarms` — the monitor's in-process cache — so the
dashboard sees the same active-alarm set as every `StreamAlarms` client, with
no per-dashboard alarm subscription. When `MxGateway:Alarms:Enabled` is false
the monitor never starts and the cache stays empty.
### API keys page ### API keys page
+6 -6
View File
@@ -29,7 +29,7 @@ A second gRPC service, `GalaxyRepositoryGrpcService`, is mapped alongside it. It
## RPC Handlers ## RPC Handlers
`MxAccessGatewayService` derives from the generated `MxAccessGateway.MxAccessGatewayBase` and implements every RPC declared in `mxaccess_gateway.proto` — six in total: `OpenSession`, `CloseSession`, `Invoke`, `StreamEvents`, `AcknowledgeAlarm`, and `QueryActiveAlarms`. The proto contract itself is documented in [Contracts](./Contracts.md); this section covers only what the server-side handler does on top of that contract. `MxAccessGatewayService` derives from the generated `MxAccessGateway.MxAccessGatewayBase` and implements every RPC declared in `mxaccess_gateway.proto` — six in total: `OpenSession`, `CloseSession`, `Invoke`, `StreamEvents`, `AcknowledgeAlarm`, and `StreamAlarms`. The proto contract itself is documented in [Contracts](./Contracts.md); this section covers only what the server-side handler does on top of that contract.
Public gRPC send and receive message sizes are configured from Public gRPC send and receive message sizes are configured from
`MxGateway:Protocol:MaxGrpcMessageBytes` (default 16 MiB). Official clients use `MxGateway:Protocol:MaxGrpcMessageBytes` (default 16 MiB). Official clients use
@@ -88,11 +88,11 @@ Carrying the enqueue timestamp into the worker layer is what lets queue-wait tim
### `AcknowledgeAlarm` ### `AcknowledgeAlarm`
`AcknowledgeAlarm` is a unary RPC that acknowledges a single alarm. The handler validates `session_id` and `alarm_full_reference` inline (it does not run through `MxAccessGrpcRequestValidator`, because the alarm surface routes through `IAlarmRpcDispatcher` rather than the generic `Invoke` path), resolves the session, then delegates to the registered `IAlarmRpcDispatcher`. The production `WorkerAlarmRpcDispatcher` routes the ack over the worker IPC by GUID (`AcknowledgeAlarmCommand`) when the reference parses as a canonical GUID, or by `Provider!Group.Tag` reference (`AcknowledgeAlarmByNameCommand`) otherwise. The handler-level RPC behaviour and the alarm contract itself are documented in [Alarm Client Discovery](./AlarmClientDiscovery.md). `AcknowledgeAlarm` is a unary, **session-less** RPC that acknowledges a single alarm. The handler validates `alarm_full_reference` inline (it does not run through `MxAccessGrpcRequestValidator`) and delegates to `IGatewayAlarmService.AcknowledgeAsync`. The always-on `GatewayAlarmMonitor` routes the ack over its own gateway-managed worker session — clients no longer open a session to acknowledge an alarm. A reference that parses as a canonical GUID forwards to `AcknowledgeAlarmCommand`; a `Provider!Group.Tag` reference forwards to `AcknowledgeAlarmByNameCommand`. The alarm contract and the central monitor are documented in [Alarm Client Discovery](./AlarmClientDiscovery.md).
### `QueryActiveAlarms` ### `StreamAlarms`
`QueryActiveAlarms` is a server-streaming RPC that returns an `ActiveAlarmSnapshot` per currently active alarm. The handler validates `session_id` inline, resolves the session, and delegates to `IAlarmRpcDispatcher`; `WorkerAlarmRpcDispatcher` issues a `QueryActiveAlarmsCommand` over the worker IPC and streams each snapshot from the worker reply. `StreamAlarms` is a server-streaming, **session-less** RPC that attaches to the gateway's central alarm feed. The handler delegates to `IGatewayAlarmService.StreamAsync`. The stream opens with one `AlarmFeedMessage` carrying an `active_alarm` per currently-active alarm (the ConditionRefresh snapshot), then a single `snapshot_complete`, then a `transition` for every subsequent raise / acknowledge / clear. It is served by the always-on `GatewayAlarmMonitor`, which owns a single gateway-managed worker session and fans out to every attached client — clients no longer open a session of their own. `alarm_filter_prefix`, when set, scopes the stream to a sub-tree.
## Validation Rules ## Validation Rules
@@ -104,8 +104,8 @@ Carrying the enqueue timestamp into the worker layer is what lets queue-wait tim
| `CloseSession` | `session_id` must be non-empty. | `InvalidArgument` | | `CloseSession` | `session_id` must be non-empty. | `InvalidArgument` |
| `StreamEvents` | `session_id` must be non-empty. | `InvalidArgument` | | `StreamEvents` | `session_id` must be non-empty. | `InvalidArgument` |
| `Invoke` | `session_id` non-empty, `command` present, `kind` not `Unspecified`, payload oneof must match `kind`. | `InvalidArgument` | | `Invoke` | `session_id` non-empty, `command` present, `kind` not `Unspecified`, payload oneof must match `kind`. | `InvalidArgument` |
| `AcknowledgeAlarm` | `session_id` and `alarm_full_reference` must be non-empty. Validated inline in the handler, not by `MxAccessGrpcRequestValidator`. | `InvalidArgument` | | `AcknowledgeAlarm` | `alarm_full_reference` must be non-empty. Validated inline in the handler, not by `MxAccessGrpcRequestValidator`. | `InvalidArgument` |
| `QueryActiveAlarms` | `session_id` must be non-empty. Validated inline in the handler, not by `MxAccessGrpcRequestValidator`. | `InvalidArgument` | | `StreamAlarms` | No required fields — `alarm_filter_prefix` is optional. | — |
The payload-vs-kind check matters because the `MxCommand.payload` oneof is non-discriminated on the wire — a misaligned client could send `kind = Write` with a `Register` payload and silently confuse the worker. The validator turns that into a clear client error: The payload-vs-kind check matters because the `MxCommand.payload` oneof is non-discriminated on the wire — a misaligned client could send `kind = Write` with a `Register` payload and silently confuse the worker. The validator turns that into a clear client error: