From 49ae6e7b6fe66c68af8aef829dcb32a16e68a62b Mon Sep 17 00:00:00 2001 From: Joseph Doherty Date: Thu, 30 Apr 2026 15:20:57 -0400 Subject: [PATCH] =?UTF-8?q?docs:=20alarms-over-gateway=20=E2=80=94=20add?= =?UTF-8?q?=20Track=20E=20client=20surface=20refresh?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Cover both client surfaces that become user-visible when the alarm path lights up: - mxaccessgw client SDKs in 5 languages (.NET, Python, Go, Java, Rust). E.1 regens proto across all of them; E.2-E.6 add per-language alarms helpers (subscribe / acknowledge / query-active) plus matching CLI verbs. - lmxopcua OPC UA-facing clients (Client.CLI, Client.UI). E.7 extends AlarmEventArgs with the new optional fields, surfaces them in the CLI's --verbose / --json output and the UI's Show-details toggle, and updates ClientRequirements + Client.{CLI,UI}.md. Sequencing: E.1 first (mechanical regen), then E.2-E.7 in parallel. E.2 (.NET) is on the critical path because lmxopcua consumes it; the other-language SDKs can ship asynchronously without gating D.1. 12 PRs grew to 19 total: 4 in A, 5 in B, 2 in C, 7 in E, 1 in D. Co-Authored-By: Claude Opus 4.7 (1M context) --- docs/plans/alarms-over-gateway.md | 377 +++++++++++++++++++++++++++--- 1 file changed, 342 insertions(+), 35 deletions(-) diff --git a/docs/plans/alarms-over-gateway.md b/docs/plans/alarms-over-gateway.md index 85ed848..4d62a0d 100644 --- a/docs/plans/alarms-over-gateway.md +++ b/docs/plans/alarms-over-gateway.md @@ -534,6 +534,252 @@ completes that slot. Two PRs in the sidecar + one consumer-side PR C.2's lmxopcua-side consumer is **PR B.4 in Track B**, which depends on C.2 being deployed. +## Track E — client surface refresh + +Two surfaces become user-visible when the alarm path lights up: the +**mxaccessgw client SDKs** (5 languages, each with its own CLI) that +consume the new `OnAlarmTransition` event family + `AcknowledgeAlarm` +/ `QueryActiveAlarms` RPCs directly, and the **lmxopcua OPC UA-facing +clients** (Client.CLI, Client.UI) that consume the richer Part 9 +condition payload through the OPC UA server. Both need updates so the +new fields actually reach end-users; without Track E, the data +arrives at the gateway / OPC UA server but the off-the-shelf clients +display the same five columns they did under v2-pre-this-epic. + +Track E is split per-language so each PR stays small and reviewable. +PRs E.2 through E.6 are independent — they share only the proto +regen from E.1 — and can land in parallel by whoever owns each +language binding. + +### PR E.1 — regenerate proto across all client SDKs + +**Depends on:** A.1 merged (proto change live). + +**Files** (`c:\Users\dohertj2\Desktop\mxaccessgw\clients\`): + +1. **.NET** — codegen runs on csproj rebuild via `Grpc.Tools`; just + rebuild `MxGateway.Client.csproj` after pulling A.1. +2. **Python** — run `clients\python\generate-proto.ps1`; commit the + regenerated `_pb2.py` + `_pb2_grpc.py` files under + `clients\python\src\`. +3. **Go** — run `clients\go\generate-proto.ps1`; commit the + regenerated `*.pb.go` + `*_grpc.pb.go` files under + `clients\go\mxgateway\`. +4. **Java** — Gradle's `protobuf-gradle-plugin` regenerates on + `gradle build`; verify the new types appear in the build + output. Commit any pinned generated source under + `clients\java\mxgateway-client\src\main\java\` if that's the + convention (check `JavaClientDesign.md`). +5. **Rust** — `build.rs` runs `tonic-build` on the proto; just + `cargo build`. Generated code lives under + `clients\rust\target\` (gitignored) — nothing to commit; + verify the new types compile. + +No hand-written code in this PR. Pure regen + commit of generated +artifacts. Per-language pre-existing proto-regen tests in each +client's test suite must stay green. + +### PR E.2 — .NET client SDK + CLI + +**Depends on:** E.1, A.3 (gateway alarm dispatch + ack RPC live). + +**Files** (`clients\dotnet\MxGateway.Client\` + `MxGateway.Client.Cli\`): + +1. `MxGatewayClient.cs` — new public methods: + ```csharp + IAsyncEnumerable SubscribeAlarmsAsync( + IAsyncEnumerable session, + AlarmFilter? filter = null, + CancellationToken ct = default); + Task AcknowledgeAlarmAsync( + MxGatewaySession session, + string alarmFullReference, + string comment, + string userPrincipal, + CancellationToken ct = default); + IAsyncEnumerable QueryActiveAlarmsAsync( + MxGatewaySession session, + string? filterPrefix = null, + CancellationToken ct = default); + ``` + Existing `MxGatewayClientRetryPolicy` covers the new operations + without bespoke retry config. +2. `MxGateway.Client.Cli` — add `alarms` verb with subcommands: + `subscribe` (streams transitions until cancelled), + `acknowledge --ref --comment ""`, + `query-active [--prefix ]`. Output formatting mirrors + the existing `events stream` verb (default human-readable + + `--json` flag for machine output). +3. AuthN — `MxGatewayClientOptions` validates new scopes + `invoke:alarm-ack` / `invoke:alarm-query` exist on the API key + when those operations are invoked; pre-flight check fails fast + with a clear error rather than letting the gateway return + `PERMISSION_DENIED` mid-stream. + +**Tests** (`clients\dotnet\MxGateway.Client.Tests\`): + +- `FakeGatewayTransport` extended to emit `OnAlarmTransition` + events; assert `SubscribeAlarmsAsync` yields each as the right + payload shape. +- Ack: assert request shape, retry policy, and error wrapping + (Unauthenticated → `MxGatewayAuthenticationException`, + PermissionDenied → `MxGatewayAuthorizationException`, + resource-exhausted → `MxGatewayException` with the right + message). +- CLI verb tests in `MxGatewayClientCliTests.cs` — argument + parsing, JSON output shape, exit codes. + +### PR E.3 — Python client SDK + CLI + +**Depends on:** E.1. + +**Files** (`clients\python\src\mxgateway\` + the existing CLI entry +point — verify the exact name during PR; `PythonClientDesign.md` +documents it): + +1. New module `alarms.py` exposing async helpers: + ```python + async def subscribe_alarms(session, *, filter=None) -> AsyncIterator[AlarmTransition]: ... + async def acknowledge_alarm(session, *, alarm_ref, comment, user) -> MxStatus: ... + async def query_active_alarms(session, *, prefix=None) -> AsyncIterator[ActiveAlarmSnapshot]: ... + ``` +2. CLI: add `alarms subscribe / acknowledge / query-active` verbs. + Use the same JSON output schema as E.2's CLI so cross-language + tooling can parse either. +3. Type stubs (`*.pyi`) updated for the new types. + +**Tests** (`clients\python\tests\`): + +- pytest-asyncio fixtures using a stub gRPC server; assert each + helper's request/response shape. +- CLI smoke via `subprocess` + captured stdout JSON comparison. + +### PR E.4 — Go client SDK + CLI + +**Depends on:** E.1. + +**Files** (`clients\go\mxgateway\` + `clients\go\cmd\`): + +1. New `alarms.go` exposing: + ```go + func (c *Client) SubscribeAlarms(ctx context.Context, opts ...SubscribeOption) (<-chan AlarmTransition, error) + func (c *Client) AcknowledgeAlarm(ctx context.Context, ref, comment, user string) (MxStatus, error) + func (c *Client) QueryActiveAlarms(ctx context.Context, prefix string) ([]ActiveAlarmSnapshot, error) + ``` +2. CLI: add `alarms` subcommand under `clients\go\cmd\mxgateway-cli\` + (verify the binary name in `GoClientDesign.md`). Same verb shape + as E.2 / E.3. +3. Errors wrapped via `errors.Is` against named sentinels + (`ErrAuthFailed`, `ErrPermissionDenied`, etc.) so callers can + programmatically distinguish failure modes. + +**Tests:** standard Go table-driven tests against a stub gRPC server +under `clients\go\internal\testserver\`. + +### PR E.5 — Java client SDK + CLI + +**Depends on:** E.1. + +**Files** (`clients\java\mxgateway-client\src\main\java\` + +`clients\java\mxgateway-cli\`): + +1. New methods on the existing client class (verify in + `JavaClientDesign.md`): + ```java + Flowable subscribeAlarms(Session s, AlarmFilter filter); + Single acknowledgeAlarm(Session s, String alarmRef, String comment, String user); + Flowable queryActiveAlarms(Session s, String prefix); + ``` + (RxJava idiom matching the existing data-change subscription + API; if the existing API uses `CompletableFuture` instead, follow + that convention — verify during PR.) +2. CLI: same `alarms subscribe / acknowledge / query-active` + verbs. + +**Tests:** JUnit 5 + a stub gRPC server. CLI tested via +`ProcessBuilder` exec + JSON output comparison. + +### PR E.6 — Rust client SDK + +**Depends on:** E.1. + +**Files** (`clients\rust\crates\mxgateway-client\src\` + +likely a `mxgateway-cli` crate — verify in `RustClientDesign.md`): + +1. New methods on the client struct: + ```rust + pub fn subscribe_alarms(&self, filter: Option) -> impl Stream>; + pub async fn acknowledge_alarm(&self, alarm_ref: &str, comment: &str, user: &str) -> Result; + pub fn query_active_alarms(&self, prefix: Option<&str>) -> impl Stream>; + ``` +2. CLI: same verb shape. +3. `thiserror`-based error enum extended with `AlarmAckPermissionDenied` + etc. variants if the existing pattern uses one. + +**Tests:** `tokio::test` against a stub gRPC server using +`tonic-build`'s test harness. CLI tested via `assert_cmd`. + +### PR E.7 — lmxopcua OPC UA-facing client refresh + +**Depends on:** B.2 + B.3 (server-side payload final on the OPC UA +wire). Independent of E.2-E.6 — different consumer surface (OPC UA +Part 9, not gateway gRPC). + +**Files** (`c:\Users\dohertj2\Desktop\lmxopcua\src\`): + +1. `Core.Abstractions\AlarmEventArgs.cs` *(extend, not new)* — add + optional fields the new path surfaces: + - `OperatorComment` (nullable string — populated by the native + ack path; null on sub-attribute fallback path) + - `OriginalRaiseTimestampUtc` (nullable; null on fallback path) + - `AlarmCategory` (nullable string) + - `AlarmTypeName` (already exists per v1 docs — leave alone) +2. `Server\OpcUa\DriverNodeManager.cs` — populate the corresponding + OPC UA Part 9 condition fields when the new payload is non-null: + `Comment` (from OperatorComment), `Time` (from OriginalRaiseTimestampUtc + when present, else event arrival time), `ConditionClassName` (from + AlarmCategory if mapping is defined). +3. `Client.Shared\Models\AlarmEventArgs.cs` — mirror the new fields + on the client-side DTO. +4. `Client.CLI\Commands\AlarmsCommand.cs` — add columns under a new + `--verbose` flag, plus full payload under `--json`. Default output + stays five-column compatible. +5. `Client.UI\ViewModels\AlarmEventViewModel.cs` — bind the new + fields. Add columns to `Views\AlarmsView.axaml` (collapsible + under a "Show details" toggle so the default view stays compact). + Surface `OperatorComment` in `AckAlarmWindow.axaml` as a + prepopulated default when re-acknowledging an already-acked + alarm. +6. `docs\Client.CLI.md` — add the new `--verbose` and `--json` + flag examples to the alarms section. +7. `docs\Client.UI.md` — add a screenshot or description of the + "Show details" expansion behavior. +8. `docs\reqs\ClientRequirements.md` — line 116 + 153 reference + the alarm subscription contract; extend the field list to cover + the new payload. +9. `docs\AlarmTracking.md` (new in B.5) — wire in client-side + examples. + +**Tests:** + +- `Client.Shared.Tests` — DTO round-trip through the alarm event + pump with all fields populated and all-null cases. +- `Client.CLI.Tests` — `--verbose` column ordering, `--json` + schema validation, default output stays five-column. +- `Client.UI.Tests` — `AlarmEventViewModel` bindings exposed, + collapsible-detail toggle behavior. + +### Sequencing within Track E: + +E.1 first (mechanical). E.2-E.7 can land in parallel. E.7 has its own +dependency chain inside lmxopcua (B.2 + B.3) and doesn't gate any +other E PR. The .NET client (E.2) is the only language SDK +**lmxopcua** consumes today; if the gateway repo's release schedule +prefers landing E.2 first and shipping E.3-E.6 in a follow-up release, +that's a valid sequence — the customer-facing constraint is "at +least one language SDK ships at the same time as A.4 lights up the +gateway dispatch." + ## Track D — deployment refresh The dev box at `DESKTOP-6JL3KKO` runs three live services from @@ -674,38 +920,51 @@ duration (parity-rig validation gate; see "Test gates" above). ## Sequencing matrix ``` -Track A (mxaccessgw) Track B (lmxopcua) Track C (sidecar) -───────────────────────── ───────────────────────── ───────────────────────── -A.1 proto (waits) C.1 AahClientManagedAlarmEventWriter - │ │ no cross-repo dep - ├──────────────────────────► B.1 EventPump branch │ -A.2 worker subscription │ uses proto types only │ - │ │ unit-testable │ - │ C.2 Program.cs wires writer -A.3 gateway dispatch + ack RPC ──►B.2 GalaxyDriver : IAlarmSource │ - │ │ │ - │ ──►B.3 DriverNodeManager routing │ - │ │ -A.4 ConditionRefresh │ │ - │ │ - B.4 SidecarAlarmHistorianWriter - (depends on C.2 deployed) - - ▼ - Track D (deployment) - ───────────────────────── - D.1 Refresh C:\publish + restart services - (depends on A.4 + B.4 + C.2 merged) - ▼ - ──►B.5 docs + memory + completion banner +Track A (mxaccessgw) Track B (lmxopcua) Track C (sidecar) Track E (clients) +───────────────────────── ───────────────────────── ───────────────────── ────────────────────────── +A.1 proto (waits) C.1 AahClientManagedWriter E.1 proto regen ×5 langs + │ │ │ (mechanical, after A.1) + ├──────────────────────────► B.1 EventPump branch │ │ +A.2 worker subscription │ uses proto types only │ │ + │ │ unit-testable │ │ + │ C.2 Program.cs wires │ +A.3 gateway dispatch + ack RPC ──►B.2 GalaxyDriver : IAlarmSource │ ──►E.2 .NET SDK + CLI + │ │ │ ──►E.3 Python SDK + CLI + │ ──►B.3 DriverNodeManager routing │ ──►E.4 Go SDK + CLI + │ │ ──►E.5 Java SDK + CLI + │ │ ──►E.6 Rust SDK +A.4 ConditionRefresh │ │ │ + │ │ │ + B.4 SidecarAlarmHistorianWriter │ + (depends on C.2 deployed) │ │ + │ │ + (B.2 + B.3 done) ────────────────────────────────────────────► E.7 lmxopcua client refresh + │ │ + ▼ │ + Track D (deployment) │ + ───────────────────────── │ + D.1 Refresh C:\publish + restart services │ + (depends on A.4 + B.4 + C.2 + E.2 merged) │ + ▼ │ + ──►B.5 docs + memory + completion banner ◄─────────(E.7 done)──┘ ``` -A.1 + B.1 + C.1 can all land in parallel — none have cross-repo runtime -dependencies. B.1's tests use proto types without needing a running -gateway. C.1 is purely sidecar-internal. The gateway-side dispatch (A.3) -gates B.2; the sidecar-side wiring (C.2) gates B.4. D.1 (deployment -refresh) gates B.5 (docs) — the docs sweep records the as-deployed -state, so the deploy must be live first. +A.1 + B.1 + C.1 + E.1 can all land in parallel — none have cross-repo +runtime dependencies. B.1's tests use proto types without needing a +running gateway. C.1 is purely sidecar-internal. E.1 is mechanical +codegen. + +The gateway-side dispatch (A.3) gates B.2 and E.2-E.6. The +sidecar-side wiring (C.2) gates B.4. E.7 gates on B.2 + B.3 only — +it's the OPC UA client surface, not the gateway client surface. + +D.1 (deployment refresh) requires E.2 to also be merged because the +deployed `MxGateway.Client.dll` consumed by GalaxyDriver needs the new +methods. E.3-E.6 (other-language SDKs) don't gate D.1 — they ship on +their own release cadence. + +B.5 (docs sweep) gates on D.1 + E.7 both merged — it's the final +"snapshot the as-shipped state" pass. ## Test gates @@ -830,8 +1089,56 @@ needed); land B.4 last and only after end-of-epic gate is green. - `docs\v2\dev-environment.md` (D.1 — document the refresh workflow) - `docs\plans\artifacts\d1-rollout-YYYY-MM-DD.md` *(new — D.1 captured smoke run)* -Total: ~10 source files added/modified in mxaccessgw; ~14 in lmxopcua -proper; ~3 in the historian sidecar; ~2 deployment scripts; ~12 test -files across all repos. Should land in 4-6 weeks of focused work given -the parity-rig dependency for end-to-end validation, plus a short -final-week ops slot for D.1. +**mxaccessgw — client SDKs (Track E):** + +- `clients\proto\` — no source change; downstream codegen consumes A.1 +- **.NET (E.2)**: + - `clients\dotnet\MxGateway.Client\MxGatewayClient.cs` + - `clients\dotnet\MxGateway.Client\Alarms\` *(new namespace)* + - `clients\dotnet\MxGateway.Client.Cli\Verbs\AlarmsVerb.cs` *(new)* + - `clients\dotnet\MxGateway.Client.Tests\AlarmsTests.cs` *(new)* +- **Python (E.3)**: + - `clients\python\src\mxgateway\alarms.py` *(new)* + - `clients\python\src\mxgateway\cli\alarms.py` *(new — verify CLI module path)* + - `clients\python\tests\test_alarms.py` *(new)* +- **Go (E.4)**: + - `clients\go\mxgateway\alarms.go` *(new)* + - `clients\go\cmd\mxgateway-cli\alarms.go` *(new — verify dir name)* + - `clients\go\internal\testserver\alarms_test.go` *(new)* +- **Java (E.5)**: + - `clients\java\mxgateway-client\src\main\java\…\AlarmsApi.java` *(new)* + - `clients\java\mxgateway-cli\src\main\java\…\AlarmsCommand.java` *(new)* + - `clients\java\mxgateway-client\src\test\java\…\AlarmsApiTest.java` *(new)* +- **Rust (E.6)**: + - `clients\rust\crates\mxgateway-client\src\alarms.rs` *(new)* + - `clients\rust\crates\mxgateway-cli\src\alarms.rs` *(new — verify crate name)* + - `clients\rust\tests\alarms.rs` *(new)* + +**lmxopcua — OPC UA client refresh (Track E.7):** + +- `src\ZB.MOM.WW.OtOpcUa.Core.Abstractions\AlarmEventArgs.cs` (extend) +- `src\ZB.MOM.WW.OtOpcUa.Server\OpcUa\DriverNodeManager.cs` (Part 9 field population) +- `src\ZB.MOM.WW.OtOpcUa.Client.Shared\Models\AlarmEventArgs.cs` (DTO mirror) +- `src\ZB.MOM.WW.OtOpcUa.Client.CLI\Commands\AlarmsCommand.cs` (verbose / json flags) +- `src\ZB.MOM.WW.OtOpcUa.Client.UI\ViewModels\AlarmEventViewModel.cs` +- `src\ZB.MOM.WW.OtOpcUa.Client.UI\ViewModels\AlarmsViewModel.cs` +- `src\ZB.MOM.WW.OtOpcUa.Client.UI\Views\AlarmsView.axaml` (+ `.cs`) +- `src\ZB.MOM.WW.OtOpcUa.Client.UI\Views\AckAlarmWindow.axaml` (+ `.cs`) +- `docs\Client.CLI.md` (alarms section examples) +- `docs\Client.UI.md` (Show-details toggle description) +- `docs\reqs\ClientRequirements.md` (extend AlarmEventArgs contract) +- `docs\AlarmTracking.md` (B.5 — cross-link client examples) +- `tests\ZB.MOM.WW.OtOpcUa.Client.Shared.Tests\` (DTO round-trip) +- `tests\ZB.MOM.WW.OtOpcUa.Client.CLI.Tests\` (flag behaviour) +- `tests\ZB.MOM.WW.OtOpcUa.Client.UI.Tests\` (view-model bindings) + +Total: ~10 source files added/modified in mxaccessgw server/worker +side; ~14 in lmxopcua server/driver side; ~3 in the historian sidecar; +~2 deployment scripts; ~30 across the five gateway-client SDK +languages; ~12 in lmxopcua client surfaces; ~25 test files across +all repos. The gateway-client multi-language work is parallelizable +across maintainers, so wall-clock effort lands in 4-6 weeks of +coordinated work given the parity-rig dependency for end-to-end +validation. If only the .NET SDK ships at first (E.2 only) and +E.3-E.6 follow asynchronously, lmxopcua's critical path stays +unchanged.