Compare commits

...

83 Commits

Author SHA1 Message Date
Joseph Doherty 9bdb899774 fix(clients): inline Go gosec directive and strip IPv6 brackets in Python authority split 2026-06-01 07:57:22 -04:00
Joseph Doherty e5c704de69 feat(gateway): add machine FQDN to self-signed cert SANs
Best-effort resolve the host FQDN via Dns.GetHostEntry and add it as a
DNS SAN when it differs (OrdinalIgnoreCase) from the short machine name
and "localhost". SocketException / ArgumentException are caught and
silently skipped so cert generation remains robust when DNS is absent.
2026-06-01 07:52:48 -04:00
Joseph Doherty 4e520f9c0c fix(gateway): delete temp cert file on persist failure
Wrap the WriteAllBytes/Move/HardenPermissions sequence in a try/catch so
that any failure best-effort deletes the hardened .tmp file (which may
already hold PFX/private-key bytes) before rethrowing.  Add a test that
induces a persist failure by pointing SelfSignedCertPath inside a
regular file and asserts no .tmp is left on disk.
2026-06-01 07:45:15 -04:00
Joseph Doherty 2eb81379e4 docs: TLS auto-cert and lenient client trust 2026-06-01 07:43:13 -04:00
Joseph Doherty ddd5721082 fix(gateway): harden self-signed cert persistence and config validation 2026-06-01 07:37:27 -04:00
Joseph Doherty 3775f6bf3b feat(gateway): supply generated cert as Kestrel HTTPS default 2026-06-01 07:30:26 -04:00
Joseph Doherty cdfad420bb fix(client-rust): apply TLS guard to GalaxyClient and add CLI strict flag
Extract the TLS-without-CA guard into a shared `build_tls_config` helper
in options.rs so both GatewayClient and GalaxyClient use identical logic.
GalaxyClient previously had no guard, so TLS-without-CA produced a cryptic
tonic handshake failure; it now returns the same actionable InvalidEndpoint
error. The guard message notes that a server-name override affects SNI but
does not pin trust. Add --require-certificate-validation to ConnectionArgs
in the CLI binary. Add a mirror test for GalaxyClient in tests/tls.rs.
2026-06-01 07:28:16 -04:00
Joseph Doherty 330e665f6b fix(gateway): correct ECDSA key usage and dispose CertificateRequest
Drop KeyEncipherment from the self-signed cert's key-usage extension — it
is semantically wrong for ECDSA (RSA key-transport only); DigitalSignature
alone is correct for TLS 1.3 / ECDHE server certs.  CertificateRequest is
unchanged (not IDisposable in .NET 10).  Test now also asserts MachineName,
127.0.0.1 and IPv6 loopback are present in the SAN extension.
2026-06-01 07:27:15 -04:00
Joseph Doherty 5e01ad9c22 fix(client-dotnet): apply lenient TLS to GalaxyRepositoryClient and enforce hostname on CA-pin
Mirror MxGatewayClient's three-branch handler structure in GalaxyRepositoryClient
(CA-pin / lenient accept-all / OS trust) so the Galaxy endpoint works against the
gateway's self-signed cert under the default lenient posture. Expose an internal
CreateHttpHandlerForTests seam for unit testing. Add RemoteCertificateNameMismatch
rejection at the top of both CA-pinned callbacks so a pinned-CA connection truly
verifies the host. Strengthen existing lenient test to invoke the callback and assert
it returns true; add mirrored Galaxy-client handler tests.
2026-06-01 07:24:07 -04:00
Joseph Doherty 77a9108673 feat(gateway): persist/reuse self-signed cert with hardened permissions 2026-06-01 07:23:33 -04:00
Joseph Doherty 192607ab8c fix(gateway): detect Certificate:Thumbprint and cover more KestrelTlsInspector cases 2026-06-01 07:22:24 -04:00
Joseph Doherty ba82afe669 fix(client-java): keep Temurin 21 toolchain, auto-provision instead of bumping to 26 2026-06-01 07:20:04 -04:00
Joseph Doherty fe7d1ce1ec feat(gateway): validate MxGateway:Tls options 2026-06-01 07:19:22 -04:00
Joseph Doherty b8a6695612 feat(gateway): generate self-signed ECDSA cert with SANs 2026-06-01 07:18:39 -04:00
Joseph Doherty 6f9188bc8d test(client-python): update TLS default-channel test for TOFU behavior 2026-06-01 07:17:36 -04:00
Joseph Doherty a276f46f81 feat(client-java): accept gateway cert by default over TLS 2026-06-01 07:13:45 -04:00
Joseph Doherty 572b268d81 feat(client-rust): accept gateway cert by default over TLS (or documented pin-only fallback) 2026-06-01 07:11:09 -04:00
Joseph Doherty 4c093a64fa feat(client-python): accept gateway cert by default via TOFU pre-fetch 2026-06-01 07:10:55 -04:00
Joseph Doherty f47bbaea95 feat(client-dotnet): accept gateway cert by default over TLS 2026-06-01 07:08:55 -04:00
Joseph Doherty c463b49f46 feat(client-go): accept gateway cert by default over TLS 2026-06-01 07:08:47 -04:00
Joseph Doherty 87f86503ef feat(gateway): add MxGateway:Tls options block 2026-06-01 07:08:19 -04:00
Joseph Doherty e912ef960c feat(gateway): detect HTTPS endpoints missing a certificate 2026-06-01 07:08:12 -04:00
Joseph Doherty c4e7ddea70 docs: implementation plan for gateway TLS auto-cert and lenient client trust 2026-06-01 07:01:58 -04:00
Joseph Doherty 6bfa4fe884 docs: design for gateway TLS auto-cert and lenient client trust 2026-06-01 06:54:23 -04:00
Joseph Doherty b4a7bac4c0 scripts: add pack-clients.ps1 to pack/publish all 5 client packages 2026-05-28 17:12:08 -04:00
Joseph Doherty 6df373ae4c client/go: release docs and tag-go-module.ps1 helper 2026-05-28 17:07:25 -04:00
Joseph Doherty fe44e3c18a client/java: maven-publish wiring for Gitea Maven feed 2026-05-28 17:07:11 -04:00
Joseph Doherty 523f944f3e client/rust: Cargo metadata + Gitea alternative-registry config 2026-05-28 17:06:47 -04:00
Joseph Doherty c33f1e6047 client/python: PyPI metadata + Gitea feed install instructions 2026-05-28 17:06:01 -04:00
Joseph Doherty 92cc4688e6 client/go: avoid holding mutex across BrowseChildren RPC in Expand 2026-05-28 15:33:48 -04:00
Joseph Doherty a155554038 grpc: reuse GalaxyBrowseProjector.ResolveParentId from handler 2026-05-28 15:32:48 -04:00
Joseph Doherty 68f905a344 client/java: avoid holding monitor across BrowseChildren RPC in expand 2026-05-28 15:32:36 -04:00
Joseph Doherty 5abc222c72 galaxy: add by-name and by-path indexes to GalaxyHierarchyIndex 2026-05-28 15:31:56 -04:00
Joseph Doherty da3aa7b0b2 client/go: paginate DiscoverHierarchy across multi-page galaxies 2026-05-28 15:31:16 -04:00
Joseph Doherty f0ec068430 galaxy: add cycle guard to HasMatchingDescendant 2026-05-28 15:30:08 -04:00
Joseph Doherty 1a1d14a9fd client/python: add public browse_children_raw for API parity 2026-05-28 15:29:08 -04:00
Joseph Doherty b2448510ac client/java: add browseChildrenRejectsRepeatedPageToken test for parity 2026-05-28 15:17:52 -04:00
Joseph Doherty 75610e3f55 client/go: wrap browseChildren duplicate-page-token error in GatewayError 2026-05-28 15:17:10 -04:00
Joseph Doherty 5032166106 client/dotnet: assert failed expand leaves node unexpanded 2026-05-28 15:16:07 -04:00
Joseph Doherty 76a042d663 grpc: make page_token error strings RPC-name-agnostic 2026-05-28 15:15:40 -04:00
Joseph Doherty 4a19854eb9 docs: per-client High-level walker example using LazyBrowseNode
Add a "High-level walker" subsection under each client's "Browsing
lazily" section showing idiomatic use of LazyBrowseNode (browse +
expand, idempotency note, redeploy refresh pattern).
2026-05-28 14:34:19 -04:00
Joseph Doherty a4467e23ef client/python: make LazyBrowseNode.expand concurrency-safe 2026-05-28 14:32:35 -04:00
Joseph Doherty eacfeff9fb client/dotnet: make LazyBrowseNode.ExpandAsync thread-safe 2026-05-28 14:28:36 -04:00
Joseph Doherty b4bc2df015 client/java: LazyBrowseNode walker for lazy hierarchy browse 2026-05-28 14:29:15 -04:00
Joseph Doherty fd2a0ac4c7 client/go: LazyBrowseNode walker for lazy hierarchy browse 2026-05-28 14:26:41 -04:00
Joseph Doherty 555e4be51f client/rust: LazyBrowseNode walker for lazy hierarchy browse 2026-05-28 14:26:05 -04:00
Joseph Doherty 1d8c0d83c4 client/python: LazyBrowseNode walker for lazy hierarchy browse 2026-05-28 14:24:23 -04:00
Joseph Doherty 6600f2a7bd client/dotnet: LazyBrowseNode walker for lazy hierarchy browse 2026-05-28 14:24:17 -04:00
Joseph Doherty 803a207ad2 client/java: regenerate protos for BrowseChildren
Regen'd from galaxy_repository.proto after BrowseChildren RPC was added.
GalaxyRepositoryGrpc and GalaxyRepositoryOuterClass now include the
BrowseChildrenRequest/BrowseChildrenReply types and stub methods.
2026-05-28 14:21:56 -04:00
Joseph Doherty 97e583e96b docs: implementation plan for per-language LazyBrowseNode walker
9 tasks: Java toolchain install (Homebrew), 5 parallel per-language
walker implementations, README updates, final verification. Java
walker is gated on toolchain bootstrap success; other languages
proceed independently if Java fails.
2026-05-28 14:17:52 -04:00
Joseph Doherty eaf479349d docs: design for client-side LazyBrowseNode walker + per-language tests
Adds one high-level walker per client (.NET/Python/Rust/Go/Java) plus
six unit tests each against existing fake transports. One-shot idempotent
Expand semantics; pagination hidden inside the helper. Includes Java
toolchain bootstrap (Homebrew Temurin + Gradle) so the Java client can
build locally on the macOS dev host.
2026-05-28 14:12:03 -04:00
Joseph Doherty 83a4d41fce docs: align design doc test-plan with InvalidArgument error mapping 2026-05-28 13:30:19 -04:00
Joseph Doherty 0d6193cdc4 docs: note BrowseChildren in gateway overview and client READMEs 2026-05-28 13:25:46 -04:00
Joseph Doherty 8cd3e1c20e client/go: regenerate protos for BrowseChildren 2026-05-28 13:22:06 -04:00
Joseph Doherty 5c28458624 client/rust: regenerate protos for BrowseChildren 2026-05-28 13:19:54 -04:00
Joseph Doherty 0b389f5a97 docs: document BrowseChildren RPC and lazy browse architecture 2026-05-28 13:19:08 -04:00
Joseph Doherty 108c4bb118 client/python: regenerate protos for BrowseChildren 2026-05-28 13:18:25 -04:00
Joseph Doherty cf54a278e1 docs: record lazy-browse stays wire-only; align error mapping 2026-05-28 13:18:23 -04:00
Joseph Doherty 81b2aacfe2 client/dotnet: live smoke for BrowseChildren 2026-05-28 13:17:29 -04:00
Joseph Doherty 5932fe2fd3 dashboard: surface lazy-load errors via BrowseLoadState.Error 2026-05-28 13:15:26 -04:00
Joseph Doherty 310dfab8b4 dashboard: lazy-load BrowsePage via DashboardBrowseService 2026-05-28 13:10:10 -04:00
Joseph Doherty ba157b4b4f grpc: implement BrowseChildren handler + metadata:read scope 2026-05-28 13:08:45 -04:00
Joseph Doherty 87e22dd529 galaxy: add GalaxyBrowseProjector for direct-children projection 2026-05-28 12:58:07 -04:00
Joseph Doherty d9eaf4b056 galaxy: add ChildrenByParent index for level-at-a-time browse 2026-05-28 12:51:48 -04:00
Joseph Doherty 2c5c5e5c7e contracts: add BrowseChildren RPC for lazy hierarchy browse 2026-05-28 12:47:02 -04:00
Joseph Doherty b3ebf583ad docs: implementation plan for lazy-browse BrowseChildren RPC
12-task bite-sized plan executing the approved design.
Includes native task persistence file.
2026-05-28 12:41:11 -04:00
Joseph Doherty edb812d859 docs: design for lazy-browse BrowseChildren RPC
OPC UA-style level-at-a-time browse across gRPC, dashboard, and the
shared cache projector. Server still loads the full Galaxy hierarchy;
laziness is wire-side and UI-side only.
2026-05-28 12:34:37 -04:00
Joseph Doherty 795eee72e3 client/dotnet: backfill XML doc comments to satisfy analyzers
Adds missing <summary>/<param> docs across the .NET client library and its
test suite so CommentChecker reports zero issues. TreatWarningsAsErrors
requires the analyzer surface clean before publishing the NuGet package.
2026-05-27 14:30:53 -04:00
Joseph Doherty 615b487a77 docs+ui: backfill XML doc comments and finish dashboard layout pass
Adds missing <summary>/<param> XML docs across 99 server, worker, and test
files so CommentChecker reports zero issues (TreatWarningsAsErrors needs the
analyzer clean). Bundles in WIP dashboard work: NavSection extraction,
MainLayout/site.css/js styling alignment, and DashboardOptions/Auth tweaks.
2026-05-27 14:20:10 -04:00
Joseph Doherty 382861c602 build: add NonWindows.slnx for macOS/Linux dev hosts
The Worker + Worker.Tests projects pull in the Windows-only ArchestrA.MxAccess
COM stack and can't be built off Windows. Add a sibling .slnx that lists only
the cross-platform projects (Contracts, Server, IntegrationTests, Tests) so
non-Windows hosts can restore + build the rest of the solution with:

    dotnet build src/ZB.MOM.WW.MxGateway.NonWindows.slnx

The canonical solution on Windows remains ZB.MOM.WW.MxGateway.slnx.
2026-05-26 01:18:29 -04:00
Joseph Doherty ba2b936609 ui: align dashboard styling with ScadaLink master conventions
- Rename DashboardLayout.razor -> MainLayout.razor; dashboard.css -> site.css
- Sidebar 218 -> 220px; add hamburger + Bootstrap collapse for <lg viewports
- Rename .metric-* KPI classes to .agg-* (matches shared theme tokens)
- Rebuild ApiKeysPage create form as card + h6 subsections + bottom Save/Cancel
2026-05-26 01:12:54 -04:00
Joseph Doherty 7fc1955287 Dashboard: handle GET /logout (was 405) by signing out + redirecting to /login
Browsers that navigate directly to /logout via the address bar issued a GET
against a POST-only route and got 405 Method Not Allowed. Logout is
self-destructive, so the GET path can skip antiforgery; the existing POST
form (used by the layout's Sign out button) is unchanged and still
antiforgery-protected.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-25 23:40:39 -04:00
Joseph Doherty 54480dde61 Add review-process + glauth design docs, bench scripts; ignore install/
Picks up the missing glauth.md referenced by CLAUDE.md, captures the
review workflow alongside the bench-read-bulk and review-readme helper
scripts, and excludes the local install/ deployment tree from source.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-25 23:26:21 -04:00
Joseph Doherty 581b541801 code-reviews: regenerate after batch 2 resolutions
All 41 findings from the 42b0037 re-review are now Resolved across
8 modules. Open count = 0 for every module.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-24 09:29:36 -04:00
Joseph Doherty d3cb311aae Resolve Client.Java-032..036: shared subscription base, batch tokenizer
Client.Java-032  README CLI examples for stream-alarms and
                 acknowledge-alarm now use the correct picocli flags
                 (--filter-prefix and --reference); two regression
                 tests parse each documented invocation.
Client.Java-033  StreamAlarmsCommand publishes an
                 AtomicReference<MxGatewayAlarmFeedSubscription> and
                 mirrors MxEventStream's overflow branch: a failed
                 queue.offer cancels the subscription, queues an
                 IllegalStateException, then queues the END sentinel
                 — preserving the fail-fast contract.
Client.Java-034  BatchCommand routes through a new
                 MxGatewayCli.tokenizeBatchLine POSIX-style shell
                 tokenizer that respects double-quoted, single-quoted,
                 and backslash-escaped arguments.
Client.Java-035  Added streamAlarmsForwardsRequestAndStreamsAlarmFeedMessages
                 to MxGatewayClientSessionTests; asserts request shape,
                 message ordering, and cancellation propagation.
Client.Java-036  Extracted MxGatewayStreamSubscription<TRequest,TResponse>
                 abstract base; the four subscription classes
                 (MxGatewayEventSubscription, MxGatewayAlarmFeedSubscription,
                 MxGatewayActiveAlarmsSubscription, DeployEventSubscription)
                 collapse to ~10-line subclasses. A new contract test
                 runs identical lifecycle / cancellation assertions
                 across all four subclasses.

All resolved at 2026-05-24; gradle build + gradle test BUILD SUCCESSFUL.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-24 09:29:27 -04:00
Joseph Doherty 186d03e5cc Resolve IntegrationTests-025: stopBoundary for repo-root walker
ResolveRepositoryRoot accepts an optional stopBoundary parameter that
caps the upward walk; production callers pass null and behavior is
unchanged. The two repository-marker tests now seal their walkers
inside their own temp directories, so a redirected TMP or a co-located
C:\src checkout no longer leaks ambient marker-bearing ancestors into
the assertion.

Regression test ResolveRepositoryRoot_StopBoundary_IsolatesWalkerFromAmbientAncestorMarkers
constructs an outer ancestor that carries src/ + .git, confirms the
walker leaks into it without the boundary, then asserts the same call
throws with the boundary supplied.

Resolved at 2026-05-24; IntegrationTestEnvironmentTests 5/5 pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-24 09:29:09 -04:00
Joseph Doherty 6bae5ea3a3 Resolve Tests-027..031: flake root cause + coverage gaps
Tests-027  GatewayMetrics exposes its internal Meter; the
           StreamEvents_WhenEventIsWritten_RecordsSendDuration listener
           now filters by ReferenceEquals(instrument.Meter, metrics.Meter)
           instead of Meter.Name, so parallel tests with their own
           GatewayMetrics no longer cross-contaminate the families list.
Tests-028  FakeWorkerClient.Kill now captures LastKillReason;
           SessionManager.KillWorkerAsync tests pin the reason
           propagation end-to-end and cover the blank/null guard. The
           DashboardSessionAdminService kill test pins the literal
           dashboard-admin-kill reason.
Tests-029  Added CloseSessionAsync_BlankSessionId_ReturnsFailure to mirror
           the existing KillWorkerAsync blank-id coverage.
Tests-030  DeleteAsync_WhenStoreRefuses_ReportsFriendlyError renamed and
           extended to assert the dashboard-delete-key audit row with
           Details = not-found-or-active. Added
           DeleteAsync_BlankKeyId_ReturnsFailure.
Tests-031  DashboardSnapshotPublisher reconnect test now measures the
           gap from the first throw inside the fake (firstThrowAt) to
           secondSubscribeAt, isolating Task.Delay from StartAsync /
           scheduling overhead.

All resolved at 2026-05-24; 512/512 gateway tests pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-24 09:28:54 -04:00
Joseph Doherty 430187c28b code-reviews: regenerate after batch 1 resolutions
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-24 08:50:39 -04:00
Joseph Doherty f5b50c4484 Resolve Client.Python-022..026: TLS-by-default, batch CLI, README
Client.Python-022  README CLI examples for stream-alarms and
                   acknowledge-alarm now use the correct flags;
                   regression test parses every documented line through
                   Click.
Client.Python-023  Re-applied Client.Python-013 — _use_plaintext drops
                   the silent localhost / 127.0.0.1 auto-downgrade
                   branch; --plaintext and --tls are mutually exclusive
                   and TLS is the default.
Client.Python-024  batch dispatch routes through main.main(...,
                   standalone_mode=False) under a redirected stdout
                   instead of click.testing.CliRunner; recursive batch
                   lines are rejected outright.
Client.Python-025  Added behavioural tests for the five bulk SDK methods,
                   stream_alarms, and the new CLI subcommands.
Client.Python-026  _bench_read_bulk hoists 'import time' to module scope
                   and logs cleanup failures instead of swallowing them.

All resolved at 2026-05-24; python -m pytest is 65/65 green.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-24 08:50:27 -04:00
Joseph Doherty 4a0f88b17d Resolve Client.Rust-022..029: MalformedReply, correlation ids, clippy
Client.Rust-022  Restored Error::MalformedReply for register / add_item /
                 add_item2 and the bulk-subscribe / read-bulk / write-bulk
                 dispatch arms so malformed-but-OK replies fail loudly
                 instead of returning Vec::new().
Client.Rust-023  Restored next_correlation_id and routed every CLI close /
                 stream-alarms / acknowledge-alarm / bench-read-bulk call
                 through it so each call carries a unique opaque token.
Client.Rust-024  Added round-trip tests for read_bulk / write_bulk /
                 write2_bulk / write_secured_bulk / write_secured2_bulk
                 plus stream_alarms and percentile_summary unit tests.
Client.Rust-025  RustClientDesign.md re-synced — new bulk SDK, alarms
                 surface, Error variants, CLI command list, and the
                 Windows stack workaround.
Client.Rust-026  Session::read_bulk now borrows a tag slice; bench-read-
                 bulk binds tags once outside the warm-up / steady-state
                 loops.
Client.Rust-027  .cargo/config.toml selector tightened to
                 cfg(all(windows, target_env = "msvc")) and comment
                 rewritten to match reality (release + debug ship the
                 8 MB reservation).
Client.Rust-028  run_batch removed the empty-line break; stdin EOF is
                 the only terminator.
Client.Rust-029  Re-applied Client.Rust-001 / 002 / 012 — added the
                 missing doc comments, renamed BulkReplyKind variants,
                 and replaced the clone-on-copy with a deref under lock
                 so cargo clippy -D warnings is clean.

All resolved at 2026-05-24; cargo fmt + check + clippy + test all green
(55 tests).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-24 08:50:15 -04:00
Joseph Doherty 82996aa8e6 Resolve Client.Go-022..027: bulk flags, bench cancel, batch loop
Client.Go-022  Re-applied Client.Go-015 shape — runWriteBulkVariant drops
               the unused secured param and gates -current-user-id /
               -verifier-user-id / -user-id behind the secured-only
               variants.
Client.Go-023  Re-applied Client.Go-018 shape — bench warm-up and steady-
               state loops respect ctx.Err().
Client.Go-024  Added SDK-level tests for WriteBulk / Write2Bulk /
               WriteSecuredBulk / WriteSecured2Bulk / ReadBulk and
               StreamAlarms via the existing bufconn fake gateway pattern.
Client.Go-025  Five bulk SDK methods short-circuit on empty input without
               an RPC round-trip and document the behavior.
Client.Go-026  runBatch widens scanner.Buffer to 16 MiB and emits an
               error-with-sentinel if a longer line still arrives, rather
               than aborting the session silently.
Client.Go-027  runBatch treats blank lines as skip-and-continue; only EOF
               ends the session.

All resolved at 2026-05-24; gofmt + go vet + go build + go test ./... all
green.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-24 08:49:58 -04:00
Joseph Doherty 712cb06442 Resolve Client.Dotnet-018..021: README + bench-read-bulk hardening
Client.Dotnet-018  README CLI examples for stream-alarms / acknowledge-alarm
                   replaced with parser-correct flags; new theory test
                   parses each documented README example through the CLI.
Client.Dotnet-019  BenchReadBulkAsync routes through new
                   RequireRegisterServerHandle helper that fails loudly when
                   the OK register reply has no typed payload.
Client.Dotnet-020  Bench steady-state catch is now
                   catch (Exception ex) when (ex is not OperationCanceledException)
                   so user-driven cancellation exits promptly.
Client.Dotnet-021  --timeout-ms now flows through ParseTimeoutMs which
                   rejects negatives with a clear error in both read-bulk
                   and bench-read-bulk.

All resolved at 2026-05-24; 67/67 .NET client tests pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-24 08:49:45 -04:00
Joseph Doherty 4d77279e7e Resolve Server-044..050: KillWorker accounting + admin service hardening
Server-044  KillWorkerAsync catch path now calls _metrics.SessionRemoved
            so the open-session gauge does not leak when KillWorker throws.
Server-045  KillWorkerAsync routes through a new
            GatewaySession.KillWorkerWithCloseGateAsync that takes the
            per-session close lock, so concurrent kills count SessionsClosed
            exactly once.
Server-046  CloseSessionCoreAsync's SessionCloseStartedException branch and
            ShutdownAsync's kill fallback both increment SessionsClosed (not
            just the gauge), so the counter and gauge stay consistent.
Server-047  ApiKeysPage.ConfirmPendingAsync holds PendingAction across the
            awaited action and clears it in finally, matching the sessions
            pages.
Server-048  Closed: the 044/045 regression tests cover the previously-
            untested kill paths.
Server-049  IDashboardSessionAdminService + DashboardSessionAdminService
            now carry XML docs that pin the Admin gate, missing-session
            return-Fail semantics, and the dashboard-admin-kill reason.
Server-050  CloseSessionAsync and KillWorkerAsync catch unexpected
            exceptions after the SessionManagerException catches and return
            a friendly Fail; OperationCanceledException tied to the caller
            token still propagates.

All resolved at 2026-05-24; 503/503 gateway tests pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-24 08:49:34 -04:00
268 changed files with 24869 additions and 974 deletions
+1
View File
@@ -45,6 +45,7 @@ build/
out/
tmp/
temp/
install/
# .NET
**/bin/
+140
View File
@@ -0,0 +1,140 @@
# Code Review Process
This document describes how to perform a comprehensive, per-module code review of
the `mxaccessgw` codebase and how to track findings to resolution.
A **module** is one buildable project under `src/` (e.g. `src/ZB.MOM.WW.MxGateway.Worker`)
or one language client under `clients/` (e.g. `clients/rust`). Each module has
its own folder under `code-reviews/` containing a single `findings.md`.
## 1. Before you start
1. Pick the module to review. Its folder is `code-reviews/<Module>/`:
- For a `src/` project, `<Module>` is the project name with the `ZB.MOM.WW.MxGateway.`
prefix stripped — `src/ZB.MOM.WW.MxGateway.Server` is reviewed in `code-reviews/Server/`.
- For a language client, `<Module>` is `Client.<Lang>``clients/rust` is
reviewed in `code-reviews/Client.Rust/`.
2. Identify the design context for the module:
- `gateway.md` — top-level architecture, command/event surface, IPC envelope,
STA thread model, fault handling.
- The relevant component design docs under `docs/` (e.g.
`docs/MxAccessWorkerInstanceDesign.md`, `docs/GatewayProcessDesign.md`,
`docs/Sessions.md`, `docs/Authentication.md`, `docs/GalaxyRepository.md`).
- `docs/DesignDecisions.md` for the v1 design choices.
- The **Repository-Specific Conventions** and **Process / Platform Notes** in
`CLAUDE.md`.
3. Record the exact commit being reviewed: `git rev-parse --short HEAD`. Every
review is a snapshot — a finding only means something relative to a known
commit.
4. Open `code-reviews/<Module>/findings.md` and fill in the header table
(reviewer, date, commit SHA, status).
## 2. Review checklist
Work through **every** category below for the module. A comprehensive review
means the checklist is completed even where it produces no findings — record
"No issues found" for a category rather than leaving it ambiguous.
1. **Correctness & logic bugs** — off-by-one, null handling, incorrect
conditionals, misuse of APIs, broken edge cases.
2. **mxaccessgw conventions** — the rules in `CLAUDE.md` and the style guides
under `docs/style-guides/`: the gateway never instantiates MXAccess COM
directly; all MXAccess COM calls run on the worker's dedicated STA thread and
the STA loop pumps Windows messages; IPC uses one bidirectional named pipe per
worker carrying length-prefixed `WorkerEnvelope` protobuf frames; MXAccess
parity is the contract (don't "fix" surprising MXAccess behaviour, never
synthesize events); one worker and one event subscriber per session; the
gateway terminates orphan workers on startup and does not reattach; C# style
(file-scoped namespaces, `sealed` by default, `Async` suffix, MXAccess-aligned
names); no Blazor UI component libraries; no logging of secrets or full tag
values; generated code is never hand-edited.
3. **Concurrency & thread safety** — shared mutable state, STA affinity, race
conditions, correct use of `async`/`await`, locking, disposal races.
4. **Error handling & resilience** — exception paths, worker crash / reconnect
handling, fail-fast event backpressure, transient vs permanent error
classification, graceful degradation, correct gRPC status codes.
5. **Security** — authentication/authorization checks, API-key scope enforcement,
input validation, SQL injection in the Galaxy Repository RPCs, secret
handling, the dashboard anonymous-localhost bypass, logging of sensitive data.
6. **Performance & resource management**`IDisposable` disposal, pipe / stream
/ COM lifetimes, buffering and back-pressure, unnecessary allocations on hot
paths, N+1 queries.
7. **Design-document adherence** — does the code match `gateway.md`, the relevant
`docs/` component designs, `docs/DesignDecisions.md`, and `CLAUDE.md`? Flag
both code that drifts from the design and design docs that are now stale.
8. **Code organization & conventions** — namespace hierarchy, project layout, the
Options pattern, separation of concerns, additive-only contract evolution.
9. **Testing coverage** — are the module's behaviours covered by tests
(`src/ZB.MOM.WW.MxGateway.Tests`, `src/ZB.MOM.WW.MxGateway.Worker.Tests`,
`src/ZB.MOM.WW.MxGateway.IntegrationTests`)? Note untested critical paths and missing
edge-case tests.
10. **Documentation & comments** — XML doc accuracy, misleading or stale comments,
undocumented non-obvious behaviour.
## 3. Recording findings
Add one entry per finding to the `## Findings` section of the module's
`findings.md`, using the entry format in
[`_template/findings.md`](code-reviews/_template/findings.md).
- **Finding ID** — `<Module>-NNN`, numbered sequentially within the module and
never reused (e.g. `Worker-001`). IDs are permanent even after resolution.
- **Severity:**
- **Critical** — data loss, security breach, crash/deadlock, or outage.
- **High** — incorrect behaviour with significant impact; no safe workaround.
- **Medium** — incorrect or risky behaviour with limited impact or a workaround.
- **Low** — minor issues, style, maintainability, documentation.
- **Category** — one of the 10 checklist categories above.
- **Location** — `file:line` (clickable), or a list of locations.
- **Description** — what is wrong and why it matters.
- **Recommendation** — concrete suggested fix.
After recording findings, update the module header table (status, open-finding
count) and regenerate the base README (step 5).
## 4. Marking an item resolved
Findings are **never deleted** — they are an audit trail. To close one, change
its **Status** and complete the **Resolution** field:
- `Open` — newly recorded, not yet addressed.
- `In Progress` — a fix is actively being worked on.
- `Resolved` — fixed. The Resolution field must state the fixing commit SHA, the
date, and a one-line description of the fix.
- `Won't Fix` — intentionally not fixed. The Resolution field must justify why.
- `Deferred` — valid but postponed. The Resolution field must say what it is
waiting on (e.g. a tracked issue or a later milestone).
`Resolved`, `Won't Fix`, and `Deferred` findings are all considered **closed**.
`Open` and `In Progress` are **pending** and appear in the base README's Pending
Findings table.
## 5. Updating the base README
`code-reviews/README.md` holds the single cross-module view (the Module Status
table and the Pending / Closed Findings tables). It is **generated** from the
per-module `findings.md` files — do not edit it by hand.
After any review or status change, regenerate it:
```
python code-reviews/regen-readme.py
```
`regen-readme.py --check` exits non-zero if `README.md` is stale, if a module
header's `Open findings` count disagrees with its finding statuses, or if a
finding carries an unrecognised Status value. The PowerShell wrapper
`scripts/check-code-reviews-readme.ps1` runs that check and is the intended hook
for CI or a pre-commit step.
> The repo's installed `python` is the real interpreter; the bare `python3`
> alias resolves to the Windows Store stub and fails. Use `python`.
The per-module `findings.md` files are the source of truth; `README.md` is the
aggregated index and must always agree with them — which the script guarantees.
## 6. Re-reviewing a module
Re-reviews append to the same `findings.md`. Update the header to the new commit
and date, continue the finding numbering from the last used ID, and leave prior
findings (including closed ones) in place as history.
+21
View File
@@ -0,0 +1,21 @@
<Project>
<PropertyGroup>
<!-- Shared package metadata for clients/dotnet/. Individual projects opt in via <IsPackable>true</IsPackable>. -->
<Authors>Joseph Doherty</Authors>
<Company>ZB MOM WW</Company>
<Copyright>Copyright (c) ZB MOM WW. All rights reserved.</Copyright>
<Product>MxAccessGateway Client</Product>
<RepositoryUrl>https://gitea.dohertylan.com/dohertj2/mxaccessgw</RepositoryUrl>
<RepositoryType>git</RepositoryType>
<PackageProjectUrl>https://gitea.dohertylan.com/dohertj2/mxaccessgw</PackageProjectUrl>
<PackageTags>mxaccess;mxgateway;grpc;client;archestra</PackageTags>
<PackageRequireLicenseAcceptance>false</PackageRequireLicenseAcceptance>
<!-- Versioning: bump per release. Symbols ship as snupkg. -->
<Version>0.1.0</Version>
<IncludeSymbols>true</IncludeSymbols>
<SymbolPackageFormat>snupkg</SymbolPackageFormat>
<GenerateDocumentationFile>true</GenerateDocumentationFile>
<!-- Default: do NOT pack. Each project opts in. -->
<IsPackable>false</IsPackable>
</PropertyGroup>
</Project>
+19
View File
@@ -107,6 +107,7 @@ public sealed class MxGatewayClientOptions
public required string ApiKey { get; init; }
public bool UseTls { get; init; }
public string? CaCertificatePath { get; init; }
public bool RequireCertificateValidation { get; init; }
public string? ServerNameOverride { get; init; }
public TimeSpan ConnectTimeout { get; init; } = TimeSpan.FromSeconds(10);
public TimeSpan DefaultCallTimeout { get; init; } = TimeSpan.FromSeconds(30);
@@ -124,6 +125,24 @@ or subscription changes because those calls can partially succeed in MXAccess.
API key may be loaded from `MXGATEWAY_API_KEY` by the CLI, not implicitly by the
library constructor unless a helper explicitly says it does that.
### TLS trust posture
The gateway can serve a self-signed certificate it generates itself (it has no
PKI). To make that usable, TLS is **lenient by default**: when `UseTls` is set
and `CaCertificatePath` is empty, `CreateHttpHandler` installs a
`RemoteCertificateValidationCallback` that returns `true`, so the gateway's
self-signed certificate is accepted without verification.
To verify the gateway instead:
- set `CaCertificatePath` to pin a CA — validated via a `CustomRootTrust`
`X509Chain` against that root, and the callback additionally rejects a
hostname/SAN mismatch (`RemoteCertificateNameMismatch`); or
- set `RequireCertificateValidation` to `true` to keep the default OS/system-trust
verification on a connection with no pinned CA.
Pinning a CA always wins over the lenient default.
## Auth Interceptor
Use a gRPC call credentials/interceptor layer to attach:
+84 -2
View File
@@ -134,8 +134,8 @@ dotnet run --project clients/dotnet/ZB.MOM.WW.MxGateway.Client.Cli -- advise --s
dotnet run --project clients/dotnet/ZB.MOM.WW.MxGateway.Client.Cli -- write --session-id <id> --server-handle 1 --item-handle 1 --type int32 --value 123 --json
dotnet run --project clients/dotnet/ZB.MOM.WW.MxGateway.Client.Cli -- write2 --session-id <id> --server-handle 1 --item-handle 1 --type int32 --value 123 --timestamp 2026-01-01T00:00:00Z --json
dotnet run --project clients/dotnet/ZB.MOM.WW.MxGateway.Client.Cli -- stream-events --session-id <id> --max-events 1 --json
dotnet run --project clients/dotnet/ZB.MOM.WW.MxGateway.Client.Cli -- stream-alarms --session-id <id> --max-messages 1 --json
dotnet run --project clients/dotnet/ZB.MOM.WW.MxGateway.Client.Cli -- acknowledge-alarm --session-id <id> --alarm-reference "\\Galaxy\Area001.Pump001.PumpFault" --json
dotnet run --project clients/dotnet/ZB.MOM.WW.MxGateway.Client.Cli -- stream-alarms --filter-prefix Area001 --max-events 1 --json
dotnet run --project clients/dotnet/ZB.MOM.WW.MxGateway.Client.Cli -- acknowledge-alarm --reference "\\Galaxy\Area001.Pump001.PumpFault" --comment "ack from cli" --operator operator1 --json
dotnet run --project clients/dotnet/ZB.MOM.WW.MxGateway.Client.Cli -- smoke --endpoint http://localhost:5000 --api-key-env MXGATEWAY_API_KEY --item Area001.Pump001.Speed --json
```
@@ -196,6 +196,54 @@ dotnet run --project clients/dotnet/ZB.MOM.WW.MxGateway.Client.Cli -- galaxy-las
dotnet run --project clients/dotnet/ZB.MOM.WW.MxGateway.Client.Cli -- galaxy-discover --endpoint http://localhost:5000 --api-key-env MXGATEWAY_API_KEY
```
### Browsing lazily
For UI trees or OPC UA bridges, use `BrowseChildrenAsync` to walk one level at a
time instead of paging the full hierarchy. Pass an empty request for root objects;
subsequent calls supply `ParentGobjectId`, `ParentTagName`, or
`ParentContainedPath`. Each child's `ChildHasChildren[i]` tells you whether to
draw an expand triangle. Filter fields match `DiscoverHierarchy`. See
[Galaxy Repository](../../docs/GalaxyRepository.md#browsechildren) for full
request and filter semantics.
```csharp
BrowseChildrenReply roots = await repository.BrowseChildrenAsync(
new BrowseChildrenRequest());
for (int i = 0; i < roots.Children.Count; i++)
{
GalaxyObject child = roots.Children[i];
bool hasChildren = roots.ChildHasChildren[i];
Console.WriteLine($"{child.TagName} expand={hasChildren}");
}
```
#### High-level walker
For UI trees, the client provides a `LazyBrowseNode` walker that handles
sibling pagination and the `child_has_children` hint for you:
```csharp
await using GalaxyRepositoryClient repository = GalaxyRepositoryClient.Create(
new MxGatewayClientOptions { Endpoint = new Uri("http://localhost:5000"), ApiKey = apiKey });
IReadOnlyList<LazyBrowseNode> roots = await repository.BrowseAsync();
foreach (LazyBrowseNode root in roots)
{
if (root.HasChildrenHint)
{
await root.ExpandAsync();
}
foreach (LazyBrowseNode child in root.Children)
{
Console.WriteLine($"{child.Object.TagName} ({(child.HasChildrenHint ? "has children" : "leaf")})");
}
}
```
`ExpandAsync` is idempotent — calling it twice fires only one RPC,
and is safe under concurrent callers. To refresh after a Galaxy redeploy, call
`BrowseAsync` again from the root.
### Watching deploy events
`WatchDeployEventsAsync` opens the `WatchDeployEvents` server-streaming RPC. The
@@ -239,6 +287,17 @@ Use TLS options for a secured gateway:
dotnet run --project clients/dotnet/ZB.MOM.WW.MxGateway.Client.Cli -- smoke --endpoint https://ZB.MOM.WW.MxGateway.example.local:5001 --tls --ca-file C:\certs\mxgateway-ca.pem --server-name ZB.MOM.WW.MxGateway.example.local --api-key-env MXGATEWAY_API_KEY --item Area001.Pump001.Speed --json
```
### TLS trust
The gateway can auto-generate its own self-signed certificate (it has no PKI), so
the client is **lenient by default**: a TLS connection (`UseTls` / `--tls`) with
no pinned CA accepts whatever certificate the gateway presents. To verify
instead, pin a CA with `CaCertificatePath` / `--ca-file` (this path also enforces
the certificate hostname/SAN match), or set `RequireCertificateValidation` to
force OS/system-trust verification without pinning. Use `ServerNameOverride` /
`--server-name` when the dialed host differs from the certificate SAN. See
[Gateway Configuration](../../docs/GatewayConfiguration.md#automatic-self-signed-certificate).
## Integration Checks
Run live checks only when a gateway and MXAccess-backed worker are available:
@@ -251,6 +310,29 @@ $env:MXGATEWAY_TEST_ITEM = 'Area001.Pump001.Speed'
dotnet run --project clients/dotnet/ZB.MOM.WW.MxGateway.Client.Cli -- smoke --endpoint $env:MXGATEWAY_ENDPOINT --api-key-env MXGATEWAY_API_KEY --item $env:MXGATEWAY_TEST_ITEM --json
```
## Installing as a NuGet Package
The client publishes to the internal Gitea NuGet feed at
`https://gitea.dohertylan.com/api/packages/dohertj2/nuget/index.json`.
Add the feed once:
````bash
dotnet nuget add source https://gitea.dohertylan.com/api/packages/dohertj2/nuget/index.json \
--name dohertj2-gitea \
--username <gitea-username> \
--password <gitea-token-or-password> \
--store-password-in-clear-text
````
Then add the package to your project:
````bash
dotnet add package ZB.MOM.WW.MxGateway.Client --version 0.1.0
````
The `ZB.MOM.WW.MxGateway.Contracts` package is pulled in transitively.
## Related Documentation
- [Client Packaging](../../docs/ClientPackaging.md)
@@ -487,7 +487,7 @@ public static class MxGatewayClientCli
ReadBulkCommand command = new()
{
ServerHandle = arguments.GetInt32("server-handle"),
TimeoutMs = (uint)arguments.GetInt32("timeout-ms", 0),
TimeoutMs = ParseTimeoutMs(arguments, defaultValue: 0),
};
command.TagAddresses.Add(ParseStringList(arguments.GetRequired("items")));
@@ -692,6 +692,49 @@ public static class MxGatewayClientCli
}
}
/// <summary>
/// Parses the optional <c>--timeout-ms</c> argument as a non-negative
/// unsigned millisecond count. Mirrors the SDK-side <c>(uint)Math.Min</c>
/// guard on <c>MxGatewaySession.ReadBulkAsync</c>: a negative value
/// (e.g. <c>-1</c>, an easy copy-paste mistake for "unbounded") is
/// rejected loudly rather than silently wrapped to <c>~49.7 days</c>,
/// which would park one worker thread per pending tag for hours.
/// Resolves Client.Dotnet-021.
/// </summary>
private static uint ParseTimeoutMs(CliArguments arguments, int defaultValue)
{
int raw = arguments.GetInt32("timeout-ms", defaultValue);
if (raw < 0)
{
throw new ArgumentException(
"--timeout-ms must be a non-negative integer (use 0 for the gateway default).");
}
return (uint)raw;
}
/// <summary>
/// Extracts the <c>ServerHandle</c> from a Register reply, throwing a
/// descriptive <see cref="MxGatewayException"/> when the typed
/// <c>Register</c> payload is absent on an otherwise-successful reply.
/// The typed sub-message is the contract for the Register command, so
/// its absence must not silently fall through to
/// <c>ReturnValue.Int32Value</c> (which would be <c>0</c> for an empty
/// reply, driving the rest of the bench against an invalid handle).
/// Resolves Client.Dotnet-019.
/// </summary>
private static int RequireRegisterServerHandle(MxCommandReply reply, string sessionId)
{
if (reply.Register is null)
{
throw new MxGatewayException(
$"Gateway reply for Register on session '{sessionId}' (correlation '{reply.CorrelationId}') "
+ "succeeded but is missing the typed 'register' payload required to read ServerHandle.");
}
return reply.Register.ServerHandle;
}
/// <summary>
/// Cross-language stress benchmark for ReadBulk. Opens its own session,
/// subscribes to N tags so the worker's MxAccessValueCache populates from
@@ -712,7 +755,7 @@ public static class MxGatewayClientCli
int tagStart = arguments.GetInt32("tag-start", 1);
string tagPrefix = arguments.GetOptional("tag-prefix") ?? "TestMachine_";
string tagAttribute = arguments.GetOptional("tag-attribute") ?? "TestChangingInt";
uint timeoutMs = (uint)arguments.GetInt32("timeout-ms", 1500);
uint timeoutMs = ParseTimeoutMs(arguments, defaultValue: 1500);
string clientName = arguments.GetOptional("client-name") ?? "mxgw-dotnet-bench";
string[] tags = new string[bulkSize];
@@ -742,7 +785,7 @@ public static class MxGatewayClientCli
}),
cancellationToken)
.ConfigureAwait(false);
int serverHandle = registerReply.Register?.ServerHandle ?? registerReply.ReturnValue.Int32Value;
int serverHandle = RequireRegisterServerHandle(registerReply, sessionId);
SubscribeBulkCommand subscribe = new() { ServerHandle = serverHandle };
subscribe.TagAddresses.Add(tags);
@@ -801,8 +844,13 @@ public static class MxGatewayClientCli
.ConfigureAwait(false);
sw.Stop();
}
catch
catch (Exception ex) when (ex is not OperationCanceledException)
{
// Client.Dotnet-020: never swallow OperationCanceledException
// here. A bare `catch` would let Ctrl+C / parent CTS /
// wall-clock timeouts keep spinning until --duration-seconds
// elapsed, burning CPU and skewing the p99/max latency numbers
// with hundreds of immediate-OCE iterations.
sw.Stop();
failedCalls++;
latencyMillis.Add(sw.Elapsed.TotalMilliseconds);
@@ -0,0 +1,34 @@
using Grpc.Core;
using Grpc.Net.Client;
using ZB.MOM.WW.MxGateway.Contracts.Proto.Galaxy;
namespace ZB.MOM.WW.MxGateway.Client.Tests;
/// <summary>
/// Live smoke tests for the BrowseChildren RPC. Skipped by default; set
/// MXGATEWAY_API_KEY and MXGATEWAY_ENDPOINT to run against a real gateway.
/// </summary>
public sealed class BrowseChildrenSmokeTests
{
/// <summary>
/// Verifies that BrowseChildren returns a non-zero cache sequence and
/// a consistent children/child-has-children count from a live gateway.
/// </summary>
[Fact(Skip = "Set MXGATEWAY_API_KEY and MXGATEWAY_ENDPOINT to enable.")]
public async Task BrowseChildren_LiveGateway_ReturnsRootsWithCacheSequence()
{
string? apiKey = Environment.GetEnvironmentVariable("MXGATEWAY_API_KEY");
string endpoint = Environment.GetEnvironmentVariable("MXGATEWAY_ENDPOINT") ?? "http://localhost:5120";
Assert.False(string.IsNullOrEmpty(apiKey), "MXGATEWAY_API_KEY must be set.");
using GrpcChannel channel = GrpcChannel.ForAddress(endpoint);
GalaxyRepository.GalaxyRepositoryClient client = new(channel);
Metadata headers = new() { { "authorization", $"Bearer {apiKey}" } };
BrowseChildrenReply reply = await client.BrowseChildrenAsync(new BrowseChildrenRequest(), headers);
Assert.True(reply.CacheSequence > 0UL);
Assert.Equal(reply.Children.Count, reply.ChildHasChildren.Count);
}
}
@@ -48,6 +48,7 @@ internal sealed class FakeGalaxyRepositoryTransport(MxGatewayClientOptions optio
/// </summary>
public DiscoverHierarchyReply DiscoverHierarchyReply { get; set; } = new();
/// <summary>Gets the queue of discover hierarchy replies; dequeued in FIFO order.</summary>
public Queue<DiscoverHierarchyReply> DiscoverHierarchyReplies { get; } = new();
/// <summary>
@@ -122,6 +123,39 @@ internal sealed class FakeGalaxyRepositoryTransport(MxGatewayClientOptions optio
: DiscoverHierarchyReply);
}
/// <summary>Records BrowseChildren RPC calls made by the client.</summary>
public List<(BrowseChildrenRequest Request, CallOptions CallOptions)> BrowseChildrenCalls { get; } = [];
/// <summary>Default reply returned from BrowseChildren when the queue is empty.</summary>
public BrowseChildrenReply BrowseChildrenReply { get; set; } = new();
/// <summary>Queue of replies returned from BrowseChildren; dequeued in FIFO order.</summary>
public Queue<BrowseChildrenReply> BrowseChildrenReplies { get; } = new();
/// <summary>Queue of exceptions to throw from BrowseChildren; dequeued in FIFO order.</summary>
public Queue<Exception> BrowseChildrenExceptions { get; } = new();
/// <summary>
/// Records the request and either throws a queued exception or returns the configured reply.
/// </summary>
/// <param name="request">The BrowseChildrenRequest to process.</param>
/// <param name="callOptions">Call options specifying RPC behavior.</param>
public Task<BrowseChildrenReply> BrowseChildrenAsync(
BrowseChildrenRequest request,
CallOptions callOptions)
{
BrowseChildrenCalls.Add((request, callOptions));
if (BrowseChildrenExceptions.TryDequeue(out Exception? exception))
{
return Task.FromException<BrowseChildrenReply>(exception);
}
return Task.FromResult(
BrowseChildrenReplies.TryDequeue(out BrowseChildrenReply? reply)
? reply
: BrowseChildrenReply);
}
/// <summary>
/// Gets the list of WatchDeployEvents RPC calls made by the client.
/// </summary>
@@ -196,6 +196,8 @@ internal sealed class FakeGatewayTransport(MxGatewayClientOptions options) : IMx
/// <summary>
/// Records the acknowledge call and returns the next enqueued reply (or default).
/// </summary>
/// <param name="request">The acknowledge alarm request.</param>
/// <param name="callOptions">Call options specifying RPC behavior.</param>
public Task<AcknowledgeAlarmReply> AcknowledgeAlarmAsync(
AcknowledgeAlarmRequest request,
CallOptions callOptions)
@@ -219,6 +221,8 @@ internal sealed class FakeGatewayTransport(MxGatewayClientOptions options) : IMx
/// <summary>
/// Records the query call and yields each enqueued snapshot.
/// </summary>
/// <param name="request">The query active alarms request.</param>
/// <param name="callOptions">Call options specifying RPC behavior.</param>
public async IAsyncEnumerable<ActiveAlarmSnapshot> QueryActiveAlarmsAsync(
QueryActiveAlarmsRequest request,
CallOptions callOptions)
@@ -234,12 +238,14 @@ internal sealed class FakeGatewayTransport(MxGatewayClientOptions options) : IMx
}
/// <summary>Enqueues an acknowledge reply.</summary>
/// <param name="reply">The acknowledge reply to enqueue.</param>
public void AddAcknowledgeReply(AcknowledgeAlarmReply reply)
{
_acknowledgeReplies.Enqueue(reply);
}
/// <summary>Enqueues a snapshot to be yielded from QueryActiveAlarmsAsync.</summary>
/// <param name="snapshot">The snapshot to enqueue.</param>
public void AddActiveAlarmSnapshot(ActiveAlarmSnapshot snapshot)
{
_activeAlarmSnapshots.Add(snapshot);
@@ -248,6 +254,8 @@ internal sealed class FakeGatewayTransport(MxGatewayClientOptions options) : IMx
/// <summary>
/// Records the stream-alarms call and yields each enqueued feed message.
/// </summary>
/// <param name="request">The stream alarms request.</param>
/// <param name="callOptions">Call options specifying RPC behavior.</param>
public async IAsyncEnumerable<AlarmFeedMessage> StreamAlarmsAsync(
StreamAlarmsRequest request,
CallOptions callOptions)
@@ -263,6 +271,7 @@ internal sealed class FakeGatewayTransport(MxGatewayClientOptions options) : IMx
}
/// <summary>Enqueues an alarm feed message to be yielded from StreamAlarmsAsync.</summary>
/// <param name="message">The alarm feed message to enqueue.</param>
public void AddAlarmFeedMessage(AlarmFeedMessage message)
{
_alarmFeedMessages.Add(message);
@@ -181,6 +181,9 @@ public sealed class GalaxyRepositoryClientTests
Assert.Contains("repeated page token", exception.Message, StringComparison.Ordinal);
}
/// <summary>
/// Verifies that DiscoverHierarchyAsync maps typed filter options correctly to the request.
/// </summary>
[Fact]
public async Task DiscoverHierarchyAsync_WithOptions_MapsTypedFilters()
{
@@ -212,6 +215,9 @@ public sealed class GalaxyRepositoryClientTests
Assert.True(request.HistorizedOnly);
}
/// <summary>
/// Verifies that TestConnectionAsync retries on transient gRPC failures.
/// </summary>
[Fact]
public async Task TestConnectionAsync_RetriesOnTransientGrpcFailure()
{
@@ -0,0 +1,221 @@
using Grpc.Core;
using ZB.MOM.WW.MxGateway.Contracts.Proto.Galaxy;
namespace ZB.MOM.WW.MxGateway.Client.Tests;
/// <summary>
/// Tests for the <see cref="LazyBrowseNode"/> walker over the BrowseChildren RPC.
/// </summary>
public sealed class LazyBrowseNodeTests
{
/// <summary>
/// Verifies that calling BrowseAsync with no parent returns the root nodes
/// from the first BrowseChildren reply and surfaces the per-child has-children hint.
/// </summary>
[Fact]
public async Task Browse_NoParent_ReturnsRoots()
{
FakeGalaxyRepositoryTransport transport = CreateTransport();
transport.BrowseChildrenReplies.Enqueue(BuildReply(
children: [BuildObject(1, "Plant", isArea: true), BuildObject(2, "Other")],
childHasChildren: [true, false],
cacheSequence: 1));
await using GalaxyRepositoryClient client = CreateClient(transport);
IReadOnlyList<LazyBrowseNode> roots = await client.BrowseAsync();
Assert.Equal(2, roots.Count);
Assert.Equal("Plant", roots[0].Object.TagName);
Assert.True(roots[0].HasChildrenHint);
Assert.False(roots[0].IsExpanded);
Assert.Equal("Other", roots[1].Object.TagName);
Assert.False(roots[1].HasChildrenHint);
Assert.False(roots[1].IsExpanded);
}
/// <summary>
/// Verifies that ExpandAsync populates Children and marks the node expanded after one RPC.
/// </summary>
[Fact]
public async Task Expand_PopulatesChildrenAndMarksExpanded()
{
FakeGalaxyRepositoryTransport transport = CreateTransport();
transport.BrowseChildrenReplies.Enqueue(BuildReply(
children: [BuildObject(1, "Plant", isArea: true)],
childHasChildren: [true],
cacheSequence: 1));
transport.BrowseChildrenReplies.Enqueue(BuildReply(
children: [BuildObject(10, "Line1")],
childHasChildren: [false],
cacheSequence: 1));
await using GalaxyRepositoryClient client = CreateClient(transport);
IReadOnlyList<LazyBrowseNode> roots = await client.BrowseAsync();
await roots[0].ExpandAsync();
Assert.True(roots[0].IsExpanded);
Assert.Single(roots[0].Children);
Assert.Equal("Line1", roots[0].Children[0].Object.TagName);
Assert.Equal(2, transport.BrowseChildrenCalls.Count);
}
/// <summary>
/// Verifies that a second ExpandAsync call is a no-op and issues no additional RPC.
/// </summary>
[Fact]
public async Task Expand_CalledTwice_NoSecondRpc()
{
FakeGalaxyRepositoryTransport transport = CreateTransport();
transport.BrowseChildrenReplies.Enqueue(BuildReply(
children: [BuildObject(1, "Plant", isArea: true)],
childHasChildren: [true],
cacheSequence: 1));
transport.BrowseChildrenReplies.Enqueue(BuildReply(
children: [BuildObject(10, "Line1")],
childHasChildren: [false],
cacheSequence: 1));
await using GalaxyRepositoryClient client = CreateClient(transport);
IReadOnlyList<LazyBrowseNode> roots = await client.BrowseAsync();
await roots[0].ExpandAsync();
await roots[0].ExpandAsync();
Assert.Equal(2, transport.BrowseChildrenCalls.Count);
}
/// <summary>
/// Verifies that an RPC failure (NotFound) during expand is wrapped in MxGatewayException.
/// </summary>
[Fact]
public async Task Expand_UnknownParent_ThrowsMxGatewayException()
{
FakeGalaxyRepositoryTransport transport = CreateTransport();
transport.BrowseChildrenReplies.Enqueue(BuildReply(
children: [BuildObject(1, "Plant", isArea: true)],
childHasChildren: [true],
cacheSequence: 1));
await using GalaxyRepositoryClient client = CreateClient(transport);
IReadOnlyList<LazyBrowseNode> roots = await client.BrowseAsync();
// Queue the failure for the upcoming ExpandAsync call so it consumes
// the exception on its first RPC rather than the BrowseAsync above.
transport.BrowseChildrenExceptions.Enqueue(
new MxGatewayException(
"Parent not found",
new RpcException(new Status(StatusCode.NotFound, "Parent not found"))));
await Assert.ThrowsAsync<MxGatewayException>(async () => await roots[0].ExpandAsync());
Assert.False(roots[0].IsExpanded);
Assert.Empty(roots[0].Children);
}
/// <summary>
/// Verifies that ExpandAsync drains multi-page sibling replies and forwards the page token.
/// </summary>
[Fact]
public async Task Expand_MultiPageSiblings_GathersAllPages()
{
FakeGalaxyRepositoryTransport transport = CreateTransport();
// Roots
transport.BrowseChildrenReplies.Enqueue(BuildReply(
children: [BuildObject(7, "Plant", isArea: true)],
childHasChildren: [true],
cacheSequence: 1));
// First child page (2 children) with a next token
BrowseChildrenReply childPage1 = BuildReply(
children: [BuildObject(70, "ChildA"), BuildObject(71, "ChildB")],
childHasChildren: [false, false],
cacheSequence: 1);
childPage1.NextPageToken = "7:abc:2";
transport.BrowseChildrenReplies.Enqueue(childPage1);
// Second child page (1 child) with no next token
transport.BrowseChildrenReplies.Enqueue(BuildReply(
children: [BuildObject(72, "ChildC")],
childHasChildren: [false],
cacheSequence: 1));
await using GalaxyRepositoryClient client = CreateClient(transport);
IReadOnlyList<LazyBrowseNode> roots = await client.BrowseAsync();
await roots[0].ExpandAsync();
Assert.Equal(3, roots[0].Children.Count);
Assert.Equal(3, transport.BrowseChildrenCalls.Count);
Assert.Equal("7:abc:2", transport.BrowseChildrenCalls[2].Request.PageToken);
}
/// <summary>
/// Verifies that ten concurrent ExpandAsync calls issue exactly one RPC, not ten.
/// </summary>
[Fact]
public async Task Expand_CalledConcurrently_OnlyFiresOneRpc()
{
FakeGalaxyRepositoryTransport transport = CreateTransport();
transport.BrowseChildrenReplies.Enqueue(BuildReply(
children: [BuildObject(1, "Plant", isArea: true)],
childHasChildren: [true],
cacheSequence: 7));
transport.BrowseChildrenReplies.Enqueue(BuildReply(
children: [BuildObject(2, "Mixer_001")],
childHasChildren: [false],
cacheSequence: 7));
await using GalaxyRepositoryClient client = CreateClient(transport);
IReadOnlyList<LazyBrowseNode> roots = await client.BrowseAsync();
// Fire ten concurrent expands of the same node.
Task[] tasks = Enumerable.Range(0, 10)
.Select(_ => roots[0].ExpandAsync())
.ToArray();
await Task.WhenAll(tasks);
Assert.True(roots[0].IsExpanded);
Assert.Single(roots[0].Children);
// 1 roots fetch + exactly 1 expand fetch = 2 total
Assert.Equal(2, transport.BrowseChildrenCalls.Count);
}
/// <summary>
/// Verifies that BrowseChildrenOptions filter fields are forwarded to the BrowseChildren request.
/// </summary>
[Fact]
public async Task Browse_WithFilter_ForwardsToRequest()
{
FakeGalaxyRepositoryTransport transport = CreateTransport();
await using GalaxyRepositoryClient client = CreateClient(transport);
await client.BrowseAsync(new BrowseChildrenOptions
{
TagNameGlob = "Mixer*",
AlarmBearingOnly = true,
});
BrowseChildrenRequest request = Assert.Single(transport.BrowseChildrenCalls).Request;
Assert.Equal("Mixer*", request.TagNameGlob);
Assert.True(request.AlarmBearingOnly);
}
private static GalaxyObject BuildObject(int id, string tag, bool isArea = false)
=> new() { GobjectId = id, TagName = tag, BrowseName = tag, IsArea = isArea };
private static BrowseChildrenReply BuildReply(
IReadOnlyList<GalaxyObject> children,
IReadOnlyList<bool> childHasChildren,
ulong cacheSequence)
{
BrowseChildrenReply reply = new() { TotalChildCount = children.Count, CacheSequence = cacheSequence };
reply.Children.AddRange(children);
reply.ChildHasChildren.AddRange(childHasChildren);
return reply;
}
private static GalaxyRepositoryClient CreateClient(FakeGalaxyRepositoryTransport transport)
=> new(transport.Options, transport);
private static FakeGalaxyRepositoryTransport CreateTransport()
=> new(new MxGatewayClientOptions
{
Endpoint = new Uri("http://localhost:5000"),
ApiKey = "test-api-key",
});
}
@@ -11,6 +11,7 @@ namespace ZB.MOM.WW.MxGateway.Client.Tests;
/// </summary>
public sealed class MxGatewayClientAlarmsTests
{
/// <summary>AcknowledgeAlarmAsync records request and returns reply.</summary>
[Fact]
public async Task AcknowledgeAlarmAsync_RecordsRequestShapeAndReturnsReply()
{
@@ -46,6 +47,7 @@ public sealed class MxGatewayClientAlarmsTests
Assert.Equal("Bearer test-api-key", call.CallOptions.Headers?.GetValue("authorization"));
}
/// <summary>AcknowledgeAlarmAsync honors cancellation.</summary>
[Fact]
public async Task AcknowledgeAlarmAsync_HonorsCancellation()
{
@@ -69,6 +71,7 @@ public sealed class MxGatewayClientAlarmsTests
cancellation.Token));
}
/// <summary>AcknowledgeAlarmAsync maps unauthenticated RPC exception to typed exception.</summary>
[Fact]
public async Task AcknowledgeAlarmAsync_MapsUnauthenticated_RpcException_ToTypedException()
{
@@ -93,6 +96,7 @@ public sealed class MxGatewayClientAlarmsTests
Assert.Equal(StatusCode.Unauthenticated, ex.StatusCode);
}
/// <summary>QueryActiveAlarmsAsync streams enqueued snapshots.</summary>
[Fact]
public async Task QueryActiveAlarmsAsync_StreamsEnqueuedSnapshots()
{
@@ -117,6 +121,7 @@ public sealed class MxGatewayClientAlarmsTests
Assert.Single(transport.QueryActiveAlarmsCalls);
}
/// <summary>QueryActiveAlarmsAsync passes filter prefix.</summary>
[Fact]
public async Task QueryActiveAlarmsAsync_PassesFilterPrefix()
{
@@ -136,6 +141,7 @@ public sealed class MxGatewayClientAlarmsTests
Assert.Equal("Tank01.", call.Request.AlarmFilterPrefix);
}
/// <summary>QueryActiveAlarmsAsync honors cancellation during enumeration.</summary>
[Fact]
public async Task QueryActiveAlarmsAsync_HonorsCancellationDuringEnumeration()
{
@@ -509,6 +509,356 @@ public sealed class MxGatewayClientCliTests
Assert.Contains("gateway-protocol=", text);
}
/// <summary>
/// Client.Dotnet-018: the README CLI examples for the alarm subcommands at
/// `clients/dotnet/README.md` must drive cleanly through the production
/// CLI argument parser. The previous text used non-existent flags
/// (`--session-id`, `--max-messages`, `--alarm-reference`) that would
/// fail with "Unknown command" / "Missing required option --reference".
/// Each documented example is extracted from the README, parsed via the
/// production <see cref="MxGatewayClientCli.RunAsync"/>, and asserted
/// against exit code 0.
/// </summary>
/// <param name="command">The alarm subcommand to validate (e.g. "stream-alarms", "acknowledge-alarm").</param>
[Theory]
[InlineData("stream-alarms")]
[InlineData("acknowledge-alarm")]
public async Task RunAsync_ReadmeExamples_ForAlarmCommands_ParseSuccessfully(string command)
{
string readme = LocateClientReadme();
string[] commandLine = ExtractReadmeCommandLine(readme, command);
// The documented examples do not include --api-key (the README assumes
// the env var path documented elsewhere). Inject an API key via the
// standard env var so CreateOptions succeeds and the parser fully
// exercises the documented flag shape.
string? previousKey = Environment.GetEnvironmentVariable("MXGATEWAY_API_KEY");
Environment.SetEnvironmentVariable("MXGATEWAY_API_KEY", "test-api-key");
try
{
using var output = new StringWriter();
using var error = new StringWriter();
FakeCliClient fakeClient = new();
fakeClient.AlarmFeedMessages.Add(new AlarmFeedMessage
{
ActiveAlarm = new ActiveAlarmSnapshot { AlarmFullReference = "fixture" },
});
fakeClient.AcknowledgeAlarmReplies.Enqueue(new AcknowledgeAlarmReply
{
CorrelationId = "ack-fixture",
ProtocolStatus = new ProtocolStatus { Code = ProtocolStatusCode.Ok },
});
int exitCode = await MxGatewayClientCli.RunAsync(
commandLine,
output,
error,
_ => fakeClient);
Assert.True(
exitCode == 0,
$"README example for '{command}' exited {exitCode}; stderr=<<{error}>>");
Assert.DoesNotContain("Unknown command", error.ToString());
Assert.DoesNotContain("Missing required option", error.ToString());
}
finally
{
Environment.SetEnvironmentVariable("MXGATEWAY_API_KEY", previousKey);
}
}
/// <summary>
/// Client.Dotnet-019: `BenchReadBulkAsync` previously fell back to
/// <c>reply.ReturnValue.Int32Value</c> when the register reply had no
/// typed <c>Register</c> payload, silently driving the rest of the bench
/// against a zero server handle. The fix must fail loudly with a
/// descriptive <see cref="MxGatewayException"/>.
/// </summary>
[Fact]
public async Task RunAsync_BenchReadBulk_WhenRegisterReplyMissingTypedPayload_FailsLoudly()
{
using var output = new StringWriter();
using var error = new StringWriter();
FakeCliClient fakeClient = new();
// Successful protocol + MX status but no typed `Register` payload.
// Before the Client.Dotnet-019 fix this silently became serverHandle=0
// and the bench proceeded through SubscribeBulk / warmup / steady-state
// against an invalid handle, producing a misleading zero-result summary.
fakeClient.InvokeReplies.Enqueue(new MxCommandReply
{
SessionId = "session-fixture",
Kind = MxCommandKind.Register,
ProtocolStatus = new ProtocolStatus { Code = ProtocolStatusCode.Ok },
});
int exitCode = await MxGatewayClientCli.RunAsync(
[
"bench-read-bulk",
"--endpoint",
"http://localhost:5000",
"--api-key",
"test-api-key",
"--duration-seconds",
"1",
"--warmup-seconds",
"0",
"--bulk-size",
"1",
],
output,
error,
_ => fakeClient);
Assert.Equal(1, exitCode);
// Descriptive message that names the missing typed payload.
string err = error.ToString();
Assert.Contains("Register", err);
// The bench must not produce any aggregate stats JSON.
Assert.DoesNotContain("bench-read-bulk", output.ToString());
}
/// <summary>
/// Client.Dotnet-020: the steady-state loop in `BenchReadBulkAsync` had a
/// bare `catch { failedCalls++; continue; }` that swallowed
/// <see cref="OperationCanceledException"/>, so token-driven cancellation
/// kept spinning until <c>--duration-seconds</c> elapsed. After the fix
/// the bench must exit promptly when the supplied token cancels.
/// </summary>
[Fact]
public async Task RunAsync_BenchReadBulk_WhenSteadyStateLoopReceivesCancellation_ExitsPromptly()
{
using var output = new StringWriter();
using var error = new StringWriter();
int invokeCount = 0;
FakeCliClient fakeClient = new()
{
InvokeHandler = (request, ct) =>
{
int n = Interlocked.Increment(ref invokeCount);
// Reply 1 = Register (success with typed payload).
if (request.Command.Kind == MxCommandKind.Register)
{
return Task.FromResult(new MxCommandReply
{
SessionId = "session-fixture",
Kind = MxCommandKind.Register,
ProtocolStatus = new ProtocolStatus { Code = ProtocolStatusCode.Ok },
Register = new RegisterReply { ServerHandle = 1 },
});
}
// Reply 2 = SubscribeBulk (success).
if (request.Command.Kind == MxCommandKind.SubscribeBulk)
{
var subscribeReply = new MxCommandReply
{
SessionId = "session-fixture",
Kind = MxCommandKind.SubscribeBulk,
ProtocolStatus = new ProtocolStatus { Code = ProtocolStatusCode.Ok },
SubscribeBulk = new BulkSubscribeReply(),
};
return Task.FromResult(subscribeReply);
}
// ReadBulk reply 1 = success (so the steady-state loop enters
// and starts iterating). Reply 2+ = simulated cancellation.
if (request.Command.Kind == MxCommandKind.ReadBulk && n <= 3)
{
return Task.FromResult(new MxCommandReply
{
SessionId = "session-fixture",
Kind = MxCommandKind.ReadBulk,
ProtocolStatus = new ProtocolStatus { Code = ProtocolStatusCode.Ok },
ReadBulk = new BulkReadReply(),
});
}
// From here on every ReadBulk throws OCE — the steady-state
// loop must exit promptly rather than spinning until
// --duration-seconds elapses.
throw new OperationCanceledException();
},
};
var sw = System.Diagnostics.Stopwatch.StartNew();
await Assert.ThrowsAsync<OperationCanceledException>(async () =>
await MxGatewayClientCli.RunAsync(
[
"bench-read-bulk",
"--endpoint",
"http://localhost:5000",
"--api-key",
"test-api-key",
"--duration-seconds",
"30",
"--warmup-seconds",
"0",
"--bulk-size",
"1",
],
output,
error,
_ => fakeClient));
sw.Stop();
// Without the fix the loop swallows OCE and continues until the 30 s
// steady-state deadline expires. With the fix it exits as soon as OCE
// surfaces. Generous 10 s ceiling to keep the test stable under load.
Assert.True(
sw.Elapsed < TimeSpan.FromSeconds(10),
$"Bench did not exit promptly on cancellation; took {sw.Elapsed}.");
}
/// <summary>
/// Client.Dotnet-021: both `ReadBulkAsync` and `BenchReadBulkAsync` cast
/// the user-supplied <c>--timeout-ms</c> to <see cref="uint"/> without
/// bounds checking, so a negative value (e.g. <c>-1</c>) silently wraps
/// to ~49.7 days. The fix must reject negatives with a clear error.
/// </summary>
/// <param name="command">The bulk-read subcommand to validate (e.g. "read-bulk", "bench-read-bulk").</param>
[Theory]
[InlineData("read-bulk")]
[InlineData("bench-read-bulk")]
public async Task RunAsync_TimeoutMs_NegativeValue_RejectsWithClearError(string command)
{
using var output = new StringWriter();
using var error = new StringWriter();
FakeCliClient fakeClient = new();
string[] args = command is "read-bulk"
? [
"read-bulk",
"--endpoint",
"http://localhost:5000",
"--api-key",
"test-api-key",
"--session-id",
"session-fixture",
"--server-handle",
"1",
"--items",
"Area001.Pump001.Speed",
"--timeout-ms",
"-1",
]
: [
"bench-read-bulk",
"--endpoint",
"http://localhost:5000",
"--api-key",
"test-api-key",
"--duration-seconds",
"1",
"--warmup-seconds",
"0",
"--bulk-size",
"1",
"--timeout-ms",
"-1",
];
int exitCode = await MxGatewayClientCli.RunAsync(
args,
output,
error,
_ => fakeClient);
Assert.NotEqual(0, exitCode);
string err = error.ToString();
Assert.Contains("timeout-ms", err);
Assert.Contains("non-negative", err);
}
/// <summary>
/// Locates the .NET client README by walking up from the test assembly's
/// base directory until <c>clients/dotnet/README.md</c> is found. Keeps
/// the regression test independent of the current working directory.
/// </summary>
private static string LocateClientReadme()
{
string? directory = AppContext.BaseDirectory;
while (!string.IsNullOrEmpty(directory))
{
string candidate = Path.Combine(directory, "clients", "dotnet", "README.md");
if (File.Exists(candidate))
{
return candidate;
}
directory = Path.GetDirectoryName(directory);
}
throw new FileNotFoundException("clients/dotnet/README.md not found above test assembly base directory.");
}
/// <summary>
/// Extracts the documented CLI invocation for the requested subcommand
/// from the README, returning only the arguments after the
/// <c>mxgw-dotnet</c>-equivalent prefix so they can be passed straight
/// to <see cref="MxGatewayClientCli.RunAsync"/>.
/// </summary>
private static string[] ExtractReadmeCommandLine(string readmePath, string command)
{
string[] lines = File.ReadAllLines(readmePath);
// Look for the documented `dotnet run ... -- <command> ...` line.
foreach (string line in lines)
{
int dashes = line.IndexOf("-- " + command, StringComparison.Ordinal);
if (dashes < 0)
{
continue;
}
string after = line[(dashes + 3)..].Trim();
// Tokenize by whitespace, respecting "..." quoted segments.
return TokenizeCommandLine(after);
}
throw new InvalidOperationException(
$"README at '{readmePath}' has no documented example for subcommand '{command}'.");
}
/// <summary>
/// Splits a single command-line string into argv tokens, honouring
/// double-quoted segments so paths with embedded spaces survive intact.
/// </summary>
private static string[] TokenizeCommandLine(string input)
{
var tokens = new List<string>();
var current = new System.Text.StringBuilder();
bool inQuotes = false;
foreach (char ch in input)
{
if (ch == '"')
{
inQuotes = !inQuotes;
continue;
}
if (!inQuotes && char.IsWhiteSpace(ch))
{
if (current.Length > 0)
{
tokens.Add(current.ToString());
current.Clear();
}
continue;
}
current.Append(ch);
}
if (current.Length > 0)
{
tokens.Add(current.ToString());
}
return tokens.ToArray();
}
/// <summary>Fake CLI client for testing.</summary>
private sealed class FakeCliClient : IMxGatewayCliClient
{
@@ -527,6 +877,9 @@ public sealed class MxGatewayClientCliTests
/// <summary>Exception to throw on invoke, if any.</summary>
public Exception? InvokeFailure { get; init; }
/// <summary>Optional per-call handler that overrides queue-based behaviour.</summary>
public Func<MxCommandRequest, CancellationToken, Task<MxCommandReply>>? InvokeHandler { get; init; }
/// <inheritdoc />
public ValueTask DisposeAsync()
{
@@ -572,6 +925,11 @@ public sealed class MxGatewayClientCliTests
throw InvokeFailure;
}
if (InvokeHandler is not null)
{
return InvokeHandler(request, cancellationToken);
}
return Task.FromResult(InvokeReplies.Dequeue());
}
@@ -632,6 +990,7 @@ public sealed class MxGatewayClientCliTests
/// <summary>Galaxy discover hierarchy reply to return.</summary>
public DiscoverHierarchyReply GalaxyDiscoverHierarchyReply { get; set; } = new();
/// <summary>Queue of galaxy discover hierarchy replies to return.</summary>
public Queue<DiscoverHierarchyReply> GalaxyDiscoverHierarchyReplies { get; } = new();
/// <summary>List of received galaxy test connection requests.</summary>
@@ -0,0 +1,85 @@
using System.Net.Http;
using System.Net.Security;
using ZB.MOM.WW.MxGateway.Client;
namespace ZB.MOM.WW.MxGateway.Client.Tests;
public sealed class MxGatewayClientTlsHandlerTests
{
/// <summary>
/// Verifies that when TLS is used with no pinned CA and RequireCertificateValidation is false (default),
/// the handler installs an accept-all callback so the gateway's self-signed cert is trusted.
/// The callback must return true regardless of chain errors.
/// </summary>
[Fact]
public void Handler_SkipsVerification_WhenTlsAndNoCaPinned()
{
MxGatewayClientOptions options = new()
{
Endpoint = new Uri("https://localhost:5120"),
ApiKey = "k",
UseTls = true,
};
using SocketsHttpHandler handler = MxGatewayClient.CreateHttpHandlerForTests(options);
Assert.NotNull(handler.SslOptions.RemoteCertificateValidationCallback);
Assert.True(handler.SslOptions.RemoteCertificateValidationCallback!(null!, null!, null, SslPolicyErrors.RemoteCertificateChainErrors));
}
/// <summary>
/// Verifies that when RequireCertificateValidation is true, the callback is left null
/// so the OS trust store performs validation.
/// </summary>
[Fact]
public void Handler_KeepsDefaultVerification_WhenRequireCertificateValidation()
{
MxGatewayClientOptions options = new()
{
Endpoint = new Uri("https://localhost:5120"),
ApiKey = "k",
UseTls = true,
RequireCertificateValidation = true,
};
using SocketsHttpHandler handler = MxGatewayClient.CreateHttpHandlerForTests(options);
Assert.Null(handler.SslOptions.RemoteCertificateValidationCallback);
}
}
public sealed class GalaxyRepositoryClientTlsHandlerTests
{
/// <summary>
/// Verifies that when TLS is used with no pinned CA and RequireCertificateValidation is false (default),
/// the Galaxy client handler installs an accept-all callback so the gateway's self-signed cert is trusted.
/// The callback must return true regardless of chain errors.
/// </summary>
[Fact]
public void Handler_SkipsVerification_WhenTlsAndNoCaPinned()
{
MxGatewayClientOptions options = new()
{
Endpoint = new Uri("https://localhost:5120"),
ApiKey = "k",
UseTls = true,
};
using SocketsHttpHandler handler = GalaxyRepositoryClient.CreateHttpHandlerForTests(options);
Assert.NotNull(handler.SslOptions.RemoteCertificateValidationCallback);
Assert.True(handler.SslOptions.RemoteCertificateValidationCallback!(null!, null!, null, SslPolicyErrors.RemoteCertificateChainErrors));
}
/// <summary>
/// Verifies that when RequireCertificateValidation is true, the Galaxy client callback is left null
/// so the OS trust store performs validation.
/// </summary>
[Fact]
public void Handler_KeepsDefaultVerification_WhenRequireCertificateValidation()
{
MxGatewayClientOptions options = new()
{
Endpoint = new Uri("https://localhost:5120"),
ApiKey = "k",
UseTls = true,
RequireCertificateValidation = true,
};
using SocketsHttpHandler handler = GalaxyRepositoryClient.CreateHttpHandlerForTests(options);
Assert.Null(handler.SslOptions.RemoteCertificateValidationCallback);
}
}
@@ -0,0 +1,26 @@
namespace ZB.MOM.WW.MxGateway.Client;
/// <summary>
/// Filters and shape options for <see cref="GalaxyRepositoryClient.BrowseAsync(BrowseChildrenOptions, System.Threading.CancellationToken)"/>.
/// Mirror of <see cref="DiscoverHierarchyOptions"/> for the lazy-browse path.
/// </summary>
public sealed class BrowseChildrenOptions
{
/// <summary>Restrict to children whose Galaxy category is in this set.</summary>
public IReadOnlyList<int> CategoryIds { get; init; } = [];
/// <summary>Restrict to children whose template chain contains any of these tokens.</summary>
public IReadOnlyList<string> TemplateChainContains { get; init; } = [];
/// <summary>Optional glob-style filter on <c>tag_name</c>.</summary>
public string? TagNameGlob { get; init; }
/// <summary>Whether to populate each <c>GalaxyObject.Attributes</c>. Null leaves the server default.</summary>
public bool? IncludeAttributes { get; init; }
/// <summary>Restrict to children that bear at least one alarm attribute.</summary>
public bool AlarmBearingOnly { get; init; }
/// <summary>Restrict to children that have at least one historized attribute.</summary>
public bool HistorizedOnly { get; init; }
}
@@ -19,6 +19,7 @@ namespace ZB.MOM.WW.MxGateway.Client;
public sealed class GalaxyRepositoryClient : IAsyncDisposable
{
private const int DiscoverHierarchyPageSize = 5000;
private const int BrowseChildrenPageSize = 500;
private readonly GrpcChannel? _channel;
private readonly IGalaxyRepositoryClientTransport _transport;
@@ -182,6 +183,10 @@ public sealed class GalaxyRepositoryClient : IAsyncDisposable
return await DiscoverHierarchyAsync(new DiscoverHierarchyOptions(), cancellationToken).ConfigureAwait(false);
}
/// <summary>Discovers the Galaxy object hierarchy.</summary>
/// <param name="options">Client configuration options.</param>
/// <param name="cancellationToken">Token to observe for cancellation.</param>
/// <returns>The collection of Galaxy objects in the hierarchy.</returns>
public async Task<IReadOnlyList<GalaxyObject>> DiscoverHierarchyAsync(
DiscoverHierarchyOptions options,
CancellationToken cancellationToken = default)
@@ -274,6 +279,89 @@ public sealed class GalaxyRepositoryClient : IAsyncDisposable
cancellationToken);
}
/// <summary>Returns root-level browse nodes (objects with no parent).</summary>
/// <param name="cancellationToken">Cancellation token.</param>
/// <returns>The list of root <see cref="LazyBrowseNode"/> instances.</returns>
public Task<IReadOnlyList<LazyBrowseNode>> BrowseAsync(CancellationToken cancellationToken = default)
=> BrowseAsync(null, cancellationToken);
/// <summary>Returns root-level browse nodes filtered by the given options.</summary>
/// <param name="options">Browse filter options. Null applies no filter.</param>
/// <param name="cancellationToken">Cancellation token.</param>
/// <returns>The list of root <see cref="LazyBrowseNode"/> instances.</returns>
public async Task<IReadOnlyList<LazyBrowseNode>> BrowseAsync(
BrowseChildrenOptions? options,
CancellationToken cancellationToken = default)
{
BrowseChildrenOptions effective = options ?? new BrowseChildrenOptions();
List<LazyBrowseNode> roots = [];
string pageToken = string.Empty;
HashSet<string> seenPageTokens = new(StringComparer.Ordinal);
do
{
BrowseChildrenRequest request = BuildBrowseChildrenRequest(effective);
request.PageToken = pageToken;
BrowseChildrenReply reply = await BrowseChildrenRawAsync(request, cancellationToken).ConfigureAwait(false);
for (int i = 0; i < reply.Children.Count; i++)
{
bool hint = i < reply.ChildHasChildren.Count && reply.ChildHasChildren[i];
roots.Add(new LazyBrowseNode(this, reply.Children[i], hint, effective));
}
pageToken = reply.NextPageToken;
if (!string.IsNullOrWhiteSpace(pageToken) && !seenPageTokens.Add(pageToken))
{
throw new MxGatewayException(
$"Galaxy BrowseChildren returned a repeated page token '{pageToken}'.");
}
}
while (!string.IsNullOrWhiteSpace(pageToken));
return roots;
}
/// <summary>Issues a raw BrowseChildren RPC without result wrapping.</summary>
/// <param name="request">The browse-children request.</param>
/// <param name="cancellationToken">Cancellation token.</param>
/// <returns>The raw server reply.</returns>
public Task<BrowseChildrenReply> BrowseChildrenRawAsync(
BrowseChildrenRequest request,
CancellationToken cancellationToken = default)
{
ArgumentNullException.ThrowIfNull(request);
ThrowIfDisposed();
return ExecuteSafeUnaryAsync(
token => _transport.BrowseChildrenAsync(request, CreateCallOptions(token)),
cancellationToken);
}
internal static BrowseChildrenRequest BuildBrowseChildrenRequest(BrowseChildrenOptions options)
{
ArgumentNullException.ThrowIfNull(options);
BrowseChildrenRequest request = new()
{
PageSize = BrowseChildrenPageSize,
AlarmBearingOnly = options.AlarmBearingOnly,
HistorizedOnly = options.HistorizedOnly,
};
request.CategoryIds.Add(options.CategoryIds);
request.TemplateChainContains.Add(options.TemplateChainContains);
if (!string.IsNullOrWhiteSpace(options.TagNameGlob))
{
request.TagNameGlob = options.TagNameGlob;
}
if (options.IncludeAttributes.HasValue)
{
request.IncludeAttributes = options.IncludeAttributes.Value;
}
return request;
}
/// <summary>
/// Subscribes to Galaxy deploy events. The server emits a bootstrap event with the
/// current state on subscribe so callers can prime their cache, then emits one event
@@ -402,7 +490,10 @@ public sealed class GalaxyRepositoryClient : IAsyncDisposable
.ConfigureAwait(false);
}
private static HttpMessageHandler CreateHttpHandler(MxGatewayClientOptions options)
private static HttpMessageHandler CreateHttpHandler(MxGatewayClientOptions options) =>
CreateHttpHandlerForTests(options);
internal static SocketsHttpHandler CreateHttpHandlerForTests(MxGatewayClientOptions options)
{
SocketsHttpHandler handler = new()
{
@@ -422,6 +513,11 @@ public sealed class GalaxyRepositoryClient : IAsyncDisposable
X509Certificate2 trustedRoot = X509CertificateLoader.LoadCertificateFromFile(options.CaCertificatePath);
handler.SslOptions.RemoteCertificateValidationCallback = (_, certificate, chain, errors) =>
{
if ((errors & System.Net.Security.SslPolicyErrors.RemoteCertificateNameMismatch) != 0)
{
return false;
}
if (certificate is null)
{
return false;
@@ -437,6 +533,10 @@ public sealed class GalaxyRepositoryClient : IAsyncDisposable
return customChain.Build(certificateToValidate);
};
}
else if (!options.RequireCertificateValidation)
{
handler.SslOptions.RemoteCertificateValidationCallback = (_, _, _, _) => true;
}
}
return handler;
@@ -74,6 +74,23 @@ internal sealed class GrpcGalaxyRepositoryClientTransport(
}
}
/// <inheritdoc />
public async Task<BrowseChildrenReply> BrowseChildrenAsync(
BrowseChildrenRequest request,
CallOptions callOptions)
{
try
{
return await RawClient.BrowseChildrenAsync(request, callOptions)
.ResponseAsync
.ConfigureAwait(false);
}
catch (RpcException exception)
{
throw MapRpcException(exception, callOptions.CancellationToken);
}
}
/// <inheritdoc />
public async IAsyncEnumerable<DeployEvent> WatchDeployEventsAsync(
WatchDeployEventsRequest request,
@@ -33,6 +33,13 @@ internal interface IGalaxyRepositoryClientTransport
DiscoverHierarchyRequest request,
CallOptions callOptions);
/// <summary>Returns direct children of a parent in the Galaxy hierarchy.</summary>
/// <param name="request">The browse children request.</param>
/// <param name="callOptions">gRPC call options (timeout, cancellation, etc.).</param>
Task<BrowseChildrenReply> BrowseChildrenAsync(
BrowseChildrenRequest request,
CallOptions callOptions);
/// <summary>Watches for deployment events from the Galaxy Repository server.</summary>
/// <param name="request">The watch deploy events request.</param>
/// <param name="callOptions">gRPC call options (timeout, cancellation, etc.).</param>
@@ -0,0 +1,101 @@
using ZB.MOM.WW.MxGateway.Contracts.Proto.Galaxy;
namespace ZB.MOM.WW.MxGateway.Client;
/// <summary>
/// One node in a lazy-loaded Galaxy browse tree. Holds the underlying
/// <see cref="GalaxyObject"/> and exposes <see cref="ExpandAsync"/> to fetch
/// its direct children on demand. Expansion is one-shot: a second call is a
/// no-op. Pagination of large sibling sets is handled internally.
/// </summary>
public sealed class LazyBrowseNode
{
private readonly GalaxyRepositoryClient _client;
private readonly BrowseChildrenOptions _options;
private readonly List<LazyBrowseNode> _children = [];
private readonly SemaphoreSlim _expandLock = new(1, 1);
private bool _isExpanded;
internal LazyBrowseNode(
GalaxyRepositoryClient client,
GalaxyObject @object,
bool hasChildrenHint,
BrowseChildrenOptions options)
{
_client = client;
Object = @object;
HasChildrenHint = hasChildrenHint;
_options = options;
}
/// <summary>The underlying Galaxy object for this node.</summary>
public GalaxyObject Object { get; }
/// <summary>True when the server reports this node has at least one matching descendant.</summary>
public bool HasChildrenHint { get; }
/// <summary>Direct children loaded by <see cref="ExpandAsync"/>; empty until then.</summary>
public IReadOnlyList<LazyBrowseNode> Children => _children;
/// <summary>True after the first <see cref="ExpandAsync"/> call completes.</summary>
public bool IsExpanded => _isExpanded;
/// <summary>
/// Fetches direct children from the gateway and populates <see cref="Children"/>.
/// Idempotent: subsequent calls are no-ops.
/// </summary>
/// <remarks>
/// Thread-safe: concurrent callers see exactly one fetch; subsequent callers
/// (after the first completes) return immediately.
/// </remarks>
/// <param name="cancellationToken">Token to observe for cancellation.</param>
public async Task ExpandAsync(CancellationToken cancellationToken = default)
{
if (_isExpanded)
{
return;
}
await _expandLock.WaitAsync(cancellationToken).ConfigureAwait(false);
try
{
if (_isExpanded)
{
return;
}
string pageToken = string.Empty;
HashSet<string> seenPageTokens = new(StringComparer.Ordinal);
do
{
BrowseChildrenRequest request = GalaxyRepositoryClient.BuildBrowseChildrenRequest(_options);
request.ParentGobjectId = Object.GobjectId;
request.PageToken = pageToken;
BrowseChildrenReply reply = await _client
.BrowseChildrenRawAsync(request, cancellationToken)
.ConfigureAwait(false);
for (int i = 0; i < reply.Children.Count; i++)
{
bool hint = i < reply.ChildHasChildren.Count && reply.ChildHasChildren[i];
_children.Add(new LazyBrowseNode(_client, reply.Children[i], hint, _options));
}
pageToken = reply.NextPageToken;
if (!string.IsNullOrWhiteSpace(pageToken) && !seenPageTokens.Add(pageToken))
{
throw new MxGatewayException(
$"Galaxy BrowseChildren returned a repeated page token '{pageToken}'.");
}
}
while (!string.IsNullOrWhiteSpace(pageToken));
_isExpanded = true;
}
finally
{
_expandLock.Release();
}
}
}
@@ -315,7 +315,10 @@ public sealed class MxGatewayClient : IAsyncDisposable
.ConfigureAwait(false);
}
private static HttpMessageHandler CreateHttpHandler(MxGatewayClientOptions options)
private static HttpMessageHandler CreateHttpHandler(MxGatewayClientOptions options) =>
CreateHttpHandlerForTests(options);
internal static SocketsHttpHandler CreateHttpHandlerForTests(MxGatewayClientOptions options)
{
SocketsHttpHandler handler = new()
{
@@ -335,6 +338,11 @@ public sealed class MxGatewayClient : IAsyncDisposable
X509Certificate2 trustedRoot = X509CertificateLoader.LoadCertificateFromFile(options.CaCertificatePath);
handler.SslOptions.RemoteCertificateValidationCallback = (_, certificate, chain, errors) =>
{
if ((errors & System.Net.Security.SslPolicyErrors.RemoteCertificateNameMismatch) != 0)
{
return false;
}
if (certificate is null)
{
return false;
@@ -350,6 +358,10 @@ public sealed class MxGatewayClient : IAsyncDisposable
return customChain.Build(certificateToValidate);
};
}
else if (!options.RequireCertificateValidation)
{
handler.SslOptions.RemoteCertificateValidationCallback = (_, _, _, _) => true;
}
}
return handler;
@@ -7,9 +7,11 @@ namespace ZB.MOM.WW.MxGateway.Client;
/// </summary>
public static class MxGatewayClientContractInfo
{
/// <inheritdoc cref="GatewayContractInfo.GatewayProtocolVersion"/>
public const uint GatewayProtocolVersion =
GatewayContractInfo.GatewayProtocolVersion;
/// <inheritdoc cref="GatewayContractInfo.WorkerProtocolVersion"/>
public const uint WorkerProtocolVersion =
GatewayContractInfo.WorkerProtocolVersion;
}
@@ -27,6 +27,14 @@ public sealed class MxGatewayClientOptions
/// </summary>
public string? CaCertificatePath { get; init; }
/// <summary>
/// When true, TLS connections without a pinned <see cref="CaCertificatePath"/>
/// use the OS trust store. When false (default), the gateway certificate is
/// accepted without verification — appropriate for this internal tool's
/// auto-generated self-signed certificate. Pinning a CA always verifies.
/// </summary>
public bool RequireCertificateValidation { get; init; }
/// <summary>
/// Gets the server name override for SNI during TLS handshake.
/// </summary>
@@ -47,6 +55,9 @@ public sealed class MxGatewayClientOptions
/// </summary>
public TimeSpan? StreamTimeout { get; init; }
/// <summary>
/// Gets the maximum size in bytes for gRPC messages.
/// </summary>
public int MaxGrpcMessageBytes { get; init; } = 16 * 1024 * 1024;
/// <summary>
@@ -16,4 +16,21 @@
<Nullable>enable</Nullable>
</PropertyGroup>
<PropertyGroup>
<IsPackable>true</IsPackable>
<PackageId>ZB.MOM.WW.MxGateway.Client</PackageId>
<Description>.NET 10 gRPC client for the MxAccessGateway service. Provides typed wrappers, retry, and a lazy-browse walker over the Galaxy Repository hierarchy.</Description>
<PackageReadmeFile>README.md</PackageReadmeFile>
</PropertyGroup>
<ItemGroup>
<None Include="..\README.md" Pack="true" PackagePath="\" />
</ItemGroup>
<ItemGroup>
<AssemblyAttribute Include="System.Runtime.CompilerServices.InternalsVisibleTo">
<_Parameter1>ZB.MOM.WW.MxGateway.Client.Tests</_Parameter1>
</AssemblyAttribute>
</ItemGroup>
</Project>
+17
View File
@@ -104,6 +104,23 @@ Support:
- `credentials.NewClientTLSFromFile`,
- custom `tls.Config` for advanced callers.
### Trust posture
The gateway can serve a self-signed certificate it generates itself (it has no
PKI). To make that usable, TLS is **lenient by default**: when `Plaintext` is
`false` and no `CACertFile`/`TLSConfig`/`TransportCredentials` is supplied,
`buildCredentials` dials with `tls.Config{InsecureSkipVerify: true}` (carrying
`ServerNameOverride` as the SNI when set), so the gateway's self-signed
certificate is accepted without verification.
To verify the gateway instead:
- set `CACertFile` to pin a CA (full verification against that root), or
- set `RequireCertificateValidation: true` to verify against the OS/system trust
roots without pinning.
Pinning a CA always wins over the lenient default.
## Streaming
`Events(ctx)` should return a receive channel of:
+102
View File
@@ -75,6 +75,14 @@ client, err := mxgateway.Dial(ctx, mxgateway.Options{
})
```
The gateway can auto-generate its own self-signed certificate (it has no PKI), so
the client is **lenient by default**: a TLS connection (`Plaintext: false`) with
no `CACertFile`/`TLSConfig` accepts whatever certificate the gateway presents
(`InsecureSkipVerify`, with `ServerNameOverride` as the SNI when set). To verify
instead, set `CACertFile` to pin a CA, or set `RequireCertificateValidation:
true` to verify against the OS/system trust roots without pinning. See
[Gateway Configuration](../../docs/GatewayConfiguration.md#automatic-self-signed-certificate).
`Client.OpenSession` returns a `Session` with helpers for `Register`,
`AddItem`, `AddItem2`, `Advise`, `Write`, `Events`, and `Close`. Prefer
`SubscribeEvents` or `SubscribeEventsAfter` for long-running streams because the
@@ -121,6 +129,68 @@ reports `present=false` (no deploy recorded). `DiscoverHierarchy` returns
the generated `*GalaxyObject` slice with each object's dynamic attributes
populated for direct contract access.
### Browsing lazily
For UI trees or OPC UA bridges, use `BrowseChildren` to walk one level at a
time instead of loading the full hierarchy. Pass an empty request for root
objects; subsequent calls set `ParentGobjectId`, `ParentTagName`, or
`ParentContainedPath`. Filter fields match `DiscoverHierarchy`. Each response
pairs `Children` with `ChildHasChildren` so you know which nodes to expand. See
[Galaxy Repository](../../docs/GalaxyRepository.md#browsechildren) for full
request and filter semantics.
```go
import pb "gitea.dohertylan.com/dohertj2/mxaccessgw/clients/go/internal/generated/galaxy_repository/v1"
reply, err := galaxy.BrowseChildren(ctx, &pb.BrowseChildrenRequest{})
if err != nil {
return err
}
for i, child := range reply.GetChildren() {
fmt.Printf("%s expand=%v\n", child.GetTagName(), reply.GetChildHasChildren()[i])
}
```
#### High-level walker
For UI trees, the client provides a `LazyBrowseNode` walker that handles
sibling pagination and the `child_has_children` hint for you:
```go
galaxy, err := mxgateway.DialGalaxy(ctx, mxgateway.Options{
Endpoint: "localhost:5000",
APIKey: os.Getenv("MXGATEWAY_API_KEY"),
Plaintext: true,
})
if err != nil {
log.Fatal(err)
}
defer galaxy.Close()
roots, err := galaxy.Browse(ctx, nil)
if err != nil {
log.Fatal(err)
}
for _, root := range roots {
if root.HasChildrenHint() {
if err := root.Expand(ctx); err != nil {
log.Fatal(err)
}
}
for _, child := range root.Children() {
kind := "leaf"
if child.HasChildrenHint() {
kind = "has children"
}
fmt.Printf("%s (%s)\n", child.Object().GetTagName(), kind)
}
}
```
`Expand` is idempotent — calling it twice fires only one RPC,
and is safe under concurrent callers. To refresh after a Galaxy redeploy, call
`Browse` again from the root.
### Watching deploy events
`WatchDeployEvents` opens a server-streaming subscription. The server emits a
@@ -213,6 +283,38 @@ $env:MXGATEWAY_TEST_ITEM = 'Area001.Tag.Value'
go run ./cmd/mxgw-go smoke -endpoint $env:MXGATEWAY_ENDPOINT -plaintext -api-key-env MXGATEWAY_API_KEY -item $env:MXGATEWAY_TEST_ITEM -json
```
## Installing the Go client
The module is resolved directly from the git repo — no package registry:
````bash
go get gitea.dohertylan.com/dohertj2/mxaccessgw/clients/go@v0.1.0
````
Then import:
````go
import "gitea.dohertylan.com/dohertj2/mxaccessgw/clients/go/mxgateway"
````
If your build environment cannot reach `gitea.dohertylan.com` directly,
configure `GOPROXY` to point at an internal proxy that fronts the Gitea
repo, or use `GONOSUMCHECK` + `GOPRIVATE` to bypass the checksum database
for the internal module path.
## Releasing a new version
Go modules in monorepo subdirectories use prefixed tags. To tag a release
from this repo:
````bash
pwsh scripts/tag-go-module.ps1 -Version v0.1.1 -Push
````
The script validates semver, refuses to tag with uncommitted tracked
changes, creates an annotated tag `clients/go/v0.1.1`, and (with `-Push`)
pushes it to origin.
## Related Documentation
- [Client Packaging](../../docs/ClientPackaging.md)
+59 -20
View File
@@ -396,25 +396,31 @@ func runReadBulk(ctx context.Context, args []string, stdout, stderr io.Writer) e
}
func runWriteBulk(ctx context.Context, args []string, stdout, stderr io.Writer) error {
return runWriteBulkVariant(ctx, args, stdout, stderr, "write-bulk", false, false)
return runWriteBulkVariant(ctx, args, stdout, stderr, "write-bulk", false)
}
func runWrite2Bulk(ctx context.Context, args []string, stdout, stderr io.Writer) error {
return runWriteBulkVariant(ctx, args, stdout, stderr, "write2-bulk", true, false)
return runWriteBulkVariant(ctx, args, stdout, stderr, "write2-bulk", true)
}
func runWriteSecuredBulk(ctx context.Context, args []string, stdout, stderr io.Writer) error {
return runWriteBulkVariant(ctx, args, stdout, stderr, "write-secured-bulk", false, true)
return runWriteBulkVariant(ctx, args, stdout, stderr, "write-secured-bulk", false)
}
func runWriteSecured2Bulk(ctx context.Context, args []string, stdout, stderr io.Writer) error {
return runWriteBulkVariant(ctx, args, stdout, stderr, "write-secured2-bulk", true, true)
return runWriteBulkVariant(ctx, args, stdout, stderr, "write-secured2-bulk", true)
}
// runWriteBulkVariant shares the flag-parsing + entry-build skeleton across
// the four bulk-write families. withTimestamp adds a --timestamp-value flag;
// secured switches from --user-id to --current-user-id / --verifier-user-id.
func runWriteBulkVariant(ctx context.Context, args []string, stdout, stderr io.Writer, command string, withTimestamp bool, secured bool) error {
// the four bulk-write families. The variant is derived from command alone;
// withTimestamp adds a --timestamp-value flag. To keep wrong-variant flags
// from silently no-op'ing, secured-only flags (-current-user-id /
// -verifier-user-id) are only registered for the secured variants, and
// -user-id only for the non-secured Write/Write2 variants — a wrong-variant
// flag then surfaces as a clean "flag provided but not defined" error.
func runWriteBulkVariant(ctx context.Context, args []string, stdout, stderr io.Writer, command string, withTimestamp bool) error {
secured := command == "write-secured-bulk" || command == "write-secured2-bulk"
flags := flag.NewFlagSet(command, flag.ContinueOnError)
flags.SetOutput(stderr)
common := bindCommonFlags(flags)
@@ -424,9 +430,17 @@ func runWriteBulkVariant(ctx context.Context, args []string, stdout, stderr io.W
itemHandles := flags.String("item-handles", "", "comma-separated item handles")
valueType := flags.String("type", "string", "value type: bool, int32, int64, float, double, string")
values := flags.String("values", "", "comma-separated values (one per item handle)")
userID := flags.Int("user-id", 0, "MXAccess user id (Write/Write2 variants)")
currentUserID := flags.Int("current-user-id", 0, "MXAccess current user id (Secured variants)")
verifierUserID := flags.Int("verifier-user-id", 0, "MXAccess verifier user id (Secured variants)")
var (
userID *int
currentUserID *int
verifierUserID *int
)
if secured {
currentUserID = flags.Int("current-user-id", 0, "MXAccess current user id (Secured variants)")
verifierUserID = flags.Int("verifier-user-id", 0, "MXAccess verifier user id (Secured variants)")
} else {
userID = flags.Int("user-id", 0, "MXAccess user id (Write/Write2 variants)")
}
timestampValue := flags.String("timestamp-value", "", "RFC 3339 timestamp shared across all entries (Write2/WriteSecured2 variants)")
if err := flags.Parse(args); err != nil {
@@ -514,7 +528,6 @@ func runWriteBulkVariant(ctx context.Context, args []string, stdout, stderr io.W
default:
return fmt.Errorf("unsupported bulk write command %q", command)
}
_ = secured // currently only used for routing above; reserved for future per-variant validation
return writeWriteBulkOutput(stdout, *jsonOutput, command, options, results, err)
}
@@ -598,10 +611,12 @@ func runBenchReadBulk(ctx context.Context, args []string, stdout, stderr io.Writ
}()
// Warm-up: drive identical calls so any first-call JIT / connection-pool
// setup is amortised before the measurement window opens.
// setup is amortised before the measurement window opens. The ctx.Err()
// guard short-circuits on Ctrl+C / parent-cancel instead of spinning
// failing ReadBulk calls until the wall-clock deadline elapses.
warmupDeadline := time.Now().Add(time.Duration(*warmupSeconds) * time.Second)
timeout := time.Duration(*timeoutMs) * time.Millisecond
for time.Now().Before(warmupDeadline) {
for time.Now().Before(warmupDeadline) && ctx.Err() == nil {
_, _ = session.ReadBulk(ctx, serverHandle, tags, timeout)
}
@@ -613,7 +628,7 @@ func runBenchReadBulk(ctx context.Context, args []string, stdout, stderr io.Writ
steadyStart := time.Now()
steadyDeadline := steadyStart.Add(time.Duration(*durationSeconds) * time.Second)
for time.Now().Before(steadyDeadline) {
for time.Now().Before(steadyDeadline) && ctx.Err() == nil {
callStart := time.Now()
results, err := session.ReadBulk(ctx, serverHandle, tags, timeout)
elapsed := time.Since(callStart)
@@ -1191,18 +1206,28 @@ const batchEOR = "__MXGW_BATCH_EOR__"
// runBatch reads one command line at a time from in, dispatches each via the
// normal runWithIO routing, and writes a batchEOR sentinel to stdout after
// every result. Errors are serialised as JSON to stdout (not stderr) so the
// harness can parse them without interleaving stderr. The loop never terminates
// on command error; only stdin EOF (or an empty line) ends the session.
// harness can parse them without interleaving stderr. Blank lines are
// skipped; only stdin EOF ends the session.
//
// The scanner buffer is widened to 16 MiB so a single long command line
// (e.g. a bulk-write with several thousand handles) does not trip the
// default 64 KiB bufio.Scanner token-too-long error and abort the session.
// If a line still exceeds the cap, the error is surfaced as a per-command
// error-with-sentinel and the session continues.
func runBatch(ctx context.Context, in io.Reader, stdout, stderr io.Writer) error {
bw := bufio.NewWriter(stdout)
scanner := bufio.NewScanner(in)
for scanner.Scan() {
line := scanner.Text()
if line == "" {
scanner.Buffer(make([]byte, 0, 64*1024), 16*1024*1024)
for {
if !scanner.Scan() {
break
}
line := scanner.Text()
args := strings.Fields(line)
if len(args) == 0 {
// Skip blank / whitespace-only lines; do NOT terminate. The
// session ends only on stdin EOF so a stray blank line in a
// PowerShell here-string does not silently drop later commands.
continue
}
if err := runWithIO(ctx, args, bw, stderr); err != nil {
@@ -1217,7 +1242,21 @@ func runBatch(ctx context.Context, in io.Reader, stdout, stderr io.Writer) error
_, _ = fmt.Fprintln(bw, batchEOR)
_ = bw.Flush()
}
return scanner.Err()
if err := scanner.Err(); err != nil {
// Emit the scanner failure as a final error-with-sentinel so the
// harness sees the failure framed, then return the error so the
// process exit reflects it. This handles bufio.ErrTooLong for any
// pathological line above the 16 MiB cap.
errPayload := map[string]string{
"error": err.Error(),
"type": "error",
}
_ = writeJSON(bw, errPayload)
_, _ = fmt.Fprintln(bw, batchEOR)
_ = bw.Flush()
return err
}
return nil
}
func dialGalaxyForCommand(ctx context.Context, common *commonOptions) (*mxgateway.GalaxyClient, commonOptions, error) {
+210
View File
@@ -2,9 +2,15 @@ package main
import (
"bytes"
"context"
"encoding/json"
"net"
"strings"
"testing"
"time"
pb "gitea.dohertylan.com/dohertj2/mxaccessgw/clients/go/internal/generated"
"google.golang.org/grpc"
)
func TestRunVersionJSON(t *testing.T) {
@@ -84,3 +90,207 @@ func TestParseValueBuildsTypedValue(t *testing.T) {
t.Fatalf("int32 value = %d, want 123", got)
}
}
// TestRunWriteBulkVariantGatesSecuredFlags pins the Client.Go-022 fix:
// secured-only flags must be unavailable on non-secured variants, and
// vice-versa, so a wrong-variant flag fails with a clean "flag provided
// but not defined" error instead of silently no-op'ing.
func TestRunWriteBulkVariantGatesSecuredFlags(t *testing.T) {
cases := []struct {
name string
args []string
}{
{
name: "write-bulk-rejects-current-user-id",
args: []string{"write-bulk", "-current-user-id", "5", "-item-handles", "1", "-values", "1"},
},
{
name: "write-bulk-rejects-verifier-user-id",
args: []string{"write-bulk", "-verifier-user-id", "5", "-item-handles", "1", "-values", "1"},
},
{
name: "write2-bulk-rejects-current-user-id",
args: []string{"write2-bulk", "-current-user-id", "5", "-item-handles", "1", "-values", "1"},
},
{
name: "write-secured-bulk-rejects-user-id",
args: []string{"write-secured-bulk", "-user-id", "5", "-item-handles", "1", "-values", "1"},
},
{
name: "write-secured2-bulk-rejects-user-id",
args: []string{"write-secured2-bulk", "-user-id", "5", "-item-handles", "1", "-values", "1"},
},
}
for _, tc := range cases {
t.Run(tc.name, func(t *testing.T) {
var stdout, stderr bytes.Buffer
err := runWithIO(t.Context(), tc.args, &stdout, &stderr)
if err == nil {
t.Fatalf("runWithIO(%v) returned no error", tc.args)
}
if !strings.Contains(err.Error(), "flag provided but not defined") {
t.Fatalf("runWithIO(%v) error = %v; want 'flag provided but not defined'", tc.args, err)
}
})
}
}
// TestRunBenchReadBulkRespectsContextCancellation pins the Client.Go-023
// fix: the warm-up and steady-state wall-clock loops must honour ctx.Err()
// so an external cancel (Ctrl+C, parent-cancel from a cross-language bench
// driver) short-circuits the bench instead of spinning failing ReadBulk
// calls until the wall-clock deadline elapses.
func TestRunBenchReadBulkRespectsContextCancellation(t *testing.T) {
listener, err := net.Listen("tcp", "127.0.0.1:0")
if err != nil {
t.Fatalf("listen: %v", err)
}
server := grpc.NewServer()
fake := &benchFakeGateway{}
pb.RegisterMxAccessGatewayServer(server, fake)
go func() {
_ = server.Serve(listener)
}()
defer server.Stop()
defer listener.Close()
ctx, cancel := context.WithCancel(context.Background())
defer cancel()
// Long warm-up + duration, so if the ctx.Err() guard were missing the
// loops would run for ~10s. With the guard, the cancel below short-
// circuits both loops within ~one ReadBulk iteration.
args := []string{
"bench-read-bulk",
"-endpoint", listener.Addr().String(),
"-plaintext",
"-api-key", "test",
"-warmup-seconds", "5",
"-duration-seconds", "5",
"-bulk-size", "1",
"-timeout-ms", "100",
}
// Cancel after a brief delay — far less than warmup+duration (10s).
go func() {
time.Sleep(150 * time.Millisecond)
cancel()
}()
var stdout, stderr bytes.Buffer
start := time.Now()
err = runWithIO(ctx, args, &stdout, &stderr)
elapsed := time.Since(start)
// With the ctx.Err() guard, the loops exit well before the wall-clock
// deadlines (warmup=5s + duration=5s = 10s). Allow generous slack for
// CI noise but assert clearly less than the un-guarded worst case.
if elapsed > 4*time.Second {
t.Fatalf("bench-read-bulk took %s after ctx cancel; want <4s (ctx.Err() guard missing?). err=%v stderr=%s", elapsed, err, stderr.String())
}
}
// benchFakeGateway is a minimal MxAccessGatewayServer that satisfies the
// bench-read-bulk session-setup sequence (OpenSession + Invoke for Register
// / SubscribeBulk / ReadBulk / UnsubscribeBulk / CloseSession).
type benchFakeGateway struct {
pb.UnimplementedMxAccessGatewayServer
}
func (g *benchFakeGateway) OpenSession(_ context.Context, _ *pb.OpenSessionRequest) (*pb.OpenSessionReply, error) {
return &pb.OpenSessionReply{
SessionId: "bench-session",
ProtocolStatus: &pb.ProtocolStatus{Code: pb.ProtocolStatusCode_PROTOCOL_STATUS_CODE_OK},
}, nil
}
func (g *benchFakeGateway) CloseSession(_ context.Context, req *pb.CloseSessionRequest) (*pb.CloseSessionReply, error) {
return &pb.CloseSessionReply{
SessionId: req.GetSessionId(),
ProtocolStatus: &pb.ProtocolStatus{Code: pb.ProtocolStatusCode_PROTOCOL_STATUS_CODE_OK},
}, nil
}
func (g *benchFakeGateway) Invoke(_ context.Context, req *pb.MxCommandRequest) (*pb.MxCommandReply, error) {
kind := req.GetCommand().GetKind()
reply := &pb.MxCommandReply{
SessionId: req.GetSessionId(),
Kind: kind,
ProtocolStatus: &pb.ProtocolStatus{Code: pb.ProtocolStatusCode_PROTOCOL_STATUS_CODE_OK},
}
switch kind {
case pb.MxCommandKind_MX_COMMAND_KIND_REGISTER:
reply.Payload = &pb.MxCommandReply_Register{Register: &pb.RegisterReply{ServerHandle: 1}}
case pb.MxCommandKind_MX_COMMAND_KIND_SUBSCRIBE_BULK:
reply.Payload = &pb.MxCommandReply_SubscribeBulk{SubscribeBulk: &pb.BulkSubscribeReply{
Results: []*pb.SubscribeResult{{ServerHandle: 1, ItemHandle: 1, WasSuccessful: true}},
}}
case pb.MxCommandKind_MX_COMMAND_KIND_READ_BULK:
reply.Payload = &pb.MxCommandReply_ReadBulk{ReadBulk: &pb.BulkReadReply{
Results: []*pb.BulkReadResult{{ItemHandle: 1, WasSuccessful: true, WasCached: true}},
}}
case pb.MxCommandKind_MX_COMMAND_KIND_UNSUBSCRIBE_BULK:
reply.Payload = &pb.MxCommandReply_UnsubscribeBulk{UnsubscribeBulk: &pb.BulkSubscribeReply{}}
}
return reply, nil
}
// TestRunBenchReadBulkRejectsNonPositiveBulkSize pins the Client.Go-023-adjacent
// positivity checks so they cannot drift while resolving the cancellation finding.
func TestRunBenchReadBulkRejectsNonPositiveBulkSize(t *testing.T) {
var stdout, stderr bytes.Buffer
err := runWithIO(t.Context(), []string{"bench-read-bulk", "-bulk-size", "0"}, &stdout, &stderr)
if err == nil || !strings.Contains(err.Error(), "bulk-size must be positive") {
t.Fatalf("bench-read-bulk -bulk-size 0 error = %v", err)
}
}
// TestRunBatchSkipsBlankLinesAndContinuesUntilEOF pins the Client.Go-027 fix:
// a blank line in the middle of a batch session must NOT terminate the loop —
// only stdin EOF ends the session.
func TestRunBatchSkipsBlankLinesAndContinuesUntilEOF(t *testing.T) {
var stdout, stderr bytes.Buffer
// version -> blank -> version (a stray blank line in the middle of a
// programmatic session).
in := strings.NewReader("version --json\n\nversion --json\n")
if err := runBatch(t.Context(), in, &stdout, &stderr); err != nil {
t.Fatalf("runBatch() error = %v; stderr = %s", err, stderr.String())
}
out := stdout.String()
// Both version commands must have produced a result before the EOR sentinel.
if count := strings.Count(out, batchEOR); count != 2 {
t.Fatalf("EOR sentinel count = %d, want 2 (one per command, blank line skipped); out = %q", count, out)
}
}
// TestRunBatchHandlesLongCommandLine pins the Client.Go-026 fix: a command
// line longer than the default bufio.Scanner token size (64 KiB) must not
// abort the batch session.
func TestRunBatchHandlesLongCommandLine(t *testing.T) {
var stdout, stderr bytes.Buffer
// Build a single command line larger than 64 KiB. The command itself is
// invalid (no real session) but runBatch must still emit an EOR sentinel
// and continue to the next command rather than dropping the line on the
// floor with a bufio.ErrTooLong from the outer return.
huge := strings.Repeat("tag-with-a-reasonably-long-name-and-suffix,", 2000) + "trailing"
line := "subscribe-bulk -session-id none -items " + huge
if len(line) <= 64*1024 {
t.Fatalf("test setup error: long line length = %d, want > 64KiB", len(line))
}
in := strings.NewReader(line + "\nversion --json\n")
if err := runBatch(t.Context(), in, &stdout, &stderr); err != nil {
t.Fatalf("runBatch() error = %v; stderr = %s", err, stderr.String())
}
out := stdout.String()
// Both commands must produce an EOR sentinel — the long line should be a
// per-command error (still emitted with EOR), then the version command
// should run normally.
if count := strings.Count(out, batchEOR); count != 2 {
t.Fatalf("EOR sentinel count = %d, want 2 (one per command, even when first is too long); out length = %d", count, len(out))
}
}
@@ -824,6 +824,260 @@ func (x *GalaxyAttribute) GetIsAlarm() bool {
return false
}
type BrowseChildrenRequest struct {
state protoimpl.MessageState `protogen:"open.v1"`
// Parent selector. Empty oneof returns root objects (parent_gobject_id == 0).
//
// Types that are valid to be assigned to Parent:
//
// *BrowseChildrenRequest_ParentGobjectId
// *BrowseChildrenRequest_ParentTagName
// *BrowseChildrenRequest_ParentContainedPath
Parent isBrowseChildrenRequest_Parent `protobuf_oneof:"parent"`
// Maximum number of direct children to return. Server default 500; cap 5000.
PageSize int32 `protobuf:"varint,4,opt,name=page_size,json=pageSize,proto3" json:"page_size,omitempty"`
// Opaque token returned by a previous BrowseChildren response. Bound to the
// cache sequence, parent selector, and the filter set; a mismatch returns
// InvalidArgument.
PageToken string `protobuf:"bytes,5,opt,name=page_token,json=pageToken,proto3" json:"page_token,omitempty"`
// --- Filter parity with DiscoverHierarchy. AND-combined. ---
CategoryIds []int32 `protobuf:"varint,6,rep,packed,name=category_ids,json=categoryIds,proto3" json:"category_ids,omitempty"`
TemplateChainContains []string `protobuf:"bytes,7,rep,name=template_chain_contains,json=templateChainContains,proto3" json:"template_chain_contains,omitempty"`
TagNameGlob string `protobuf:"bytes,8,opt,name=tag_name_glob,json=tagNameGlob,proto3" json:"tag_name_glob,omitempty"`
IncludeAttributes *bool `protobuf:"varint,9,opt,name=include_attributes,json=includeAttributes,proto3,oneof" json:"include_attributes,omitempty"`
AlarmBearingOnly bool `protobuf:"varint,10,opt,name=alarm_bearing_only,json=alarmBearingOnly,proto3" json:"alarm_bearing_only,omitempty"`
HistorizedOnly bool `protobuf:"varint,11,opt,name=historized_only,json=historizedOnly,proto3" json:"historized_only,omitempty"`
unknownFields protoimpl.UnknownFields
sizeCache protoimpl.SizeCache
}
func (x *BrowseChildrenRequest) Reset() {
*x = BrowseChildrenRequest{}
mi := &file_galaxy_repository_proto_msgTypes[10]
ms := protoimpl.X.MessageStateOf(protoimpl.Pointer(x))
ms.StoreMessageInfo(mi)
}
func (x *BrowseChildrenRequest) String() string {
return protoimpl.X.MessageStringOf(x)
}
func (*BrowseChildrenRequest) ProtoMessage() {}
func (x *BrowseChildrenRequest) ProtoReflect() protoreflect.Message {
mi := &file_galaxy_repository_proto_msgTypes[10]
if x != nil {
ms := protoimpl.X.MessageStateOf(protoimpl.Pointer(x))
if ms.LoadMessageInfo() == nil {
ms.StoreMessageInfo(mi)
}
return ms
}
return mi.MessageOf(x)
}
// Deprecated: Use BrowseChildrenRequest.ProtoReflect.Descriptor instead.
func (*BrowseChildrenRequest) Descriptor() ([]byte, []int) {
return file_galaxy_repository_proto_rawDescGZIP(), []int{10}
}
func (x *BrowseChildrenRequest) GetParent() isBrowseChildrenRequest_Parent {
if x != nil {
return x.Parent
}
return nil
}
func (x *BrowseChildrenRequest) GetParentGobjectId() int32 {
if x != nil {
if x, ok := x.Parent.(*BrowseChildrenRequest_ParentGobjectId); ok {
return x.ParentGobjectId
}
}
return 0
}
func (x *BrowseChildrenRequest) GetParentTagName() string {
if x != nil {
if x, ok := x.Parent.(*BrowseChildrenRequest_ParentTagName); ok {
return x.ParentTagName
}
}
return ""
}
func (x *BrowseChildrenRequest) GetParentContainedPath() string {
if x != nil {
if x, ok := x.Parent.(*BrowseChildrenRequest_ParentContainedPath); ok {
return x.ParentContainedPath
}
}
return ""
}
func (x *BrowseChildrenRequest) GetPageSize() int32 {
if x != nil {
return x.PageSize
}
return 0
}
func (x *BrowseChildrenRequest) GetPageToken() string {
if x != nil {
return x.PageToken
}
return ""
}
func (x *BrowseChildrenRequest) GetCategoryIds() []int32 {
if x != nil {
return x.CategoryIds
}
return nil
}
func (x *BrowseChildrenRequest) GetTemplateChainContains() []string {
if x != nil {
return x.TemplateChainContains
}
return nil
}
func (x *BrowseChildrenRequest) GetTagNameGlob() string {
if x != nil {
return x.TagNameGlob
}
return ""
}
func (x *BrowseChildrenRequest) GetIncludeAttributes() bool {
if x != nil && x.IncludeAttributes != nil {
return *x.IncludeAttributes
}
return false
}
func (x *BrowseChildrenRequest) GetAlarmBearingOnly() bool {
if x != nil {
return x.AlarmBearingOnly
}
return false
}
func (x *BrowseChildrenRequest) GetHistorizedOnly() bool {
if x != nil {
return x.HistorizedOnly
}
return false
}
type isBrowseChildrenRequest_Parent interface {
isBrowseChildrenRequest_Parent()
}
type BrowseChildrenRequest_ParentGobjectId struct {
ParentGobjectId int32 `protobuf:"varint,1,opt,name=parent_gobject_id,json=parentGobjectId,proto3,oneof"`
}
type BrowseChildrenRequest_ParentTagName struct {
ParentTagName string `protobuf:"bytes,2,opt,name=parent_tag_name,json=parentTagName,proto3,oneof"`
}
type BrowseChildrenRequest_ParentContainedPath struct {
ParentContainedPath string `protobuf:"bytes,3,opt,name=parent_contained_path,json=parentContainedPath,proto3,oneof"`
}
func (*BrowseChildrenRequest_ParentGobjectId) isBrowseChildrenRequest_Parent() {}
func (*BrowseChildrenRequest_ParentTagName) isBrowseChildrenRequest_Parent() {}
func (*BrowseChildrenRequest_ParentContainedPath) isBrowseChildrenRequest_Parent() {}
type BrowseChildrenReply struct {
state protoimpl.MessageState `protogen:"open.v1"`
// Direct children matching the filter, sorted areas-first then by
// case-insensitive display name (same order as the dashboard tree).
Children []*GalaxyObject `protobuf:"bytes,1,rep,name=children,proto3" json:"children,omitempty"`
// Non-empty when another page of siblings is available.
NextPageToken string `protobuf:"bytes,2,opt,name=next_page_token,json=nextPageToken,proto3" json:"next_page_token,omitempty"`
// Total matching direct children of the parent (post-filter).
TotalChildCount int32 `protobuf:"varint,3,opt,name=total_child_count,json=totalChildCount,proto3" json:"total_child_count,omitempty"`
// Parallel array, indexed with `children`. True when the child has at least
// one matching descendant under the same filter set. Lets a UI choose
// whether to draw an expand triangle without an extra round trip.
ChildHasChildren []bool `protobuf:"varint,4,rep,packed,name=child_has_children,json=childHasChildren,proto3" json:"child_has_children,omitempty"`
// Cache sequence this reply was projected from. Clients may pass it back as
// part of the page_token contract. Mismatch on the next page -> InvalidArgument.
CacheSequence uint64 `protobuf:"varint,5,opt,name=cache_sequence,json=cacheSequence,proto3" json:"cache_sequence,omitempty"`
unknownFields protoimpl.UnknownFields
sizeCache protoimpl.SizeCache
}
func (x *BrowseChildrenReply) Reset() {
*x = BrowseChildrenReply{}
mi := &file_galaxy_repository_proto_msgTypes[11]
ms := protoimpl.X.MessageStateOf(protoimpl.Pointer(x))
ms.StoreMessageInfo(mi)
}
func (x *BrowseChildrenReply) String() string {
return protoimpl.X.MessageStringOf(x)
}
func (*BrowseChildrenReply) ProtoMessage() {}
func (x *BrowseChildrenReply) ProtoReflect() protoreflect.Message {
mi := &file_galaxy_repository_proto_msgTypes[11]
if x != nil {
ms := protoimpl.X.MessageStateOf(protoimpl.Pointer(x))
if ms.LoadMessageInfo() == nil {
ms.StoreMessageInfo(mi)
}
return ms
}
return mi.MessageOf(x)
}
// Deprecated: Use BrowseChildrenReply.ProtoReflect.Descriptor instead.
func (*BrowseChildrenReply) Descriptor() ([]byte, []int) {
return file_galaxy_repository_proto_rawDescGZIP(), []int{11}
}
func (x *BrowseChildrenReply) GetChildren() []*GalaxyObject {
if x != nil {
return x.Children
}
return nil
}
func (x *BrowseChildrenReply) GetNextPageToken() string {
if x != nil {
return x.NextPageToken
}
return ""
}
func (x *BrowseChildrenReply) GetTotalChildCount() int32 {
if x != nil {
return x.TotalChildCount
}
return 0
}
func (x *BrowseChildrenReply) GetChildHasChildren() []bool {
if x != nil {
return x.ChildHasChildren
}
return nil
}
func (x *BrowseChildrenReply) GetCacheSequence() uint64 {
if x != nil {
return x.CacheSequence
}
return 0
}
var File_galaxy_repository_proto protoreflect.FileDescriptor
const file_galaxy_repository_proto_rawDesc = "" +
@@ -897,12 +1151,35 @@ const file_galaxy_repository_proto_rawDesc = "" +
"\x17security_classification\x18\t \x01(\x05R\x16securityClassification\x12#\n" +
"\ris_historized\x18\n" +
" \x01(\bR\fisHistorized\x12\x19\n" +
"\bis_alarm\x18\v \x01(\bR\aisAlarm2\xcc\x03\n" +
"\bis_alarm\x18\v \x01(\bR\aisAlarm\"\x8c\x04\n" +
"\x15BrowseChildrenRequest\x12,\n" +
"\x11parent_gobject_id\x18\x01 \x01(\x05H\x00R\x0fparentGobjectId\x12(\n" +
"\x0fparent_tag_name\x18\x02 \x01(\tH\x00R\rparentTagName\x124\n" +
"\x15parent_contained_path\x18\x03 \x01(\tH\x00R\x13parentContainedPath\x12\x1b\n" +
"\tpage_size\x18\x04 \x01(\x05R\bpageSize\x12\x1d\n" +
"\n" +
"page_token\x18\x05 \x01(\tR\tpageToken\x12!\n" +
"\fcategory_ids\x18\x06 \x03(\x05R\vcategoryIds\x126\n" +
"\x17template_chain_contains\x18\a \x03(\tR\x15templateChainContains\x12\"\n" +
"\rtag_name_glob\x18\b \x01(\tR\vtagNameGlob\x122\n" +
"\x12include_attributes\x18\t \x01(\bH\x01R\x11includeAttributes\x88\x01\x01\x12,\n" +
"\x12alarm_bearing_only\x18\n" +
" \x01(\bR\x10alarmBearingOnly\x12'\n" +
"\x0fhistorized_only\x18\v \x01(\bR\x0ehistorizedOnlyB\b\n" +
"\x06parentB\x15\n" +
"\x13_include_attributes\"\xfe\x01\n" +
"\x13BrowseChildrenReply\x12>\n" +
"\bchildren\x18\x01 \x03(\v2\".galaxy_repository.v1.GalaxyObjectR\bchildren\x12&\n" +
"\x0fnext_page_token\x18\x02 \x01(\tR\rnextPageToken\x12*\n" +
"\x11total_child_count\x18\x03 \x01(\x05R\x0ftotalChildCount\x12,\n" +
"\x12child_has_children\x18\x04 \x03(\bR\x10childHasChildren\x12%\n" +
"\x0ecache_sequence\x18\x05 \x01(\x04R\rcacheSequence2\xb6\x04\n" +
"\x10GalaxyRepository\x12h\n" +
"\x0eTestConnection\x12+.galaxy_repository.v1.TestConnectionRequest\x1a).galaxy_repository.v1.TestConnectionReply\x12q\n" +
"\x11GetLastDeployTime\x12..galaxy_repository.v1.GetLastDeployTimeRequest\x1a,.galaxy_repository.v1.GetLastDeployTimeReply\x12q\n" +
"\x11DiscoverHierarchy\x12..galaxy_repository.v1.DiscoverHierarchyRequest\x1a,.galaxy_repository.v1.DiscoverHierarchyReply\x12h\n" +
"\x11WatchDeployEvents\x12..galaxy_repository.v1.WatchDeployEventsRequest\x1a!.galaxy_repository.v1.DeployEvent0\x01B-\xaa\x02*ZB.MOM.WW.MxGateway.Contracts.Proto.Galaxyb\x06proto3"
"\x11WatchDeployEvents\x12..galaxy_repository.v1.WatchDeployEventsRequest\x1a!.galaxy_repository.v1.DeployEvent0\x01\x12h\n" +
"\x0eBrowseChildren\x12+.galaxy_repository.v1.BrowseChildrenRequest\x1a).galaxy_repository.v1.BrowseChildrenReplyB-\xaa\x02*ZB.MOM.WW.MxGateway.Contracts.Proto.Galaxyb\x06proto3"
var (
file_galaxy_repository_proto_rawDescOnce sync.Once
@@ -916,7 +1193,7 @@ func file_galaxy_repository_proto_rawDescGZIP() []byte {
return file_galaxy_repository_proto_rawDescData
}
var file_galaxy_repository_proto_msgTypes = make([]protoimpl.MessageInfo, 10)
var file_galaxy_repository_proto_msgTypes = make([]protoimpl.MessageInfo, 12)
var file_galaxy_repository_proto_goTypes = []any{
(*TestConnectionRequest)(nil), // 0: galaxy_repository.v1.TestConnectionRequest
(*TestConnectionReply)(nil), // 1: galaxy_repository.v1.TestConnectionReply
@@ -928,30 +1205,35 @@ var file_galaxy_repository_proto_goTypes = []any{
(*DeployEvent)(nil), // 7: galaxy_repository.v1.DeployEvent
(*GalaxyObject)(nil), // 8: galaxy_repository.v1.GalaxyObject
(*GalaxyAttribute)(nil), // 9: galaxy_repository.v1.GalaxyAttribute
(*timestamppb.Timestamp)(nil), // 10: google.protobuf.Timestamp
(*wrapperspb.Int32Value)(nil), // 11: google.protobuf.Int32Value
(*BrowseChildrenRequest)(nil), // 10: galaxy_repository.v1.BrowseChildrenRequest
(*BrowseChildrenReply)(nil), // 11: galaxy_repository.v1.BrowseChildrenReply
(*timestamppb.Timestamp)(nil), // 12: google.protobuf.Timestamp
(*wrapperspb.Int32Value)(nil), // 13: google.protobuf.Int32Value
}
var file_galaxy_repository_proto_depIdxs = []int32{
10, // 0: galaxy_repository.v1.GetLastDeployTimeReply.time_of_last_deploy:type_name -> google.protobuf.Timestamp
11, // 1: galaxy_repository.v1.DiscoverHierarchyRequest.max_depth:type_name -> google.protobuf.Int32Value
12, // 0: galaxy_repository.v1.GetLastDeployTimeReply.time_of_last_deploy:type_name -> google.protobuf.Timestamp
13, // 1: galaxy_repository.v1.DiscoverHierarchyRequest.max_depth:type_name -> google.protobuf.Int32Value
8, // 2: galaxy_repository.v1.DiscoverHierarchyReply.objects:type_name -> galaxy_repository.v1.GalaxyObject
10, // 3: galaxy_repository.v1.WatchDeployEventsRequest.last_seen_deploy_time:type_name -> google.protobuf.Timestamp
10, // 4: galaxy_repository.v1.DeployEvent.observed_at:type_name -> google.protobuf.Timestamp
10, // 5: galaxy_repository.v1.DeployEvent.time_of_last_deploy:type_name -> google.protobuf.Timestamp
12, // 3: galaxy_repository.v1.WatchDeployEventsRequest.last_seen_deploy_time:type_name -> google.protobuf.Timestamp
12, // 4: galaxy_repository.v1.DeployEvent.observed_at:type_name -> google.protobuf.Timestamp
12, // 5: galaxy_repository.v1.DeployEvent.time_of_last_deploy:type_name -> google.protobuf.Timestamp
9, // 6: galaxy_repository.v1.GalaxyObject.attributes:type_name -> galaxy_repository.v1.GalaxyAttribute
0, // 7: galaxy_repository.v1.GalaxyRepository.TestConnection:input_type -> galaxy_repository.v1.TestConnectionRequest
2, // 8: galaxy_repository.v1.GalaxyRepository.GetLastDeployTime:input_type -> galaxy_repository.v1.GetLastDeployTimeRequest
4, // 9: galaxy_repository.v1.GalaxyRepository.DiscoverHierarchy:input_type -> galaxy_repository.v1.DiscoverHierarchyRequest
6, // 10: galaxy_repository.v1.GalaxyRepository.WatchDeployEvents:input_type -> galaxy_repository.v1.WatchDeployEventsRequest
1, // 11: galaxy_repository.v1.GalaxyRepository.TestConnection:output_type -> galaxy_repository.v1.TestConnectionReply
3, // 12: galaxy_repository.v1.GalaxyRepository.GetLastDeployTime:output_type -> galaxy_repository.v1.GetLastDeployTimeReply
5, // 13: galaxy_repository.v1.GalaxyRepository.DiscoverHierarchy:output_type -> galaxy_repository.v1.DiscoverHierarchyReply
7, // 14: galaxy_repository.v1.GalaxyRepository.WatchDeployEvents:output_type -> galaxy_repository.v1.DeployEvent
11, // [11:15] is the sub-list for method output_type
7, // [7:11] is the sub-list for method input_type
7, // [7:7] is the sub-list for extension type_name
7, // [7:7] is the sub-list for extension extendee
0, // [0:7] is the sub-list for field type_name
8, // 7: galaxy_repository.v1.BrowseChildrenReply.children:type_name -> galaxy_repository.v1.GalaxyObject
0, // 8: galaxy_repository.v1.GalaxyRepository.TestConnection:input_type -> galaxy_repository.v1.TestConnectionRequest
2, // 9: galaxy_repository.v1.GalaxyRepository.GetLastDeployTime:input_type -> galaxy_repository.v1.GetLastDeployTimeRequest
4, // 10: galaxy_repository.v1.GalaxyRepository.DiscoverHierarchy:input_type -> galaxy_repository.v1.DiscoverHierarchyRequest
6, // 11: galaxy_repository.v1.GalaxyRepository.WatchDeployEvents:input_type -> galaxy_repository.v1.WatchDeployEventsRequest
10, // 12: galaxy_repository.v1.GalaxyRepository.BrowseChildren:input_type -> galaxy_repository.v1.BrowseChildrenRequest
1, // 13: galaxy_repository.v1.GalaxyRepository.TestConnection:output_type -> galaxy_repository.v1.TestConnectionReply
3, // 14: galaxy_repository.v1.GalaxyRepository.GetLastDeployTime:output_type -> galaxy_repository.v1.GetLastDeployTimeReply
5, // 15: galaxy_repository.v1.GalaxyRepository.DiscoverHierarchy:output_type -> galaxy_repository.v1.DiscoverHierarchyReply
7, // 16: galaxy_repository.v1.GalaxyRepository.WatchDeployEvents:output_type -> galaxy_repository.v1.DeployEvent
11, // 17: galaxy_repository.v1.GalaxyRepository.BrowseChildren:output_type -> galaxy_repository.v1.BrowseChildrenReply
13, // [13:18] is the sub-list for method output_type
8, // [8:13] is the sub-list for method input_type
8, // [8:8] is the sub-list for extension type_name
8, // [8:8] is the sub-list for extension extendee
0, // [0:8] is the sub-list for field type_name
}
func init() { file_galaxy_repository_proto_init() }
@@ -964,13 +1246,18 @@ func file_galaxy_repository_proto_init() {
(*DiscoverHierarchyRequest_RootTagName)(nil),
(*DiscoverHierarchyRequest_RootContainedPath)(nil),
}
file_galaxy_repository_proto_msgTypes[10].OneofWrappers = []any{
(*BrowseChildrenRequest_ParentGobjectId)(nil),
(*BrowseChildrenRequest_ParentTagName)(nil),
(*BrowseChildrenRequest_ParentContainedPath)(nil),
}
type x struct{}
out := protoimpl.TypeBuilder{
File: protoimpl.DescBuilder{
GoPackagePath: reflect.TypeOf(x{}).PkgPath(),
RawDescriptor: unsafe.Slice(unsafe.StringData(file_galaxy_repository_proto_rawDesc), len(file_galaxy_repository_proto_rawDesc)),
NumEnums: 0,
NumMessages: 10,
NumMessages: 12,
NumExtensions: 0,
NumServices: 1,
},
@@ -1,6 +1,6 @@
// Code generated by protoc-gen-go-grpc. DO NOT EDIT.
// versions:
// - protoc-gen-go-grpc v1.6.1
// - protoc-gen-go-grpc v1.6.2
// - protoc v7.34.1
// source: galaxy_repository.proto
@@ -23,6 +23,7 @@ const (
GalaxyRepository_GetLastDeployTime_FullMethodName = "/galaxy_repository.v1.GalaxyRepository/GetLastDeployTime"
GalaxyRepository_DiscoverHierarchy_FullMethodName = "/galaxy_repository.v1.GalaxyRepository/DiscoverHierarchy"
GalaxyRepository_WatchDeployEvents_FullMethodName = "/galaxy_repository.v1.GalaxyRepository/WatchDeployEvents"
GalaxyRepository_BrowseChildren_FullMethodName = "/galaxy_repository.v1.GalaxyRepository/BrowseChildren"
)
// GalaxyRepositoryClient is the client API for GalaxyRepository service.
@@ -44,6 +45,11 @@ type GalaxyRepositoryClient interface {
// increasing per server start; gaps indicate the per-subscriber buffer dropped
// older events because the client was too slow.
WatchDeployEvents(ctx context.Context, in *WatchDeployEventsRequest, opts ...grpc.CallOption) (grpc.ServerStreamingClient[DeployEvent], error)
// Returns the direct children of a parent object (or the root objects when
// `parent` is unset). Designed for OPC UA-style lazy expand: clients walk
// one level at a time instead of paging the full hierarchy. Filters mirror
// DiscoverHierarchy exactly. Backed by the same shared hierarchy cache.
BrowseChildren(ctx context.Context, in *BrowseChildrenRequest, opts ...grpc.CallOption) (*BrowseChildrenReply, error)
}
type galaxyRepositoryClient struct {
@@ -103,6 +109,16 @@ func (c *galaxyRepositoryClient) WatchDeployEvents(ctx context.Context, in *Watc
// This type alias is provided for backwards compatibility with existing code that references the prior non-generic stream type by name.
type GalaxyRepository_WatchDeployEventsClient = grpc.ServerStreamingClient[DeployEvent]
func (c *galaxyRepositoryClient) BrowseChildren(ctx context.Context, in *BrowseChildrenRequest, opts ...grpc.CallOption) (*BrowseChildrenReply, error) {
cOpts := append([]grpc.CallOption{grpc.StaticMethod()}, opts...)
out := new(BrowseChildrenReply)
err := c.cc.Invoke(ctx, GalaxyRepository_BrowseChildren_FullMethodName, in, out, cOpts...)
if err != nil {
return nil, err
}
return out, nil
}
// GalaxyRepositoryServer is the server API for GalaxyRepository service.
// All implementations must embed UnimplementedGalaxyRepositoryServer
// for forward compatibility.
@@ -122,6 +138,11 @@ type GalaxyRepositoryServer interface {
// increasing per server start; gaps indicate the per-subscriber buffer dropped
// older events because the client was too slow.
WatchDeployEvents(*WatchDeployEventsRequest, grpc.ServerStreamingServer[DeployEvent]) error
// Returns the direct children of a parent object (or the root objects when
// `parent` is unset). Designed for OPC UA-style lazy expand: clients walk
// one level at a time instead of paging the full hierarchy. Filters mirror
// DiscoverHierarchy exactly. Backed by the same shared hierarchy cache.
BrowseChildren(context.Context, *BrowseChildrenRequest) (*BrowseChildrenReply, error)
mustEmbedUnimplementedGalaxyRepositoryServer()
}
@@ -144,6 +165,9 @@ func (UnimplementedGalaxyRepositoryServer) DiscoverHierarchy(context.Context, *D
func (UnimplementedGalaxyRepositoryServer) WatchDeployEvents(*WatchDeployEventsRequest, grpc.ServerStreamingServer[DeployEvent]) error {
return status.Error(codes.Unimplemented, "method WatchDeployEvents not implemented")
}
func (UnimplementedGalaxyRepositoryServer) BrowseChildren(context.Context, *BrowseChildrenRequest) (*BrowseChildrenReply, error) {
return nil, status.Error(codes.Unimplemented, "method BrowseChildren not implemented")
}
func (UnimplementedGalaxyRepositoryServer) mustEmbedUnimplementedGalaxyRepositoryServer() {}
func (UnimplementedGalaxyRepositoryServer) testEmbeddedByValue() {}
@@ -230,6 +254,24 @@ func _GalaxyRepository_WatchDeployEvents_Handler(srv interface{}, stream grpc.Se
// This type alias is provided for backwards compatibility with existing code that references the prior non-generic stream type by name.
type GalaxyRepository_WatchDeployEventsServer = grpc.ServerStreamingServer[DeployEvent]
func _GalaxyRepository_BrowseChildren_Handler(srv interface{}, ctx context.Context, dec func(interface{}) error, interceptor grpc.UnaryServerInterceptor) (interface{}, error) {
in := new(BrowseChildrenRequest)
if err := dec(in); err != nil {
return nil, err
}
if interceptor == nil {
return srv.(GalaxyRepositoryServer).BrowseChildren(ctx, in)
}
info := &grpc.UnaryServerInfo{
Server: srv,
FullMethod: GalaxyRepository_BrowseChildren_FullMethodName,
}
handler := func(ctx context.Context, req interface{}) (interface{}, error) {
return srv.(GalaxyRepositoryServer).BrowseChildren(ctx, req.(*BrowseChildrenRequest))
}
return interceptor(ctx, in, info, handler)
}
// GalaxyRepository_ServiceDesc is the grpc.ServiceDesc for GalaxyRepository service.
// It's only intended for direct use with grpc.RegisterService,
// and not to be introspected or modified (even as a copy)
@@ -249,6 +291,10 @@ var GalaxyRepository_ServiceDesc = grpc.ServiceDesc{
MethodName: "DiscoverHierarchy",
Handler: _GalaxyRepository_DiscoverHierarchy_Handler,
},
{
MethodName: "BrowseChildren",
Handler: _GalaxyRepository_BrowseChildren_Handler,
},
},
Streams: []grpc.StreamDesc{
{
@@ -725,9 +725,10 @@ func (SessionState) EnumDescriptor() ([]byte, []int) {
return file_mxaccess_gateway_proto_rawDescGZIP(), []int{8}
}
// Public request shape for QueryActiveAlarms. session_id is currently unused
// (the snapshot is session-less) but reserved so a future per-session view
// can be added without a wire break.
// Public request shape for QueryActiveAlarms.
// Clients may leave `session_id` empty; the gateway currently ignores it and
// serves the session-less central-monitor cache. A future version may use it
// to scope the snapshot to one session.
type QueryActiveAlarmsRequest struct {
state protoimpl.MessageState `protogen:"open.v1"`
SessionId string `protobuf:"bytes,1,opt,name=session_id,json=sessionId,proto3" json:"session_id,omitempty"`
@@ -1,6 +1,6 @@
// Code generated by protoc-gen-go-grpc. DO NOT EDIT.
// versions:
// - protoc-gen-go-grpc v1.6.1
// - protoc-gen-go-grpc v1.6.2
// - protoc v7.34.1
// source: mxaccess_gateway.proto
@@ -50,6 +50,9 @@ type MxAccessGatewayClient interface {
// reconnect to seed Part 9 client state, or to reconcile alarms that may
// have been missed during a transport blip. Streamed so callers can
// begin processing without buffering the full set.
// `QueryActiveAlarmsRequest.alarm_filter_prefix` optionally narrows the
// snapshot to alarms whose `alarm_full_reference` starts with the given
// prefix; an empty prefix returns the full set.
QueryActiveAlarms(ctx context.Context, in *QueryActiveAlarmsRequest, opts ...grpc.CallOption) (grpc.ServerStreamingClient[ActiveAlarmSnapshot], error)
}
@@ -180,6 +183,9 @@ type MxAccessGatewayServer interface {
// reconnect to seed Part 9 client state, or to reconcile alarms that may
// have been missed during a transport blip. Streamed so callers can
// begin processing without buffering the full set.
// `QueryActiveAlarmsRequest.alarm_filter_prefix` optionally narrows the
// snapshot to alarms whose `alarm_full_reference` starts with the given
// prefix; an empty prefix returns the full set.
QueryActiveAlarms(*QueryActiveAlarmsRequest, grpc.ServerStreamingServer[ActiveAlarmSnapshot]) error
mustEmbedUnimplementedMxAccessGatewayServer()
}
+75
View File
@@ -168,6 +168,66 @@ func TestQueryActiveAlarmsPassesFilterPrefix(t *testing.T) {
}
}
func TestStreamAlarmsPassesFilterPrefixAndReceivesFeedMessages(t *testing.T) {
fake := &fakeGatewayWithAlarms{
feedMessages: []*pb.AlarmFeedMessage{
{
Payload: &pb.AlarmFeedMessage_ActiveAlarm{
ActiveAlarm: &pb.ActiveAlarmSnapshot{
AlarmFullReference: "Tank01.Level.HiHi",
CurrentState: pb.AlarmConditionState_ALARM_CONDITION_STATE_ACTIVE,
},
},
},
{
Payload: &pb.AlarmFeedMessage_SnapshotComplete{
SnapshotComplete: true,
},
},
},
}
client, cleanup := newBufconnClientWithAlarms(t, fake)
defer cleanup()
stream, err := client.StreamAlarms(context.Background(), &pb.StreamAlarmsRequest{
AlarmFilterPrefix: "Tank01.",
})
if err != nil {
t.Fatalf("StreamAlarms() error = %v", err)
}
var received []*pb.AlarmFeedMessage
for {
msg, err := stream.Recv()
if errors.Is(err, io.EOF) {
break
}
if err != nil {
t.Fatalf("stream.Recv() error = %v", err)
}
received = append(received, msg)
}
if len(received) != 2 {
t.Fatalf("received count = %d, want 2", len(received))
}
if got := fake.streamRequest.GetAlarmFilterPrefix(); got != "Tank01." {
t.Fatalf("captured filter prefix = %q", got)
}
if got := fake.streamAuth; got != "Bearer test-api-key" {
t.Fatalf("stream authorization metadata = %q", got)
}
}
func TestStreamAlarmsRejectsNilRequest(t *testing.T) {
fake := &fakeGatewayWithAlarms{}
client, cleanup := newBufconnClientWithAlarms(t, fake)
defer cleanup()
if _, err := client.StreamAlarms(context.Background(), nil); err == nil {
t.Fatal("StreamAlarms(nil) returned no error")
}
}
type fakeGatewayWithAlarms struct {
pb.UnimplementedMxAccessGatewayServer
@@ -178,6 +238,10 @@ type fakeGatewayWithAlarms struct {
queryRequest *pb.QueryActiveAlarmsRequest
activeSnapshots []*pb.ActiveAlarmSnapshot
streamRequest *pb.StreamAlarmsRequest
feedMessages []*pb.AlarmFeedMessage
streamAuth string
}
func (s *fakeGatewayWithAlarms) AcknowledgeAlarm(ctx context.Context, req *pb.AcknowledgeAlarmRequest) (*pb.AcknowledgeAlarmReply, error) {
@@ -207,6 +271,17 @@ func (s *fakeGatewayWithAlarms) QueryActiveAlarms(req *pb.QueryActiveAlarmsReque
return nil
}
func (s *fakeGatewayWithAlarms) StreamAlarms(req *pb.StreamAlarmsRequest, stream grpc.ServerStreamingServer[pb.AlarmFeedMessage]) error {
s.streamRequest = req
s.streamAuth = authorizationFromContext(stream.Context())
for _, msg := range s.feedMessages {
if err := stream.Send(msg); err != nil {
return err
}
}
return nil
}
func newBufconnClientWithAlarms(t *testing.T, fake *fakeGatewayWithAlarms) (*Client, func()) {
t.Helper()
listener := bufconn.Listen(bufSize)
+16 -4
View File
@@ -222,10 +222,22 @@ func resolveTransportCredentials(opts Options) (credentials.TransportCredentials
return credentials.NewTLS(cfg), nil
}
return credentials.NewTLS(&tls.Config{
MinVersion: tls.VersionTLS12,
ServerName: opts.ServerNameOverride,
}), nil
return credentials.NewTLS(tlsConfigForOptions(opts)), nil
}
// tlsConfigForOptions returns the *tls.Config for the no-CA, no-custom-config TLS path.
// It returns nil when the caller should use a different credentials path (CA file or custom TLSConfig).
// Exposed as an internal helper so unit tests can assert the InsecureSkipVerify posture.
func tlsConfigForOptions(opts Options) *tls.Config {
// CA file and custom TLSConfig take their own paths in resolveTransportCredentials.
if opts.CACertFile != "" || opts.TLSConfig != nil {
return nil
}
return &tls.Config{
MinVersion: tls.VersionTLS12,
ServerName: opts.ServerNameOverride,
InsecureSkipVerify: !opts.RequireCertificateValidation, //nolint:gosec // internal tool; self-signed gateway cert expected; opt-in strict via RequireCertificateValidation
}
}
// OpenSessionOptions describes fields used to create an OpenSessionRequest.
+200
View File
@@ -230,6 +230,206 @@ func TestSubscribeBulkBuildsOneBulkCommandAndReturnsResults(t *testing.T) {
}
}
func TestWriteBulkBuildsOneBulkCommandAndReturnsPerEntryResults(t *testing.T) {
fake := &fakeGatewayServer{
invokeReply: &pb.MxCommandReply{
SessionId: "session-1",
Kind: pb.MxCommandKind_MX_COMMAND_KIND_WRITE_BULK,
ProtocolStatus: &pb.ProtocolStatus{
Code: pb.ProtocolStatusCode_PROTOCOL_STATUS_CODE_OK,
},
Payload: &pb.MxCommandReply_WriteBulk{
WriteBulk: &pb.BulkWriteReply{
Results: []*pb.BulkWriteResult{
{ItemHandle: 10, WasSuccessful: true},
{ItemHandle: 11, WasSuccessful: true},
},
},
},
},
}
client, cleanup := newBufconnClient(t, fake)
defer cleanup()
session := NewSessionForID(client, "session-1")
entries := []*WriteBulkEntry{
{ItemHandle: 10, Value: Int32Value(7), UserId: 100},
{ItemHandle: 11, Value: Int32Value(8), UserId: 100},
}
results, err := session.WriteBulk(context.Background(), 12, entries)
if err != nil {
t.Fatalf("WriteBulk() error = %v", err)
}
if len(results) != 2 {
t.Fatalf("results len = %d, want 2", len(results))
}
req := fake.invokeRequest
if req.GetCommand().GetKind() != pb.MxCommandKind_MX_COMMAND_KIND_WRITE_BULK {
t.Fatalf("command kind = %s", req.GetCommand().GetKind())
}
if got := req.GetCommand().GetWriteBulk().GetEntries(); len(got) != 2 {
t.Fatalf("entry count = %d, want 2", len(got))
}
}
func TestWriteBulkRejectsNilEntries(t *testing.T) {
fake := &fakeGatewayServer{}
client, cleanup := newBufconnClient(t, fake)
defer cleanup()
session := NewSessionForID(client, "session-1")
if _, err := session.WriteBulk(context.Background(), 12, nil); err == nil {
t.Fatal("WriteBulk(nil) returned no error")
}
if _, err := session.Write2Bulk(context.Background(), 12, nil); err == nil {
t.Fatal("Write2Bulk(nil) returned no error")
}
if _, err := session.WriteSecuredBulk(context.Background(), 12, nil); err == nil {
t.Fatal("WriteSecuredBulk(nil) returned no error")
}
if _, err := session.WriteSecured2Bulk(context.Background(), 12, nil); err == nil {
t.Fatal("WriteSecured2Bulk(nil) returned no error")
}
if _, err := session.ReadBulk(context.Background(), 12, nil, 0); err == nil {
t.Fatal("ReadBulk(nil) returned no error")
}
}
func TestBulkMethodsShortCircuitOnEmptySliceWithoutRoundTrip(t *testing.T) {
fake := &fakeGatewayServer{
invokeReply: &pb.MxCommandReply{
ProtocolStatus: &pb.ProtocolStatus{
Code: pb.ProtocolStatusCode_PROTOCOL_STATUS_CODE_OK,
},
},
}
client, cleanup := newBufconnClient(t, fake)
defer cleanup()
session := NewSessionForID(client, "session-1")
results, err := session.WriteBulk(context.Background(), 12, []*WriteBulkEntry{})
if err != nil {
t.Fatalf("WriteBulk(empty) error = %v", err)
}
if len(results) != 0 {
t.Fatalf("WriteBulk(empty) results len = %d, want 0", len(results))
}
if fake.invokeRequest != nil {
t.Fatal("WriteBulk(empty) sent a round trip; expected short-circuit")
}
results2, err := session.Write2Bulk(context.Background(), 12, []*Write2BulkEntry{})
if err != nil {
t.Fatalf("Write2Bulk(empty) error = %v", err)
}
if len(results2) != 0 {
t.Fatalf("Write2Bulk(empty) results len = %d, want 0", len(results2))
}
if fake.invokeRequest != nil {
t.Fatal("Write2Bulk(empty) sent a round trip; expected short-circuit")
}
results3, err := session.WriteSecuredBulk(context.Background(), 12, []*WriteSecuredBulkEntry{})
if err != nil {
t.Fatalf("WriteSecuredBulk(empty) error = %v", err)
}
if len(results3) != 0 {
t.Fatalf("WriteSecuredBulk(empty) results len = %d, want 0", len(results3))
}
if fake.invokeRequest != nil {
t.Fatal("WriteSecuredBulk(empty) sent a round trip; expected short-circuit")
}
results4, err := session.WriteSecured2Bulk(context.Background(), 12, []*WriteSecured2BulkEntry{})
if err != nil {
t.Fatalf("WriteSecured2Bulk(empty) error = %v", err)
}
if len(results4) != 0 {
t.Fatalf("WriteSecured2Bulk(empty) results len = %d, want 0", len(results4))
}
if fake.invokeRequest != nil {
t.Fatal("WriteSecured2Bulk(empty) sent a round trip; expected short-circuit")
}
readResults, err := session.ReadBulk(context.Background(), 12, []string{}, 0)
if err != nil {
t.Fatalf("ReadBulk(empty) error = %v", err)
}
if len(readResults) != 0 {
t.Fatalf("ReadBulk(empty) results len = %d, want 0", len(readResults))
}
if fake.invokeRequest != nil {
t.Fatal("ReadBulk(empty) sent a round trip; expected short-circuit")
}
}
func TestReadBulkForwardsTimeoutAndUnpacksCachedFlag(t *testing.T) {
fake := &fakeGatewayServer{
invokeReply: &pb.MxCommandReply{
SessionId: "session-1",
Kind: pb.MxCommandKind_MX_COMMAND_KIND_READ_BULK,
ProtocolStatus: &pb.ProtocolStatus{
Code: pb.ProtocolStatusCode_PROTOCOL_STATUS_CODE_OK,
},
Payload: &pb.MxCommandReply_ReadBulk{
ReadBulk: &pb.BulkReadReply{
Results: []*pb.BulkReadResult{
{TagAddress: "Tank01.Level", WasSuccessful: true, WasCached: true},
{TagAddress: "Tank02.Level", WasSuccessful: true, WasCached: false},
},
},
},
},
}
client, cleanup := newBufconnClient(t, fake)
defer cleanup()
session := NewSessionForID(client, "session-1")
results, err := session.ReadBulk(context.Background(), 12, []string{"Tank01.Level", "Tank02.Level"}, 250*time.Millisecond)
if err != nil {
t.Fatalf("ReadBulk() error = %v", err)
}
if len(results) != 2 {
t.Fatalf("results len = %d, want 2", len(results))
}
if !results[0].GetWasCached() || results[1].GetWasCached() {
t.Fatalf("WasCached flags = [%v %v], want [true false]", results[0].GetWasCached(), results[1].GetWasCached())
}
req := fake.invokeRequest
if req.GetCommand().GetKind() != pb.MxCommandKind_MX_COMMAND_KIND_READ_BULK {
t.Fatalf("command kind = %s", req.GetCommand().GetKind())
}
if got := req.GetCommand().GetReadBulk().GetTimeoutMs(); got != 250 {
t.Fatalf("timeout ms = %d, want 250", got)
}
}
func TestReadBulkSaturatesTimeoutAboveMaxUint32(t *testing.T) {
fake := &fakeGatewayServer{
invokeReply: &pb.MxCommandReply{
SessionId: "session-1",
Kind: pb.MxCommandKind_MX_COMMAND_KIND_READ_BULK,
ProtocolStatus: &pb.ProtocolStatus{
Code: pb.ProtocolStatusCode_PROTOCOL_STATUS_CODE_OK,
},
},
}
client, cleanup := newBufconnClient(t, fake)
defer cleanup()
session := NewSessionForID(client, "session-1")
// 100 days in milliseconds exceeds MaxUint32 (~49.7 days).
hugeTimeout := 100 * 24 * time.Hour
_, err := session.ReadBulk(context.Background(), 12, []string{"Tank01.Level"}, hugeTimeout)
if err != nil {
t.Fatalf("ReadBulk() error = %v", err)
}
got := fake.invokeRequest.GetCommand().GetReadBulk().GetTimeoutMs()
if got != ^uint32(0) {
t.Fatalf("timeout ms = %d, want %d (MaxUint32)", got, ^uint32(0))
}
}
func TestInvokeReturnsTypedMxAccessErrorWithRawReply(t *testing.T) {
hresult := int32(-2147467259)
fake := &fakeGatewayServer{
+59
View File
@@ -0,0 +1,59 @@
package mxgateway
import (
"crypto/tls"
"testing"
)
// tlsConfigFromOptions is the internal helper under test.
// It extracts the *tls.Config from the no-CA TLS path of resolveTransportCredentials.
// We exercise it directly to avoid needing a real dial target.
func TestTLSInsecureSkipVerify_DefaultTrue(t *testing.T) {
cfg := tlsConfigForOptions(Options{
Endpoint: "localhost:5120",
})
if cfg == nil {
t.Fatal("expected non-nil tls.Config")
}
if !cfg.InsecureSkipVerify {
t.Error("InsecureSkipVerify should be true by default when no CA is pinned")
}
}
func TestTLSInsecureSkipVerify_FalseWhenRequireCertificateValidation(t *testing.T) {
cfg := tlsConfigForOptions(Options{
Endpoint: "localhost:5120",
RequireCertificateValidation: true,
})
if cfg == nil {
t.Fatal("expected non-nil tls.Config")
}
if cfg.InsecureSkipVerify {
t.Error("InsecureSkipVerify should be false when RequireCertificateValidation is true")
}
}
func TestTLSInsecureSkipVerify_FalseWhenCACertFileSet(t *testing.T) {
// When a CA file is pinned, the CA-verification path is taken instead.
// tlsConfigForOptions should return nil (the CA path does not use our helper).
cfg := tlsConfigForOptions(Options{
Endpoint: "localhost:5120",
CACertFile: "/some/ca.pem",
})
if cfg != nil {
t.Error("expected nil tls.Config when CACertFile is set (CA path taken)")
}
}
func TestTLSInsecureSkipVerify_FalseWhenCustomTLSConfig(t *testing.T) {
// When TLSConfig is supplied explicitly, our default skip-verify must not overwrite it.
custom := &tls.Config{MinVersion: tls.VersionTLS13}
cfg := tlsConfigForOptions(Options{
Endpoint: "localhost:5120",
TLSConfig: custom,
})
if cfg != nil {
t.Error("expected nil tls.Config when TLSConfig is already set (custom config path taken)")
}
}
+241 -8
View File
@@ -3,7 +3,9 @@ package mxgateway
import (
"context"
"errors"
"fmt"
"io"
"sync"
"time"
pb "gitea.dohertylan.com/dohertj2/mxaccessgw/clients/go/internal/generated"
@@ -13,6 +15,14 @@ import (
"google.golang.org/protobuf/types/known/timestamppb"
)
// browseChildrenPageSize is the per-request page size used by the lazy walker.
const browseChildrenPageSize = 500
// discoverHierarchyPageSize is the per-request page size used by DiscoverHierarchy.
// Mirrors the .NET client constant so large galaxies are not silently truncated
// by the server's default page cap.
const discoverHierarchyPageSize = 5000
// RawGalaxyRepositoryClient is the generated gRPC client interface for the
// Galaxy Repository service exposed for callers that need direct contract
// access.
@@ -40,6 +50,10 @@ type (
WatchDeployEventsRequest = pb.WatchDeployEventsRequest
// DeployEvent is one Galaxy Repository deploy event.
DeployEvent = pb.DeployEvent
// BrowseChildrenRequest is the request for BrowseChildren.
BrowseChildrenRequest = pb.BrowseChildrenRequest
// BrowseChildrenReply is the reply for BrowseChildren.
BrowseChildrenReply = pb.BrowseChildrenReply
)
// RawDeployEventStream is the generated WatchDeployEvents client stream.
@@ -146,16 +160,35 @@ func (c *GalaxyClient) GetLastDeployTime(ctx context.Context) (time.Time, bool,
// DiscoverHierarchy returns the deployed Galaxy object hierarchy with each
// object's dynamic attributes. The objects are returned in the order supplied
// by the server.
// by the server. The call pages over the server's NextPageToken until the
// server signals it has no more results, matching the .NET client.
func (c *GalaxyClient) DiscoverHierarchy(ctx context.Context) ([]*GalaxyObject, error) {
callCtx, cancel := c.callContext(ctx)
defer cancel()
reply, err := c.raw.DiscoverHierarchy(callCtx, &pb.DiscoverHierarchyRequest{})
if err != nil {
return nil, &GatewayError{Op: "galaxy discover hierarchy", Err: err}
var objects []*GalaxyObject
pageToken := ""
seen := map[string]struct{}{}
for {
callCtx, cancel := c.callContext(ctx)
reply, err := c.raw.DiscoverHierarchy(callCtx, &pb.DiscoverHierarchyRequest{
PageSize: discoverHierarchyPageSize,
PageToken: pageToken,
})
cancel()
if err != nil {
return nil, &GatewayError{Op: "galaxy discover hierarchy", Err: err}
}
objects = append(objects, reply.GetObjects()...)
pageToken = reply.GetNextPageToken()
if pageToken == "" {
return objects, nil
}
if _, dup := seen[pageToken]; dup {
return nil, &GatewayError{
Op: "galaxy discover hierarchy",
Err: fmt.Errorf("repeated page token %q", pageToken),
}
}
seen[pageToken] = struct{}{}
}
return reply.GetObjects(), nil
}
// WatchDeployEventsRaw starts the generated WatchDeployEvents stream for callers
@@ -238,6 +271,206 @@ func (c *GalaxyClient) Close() error {
return c.conn.Close()
}
// LazyBrowseNode is one node in a lazy Galaxy hierarchy walk produced by
// (*GalaxyClient).Browse. Children are not fetched until Expand is called.
// The node is safe for concurrent use; concurrent Expand calls coalesce onto
// a single in-flight RPC and do not block snapshot accessors.
type LazyBrowseNode struct {
client *GalaxyClient
object *pb.GalaxyObject
hasChildrenHint bool
options BrowseChildrenOptions
// expandLock gates inspection and mutation of expand-coordination state
// (expanding, expandDone, expandErr). It is held only briefly; the BrowseChildren
// RPC itself runs outside this lock so concurrent readers and waiters are not blocked.
expandLock sync.Mutex
expanding bool
expandDone chan struct{}
expandErr error
// mu protects the children snapshot and isExpanded flag for concurrent
// Children() / IsExpanded() readers.
mu sync.RWMutex
children []*LazyBrowseNode
isExpanded bool
}
// Object returns the underlying GalaxyObject describing this node.
func (n *LazyBrowseNode) Object() *pb.GalaxyObject { return n.object }
// HasChildrenHint reports the server-supplied hint on whether this node has
// matching descendants under the current filter set.
func (n *LazyBrowseNode) HasChildrenHint() bool { return n.hasChildrenHint }
// Children returns a snapshot copy of the currently-loaded child nodes. Returns
// an empty slice when Expand has not yet been called.
func (n *LazyBrowseNode) Children() []*LazyBrowseNode {
n.mu.RLock()
defer n.mu.RUnlock()
out := make([]*LazyBrowseNode, len(n.children))
copy(out, n.children)
return out
}
// IsExpanded reports whether Expand has completed successfully on this node.
func (n *LazyBrowseNode) IsExpanded() bool {
n.mu.RLock()
defer n.mu.RUnlock()
return n.isExpanded
}
// Expand fetches this node's direct children via BrowseChildren when they have
// not yet been loaded. Subsequent calls after a successful Expand are a no-op
// and do not issue another RPC.
//
// Expand is safe to call concurrently from multiple goroutines: callers that
// arrive while an expansion is in flight wait on the active RPC and share its
// result instead of issuing a second RPC. The RPC itself runs without holding
// the snapshot mutex, so concurrent Children() and IsExpanded() callers are
// not blocked for the duration of the network round trip.
//
// Failure semantics: a failed expansion surfaces the same error to every
// in-flight waiter, but the node is left in its pre-call state (isExpanded =
// false, no in-flight expansion). The next Expand call therefore retries with
// a fresh RPC; failures are not sticky.
func (n *LazyBrowseNode) Expand(ctx context.Context) error {
// Fast path: already expanded.
n.mu.RLock()
if n.isExpanded {
n.mu.RUnlock()
return nil
}
n.mu.RUnlock()
// Either start a new expansion or wait on an existing one.
n.expandLock.Lock()
n.mu.RLock()
alreadyExpanded := n.isExpanded
n.mu.RUnlock()
if alreadyExpanded {
n.expandLock.Unlock()
return nil
}
if n.expanding {
done := n.expandDone
n.expandLock.Unlock()
select {
case <-done:
n.expandLock.Lock()
err := n.expandErr
n.expandLock.Unlock()
return err
case <-ctx.Done():
return ctx.Err()
}
}
n.expanding = true
n.expandDone = make(chan struct{})
done := n.expandDone
n.expandLock.Unlock()
// Issue the RPC outside any lock so concurrent readers/waiters are not blocked.
parentID := n.object.GetGobjectId()
children, err := n.client.browseChildrenInner(ctx, &parentID, n.options)
if err == nil {
n.mu.Lock()
n.children = children
n.isExpanded = true
n.mu.Unlock()
}
// Publish result to waiters and clear the in-flight marker so a failed
// expansion can be retried by the next Expand call.
n.expandLock.Lock()
n.expandErr = err
n.expanding = false
close(done)
n.expandLock.Unlock()
return err
}
// Browse returns the root nodes of the Galaxy hierarchy. The returned nodes
// have only their server-supplied hints populated; call Expand on each node to
// fetch its direct children. When opts is nil the server defaults apply.
func (c *GalaxyClient) Browse(ctx context.Context, opts *BrowseChildrenOptions) ([]*LazyBrowseNode, error) {
effective := BrowseChildrenOptions{}
if opts != nil {
effective = *opts
}
return c.browseChildrenInner(ctx, nil, effective)
}
// BrowseChildrenRaw issues a single BrowseChildren RPC and returns the raw
// reply for callers that need direct page-token control. Transport-level
// failures are wrapped in *GatewayError to match the rest of the client.
func (c *GalaxyClient) BrowseChildrenRaw(ctx context.Context, req *pb.BrowseChildrenRequest) (*pb.BrowseChildrenReply, error) {
callCtx, cancel := c.callContext(ctx)
defer cancel()
reply, err := c.raw.BrowseChildren(callCtx, req)
if err != nil {
return nil, &GatewayError{Op: "galaxy browse children", Err: err}
}
return reply, nil
}
func (c *GalaxyClient) browseChildrenInner(
ctx context.Context,
parentGobjectID *int32,
opts BrowseChildrenOptions,
) ([]*LazyBrowseNode, error) {
var nodes []*LazyBrowseNode
pageToken := ""
seen := map[string]struct{}{}
for {
req := &pb.BrowseChildrenRequest{
PageSize: browseChildrenPageSize,
PageToken: pageToken,
CategoryIds: opts.CategoryIds,
TemplateChainContains: opts.TemplateChainContains,
TagNameGlob: opts.TagNameGlob,
AlarmBearingOnly: opts.AlarmBearingOnly,
HistorizedOnly: opts.HistorizedOnly,
}
if parentGobjectID != nil {
req.Parent = &pb.BrowseChildrenRequest_ParentGobjectId{ParentGobjectId: *parentGobjectID}
}
if opts.IncludeAttributes != nil {
req.IncludeAttributes = opts.IncludeAttributes
}
reply, err := c.BrowseChildrenRaw(ctx, req)
if err != nil {
return nil, err
}
for i, child := range reply.GetChildren() {
hasChildren := reply.GetChildHasChildren()
hint := i < len(hasChildren) && hasChildren[i]
nodes = append(nodes, &LazyBrowseNode{
client: c,
object: child,
hasChildrenHint: hint,
options: opts,
})
}
pageToken = reply.GetNextPageToken()
if pageToken == "" {
return nodes, nil
}
if _, dup := seen[pageToken]; dup {
return nil, &GatewayError{
Op: "galaxy browse children",
Err: fmt.Errorf("repeated page token %q", pageToken),
}
}
seen[pageToken] = struct{}{}
}
}
func (c *GalaxyClient) callContext(ctx context.Context) (context.Context, context.CancelFunc) {
timeout := c.opts.CallTimeout
if timeout == 0 {
+446 -9
View File
@@ -4,11 +4,14 @@ import (
"context"
"errors"
"net"
"sync"
"testing"
"time"
pb "gitea.dohertylan.com/dohertj2/mxaccessgw/clients/go/internal/generated"
"google.golang.org/grpc"
"google.golang.org/grpc/codes"
"google.golang.org/grpc/status"
"google.golang.org/grpc/test/bufconn"
"google.golang.org/protobuf/types/known/timestamppb"
)
@@ -144,6 +147,47 @@ func TestGalaxyDiscoverHierarchyReturnsObjects(t *testing.T) {
}
}
func TestGalaxyDiscoverHierarchyPaginatesAcrossMultiplePages(t *testing.T) {
page1 := &pb.DiscoverHierarchyReply{
Objects: []*pb.GalaxyObject{
{GobjectId: 1, TagName: "A"},
{GobjectId: 2, TagName: "B"},
},
NextPageToken: "page-2",
TotalObjectCount: 3,
}
page2 := &pb.DiscoverHierarchyReply{
Objects: []*pb.GalaxyObject{
{GobjectId: 3, TagName: "C"},
},
TotalObjectCount: 3,
}
fake := &fakeGalaxyServer{
discoverHierarchyReplies: []*pb.DiscoverHierarchyReply{page1, page2},
}
client, cleanup := newGalaxyBufconnClient(t, fake)
defer cleanup()
objs, err := client.DiscoverHierarchy(context.Background())
if err != nil {
t.Fatalf("DiscoverHierarchy: %v", err)
}
if got, want := len(objs), 3; got != want {
t.Fatalf("len(objs) = %d, want %d", got, want)
}
if len(fake.discoverHierarchyCalls) != 2 {
t.Fatalf("expected 2 RPC calls, got %d", len(fake.discoverHierarchyCalls))
}
if fake.discoverHierarchyCalls[0].GetPageSize() != discoverHierarchyPageSize {
t.Fatalf("first call PageSize = %d, want %d",
fake.discoverHierarchyCalls[0].GetPageSize(), discoverHierarchyPageSize)
}
if fake.discoverHierarchyCalls[1].GetPageToken() != "page-2" {
t.Fatalf("second call page token = %q, want %q",
fake.discoverHierarchyCalls[1].GetPageToken(), "page-2")
}
}
func TestGalaxyDialReturnsGatewayErrorOnRpcFailure(t *testing.T) {
fake := &fakeGalaxyServer{failTest: true}
client, cleanup := newGalaxyBufconnClient(t, fake)
@@ -370,15 +414,20 @@ func newGalaxyBufconnClient(t *testing.T, fake *fakeGalaxyServer) (*GalaxyClient
type fakeGalaxyServer struct {
pb.UnimplementedGalaxyRepositoryServer
testReply *pb.TestConnectionReply
testAuth string
failTest bool
deployReply *pb.GetLastDeployTimeReply
discoverReply *pb.DiscoverHierarchyReply
watchEvents []*pb.DeployEvent
watchRequest *pb.WatchDeployEventsRequest
watchSendInterval time.Duration
watchHoldOpen bool
testReply *pb.TestConnectionReply
testAuth string
failTest bool
deployReply *pb.GetLastDeployTimeReply
discoverReply *pb.DiscoverHierarchyReply
discoverHierarchyCalls []*pb.DiscoverHierarchyRequest
discoverHierarchyReplies []*pb.DiscoverHierarchyReply
watchEvents []*pb.DeployEvent
watchRequest *pb.WatchDeployEventsRequest
watchSendInterval time.Duration
watchHoldOpen bool
browseChildrenCalls []*pb.BrowseChildrenRequest
browseChildrenReplies []*pb.BrowseChildrenReply
browseChildrenError error
}
func (s *fakeGalaxyServer) TestConnection(ctx context.Context, req *pb.TestConnectionRequest) (*pb.TestConnectionReply, error) {
@@ -400,6 +449,12 @@ func (s *fakeGalaxyServer) GetLastDeployTime(ctx context.Context, req *pb.GetLas
}
func (s *fakeGalaxyServer) DiscoverHierarchy(ctx context.Context, req *pb.DiscoverHierarchyRequest) (*pb.DiscoverHierarchyReply, error) {
s.discoverHierarchyCalls = append(s.discoverHierarchyCalls, req)
if len(s.discoverHierarchyReplies) > 0 {
reply := s.discoverHierarchyReplies[0]
s.discoverHierarchyReplies = s.discoverHierarchyReplies[1:]
return reply, nil
}
if s.discoverReply != nil {
return s.discoverReply, nil
}
@@ -425,3 +480,385 @@ func (s *fakeGalaxyServer) WatchDeployEvents(req *pb.WatchDeployEventsRequest, s
}
return nil
}
func (s *fakeGalaxyServer) BrowseChildren(ctx context.Context, req *pb.BrowseChildrenRequest) (*pb.BrowseChildrenReply, error) {
s.browseChildrenCalls = append(s.browseChildrenCalls, req)
if s.browseChildrenError != nil {
err := s.browseChildrenError
s.browseChildrenError = nil
return nil, err
}
if len(s.browseChildrenReplies) == 0 {
return &pb.BrowseChildrenReply{}, nil
}
reply := s.browseChildrenReplies[0]
s.browseChildrenReplies = s.browseChildrenReplies[1:]
return reply, nil
}
func obj(id int32, tag string, isArea bool) *pb.GalaxyObject {
return &pb.GalaxyObject{
GobjectId: id,
TagName: tag,
BrowseName: tag,
IsArea: isArea,
}
}
func buildBrowseReply(children []*pb.GalaxyObject, hasChildren []bool, seq uint64) *pb.BrowseChildrenReply {
return &pb.BrowseChildrenReply{
TotalChildCount: int32(len(children)),
CacheSequence: seq,
Children: children,
ChildHasChildren: hasChildren,
}
}
func TestGalaxyBrowseNoParentReturnsRoots(t *testing.T) {
fake := &fakeGalaxyServer{
browseChildrenReplies: []*pb.BrowseChildrenReply{
buildBrowseReply(
[]*pb.GalaxyObject{obj(1, "Plant", true), obj(99, "Other", false)},
[]bool{true, false},
7,
),
},
}
client, cleanup := newGalaxyBufconnClient(t, fake)
defer cleanup()
roots, err := client.Browse(context.Background(), nil)
if err != nil {
t.Fatalf("Browse: %v", err)
}
if got, want := len(roots), 2; got != want {
t.Fatalf("len(roots) = %d, want %d", got, want)
}
if roots[0].Object().GetTagName() != "Plant" {
t.Fatalf("roots[0].TagName = %q", roots[0].Object().GetTagName())
}
if !roots[0].HasChildrenHint() {
t.Fatal("roots[0].HasChildrenHint = false, want true")
}
if roots[0].IsExpanded() {
t.Fatal("roots[0].IsExpanded = true, want false")
}
if roots[1].HasChildrenHint() {
t.Fatal("roots[1].HasChildrenHint = true, want false")
}
if len(fake.browseChildrenCalls) != 1 {
t.Fatalf("BrowseChildren calls = %d, want 1", len(fake.browseChildrenCalls))
}
if fake.browseChildrenCalls[0].GetParent() != nil {
t.Fatalf("root browse should not set Parent oneof, got %T", fake.browseChildrenCalls[0].GetParent())
}
}
func TestGalaxyBrowseExpandPopulatesChildrenAndMarksExpanded(t *testing.T) {
fake := &fakeGalaxyServer{
browseChildrenReplies: []*pb.BrowseChildrenReply{
buildBrowseReply(
[]*pb.GalaxyObject{obj(1, "Plant", true)},
[]bool{true},
1,
),
buildBrowseReply(
[]*pb.GalaxyObject{obj(10, "Area1", true), obj(11, "Tank1", false)},
[]bool{true, false},
1,
),
},
}
client, cleanup := newGalaxyBufconnClient(t, fake)
defer cleanup()
roots, err := client.Browse(context.Background(), nil)
if err != nil {
t.Fatalf("Browse: %v", err)
}
if len(roots) != 1 {
t.Fatalf("len(roots) = %d, want 1", len(roots))
}
plant := roots[0]
if plant.IsExpanded() {
t.Fatal("plant.IsExpanded = true before Expand, want false")
}
if err := plant.Expand(context.Background()); err != nil {
t.Fatalf("Expand: %v", err)
}
if !plant.IsExpanded() {
t.Fatal("plant.IsExpanded = false after Expand, want true")
}
children := plant.Children()
if len(children) != 2 {
t.Fatalf("len(children) = %d, want 2", len(children))
}
if children[0].Object().GetTagName() != "Area1" {
t.Fatalf("children[0].TagName = %q, want Area1", children[0].Object().GetTagName())
}
if !children[0].HasChildrenHint() {
t.Fatal("children[0].HasChildrenHint = false, want true")
}
if children[1].HasChildrenHint() {
t.Fatal("children[1].HasChildrenHint = true, want false")
}
if len(fake.browseChildrenCalls) != 2 {
t.Fatalf("BrowseChildren calls = %d, want 2", len(fake.browseChildrenCalls))
}
parent := fake.browseChildrenCalls[1].GetParent()
parentGobj, ok := parent.(*pb.BrowseChildrenRequest_ParentGobjectId)
if !ok {
t.Fatalf("Parent oneof = %T, want *BrowseChildrenRequest_ParentGobjectId", parent)
}
if parentGobj.ParentGobjectId != 1 {
t.Fatalf("ParentGobjectId = %d, want 1", parentGobj.ParentGobjectId)
}
}
func TestGalaxyBrowseExpandIdempotentNoSecondRpc(t *testing.T) {
fake := &fakeGalaxyServer{
browseChildrenReplies: []*pb.BrowseChildrenReply{
buildBrowseReply(
[]*pb.GalaxyObject{obj(1, "Plant", true)},
[]bool{true},
1,
),
buildBrowseReply(
[]*pb.GalaxyObject{obj(10, "Area1", true)},
[]bool{false},
1,
),
},
}
client, cleanup := newGalaxyBufconnClient(t, fake)
defer cleanup()
roots, err := client.Browse(context.Background(), nil)
if err != nil {
t.Fatalf("Browse: %v", err)
}
plant := roots[0]
if err := plant.Expand(context.Background()); err != nil {
t.Fatalf("Expand #1: %v", err)
}
callsAfterFirst := len(fake.browseChildrenCalls)
if callsAfterFirst != 2 {
t.Fatalf("BrowseChildren calls after first Expand = %d, want 2", callsAfterFirst)
}
if err := plant.Expand(context.Background()); err != nil {
t.Fatalf("Expand #2: %v", err)
}
if got := len(fake.browseChildrenCalls); got != callsAfterFirst {
t.Fatalf("BrowseChildren calls after second Expand = %d, want %d (no extra RPC)", got, callsAfterFirst)
}
}
func TestGalaxyBrowseExpandUnknownParentReturnsNotFoundError(t *testing.T) {
fake := &fakeGalaxyServer{
browseChildrenReplies: []*pb.BrowseChildrenReply{
buildBrowseReply(
[]*pb.GalaxyObject{obj(1, "Plant", true)},
[]bool{true},
1,
),
},
browseChildrenError: status.Error(codes.NotFound, "parent not found"),
}
// The first Browse() consumes the first reply; the next call (Expand) will
// then hit browseChildrenError. We need the error to fire only on the second
// call, so seed the reply first and let the call sequence consume them in
// order. Because BrowseChildren in the fake consumes browseChildrenError
// before falling through to replies, swap the strategy: keep the root reply
// but have BrowseChildren return the error on the second call. We do this by
// emptying the reply list after the first Browse.
client, cleanup := newGalaxyBufconnClient(t, fake)
defer cleanup()
// First call returns the error (because browseChildrenError takes precedence).
// To avoid that, clear it for the root call by performing a manual setup: we
// pre-stage replies first, then set the error after the first call. Easiest:
// pre-Browse() with error=nil, then set error before Expand.
fake.browseChildrenError = nil
roots, err := client.Browse(context.Background(), nil)
if err != nil {
t.Fatalf("Browse: %v", err)
}
if len(roots) != 1 {
t.Fatalf("len(roots) = %d, want 1", len(roots))
}
fake.browseChildrenError = status.Error(codes.NotFound, "parent not found")
err = roots[0].Expand(context.Background())
if err == nil {
t.Fatal("Expand: error = nil, want NotFound")
}
if status.Code(err) != codes.NotFound {
t.Fatalf("status.Code = %s, want NotFound", status.Code(err))
}
if roots[0].IsExpanded() {
t.Fatal("roots[0].IsExpanded = true after failed Expand, want false")
}
}
func TestGalaxyBrowseExpandMultiPageGathersAllPages(t *testing.T) {
firstPage := buildBrowseReply(
[]*pb.GalaxyObject{obj(1, "Plant", true)},
[]bool{true},
7,
)
pageA := buildBrowseReply(
[]*pb.GalaxyObject{obj(10, "Child1", false), obj(11, "Child2", false)},
[]bool{false, false},
7,
)
pageA.NextPageToken = "7:abc:2"
pageB := buildBrowseReply(
[]*pb.GalaxyObject{obj(12, "Child3", false)},
[]bool{false},
7,
)
fake := &fakeGalaxyServer{
browseChildrenReplies: []*pb.BrowseChildrenReply{firstPage, pageA, pageB},
}
client, cleanup := newGalaxyBufconnClient(t, fake)
defer cleanup()
roots, err := client.Browse(context.Background(), nil)
if err != nil {
t.Fatalf("Browse: %v", err)
}
if err := roots[0].Expand(context.Background()); err != nil {
t.Fatalf("Expand: %v", err)
}
children := roots[0].Children()
if len(children) != 3 {
t.Fatalf("len(children) = %d, want 3", len(children))
}
if len(fake.browseChildrenCalls) != 3 {
t.Fatalf("BrowseChildren calls = %d, want 3", len(fake.browseChildrenCalls))
}
if got := fake.browseChildrenCalls[2].GetPageToken(); got != "7:abc:2" {
t.Fatalf("third call PageToken = %q, want %q", got, "7:abc:2")
}
}
func TestGalaxyBrowseWithFilterForwardsToRequest(t *testing.T) {
fake := &fakeGalaxyServer{
browseChildrenReplies: []*pb.BrowseChildrenReply{
buildBrowseReply(nil, nil, 1),
},
}
client, cleanup := newGalaxyBufconnClient(t, fake)
defer cleanup()
include := true
opts := &BrowseChildrenOptions{
CategoryIds: []int32{7, 9},
TemplateChainContains: []string{"$AppObject"},
TagNameGlob: "Tank*",
IncludeAttributes: &include,
AlarmBearingOnly: true,
HistorizedOnly: true,
}
if _, err := client.Browse(context.Background(), opts); err != nil {
t.Fatalf("Browse: %v", err)
}
if len(fake.browseChildrenCalls) != 1 {
t.Fatalf("BrowseChildren calls = %d, want 1", len(fake.browseChildrenCalls))
}
got := fake.browseChildrenCalls[0]
if want := []int32{7, 9}; len(got.GetCategoryIds()) != 2 || got.GetCategoryIds()[0] != want[0] || got.GetCategoryIds()[1] != want[1] {
t.Fatalf("CategoryIds = %v, want %v", got.GetCategoryIds(), want)
}
if want := []string{"$AppObject"}; len(got.GetTemplateChainContains()) != 1 || got.GetTemplateChainContains()[0] != want[0] {
t.Fatalf("TemplateChainContains = %v, want %v", got.GetTemplateChainContains(), want)
}
if got.GetTagNameGlob() != "Tank*" {
t.Fatalf("TagNameGlob = %q, want %q", got.GetTagNameGlob(), "Tank*")
}
if !got.GetIncludeAttributes() {
t.Fatal("IncludeAttributes = false, want true")
}
if !got.GetAlarmBearingOnly() {
t.Fatal("AlarmBearingOnly = false, want true")
}
if !got.GetHistorizedOnly() {
t.Fatal("HistorizedOnly = false, want true")
}
}
func TestGalaxyBrowseExpandConcurrentCallersOnlyFireOneRpc(t *testing.T) {
fake := &fakeGalaxyServer{
browseChildrenReplies: []*pb.BrowseChildrenReply{
// roots
buildBrowseReply([]*pb.GalaxyObject{obj(1, "Plant", true)}, []bool{true}, 7),
// one expand: one child
buildBrowseReply([]*pb.GalaxyObject{obj(2, "Mixer", false)}, []bool{false}, 7),
},
}
client, cleanup := newGalaxyBufconnClient(t, fake)
defer cleanup()
ctx := context.Background()
roots, err := client.Browse(ctx, nil)
if err != nil {
t.Fatalf("Browse: %v", err)
}
var wg sync.WaitGroup
errs := make(chan error, 10)
for i := 0; i < 10; i++ {
wg.Add(1)
go func() {
defer wg.Done()
errs <- roots[0].Expand(ctx)
}()
}
wg.Wait()
close(errs)
for err := range errs {
if err != nil {
t.Fatalf("concurrent Expand: %v", err)
}
}
if !roots[0].IsExpanded() {
t.Fatal("IsExpanded() = false after 10 concurrent expands")
}
if got, want := len(roots[0].Children()), 1; got != want {
t.Fatalf("len(children) = %d, want %d", got, want)
}
// 1 roots fetch + exactly 1 expand fetch.
if got, want := len(fake.browseChildrenCalls), 2; got != want {
t.Fatalf("RPC count = %d, want %d", got, want)
}
}
func TestGalaxyBrowseChildrenRejectsRepeatedPageToken(t *testing.T) {
// Build a reply that carries a non-empty NextPageToken so browseChildrenInner
// will request a second page. Queue the same reply twice so the second response
// returns the same page token, triggering the duplicate-token guard.
page := buildBrowseReply(
[]*pb.GalaxyObject{obj(1, "Plant", true)},
[]bool{true},
1,
)
page.NextPageToken = "1:abc:1"
fake := &fakeGalaxyServer{
browseChildrenReplies: []*pb.BrowseChildrenReply{page, page},
}
client, cleanup := newGalaxyBufconnClient(t, fake)
defer cleanup()
_, err := client.Browse(context.Background(), nil)
if err == nil {
t.Fatal("Browse: error = nil, want repeated-page-token error")
}
var gwErr *GatewayError
if !errors.As(err, &gwErr) {
t.Fatalf("error type = %T, want *GatewayError; err = %v", err, err)
}
}
+26
View File
@@ -34,6 +34,32 @@ type Options struct {
TransportCredentials credentials.TransportCredentials
// DialOptions are appended to the gRPC dial options after the defaults.
DialOptions []grpc.DialOption
// RequireCertificateValidation forces TLS certificate verification even when
// no CACertFile is pinned. Default false: the gateway's self-signed cert is
// accepted without verification (internal-tool posture).
RequireCertificateValidation bool
}
// BrowseChildrenOptions configures lazy Galaxy hierarchy walks performed by
// (*GalaxyClient).Browse and (*LazyBrowseNode).Expand. All fields are optional;
// the zero value matches the dashboard default (no filters, all attributes per
// the server default).
type BrowseChildrenOptions struct {
// CategoryIds restricts results to the listed Galaxy category ids when set.
CategoryIds []int32
// TemplateChainContains restricts results to objects whose template chain
// contains any of the listed template tag names.
TemplateChainContains []string
// TagNameGlob restricts results to objects whose tag name matches the glob
// pattern when non-empty.
TagNameGlob string
// IncludeAttributes overrides the server default for attribute inclusion when
// non-nil. The pointer form mirrors the proto's optional field.
IncludeAttributes *bool
// AlarmBearingOnly limits results to alarm-bearing objects when true.
AlarmBearingOnly bool
// HistorizedOnly limits results to historized objects when true.
HistorizedOnly bool
}
// RedactedAPIKey returns a display-safe representation of the configured API
+31
View File
@@ -392,6 +392,9 @@ func (s *Session) UnsubscribeBulk(ctx context.Context, serverHandle int32, itemH
// Per-entry failures appear as BulkWriteResult entries with WasSuccessful=false; the call
// never returns an error for per-entry MXAccess failures (it returns an error only for
// protocol-level failures or transport errors).
//
// A non-nil but empty entries slice is treated as a no-op and returns an empty result
// without a wire round-trip; pass nil to surface a clear "entries are required" error.
func (s *Session) WriteBulk(ctx context.Context, serverHandle int32, entries []*WriteBulkEntry) ([]*BulkWriteResult, error) {
if entries == nil {
return nil, errors.New("mxgateway: write bulk entries are required")
@@ -399,6 +402,9 @@ func (s *Session) WriteBulk(ctx context.Context, serverHandle int32, entries []*
if err := ensureBulkSize("write bulk entries", len(entries)); err != nil {
return nil, err
}
if len(entries) == 0 {
return []*BulkWriteResult{}, nil
}
reply, err := s.invokeCommand(ctx, &pb.MxCommand{
Kind: pb.MxCommandKind_MX_COMMAND_KIND_WRITE_BULK,
Payload: &pb.MxCommand_WriteBulk{
@@ -415,6 +421,9 @@ func (s *Session) WriteBulk(ctx context.Context, serverHandle int32, entries []*
}
// Write2Bulk invokes MXAccess Write2 (timestamped) for each entry inside one gateway command.
//
// A non-nil but empty entries slice is treated as a no-op and returns an empty result
// without a wire round-trip; pass nil to surface a clear "entries are required" error.
func (s *Session) Write2Bulk(ctx context.Context, serverHandle int32, entries []*Write2BulkEntry) ([]*BulkWriteResult, error) {
if entries == nil {
return nil, errors.New("mxgateway: write2 bulk entries are required")
@@ -422,6 +431,9 @@ func (s *Session) Write2Bulk(ctx context.Context, serverHandle int32, entries []
if err := ensureBulkSize("write2 bulk entries", len(entries)); err != nil {
return nil, err
}
if len(entries) == 0 {
return []*BulkWriteResult{}, nil
}
reply, err := s.invokeCommand(ctx, &pb.MxCommand{
Kind: pb.MxCommandKind_MX_COMMAND_KIND_WRITE2_BULK,
Payload: &pb.MxCommand_Write2Bulk{
@@ -439,6 +451,9 @@ func (s *Session) Write2Bulk(ctx context.Context, serverHandle int32, entries []
// WriteSecuredBulk invokes MXAccess WriteSecured for each entry. Credential-sensitive
// values must not be logged by callers; mirrors the single-item WriteSecured contract.
//
// A non-nil but empty entries slice is treated as a no-op and returns an empty result
// without a wire round-trip; pass nil to surface a clear "entries are required" error.
func (s *Session) WriteSecuredBulk(ctx context.Context, serverHandle int32, entries []*WriteSecuredBulkEntry) ([]*BulkWriteResult, error) {
if entries == nil {
return nil, errors.New("mxgateway: write-secured bulk entries are required")
@@ -446,6 +461,9 @@ func (s *Session) WriteSecuredBulk(ctx context.Context, serverHandle int32, entr
if err := ensureBulkSize("write-secured bulk entries", len(entries)); err != nil {
return nil, err
}
if len(entries) == 0 {
return []*BulkWriteResult{}, nil
}
reply, err := s.invokeCommand(ctx, &pb.MxCommand{
Kind: pb.MxCommandKind_MX_COMMAND_KIND_WRITE_SECURED_BULK,
Payload: &pb.MxCommand_WriteSecuredBulk{
@@ -462,6 +480,9 @@ func (s *Session) WriteSecuredBulk(ctx context.Context, serverHandle int32, entr
}
// WriteSecured2Bulk invokes MXAccess WriteSecured2 (timestamped) for each entry.
//
// A non-nil but empty entries slice is treated as a no-op and returns an empty result
// without a wire round-trip; pass nil to surface a clear "entries are required" error.
func (s *Session) WriteSecured2Bulk(ctx context.Context, serverHandle int32, entries []*WriteSecured2BulkEntry) ([]*BulkWriteResult, error) {
if entries == nil {
return nil, errors.New("mxgateway: write-secured2 bulk entries are required")
@@ -469,6 +490,9 @@ func (s *Session) WriteSecured2Bulk(ctx context.Context, serverHandle int32, ent
if err := ensureBulkSize("write-secured2 bulk entries", len(entries)); err != nil {
return nil, err
}
if len(entries) == 0 {
return []*BulkWriteResult{}, nil
}
reply, err := s.invokeCommand(ctx, &pb.MxCommand{
Kind: pb.MxCommandKind_MX_COMMAND_KIND_WRITE_SECURED2_BULK,
Payload: &pb.MxCommand_WriteSecured2Bulk{
@@ -492,6 +516,10 @@ func (s *Session) WriteSecured2Bulk(ctx context.Context, serverHandle int32, ent
// otherwise. timeout bounds the wait per tag in the snapshot case; pass zero to use the
// worker default. Per-tag failures (timeout, invalid tag) appear as BulkReadResult entries
// with WasSuccessful=false; the call never returns an error for per-tag MXAccess failures.
//
// A non-nil but empty tagAddresses slice is treated as a no-op and returns an empty
// result without a wire round-trip; pass nil to surface a clear "tag addresses are
// required" error.
func (s *Session) ReadBulk(ctx context.Context, serverHandle int32, tagAddresses []string, timeout time.Duration) ([]*BulkReadResult, error) {
if tagAddresses == nil {
return nil, errors.New("mxgateway: tag addresses are required")
@@ -499,6 +527,9 @@ func (s *Session) ReadBulk(ctx context.Context, serverHandle int32, tagAddresses
if err := ensureBulkSize("tag addresses", len(tagAddresses)); err != nil {
return nil, err
}
if len(tagAddresses) == 0 {
return []*BulkReadResult{}, nil
}
var timeoutMs uint32
if timeout > 0 {
ms := timeout.Milliseconds()
+17
View File
@@ -112,6 +112,23 @@ Support:
- custom CA certificate file,
- server name override for test environments.
### Trust posture
The gateway can serve a self-signed certificate it generates itself (it has no
PKI). To make that usable, TLS is **lenient by default**: when the channel is not
plaintext and no `caCertificatePath` is set, the client builds
`GrpcSslContexts.forClient().trustManager(InsecureTrustManagerFactory.INSTANCE)`
(grpc-netty-shaded), so the gateway's self-signed certificate is accepted without
verification.
To verify the gateway instead:
- set `caCertificatePath` to pin a CA (full verification against that root), or
- set `requireCertificateValidation` to `true` to verify against the JVM trust
store without pinning.
Pinning a CA always wins over the lenient default.
## Streaming
Support both:
+96 -2
View File
@@ -57,6 +57,16 @@ try (MxGatewayClient client = MxGatewayClient.connect(options);
}
```
The gateway can auto-generate its own self-signed certificate (it has no PKI), so
the client is **lenient by default**: a TLS connection (`plaintext(false)`) with
no `caCertificatePath` accepts whatever certificate the gateway presents (via
grpc-netty-shaded's `InsecureTrustManagerFactory`). To verify instead, set
`caCertificatePath` to pin a CA, or set `requireCertificateValidation(true)` to
verify against the JVM trust store without pinning. Use `serverNameOverride` /
`--server-name-override` when the dialed host differs from the certificate SAN.
See
[Gateway Configuration](../../docs/GatewayConfiguration.md#automatic-self-signed-certificate).
Use `rawBlockingStub`, `rawFutureStub`, `rawAsyncStub`, `openSessionRaw`,
`closeSessionRaw`, `invoke`, and raw session helper methods when tests need the
underlying protobuf messages. `MxGatewayCommandException` and
@@ -116,6 +126,59 @@ gradle :zb-mom-ww-mxgateway-cli:run --args="galaxy-deploy-time --endpoint localh
gradle :zb-mom-ww-mxgateway-cli:run --args="galaxy-discover --endpoint localhost:5000 --api-key-env MXGATEWAY_API_KEY --plaintext --json"
```
### Browsing lazily
For UI trees or OPC UA bridges, use `browseChildren` to walk one level at a
time instead of loading the full hierarchy with `discoverHierarchy`. Pass a
default request for root objects; subsequent calls set `parentGobjectId`,
`parentTagName`, or `parentContainedPath`. Filter fields match
`DiscoverHierarchy`. Each response pairs `getChildrenList()` with
`getChildHasChildrenList()` so you know which nodes to expand. See
[Galaxy Repository](../../docs/GalaxyRepository.md#browsechildren) for full
request and filter semantics. This snippet documents the API as it appears once
the Java client is regenerated on the Windows host.
```java
BrowseChildrenReply reply = galaxy.browseChildren(
BrowseChildrenRequest.newBuilder().build());
List<GalaxyObject> children = reply.getChildrenList();
List<Boolean> hasChildren = reply.getChildHasChildrenList();
for (int i = 0; i < children.size(); i++) {
System.out.printf("%s expand=%b%n", children.get(i).getTagName(), hasChildren.get(i));
}
```
#### High-level walker
For UI trees, the client provides a `LazyBrowseNode` walker that handles
sibling pagination and the `child_has_children` hint for you:
```java
MxGatewayClientOptions options = MxGatewayClientOptions.builder()
.endpoint("localhost:5000")
.apiKey(System.getenv("MXGATEWAY_API_KEY"))
.plaintext(true)
.build();
try (GalaxyRepositoryClient galaxy = GalaxyRepositoryClient.connect(options)) {
List<LazyBrowseNode> roots = galaxy.browse();
for (LazyBrowseNode root : roots) {
if (root.hasChildrenHint()) {
root.expand();
}
for (LazyBrowseNode child : root.getChildren()) {
String kind = child.hasChildrenHint() ? "has children" : "leaf";
System.out.println(child.getObject().getTagName() + " (" + kind + ")");
}
}
}
```
`expand` is idempotent — calling it twice fires only one RPC,
and is safe under concurrent callers. To refresh after a Galaxy redeploy, call
`browse` again from the root.
### Watching deploy events
`GalaxyRepository.WatchDeployEvents` is a server-streaming RPC: the gateway
@@ -179,8 +242,8 @@ gradle :zb-mom-ww-mxgateway-cli:run --args="add-item --endpoint localhost:5000 -
gradle :zb-mom-ww-mxgateway-cli:run --args="advise --endpoint localhost:5000 --api-key-env MXGATEWAY_API_KEY --plaintext --session-id <id> --server-handle 1 --item-handle 1 --json"
gradle :zb-mom-ww-mxgateway-cli:run --args="write --endpoint localhost:5000 --api-key-env MXGATEWAY_API_KEY --plaintext --session-id <id> --server-handle 1 --item-handle 1 --type int32 --value 123 --json"
gradle :zb-mom-ww-mxgateway-cli:run --args="stream-events --endpoint localhost:5000 --api-key-env MXGATEWAY_API_KEY --plaintext --session-id <id> --limit 1 --json"
gradle :zb-mom-ww-mxgateway-cli:run --args="stream-alarms --endpoint localhost:5000 --api-key-env MXGATEWAY_API_KEY --plaintext --session-id <id> --limit 1 --json"
gradle :zb-mom-ww-mxgateway-cli:run --args="acknowledge-alarm --endpoint localhost:5000 --api-key-env MXGATEWAY_API_KEY --plaintext --session-id <id> --alarm-reference \"\\Galaxy\Area001.Pump001.PumpFault\" --json"
gradle :zb-mom-ww-mxgateway-cli:run --args="stream-alarms --endpoint localhost:5000 --api-key-env MXGATEWAY_API_KEY --plaintext --filter-prefix Galaxy --limit 1 --json"
gradle :zb-mom-ww-mxgateway-cli:run --args="acknowledge-alarm --endpoint localhost:5000 --api-key-env MXGATEWAY_API_KEY --plaintext --reference \"\\Galaxy\Area001.Pump001.PumpFault\" --json"
gradle :zb-mom-ww-mxgateway-cli:run --args="smoke --endpoint localhost:5000 --api-key-env MXGATEWAY_API_KEY --plaintext --item TestObject.TestInt --json"
```
@@ -229,6 +292,37 @@ $env:MXGATEWAY_TEST_ITEM = 'TestObject.TestInt'
gradle :zb-mom-ww-mxgateway-cli:run --args="smoke --endpoint $env:MXGATEWAY_ENDPOINT --plaintext --api-key-env MXGATEWAY_API_KEY --item $env:MXGATEWAY_TEST_ITEM --json"
```
## Installing from the Gitea Maven repository
The client publishes to the internal Gitea Maven repository at
`https://gitea.dohertylan.com/api/packages/dohertj2/maven`.
In your consumer project's `build.gradle`:
````groovy
repositories {
maven {
url 'https://gitea.dohertylan.com/api/packages/dohertj2/maven'
credentials {
username = System.getenv('GITEA_USERNAME')
password = System.getenv('GITEA_TOKEN')
}
}
}
dependencies {
implementation 'com.zb.mom.ww.mxgateway:zb-mom-ww-mxgateway-client:0.1.0'
}
````
To publish a new version from this repo:
````bash
export GITEA_USERNAME=dohertj2
export GITEA_TOKEN=<your-gitea-token>
gradle :zb-mom-ww-mxgateway-client:publish
````
## Related Documentation
- [Client Packaging](../../docs/ClientPackaging.md)
+40
View File
@@ -37,4 +37,44 @@ subprojects {
testRuntimeOnly 'org.junit.platform:junit-platform-launcher'
}
}
pluginManager.withPlugin('maven-publish') {
publishing {
publications {
maven(MavenPublication) {
from components.java
pom {
url = 'https://gitea.dohertylan.com/dohertj2/mxaccessgw'
description = 'MxAccessGateway Java client'
scm {
url = 'https://gitea.dohertylan.com/dohertj2/mxaccessgw'
connection = 'scm:git:https://gitea.dohertylan.com/dohertj2/mxaccessgw.git'
}
developers {
developer {
id = 'dohertj2'
name = 'Joseph Doherty'
}
}
licenses {
license {
name = 'Proprietary'
distribution = 'repo'
}
}
}
}
}
repositories {
maven {
name = 'GiteaPackages'
url = 'https://gitea.dohertylan.com/api/packages/dohertj2/maven'
credentials {
username = System.getenv('GITEA_USERNAME') ?: ''
password = System.getenv('GITEA_TOKEN') ?: ''
}
}
}
}
}
}
+4
View File
@@ -9,6 +9,10 @@ pluginManagement {
}
}
plugins {
id 'org.gradle.toolchains.foojay-resolver-convention' version '1.0.0'
}
dependencyResolutionManagement {
repositoriesMode.set(RepositoriesMode.FAIL_ON_PROJECT_REPOS)
repositories {
@@ -142,6 +142,37 @@ public final class GalaxyRepositoryGrpc {
return getWatchDeployEventsMethod;
}
private static volatile io.grpc.MethodDescriptor<galaxy_repository.v1.GalaxyRepositoryOuterClass.BrowseChildrenRequest,
galaxy_repository.v1.GalaxyRepositoryOuterClass.BrowseChildrenReply> getBrowseChildrenMethod;
@io.grpc.stub.annotations.RpcMethod(
fullMethodName = SERVICE_NAME + '/' + "BrowseChildren",
requestType = galaxy_repository.v1.GalaxyRepositoryOuterClass.BrowseChildrenRequest.class,
responseType = galaxy_repository.v1.GalaxyRepositoryOuterClass.BrowseChildrenReply.class,
methodType = io.grpc.MethodDescriptor.MethodType.UNARY)
public static io.grpc.MethodDescriptor<galaxy_repository.v1.GalaxyRepositoryOuterClass.BrowseChildrenRequest,
galaxy_repository.v1.GalaxyRepositoryOuterClass.BrowseChildrenReply> getBrowseChildrenMethod() {
io.grpc.MethodDescriptor<galaxy_repository.v1.GalaxyRepositoryOuterClass.BrowseChildrenRequest, galaxy_repository.v1.GalaxyRepositoryOuterClass.BrowseChildrenReply> getBrowseChildrenMethod;
if ((getBrowseChildrenMethod = GalaxyRepositoryGrpc.getBrowseChildrenMethod) == null) {
synchronized (GalaxyRepositoryGrpc.class) {
if ((getBrowseChildrenMethod = GalaxyRepositoryGrpc.getBrowseChildrenMethod) == null) {
GalaxyRepositoryGrpc.getBrowseChildrenMethod = getBrowseChildrenMethod =
io.grpc.MethodDescriptor.<galaxy_repository.v1.GalaxyRepositoryOuterClass.BrowseChildrenRequest, galaxy_repository.v1.GalaxyRepositoryOuterClass.BrowseChildrenReply>newBuilder()
.setType(io.grpc.MethodDescriptor.MethodType.UNARY)
.setFullMethodName(generateFullMethodName(SERVICE_NAME, "BrowseChildren"))
.setSampledToLocalTracing(true)
.setRequestMarshaller(io.grpc.protobuf.ProtoUtils.marshaller(
galaxy_repository.v1.GalaxyRepositoryOuterClass.BrowseChildrenRequest.getDefaultInstance()))
.setResponseMarshaller(io.grpc.protobuf.ProtoUtils.marshaller(
galaxy_repository.v1.GalaxyRepositoryOuterClass.BrowseChildrenReply.getDefaultInstance()))
.setSchemaDescriptor(new GalaxyRepositoryMethodDescriptorSupplier("BrowseChildren"))
.build();
}
}
}
return getBrowseChildrenMethod;
}
/**
* Creates a new async stub that supports all call types for the service
*/
@@ -246,6 +277,19 @@ public final class GalaxyRepositoryGrpc {
io.grpc.stub.StreamObserver<galaxy_repository.v1.GalaxyRepositoryOuterClass.DeployEvent> responseObserver) {
io.grpc.stub.ServerCalls.asyncUnimplementedUnaryCall(getWatchDeployEventsMethod(), responseObserver);
}
/**
* <pre>
* Returns the direct children of a parent object (or the root objects when
* `parent` is unset). Designed for OPC UA-style lazy expand: clients walk
* one level at a time instead of paging the full hierarchy. Filters mirror
* DiscoverHierarchy exactly. Backed by the same shared hierarchy cache.
* </pre>
*/
default void browseChildren(galaxy_repository.v1.GalaxyRepositoryOuterClass.BrowseChildrenRequest request,
io.grpc.stub.StreamObserver<galaxy_repository.v1.GalaxyRepositoryOuterClass.BrowseChildrenReply> responseObserver) {
io.grpc.stub.ServerCalls.asyncUnimplementedUnaryCall(getBrowseChildrenMethod(), responseObserver);
}
}
/**
@@ -326,6 +370,20 @@ public final class GalaxyRepositoryGrpc {
io.grpc.stub.ClientCalls.asyncServerStreamingCall(
getChannel().newCall(getWatchDeployEventsMethod(), getCallOptions()), request, responseObserver);
}
/**
* <pre>
* Returns the direct children of a parent object (or the root objects when
* `parent` is unset). Designed for OPC UA-style lazy expand: clients walk
* one level at a time instead of paging the full hierarchy. Filters mirror
* DiscoverHierarchy exactly. Backed by the same shared hierarchy cache.
* </pre>
*/
public void browseChildren(galaxy_repository.v1.GalaxyRepositoryOuterClass.BrowseChildrenRequest request,
io.grpc.stub.StreamObserver<galaxy_repository.v1.GalaxyRepositoryOuterClass.BrowseChildrenReply> responseObserver) {
io.grpc.stub.ClientCalls.asyncUnaryCall(
getChannel().newCall(getBrowseChildrenMethod(), getCallOptions()), request, responseObserver);
}
}
/**
@@ -387,6 +445,19 @@ public final class GalaxyRepositoryGrpc {
return io.grpc.stub.ClientCalls.blockingV2ServerStreamingCall(
getChannel(), getWatchDeployEventsMethod(), getCallOptions(), request);
}
/**
* <pre>
* Returns the direct children of a parent object (or the root objects when
* `parent` is unset). Designed for OPC UA-style lazy expand: clients walk
* one level at a time instead of paging the full hierarchy. Filters mirror
* DiscoverHierarchy exactly. Backed by the same shared hierarchy cache.
* </pre>
*/
public galaxy_repository.v1.GalaxyRepositoryOuterClass.BrowseChildrenReply browseChildren(galaxy_repository.v1.GalaxyRepositoryOuterClass.BrowseChildrenRequest request) throws io.grpc.StatusException {
return io.grpc.stub.ClientCalls.blockingV2UnaryCall(
getChannel(), getBrowseChildrenMethod(), getCallOptions(), request);
}
}
/**
@@ -447,6 +518,19 @@ public final class GalaxyRepositoryGrpc {
return io.grpc.stub.ClientCalls.blockingServerStreamingCall(
getChannel(), getWatchDeployEventsMethod(), getCallOptions(), request);
}
/**
* <pre>
* Returns the direct children of a parent object (or the root objects when
* `parent` is unset). Designed for OPC UA-style lazy expand: clients walk
* one level at a time instead of paging the full hierarchy. Filters mirror
* DiscoverHierarchy exactly. Backed by the same shared hierarchy cache.
* </pre>
*/
public galaxy_repository.v1.GalaxyRepositoryOuterClass.BrowseChildrenReply browseChildren(galaxy_repository.v1.GalaxyRepositoryOuterClass.BrowseChildrenRequest request) {
return io.grpc.stub.ClientCalls.blockingUnaryCall(
getChannel(), getBrowseChildrenMethod(), getCallOptions(), request);
}
}
/**
@@ -494,12 +578,27 @@ public final class GalaxyRepositoryGrpc {
return io.grpc.stub.ClientCalls.futureUnaryCall(
getChannel().newCall(getDiscoverHierarchyMethod(), getCallOptions()), request);
}
/**
* <pre>
* Returns the direct children of a parent object (or the root objects when
* `parent` is unset). Designed for OPC UA-style lazy expand: clients walk
* one level at a time instead of paging the full hierarchy. Filters mirror
* DiscoverHierarchy exactly. Backed by the same shared hierarchy cache.
* </pre>
*/
public com.google.common.util.concurrent.ListenableFuture<galaxy_repository.v1.GalaxyRepositoryOuterClass.BrowseChildrenReply> browseChildren(
galaxy_repository.v1.GalaxyRepositoryOuterClass.BrowseChildrenRequest request) {
return io.grpc.stub.ClientCalls.futureUnaryCall(
getChannel().newCall(getBrowseChildrenMethod(), getCallOptions()), request);
}
}
private static final int METHODID_TEST_CONNECTION = 0;
private static final int METHODID_GET_LAST_DEPLOY_TIME = 1;
private static final int METHODID_DISCOVER_HIERARCHY = 2;
private static final int METHODID_WATCH_DEPLOY_EVENTS = 3;
private static final int METHODID_BROWSE_CHILDREN = 4;
private static final class MethodHandlers<Req, Resp> implements
io.grpc.stub.ServerCalls.UnaryMethod<Req, Resp>,
@@ -534,6 +633,10 @@ public final class GalaxyRepositoryGrpc {
serviceImpl.watchDeployEvents((galaxy_repository.v1.GalaxyRepositoryOuterClass.WatchDeployEventsRequest) request,
(io.grpc.stub.StreamObserver<galaxy_repository.v1.GalaxyRepositoryOuterClass.DeployEvent>) responseObserver);
break;
case METHODID_BROWSE_CHILDREN:
serviceImpl.browseChildren((galaxy_repository.v1.GalaxyRepositoryOuterClass.BrowseChildrenRequest) request,
(io.grpc.stub.StreamObserver<galaxy_repository.v1.GalaxyRepositoryOuterClass.BrowseChildrenReply>) responseObserver);
break;
default:
throw new AssertionError();
}
@@ -580,6 +683,13 @@ public final class GalaxyRepositoryGrpc {
galaxy_repository.v1.GalaxyRepositoryOuterClass.WatchDeployEventsRequest,
galaxy_repository.v1.GalaxyRepositoryOuterClass.DeployEvent>(
service, METHODID_WATCH_DEPLOY_EVENTS)))
.addMethod(
getBrowseChildrenMethod(),
io.grpc.stub.ServerCalls.asyncUnaryCall(
new MethodHandlers<
galaxy_repository.v1.GalaxyRepositoryOuterClass.BrowseChildrenRequest,
galaxy_repository.v1.GalaxyRepositoryOuterClass.BrowseChildrenReply>(
service, METHODID_BROWSE_CHILDREN)))
.build();
}
@@ -632,6 +742,7 @@ public final class GalaxyRepositoryGrpc {
.addMethod(getGetLastDeployTimeMethod())
.addMethod(getDiscoverHierarchyMethod())
.addMethod(getWatchDeployEventsMethod())
.addMethod(getBrowseChildrenMethod())
.build();
}
}
@@ -33,6 +33,7 @@ import java.util.Optional;
import java.util.concurrent.ArrayBlockingQueue;
import java.util.concurrent.BlockingQueue;
import java.util.concurrent.Callable;
import java.util.concurrent.atomic.AtomicReference;
import mxaccess_gateway.v1.MxaccessGateway.AcknowledgeAlarmReply;
import mxaccess_gateway.v1.MxaccessGateway.AcknowledgeAlarmRequest;
import mxaccess_gateway.v1.MxaccessGateway.ActiveAlarmSnapshot;
@@ -119,7 +120,7 @@ public final class MxGatewayCli implements Callable<Integer> {
return 0;
}
private static CommandLine commandLine(MxGatewayCliClientFactory clientFactory) {
static CommandLine commandLine(MxGatewayCliClientFactory clientFactory) {
CommandLine commandLine = new CommandLine(new MxGatewayCli(clientFactory));
commandLine.addSubcommand("version", new VersionCommand());
commandLine.addSubcommand("open-session", new OpenSessionCommand(clientFactory));
@@ -154,6 +155,120 @@ public final class MxGatewayCli implements Callable<Integer> {
/** Sentinel queued by {@code stream-alarms} to mark a clean end of the alarm feed. */
private static final Object ALARM_FEED_END = new Object();
/**
* Tokenises a single batch-mode stdin line into the argv that the inner
* {@link CommandLine} should execute. Honours single-quoted, double-quoted,
* and backslash-escaped runs so values that contain spaces (e.g.
* {@code --comment "needs verification"}) survive intact the old
* implementation used {@code split("\\s+")} which shredded any quoted
* argument mid-string (Client.Java-034).
*
* <p>Rules (a small POSIX-like shell tokenizer; no variable expansion,
* command substitution, globbing, or backtick handling):
*
* <ul>
* <li>Outside quotes, runs of whitespace separate tokens.</li>
* <li>{@code "..."} groups a sequence into one token; the surrounding
* quotes are removed. Inside double quotes a backslash escapes
* {@code \\}, {@code "}, and a literal newline; other characters
* are taken literally (so {@code \n} is the two characters
* backslash-n).</li>
* <li>{@code '...'} groups a sequence into one token; the surrounding
* quotes are removed. Inside single quotes nothing is escaped
* the run is literal until the matching single quote.</li>
* <li>Outside quotes, backslash escapes the next character (including
* whitespace, so {@code needs\ verification} is one token).</li>
* <li>An unterminated quote or a trailing backslash throws
* {@link IllegalArgumentException} so the batch loop surfaces it
* as a JSON error instead of silently emitting wrong args.</li>
* </ul>
*
* <p>Empty input (or input that contains only whitespace) returns an
* empty array so callers can skip the line.
*/
static String[] tokenizeBatchLine(String line) {
List<String> tokens = new ArrayList<>();
StringBuilder current = new StringBuilder();
boolean inToken = false;
// 0 = outside, 1 = inside single quotes, 2 = inside double quotes
int quoteMode = 0;
int length = line.length();
for (int i = 0; i < length; i++) {
char c = line.charAt(i);
if (quoteMode == 1) {
if (c == '\'') {
quoteMode = 0;
} else {
current.append(c);
}
continue;
}
if (quoteMode == 2) {
if (c == '\\') {
if (i + 1 >= length) {
throw new IllegalArgumentException(
"batch tokenizer: trailing backslash inside double-quoted string");
}
char next = line.charAt(i + 1);
if (next == '\\' || next == '"' || next == '\n') {
current.append(next);
i++;
} else {
// POSIX rule: inside double quotes a backslash is
// literal unless it precedes \, ", $, `, or newline.
current.append(c);
}
continue;
}
if (c == '"') {
quoteMode = 0;
continue;
}
current.append(c);
continue;
}
// Outside any quotes.
if (c == '\'') {
quoteMode = 1;
inToken = true;
continue;
}
if (c == '"') {
quoteMode = 2;
inToken = true;
continue;
}
if (c == '\\') {
if (i + 1 >= length) {
throw new IllegalArgumentException(
"batch tokenizer: trailing backslash outside quotes");
}
current.append(line.charAt(i + 1));
i++;
inToken = true;
continue;
}
if (Character.isWhitespace(c)) {
if (inToken) {
tokens.add(current.toString());
current.setLength(0);
inToken = false;
}
continue;
}
current.append(c);
inToken = true;
}
if (quoteMode != 0) {
throw new IllegalArgumentException(
"batch tokenizer: unterminated " + (quoteMode == 1 ? "single" : "double") + " quote");
}
if (inToken) {
tokens.add(current.toString());
}
return tokens.toArray(new String[0]);
}
/**
* Reads one CLI invocation per stdin line, executes each via a fresh
* {@link CommandLine}, and writes {@value #BATCH_EOR} to stdout after
@@ -183,8 +298,8 @@ public final class MxGatewayCli implements Callable<Integer> {
if (line.isEmpty()) {
break;
}
String[] args = line.trim().split("\\s+");
if (args.length == 0 || (args.length == 1 && args[0].isEmpty())) {
String[] args = tokenizeBatchLine(line);
if (args.length == 0) {
continue;
}
StringWriter cmdOut = new StringWriter();
@@ -1079,11 +1194,29 @@ public final class MxGatewayCli implements Callable<Integer> {
StreamAlarmsRequest request = StreamAlarmsRequest.newBuilder()
.setAlarmFilterPrefix(filterPrefix)
.build();
// Client.Java-033 fail-fast on overflow. A bare
// queue.offer(value) silently drops messages past capacity,
// which violates the JavaStyleGuide "do not drop events"
// contract and lets the CLI exit 0 on a truncated feed.
// Mirrors MxEventStream's overflow branch: detect a failed
// offer, cancel the subscription, drain the buffer, then
// queue an explicit overflow exception followed by the END
// sentinel so the drain loop surfaces a non-zero exit.
AtomicReference<MxGatewayAlarmFeedSubscription> subscriptionRef = new AtomicReference<>();
MxGatewayAlarmFeedSubscription subscription =
client.streamAlarms(request, new StreamObserver<>() {
@Override
public void onNext(AlarmFeedMessage value) {
queue.offer(value);
if (!queue.offer(value)) {
MxGatewayAlarmFeedSubscription sub = subscriptionRef.get();
if (sub != null) {
sub.cancel();
}
queue.clear();
queue.offer(new IllegalStateException(
"stream-alarms queue overflowed (capacity 1024); consumer too slow"));
queue.offer(ALARM_FEED_END);
}
}
@Override
@@ -1096,6 +1229,7 @@ public final class MxGatewayCli implements Callable<Integer> {
queue.offer(ALARM_FEED_END);
}
});
subscriptionRef.set(subscription);
try {
int count = 0;
while (true) {
@@ -225,6 +225,89 @@ final class MxGatewayCliTests {
assertTrue(run.errors().contains("--reference"), run.errors());
}
@Test
void readmeDocumentedStreamAlarmsExampleParsesCleanly() {
// Client.Java-032 regression the README's stream-alarms example
// (clients/java/README.md:182) must round-trip through picocli's
// parser without a parse error. Before the fix, the example used
// a non-existent --session-id option and picocli failed at parse
// time. This test pins the exact tokens documented in the README.
String[] args = {
"stream-alarms",
"--endpoint",
"localhost:5000",
"--api-key-env",
"MXGATEWAY_API_KEY",
"--plaintext",
"--filter-prefix",
"Galaxy",
"--limit",
"1",
"--json"
};
assertReadmeExampleParses(args);
}
@Test
void readmeDocumentedAcknowledgeAlarmExampleParsesCleanly() {
// Client.Java-032 regression the README's acknowledge-alarm
// example (clients/java/README.md:183) must parse without error.
// Before the fix it used --session-id (no such option) and
// --alarm-reference (the real option is --reference), so picocli
// rejected the invocation immediately.
String[] args = {
"acknowledge-alarm",
"--endpoint",
"localhost:5000",
"--api-key-env",
"MXGATEWAY_API_KEY",
"--plaintext",
"--reference",
"\\Galaxy\\Area001.Pump001.PumpFault",
"--json"
};
assertReadmeExampleParses(args);
}
/**
* Parses the given args through the production picocli {@link CommandLine}
* and asserts no parser error, no unknown option, and no missing required
* option. Does not execute the command body only the option / subcommand
* parser is exercised, so no network call is made.
*/
private static void assertReadmeExampleParses(String[] args) {
picocli.CommandLine commandLine = MxGatewayCli.commandLine(new FakeClientFactory());
try {
commandLine.parseArgs(args);
} catch (picocli.CommandLine.ParameterException ex) {
throw new AssertionError(
"documented README invocation failed picocli parse: "
+ String.join(" ", args)
+ " -> "
+ ex.getMessage(),
ex);
}
}
@Test
void streamAlarmsCommandFailsFastOnQueueOverflow() {
// Client.Java-033 regression the CLI's stream-alarms bounded queue
// used queue.offer(value) which silently dropped messages past
// capacity (1024). After the fix the CLI must surface the overflow
// as a non-zero exit (mirroring MxEventStream's fail-fast contract).
//
// The OverflowingFakeClient floods the gRPC observer with 2000
// messages synchronously, which exceeds the bounded 1024-element
// queue. The fix detects the failed offer, cancels the subscription,
// queues an overflow exception, and the drain loop surfaces it.
OverflowingFakeClientFactory factory = new OverflowingFakeClientFactory();
CliRun run = execute(factory, "stream-alarms", "--filter-prefix", "Flood");
assertFalse(run.exitCode() == 0,
"expected non-zero exit when the alarm queue overflows; got exit=" + run.exitCode()
+ " out=\n" + run.output() + "\nerr=\n" + run.errors());
}
@Test
void batchCommandExecutesVersionAndEmitsEorMarker() {
CliRun run = executeBatch(new FakeClientFactory(), "version --json\n");
@@ -235,6 +318,68 @@ final class MxGatewayCliTests {
assertTrue(out.contains(MxGatewayCli.BATCH_EOR), out);
}
@Test
void batchCommandTokenisesDoubleQuotedArgumentWithEmbeddedSpaces() {
// Client.Java-034 regression a real shell-style tokenizer must not
// shred `"needs verification"` into two arguments. Drives
// acknowledge-alarm through batch and asserts the captured --comment
// is the un-quoted string with the embedded space preserved.
FakeClientFactory factory = new FakeClientFactory();
String line = "acknowledge-alarm --reference Tank01.Level.HiHi --comment \"needs verification\" --operator op1\n";
CliRun run = executeBatch(factory, line);
assertEquals(0, run.exitCode());
assertEquals("needs verification", factory.client.lastAcknowledgeAlarmRequest.getComment());
assertEquals("op1", factory.client.lastAcknowledgeAlarmRequest.getOperatorUser());
assertEquals(
"Tank01.Level.HiHi", factory.client.lastAcknowledgeAlarmRequest.getAlarmFullReference());
}
@Test
void batchCommandTokenisesSingleQuotedArgumentWithEmbeddedSpaces() {
FakeClientFactory factory = new FakeClientFactory();
String line =
"acknowledge-alarm --reference Tank01.Level.HiHi --comment 'needs verification' --operator op1\n";
CliRun run = executeBatch(factory, line);
assertEquals(0, run.exitCode());
assertEquals("needs verification", factory.client.lastAcknowledgeAlarmRequest.getComment());
}
@Test
void batchCommandTokenisesBackslashEscapedSpaceOutsideQuotes() {
FakeClientFactory factory = new FakeClientFactory();
String line =
"acknowledge-alarm --reference Tank01.Level.HiHi --comment needs\\ verification\n";
CliRun run = executeBatch(factory, line);
assertEquals(0, run.exitCode());
assertEquals("needs verification", factory.client.lastAcknowledgeAlarmRequest.getComment());
}
@Test
void batchCommandPreservesEmptyQuotedArgument() {
FakeClientFactory factory = new FakeClientFactory();
String line = "acknowledge-alarm --reference Tank01.Level.HiHi --comment \"\"\n";
CliRun run = executeBatch(factory, line);
assertEquals(0, run.exitCode());
assertEquals("", factory.client.lastAcknowledgeAlarmRequest.getComment());
}
@Test
void batchCommandSupportsBackslashEscapedQuoteInsideDoubleQuotes() {
// `--comment "with \"inner\" quote"` should round-trip the inner
// double-quote into the comment string.
FakeClientFactory factory = new FakeClientFactory();
String line =
"acknowledge-alarm --reference Tank01.Level.HiHi --comment \"with \\\"inner\\\" quote\"\n";
CliRun run = executeBatch(factory, line);
assertEquals(0, run.exitCode());
assertEquals("with \"inner\" quote", factory.client.lastAcknowledgeAlarmRequest.getComment());
}
@Test
void batchCommandEmitsEorAfterFailedCommandAndContinues() {
// An unknown subcommand causes a picocli parse error (non-zero exit).
@@ -290,6 +435,77 @@ final class MxGatewayCliTests {
}
}
/**
* Factory whose fake client floods the {@code streamAlarms} observer with
* 2000 messages synchronously, exceeding the CLI's bounded 1024-element
* queue. Used by the Client.Java-033 fail-fast overflow regression.
*/
private static final class OverflowingFakeClientFactory implements MxGatewayCli.MxGatewayCliClientFactory {
@Override
public MxGatewayCli.MxGatewayCliClient connect(MxGatewayCli.CommonOptions options) {
return new OverflowingFakeClient(options.spec.commandLine().getOut());
}
}
private static final class OverflowingFakeClient implements MxGatewayCli.MxGatewayCliClient {
private final PrintWriter out;
OverflowingFakeClient(PrintWriter out) {
this.out = out;
}
@Override
public PrintWriter out() {
return out;
}
@Override
public OpenSessionReply openSession(OpenSessionRequest request) {
return OpenSessionReply.newBuilder().setSessionId("flood-session").setProtocolStatus(ok()).build();
}
@Override
public CloseSessionReply closeSession(CloseSessionRequest request) {
return CloseSessionReply.newBuilder()
.setSessionId(request.getSessionId())
.setFinalState(SessionState.SESSION_STATE_CLOSED)
.setProtocolStatus(ok())
.build();
}
@Override
public MxGatewayCli.MxGatewayCliSession session(String sessionId) {
throw new UnsupportedOperationException();
}
@Override
public AcknowledgeAlarmReply acknowledgeAlarm(AcknowledgeAlarmRequest request) {
throw new UnsupportedOperationException();
}
@Override
public MxGatewayAlarmFeedSubscription streamAlarms(
StreamAlarmsRequest request, StreamObserver<AlarmFeedMessage> observer) {
// Synchronously push 2000 messages to overflow the CLI's bounded
// 1024-element queue. The CLI must surface the overflow rather
// than silently dropping the trailing ~976 messages.
for (int i = 0; i < 2000; i++) {
observer.onNext(AlarmFeedMessage.newBuilder()
.setActiveAlarm(ActiveAlarmSnapshot.newBuilder()
.setAlarmFullReference("Flood." + i)
.setCurrentState(AlarmConditionState.ALARM_CONDITION_STATE_ACTIVE)
.setSeverity(700))
.build());
}
observer.onCompleted();
return new MxGatewayAlarmFeedSubscription();
}
@Override
public void close() {
}
}
private static final class FakeClient implements MxGatewayCli.MxGatewayCliClient {
private final PrintWriter out;
private final FakeSession session = new FakeSession();
@@ -1,6 +1,7 @@
plugins {
id 'java-library'
id 'com.google.protobuf'
id 'maven-publish'
}
dependencies {
@@ -30,6 +31,11 @@ sourceSets {
}
}
java {
withSourcesJar()
withJavadocJar()
}
protobuf {
protoc {
artifact = "com.google.protobuf:protoc:${protobufVersion}"
@@ -0,0 +1,105 @@
package com.zb.mom.ww.mxgateway.client;
import java.util.Collections;
import java.util.List;
/**
* Filters and shape options for {@link GalaxyRepositoryClient#browse(BrowseChildrenOptions)}.
* Mirror of the existing DiscoverHierarchy options for the lazy-browse path.
*
* <p>All filter fields are AND-combined server-side. Empty / unset fields disable
* that filter. The {@code includeAttributes} tri-state uses {@code null} to mean
* "let the server use its default"; non-{@code null} forwards the explicit flag.
*/
public final class BrowseChildrenOptions {
private final List<Integer> categoryIds;
private final List<String> templateChainContains;
private final String tagNameGlob;
private final Boolean includeAttributes;
private final boolean alarmBearingOnly;
private final boolean historizedOnly;
private BrowseChildrenOptions(Builder b) {
this.categoryIds = List.copyOf(b.categoryIds);
this.templateChainContains = List.copyOf(b.templateChainContains);
this.tagNameGlob = b.tagNameGlob;
this.includeAttributes = b.includeAttributes;
this.alarmBearingOnly = b.alarmBearingOnly;
this.historizedOnly = b.historizedOnly;
}
/** @return immutable list of category IDs to include; empty disables this filter. */
public List<Integer> getCategoryIds() { return categoryIds; }
/** @return immutable list of template names that must appear in each child's template chain. */
public List<String> getTemplateChainContains() { return templateChainContains; }
/** @return SQL-LIKE-style glob applied to {@code tag_name}; empty disables. */
public String getTagNameGlob() { return tagNameGlob; }
/** @return tri-state override for {@code include_attributes}; {@code null} keeps the server default. */
public Boolean getIncludeAttributes() { return includeAttributes; }
/** @return restrict to alarm-bearing objects. */
public boolean isAlarmBearingOnly() { return alarmBearingOnly; }
/** @return restrict to objects with at least one historized attribute. */
public boolean isHistorizedOnly() { return historizedOnly; }
/** @return a fresh builder. */
public static Builder builder() { return new Builder(); }
/** @return options with every filter disabled and {@code includeAttributes} unset. */
public static BrowseChildrenOptions empty() { return builder().build(); }
/** Fluent builder for {@link BrowseChildrenOptions}. */
public static final class Builder {
private List<Integer> categoryIds = Collections.emptyList();
private List<String> templateChainContains = Collections.emptyList();
private String tagNameGlob = "";
private Boolean includeAttributes = null;
private boolean alarmBearingOnly = false;
private boolean historizedOnly = false;
/** Sets the category-id filter. */
public Builder categoryIds(List<Integer> v) {
this.categoryIds = v == null ? Collections.emptyList() : v;
return this;
}
/** Sets the template-chain-contains filter. */
public Builder templateChainContains(List<String> v) {
this.templateChainContains = v == null ? Collections.emptyList() : v;
return this;
}
/** Sets the tag-name glob. */
public Builder tagNameGlob(String v) {
this.tagNameGlob = v == null ? "" : v;
return this;
}
/** Sets the tri-state {@code includeAttributes} override; {@code null} keeps the server default. */
public Builder includeAttributes(Boolean v) {
this.includeAttributes = v;
return this;
}
/** Toggles the alarm-bearing-only filter. */
public Builder alarmBearingOnly(boolean v) {
this.alarmBearingOnly = v;
return this;
}
/** Toggles the historized-only filter. */
public Builder historizedOnly(boolean v) {
this.historizedOnly = v;
return this;
}
/** Builds the immutable options. */
public BrowseChildrenOptions build() {
return new BrowseChildrenOptions(this);
}
}
}
@@ -2,64 +2,19 @@ package com.zb.mom.ww.mxgateway.client;
import galaxy_repository.v1.GalaxyRepositoryOuterClass.DeployEvent;
import galaxy_repository.v1.GalaxyRepositoryOuterClass.WatchDeployEventsRequest;
import io.grpc.stub.ClientCallStreamObserver;
import io.grpc.stub.ClientResponseObserver;
import io.grpc.stub.StreamObserver;
import java.util.concurrent.atomic.AtomicBoolean;
import java.util.concurrent.atomic.AtomicReference;
/**
* Cancellable handle returned by the async {@code watchDeployEvents} variant.
* Mirrors {@link MxGatewayEventSubscription} but for the Galaxy Repository
* deploy-event stream.
*
* <p>All lifecycle / cancellation behaviour is inherited from
* {@link MxGatewayStreamSubscription} (Client.Java-036).
*/
public final class DeployEventSubscription implements AutoCloseable {
private final AtomicReference<ClientCallStreamObserver<WatchDeployEventsRequest>> requestStream =
new AtomicReference<>();
private final AtomicBoolean cancelled = new AtomicBoolean();
ClientResponseObserver<WatchDeployEventsRequest, DeployEvent> wrap(StreamObserver<DeployEvent> observer) {
return new ClientResponseObserver<>() {
@Override
public void beforeStart(ClientCallStreamObserver<WatchDeployEventsRequest> stream) {
requestStream.set(stream);
if (cancelled.get()) {
stream.cancel("client cancelled deploy event stream", null);
}
}
@Override
public void onNext(DeployEvent value) {
observer.onNext(value);
}
@Override
public void onError(Throwable error) {
observer.onError(error);
}
@Override
public void onCompleted() {
observer.onCompleted();
}
};
}
/**
* Cancels the underlying gRPC call. Safe to invoke before the call has
* started; cancellation is recorded and applied as soon as the stream
* attaches.
*/
public void cancel() {
cancelled.set(true);
ClientCallStreamObserver<WatchDeployEventsRequest> stream = requestStream.get();
if (stream != null) {
stream.cancel("client cancelled deploy event stream", null);
}
}
@Override
public void close() {
cancel();
public final class DeployEventSubscription
extends MxGatewayStreamSubscription<WatchDeployEventsRequest, DeployEvent> {
public DeployEventSubscription() {
super("client cancelled deploy event stream");
}
}
@@ -4,6 +4,8 @@ import com.google.common.util.concurrent.FutureCallback;
import com.google.common.util.concurrent.Futures;
import com.google.common.util.concurrent.MoreExecutors;
import galaxy_repository.v1.GalaxyRepositoryGrpc;
import galaxy_repository.v1.GalaxyRepositoryOuterClass.BrowseChildrenReply;
import galaxy_repository.v1.GalaxyRepositoryOuterClass.BrowseChildrenRequest;
import galaxy_repository.v1.GalaxyRepositoryOuterClass.DeployEvent;
import galaxy_repository.v1.GalaxyRepositoryOuterClass.DiscoverHierarchyReply;
import galaxy_repository.v1.GalaxyRepositoryOuterClass.DiscoverHierarchyRequest;
@@ -37,6 +39,7 @@ import javax.net.ssl.SSLException;
*/
public final class GalaxyRepositoryClient implements AutoCloseable {
private static final int DISCOVER_HIERARCHY_PAGE_SIZE = 5000;
private static final int BROWSE_CHILDREN_PAGE_SIZE = 500;
private final ManagedChannel ownedChannel;
private final MxGatewayClientOptions options;
@@ -213,6 +216,98 @@ public final class GalaxyRepositoryClient implements AutoCloseable {
return discoverHierarchyPageAsync("", new java.util.ArrayList<>(), new java.util.HashSet<>());
}
/**
* Lazy-browse entry point: fetches the root layer of the Galaxy hierarchy.
* Each returned {@link LazyBrowseNode} can be expanded on demand via
* {@link LazyBrowseNode#expand()} to load its direct children.
*
* @return the root nodes (no parent selector) with default options
* @throws MxGatewayException on transport or protocol failure
*/
public List<LazyBrowseNode> browse() {
return browse(null);
}
/**
* Lazy-browse entry point with caller-supplied filters / shape.
*
* @param options filter and shape options; {@code null} means {@link BrowseChildrenOptions#empty()}
* @return the root nodes matching the options
* @throws MxGatewayException on transport or protocol failure
*/
public List<LazyBrowseNode> browse(BrowseChildrenOptions options) {
BrowseChildrenOptions effective = options == null ? BrowseChildrenOptions.empty() : options;
return browseChildrenInner(null, effective);
}
/**
* Issues a single {@code BrowseChildren} RPC and returns the raw reply.
* Callers wanting full control over pagination can drive the loop themselves.
*
* @param request the request to send
* @return the reply
* @throws MxGatewayException on transport or protocol failure
*/
public BrowseChildrenReply browseChildrenRaw(BrowseChildrenRequest request) {
try {
return rawBlockingStub().browseChildren(request);
} catch (RuntimeException error) {
if (error instanceof MxGatewayException) {
throw error;
}
throw MxGatewayErrors.fromGrpc("galaxy browse children", error);
}
}
/**
* Drives the BrowseChildren paging loop for a single parent (or roots when
* {@code parentGobjectId} is {@code null}). Detects repeated page tokens to
* avoid infinite loops on a buggy server.
*/
List<LazyBrowseNode> browseChildrenInner(Integer parentGobjectId, BrowseChildrenOptions options) {
java.util.ArrayList<LazyBrowseNode> nodes = new java.util.ArrayList<>();
java.util.HashSet<String> seenPageTokens = new java.util.HashSet<>();
String pageToken = "";
while (true) {
BrowseChildrenRequest.Builder builder = BrowseChildrenRequest.newBuilder()
.setPageSize(BROWSE_CHILDREN_PAGE_SIZE)
.setPageToken(pageToken)
.setAlarmBearingOnly(options.isAlarmBearingOnly())
.setHistorizedOnly(options.isHistorizedOnly());
if (parentGobjectId != null) {
builder.setParentGobjectId(parentGobjectId.intValue());
}
if (!options.getCategoryIds().isEmpty()) {
builder.addAllCategoryIds(options.getCategoryIds());
}
if (!options.getTemplateChainContains().isEmpty()) {
builder.addAllTemplateChainContains(options.getTemplateChainContains());
}
if (!options.getTagNameGlob().isEmpty()) {
builder.setTagNameGlob(options.getTagNameGlob());
}
if (options.getIncludeAttributes() != null) {
builder.setIncludeAttributes(options.getIncludeAttributes());
}
BrowseChildrenReply reply = browseChildrenRaw(builder.build());
for (int i = 0; i < reply.getChildrenCount(); i++) {
boolean hint = i < reply.getChildHasChildrenCount() && reply.getChildHasChildren(i);
nodes.add(new LazyBrowseNode(this, reply.getChildren(i), hint, options));
}
pageToken = reply.getNextPageToken();
if (pageToken == null || pageToken.isEmpty()) {
return nodes;
}
if (!seenPageTokens.add(pageToken)) {
throw new MxGatewayException(
"galaxy browse children returned repeated page token: " + pageToken);
}
}
}
/**
* Subscribes to {@code WatchDeployEvents} via the async stub and consumes
* results through a blocking iterator. Closing the returned stream cancels
@@ -0,0 +1,150 @@
package com.zb.mom.ww.mxgateway.client;
import galaxy_repository.v1.GalaxyRepositoryOuterClass.GalaxyObject;
import java.util.Collections;
import java.util.List;
import java.util.concurrent.CompletableFuture;
import java.util.concurrent.ExecutionException;
import java.util.concurrent.locks.ReentrantReadWriteLock;
/**
* One node in a lazy-loaded Galaxy browse tree. Holds the underlying
* {@link GalaxyObject} and exposes {@link #expand()} to fetch its direct
* children on demand. Expansion is one-shot: a second call is a no-op.
* Pagination of large sibling sets is handled internally by the client.
*/
public final class LazyBrowseNode {
private final GalaxyRepositoryClient client;
private final GalaxyObject object;
private final boolean hasChildrenHint;
private final BrowseChildrenOptions options;
// expandLock gates the start of a new expand AND the publish of the in-flight
// future. Readers (getChildren / isExpanded) use a separate read-write lock so
// they never block on the gRPC call.
private final Object expandLock = new Object();
private CompletableFuture<Void> inFlight;
private final ReentrantReadWriteLock readWriteLock = new ReentrantReadWriteLock();
private List<LazyBrowseNode> children = Collections.emptyList();
private boolean isExpanded;
LazyBrowseNode(
GalaxyRepositoryClient client,
GalaxyObject object,
boolean hasChildrenHint,
BrowseChildrenOptions options) {
this.client = client;
this.object = object;
this.hasChildrenHint = hasChildrenHint;
this.options = options;
}
/** @return the underlying Galaxy object proto for this node. */
public GalaxyObject getObject() {
return object;
}
/** @return {@code true} when the server reports this node has at least one matching descendant. */
public boolean hasChildrenHint() {
return hasChildrenHint;
}
/** @return a snapshot of direct children loaded by {@link #expand()}; empty until then. */
public List<LazyBrowseNode> getChildren() {
readWriteLock.readLock().lock();
try {
return List.copyOf(children);
} finally {
readWriteLock.readLock().unlock();
}
}
/** @return {@code true} after the first {@link #expand()} call completes. */
public boolean isExpanded() {
readWriteLock.readLock().lock();
try {
return isExpanded;
} finally {
readWriteLock.readLock().unlock();
}
}
/**
* Fetches direct children from the gateway and populates {@link #getChildren()}.
* Idempotent: subsequent calls are no-ops and do not issue a second RPC.
*
* <p>Concurrent callers coalesce onto a single in-flight RPC: the first caller
* (the "leader") issues the gRPC call, while any other thread that calls
* {@code expand()} during that window blocks on the leader's future and sees
* the same result (or the same exception). On failure the in-flight slot is
* cleared so a subsequent call can retry.
*
* <p>Readers ({@link #getChildren()} / {@link #isExpanded()}) take a separate
* read lock and are never blocked for the duration of the RPC.
*
* @throws MxGatewayException on transport or protocol failure
*/
public void expand() {
if (isExpanded()) {
return;
}
CompletableFuture<Void> future;
boolean iAmTheLeader;
synchronized (expandLock) {
if (isExpanded()) {
return;
}
if (inFlight != null) {
future = inFlight;
iAmTheLeader = false;
} else {
future = new CompletableFuture<>();
inFlight = future;
iAmTheLeader = true;
}
}
if (iAmTheLeader) {
try {
List<LazyBrowseNode> loaded =
client.browseChildrenInner(object.getGobjectId(), options);
readWriteLock.writeLock().lock();
try {
this.children = loaded;
this.isExpanded = true;
} finally {
readWriteLock.writeLock().unlock();
}
synchronized (expandLock) {
inFlight = null;
}
future.complete(null);
} catch (RuntimeException ex) {
synchronized (expandLock) {
inFlight = null;
}
future.completeExceptionally(ex);
throw ex;
}
} else {
try {
future.get();
} catch (InterruptedException ie) {
Thread.currentThread().interrupt();
throw new MxGatewayException("Interrupted waiting for browse-children expand.", ie);
} catch (ExecutionException ee) {
Throwable cause = ee.getCause();
if (cause instanceof MxGatewayException me) {
throw me;
}
if (cause instanceof RuntimeException re) {
throw re;
}
throw new MxGatewayException("BrowseChildren expand failed.", cause);
}
}
}
}
@@ -1,10 +1,6 @@
package com.zb.mom.ww.mxgateway.client;
import io.grpc.stub.ClientCallStreamObserver;
import io.grpc.stub.ClientResponseObserver;
import io.grpc.stub.StreamObserver;
import java.util.concurrent.atomic.AtomicBoolean;
import java.util.concurrent.atomic.AtomicReference;
import mxaccess_gateway.v1.MxaccessGateway.ActiveAlarmSnapshot;
import mxaccess_gateway.v1.MxaccessGateway.QueryActiveAlarmsRequest;
@@ -15,53 +11,13 @@ import mxaccess_gateway.v1.MxaccessGateway.QueryActiveAlarmsRequest;
* {@link #cancel()} entry point that aborts the underlying gRPC call. The
* subscription also implements {@link AutoCloseable} so it can participate in
* try-with-resources blocks.
*
* <p>All lifecycle / cancellation behaviour is inherited from
* {@link MxGatewayStreamSubscription} (Client.Java-036).
*/
public final class MxGatewayActiveAlarmsSubscription implements AutoCloseable {
private final AtomicReference<ClientCallStreamObserver<QueryActiveAlarmsRequest>> requestStream = new AtomicReference<>();
private final AtomicBoolean cancelled = new AtomicBoolean();
ClientResponseObserver<QueryActiveAlarmsRequest, ActiveAlarmSnapshot> wrap(StreamObserver<ActiveAlarmSnapshot> observer) {
return new ClientResponseObserver<>() {
@Override
public void beforeStart(ClientCallStreamObserver<QueryActiveAlarmsRequest> stream) {
requestStream.set(stream);
if (cancelled.get()) {
stream.cancel("client cancelled active-alarms query", null);
}
}
@Override
public void onNext(ActiveAlarmSnapshot value) {
observer.onNext(value);
}
@Override
public void onError(Throwable error) {
observer.onError(error);
}
@Override
public void onCompleted() {
observer.onCompleted();
}
};
}
/**
* Cancels the underlying gRPC call. Safe to invoke before the call has
* started; cancellation is recorded and applied as soon as the stream
* attaches.
*/
public void cancel() {
cancelled.set(true);
ClientCallStreamObserver<QueryActiveAlarmsRequest> stream = requestStream.get();
if (stream != null) {
stream.cancel("client cancelled active-alarms query", null);
}
}
@Override
public void close() {
cancel();
public final class MxGatewayActiveAlarmsSubscription
extends MxGatewayStreamSubscription<QueryActiveAlarmsRequest, ActiveAlarmSnapshot> {
public MxGatewayActiveAlarmsSubscription() {
super("client cancelled active-alarms query");
}
}
@@ -1,10 +1,6 @@
package com.zb.mom.ww.mxgateway.client;
import io.grpc.stub.ClientCallStreamObserver;
import io.grpc.stub.ClientResponseObserver;
import io.grpc.stub.StreamObserver;
import java.util.concurrent.atomic.AtomicBoolean;
import java.util.concurrent.atomic.AtomicReference;
import mxaccess_gateway.v1.MxaccessGateway.AlarmFeedMessage;
import mxaccess_gateway.v1.MxaccessGateway.StreamAlarmsRequest;
@@ -15,53 +11,13 @@ import mxaccess_gateway.v1.MxaccessGateway.StreamAlarmsRequest;
* {@link #cancel()} entry point that aborts the underlying gRPC call. The
* subscription also implements {@link AutoCloseable} so it can participate in
* try-with-resources blocks.
*
* <p>All lifecycle / cancellation behaviour is inherited from
* {@link MxGatewayStreamSubscription} (Client.Java-036).
*/
public final class MxGatewayAlarmFeedSubscription implements AutoCloseable {
private final AtomicReference<ClientCallStreamObserver<StreamAlarmsRequest>> requestStream = new AtomicReference<>();
private final AtomicBoolean cancelled = new AtomicBoolean();
ClientResponseObserver<StreamAlarmsRequest, AlarmFeedMessage> wrap(StreamObserver<AlarmFeedMessage> observer) {
return new ClientResponseObserver<>() {
@Override
public void beforeStart(ClientCallStreamObserver<StreamAlarmsRequest> stream) {
requestStream.set(stream);
if (cancelled.get()) {
stream.cancel("client cancelled alarm feed", null);
}
}
@Override
public void onNext(AlarmFeedMessage value) {
observer.onNext(value);
}
@Override
public void onError(Throwable error) {
observer.onError(error);
}
@Override
public void onCompleted() {
observer.onCompleted();
}
};
}
/**
* Cancels the underlying gRPC call. Safe to invoke before the call has
* started; cancellation is recorded and applied as soon as the stream
* attaches.
*/
public void cancel() {
cancelled.set(true);
ClientCallStreamObserver<StreamAlarmsRequest> stream = requestStream.get();
if (stream != null) {
stream.cancel("client cancelled alarm feed", null);
}
}
@Override
public void close() {
cancel();
public final class MxGatewayAlarmFeedSubscription
extends MxGatewayStreamSubscription<StreamAlarmsRequest, AlarmFeedMessage> {
public MxGatewayAlarmFeedSubscription() {
super("client cancelled alarm feed");
}
}
@@ -384,6 +384,15 @@ public final class MxGatewayClient implements AutoCloseable {
} catch (SSLException error) {
throw new MxGatewayException("failed to configure gateway TLS", error);
}
} else if (!options.requireCertificateValidation()) {
try {
builder.sslContext(GrpcSslContexts.forClient()
.trustManager(io.grpc.netty.shaded.io.netty.handler.ssl.util
.InsecureTrustManagerFactory.INSTANCE)
.build());
} catch (SSLException error) {
throw new MxGatewayException("failed to configure lenient gateway TLS", error);
}
} else {
builder.useTransportSecurity();
}
@@ -393,6 +402,19 @@ public final class MxGatewayClient implements AutoCloseable {
return builder.build();
}
/**
* Package-visible test seam creates a raw {@link ManagedChannel} from the
* given options without attaching auth interceptors. Used by TLS fixture
* tests to verify channel construction behaviour without a full
* {@link MxGatewayClient} wrapper.
*
* @param options the client options
* @return a new {@link ManagedChannel}
*/
static ManagedChannel createChannelForTests(MxGatewayClientOptions options) {
return createChannel(options);
}
private <T extends io.grpc.stub.AbstractStub<T>> T withDeadline(T stub) {
if (options.callTimeout().isNegative()) {
return stub;
@@ -20,6 +20,7 @@ public final class MxGatewayClientOptions {
private final String apiKey;
private final boolean plaintext;
private final Path caCertificatePath;
private final boolean requireCertificateValidation;
private final String serverNameOverride;
private final Duration connectTimeout;
private final Duration callTimeout;
@@ -31,6 +32,7 @@ public final class MxGatewayClientOptions {
apiKey = builder.apiKey == null ? "" : builder.apiKey;
plaintext = builder.plaintext;
caCertificatePath = builder.caCertificatePath;
requireCertificateValidation = builder.requireCertificateValidation;
serverNameOverride = builder.serverNameOverride == null ? "" : builder.serverNameOverride;
connectTimeout = builder.connectTimeout == null ? DEFAULT_CONNECT_TIMEOUT : builder.connectTimeout;
callTimeout = builder.callTimeout == null ? DEFAULT_CALL_TIMEOUT : builder.callTimeout;
@@ -95,6 +97,18 @@ public final class MxGatewayClientOptions {
return caCertificatePath;
}
/**
* Returns whether TLS certificate verification is required even when no CA is pinned.
* When {@code false} (default), the gateway's self-signed certificate is accepted
* without verification. When {@code true}, the OS trust store is used.
* Pinning a CA via {@link #caCertificatePath()} always verifies regardless of this flag.
*
* @return {@code true} if strict certificate verification is required
*/
public boolean requireCertificateValidation() {
return requireCertificateValidation;
}
/**
* Returns the TLS server-name override, or an empty string when none was supplied.
*
@@ -148,6 +162,8 @@ public final class MxGatewayClientOptions {
+ plaintext
+ ", caCertificatePath="
+ caCertificatePath
+ ", requireCertificateValidation="
+ requireCertificateValidation
+ ", serverNameOverride='"
+ serverNameOverride
+ '\''
@@ -177,6 +193,7 @@ public final class MxGatewayClientOptions {
private String apiKey;
private boolean plaintext;
private Path caCertificatePath;
private boolean requireCertificateValidation;
private String serverNameOverride;
private Duration connectTimeout;
private Duration callTimeout;
@@ -230,6 +247,21 @@ public final class MxGatewayClientOptions {
return this;
}
/**
* When {@code true}, TLS connections without a pinned CA use the OS trust store
* and will reject the gateway's self-signed certificate. When {@code false}
* (default), the gateway certificate is accepted without verification
* appropriate for this internal tool's auto-generated self-signed certificate.
* Pinning a CA via {@link #caCertificatePath(Path)} always verifies.
*
* @param value {@code true} to require certificate validation, {@code false} to accept any cert
* @return this builder
*/
public Builder requireCertificateValidation(boolean value) {
requireCertificateValidation = value;
return this;
}
/**
* Overrides the TLS server name used during the handshake.
*
@@ -1,10 +1,6 @@
package com.zb.mom.ww.mxgateway.client;
import io.grpc.stub.ClientCallStreamObserver;
import io.grpc.stub.ClientResponseObserver;
import io.grpc.stub.StreamObserver;
import java.util.concurrent.atomic.AtomicReference;
import java.util.concurrent.atomic.AtomicBoolean;
import mxaccess_gateway.v1.MxaccessGateway.MxEvent;
import mxaccess_gateway.v1.MxaccessGateway.StreamEventsRequest;
@@ -15,53 +11,13 @@ import mxaccess_gateway.v1.MxaccessGateway.StreamEventsRequest;
* {@link #cancel()} entry point that aborts the underlying gRPC call. The
* subscription also implements {@link AutoCloseable} so it can participate in
* try-with-resources blocks.
*
* <p>All lifecycle / cancellation behaviour is inherited from
* {@link MxGatewayStreamSubscription} (Client.Java-036).
*/
public final class MxGatewayEventSubscription implements AutoCloseable {
private final AtomicReference<ClientCallStreamObserver<StreamEventsRequest>> requestStream = new AtomicReference<>();
private final AtomicBoolean cancelled = new AtomicBoolean();
ClientResponseObserver<StreamEventsRequest, MxEvent> wrap(StreamObserver<MxEvent> observer) {
return new ClientResponseObserver<>() {
@Override
public void beforeStart(ClientCallStreamObserver<StreamEventsRequest> stream) {
requestStream.set(stream);
if (cancelled.get()) {
stream.cancel("client cancelled event stream", null);
}
}
@Override
public void onNext(MxEvent value) {
observer.onNext(value);
}
@Override
public void onError(Throwable error) {
observer.onError(error);
}
@Override
public void onCompleted() {
observer.onCompleted();
}
};
}
/**
* Cancels the underlying gRPC call. Safe to invoke before the call has
* started; cancellation is recorded and applied as soon as the stream
* attaches.
*/
public void cancel() {
cancelled.set(true);
ClientCallStreamObserver<StreamEventsRequest> stream = requestStream.get();
if (stream != null) {
stream.cancel("client cancelled event stream", null);
}
}
@Override
public void close() {
cancel();
public final class MxGatewayEventSubscription
extends MxGatewayStreamSubscription<StreamEventsRequest, MxEvent> {
public MxGatewayEventSubscription() {
super("client cancelled event stream");
}
}
@@ -0,0 +1,89 @@
package com.zb.mom.ww.mxgateway.client;
import io.grpc.stub.ClientCallStreamObserver;
import io.grpc.stub.ClientResponseObserver;
import io.grpc.stub.StreamObserver;
import java.util.concurrent.atomic.AtomicBoolean;
import java.util.concurrent.atomic.AtomicReference;
/**
* Shared base for the cancellable subscription handles returned by the
* async-style server-streaming RPCs ({@code streamEvents}, {@code streamAlarms},
* {@code queryActiveAlarms}, {@code watchDeployEvents}).
*
* <p>All four subscription classes share the same lifecycle and cancellation
* contract:
*
* <ul>
* <li>{@link #wrap(StreamObserver)} returns a {@link ClientResponseObserver}
* that captures the underlying {@link ClientCallStreamObserver} in
* {@code beforeStart}. If {@link #cancel()} was called before the gRPC
* call attached, the stream is cancelled eagerly inside
* {@code beforeStart} (the Client.Java-014 close-before-beforeStart
* fix).</li>
* <li>{@link #cancel()} is idempotent. It records the cancellation flag and
* forwards {@code cancel(message, cause)} to the underlying stream when
* one is attached; otherwise the flag is checked in {@code beforeStart}
* once the stream attaches.</li>
* <li>{@link #close()} delegates to {@link #cancel()} so the handle can be
* used with try-with-resources.</li>
* </ul>
*
* <p>Subclasses supply only the cancel-message string used by {@code cancel()}.
* Refactor introduced for Client.Java-036 the four prior subscription
* classes were structural near-clones (~60 lines each).
*/
abstract class MxGatewayStreamSubscription<TRequest, TResponse> implements AutoCloseable {
private final AtomicReference<ClientCallStreamObserver<TRequest>> requestStream = new AtomicReference<>();
private final AtomicBoolean cancelled = new AtomicBoolean();
private final String cancelMessage;
MxGatewayStreamSubscription(String cancelMessage) {
this.cancelMessage = cancelMessage;
}
final ClientResponseObserver<TRequest, TResponse> wrap(StreamObserver<TResponse> observer) {
return new ClientResponseObserver<>() {
@Override
public void beforeStart(ClientCallStreamObserver<TRequest> stream) {
requestStream.set(stream);
if (cancelled.get()) {
stream.cancel(cancelMessage, null);
}
}
@Override
public void onNext(TResponse value) {
observer.onNext(value);
}
@Override
public void onError(Throwable error) {
observer.onError(error);
}
@Override
public void onCompleted() {
observer.onCompleted();
}
};
}
/**
* Cancels the underlying gRPC call. Safe to invoke before the call has
* started; cancellation is recorded and applied as soon as the stream
* attaches.
*/
public final void cancel() {
cancelled.set(true);
ClientCallStreamObserver<TRequest> stream = requestStream.get();
if (stream != null) {
stream.cancel(cancelMessage, null);
}
}
@Override
public final void close() {
cancel();
}
}
@@ -8,6 +8,8 @@ import static org.junit.jupiter.api.Assertions.assertTrue;
import com.google.protobuf.Timestamp;
import galaxy_repository.v1.GalaxyRepositoryGrpc;
import galaxy_repository.v1.GalaxyRepositoryOuterClass.BrowseChildrenReply;
import galaxy_repository.v1.GalaxyRepositoryOuterClass.BrowseChildrenRequest;
import galaxy_repository.v1.GalaxyRepositoryOuterClass.DeployEvent;
import galaxy_repository.v1.GalaxyRepositoryOuterClass.DiscoverHierarchyReply;
import galaxy_repository.v1.GalaxyRepositoryOuterClass.DiscoverHierarchyRequest;
@@ -24,6 +26,7 @@ import io.grpc.Server;
import io.grpc.ServerCall;
import io.grpc.ServerCallHandler;
import io.grpc.ServerInterceptor;
import io.grpc.Status;
import io.grpc.inprocess.InProcessChannelBuilder;
import io.grpc.inprocess.InProcessServerBuilder;
import io.grpc.stub.ClientCallStreamObserver;
@@ -31,11 +34,20 @@ import io.grpc.stub.ClientResponseObserver;
import io.grpc.stub.StreamObserver;
import java.time.Duration;
import java.time.Instant;
import java.util.ArrayDeque;
import java.util.Collections;
import java.util.List;
import java.util.Optional;
import java.util.Queue;
import java.util.UUID;
import java.util.ArrayList;
import java.util.concurrent.CopyOnWriteArrayList;
import java.util.concurrent.CountDownLatch;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;
import java.util.concurrent.Future;
import java.util.concurrent.TimeUnit;
import java.util.concurrent.atomic.AtomicInteger;
import java.util.concurrent.atomic.AtomicReference;
import org.junit.jupiter.api.Test;
@@ -196,6 +208,27 @@ final class GalaxyRepositoryClientTests {
}
}
@Test
void browseChildrenRejectsRepeatedPageToken() throws Exception {
// Queue the same BrowseChildrenReply twice with a non-empty NextPageToken.
// The client will request a second page and detect that the token repeats.
BrowseChildrenService service = new BrowseChildrenService();
BrowseChildrenReply repeatedReply = browseReply(
List.of(obj(1, "Plant", true)),
List.of(true),
1L,
"1:abc:1");
service.replies.add(repeatedReply);
service.replies.add(repeatedReply);
try (InProcessGalaxy g = InProcessGalaxy.start(service, new AtomicReference<>());
GalaxyRepositoryClient client = g.client("")) {
MxGatewayException error = assertThrows(MxGatewayException.class, client::browse);
assertTrue(error.getMessage().contains("repeated page token"));
}
}
@Test
void watchDeployEventsReceivesEventsInOrder() throws Exception {
DeployEvent first = DeployEvent.newBuilder()
@@ -306,6 +339,294 @@ final class GalaxyRepositoryClientTests {
}
}
@Test
void browseNoParentReturnsRoots() throws Exception {
BrowseChildrenService service = new BrowseChildrenService();
service.replies.add(browseReply(
List.of(obj(1, "Plant", true), obj(2, "Other", false)),
List.of(true, false),
1L,
""));
try (InProcessGalaxy g = InProcessGalaxy.start(service, new AtomicReference<>());
GalaxyRepositoryClient client = g.client("")) {
List<LazyBrowseNode> roots = client.browse();
assertEquals(2, roots.size());
assertEquals("Plant", roots.get(0).getObject().getTagName());
assertTrue(roots.get(0).hasChildrenHint());
assertFalse(roots.get(0).isExpanded());
assertEquals("Other", roots.get(1).getObject().getTagName());
assertFalse(roots.get(1).hasChildrenHint());
assertFalse(roots.get(1).isExpanded());
assertEquals(1, service.calls.size());
assertFalse(service.calls.get(0).hasParentGobjectId());
}
}
@Test
void browseExpandPopulatesChildrenAndMarksExpanded() throws Exception {
BrowseChildrenService service = new BrowseChildrenService();
service.replies.add(browseReply(
List.of(obj(1, "Plant", true)),
List.of(true),
1L,
""));
service.replies.add(browseReply(
List.of(obj(10, "Line1", false)),
List.of(false),
1L,
""));
try (InProcessGalaxy g = InProcessGalaxy.start(service, new AtomicReference<>());
GalaxyRepositoryClient client = g.client("")) {
List<LazyBrowseNode> roots = client.browse();
roots.get(0).expand();
assertTrue(roots.get(0).isExpanded());
assertEquals(1, roots.get(0).getChildren().size());
assertEquals("Line1", roots.get(0).getChildren().get(0).getObject().getTagName());
assertEquals(2, service.calls.size());
assertTrue(service.calls.get(1).hasParentGobjectId());
assertEquals(1, service.calls.get(1).getParentGobjectId());
}
}
@Test
void browseExpandIdempotentNoSecondRpc() throws Exception {
BrowseChildrenService service = new BrowseChildrenService();
service.replies.add(browseReply(
List.of(obj(1, "Plant", true)),
List.of(true),
1L,
""));
service.replies.add(browseReply(
List.of(obj(10, "Line1", false)),
List.of(false),
1L,
""));
try (InProcessGalaxy g = InProcessGalaxy.start(service, new AtomicReference<>());
GalaxyRepositoryClient client = g.client("")) {
List<LazyBrowseNode> roots = client.browse();
roots.get(0).expand();
roots.get(0).expand();
assertEquals(2, service.calls.size());
assertEquals(1, roots.get(0).getChildren().size());
}
}
@Test
void browseExpandUnknownParentThrowsGalaxyNotFound() throws Exception {
BrowseChildrenService service = new BrowseChildrenService();
service.replies.add(browseReply(
List.of(obj(1, "Plant", true)),
List.of(true),
1L,
""));
service.errors.add(Status.NOT_FOUND.withDescription("Parent not found").asRuntimeException());
try (InProcessGalaxy g = InProcessGalaxy.start(service, new AtomicReference<>());
GalaxyRepositoryClient client = g.client("")) {
List<LazyBrowseNode> roots = client.browse();
MxGatewayException error = assertThrows(MxGatewayException.class, () -> roots.get(0).expand());
assertTrue(
error.getMessage().toLowerCase().contains("not found"),
"expected message to mention 'not found', got: " + error.getMessage());
}
}
@Test
void browseExpandMultiPageGathersAllPages() throws Exception {
BrowseChildrenService service = new BrowseChildrenService();
// Roots
service.replies.add(browseReply(
List.of(obj(7, "Plant", true)),
List.of(true),
1L,
""));
// First child page with a next token
service.replies.add(browseReply(
List.of(obj(70, "ChildA", false), obj(71, "ChildB", false)),
List.of(false, false),
1L,
"7:abc:2"));
// Second child page closes the loop
service.replies.add(browseReply(
List.of(obj(72, "ChildC", false)),
List.of(false),
1L,
""));
try (InProcessGalaxy g = InProcessGalaxy.start(service, new AtomicReference<>());
GalaxyRepositoryClient client = g.client("")) {
List<LazyBrowseNode> roots = client.browse();
roots.get(0).expand();
assertEquals(3, roots.get(0).getChildren().size());
assertEquals(3, service.calls.size());
assertEquals("7:abc:2", service.calls.get(2).getPageToken());
}
}
@Test
void browseExpandConcurrentCallersOnlyFireOneRpc() throws Exception {
// Verifies that concurrent expand() calls coalesce onto a single in-flight
// BrowseChildren RPC and that readers (isExpanded/getChildren) are not
// blocked for the full RPC duration.
BrowseChildrenReply rootsReply = browseReply(
List.of(obj(1, "Plant", true)),
List.of(true),
7L,
"");
BrowseChildrenReply childrenReply = browseReply(
List.of(obj(2, "Mixer_001", false)),
List.of(false),
7L,
"");
// Gate the child fetch behind a latch so multiple expanders can pile up.
CountDownLatch release = new CountDownLatch(1);
AtomicInteger childCalls = new AtomicInteger();
BrowseChildrenService service = new BrowseChildrenService() {
@Override
public void browseChildren(
BrowseChildrenRequest request, StreamObserver<BrowseChildrenReply> responseObserver) {
calls.add(request);
BrowseChildrenReply reply;
if (!request.hasParentGobjectId()) {
reply = rootsReply;
} else {
// Block the leader until the followers have arrived.
try {
assertTrue(release.await(5, TimeUnit.SECONDS), "release latch never tripped");
} catch (InterruptedException ie) {
Thread.currentThread().interrupt();
responseObserver.onError(Status.CANCELLED.asRuntimeException());
return;
}
childCalls.incrementAndGet();
reply = childrenReply;
}
responseObserver.onNext(reply);
responseObserver.onCompleted();
}
};
try (InProcessGalaxy g = InProcessGalaxy.start(service, new AtomicReference<>());
GalaxyRepositoryClient client = g.client("")) {
List<LazyBrowseNode> roots = client.browse();
LazyBrowseNode root = roots.get(0);
int parallelism = 10;
ExecutorService pool = Executors.newFixedThreadPool(parallelism);
try {
CountDownLatch ready = new CountDownLatch(parallelism);
List<Future<Void>> futures = new ArrayList<>();
for (int i = 0; i < parallelism; i++) {
futures.add(pool.submit(() -> {
ready.countDown();
root.expand();
return null;
}));
}
// Wait for all callers to be in flight, then release the leader.
assertTrue(ready.await(5, TimeUnit.SECONDS), "expander threads did not start");
// Readers must not be blocked by an in-flight expand; this should not deadlock
// and should return the pre-expand state.
assertFalse(root.isExpanded());
assertEquals(0, root.getChildren().size());
release.countDown();
for (Future<Void> f : futures) {
f.get(10, TimeUnit.SECONDS);
}
} finally {
pool.shutdownNow();
}
assertTrue(root.isExpanded());
assertEquals(1, root.getChildren().size());
// Exactly one expand RPC was issued even though many callers raced.
assertEquals(1, childCalls.get());
// 1 roots fetch + exactly 1 expand fetch.
assertEquals(2, service.calls.size());
}
}
@Test
void browseWithFilterForwardsToRequest() throws Exception {
BrowseChildrenService service = new BrowseChildrenService();
// Default reply is empty; only the request shape matters here.
try (InProcessGalaxy g = InProcessGalaxy.start(service, new AtomicReference<>());
GalaxyRepositoryClient client = g.client("")) {
client.browse(BrowseChildrenOptions.builder()
.tagNameGlob("Mixer*")
.alarmBearingOnly(true)
.build());
}
assertEquals(1, service.calls.size());
BrowseChildrenRequest request = service.calls.get(0);
assertEquals("Mixer*", request.getTagNameGlob());
assertTrue(request.getAlarmBearingOnly());
}
private static GalaxyObject obj(int id, String tag, boolean isArea) {
return GalaxyObject.newBuilder()
.setGobjectId(id)
.setTagName(tag)
.setBrowseName(tag)
.setIsArea(isArea)
.build();
}
private static BrowseChildrenReply browseReply(
List<GalaxyObject> children,
List<Boolean> childHasChildren,
long cacheSequence,
String nextPageToken) {
BrowseChildrenReply.Builder b = BrowseChildrenReply.newBuilder()
.setTotalChildCount(children.size())
.setCacheSequence(cacheSequence)
.setNextPageToken(nextPageToken);
b.addAllChildren(children);
b.addAllChildHasChildren(childHasChildren);
return b.build();
}
private static class BrowseChildrenService extends TestService {
final List<BrowseChildrenRequest> calls =
Collections.synchronizedList(new CopyOnWriteArrayList<>());
final Queue<BrowseChildrenReply> replies = new ArrayDeque<>();
final Queue<Throwable> errors = new ArrayDeque<>();
@Override
public void browseChildren(
BrowseChildrenRequest request, StreamObserver<BrowseChildrenReply> responseObserver) {
calls.add(request);
BrowseChildrenReply reply;
Throwable err;
synchronized (this) {
// Prefer queued replies first; once they're exhausted, fall through to any
// queued error. This matches the .NET fake's ordering used by parity tests.
reply = replies.poll();
err = reply == null ? errors.poll() : null;
}
if (err != null) {
responseObserver.onError(err);
return;
}
if (reply == null) {
reply = BrowseChildrenReply.getDefaultInstance();
}
responseObserver.onNext(reply);
responseObserver.onCompleted();
}
}
private abstract static class TestService extends GalaxyRepositoryGrpc.GalaxyRepositoryImplBase {
@Override
public void testConnection(
@@ -27,7 +27,10 @@ import mxaccess_gateway.v1.MxAccessGatewayGrpc;
import mxaccess_gateway.v1.MxaccessGateway.ActiveAlarmSnapshot;
import mxaccess_gateway.v1.MxaccessGateway.AddItemReply;
import mxaccess_gateway.v1.MxaccessGateway.AlarmConditionState;
import mxaccess_gateway.v1.MxaccessGateway.AlarmFeedMessage;
import mxaccess_gateway.v1.MxaccessGateway.AlarmTransitionKind;
import mxaccess_gateway.v1.MxaccessGateway.BulkSubscribeReply;
import mxaccess_gateway.v1.MxaccessGateway.OnAlarmTransitionEvent;
import mxaccess_gateway.v1.MxaccessGateway.CloseSessionReply;
import mxaccess_gateway.v1.MxaccessGateway.CloseSessionRequest;
import mxaccess_gateway.v1.MxaccessGateway.MxCommandKind;
@@ -41,6 +44,7 @@ import mxaccess_gateway.v1.MxaccessGateway.ProtocolStatusCode;
import mxaccess_gateway.v1.MxaccessGateway.QueryActiveAlarmsRequest;
import mxaccess_gateway.v1.MxaccessGateway.RegisterReply;
import mxaccess_gateway.v1.MxaccessGateway.SessionState;
import mxaccess_gateway.v1.MxaccessGateway.StreamAlarmsRequest;
import mxaccess_gateway.v1.MxaccessGateway.StreamEventsRequest;
import mxaccess_gateway.v1.MxaccessGateway.SubscribeResult;
import org.junit.jupiter.api.Test;
@@ -268,6 +272,100 @@ final class MxGatewayClientSessionTests {
}
}
@Test
void streamAlarmsForwardsRequestAndStreamsAlarmFeedMessages() throws Exception {
AtomicReference<StreamAlarmsRequest> streamRequest = new AtomicReference<>();
CountDownLatch serverCancelled = new CountDownLatch(1);
TestGatewayService service = new TestGatewayService() {
@Override
public void streamAlarms(
StreamAlarmsRequest request, StreamObserver<AlarmFeedMessage> responseObserver) {
streamRequest.set(request);
ServerCallStreamObserver<AlarmFeedMessage> server =
(ServerCallStreamObserver<AlarmFeedMessage>) responseObserver;
server.setOnCancelHandler(serverCancelled::countDown);
// Active-alarm snapshot, snapshot-complete sentinel, then a
// transition mirrors the shape of a real alarm feed open.
server.onNext(AlarmFeedMessage.newBuilder()
.setActiveAlarm(ActiveAlarmSnapshot.newBuilder()
.setAlarmFullReference("Tank01.Level.HiHi")
.setCurrentState(AlarmConditionState.ALARM_CONDITION_STATE_ACTIVE)
.setSeverity(700))
.build());
server.onNext(AlarmFeedMessage.newBuilder().setSnapshotComplete(true).build());
server.onNext(AlarmFeedMessage.newBuilder()
.setTransition(OnAlarmTransitionEvent.newBuilder()
.setAlarmFullReference("Tank01.Level.HiHi")
.setTransitionKind(AlarmTransitionKind.ALARM_TRANSITION_KIND_ACKNOWLEDGE)
.setSeverity(700))
.build());
// Note: we deliberately do NOT call onCompleted() so the call
// remains open for the cancellation assertion below.
}
};
try (InProcessGateway gateway = InProcessGateway.start(service, new AtomicReference<>());
MxGatewayClient client = gateway.client("", Duration.ofSeconds(5))) {
java.util.List<AlarmFeedMessage> received = new java.util.ArrayList<>();
AtomicReference<Throwable> errorRef = new AtomicReference<>();
CountDownLatch threeReceived = new CountDownLatch(3);
StreamAlarmsRequest request = StreamAlarmsRequest.newBuilder()
.setAlarmFilterPrefix("Tank01")
.build();
MxGatewayAlarmFeedSubscription subscription = client.streamAlarms(
request,
new StreamObserver<>() {
@Override
public void onNext(AlarmFeedMessage value) {
received.add(value);
threeReceived.countDown();
}
@Override
public void onError(Throwable t) {
errorRef.set(t);
}
@Override
public void onCompleted() {
}
});
assertTrue(threeReceived.await(5, TimeUnit.SECONDS),
"expected three alarm feed messages within 5s");
// The request shape (filter prefix in particular) must reach the
// server proves MxGatewayClient.streamAlarms calls the production
// subscription.wrap(observer) glue and not a CLI override.
assertNotNull(streamRequest.get());
assertEquals("Tank01", streamRequest.get().getAlarmFilterPrefix());
// Order and payload-case must be preserved (the wrapping observer
// is just a pass-through).
assertEquals(3, received.size());
assertEquals(AlarmFeedMessage.PayloadCase.ACTIVE_ALARM, received.get(0).getPayloadCase());
assertEquals(
"Tank01.Level.HiHi",
received.get(0).getActiveAlarm().getAlarmFullReference());
assertEquals(AlarmFeedMessage.PayloadCase.SNAPSHOT_COMPLETE, received.get(1).getPayloadCase());
assertEquals(AlarmFeedMessage.PayloadCase.TRANSITION, received.get(2).getPayloadCase());
assertEquals(
AlarmTransitionKind.ALARM_TRANSITION_KIND_ACKNOWLEDGE,
received.get(2).getTransition().getTransitionKind());
// No error expected before cancellation proves the wrapping
// observer forwarded only data, not a synthetic error.
assertNull(errorRef.get(), "no error expected before cancellation");
// Cancellation must propagate to the underlying gRPC call.
subscription.cancel();
assertTrue(serverCancelled.await(5, TimeUnit.SECONDS),
"server should observe RPC cancellation after subscription.cancel()");
}
}
@Test
void commandFailureKeepsRawReply() throws Exception {
TestGatewayService service = new TestGatewayService() {
@@ -0,0 +1,198 @@
package com.zb.mom.ww.mxgateway.client;
import static org.junit.jupiter.api.Assertions.assertThrows;
import static org.junit.jupiter.api.Assertions.assertTrue;
import io.grpc.ManagedChannel;
import io.grpc.Server;
import io.grpc.StatusRuntimeException;
import io.grpc.netty.shaded.io.grpc.netty.GrpcSslContexts;
import io.grpc.netty.shaded.io.grpc.netty.NettyServerBuilder;
import io.grpc.stub.StreamObserver;
import java.io.File;
import java.io.FileOutputStream;
import java.io.IOException;
import java.net.InetSocketAddress;
import java.nio.file.Files;
import java.security.KeyStore;
import java.security.PrivateKey;
import java.security.cert.Certificate;
import java.security.cert.X509Certificate;
import java.time.Duration;
import java.util.Base64;
import java.util.concurrent.TimeUnit;
import javax.net.ssl.SSLException;
import mxaccess_gateway.v1.MxAccessGatewayGrpc;
import mxaccess_gateway.v1.MxaccessGateway.OpenSessionReply;
import mxaccess_gateway.v1.MxaccessGateway.OpenSessionRequest;
import mxaccess_gateway.v1.MxaccessGateway.ProtocolStatus;
import mxaccess_gateway.v1.MxaccessGateway.ProtocolStatusCode;
import org.junit.jupiter.api.AfterEach;
import org.junit.jupiter.api.BeforeEach;
import org.junit.jupiter.api.Test;
/**
* Verifies that the Java client connects to a Netty TLS server with a
* self-signed certificate when no CA is pinned (lenient default), and that
* setting {@code requireCertificateValidation(true)} causes a TLS failure.
*
* <p>A self-signed certificate is generated using {@code keytool} (always
* available in the JDK) to avoid dependencies on internal JDK APIs or
* BouncyCastle, and so the test works on all JDK versions used by the project.
*/
final class MxGatewayClientTlsTests {
private Server server;
private int port;
private File certPemFile;
private File keyPemFile;
private File keystoreFile;
@BeforeEach
void startTlsServer() throws Exception {
keystoreFile = File.createTempFile("gw-test-ks", ".p12");
certPemFile = File.createTempFile("gw-test-cert", ".pem");
keyPemFile = File.createTempFile("gw-test-key", ".pem");
// keytool refuses to write to a pre-existing (even empty) file; delete it first.
keystoreFile.delete();
// Use keytool to generate a self-signed PKCS12 keystore.
String keytool = ProcessHandle.current().info().command()
.map(cmd -> cmd.replace("java", "keytool"))
.orElse("keytool");
// Fall back to just "keytool" on PATH if the resolved path doesn't exist.
if (!new File(keytool).exists()) {
keytool = "keytool";
}
Process p = new ProcessBuilder(
keytool,
"-genkeypair",
"-alias", "server",
"-keyalg", "RSA",
"-keysize", "2048",
"-sigalg", "SHA256withRSA",
"-validity", "1",
"-dname", "CN=localhost",
"-storetype", "PKCS12",
"-storepass", "changeit",
"-keypass", "changeit",
"-keystore", keystoreFile.getAbsolutePath())
.redirectErrorStream(true)
.start();
int exit = p.waitFor();
if (exit != 0) {
String out = new String(p.getInputStream().readAllBytes());
throw new IllegalStateException("keytool failed (exit " + exit + "): " + out);
}
// Export cert and private key from the PKCS12 keystore to PEM files.
KeyStore ks = KeyStore.getInstance("PKCS12");
try (var is = Files.newInputStream(keystoreFile.toPath())) {
ks.load(is, "changeit".toCharArray());
}
X509Certificate cert = (X509Certificate) ks.getCertificate("server");
PrivateKey privateKey = (PrivateKey) ks.getKey("server", "changeit".toCharArray());
try (FileOutputStream out = new FileOutputStream(certPemFile)) {
out.write("-----BEGIN CERTIFICATE-----\n".getBytes());
out.write(Base64.getMimeEncoder(64, new byte[]{'\n'}).encode(cert.getEncoded()));
out.write("\n-----END CERTIFICATE-----\n".getBytes());
}
try (FileOutputStream out = new FileOutputStream(keyPemFile)) {
out.write("-----BEGIN PRIVATE KEY-----\n".getBytes());
out.write(Base64.getMimeEncoder(64, new byte[]{'\n'}).encode(privateKey.getEncoded()));
out.write("\n-----END PRIVATE KEY-----\n".getBytes());
}
server = NettyServerBuilder
.forAddress(new InetSocketAddress("127.0.0.1", 0))
.sslContext(GrpcSslContexts.forServer(certPemFile, keyPemFile).build())
.addService(new MinimalGatewayService())
.build()
.start();
port = server.getPort();
}
@AfterEach
void stopTlsServer() throws InterruptedException {
if (server != null) {
server.shutdown();
server.awaitTermination(5, TimeUnit.SECONDS);
}
if (certPemFile != null) {
certPemFile.delete();
}
if (keyPemFile != null) {
keyPemFile.delete();
}
if (keystoreFile != null) {
keystoreFile.delete();
}
}
@Test
void connectsToSelfSignedServer_WhenRequireCertificateValidationIsFalse() throws SSLException {
// Default options requireCertificateValidation defaults to false.
MxGatewayClientOptions options = MxGatewayClientOptions.builder()
.endpoint("127.0.0.1:" + port)
.apiKey("test-key")
.connectTimeout(Duration.ofSeconds(5))
.callTimeout(Duration.ofSeconds(5))
.build();
ManagedChannel channel = MxGatewayClient.createChannelForTests(options);
try {
MxAccessGatewayGrpc.MxAccessGatewayBlockingStub stub =
MxAccessGatewayGrpc.newBlockingStub(channel);
OpenSessionReply reply = stub.openSession(
OpenSessionRequest.newBuilder()
.setClientSessionName("tls-test")
.build());
assertTrue(reply.getProtocolStatus().getCode()
== ProtocolStatusCode.PROTOCOL_STATUS_CODE_OK);
} finally {
channel.shutdownNow();
}
}
@Test
void failsToConnect_WhenRequireCertificateValidationIsTrue() throws SSLException {
MxGatewayClientOptions options = MxGatewayClientOptions.builder()
.endpoint("127.0.0.1:" + port)
.apiKey("test-key")
.requireCertificateValidation(true)
.connectTimeout(Duration.ofSeconds(5))
.callTimeout(Duration.ofSeconds(5))
.build();
ManagedChannel channel = MxGatewayClient.createChannelForTests(options);
try {
MxAccessGatewayGrpc.MxAccessGatewayBlockingStub stub =
MxAccessGatewayGrpc.newBlockingStub(channel);
assertThrows(StatusRuntimeException.class, () ->
stub.openSession(OpenSessionRequest.newBuilder()
.setClientSessionName("tls-strict-test")
.build()));
} finally {
channel.shutdownNow();
}
}
/** Minimal gateway stub that succeeds any OpenSession call. */
private static final class MinimalGatewayService
extends MxAccessGatewayGrpc.MxAccessGatewayImplBase {
@Override
public void openSession(
OpenSessionRequest request,
StreamObserver<OpenSessionReply> responseObserver) {
responseObserver.onNext(OpenSessionReply.newBuilder()
.setSessionId("tls-test-session")
.setProtocolStatus(ProtocolStatus.newBuilder()
.setCode(ProtocolStatusCode.PROTOCOL_STATUS_CODE_OK)
.build())
.build());
responseObserver.onCompleted();
}
}
}
@@ -0,0 +1,275 @@
package com.zb.mom.ww.mxgateway.client;
import static org.junit.jupiter.api.Assertions.assertEquals;
import static org.junit.jupiter.api.Assertions.assertFalse;
import static org.junit.jupiter.api.Assertions.assertNotNull;
import static org.junit.jupiter.api.Assertions.assertTrue;
import galaxy_repository.v1.GalaxyRepositoryOuterClass.DeployEvent;
import galaxy_repository.v1.GalaxyRepositoryOuterClass.WatchDeployEventsRequest;
import io.grpc.stub.ClientCallStreamObserver;
import io.grpc.stub.ClientResponseObserver;
import io.grpc.stub.StreamObserver;
import java.util.ArrayList;
import java.util.List;
import java.util.concurrent.atomic.AtomicReference;
import mxaccess_gateway.v1.MxaccessGateway.ActiveAlarmSnapshot;
import mxaccess_gateway.v1.MxaccessGateway.AlarmFeedMessage;
import mxaccess_gateway.v1.MxaccessGateway.MxEvent;
import mxaccess_gateway.v1.MxaccessGateway.QueryActiveAlarmsRequest;
import mxaccess_gateway.v1.MxaccessGateway.StreamAlarmsRequest;
import mxaccess_gateway.v1.MxaccessGateway.StreamEventsRequest;
import org.junit.jupiter.api.Test;
/**
* Lifecycle / cancellation contract tests applied uniformly to each of the
* four subscription classes that extend {@link MxGatewayStreamSubscription}.
*
* <p>Locks in the Client.Java-036 refactor: every subclass must exhibit the
* same behaviour for (a) cancel-before-beforeStart eagerly cancelling the
* stream once it attaches, (b) cancel-after-beforeStart forwarding directly
* to the stream, (c) the cancel message matching the subclass's documented
* value, (d) {@code close()} delegating to {@code cancel()}, and (e) the
* wrapping observer forwarding {@code onNext}/{@code onError}/{@code onCompleted}
* to the caller's observer.
*/
final class MxGatewayStreamSubscriptionContractTests {
@Test
void cancelBeforeBeforeStartCancelsStreamWhenItAttaches_eventSubscription() {
runCancelBeforeBeforeStartTest(new MxGatewayEventSubscription(), "client cancelled event stream");
}
@Test
void cancelBeforeBeforeStartCancelsStreamWhenItAttaches_alarmFeedSubscription() {
runCancelBeforeBeforeStartTest(
new MxGatewayAlarmFeedSubscription(), "client cancelled alarm feed");
}
@Test
void cancelBeforeBeforeStartCancelsStreamWhenItAttaches_activeAlarmsSubscription() {
runCancelBeforeBeforeStartTest(
new MxGatewayActiveAlarmsSubscription(), "client cancelled active-alarms query");
}
@Test
void cancelBeforeBeforeStartCancelsStreamWhenItAttaches_deployEventSubscription() {
runCancelBeforeBeforeStartTest(
new DeployEventSubscription(), "client cancelled deploy event stream");
}
@Test
void cancelAfterBeforeStartForwardsToStream_eventSubscription() {
runCancelAfterBeforeStartTest(new MxGatewayEventSubscription(), "client cancelled event stream");
}
@Test
void cancelAfterBeforeStartForwardsToStream_alarmFeedSubscription() {
runCancelAfterBeforeStartTest(
new MxGatewayAlarmFeedSubscription(), "client cancelled alarm feed");
}
@Test
void cancelAfterBeforeStartForwardsToStream_activeAlarmsSubscription() {
runCancelAfterBeforeStartTest(
new MxGatewayActiveAlarmsSubscription(), "client cancelled active-alarms query");
}
@Test
void cancelAfterBeforeStartForwardsToStream_deployEventSubscription() {
runCancelAfterBeforeStartTest(
new DeployEventSubscription(), "client cancelled deploy event stream");
}
@Test
void closeDelegatesToCancel_eventSubscription() {
runCloseDelegatesToCancelTest(new MxGatewayEventSubscription());
}
@Test
void closeDelegatesToCancel_alarmFeedSubscription() {
runCloseDelegatesToCancelTest(new MxGatewayAlarmFeedSubscription());
}
@Test
void closeDelegatesToCancel_activeAlarmsSubscription() {
runCloseDelegatesToCancelTest(new MxGatewayActiveAlarmsSubscription());
}
@Test
void closeDelegatesToCancel_deployEventSubscription() {
runCloseDelegatesToCancelTest(new DeployEventSubscription());
}
@Test
void wrappedObserverForwardsOnNextOnErrorOnCompleted_eventSubscription() {
MxEvent event = MxEvent.newBuilder().setWorkerSequence(7L).build();
runForwardingTest(new MxGatewayEventSubscription(), event);
}
@Test
void wrappedObserverForwardsOnNextOnErrorOnCompleted_alarmFeedSubscription() {
AlarmFeedMessage msg = AlarmFeedMessage.newBuilder().setSnapshotComplete(true).build();
runForwardingTest(new MxGatewayAlarmFeedSubscription(), msg);
}
@Test
void wrappedObserverForwardsOnNextOnErrorOnCompleted_activeAlarmsSubscription() {
ActiveAlarmSnapshot snap = ActiveAlarmSnapshot.newBuilder()
.setAlarmFullReference("ref")
.setSeverity(500)
.build();
runForwardingTest(new MxGatewayActiveAlarmsSubscription(), snap);
}
@Test
void wrappedObserverForwardsOnNextOnErrorOnCompleted_deployEventSubscription() {
DeployEvent ev = DeployEvent.newBuilder().setSequence(1L).build();
runForwardingTest(new DeployEventSubscription(), ev);
}
private static <Req, Resp> void runCancelBeforeBeforeStartTest(
MxGatewayStreamSubscription<Req, Resp> subscription, String expectedMessage) {
ClientResponseObserver<Req, Resp> wrapped = subscription.wrap(new NoopObserver<>());
RecordingClientCallStreamObserver<Req> stream = new RecordingClientCallStreamObserver<>();
subscription.cancel();
wrapped.beforeStart(stream);
assertTrue(stream.cancelled, "stream should have been cancelled by beforeStart after prior cancel()");
assertEquals(expectedMessage, stream.cancelMessage);
}
private static <Req, Resp> void runCancelAfterBeforeStartTest(
MxGatewayStreamSubscription<Req, Resp> subscription, String expectedMessage) {
ClientResponseObserver<Req, Resp> wrapped = subscription.wrap(new NoopObserver<>());
RecordingClientCallStreamObserver<Req> stream = new RecordingClientCallStreamObserver<>();
wrapped.beforeStart(stream);
assertFalse(stream.cancelled, "stream should not be cancelled before cancel() is called");
subscription.cancel();
assertTrue(stream.cancelled, "stream should have been cancelled by direct cancel()");
assertEquals(expectedMessage, stream.cancelMessage);
}
private static <Req, Resp> void runCloseDelegatesToCancelTest(
MxGatewayStreamSubscription<Req, Resp> subscription) {
ClientResponseObserver<Req, Resp> wrapped = subscription.wrap(new NoopObserver<>());
RecordingClientCallStreamObserver<Req> stream = new RecordingClientCallStreamObserver<>();
wrapped.beforeStart(stream);
subscription.close();
assertTrue(stream.cancelled, "close() should delegate to cancel()");
}
private static <Req, Resp> void runForwardingTest(
MxGatewayStreamSubscription<Req, Resp> subscription, Resp value) {
List<Resp> received = new ArrayList<>();
AtomicReference<Throwable> errorRef = new AtomicReference<>();
AtomicReference<Boolean> completed = new AtomicReference<>(false);
StreamObserver<Resp> caller = new StreamObserver<>() {
@Override
public void onNext(Resp v) {
received.add(v);
}
@Override
public void onError(Throwable t) {
errorRef.set(t);
}
@Override
public void onCompleted() {
completed.set(true);
}
};
ClientResponseObserver<Req, Resp> wrapped = subscription.wrap(caller);
RecordingClientCallStreamObserver<Req> stream = new RecordingClientCallStreamObserver<>();
wrapped.beforeStart(stream);
wrapped.onNext(value);
IllegalStateException boom = new IllegalStateException("boom");
wrapped.onError(boom);
wrapped.onCompleted();
assertEquals(1, received.size());
assertEquals(value, received.get(0));
assertNotNull(errorRef.get());
assertEquals(boom, errorRef.get());
assertTrue(completed.get());
}
private static final class NoopObserver<T> implements StreamObserver<T> {
@Override
public void onNext(T value) {
}
@Override
public void onError(Throwable t) {
}
@Override
public void onCompleted() {
}
}
private static final class RecordingClientCallStreamObserver<T> extends ClientCallStreamObserver<T> {
boolean cancelled;
String cancelMessage;
@Override
public boolean isReady() {
return true;
}
@Override
public void setOnReadyHandler(Runnable onReadyHandler) {
}
@Override
public void disableAutoInboundFlowControl() {
}
@Override
public void request(int count) {
}
@Override
public void setMessageCompression(boolean enable) {
}
@Override
public void cancel(String message, Throwable cause) {
cancelled = true;
cancelMessage = message;
}
@Override
public void onNext(T value) {
}
@Override
public void onError(Throwable error) {
}
@Override
public void onCompleted() {
}
}
// Compile-time guarantee that the parameter types still match the
// generic bounds catches a regression where a subclass changes its
// request/response types out from under the shared base.
@SuppressWarnings("unused")
private static void typeBoundsCheck() {
MxGatewayStreamSubscription<StreamEventsRequest, MxEvent> a = new MxGatewayEventSubscription();
MxGatewayStreamSubscription<StreamAlarmsRequest, AlarmFeedMessage> b = new MxGatewayAlarmFeedSubscription();
MxGatewayStreamSubscription<QueryActiveAlarmsRequest, ActiveAlarmSnapshot> c =
new MxGatewayActiveAlarmsSubscription();
MxGatewayStreamSubscription<WatchDeployEventsRequest, DeployEvent> d = new DeployEventSubscription();
}
}
+22
View File
@@ -112,6 +112,28 @@ Support:
- TLS channel with default roots,
- custom root certificate file.
### Trust posture (trust-on-first-use)
The gateway can serve a self-signed certificate it generates itself (it has no
PKI). grpc-python exposes no per-channel skip-verify hook, so the client cannot
"accept any certificate" the way the other clients do. Instead, when the channel
is not plaintext and neither `ca_file` nor `require_certificate_validation` is
set, the TLS default is **trust-on-first-use**: the client fetches the server's
presented certificate once via `ssl.get_server_certificate` (an unverified
probe), pins it as the channel's only trust root, and — because the generated
certificate always carries a `localhost` SAN — defaults
`grpc.ssl_target_name_override` to `localhost` when no `server_name_override` was
supplied (tolerating dial-by-IP or a hostname mismatch). A failed probe is
surfaced as a transport error naming the endpoint.
To verify the gateway instead:
- set `ca_file` to verify against a specific CA, or
- set `require_certificate_validation=True` to verify against the system trust
roots.
Both bypass the TOFU path.
## Streaming
Expose `stream_events` as an async iterator. Canceling the task should cancel
+69 -2
View File
@@ -138,6 +138,49 @@ The methods return native Python types (`bool`, `datetime | None`, and a
into the hierarchy without learning the underlying stub class. The
service requires the `metadata:read` scope on the API key.
### Browsing lazily
For UI trees or OPC UA bridges, use `browse_children` to walk one level at a
time instead of loading the full hierarchy with `discover_hierarchy`. Pass an
empty request for root objects; subsequent calls set `parent_gobject_id`,
`parent_tag_name`, or `parent_contained_path`. Filter fields match
`DiscoverHierarchy`. Each response pairs `children` with `child_has_children` so
you know which nodes to expand. See
[Galaxy Repository](../../docs/GalaxyRepository.md#browsechildren) for full
request and filter semantics.
```python
from zb_mom_ww_mxgateway.generated import galaxy_repository_pb2 as galaxy_pb2
reply = await galaxy.browse_children(galaxy_pb2.BrowseChildrenRequest())
for child, has_children in zip(reply.children, reply.child_has_children):
print(child.tag_name, "expand=" + str(has_children))
```
#### High-level walker
For UI trees, the client provides a `LazyBrowseNode` walker that handles
sibling pagination and the `child_has_children` hint for you:
```python
async with await GalaxyRepositoryClient.connect(
endpoint="localhost:5000",
api_key="<gateway-api-key>",
plaintext=True,
) as galaxy:
roots = await galaxy.browse()
for root in roots:
if root.has_children_hint:
await root.expand()
for child in root.children:
kind = "has children" if child.has_children_hint else "leaf"
print(f"{child.object.tag_name} ({kind})")
```
`expand` is idempotent — calling it twice fires only one RPC,
and is safe under concurrent callers. To refresh after a Galaxy redeploy, call
`browse` again from the root.
### Watching deploy events
`GalaxyRepositoryClient.watch_deploy_events` opens a server-streaming
@@ -187,6 +230,17 @@ The client supports plaintext channels for local development, TLS with system
roots, TLS with a custom `ca_file`, and an optional test server name override.
API keys are redacted from option repr output and CLI error output.
The gateway can auto-generate its own self-signed certificate (it has no PKI).
grpc-python has no per-channel skip-verify, so the lenient TLS default is
**trust-on-first-use**: with no `ca_file` and `require_certificate_validation`
left `False`, the client fetches the gateway's presented certificate once
(unverified) and pins it for the channel, defaulting the SNI/target-name override
to `localhost` (the generated certificate always carries a `localhost` SAN) when
none was supplied. To verify instead, pass `ca_file` to verify against a specific
CA, or set `require_certificate_validation=True` to verify against the system
trust roots. See
[Gateway Configuration](../../docs/GatewayConfiguration.md#automatic-self-signed-certificate).
## CLI
The CLI emits deterministic JSON for automation:
@@ -198,8 +252,8 @@ mxgw-py register --session-id <id> --client-name python-client --json
mxgw-py add-item --session-id <id> --server-handle 1 --item Object.Attribute --json
mxgw-py advise --session-id <id> --server-handle 1 --item-handle 2 --json
mxgw-py stream-events --session-id <id> --max-events 1 --json
mxgw-py stream-alarms --session-id <id> --max-messages 1 --json
mxgw-py acknowledge-alarm --session-id <id> --alarm-reference "\\Galaxy\Area001.Pump001.PumpFault" --json
mxgw-py stream-alarms --max-messages 1 --json
mxgw-py acknowledge-alarm --reference "\\Galaxy\Area001.Pump001.PumpFault" --json
mxgw-py write --session-id <id> --server-handle 1 --item-handle 2 --type int32 --value 123 --json
```
@@ -225,6 +279,19 @@ $env:MXGATEWAY_TEST_ITEM = 'Object.Attribute'
mxgw-py smoke --endpoint $env:MXGATEWAY_ENDPOINT --plaintext --api-key-env MXGATEWAY_API_KEY --item $env:MXGATEWAY_TEST_ITEM --json
```
## Installing from the Gitea PyPI Feed
The client publishes to the internal Gitea PyPI feed:
````bash
pip install \
--index-url https://gitea.dohertylan.com/api/packages/dohertj2/pypi/simple/ \
zb-mom-ww-mxaccess-gateway-client
````
If you need authentication (private feed), use `--extra-index-url` and either
a `~/.netrc` entry or `PIP_INDEX_URL=https://<user>:<token>@gitea.dohertylan.com/...`.
## Related Documentation
- [Client Packaging](../../docs/ClientPackaging.md)
+23
View File
@@ -13,12 +13,35 @@ dependencies = [
"grpcio>=1.80,<2",
"protobuf>=6.33,<7",
]
authors = [
{ name = "Joseph Doherty" },
]
license = { text = "Proprietary" }
keywords = ["mxaccess", "mxgateway", "grpc", "client", "archestra"]
classifiers = [
"Development Status :: 3 - Alpha",
"License :: Other/Proprietary License",
"Programming Language :: Python :: 3",
"Programming Language :: Python :: 3.12",
"Programming Language :: Python :: 3.13",
"Topic :: System :: Distributed Computing",
"Topic :: Software Development :: Libraries :: Python Modules",
"Intended Audience :: Developers",
"Operating System :: OS Independent",
]
[project.urls]
Homepage = "https://gitea.dohertylan.com/dohertj2/mxaccessgw"
Repository = "https://gitea.dohertylan.com/dohertj2/mxaccessgw"
Issues = "https://gitea.dohertylan.com/dohertj2/mxaccessgw/issues"
[project.optional-dependencies]
dev = [
"grpcio-tools>=1.80,<2",
"pytest>=9,<10",
"pytest-asyncio>=1.3,<2",
"build>=1.2,<2",
"twine>=5,<6",
]
[project.scripts]
@@ -21,9 +21,10 @@ from .auth import merge_metadata
from .errors import MxGatewayError, map_rpc_error
from .generated import galaxy_repository_pb2 as galaxy_pb
from .generated import galaxy_repository_pb2_grpc as galaxy_pb_grpc
from .options import ClientOptions, create_channel
from .options import BrowseChildrenOptions, ClientOptions, create_channel
_DISCOVER_HIERARCHY_PAGE_SIZE = 5000
_BROWSE_CHILDREN_PAGE_SIZE = 500
class GalaxyRepositoryClient:
@@ -139,6 +140,89 @@ class GalaxyRepositoryClient:
)
seen_page_tokens.add(page_token)
async def browse_children_raw(
self, request: galaxy_pb.BrowseChildrenRequest
) -> galaxy_pb.BrowseChildrenReply:
"""Issue one BrowseChildren RPC and return the raw reply.
Lower-level escape hatch for callers that need direct page-token control
or do not want LazyBrowseNode wrapping. Most callers should use
:py:meth:`browse` and :py:meth:`LazyBrowseNode.expand` instead.
"""
return await self._unary(
"browse children",
self.raw_stub.BrowseChildren,
request,
)
async def browse(
self,
options: BrowseChildrenOptions | None = None,
) -> list["LazyBrowseNode"]:
"""Return the root browse nodes for lazy hierarchy traversal.
Each returned ``LazyBrowseNode`` wraps a Galaxy object whose direct
children can be loaded on demand by ``await node.expand()``.
"""
effective = options or BrowseChildrenOptions()
return [
node
async for node in self._iter_browse_children(
parent_gobject_id=None,
options=effective,
)
]
async def _iter_browse_children(
self,
*,
parent_gobject_id: int | None,
options: BrowseChildrenOptions,
) -> AsyncIterator["LazyBrowseNode"]:
page_token = ""
seen_page_tokens: set[str] = set()
while True:
request = galaxy_pb.BrowseChildrenRequest(
page_size=_BROWSE_CHILDREN_PAGE_SIZE,
page_token=page_token,
alarm_bearing_only=options.alarm_bearing_only,
historized_only=options.historized_only,
)
if parent_gobject_id is not None:
request.parent_gobject_id = parent_gobject_id
if options.category_ids:
request.category_ids.extend(options.category_ids)
if options.template_chain_contains:
request.template_chain_contains.extend(options.template_chain_contains)
if options.tag_name_glob:
request.tag_name_glob = options.tag_name_glob
if options.include_attributes is not None:
request.include_attributes = options.include_attributes
reply = await self._unary(
"browse children",
self.raw_stub.BrowseChildren,
request,
)
for index, obj in enumerate(reply.children):
hint = (
index < len(reply.child_has_children)
and bool(reply.child_has_children[index])
)
yield LazyBrowseNode(self, obj, hint, options)
page_token = reply.next_page_token
if not page_token:
return
if page_token in seen_page_tokens:
raise MxGatewayError(
f"galaxy browse children returned repeated page token {page_token!r}"
)
seen_page_tokens.add(page_token)
def watch_deploy_events(
self,
last_seen_deploy_time: datetime | None = None,
@@ -202,6 +286,67 @@ class GalaxyRepositoryClient:
raise map_rpc_error(operation, error) from error
class LazyBrowseNode:
"""One node in a lazy-loaded Galaxy browse tree.
Calling ``expand`` once fetches direct children (paginating as needed)
and populates ``children``. Subsequent calls are no-ops so callers can
drive UI expand toggles without de-duping.
"""
def __init__(
self,
client: "GalaxyRepositoryClient",
obj: galaxy_pb.GalaxyObject,
has_children_hint: bool,
options: BrowseChildrenOptions,
) -> None:
"""Initialize a node bound to its owning client and filter set."""
self._client = client
self._object = obj
self._has_children_hint = has_children_hint
self._options = options
self._children: list[LazyBrowseNode] = []
self._is_expanded = False
self._expand_lock = asyncio.Lock()
@property
def object(self) -> galaxy_pb.GalaxyObject:
"""Return the underlying ``GalaxyObject`` proto for this node."""
return self._object
@property
def has_children_hint(self) -> bool:
"""Return the server hint about whether this node has children."""
return self._has_children_hint
@property
def children(self) -> list["LazyBrowseNode"]:
"""Return a copy of the loaded child nodes (empty until expanded)."""
return list(self._children)
@property
def is_expanded(self) -> bool:
"""Return whether ``expand`` has already populated ``children``."""
return self._is_expanded
async def expand(self) -> None:
"""Fetch direct children of this node; no-op on subsequent calls."""
if self._is_expanded:
return
async with self._expand_lock:
if self._is_expanded:
return
new_children: list[LazyBrowseNode] = []
async for child in self._client._iter_browse_children(
parent_gobject_id=self._object.gobject_id,
options=self._options,
):
new_children.append(child)
self._children.extend(new_children)
self._is_expanded = True
async def _canceling_iterator(call: Any) -> AsyncIterator[galaxy_pb.DeployEvent]:
try:
async for event in call:
@@ -26,7 +26,7 @@ from google.protobuf import timestamp_pb2 as google_dot_protobuf_dot_timestamp__
from google.protobuf import wrappers_pb2 as google_dot_protobuf_dot_wrappers__pb2
DESCRIPTOR = _descriptor_pool.Default().AddSerializedFile(b'\n\x17galaxy_repository.proto\x12\x14galaxy_repository.v1\x1a\x1fgoogle/protobuf/timestamp.proto\x1a\x1egoogle/protobuf/wrappers.proto\"\x17\n\x15TestConnectionRequest\"!\n\x13TestConnectionReply\x12\n\n\x02ok\x18\x01 \x01(\x08\"\x1a\n\x18GetLastDeployTimeRequest\"b\n\x16GetLastDeployTimeReply\x12\x0f\n\x07present\x18\x01 \x01(\x08\x12\x37\n\x13time_of_last_deploy\x18\x02 \x01(\x0b\x32\x1a.google.protobuf.Timestamp\"\x87\x03\n\x18\x44iscoverHierarchyRequest\x12\x11\n\tpage_size\x18\x01 \x01(\x05\x12\x12\n\npage_token\x18\x02 \x01(\t\x12\x19\n\x0froot_gobject_id\x18\x03 \x01(\x05H\x00\x12\x17\n\rroot_tag_name\x18\x04 \x01(\tH\x00\x12\x1d\n\x13root_contained_path\x18\x05 \x01(\tH\x00\x12.\n\tmax_depth\x18\x06 \x01(\x0b\x32\x1b.google.protobuf.Int32Value\x12\x14\n\x0c\x63\x61tegory_ids\x18\x07 \x03(\x05\x12\x1f\n\x17template_chain_contains\x18\x08 \x03(\t\x12\x15\n\rtag_name_glob\x18\t \x01(\t\x12\x1f\n\x12include_attributes\x18\n \x01(\x08H\x01\x88\x01\x01\x12\x1a\n\x12\x61larm_bearing_only\x18\x0b \x01(\x08\x12\x17\n\x0fhistorized_only\x18\x0c \x01(\x08\x42\x06\n\x04rootB\x15\n\x13_include_attributes\"\x82\x01\n\x16\x44iscoverHierarchyReply\x12\x33\n\x07objects\x18\x01 \x03(\x0b\x32\".galaxy_repository.v1.GalaxyObject\x12\x17\n\x0fnext_page_token\x18\x02 \x01(\t\x12\x1a\n\x12total_object_count\x18\x03 \x01(\x05\"U\n\x18WatchDeployEventsRequest\x12\x39\n\x15last_seen_deploy_time\x18\x01 \x01(\x0b\x32\x1a.google.protobuf.Timestamp\"\xdd\x01\n\x0b\x44\x65ployEvent\x12\x10\n\x08sequence\x18\x01 \x01(\x04\x12/\n\x0bobserved_at\x18\x02 \x01(\x0b\x32\x1a.google.protobuf.Timestamp\x12\x37\n\x13time_of_last_deploy\x18\x03 \x01(\x0b\x32\x1a.google.protobuf.Timestamp\x12#\n\x1btime_of_last_deploy_present\x18\x04 \x01(\x08\x12\x14\n\x0cobject_count\x18\x05 \x01(\x05\x12\x17\n\x0f\x61ttribute_count\x18\x06 \x01(\x05\"\x93\x02\n\x0cGalaxyObject\x12\x12\n\ngobject_id\x18\x01 \x01(\x05\x12\x10\n\x08tag_name\x18\x02 \x01(\t\x12\x16\n\x0e\x63ontained_name\x18\x03 \x01(\t\x12\x13\n\x0b\x62rowse_name\x18\x04 \x01(\t\x12\x19\n\x11parent_gobject_id\x18\x05 \x01(\x05\x12\x0f\n\x07is_area\x18\x06 \x01(\x08\x12\x13\n\x0b\x63\x61tegory_id\x18\x07 \x01(\x05\x12\x1c\n\x14hosted_by_gobject_id\x18\x08 \x01(\x05\x12\x16\n\x0etemplate_chain\x18\t \x03(\t\x12\x39\n\nattributes\x18\n \x03(\x0b\x32%.galaxy_repository.v1.GalaxyAttribute\"\xa8\x02\n\x0fGalaxyAttribute\x12\x16\n\x0e\x61ttribute_name\x18\x01 \x01(\t\x12\x1a\n\x12\x66ull_tag_reference\x18\x02 \x01(\t\x12\x14\n\x0cmx_data_type\x18\x03 \x01(\x05\x12\x16\n\x0e\x64\x61ta_type_name\x18\x04 \x01(\t\x12\x10\n\x08is_array\x18\x05 \x01(\x08\x12\x17\n\x0f\x61rray_dimension\x18\x06 \x01(\x05\x12\x1f\n\x17\x61rray_dimension_present\x18\x07 \x01(\x08\x12\x1d\n\x15mx_attribute_category\x18\x08 \x01(\x05\x12\x1f\n\x17security_classification\x18\t \x01(\x05\x12\x15\n\ris_historized\x18\n \x01(\x08\x12\x10\n\x08is_alarm\x18\x0b \x01(\x08\x32\xcc\x03\n\x10GalaxyRepository\x12h\n\x0eTestConnection\x12+.galaxy_repository.v1.TestConnectionRequest\x1a).galaxy_repository.v1.TestConnectionReply\x12q\n\x11GetLastDeployTime\x12..galaxy_repository.v1.GetLastDeployTimeRequest\x1a,.galaxy_repository.v1.GetLastDeployTimeReply\x12q\n\x11\x44iscoverHierarchy\x12..galaxy_repository.v1.DiscoverHierarchyRequest\x1a,.galaxy_repository.v1.DiscoverHierarchyReply\x12h\n\x11WatchDeployEvents\x12..galaxy_repository.v1.WatchDeployEventsRequest\x1a!.galaxy_repository.v1.DeployEvent0\x01\x42-\xaa\x02*ZB.MOM.WW.MxGateway.Contracts.Proto.Galaxyb\x06proto3')
DESCRIPTOR = _descriptor_pool.Default().AddSerializedFile(b'\n\x17galaxy_repository.proto\x12\x14galaxy_repository.v1\x1a\x1fgoogle/protobuf/timestamp.proto\x1a\x1egoogle/protobuf/wrappers.proto\"\x17\n\x15TestConnectionRequest\"!\n\x13TestConnectionReply\x12\n\n\x02ok\x18\x01 \x01(\x08\"\x1a\n\x18GetLastDeployTimeRequest\"b\n\x16GetLastDeployTimeReply\x12\x0f\n\x07present\x18\x01 \x01(\x08\x12\x37\n\x13time_of_last_deploy\x18\x02 \x01(\x0b\x32\x1a.google.protobuf.Timestamp\"\x87\x03\n\x18\x44iscoverHierarchyRequest\x12\x11\n\tpage_size\x18\x01 \x01(\x05\x12\x12\n\npage_token\x18\x02 \x01(\t\x12\x19\n\x0froot_gobject_id\x18\x03 \x01(\x05H\x00\x12\x17\n\rroot_tag_name\x18\x04 \x01(\tH\x00\x12\x1d\n\x13root_contained_path\x18\x05 \x01(\tH\x00\x12.\n\tmax_depth\x18\x06 \x01(\x0b\x32\x1b.google.protobuf.Int32Value\x12\x14\n\x0c\x63\x61tegory_ids\x18\x07 \x03(\x05\x12\x1f\n\x17template_chain_contains\x18\x08 \x03(\t\x12\x15\n\rtag_name_glob\x18\t \x01(\t\x12\x1f\n\x12include_attributes\x18\n \x01(\x08H\x01\x88\x01\x01\x12\x1a\n\x12\x61larm_bearing_only\x18\x0b \x01(\x08\x12\x17\n\x0fhistorized_only\x18\x0c \x01(\x08\x42\x06\n\x04rootB\x15\n\x13_include_attributes\"\x82\x01\n\x16\x44iscoverHierarchyReply\x12\x33\n\x07objects\x18\x01 \x03(\x0b\x32\".galaxy_repository.v1.GalaxyObject\x12\x17\n\x0fnext_page_token\x18\x02 \x01(\t\x12\x1a\n\x12total_object_count\x18\x03 \x01(\x05\"U\n\x18WatchDeployEventsRequest\x12\x39\n\x15last_seen_deploy_time\x18\x01 \x01(\x0b\x32\x1a.google.protobuf.Timestamp\"\xdd\x01\n\x0b\x44\x65ployEvent\x12\x10\n\x08sequence\x18\x01 \x01(\x04\x12/\n\x0bobserved_at\x18\x02 \x01(\x0b\x32\x1a.google.protobuf.Timestamp\x12\x37\n\x13time_of_last_deploy\x18\x03 \x01(\x0b\x32\x1a.google.protobuf.Timestamp\x12#\n\x1btime_of_last_deploy_present\x18\x04 \x01(\x08\x12\x14\n\x0cobject_count\x18\x05 \x01(\x05\x12\x17\n\x0f\x61ttribute_count\x18\x06 \x01(\x05\"\x93\x02\n\x0cGalaxyObject\x12\x12\n\ngobject_id\x18\x01 \x01(\x05\x12\x10\n\x08tag_name\x18\x02 \x01(\t\x12\x16\n\x0e\x63ontained_name\x18\x03 \x01(\t\x12\x13\n\x0b\x62rowse_name\x18\x04 \x01(\t\x12\x19\n\x11parent_gobject_id\x18\x05 \x01(\x05\x12\x0f\n\x07is_area\x18\x06 \x01(\x08\x12\x13\n\x0b\x63\x61tegory_id\x18\x07 \x01(\x05\x12\x1c\n\x14hosted_by_gobject_id\x18\x08 \x01(\x05\x12\x16\n\x0etemplate_chain\x18\t \x03(\t\x12\x39\n\nattributes\x18\n \x03(\x0b\x32%.galaxy_repository.v1.GalaxyAttribute\"\xa8\x02\n\x0fGalaxyAttribute\x12\x16\n\x0e\x61ttribute_name\x18\x01 \x01(\t\x12\x1a\n\x12\x66ull_tag_reference\x18\x02 \x01(\t\x12\x14\n\x0cmx_data_type\x18\x03 \x01(\x05\x12\x16\n\x0e\x64\x61ta_type_name\x18\x04 \x01(\t\x12\x10\n\x08is_array\x18\x05 \x01(\x08\x12\x17\n\x0f\x61rray_dimension\x18\x06 \x01(\x05\x12\x1f\n\x17\x61rray_dimension_present\x18\x07 \x01(\x08\x12\x1d\n\x15mx_attribute_category\x18\x08 \x01(\x05\x12\x1f\n\x17security_classification\x18\t \x01(\x05\x12\x15\n\ris_historized\x18\n \x01(\x08\x12\x10\n\x08is_alarm\x18\x0b \x01(\x08\"\xdc\x02\n\x15\x42rowseChildrenRequest\x12\x1b\n\x11parent_gobject_id\x18\x01 \x01(\x05H\x00\x12\x19\n\x0fparent_tag_name\x18\x02 \x01(\tH\x00\x12\x1f\n\x15parent_contained_path\x18\x03 \x01(\tH\x00\x12\x11\n\tpage_size\x18\x04 \x01(\x05\x12\x12\n\npage_token\x18\x05 \x01(\t\x12\x14\n\x0c\x63\x61tegory_ids\x18\x06 \x03(\x05\x12\x1f\n\x17template_chain_contains\x18\x07 \x03(\t\x12\x15\n\rtag_name_glob\x18\x08 \x01(\t\x12\x1f\n\x12include_attributes\x18\t \x01(\x08H\x01\x88\x01\x01\x12\x1a\n\x12\x61larm_bearing_only\x18\n \x01(\x08\x12\x17\n\x0fhistorized_only\x18\x0b \x01(\x08\x42\x08\n\x06parentB\x15\n\x13_include_attributes\"\xb3\x01\n\x13\x42rowseChildrenReply\x12\x34\n\x08\x63hildren\x18\x01 \x03(\x0b\x32\".galaxy_repository.v1.GalaxyObject\x12\x17\n\x0fnext_page_token\x18\x02 \x01(\t\x12\x19\n\x11total_child_count\x18\x03 \x01(\x05\x12\x1a\n\x12\x63hild_has_children\x18\x04 \x03(\x08\x12\x16\n\x0e\x63\x61\x63he_sequence\x18\x05 \x01(\x04\x32\xb6\x04\n\x10GalaxyRepository\x12h\n\x0eTestConnection\x12+.galaxy_repository.v1.TestConnectionRequest\x1a).galaxy_repository.v1.TestConnectionReply\x12q\n\x11GetLastDeployTime\x12..galaxy_repository.v1.GetLastDeployTimeRequest\x1a,.galaxy_repository.v1.GetLastDeployTimeReply\x12q\n\x11\x44iscoverHierarchy\x12..galaxy_repository.v1.DiscoverHierarchyRequest\x1a,.galaxy_repository.v1.DiscoverHierarchyReply\x12h\n\x11WatchDeployEvents\x12..galaxy_repository.v1.WatchDeployEventsRequest\x1a!.galaxy_repository.v1.DeployEvent0\x01\x12h\n\x0e\x42rowseChildren\x12+.galaxy_repository.v1.BrowseChildrenRequest\x1a).galaxy_repository.v1.BrowseChildrenReplyB-\xaa\x02*ZB.MOM.WW.MxGateway.Contracts.Proto.Galaxyb\x06proto3')
_globals = globals()
_builder.BuildMessageAndEnumDescriptors(DESCRIPTOR, _globals)
@@ -54,6 +54,10 @@ if not _descriptor._USE_C_DESCRIPTORS:
_globals['_GALAXYOBJECT']._serialized_end=1416
_globals['_GALAXYATTRIBUTE']._serialized_start=1419
_globals['_GALAXYATTRIBUTE']._serialized_end=1715
_globals['_GALAXYREPOSITORY']._serialized_start=1718
_globals['_GALAXYREPOSITORY']._serialized_end=2178
_globals['_BROWSECHILDRENREQUEST']._serialized_start=1718
_globals['_BROWSECHILDRENREQUEST']._serialized_end=2066
_globals['_BROWSECHILDRENREPLY']._serialized_start=2069
_globals['_BROWSECHILDRENREPLY']._serialized_end=2248
_globals['_GALAXYREPOSITORY']._serialized_start=2251
_globals['_GALAXYREPOSITORY']._serialized_end=2817
# @@protoc_insertion_point(module_scope)
@@ -65,6 +65,11 @@ class GalaxyRepositoryStub(object):
request_serializer=galaxy__repository__pb2.WatchDeployEventsRequest.SerializeToString,
response_deserializer=galaxy__repository__pb2.DeployEvent.FromString,
_registered_method=True)
self.BrowseChildren = channel.unary_unary(
'/galaxy_repository.v1.GalaxyRepository/BrowseChildren',
request_serializer=galaxy__repository__pb2.BrowseChildrenRequest.SerializeToString,
response_deserializer=galaxy__repository__pb2.BrowseChildrenReply.FromString,
_registered_method=True)
class GalaxyRepositoryServicer(object):
@@ -111,6 +116,16 @@ class GalaxyRepositoryServicer(object):
context.set_details('Method not implemented!')
raise NotImplementedError('Method not implemented!')
def BrowseChildren(self, request, context):
"""Returns the direct children of a parent object (or the root objects when
`parent` is unset). Designed for OPC UA-style lazy expand: clients walk
one level at a time instead of paging the full hierarchy. Filters mirror
DiscoverHierarchy exactly. Backed by the same shared hierarchy cache.
"""
context.set_code(grpc.StatusCode.UNIMPLEMENTED)
context.set_details('Method not implemented!')
raise NotImplementedError('Method not implemented!')
def add_GalaxyRepositoryServicer_to_server(servicer, server):
rpc_method_handlers = {
@@ -134,6 +149,11 @@ def add_GalaxyRepositoryServicer_to_server(servicer, server):
request_deserializer=galaxy__repository__pb2.WatchDeployEventsRequest.FromString,
response_serializer=galaxy__repository__pb2.DeployEvent.SerializeToString,
),
'BrowseChildren': grpc.unary_unary_rpc_method_handler(
servicer.BrowseChildren,
request_deserializer=galaxy__repository__pb2.BrowseChildrenRequest.FromString,
response_serializer=galaxy__repository__pb2.BrowseChildrenReply.SerializeToString,
),
}
generic_handler = grpc.method_handlers_generic_handler(
'galaxy_repository.v1.GalaxyRepository', rpc_method_handlers)
@@ -263,3 +283,30 @@ class GalaxyRepository(object):
timeout,
metadata,
_registered_method=True)
@staticmethod
def BrowseChildren(request,
target,
options=(),
channel_credentials=None,
call_credentials=None,
insecure=False,
compression=None,
wait_for_ready=None,
timeout=None,
metadata=None):
return grpc.experimental.unary_unary(
request,
target,
'/galaxy_repository.v1.GalaxyRepository/BrowseChildren',
galaxy__repository__pb2.BrowseChildrenRequest.SerializeToString,
galaxy__repository__pb2.BrowseChildrenReply.FromString,
options,
channel_credentials,
insecure,
call_credentials,
compression,
wait_for_ready,
timeout,
metadata,
_registered_method=True)
@@ -135,6 +135,9 @@ class MxAccessGatewayServicer(object):
reconnect to seed Part 9 client state, or to reconcile alarms that may
have been missed during a transport blip. Streamed so callers can
begin processing without buffering the full set.
`QueryActiveAlarmsRequest.alarm_filter_prefix` optionally narrows the
snapshot to alarms whose `alarm_full_reference` starts with the given
prefix; an empty prefix returns the full set.
"""
context.set_code(grpc.StatusCode.UNIMPLEMENTED)
context.set_details('Method not implemented!')
@@ -2,12 +2,15 @@
from __future__ import annotations
from dataclasses import dataclass
import ssl
from collections.abc import Sequence
from dataclasses import dataclass, field
from pathlib import Path
import grpc
from .auth import REDACTED, ApiKey
from .errors import MxGatewayTransportError
@dataclass(frozen=True)
@@ -18,6 +21,7 @@ class ClientOptions:
api_key: str | ApiKey | None = None
plaintext: bool = False
ca_file: str | None = None
require_certificate_validation: bool = False
server_name_override: str | None = None
call_timeout: float | None = 30.0
stream_timeout: float | None = None
@@ -44,6 +48,7 @@ class ClientOptions:
f"{type(self).__name__}(endpoint={self.endpoint!r}, "
f"api_key={api_key!r}, plaintext={self.plaintext!r}, "
f"ca_file={self.ca_file!r}, "
f"require_certificate_validation={self.require_certificate_validation!r}, "
f"server_name_override={self.server_name_override!r}, "
f"call_timeout={self.call_timeout!r}, "
f"stream_timeout={self.stream_timeout!r}, "
@@ -51,8 +56,51 @@ class ClientOptions:
)
@dataclass(frozen=True)
class BrowseChildrenOptions:
"""Filters and shape options for ``GalaxyRepositoryClient.browse``.
Mirrors the AND-combined filter set on ``BrowseChildrenRequest`` so a
single instance can be re-used across an entire lazy browse session
(the filter set is part of the page-token contract).
"""
category_ids: Sequence[int] = field(default_factory=tuple)
template_chain_contains: Sequence[str] = field(default_factory=tuple)
tag_name_glob: str | None = None
include_attributes: bool | None = None
alarm_bearing_only: bool = False
historized_only: bool = False
def _split_authority(endpoint: str) -> tuple[str, int]:
"""Split a gRPC target (optionally scheme-prefixed) into (host, port).
Handles bracketed IPv6 literals (e.g. ``[::1]:5120`` or bare ``[::1]``),
returning the host without brackets so it is safe to pass to
``ssl.get_server_certificate``.
"""
target = endpoint.split("://", 1)[-1]
if target.startswith("["):
# Bracketed IPv6: "[::1]:5120" or "[::1]"
bracket_end = target.find("]")
host = target[1:bracket_end] # strip surrounding brackets
remainder = target[bracket_end + 1 :] # ":5120" or ""
port_str = remainder.lstrip(":")
return (host, int(port_str) if port_str else 443)
host, _, port = target.rpartition(":")
return (host or "localhost", int(port) if port else 443)
def create_channel(options: ClientOptions) -> grpc.aio.Channel:
"""Create a plaintext or TLS `grpc.aio` channel from client options."""
"""Create a plaintext or TLS `grpc.aio` channel from client options.
The TLS default is lenient: grpc-python has no per-channel skip-verify, so
the server's presented certificate is fetched once (unverified) and pinned
as the channel's only trust root (trust-on-first-use). Set
`require_certificate_validation=True` to force system-trust verification, or
pass `ca_file` to verify against a specific CA both bypass the TOFU path.
"""
channel_options: list[tuple[str, str | int]] = [
("grpc.max_receive_message_length", options.max_grpc_message_bytes),
@@ -64,11 +112,28 @@ def create_channel(options: ClientOptions) -> grpc.aio.Channel:
if options.plaintext:
return grpc.aio.insecure_channel(options.endpoint, options=channel_options)
root_certificates = None
if options.ca_file:
root_certificates = Path(options.ca_file).read_bytes()
credentials = grpc.ssl_channel_credentials(root_certificates=root_certificates)
elif options.require_certificate_validation:
credentials = grpc.ssl_channel_credentials()
else:
# Lenient default: grpc-python has no per-channel skip-verify, so fetch the
# server's certificate (unverified) and pin it for this channel (TOFU).
host, port = _split_authority(options.endpoint)
try:
presented = ssl.get_server_certificate((host, port))
except OSError as error:
raise MxGatewayTransportError(
f"failed to fetch TLS certificate from {options.endpoint}: {error}"
) from error
credentials = grpc.ssl_channel_credentials(root_certificates=presented.encode("ascii"))
# The gateway self-signed cert always carries a "localhost" SAN, so default
# the SNI/target-name override to it when none was supplied, tolerating
# dial-by-IP or hostname mismatch.
if not options.server_name_override:
channel_options.append(("grpc.ssl_target_name_override", "localhost"))
credentials = grpc.ssl_channel_credentials(root_certificates=root_certificates)
return grpc.aio.secure_channel(
options.endpoint,
credentials,
@@ -3,15 +3,18 @@
from __future__ import annotations
import asyncio
import contextlib
import io
import json
import logging
import os
import sys
import time
from collections.abc import Awaitable, Callable
from datetime import datetime, timezone
from typing import Any
import click
from click.testing import CliRunner
from google.protobuf.json_format import MessageToDict
from zb_mom_ww_mxgateway import __version__
@@ -22,6 +25,8 @@ from zb_mom_ww_mxgateway.generated import mxaccess_gateway_pb2 as pb
from zb_mom_ww_mxgateway.options import ClientOptions
from zb_mom_ww_mxgateway.values import MxValueInput, to_mx_value
logger = logging.getLogger(__name__)
MAX_AGGREGATE_EVENTS = 10_000
_BATCH_EOR = "__MXGW_BATCH_EOR__"
@@ -56,9 +61,10 @@ def batch() -> None:
Errors do NOT terminate the loop. Each command's output (including any error JSON) is
written to stdout followed by a line containing exactly ``__MXGW_BATCH_EOR__``, then
stdout is flushed. Error output is formatted as ``{"error": "...", "type": "..."}``.
"""
runner = CliRunner()
Recursive ``batch`` lines are rejected (Client.Python-024) re-entering the batch
dispatcher would silently spawn a nested loop reading from the same exhausted stdin.
"""
for raw_line in sys.stdin:
line = raw_line.rstrip("\n").rstrip("\r")
@@ -68,44 +74,77 @@ def batch() -> None:
args = line.split()
try:
result = runner.invoke(main, args, catch_exceptions=True)
except Exception as exc: # noqa: BLE001 — be safe; never let batch loop die
_batch_write_error(exc.__class__.__name__, str(exc))
# Reject a recursive `batch` line outright: the nested invocation would
# read from the already-exhausted stdin (or, depending on harness, the
# same stream the outer batch is consuming line-by-line) and silently
# exit. Surface it as an explicit error block so callers can audit the
# mis-routed line.
if args and args[0] == "batch":
_batch_write_error(
"RecursiveBatchError",
"nested 'batch' invocation is not allowed inside batch mode",
)
_batch_flush_eor()
continue
if result.exit_code == 0:
# Normal success — write captured output as-is.
sys.stdout.write(result.output)
_dispatch_batch_line(args)
def _dispatch_batch_line(args: list[str]) -> None:
"""Run a single batch line through the Click parser directly (no CliRunner).
Captures the subcommand's stdout via :func:`contextlib.redirect_stdout` and
synthesises the standard ``{"error": ..., "type": ...}`` shape on failure.
Click exceptions (`ClickException`, `UsageError`) are caught and rendered;
`SystemExit(0)` from a Click command is treated as a clean exit, while a
non-zero `SystemExit` is rendered as a CLI error. All other exceptions are
captured and rendered as `{"error": str(exc), "type": exc.__class__.__name__}`
so the loop never dies.
"""
buffer = io.StringIO()
exit_code = 0
exc: BaseException | None = None
try:
with contextlib.redirect_stdout(buffer):
try:
# `standalone_mode=False` makes Click raise instead of calling
# `sys.exit`; we still need to handle SystemExit because some
# commands explicitly raise it (or `click.UsageError` converts
# to a SystemExit under some entry-point paths).
main.main(args=args, standalone_mode=False, prog_name="mxgw-py")
except click.exceptions.Exit as click_exit:
exit_code = click_exit.exit_code
except click.ClickException as click_exc:
exit_code = click_exc.exit_code
exc = click_exc
click.echo(f"Error: {click_exc.format_message()}", err=False)
except SystemExit as sys_exit:
code = sys_exit.code
exit_code = int(code) if isinstance(code, int) else (0 if code is None else 1)
except Exception as captured: # noqa: BLE001 — never let batch loop die
exc = captured
exit_code = 1
output = buffer.getvalue()
if exit_code == 0 and exc is None:
sys.stdout.write(output)
else:
if output.lstrip().startswith("{"):
# Inner command already emitted JSON (e.g. a structured error) —
# relay verbatim.
sys.stdout.write(output)
if output and not output.endswith("\n"):
sys.stdout.write("\n")
elif exc is not None:
_batch_write_error(type(exc).__name__, str(exc))
else:
# Something went wrong. If the command already emitted a JSON object
# (e.g. the output starts with '{'), trust that and relay it verbatim.
# Otherwise synthesise the standard {"error": ..., "type": ...} shape.
output = result.output or ""
exc = result.exception
msg = output.strip()
if msg.startswith("Error: "):
msg = msg[len("Error: "):]
_batch_write_error("CliError", msg)
if output.lstrip().startswith("{"):
# Already JSON — relay verbatim (may or may not end with newline).
sys.stdout.write(output)
if not output.endswith("\n"):
sys.stdout.write("\n")
elif exc is not None and not isinstance(exc, SystemExit):
_batch_write_error(type(exc).__name__, str(exc))
else:
# Click's default error format is "Error: <message>\n"; extract the
# message so the harness gets clean JSON.
msg = output.strip()
if msg.startswith("Error: "):
msg = msg[len("Error: "):]
exc_type = (
type(exc).__name__
if exc is not None and not isinstance(exc, SystemExit)
else "CliError"
)
_batch_write_error(exc_type, msg)
_batch_flush_eor()
_batch_flush_eor()
def _batch_write_error(exc_type: str, message: str) -> None:
@@ -673,7 +712,6 @@ async def _write_secured2_bulk(**kwargs: Any) -> dict[str, Any]:
async def _bench_read_bulk(**kwargs: Any) -> dict[str, Any]:
"""ReadBulk stress benchmark — matches the .NET / Go / Rust / Java schema."""
import time
bulk_size = int(kwargs["bulk_size"])
if bulk_size < 1:
@@ -730,12 +768,12 @@ async def _bench_read_bulk(**kwargs: Any) -> dict[str, Any]:
if item_handles:
try:
await session.unsubscribe_bulk(server_handle, item_handles)
except Exception:
pass
except Exception as exc: # noqa: BLE001 — bench is best-effort
logger.warning("bench-read-bulk: unsubscribe_bulk cleanup failed: %s", exc)
try:
await session.close()
except Exception:
pass
except Exception as exc: # noqa: BLE001 — bench is best-effort
logger.warning("bench-read-bulk: session.close cleanup failed: %s", exc)
return {
"language": "python",
@@ -899,11 +937,21 @@ def _session(client: GatewayClient, session_id: str):
def _use_plaintext(kwargs: dict[str, Any]) -> bool:
if kwargs.get("use_tls"):
return False
if kwargs.get("plaintext"):
return True
return kwargs["endpoint"].startswith("localhost:") or kwargs["endpoint"].startswith("127.0.0.1:")
"""Resolve the plaintext / TLS contract from the CLI flags.
TLS is the default. ``--plaintext`` is the only way to opt in to an
unencrypted channel; ``--tls`` is accepted as a redundant explicit
affirmation. Combining the two is a usage error (regression-guarded by
Client.Python-023 the previous silent ``localhost:`` /
``127.0.0.1:`` auto-plaintext branch leaked the API-key bearer over a
plaintext channel when a user ran the gateway behind TLS on loopback).
"""
plaintext = bool(kwargs.get("plaintext"))
use_tls = bool(kwargs.get("use_tls"))
if plaintext and use_tls:
raise click.UsageError("--plaintext and --tls are mutually exclusive")
return plaintext
def _api_key_from_env(name: str | None) -> str | None:
+186 -23
View File
@@ -72,27 +72,83 @@ def test_create_channel_uses_plaintext_channel(monkeypatch: pytest.MonkeyPatch)
]
def test_create_channel_uses_tls_channel(monkeypatch: pytest.MonkeyPatch) -> None:
calls: list[tuple[str, object, object]] = []
def test_create_channel_uses_tls_channel_tofu_default(monkeypatch: pytest.MonkeyPatch) -> None:
"""Default TLS (no ca_file, no require_certificate_validation) uses TOFU:
fetches the server cert unverified, pins it as root_certificates, and adds
grpc.ssl_target_name_override = "localhost" automatically.
"""
_DUMMY_PEM = "-----BEGIN CERTIFICATE-----\nZmFrZQ==\n-----END CERTIFICATE-----\n"
get_cert_calls: list[tuple[str, int]] = []
def fake_credentials(*, root_certificates: object) -> str:
assert root_certificates is None
def fake_get_server_certificate(addr: tuple[str, int]) -> str:
get_cert_calls.append(addr)
return _DUMMY_PEM
cred_calls: list[object] = []
def fake_credentials(*, root_certificates: object = None) -> str:
cred_calls.append(root_certificates)
return "creds"
channel_calls: list[tuple[str, object, object]] = []
def fake_secure_channel(endpoint: str, credentials: object, *, options: object) -> str:
calls.append((endpoint, credentials, options))
channel_calls.append((endpoint, credentials, options))
return "tls-channel"
monkeypatch.setattr(
options_module.grpc,
"ssl_channel_credentials",
fake_credentials,
monkeypatch.setattr(options_module.ssl, "get_server_certificate", fake_get_server_certificate)
monkeypatch.setattr(options_module.grpc, "ssl_channel_credentials", fake_credentials)
monkeypatch.setattr(options_module.grpc.aio, "secure_channel", fake_secure_channel)
channel = create_channel(
ClientOptions(endpoint="gateway.example:5001"),
)
assert channel == "tls-channel"
# TOFU: should have fetched the cert from the server (host, port)
assert get_cert_calls == [("gateway.example", 5001)]
# Pinned the fetched PEM bytes as root_certificates
assert cred_calls == [_DUMMY_PEM.encode("ascii")]
# Auto-injected localhost override (no server_name_override supplied)
assert channel_calls == [
(
"gateway.example:5001",
"creds",
[
("grpc.max_receive_message_length", 16 * 1024 * 1024),
("grpc.max_send_message_length", 16 * 1024 * 1024),
("grpc.ssl_target_name_override", "localhost"),
],
),
]
def test_create_channel_uses_tls_channel_tofu_respects_server_name_override(
monkeypatch: pytest.MonkeyPatch,
) -> None:
"""When server_name_override is set, TOFU still runs but does NOT add the
auto-localhost override (the explicit override is already in channel_options).
"""
_DUMMY_PEM = "-----BEGIN CERTIFICATE-----\nZmFrZQ==\n-----END CERTIFICATE-----\n"
monkeypatch.setattr(
options_module.grpc.aio,
"secure_channel",
fake_secure_channel,
options_module.ssl,
"get_server_certificate",
lambda addr: _DUMMY_PEM,
)
cred_calls: list[object] = []
def fake_credentials(*, root_certificates: object = None) -> str:
cred_calls.append(root_certificates)
return "creds"
channel_calls: list[tuple[str, object, object]] = []
def fake_secure_channel(endpoint: str, credentials: object, *, options: object) -> str:
channel_calls.append((endpoint, credentials, options))
return "tls-channel"
monkeypatch.setattr(options_module.grpc, "ssl_channel_credentials", fake_credentials)
monkeypatch.setattr(options_module.grpc.aio, "secure_channel", fake_secure_channel)
channel = create_channel(
ClientOptions(
@@ -102,14 +158,121 @@ def test_create_channel_uses_tls_channel(monkeypatch: pytest.MonkeyPatch) -> Non
)
assert channel == "tls-channel"
assert calls == [
(
"gateway.example:5001",
"creds",
[
("grpc.max_receive_message_length", 16 * 1024 * 1024),
("grpc.max_send_message_length", 16 * 1024 * 1024),
("grpc.ssl_target_name_override", "gateway.test"),
],
),
]
assert cred_calls == [_DUMMY_PEM.encode("ascii")]
assert channel_calls == [
(
"gateway.example:5001",
"creds",
[
("grpc.max_receive_message_length", 16 * 1024 * 1024),
("grpc.max_send_message_length", 16 * 1024 * 1024),
# Explicit override from ClientOptions — not the auto-localhost one
("grpc.ssl_target_name_override", "gateway.test"),
],
),
]
def test_create_channel_uses_tls_channel_require_cert_validation(
monkeypatch: pytest.MonkeyPatch,
) -> None:
"""require_certificate_validation=True uses system trust (no TOFU, no root_certificates)."""
get_cert_called = False
def fake_get_server_certificate(addr: object) -> str: # pragma: no cover
nonlocal get_cert_called
get_cert_called = True
return "SHOULD_NOT_BE_CALLED"
cred_calls: list[object] = []
def fake_credentials(**kwargs: object) -> str:
cred_calls.append(kwargs)
return "creds"
channel_calls: list[tuple[str, object, object]] = []
def fake_secure_channel(endpoint: str, credentials: object, *, options: object) -> str:
channel_calls.append((endpoint, credentials, options))
return "tls-channel"
monkeypatch.setattr(options_module.ssl, "get_server_certificate", fake_get_server_certificate)
monkeypatch.setattr(options_module.grpc, "ssl_channel_credentials", fake_credentials)
monkeypatch.setattr(options_module.grpc.aio, "secure_channel", fake_secure_channel)
channel = create_channel(
ClientOptions(
endpoint="gateway.example:5001",
require_certificate_validation=True,
),
)
assert channel == "tls-channel"
# Must NOT call TOFU prefetch
assert not get_cert_called
# ssl_channel_credentials() called with NO keyword args (system trust)
assert cred_calls == [{}]
assert channel_calls == [
(
"gateway.example:5001",
"creds",
[
("grpc.max_receive_message_length", 16 * 1024 * 1024),
("grpc.max_send_message_length", 16 * 1024 * 1024),
],
),
]
def test_create_channel_uses_tls_channel_ca_file(
monkeypatch: pytest.MonkeyPatch,
tmp_path: pytest.TempPathFactory,
) -> None:
"""ca_file path: reads the PEM file, passes bytes as root_certificates, skips TOFU."""
ca_pem = b"-----BEGIN CERTIFICATE-----\nY2FkYXRh\n-----END CERTIFICATE-----\n"
ca_file = tmp_path / "ca.pem"
ca_file.write_bytes(ca_pem)
get_cert_called = False
def fake_get_server_certificate(addr: object) -> str: # pragma: no cover
nonlocal get_cert_called
get_cert_called = True
return "SHOULD_NOT_BE_CALLED"
cred_calls: list[object] = []
def fake_credentials(*, root_certificates: object = None) -> str:
cred_calls.append(root_certificates)
return "creds"
channel_calls: list[tuple[str, object, object]] = []
def fake_secure_channel(endpoint: str, credentials: object, *, options: object) -> str:
channel_calls.append((endpoint, credentials, options))
return "tls-channel"
monkeypatch.setattr(options_module.ssl, "get_server_certificate", fake_get_server_certificate)
monkeypatch.setattr(options_module.grpc, "ssl_channel_credentials", fake_credentials)
monkeypatch.setattr(options_module.grpc.aio, "secure_channel", fake_secure_channel)
channel = create_channel(
ClientOptions(
endpoint="gateway.example:5001",
ca_file=str(ca_file),
),
)
assert channel == "tls-channel"
assert not get_cert_called
assert cred_calls == [ca_pem]
assert channel_calls == [
(
"gateway.example:5001",
"creds",
[
("grpc.max_receive_message_length", 16 * 1024 * 1024),
("grpc.max_send_message_length", 16 * 1024 * 1024),
],
),
]
+276
View File
@@ -6,12 +6,16 @@ import asyncio
from datetime import datetime, timezone
from typing import Any
import grpc
import pytest
from google.protobuf.timestamp_pb2 import Timestamp
from zb_mom_ww_mxgateway import ClientOptions, DeployEvent, GalaxyRepositoryClient, WatchDeployEventsRequest
from zb_mom_ww_mxgateway.errors import MxGatewayError
from zb_mom_ww_mxgateway.galaxy import LazyBrowseNode
from zb_mom_ww_mxgateway.generated import galaxy_repository_pb2 as galaxy_pb
from zb_mom_ww_mxgateway.generated import galaxy_repository_pb2_grpc as galaxy_pb_grpc
from zb_mom_ww_mxgateway.options import BrowseChildrenOptions
def test_galaxy_messages_import() -> None:
@@ -268,15 +272,281 @@ async def test_close_marks_channel_closed_when_no_real_channel() -> None:
await client.close()
def _obj(gid: int, tag: str, is_area: bool = False) -> galaxy_pb.GalaxyObject:
return galaxy_pb.GalaxyObject(
gobject_id=gid, tag_name=tag, browse_name=tag, is_area=is_area,
)
def _build_browse_reply(
children: list[galaxy_pb.GalaxyObject],
child_has_children: list[bool],
cache_sequence: int,
next_page_token: str = "",
) -> galaxy_pb.BrowseChildrenReply:
reply = galaxy_pb.BrowseChildrenReply(
total_child_count=len(children),
cache_sequence=cache_sequence,
next_page_token=next_page_token,
)
reply.children.extend(children)
reply.child_has_children.extend(child_has_children)
return reply
def _fake_aio_rpc_error(code: grpc.StatusCode, details: str) -> grpc.aio.AioRpcError:
return grpc.aio.AioRpcError(
code=code,
initial_metadata=grpc.aio.Metadata(),
trailing_metadata=grpc.aio.Metadata(),
details=details,
)
@pytest.mark.asyncio
async def test_browse_no_parent_returns_roots() -> None:
stub = FakeGalaxyStub()
stub.browse_children.replies = [
_build_browse_reply(
children=[_obj(1, "Area_A", is_area=True), _obj(2, "Area_B", is_area=True)],
child_has_children=[True, False],
cache_sequence=7,
),
]
client = await GalaxyRepositoryClient.connect(
ClientOptions(endpoint="fake", plaintext=True),
stub=stub,
)
roots = await client.browse()
assert len(roots) == 2
assert all(isinstance(node, LazyBrowseNode) for node in roots)
assert roots[0].object.tag_name == "Area_A"
assert roots[0].has_children_hint is True
assert roots[1].has_children_hint is False
assert roots[0].is_expanded is False
request = stub.browse_children.requests[0]
assert request.WhichOneof("parent") is None
assert request.page_size == 500
assert request.page_token == ""
@pytest.mark.asyncio
async def test_browse_expand_populates_children_and_marks_expanded() -> None:
stub = FakeGalaxyStub()
stub.browse_children.replies = [
_build_browse_reply(
children=[_obj(1, "Area_A", is_area=True)],
child_has_children=[True],
cache_sequence=1,
),
_build_browse_reply(
children=[_obj(11, "Child_A"), _obj(12, "Child_B")],
child_has_children=[False, False],
cache_sequence=1,
),
]
client = await GalaxyRepositoryClient.connect(
ClientOptions(endpoint="fake", plaintext=True),
stub=stub,
)
roots = await client.browse()
await roots[0].expand()
assert roots[0].is_expanded is True
assert [n.object.tag_name for n in roots[0].children] == ["Child_A", "Child_B"]
assert len(stub.browse_children.requests) == 2
expand_request = stub.browse_children.requests[1]
assert expand_request.WhichOneof("parent") == "parent_gobject_id"
assert expand_request.parent_gobject_id == 1
@pytest.mark.asyncio
async def test_browse_expand_idempotent_no_second_rpc() -> None:
stub = FakeGalaxyStub()
stub.browse_children.replies = [
_build_browse_reply(
children=[_obj(1, "Area_A", is_area=True)],
child_has_children=[True],
cache_sequence=1,
),
_build_browse_reply(
children=[_obj(11, "Child_A")],
child_has_children=[False],
cache_sequence=1,
),
]
client = await GalaxyRepositoryClient.connect(
ClientOptions(endpoint="fake", plaintext=True),
stub=stub,
)
roots = await client.browse()
await roots[0].expand()
await roots[0].expand()
assert len(stub.browse_children.requests) == 2
assert len(roots[0].children) == 1
@pytest.mark.asyncio
async def test_browse_expand_concurrent_callers_only_fire_one_rpc() -> None:
stub = FakeGalaxyStub()
stub.browse_children.replies = [
_build_browse_reply([_obj(1, "Plant", is_area=True)], [True], 7),
_build_browse_reply([_obj(2, "Mixer_001")], [False], 7),
]
client = await GalaxyRepositoryClient.connect(
ClientOptions(endpoint="fake", plaintext=True),
stub=stub,
)
roots = await client.browse()
# Ten concurrent expand calls on the same node should issue exactly one RPC.
await asyncio.gather(*(roots[0].expand() for _ in range(10)))
assert roots[0].is_expanded
assert len(roots[0].children) == 1
# 1 roots fetch + exactly 1 expand fetch = 2 total
assert len(stub.browse_children.requests) == 2
@pytest.mark.asyncio
async def test_browse_expand_unknown_parent_raises_mxgateway_error() -> None:
stub = FakeGalaxyStub()
stub.browse_children.replies = [
_build_browse_reply(
children=[_obj(99, "Stale_Parent", is_area=True)],
child_has_children=[True],
cache_sequence=1,
),
]
stub.browse_children.exceptions = [
None,
_fake_aio_rpc_error(grpc.StatusCode.NOT_FOUND, "parent not found"),
]
client = await GalaxyRepositoryClient.connect(
ClientOptions(endpoint="fake", plaintext=True),
stub=stub,
)
roots = await client.browse()
with pytest.raises(MxGatewayError):
await roots[0].expand()
@pytest.mark.asyncio
async def test_browse_expand_multi_page_gathers_all_pages() -> None:
stub = FakeGalaxyStub()
stub.browse_children.replies = [
_build_browse_reply(
children=[_obj(7, "Area_Big", is_area=True)],
child_has_children=[True],
cache_sequence=2,
),
_build_browse_reply(
children=[_obj(71, "Child_1"), _obj(72, "Child_2")],
child_has_children=[False, False],
cache_sequence=2,
next_page_token="7:abc:2",
),
_build_browse_reply(
children=[_obj(73, "Child_3")],
child_has_children=[False],
cache_sequence=2,
),
]
client = await GalaxyRepositoryClient.connect(
ClientOptions(endpoint="fake", plaintext=True),
stub=stub,
)
roots = await client.browse()
await roots[0].expand()
assert [n.object.tag_name for n in roots[0].children] == ["Child_1", "Child_2", "Child_3"]
assert len(stub.browse_children.requests) == 3
assert stub.browse_children.requests[2].page_token == "7:abc:2"
assert stub.browse_children.requests[2].parent_gobject_id == 7
@pytest.mark.asyncio
async def test_browse_with_filter_forwards_to_request() -> None:
stub = FakeGalaxyStub()
stub.browse_children.replies = [
_build_browse_reply(
children=[_obj(1, "Area_A", is_area=True)],
child_has_children=[False],
cache_sequence=3,
),
]
client = await GalaxyRepositoryClient.connect(
ClientOptions(endpoint="fake", plaintext=True),
stub=stub,
)
options = BrowseChildrenOptions(
category_ids=(4, 5),
template_chain_contains=("$DelmiaReceiver",),
tag_name_glob="Area_*",
include_attributes=True,
alarm_bearing_only=True,
historized_only=True,
)
await client.browse(options)
request = stub.browse_children.requests[0]
assert list(request.category_ids) == [4, 5]
assert list(request.template_chain_contains) == ["$DelmiaReceiver"]
assert request.tag_name_glob == "Area_*"
assert request.HasField("include_attributes")
assert request.include_attributes is True
assert request.alarm_bearing_only is True
assert request.historized_only is True
@pytest.mark.asyncio
async def test_browse_children_raw_returns_reply_unwrapped() -> None:
"""browse_children_raw forwards the request to the stub and returns the raw reply."""
stub = FakeGalaxyStub()
expected = _build_browse_reply(
children=[_obj(1, "Plant", is_area=True)],
child_has_children=[True],
cache_sequence=42,
)
stub.browse_children.replies = [expected]
async with await GalaxyRepositoryClient.connect(
endpoint="fake",
plaintext=True,
stub=stub,
) as client:
request = galaxy_pb.BrowseChildrenRequest(
page_size=10,
tag_name_glob="Plant*",
)
reply = await client.browse_children_raw(request)
assert reply.cache_sequence == 42
assert len(reply.children) == 1
assert reply.children[0].tag_name == "Plant"
assert len(stub.browse_children.requests) == 1
assert stub.browse_children.requests[0].tag_name_glob == "Plant*"
class FakeGalaxyStub:
def __init__(self) -> None:
self.test_connection = FakeUnary([galaxy_pb.TestConnectionReply(ok=False)])
self.get_last_deploy_time = FakeUnary([galaxy_pb.GetLastDeployTimeReply(present=False)])
self.discover_hierarchy = FakeUnary([galaxy_pb.DiscoverHierarchyReply()])
self.browse_children = FakeUnary([galaxy_pb.BrowseChildrenReply()])
self.watch_deploy_events = FakeStream([])
self.TestConnection = self.test_connection
self.GetLastDeployTime = self.get_last_deploy_time
self.DiscoverHierarchy = self.discover_hierarchy
self.BrowseChildren = self.browse_children
@property
def WatchDeployEvents(self) -> "FakeStream": # noqa: N802 — gRPC naming
@@ -287,6 +557,8 @@ class FakeUnary:
def __init__(self, replies: list[Any]) -> None:
self.replies = replies
self.requests: list[Any] = []
# None entries mean "no exception on this call"; aligns with the replies queue index-by-index.
self.exceptions: list[BaseException | None] = []
self.metadata: tuple[tuple[str, str], ...] | None = None
async def __call__(
@@ -298,6 +570,10 @@ class FakeUnary:
) -> Any:
self.requests.append(request)
self.metadata = metadata
if self.exceptions:
exc = self.exceptions.pop(0)
if exc is not None:
raise exc
return self.replies.pop(0)
@@ -0,0 +1,789 @@
"""Regression tests for Client.Python-022..026.
Each test corresponds to a finding from the latest re-review. Tests are
TDD-first they failed against the pre-fix source and pass against the
fixed source.
"""
from __future__ import annotations
import json
import re
import time as _time_module_ref
from pathlib import Path
from typing import Any
import pytest
from click.testing import CliRunner
from zb_mom_ww_mxgateway import ClientOptions, GatewayClient
from zb_mom_ww_mxgateway.generated import mxaccess_gateway_pb2 as pb
from zb_mom_ww_mxgateway_cli import commands as cli_commands
from zb_mom_ww_mxgateway_cli.commands import _use_plaintext, main
_BATCH_EOR = "__MXGW_BATCH_EOR__"
# ---------------------------------------------------------------------------
# Client.Python-022 — README CLI examples must parse against the implementation.
# ---------------------------------------------------------------------------
def _readme_path() -> Path:
return Path(__file__).resolve().parent.parent / "README.md"
def _extract_mxgw_py_examples() -> list[list[str]]:
"""Return the README's ``mxgw-py ...`` example lines as click arg lists.
Replaces angle-bracket placeholders (``<id>``) with safe stub values and
leaves real flag names untouched. The returned arg lists drop the
``mxgw-py`` prefix.
"""
text = _readme_path().read_text(encoding="utf-8")
args: list[list[str]] = []
for raw_line in text.splitlines():
line = raw_line.strip()
if not line.startswith("mxgw-py "):
continue
# Strip the leading "mxgw-py " token.
body = line[len("mxgw-py ") :]
# Replace common placeholders so click does not error on the placeholder.
body = body.replace("<id>", "session-1")
# Backtick-quoted hostnames in the TLS example are not represented
# in CLI; safe to leave as-is.
tokens = _split_cli_tokens(body)
# Keep only examples that exercise a real subcommand. Skip TLS
# multi-flag example (we only need the README CLI examples added in
# commits 8738735 — stream-alarms / acknowledge-alarm).
args.append(tokens)
return args
def _split_cli_tokens(body: str) -> list[str]:
"""Split a CLI body into argv tokens, honouring double-quoted strings."""
tokens: list[str] = []
pattern = re.compile(r'"([^"]*)"|(\S+)')
for match in pattern.finditer(body):
quoted, plain = match.group(1), match.group(2)
tokens.append(quoted if quoted is not None else plain)
return tokens
def test_readme_alarm_examples_parse_against_cli() -> None:
"""README `stream-alarms` / `acknowledge-alarm` examples must parse without
triggering Click's ``no such option`` error.
Drives every README ``mxgw-py`` example through Click's ``--help`` style
parser by re-invoking the documented argv with a trailing ``--help`` flag so
only the parser runs (no RPC is attempted). If a documented flag does not
exist on the subcommand, Click prints ``no such option: --<flag>`` and
exits 2 that is the regression we want to catch.
"""
runner = CliRunner()
examples = _extract_mxgw_py_examples()
assert any(
"stream-alarms" in args for args in examples
), "README must include a stream-alarms example."
assert any(
"acknowledge-alarm" in args for args in examples
), "README must include an acknowledge-alarm example."
for argv in examples:
# Strip "--json" (already a real flag) and any value-bearing flag that
# requires a host/file/value, then append --help so we exercise the
# parser only.
# We just append --help — Click parses all options up to --help and
# then prints help; an unknown option still errors out first.
result = runner.invoke(main, [*argv, "--help"])
# Either help text printed (exit 0) or some other parser issue (exit 2);
# we only want to assert NO "no such option" error.
assert "no such option" not in result.output.lower(), (
f"README example failed Click parsing: argv={argv!r}\n"
f"output={result.output!r}"
)
# ---------------------------------------------------------------------------
# Client.Python-023 — REGRESSION of Client.Python-013. _use_plaintext must
# not silently auto-downgrade on localhost / 127.0.0.1.
# ---------------------------------------------------------------------------
def test_use_plaintext_does_not_auto_downgrade_for_localhost_endpoint() -> None:
"""A bare ``localhost:...`` endpoint with no flags must default to TLS."""
assert _use_plaintext({
"endpoint": "localhost:5001",
"plaintext": False,
"use_tls": False,
}) is False
def test_use_plaintext_does_not_auto_downgrade_for_loopback_ipv4_endpoint() -> None:
"""A bare ``127.0.0.1:...`` endpoint with no flags must default to TLS."""
assert _use_plaintext({
"endpoint": "127.0.0.1:5001",
"plaintext": False,
"use_tls": False,
}) is False
def test_use_plaintext_requires_explicit_plaintext_flag() -> None:
"""``--plaintext`` is the only way to opt in."""
assert _use_plaintext({
"endpoint": "localhost:5001",
"plaintext": True,
"use_tls": False,
}) is True
def test_use_plaintext_tls_flag_explicitly_disables_plaintext() -> None:
"""``--tls`` is accepted as an explicit affirmation of the default."""
assert _use_plaintext({
"endpoint": "localhost:5001",
"plaintext": False,
"use_tls": True,
}) is False
def test_use_plaintext_rejects_plaintext_and_tls_combined() -> None:
"""``--plaintext`` and ``--tls`` together must be rejected as ambiguous."""
import click as _click
with pytest.raises(_click.UsageError):
_use_plaintext({
"endpoint": "localhost:5001",
"plaintext": True,
"use_tls": True,
})
def test_cli_localhost_endpoint_with_no_flags_uses_tls_channel(monkeypatch) -> None:
"""End-to-end CLI: against ``localhost:...`` with no flags, the resolved
``ClientOptions.plaintext`` flowing into ``GatewayClient.connect`` must be
``False`` (TLS), so the API key bearer cannot leak over plaintext.
"""
captured: dict[str, Any] = {}
class _FakeStub:
def __init__(self) -> None:
pass
async def OpenSession(self, request: Any, *, metadata: tuple[Any, ...]) -> Any:
captured["metadata"] = metadata
return pb.OpenSessionReply(
session_id="session-1",
protocol_status=pb.ProtocolStatus(code=pb.PROTOCOL_STATUS_CODE_OK),
)
real_connect = GatewayClient.connect
@classmethod
async def _spy_connect(cls, options: ClientOptions, **kwargs: Any) -> GatewayClient:
captured["options"] = options
return await real_connect(options, stub=_FakeStub())
monkeypatch.setattr(GatewayClient, "connect", _spy_connect)
runner = CliRunner()
result = runner.invoke(
main,
[
"open-session",
"--endpoint",
"localhost:5000",
"--api-key",
"mxgw_test_secret",
"--json",
],
)
assert result.exit_code == 0, result.output
assert "options" in captured
assert captured["options"].plaintext is False, (
"localhost endpoint without --plaintext must NOT auto-downgrade to plaintext"
)
# ---------------------------------------------------------------------------
# Client.Python-024 — `batch` must not use CliRunner from production code,
# and a recursive `batch` line must not silently re-enter.
# ---------------------------------------------------------------------------
def test_batch_command_does_not_use_clirunner_in_production() -> None:
"""`commands.py` must not import or instantiate the test-only CliRunner helper.
Docstring references explaining what the module deliberately avoids are
permitted; what is forbidden is an actual ``import`` of ``click.testing``
or an actual ``CliRunner()`` instantiation in executable code.
"""
source = Path(cli_commands.__file__).read_text(encoding="utf-8")
assert "from click.testing" not in source, (
"click.testing is a test-only helper and must not be used by production code"
)
assert "import click.testing" not in source, (
"click.testing is a test-only helper and must not be used by production code"
)
# `CliRunner()` (instantiation) must not appear in production code.
assert "CliRunner(" not in source, (
"CliRunner() must not be instantiated in production code"
)
def test_batch_recursive_batch_line_is_bounded() -> None:
"""A `batch` line nested inside `batch` stdin must not be silently spawned.
The pre-fix implementation re-invoked the test runner with empty stdin,
so `batch` inside `batch` exited cleanly with no error. The fix either
rejects the nested invocation or surfaces it as an error block so the
behaviour is auditable.
"""
runner = CliRunner()
result = runner.invoke(
main,
["batch"],
input="batch\nversion --json\n",
)
# Outer batch must still exit 0 and process both lines.
assert result.exit_code == 0
assert result.output.count(_BATCH_EOR) == 2
blocks = [block for block in result.output.split(_BATCH_EOR + "\n") if block]
# The first block — the recursive `batch` line — must surface an error
# JSON. (Either an explicit rejection, or some non-empty error block —
# NOT a silently empty block.)
first_block = blocks[0].strip()
assert first_block, "recursive batch line must not be silently swallowed"
payload = json.loads(first_block.splitlines()[-1])
assert "error" in payload, (
f"recursive batch line should surface an error: got {payload!r}"
)
# ---------------------------------------------------------------------------
# Client.Python-025 — Behavioural tests for new bulk SDK methods,
# stream_alarms, and the new CLI subcommands.
# ---------------------------------------------------------------------------
class _AlarmFakeStream:
def __init__(self, messages: list[pb.AlarmFeedMessage]) -> None:
self._messages = list(messages)
self.cancelled = False
def __aiter__(self) -> "_AlarmFakeStream":
return self
async def __anext__(self) -> pb.AlarmFeedMessage:
if not self._messages:
raise StopAsyncIteration
return self._messages.pop(0)
def cancel(self) -> None:
self.cancelled = True
class _BulkFakeUnary:
def __init__(self, replies: list[Any]) -> None:
self.replies = replies
self.requests: list[Any] = []
self.metadata: tuple[tuple[str, str], ...] | None = None
async def __call__(self, request: Any, *, metadata: tuple[tuple[str, str], ...]) -> Any:
self.requests.append(request)
self.metadata = metadata
return self.replies.pop(0)
class _BulkFakeStub:
def __init__(self) -> None:
self.open_session = _BulkFakeUnary(
[
pb.OpenSessionReply(
session_id="session-1",
protocol_status=pb.ProtocolStatus(code=pb.PROTOCOL_STATUS_CODE_OK),
),
],
)
self.invoke = _BulkFakeUnary([])
self.OpenSession = self.open_session
self.Invoke = self.invoke
self.stream_alarms_metadata: tuple[tuple[str, str], ...] | None = None
self._alarm_stream = _AlarmFakeStream([])
def set_invoke_replies(self, replies: list[Any]) -> None:
self.invoke.replies = replies
def set_alarm_stream(self, stream: _AlarmFakeStream) -> None:
self._alarm_stream = stream
def StreamAlarms(self, request: Any, *, metadata: tuple[tuple[str, str], ...]) -> Any:
self.stream_alarms_request = request
self.stream_alarms_metadata = metadata
return self._alarm_stream
@pytest.mark.asyncio
async def test_session_read_bulk_sends_expected_request_shape() -> None:
stub = _BulkFakeStub()
stub.set_invoke_replies(
[
pb.MxCommandReply(
session_id="session-1",
kind=pb.MX_COMMAND_KIND_READ_BULK,
protocol_status=pb.ProtocolStatus(code=pb.PROTOCOL_STATUS_CODE_OK),
read_bulk=pb.BulkReadReply(
results=[
pb.BulkReadResult(
tag_address="Tank01.Level",
was_successful=True,
),
],
),
),
],
)
client = await GatewayClient.connect(
ClientOptions(endpoint="fake", api_key="mxgw_test_secret", plaintext=True),
stub=stub,
)
session = await client.open_session()
results = await session.read_bulk(12, ["Tank01.Level"], timeout_ms=1500)
assert len(results) == 1
assert results[0].tag_address == "Tank01.Level"
request = stub.invoke.requests[0]
assert request.command.kind == pb.MX_COMMAND_KIND_READ_BULK
assert request.command.read_bulk.server_handle == 12
assert list(request.command.read_bulk.tag_addresses) == ["Tank01.Level"]
assert request.command.read_bulk.timeout_ms == 1500
@pytest.mark.asyncio
async def test_session_write_bulk_sends_expected_request_shape() -> None:
stub = _BulkFakeStub()
stub.set_invoke_replies(
[
pb.MxCommandReply(
session_id="session-1",
kind=pb.MX_COMMAND_KIND_WRITE_BULK,
protocol_status=pb.ProtocolStatus(code=pb.PROTOCOL_STATUS_CODE_OK),
write_bulk=pb.BulkWriteReply(
results=[pb.BulkWriteResult(was_successful=True)],
),
),
],
)
client = await GatewayClient.connect(
ClientOptions(endpoint="fake", api_key="mxgw_test_secret", plaintext=True),
stub=stub,
)
session = await client.open_session()
from zb_mom_ww_mxgateway.values import to_mx_value
entries = [
pb.WriteBulkEntry(item_handle=34, user_id=99, value=to_mx_value(123)),
]
results = await session.write_bulk(12, entries)
assert results[0].was_successful is True
cmd = stub.invoke.requests[0].command
assert cmd.kind == pb.MX_COMMAND_KIND_WRITE_BULK
assert cmd.write_bulk.server_handle == 12
assert cmd.write_bulk.entries[0].item_handle == 34
assert cmd.write_bulk.entries[0].user_id == 99
@pytest.mark.asyncio
async def test_session_write2_bulk_sends_expected_request_shape() -> None:
stub = _BulkFakeStub()
stub.set_invoke_replies(
[
pb.MxCommandReply(
session_id="session-1",
kind=pb.MX_COMMAND_KIND_WRITE2_BULK,
protocol_status=pb.ProtocolStatus(code=pb.PROTOCOL_STATUS_CODE_OK),
write2_bulk=pb.BulkWriteReply(
results=[pb.BulkWriteResult(was_successful=True)],
),
),
],
)
client = await GatewayClient.connect(
ClientOptions(endpoint="fake", api_key="mxgw_test_secret", plaintext=True),
stub=stub,
)
session = await client.open_session()
from zb_mom_ww_mxgateway.values import to_mx_value
entries = [
pb.Write2BulkEntry(
item_handle=34,
user_id=99,
value=to_mx_value(123),
timestamp_value=to_mx_value(1.5),
),
]
results = await session.write2_bulk(12, entries)
assert results[0].was_successful is True
cmd = stub.invoke.requests[0].command
assert cmd.kind == pb.MX_COMMAND_KIND_WRITE2_BULK
assert cmd.write2_bulk.server_handle == 12
assert cmd.write2_bulk.entries[0].item_handle == 34
@pytest.mark.asyncio
async def test_session_write_secured_bulk_sends_expected_request_shape() -> None:
stub = _BulkFakeStub()
stub.set_invoke_replies(
[
pb.MxCommandReply(
session_id="session-1",
kind=pb.MX_COMMAND_KIND_WRITE_SECURED_BULK,
protocol_status=pb.ProtocolStatus(code=pb.PROTOCOL_STATUS_CODE_OK),
write_secured_bulk=pb.BulkWriteReply(
results=[pb.BulkWriteResult(was_successful=True)],
),
),
],
)
client = await GatewayClient.connect(
ClientOptions(endpoint="fake", api_key="mxgw_test_secret", plaintext=True),
stub=stub,
)
session = await client.open_session()
from zb_mom_ww_mxgateway.values import to_mx_value
entries = [
pb.WriteSecuredBulkEntry(
item_handle=34,
current_user_id=42,
verifier_user_id=43,
value=to_mx_value("secret"),
),
]
results = await session.write_secured_bulk(12, entries)
assert results[0].was_successful is True
cmd = stub.invoke.requests[0].command
assert cmd.kind == pb.MX_COMMAND_KIND_WRITE_SECURED_BULK
assert cmd.write_secured_bulk.server_handle == 12
assert cmd.write_secured_bulk.entries[0].current_user_id == 42
assert cmd.write_secured_bulk.entries[0].verifier_user_id == 43
@pytest.mark.asyncio
async def test_session_write_secured2_bulk_sends_expected_request_shape() -> None:
stub = _BulkFakeStub()
stub.set_invoke_replies(
[
pb.MxCommandReply(
session_id="session-1",
kind=pb.MX_COMMAND_KIND_WRITE_SECURED2_BULK,
protocol_status=pb.ProtocolStatus(code=pb.PROTOCOL_STATUS_CODE_OK),
write_secured2_bulk=pb.BulkWriteReply(
results=[pb.BulkWriteResult(was_successful=True)],
),
),
],
)
client = await GatewayClient.connect(
ClientOptions(endpoint="fake", api_key="mxgw_test_secret", plaintext=True),
stub=stub,
)
session = await client.open_session()
from zb_mom_ww_mxgateway.values import to_mx_value
entries = [
pb.WriteSecured2BulkEntry(
item_handle=34,
current_user_id=42,
verifier_user_id=43,
value=to_mx_value("secret"),
timestamp_value=to_mx_value(1.5),
),
]
results = await session.write_secured2_bulk(12, entries)
assert results[0].was_successful is True
cmd = stub.invoke.requests[0].command
assert cmd.kind == pb.MX_COMMAND_KIND_WRITE_SECURED2_BULK
assert cmd.write_secured2_bulk.entries[0].current_user_id == 42
@pytest.mark.asyncio
async def test_stream_alarms_yields_feed_messages_and_cancels_on_close() -> None:
transitions = [
pb.AlarmFeedMessage(
transition=pb.OnAlarmTransitionEvent(
alarm_full_reference="Tank01.Level.HiHi",
transition_kind=pb.ALARM_TRANSITION_KIND_RAISE,
),
),
]
stream = _AlarmFakeStream(transitions)
stub = _BulkFakeStub()
stub.set_alarm_stream(stream)
client = await GatewayClient.connect(
ClientOptions(endpoint="fake", api_key="mxgw_test_secret", plaintext=True),
stub=stub,
)
iterator = client.stream_alarms(pb.StreamAlarmsRequest(alarm_filter_prefix="Tank01."))
first = await anext(iterator)
await iterator.aclose()
assert first.transition.alarm_full_reference == "Tank01.Level.HiHi"
assert stream.cancelled
assert stub.stream_alarms_metadata == (("authorization", "Bearer mxgw_test_secret"),)
assert stub.stream_alarms_request.alarm_filter_prefix == "Tank01."
# ---- CLI happy-path coverage for the new subcommands ----
def _install_fake_connect(monkeypatch, stub: Any) -> dict[str, Any]:
"""Patch `GatewayClient.connect` so the CLI uses the supplied fake stub."""
captured: dict[str, Any] = {}
real_connect = GatewayClient.connect
@classmethod
async def _spy_connect(cls, options: ClientOptions, **kwargs: Any) -> GatewayClient:
captured["options"] = options
return await real_connect(options, stub=stub)
monkeypatch.setattr(GatewayClient, "connect", _spy_connect)
return captured
def test_cli_read_bulk_happy_path(monkeypatch) -> None:
stub = _BulkFakeStub()
stub.set_invoke_replies(
[
pb.MxCommandReply(
session_id="session-1",
kind=pb.MX_COMMAND_KIND_READ_BULK,
protocol_status=pb.ProtocolStatus(code=pb.PROTOCOL_STATUS_CODE_OK),
read_bulk=pb.BulkReadReply(
results=[
pb.BulkReadResult(
tag_address="Tank01.Level",
was_successful=True,
),
],
),
),
],
)
_install_fake_connect(monkeypatch, stub)
runner = CliRunner()
result = runner.invoke(
main,
[
"read-bulk",
"--endpoint",
"localhost:5000",
"--plaintext",
"--session-id",
"session-1",
"--server-handle",
"12",
"--items",
"Tank01.Level",
"--timeout-ms",
"1500",
"--json",
],
)
assert result.exit_code == 0, result.output
payload = json.loads(result.output)
assert payload["results"][0]["tagAddress"] == "Tank01.Level"
def test_cli_write_bulk_happy_path(monkeypatch) -> None:
stub = _BulkFakeStub()
stub.set_invoke_replies(
[
pb.MxCommandReply(
session_id="session-1",
kind=pb.MX_COMMAND_KIND_WRITE_BULK,
protocol_status=pb.ProtocolStatus(code=pb.PROTOCOL_STATUS_CODE_OK),
write_bulk=pb.BulkWriteReply(
results=[pb.BulkWriteResult(was_successful=True)],
),
),
],
)
_install_fake_connect(monkeypatch, stub)
runner = CliRunner()
result = runner.invoke(
main,
[
"write-bulk",
"--endpoint",
"localhost:5000",
"--plaintext",
"--session-id",
"session-1",
"--server-handle",
"12",
"--item-handles",
"34",
"--values",
"123",
"--type",
"int32",
"--json",
],
)
assert result.exit_code == 0, result.output
payload = json.loads(result.output)
assert payload["results"][0]["wasSuccessful"] is True
cmd = stub.invoke.requests[0].command
assert cmd.kind == pb.MX_COMMAND_KIND_WRITE_BULK
def test_cli_stream_alarms_happy_path(monkeypatch) -> None:
transitions = [
pb.AlarmFeedMessage(
transition=pb.OnAlarmTransitionEvent(
alarm_full_reference="Tank01.Level.HiHi",
transition_kind=pb.ALARM_TRANSITION_KIND_RAISE,
),
),
]
stream = _AlarmFakeStream(transitions)
stub = _BulkFakeStub()
stub.set_alarm_stream(stream)
_install_fake_connect(monkeypatch, stub)
runner = CliRunner()
result = runner.invoke(
main,
[
"stream-alarms",
"--endpoint",
"localhost:5000",
"--plaintext",
"--max-messages",
"1",
"--timeout",
"5.0",
"--filter-prefix",
"Tank01.",
"--json",
],
)
assert result.exit_code == 0, result.output
payload = json.loads(result.output)
assert payload["messages"][0]["transition"]["alarmFullReference"] == "Tank01.Level.HiHi"
def test_cli_acknowledge_alarm_happy_path(monkeypatch) -> None:
stub = _BulkFakeStub()
stub.acknowledge_alarm = _BulkFakeUnary(
[
pb.AcknowledgeAlarmReply(
correlation_id="corr-1",
protocol_status=pb.ProtocolStatus(code=pb.PROTOCOL_STATUS_CODE_OK),
status=pb.MxStatusProxy(success=1, category=pb.MX_STATUS_CATEGORY_OK),
),
],
)
stub.AcknowledgeAlarm = stub.acknowledge_alarm
_install_fake_connect(monkeypatch, stub)
runner = CliRunner()
result = runner.invoke(
main,
[
"acknowledge-alarm",
"--endpoint",
"localhost:5000",
"--plaintext",
"--reference",
"Tank01.Level.HiHi",
"--comment",
"investigating",
"--operator",
"alice",
"--json",
],
)
assert result.exit_code == 0, result.output
captured_request = stub.acknowledge_alarm.requests[0]
assert captured_request.alarm_full_reference == "Tank01.Level.HiHi"
assert captured_request.comment == "investigating"
assert captured_request.operator_user == "alice"
# ---------------------------------------------------------------------------
# Client.Python-026 — `import time` at module scope; tighter cleanup excepts.
# ---------------------------------------------------------------------------
def test_commands_module_imports_time_at_module_scope() -> None:
"""`time` must be imported at module scope, not inside `_bench_read_bulk`.
`inspect.getsource(_bench_read_bulk)` must not contain a function-local
``import time`` statement.
"""
import inspect
source = inspect.getsource(cli_commands._bench_read_bulk)
# The function body must NOT contain a function-local `import time` line.
for line in source.splitlines():
stripped = line.strip()
assert stripped != "import time", (
f"_bench_read_bulk must not have function-local `import time`: {line!r}"
)
# And the module-level `time` attribute must be present.
assert hasattr(cli_commands, "time"), (
"`time` must be imported at module scope on commands.py"
)
assert cli_commands.time is _time_module_ref
def test_commands_module_bench_read_bulk_does_not_use_bare_except_pass() -> None:
"""The two `except Exception: pass` cleanup blocks in `_bench_read_bulk`
must be removed in favour of either logging or a narrower exception class.
"""
import inspect
source = inspect.getsource(cli_commands._bench_read_bulk)
# Reject the bare `except Exception:` followed by `pass` pattern in
# `_bench_read_bulk`. We tolerate `except Exception as <name>:` because the
# fix logs the exception.
pattern = re.compile(r"except\s+Exception\s*:\s*\n\s*pass\b")
assert not pattern.search(source), (
"_bench_read_bulk cleanup blocks must log or narrow the except clause"
)
+165
View File
@@ -0,0 +1,165 @@
"""TLS behaviour tests for ``create_channel``.
These spin up a real loopback ``grpc.aio`` server with a freshly generated
self-signed certificate (carrying a ``localhost`` SAN, mirroring the gateway's
auto-generated cert) and assert the lenient TOFU default lets a client connect
without any CA configured.
Marked ``tls`` and skipped unless ``MXGATEWAY_RUN_TLS_TESTS=1`` because loopback
TLS handshakes can be timing-flaky on shared CI runners. This mirrors how the
suite gates anything that depends on real sockets rather than fakes.
"""
from __future__ import annotations
import os
import shutil
import socket
import ssl
import subprocess
import tempfile
from collections.abc import AsyncIterator
from pathlib import Path
import grpc
import pytest
import pytest_asyncio
from zb_mom_ww_mxgateway import ClientOptions
from zb_mom_ww_mxgateway.errors import MxGatewayTransportError
from zb_mom_ww_mxgateway.generated import mxaccess_gateway_pb2 as pb
from zb_mom_ww_mxgateway.generated import mxaccess_gateway_pb2_grpc as pb_grpc
from zb_mom_ww_mxgateway.options import create_channel
pytestmark = pytest.mark.tls
_RUN_TLS_TESTS = os.environ.get("MXGATEWAY_RUN_TLS_TESTS") == "1"
_OPENSSL = shutil.which("openssl")
requires_tls = pytest.mark.skipif(
not _RUN_TLS_TESTS,
reason="set MXGATEWAY_RUN_TLS_TESTS=1 to run loopback TLS tests",
)
requires_openssl = pytest.mark.skipif(
_OPENSSL is None,
reason="openssl CLI is required to generate a self-signed test certificate",
)
def _generate_self_signed_cert(directory: Path) -> tuple[Path, Path]:
"""Generate a self-signed cert/key pair with a ``localhost`` SAN."""
key_path = directory / "server.key"
cert_path = directory / "server.crt"
subprocess.run(
[
str(_OPENSSL),
"req",
"-x509",
"-newkey",
"rsa:2048",
"-nodes",
"-keyout",
str(key_path),
"-out",
str(cert_path),
"-days",
"1",
"-subj",
"/CN=mxgateway-test",
"-addext",
"subjectAltName=DNS:localhost,IP:127.0.0.1",
],
check=True,
capture_output=True,
)
return cert_path, key_path
def _free_port() -> int:
with socket.socket(socket.AF_INET, socket.SOCK_STREAM) as sock:
sock.bind(("127.0.0.1", 0))
return int(sock.getsockname()[1])
class _StaticGatewayServicer(pb_grpc.MxAccessGatewayServicer):
"""Minimal servicer answering ``OpenSession`` with a fixed session id."""
async def OpenSession( # noqa: N802 - generated gRPC method name
self, request: pb.OpenSessionRequest, context: object
) -> pb.OpenSessionReply:
return pb.OpenSessionReply(session_id="tls-session-1")
@pytest_asyncio.fixture
async def tls_server() -> AsyncIterator[int]:
with tempfile.TemporaryDirectory() as tmp:
cert_path, key_path = _generate_self_signed_cert(Path(tmp))
credentials = grpc.ssl_server_credentials(
[(key_path.read_bytes(), cert_path.read_bytes())]
)
server = grpc.aio.server()
pb_grpc.add_MxAccessGatewayServicer_to_server(_StaticGatewayServicer(), server)
port = _free_port()
server.add_secure_port(f"127.0.0.1:{port}", credentials)
await server.start()
try:
yield port
finally:
await server.stop(grace=None)
@requires_tls
@requires_openssl
@pytest.mark.asyncio
async def test_default_tls_connects_via_tofu(tls_server: int) -> None:
"""Default TLS options (no CA) connect by pinning the presented cert."""
options = ClientOptions(
endpoint=f"127.0.0.1:{tls_server}",
api_key="mxgw_test_secret",
)
channel = create_channel(options)
try:
stub = pb_grpc.MxAccessGatewayStub(channel)
reply = await stub.OpenSession(pb.OpenSessionRequest(), timeout=10)
assert reply.session_id == "tls-session-1"
finally:
await channel.close()
def test_split_authority_parses_host_and_port() -> None:
from zb_mom_ww_mxgateway.options import _split_authority
assert _split_authority("https://10.0.0.5:5120") == ("10.0.0.5", 5120)
assert _split_authority("localhost:5120") == ("localhost", 5120)
assert _split_authority(":5120") == ("localhost", 5120)
def test_split_authority_strips_ipv6_brackets() -> None:
from zb_mom_ww_mxgateway.options import _split_authority
# Bracketed IPv6 with port — brackets must be removed for ssl.get_server_certificate
assert _split_authority("[::1]:5120") == ("::1", 5120)
# Bare bracketed IPv6 (no port) — default port 443
assert _split_authority("[::1]") == ("::1", 443)
# Scheme-prefixed bracketed IPv6
assert _split_authority("grpc://[::1]:5120") == ("::1", 5120)
def test_tofu_connect_failure_raises_transport_error() -> None:
"""A failed cert pre-fetch surfaces the client's transport error type."""
options = ClientOptions(endpoint=f"127.0.0.1:{_free_port()}")
with pytest.raises(MxGatewayTransportError) as excinfo:
create_channel(options)
assert options.endpoint in str(excinfo.value)
def test_require_certificate_validation_uses_system_trust() -> None:
"""``require_certificate_validation`` must not attempt a TOFU pre-fetch."""
# Pointing at a closed port: with system-trust the channel is created lazily
# (no eager pre-fetch), so create_channel must succeed without connecting.
options = ClientOptions(
endpoint=f"127.0.0.1:{_free_port()}",
require_certificate_validation=True,
)
channel = create_channel(options)
assert isinstance(channel, grpc.aio.Channel)
+20 -7
View File
@@ -1,9 +1,22 @@
[target.'cfg(windows)']
# Bump the default 1 MB Windows stack to 8 MB. clap-derive builds a large
# Command enum in this CLI (one variant per subcommand, each carrying flag
# args); in debug builds the enum is materialized on the stack without
# MSVC-only: bump the default 1 MB Windows stack to 8 MB. clap-derive builds
# a large Command enum in this CLI (one variant per subcommand, each carrying
# flag args); in debug builds the enum is materialized on the stack without
# optimization and overflows the default Windows main-thread stack before
# even reaching our code. Release builds are unaffected but the e2e matrix
# drives the CLI through `cargo run` (debug), so the link-arg ships with
# every dev-time invocation.
# even reaching our code.
#
# The /STACK: link-arg goes into the PE header's IMAGE_OPTIONAL_HEADER.
# SizeOfStackReserve at link time and applies to both debug and release
# builds — release artifacts ship with the same 8 MB stack reservation. At
# runtime the optimizer elides the enum from the stack frame, so release
# builds would not overflow without this setting; it is kept on for them so
# both build flavours produce binaries with identical stack metadata.
#
# `/STACK:` is an MSVC-linker (`link.exe` / `lld-link`) directive. The
# `target_env = "msvc"` selector below scopes the rustflag to the MSVC
# toolchain so `x86_64-pc-windows-gnu` (mingw) builds, which route link
# args through the GNU linker and reject `/STACK:`, are unaffected.
[target.'cfg(all(windows, target_env = "msvc"))']
rustflags = ["-C", "link-arg=/STACK:8388608"]
[registries.dohertj2-gitea]
index = "sparse+https://gitea.dohertylan.com/api/packages/dohertj2/cargo/"
+14 -2
View File
@@ -2,7 +2,16 @@
name = "zb-mom-ww-mxgateway-client"
version = "0.1.0"
edition = "2021"
publish = false
authors = ["Joseph Doherty"]
description = "Async Rust client for the MxAccessGateway gRPC service, including a lazy-browse walker over the Galaxy Repository hierarchy."
license = "Proprietary"
repository = "https://gitea.dohertylan.com/dohertj2/mxaccessgw"
homepage = "https://gitea.dohertylan.com/dohertj2/mxaccessgw"
documentation = "https://gitea.dohertylan.com/dohertj2/mxaccessgw"
readme = "README.md"
keywords = ["mxaccess", "mxgateway", "grpc", "client", "archestra"]
categories = ["api-bindings", "asynchronous"]
publish = ["dohertj2-gitea"]
build = "build.rs"
[workspace]
@@ -12,7 +21,10 @@ resolver = "2"
[workspace.package]
edition = "2021"
version = "0.1.0"
publish = false
authors = ["Joseph Doherty"]
license = "Proprietary"
repository = "https://gitea.dohertylan.com/dohertj2/mxaccessgw"
publish = ["dohertj2-gitea"]
[workspace.dependencies]
clap = { version = "4.5.53", features = ["derive"] }
+81
View File
@@ -76,6 +76,19 @@ types.
cargo run -p mxgw-cli -- smoke --endpoint https://mxgateway.example.local:5001 --tls --ca-file C:\certs\mxgateway-ca.pem --server-name-override mxgateway.example.local --api-key-env MXGATEWAY_API_KEY --item TestChildObject.TestInt --json
```
### TLS trust (pin-only)
The gateway can auto-generate its own self-signed certificate (it has no PKI).
Unlike the other clients, the Rust client is **not** lenient: tonic 0.13.1
exposes no public hook to inject a custom certificate verifier, so TLS over Rust
is pin-only. A TLS connection requires either `--ca-file` /
`ClientOptions::with_ca_file(...)` to pin a CA (export the gateway's self-signed
certificate and pin it), or `--require-certificate-validation` /
`with_require_certificate_validation(true)` to verify against the system trust
roots. TLS with neither set fails `connect` with a clear, actionable error rather
than accepting the certificate. See
[Gateway Configuration](../../docs/GatewayConfiguration.md#automatic-self-signed-certificate).
## Library Surface
`ClientOptions` configures endpoint, API key, plaintext or TLS transport,
@@ -138,6 +151,50 @@ cargo run -p mxgw-cli -- galaxy last-deploy-time --endpoint http://localhost:500
cargo run -p mxgw-cli -- galaxy discover-hierarchy --endpoint http://localhost:5000 --api-key-env MXGATEWAY_API_KEY --json
```
### Browsing lazily
For UI trees or OPC UA bridges, use `browse_children` to walk one level at a
time instead of paging the full hierarchy. Pass a default request for root
objects; subsequent calls set `parent_gobject_id`, `parent_tag_name`, or
`parent_contained_path`. Filter fields match `discover_hierarchy`. Each response
pairs `children` with `child_has_children` so you know which nodes to expand. See
[Galaxy Repository](../../docs/GalaxyRepository.md#browsechildren) for full
request and filter semantics.
```rust
use zb_mom_ww_mxgateway_client::generated::galaxy_repository::v1::BrowseChildrenRequest;
let reply = galaxy.browse_children(BrowseChildrenRequest::default()).await?.into_inner();
for (child, has_children) in reply.children.iter().zip(reply.child_has_children.iter()) {
println!("{} expand={}", child.tag_name, has_children);
}
```
#### High-level walker
For UI trees, the client provides a `LazyBrowseNode` walker that handles
sibling pagination and the `child_has_children` hint for you:
```rust
let mut client = GalaxyClient::connect(
ClientOptions::new("http://localhost:5000").with_api_key(ApiKey::new(api_key)),
).await?;
let roots = client.browse(None).await?;
for root in &roots {
if root.has_children_hint() {
root.expand().await?;
}
for child in root.children().await {
let kind = if child.has_children_hint() { "has children" } else { "leaf" };
println!("{} ({kind})", child.object().tag_name);
}
}
```
`expand` is idempotent — calling it twice fires only one RPC,
and is safe under concurrent callers. To refresh after a Galaxy redeploy, call
`browse` again from the root.
### Watching deploy events
`watch_deploy_events` opens the `WatchDeployEvents` server stream. The
@@ -192,3 +249,27 @@ cargo run -p mxgw-cli -- smoke --endpoint $env:MXGATEWAY_ENDPOINT --plaintext --
- [Client Proto Generation](../../docs/ClientProtoGeneration.md)
- [Rust Client Detailed Design](./RustClientDesign.md)
- [Rust Style Guide](../../docs/style-guides/RustStyleGuide.md)
## Installing from the Gitea Cargo registry
The crate publishes to the internal Gitea Cargo registry. Register the
registry once in your global `~/.cargo/config.toml`:
```toml
[registries.dohertj2-gitea]
index = "sparse+https://gitea.dohertylan.com/api/packages/dohertj2/cargo/"
```
Authentication: cargo reads credentials from `~/.cargo/credentials.toml`:
```toml
[registries.dohertj2-gitea]
token = "Bearer <your-gitea-token>"
```
Then add the dependency:
```toml
[dependencies]
zb-mom-ww-mxgateway-client = { version = "0.1.0", registry = "dohertj2-gitea" }
```
+134 -12
View File
@@ -56,7 +56,23 @@ Expected dependencies:
- `clap`
- `serde`
- `serde_json`
- `tracing`
## Windows Build Notes
`clients/rust/.cargo/config.toml` carries an MSVC-scoped rustflag that bumps
the default 1 MB Windows main-thread stack to 8 MB
(`-C link-arg=/STACK:8388608`, under `cfg(all(windows, target_env = "msvc"))`).
The setting is required because clap-derive materialises a large `Command`
enum (one variant per CLI subcommand, each carrying its flag args) on the
main task's stack in debug builds, before any user code runs; the default 1
MB stack overflows during enum construction. The `/STACK:` link-arg writes
into the PE header's `IMAGE_OPTIONAL_HEADER.SizeOfStackReserve` at link
time, so both debug and release artifacts ship with the same 8 MB stack
reservation. Release builds would not overflow without it (the optimizer
elides the enum from the stack frame), but the setting is kept on for
release too so both build flavours produce binaries with identical stack
metadata. The MSVC-only selector keeps `x86_64-pc-windows-gnu` (mingw)
builds unaffected, since the GNU linker rejects `/STACK:`.
## Library API
@@ -94,18 +110,65 @@ impl Session {
pub async fn add_item(&self, server_handle: i32, item: &str) -> Result<i32, Error>;
pub async fn add_item2(&self, server_handle: i32, item: &str, context: &str) -> Result<i32, Error>;
pub async fn advise(&self, server_handle: i32, item_handle: i32) -> Result<(), Error>;
pub async fn un_advise(&self, server_handle: i32, item_handle: i32) -> Result<(), Error>;
pub async fn remove_item(&self, server_handle: i32, item_handle: i32) -> Result<(), Error>;
pub async fn add_item_bulk(&self, server_handle: i32, tag_addresses: Vec<String>) -> Result<Vec<SubscribeResult>, Error>;
pub async fn advise_item_bulk(&self, server_handle: i32, item_handles: Vec<i32>) -> Result<Vec<SubscribeResult>, Error>;
pub async fn remove_item_bulk(&self, server_handle: i32, item_handles: Vec<i32>) -> Result<Vec<SubscribeResult>, Error>;
pub async fn un_advise_item_bulk(&self, server_handle: i32, item_handles: Vec<i32>) -> Result<Vec<SubscribeResult>, Error>;
pub async fn subscribe_bulk(&self, server_handle: i32, tag_addresses: Vec<String>) -> Result<Vec<SubscribeResult>, Error>;
pub async fn unsubscribe_bulk(&self, server_handle: i32, item_handles: Vec<i32>) -> Result<Vec<SubscribeResult>, Error>;
pub async fn read_bulk<S: AsRef<str>>(&self, server_handle: i32, tag_addresses: &[S], timeout_ms: u32) -> Result<Vec<BulkReadResult>, Error>;
pub async fn write(&self, server_handle: i32, item_handle: i32, value: MxValue, user_id: i32) -> Result<(), Error>;
pub async fn write2(&self, server_handle: i32, item_handle: i32, value: MxValue, timestamp_value: MxValue, user_id: i32) -> Result<(), Error>;
pub async fn write_bulk(&self, server_handle: i32, entries: Vec<WriteBulkEntry>) -> Result<Vec<BulkWriteResult>, Error>;
pub async fn write2_bulk(&self, server_handle: i32, entries: Vec<Write2BulkEntry>) -> Result<Vec<BulkWriteResult>, Error>;
pub async fn write_secured_bulk(&self, server_handle: i32, entries: Vec<WriteSecuredBulkEntry>) -> Result<Vec<BulkWriteResult>, Error>;
pub async fn write_secured2_bulk(&self, server_handle: i32, entries: Vec<WriteSecured2BulkEntry>) -> Result<Vec<BulkWriteResult>, Error>;
pub async fn events(&self) -> Result<impl Stream<Item = Result<MxEvent, Error>>, Error>;
pub async fn close(&self) -> Result<(), Error>;
}
```
The per-entry credentials and timestamps (`user_id`, `timestamp_value`,
`current_user_id`, `verifier_user_id`) live on the `WriteBulkEntry` /
`Write2BulkEntry` / `WriteSecuredBulkEntry` / `WriteSecured2BulkEntry`
structs rather than as trailing positional arguments on the bulk-write
helpers, matching the protobuf shapes in `mxaccess_gateway.proto`.
`read_bulk` is generic over `AsRef<str>` so callers can pass `&[String]` or
`&[&str]` without cloning at the call site (the cross-language bench-read-bulk
hot loop relies on this).
The `session::next_correlation_id` helper is `pub` and re-exported at the
crate root (`zb_mom_ww_mxgateway_client::next_correlation_id`); raw-RPC
consumers like the `mxgw` CLI's `Ping`, `CloseSession`, `StreamAlarms`,
`AcknowledgeAlarm`, and `BenchReadBulk` paths call it so every request
carries a unique correlation id that gateway logs can tell apart from
concurrent CLI smokes. The textual format is intentionally not part of the
public contract.
## Alarms
`GatewayClient` exposes the gateway's session-less central alarm surface:
```rust
pub type AlarmFeedStream = Pin<Box<dyn Stream<Item = Result<AlarmFeedMessage, Error>> + Send + 'static>>;
impl GatewayClient {
pub async fn stream_alarms(&self, request: StreamAlarmsRequest) -> Result<AlarmFeedStream, Error>;
pub async fn acknowledge_alarm(&self, request: AcknowledgeAlarmRequest) -> Result<AcknowledgeAlarmReply, Error>;
}
```
`stream_alarms` opens with one `active_alarm` per currently-active alarm
(the ConditionRefresh snapshot), then a single `snapshot_complete`, then a
`transition` for every subsequent raise / acknowledge / clear. The feed is
served by the gateway's always-on alarm monitor — no worker session is
opened — so any number of clients may attach. Dropping the stream cancels
the gRPC call cooperatively. `acknowledge_alarm` is idempotent at the
MxAccess layer; the returned `AcknowledgeAlarmReply` carries the native
MxStatus from the worker.
## Authentication
Use a `tonic` interceptor or request extension layer to add:
@@ -126,6 +189,25 @@ Support:
- custom CA file,
- domain override.
### Trust posture (pin-only)
The gateway can serve a self-signed certificate it generates itself (it has no
PKI). Rust is the **exception** to the lenient-by-default posture the other
clients use: tonic 0.13.1 exposes no public hook to inject a custom certificate
verifier, so the Rust client cannot accept an arbitrary certificate. TLS over the
Rust client is therefore **pin-only** — it requires either:
- `ClientOptions::with_ca_file(...)` to pin a CA (the supported path for the
gateway's self-signed certificate; export the certificate and pin it), or
- `ClientOptions::with_require_certificate_validation(true)` to verify against the
system trust roots.
With TLS enabled (`with_plaintext(false)`), no pinned CA, and certificate
validation not required, `GatewayClient::connect` rejects the connection with a
clear, actionable error pointing at `with_ca_file` /
`require_certificate_validation` rather than silently accepting the certificate.
The CLI exposes `--ca-file` and `--require-certificate-validation`.
## Streaming
Expose event streams as a `Stream<Item = Result<MxEvent, Error>>`. Dropping the
@@ -140,19 +222,31 @@ Use `thiserror`:
```rust
pub enum Error {
InvalidEndpoint { endpoint: String, detail: String },
InvalidArgument { name: String, detail: String },
Transport(tonic::transport::Error),
Status(tonic::Status),
Authentication(String),
Authorization(String),
Session(SessionError),
Worker(WorkerError),
Command(CommandError),
MxAccess(MxAccessError),
Timeout,
Cancelled,
Authentication { message: String, status: Box<tonic::Status> },
Authorization { message: String, status: Box<tonic::Status> },
Timeout { message: String, status: Box<tonic::Status> },
Cancelled { message: String, status: Box<tonic::Status> },
Unavailable { message: String, status: Box<tonic::Status> },
Status(Box<tonic::Status>),
Command(Box<CommandError>),
ProtocolStatus { operation: &'static str, code: ProtocolStatusCode, message: String },
MalformedReply { detail: String },
}
```
- `Unavailable` classifies `Code::Unavailable` / `Code::ResourceExhausted`
so callers can distinguish transient failures from permanent ones.
- `MalformedReply` surfaces the rare case where the gateway returned a
protocol-level `Ok` envelope but the typed payload arm was missing or did
not match the command kind (e.g. an `AddItemReply` body on a `WriteBulk`
reply). This is distinct from `ProtocolStatus` because the protocol-level
envelope itself succeeded; the corruption is in the payload shape.
- `InvalidEndpoint` is raised before any RPC dispatch when the endpoint URL
or CA file fails to parse / load.
Preserve raw command replies in `CommandError` where applicable.
## Test CLI
@@ -165,11 +259,39 @@ Commands:
```text
mxgw version
mxgw smoke --endpoint http://localhost:5000 --api-key-env MXGATEWAY_API_KEY --item TestChildObject.TestInt
mxgw ping
mxgw open-session
mxgw close-session --session-id <id>
mxgw register --session-id <id>
mxgw add-item --session-id <id> --server-handle <h> --item <tag>
mxgw advise --session-id <id> --server-handle <h> --item-handle <h>
mxgw subscribe-bulk --session-id <id> --server-handle <h> --items <csv>
mxgw unsubscribe-bulk --session-id <id> --server-handle <h> --item-handles <csv>
mxgw read-bulk --session-id <id> --server-handle <h> --items <csv> [--timeout-ms <ms>]
mxgw write --session-id <id> --server-handle 1 --item-handle 1 --value-type int32 --value 123
mxgw write2 --session-id <id> --server-handle 1 --item-handle 1 --value-type int32 --value 123 --timestamp <iso>
mxgw write-bulk --session-id <id> --server-handle <h> --item-handles <csv> --value-type <t> --values <csv>
mxgw write2-bulk ...
mxgw write-secured-bulk ...
mxgw write-secured2-bulk ...
mxgw stream-events --session-id <id> --json
mxgw write --session-id <id> --server-handle 1 --item-handle 1 --type int32 --value 123
mxgw stream-alarms [--filter-prefix <prefix>] [--max-events <n>]
mxgw acknowledge-alarm --reference <full-ref> [--comment <txt>] [--operator <user>]
mxgw bench-read-bulk [--duration-seconds <n>] [--warmup-seconds <n>] [--bulk-size <n>]
mxgw smoke --endpoint http://localhost:5000 --api-key-env MXGATEWAY_API_KEY --item TestChildObject.TestInt
mxgw batch
mxgw galaxy {test-connection,last-deploy-time,discover-hierarchy,watch}
```
`batch` reads commands from stdin one per line and dispatches each through
the normal subcommand path; the loop terminates only on stdin EOF (blank
lines log an empty-EOR-bracketed result and continue) so accidental empty
lines from the PowerShell e2e harness do not silently end the session.
`bench-read-bulk` opens its own session, subscribes to `--bulk-size` tags so
the worker's value cache populates from OnDataChange events, hammers
`read_bulk` in a tight loop for `--duration-seconds`, and emits the
cross-language JSON shape that `scripts/bench-read-bulk.ps1` collates.
JSON output should use `serde_json`.
## Unit Tests
+1 -1
View File
@@ -2,7 +2,7 @@
name = "mxgw-cli"
version.workspace = true
edition.workspace = true
publish.workspace = true
publish = false
[[bin]]
name = "mxgw"
+71 -18
View File
@@ -26,8 +26,8 @@ use zb_mom_ww_mxgateway_client::generated::mxaccess_gateway::v1::{
WriteBulkEntry, WriteSecured2BulkEntry, WriteSecuredBulkEntry,
};
use zb_mom_ww_mxgateway_client::{
ApiKey, ClientOptions, Error, GalaxyClient, GatewayClient, MxValue, MxValueProjection,
CLIENT_VERSION, GATEWAY_PROTOCOL_VERSION, WORKER_PROTOCOL_VERSION,
next_correlation_id, ApiKey, ClientOptions, Error, GalaxyClient, GatewayClient, MxValue,
MxValueProjection, CLIENT_VERSION, GATEWAY_PROTOCOL_VERSION, WORKER_PROTOCOL_VERSION,
};
const MAX_AGGREGATE_EVENTS: usize = 10_000;
@@ -359,8 +359,9 @@ enum Command {
/// write `__MXGW_BATCH_EOR__` to stdout after every result. Errors are
/// written as `{"error":"…","type":"error"}` JSON to stdout (not stderr)
/// so the harness can parse them without interleaving stderr. The loop
/// never terminates on command error; only stdin EOF (or an empty line)
/// ends the session.
/// never terminates on command error or accidental blank lines; only
/// stdin EOF ends the session — empty lines log an empty-EOR-bracketed
/// result and continue, matching the other four language CLIs.
Batch,
#[command(subcommand)]
Galaxy(GalaxyCommand),
@@ -425,6 +426,11 @@ struct ConnectionArgs {
ca_file: Option<PathBuf>,
#[arg(long)]
server_name_override: Option<String>,
/// Verify the server certificate against the system trust roots even
/// without a pinned CA. The Rust client's default is to require a CA
/// file (see `--ca-file`); set this flag to use system roots instead.
#[arg(long)]
require_certificate_validation: bool,
#[arg(long, default_value_t = 10)]
connect_timeout_seconds: u64,
#[arg(long, default_value_t = 30)]
@@ -452,6 +458,9 @@ impl ConnectionArgs {
if let Some(server_name_override) = &self.server_name_override {
options = options.with_server_name_override(server_name_override);
}
if self.require_certificate_validation {
options = options.with_require_certificate_validation(true);
}
options
}
@@ -503,7 +512,7 @@ async fn dispatch(command: Command) -> Result<(), Error> {
let client = connect(connection).await?;
let reply = client
.invoke(MxCommandRequest {
client_correlation_id: "rust-cli-ping".to_owned(),
client_correlation_id: next_correlation_id("cli-ping"),
command: Some(MxCommand {
kind: MxCommandKind::Ping as i32,
payload: Some(zb_mom_ww_mxgateway_client::generated::mxaccess_gateway::v1::mx_command::Payload::Ping(
@@ -550,7 +559,7 @@ async fn dispatch(command: Command) -> Result<(), Error> {
let reply = client
.close_session_raw(CloseSessionRequest {
session_id,
client_correlation_id: "rust-cli-close-session".to_owned(),
client_correlation_id: next_correlation_id("cli-close-session"),
})
.await?;
if json {
@@ -624,7 +633,7 @@ async fn dispatch(command: Command) -> Result<(), Error> {
json,
} => {
let session = session_for(connection, session_id).await?;
let results = session.read_bulk(server_handle, items, timeout_ms).await?;
let results = session.read_bulk(server_handle, &items, timeout_ms).await?;
print_read_bulk_results("read-bulk", &results, json);
}
Command::WriteBulk {
@@ -832,7 +841,7 @@ async fn dispatch(command: Command) -> Result<(), Error> {
let client = connect(connection).await?;
let mut stream = client
.stream_alarms(StreamAlarmsRequest {
client_correlation_id: "rust-cli-stream-alarms".to_owned(),
client_correlation_id: next_correlation_id("cli-stream-alarms"),
alarm_filter_prefix: filter_prefix.unwrap_or_default(),
})
.await?;
@@ -869,7 +878,7 @@ async fn dispatch(command: Command) -> Result<(), Error> {
let client = connect(connection).await?;
let reply = client
.acknowledge_alarm(AcknowledgeAlarmRequest {
client_correlation_id: "rust-cli-acknowledge-alarm".to_owned(),
client_correlation_id: next_correlation_id("cli-acknowledge-alarm"),
alarm_full_reference: reference,
comment,
operator_user: operator,
@@ -1113,8 +1122,15 @@ const BATCH_EOR: &str = "__MXGW_BATCH_EOR__";
/// each through the normal [`dispatch`] path, and write [`BATCH_EOR`] to
/// stdout after every result. Errors are serialised as JSON to stdout so
/// the harness can parse them without interleaving stderr. The loop never
/// terminates on command error; only stdin EOF or an empty line ends the
/// session.
/// terminates on command error or accidental blank lines; only stdin EOF
/// ends the session — empty lines log an empty-EOR-bracketed result and
/// continue.
///
/// `std::io::Stdin::lock().lines()` is a blocking iterator and the dispatch
/// future is spawned on a separate tokio task so the runtime's main worker
/// stays free. When the runtime is multi-threaded the blocking read keeps
/// one worker parked on `ReadFile`; that is acceptable here because no other
/// future on the main task needs to run while we wait for the next command.
async fn run_batch() -> Result<(), Error> {
let stdin = io::stdin();
let stdout = io::stdout();
@@ -1125,12 +1141,11 @@ async fn run_batch() -> Result<(), Error> {
detail: e.to_string(),
})?;
if line.is_empty() {
break;
}
let parts: Vec<&str> = line.split_ascii_whitespace().collect();
if parts.is_empty() {
// Empty / whitespace-only line: log an empty-EOR-bracketed
// result and continue so accidental blank lines from the
// PowerShell e2e harness do not silently end the session.
println!("{BATCH_EOR}");
stdout.lock().flush().ok();
continue;
@@ -1388,6 +1403,7 @@ async fn run_bench_read_bulk(
let bench_outcome = async {
let server_handle = session.register(&client_name).await?;
let subscribe_results = session.subscribe_bulk(server_handle, tags.clone()).await?;
let tags_ref: &[String] = &tags;
let item_handles: Vec<i32> = subscribe_results
.iter()
.filter(|r| r.was_successful)
@@ -1401,7 +1417,7 @@ async fn run_bench_read_bulk(
let warmup_deadline = Instant::now() + Duration::from_secs(warmup_seconds);
while Instant::now() < warmup_deadline {
let _ = session
.read_bulk(server_handle, tags.clone(), timeout_ms_param)
.read_bulk(server_handle, tags_ref, timeout_ms_param)
.await;
}
@@ -1419,7 +1435,7 @@ async fn run_bench_read_bulk(
while Instant::now() < steady_deadline {
let call_start = Instant::now();
let result = session
.read_bulk(server_handle, tags.clone(), timeout_ms_param)
.read_bulk(server_handle, tags_ref, timeout_ms_param)
.await;
let elapsed = call_start.elapsed();
latencies_ms.push(elapsed.as_secs_f64() * 1000.0);
@@ -1473,7 +1489,7 @@ async fn run_bench_read_bulk(
let close_result = client
.close_session_raw(CloseSessionRequest {
session_id: session_id.clone(),
client_correlation_id: "rust-cli-bench-read-bulk-close".to_owned(),
client_correlation_id: next_correlation_id("cli-bench-read-bulk-close"),
})
.await;
@@ -2100,6 +2116,43 @@ mod tests {
assert_eq!(super::BATCH_EOR, "__MXGW_BATCH_EOR__");
}
#[test]
fn bench_percentile_summary_matches_hand_built_sample() {
// Hand-built sample with 5 values: 1, 2, 3, 4, 5.
let sample: Vec<f64> = vec![1.0, 2.0, 3.0, 4.0, 5.0];
let summary = super::percentile_summary(&sample);
assert_eq!(summary.max, 5.0);
// Mean = 15/5 = 3.0
assert!((summary.mean - 3.0).abs() < f64::EPSILON);
// p50: rank = 0.5 * 4 = 2 -> sorted[2] = 3.0
assert!((summary.p50 - 3.0).abs() < f64::EPSILON);
// p95: rank = 0.95 * 4 = 3.8 -> 4.0 + 0.8 * (5.0 - 4.0) = 4.8
assert!((summary.p95 - 4.8).abs() < f64::EPSILON);
// p99: rank = 0.99 * 4 = 3.96 -> 4.0 + 0.96 * 1.0 = 4.96
assert!((summary.p99 - 4.96).abs() < f64::EPSILON);
}
#[test]
fn bench_percentile_summary_handles_empty_sample() {
let summary = super::percentile_summary(&[]);
assert_eq!(summary.p50, 0.0);
assert_eq!(summary.p95, 0.0);
assert_eq!(summary.p99, 0.0);
assert_eq!(summary.max, 0.0);
assert_eq!(summary.mean, 0.0);
}
#[test]
fn bench_percentile_summary_handles_single_value_sample() {
let summary = super::percentile_summary(&[42.0]);
assert_eq!(summary.p50, 42.0);
assert_eq!(summary.p95, 42.0);
assert_eq!(summary.p99, 42.0);
assert_eq!(summary.max, 42.0);
assert_eq!(summary.mean, 42.0);
}
#[test]
fn rfc3339_parser_round_trips_z_and_offset_inputs() {
// 2026-04-28T15:30:00Z = 1_777_995_000 (sanity-checked once below)
+3 -16
View File
@@ -6,10 +6,8 @@
//! code should prefer [`GatewayClient::open_session`] and the [`Session`]
//! handle it returns, rather than the `*_raw` methods.
use std::fs;
use tonic::codegen::InterceptedService;
use tonic::transport::{Certificate, Channel, ClientTlsConfig};
use tonic::transport::Channel;
use tonic::Request;
use crate::auth::AuthInterceptor;
@@ -21,7 +19,7 @@ use crate::generated::mxaccess_gateway::v1::{
OpenSessionReply, OpenSessionRequest, QueryActiveAlarmsRequest, StreamAlarmsRequest,
StreamEventsRequest,
};
use crate::options::ClientOptions;
use crate::options::{build_tls_config, ClientOptions};
use crate::session::Session;
/// Generated gateway client wrapped in the auth interceptor that
@@ -78,18 +76,7 @@ impl GatewayClient {
})?;
endpoint = endpoint.connect_timeout(options.connect_timeout());
if !options.plaintext() {
let mut tls = ClientTlsConfig::new();
if let Some(server_name) = options.server_name_override() {
tls = tls.domain_name(server_name.to_owned());
}
if let Some(ca_file) = options.ca_file() {
let certificate = fs::read(ca_file).map_err(|source| Error::InvalidEndpoint {
endpoint: options.endpoint().to_owned(),
detail: format!("failed to read CA file {}: {source}", ca_file.display()),
})?;
tls = tls.ca_certificate(Certificate::from_pem(certificate));
}
if let Some(tls) = build_tls_config(&options)? {
endpoint = endpoint.tls_config(tls)?;
}
+25
View File
@@ -106,6 +106,27 @@ pub enum Error {
/// Detail message from the server.
message: String,
},
/// Gateway returned an `Ok` protocol status but the reply lacked the
/// expected typed payload (or carried the wrong payload arm). Distinct
/// from [`Error::ProtocolStatus`] because the protocol-level envelope
/// itself succeeded — the corruption is in the payload shape.
#[error("gateway returned a malformed reply: {detail}")]
MalformedReply {
/// Human-readable description of what was missing or mismatched.
detail: String,
},
/// Server returned `Unavailable` or `ResourceExhausted` — classify
/// transient failures separately from the catch-all [`Error::Status`].
#[error("gateway unavailable: {message}")]
Unavailable {
/// Redacted server-supplied detail message.
message: String,
/// Original `tonic::Status`.
#[source]
status: Box<tonic::Status>,
},
}
/// Wrapper around an [`MxCommandReply`] whose `protocol_status` reported a
@@ -174,6 +195,10 @@ impl From<tonic::Status> for Error {
message,
status: Box::new(status),
},
Code::Unavailable | Code::ResourceExhausted => Self::Unavailable {
message,
status: Box::new(status),
},
_ => Self::Status(Box::new(status)),
}
}
+540 -21
View File
@@ -5,23 +5,143 @@
//! read-only RPCs as Rust async methods. Generated Galaxy proto types are
//! re-exported through [`crate::generated::galaxy_repository::v1`].
use std::fs;
use std::collections::HashSet;
use std::sync::Arc;
use prost_types::Timestamp;
use tokio::sync::Mutex as AsyncMutex;
use tonic::codegen::InterceptedService;
use tonic::transport::{Certificate, Channel, ClientTlsConfig};
use tonic::transport::Channel;
use tonic::Request;
use crate::auth::AuthInterceptor;
use crate::error::Error;
use crate::generated::galaxy_repository::v1::galaxy_repository_client::GalaxyRepositoryClient;
use crate::generated::galaxy_repository::v1::{
DeployEvent, DiscoverHierarchyRequest, GalaxyObject, GetLastDeployTimeRequest,
TestConnectionRequest, WatchDeployEventsRequest,
browse_children_request, BrowseChildrenReply, BrowseChildrenRequest, DeployEvent,
DiscoverHierarchyRequest, GalaxyObject, GetLastDeployTimeRequest, TestConnectionRequest,
WatchDeployEventsRequest,
};
use crate::options::ClientOptions;
use crate::options::{build_tls_config, ClientOptions};
const DISCOVER_HIERARCHY_PAGE_SIZE: i32 = 5000;
const BROWSE_CHILDREN_PAGE_SIZE: i32 = 500;
/// Optional filter set forwarded to `GalaxyRepository.BrowseChildren`.
///
/// Mirrors the request-level filters on the wire: combined with AND so a child
/// only appears when it satisfies every populated criterion. Construct via
/// [`BrowseChildrenOptions::default`] and tweak the fields you care about.
#[derive(Debug, Clone, Default)]
pub struct BrowseChildrenOptions {
/// Restrict to objects whose `category_id` matches one of the supplied
/// Galaxy category identifiers. Empty means "no restriction".
pub category_ids: Vec<i32>,
/// Restrict to objects whose template chain contains every supplied
/// template name (case-sensitive substring match on each entry).
pub template_chain_contains: Vec<String>,
/// Restrict to objects whose tag name matches the supplied glob (SQL
/// `LIKE`-style on the server). `None` means "no glob filter".
pub tag_name_glob: Option<String>,
/// Optional tri-state hint for whether to populate `GalaxyObject.attributes`
/// on returned children. `None` falls back to the server default.
pub include_attributes: Option<bool>,
/// When `true`, only return children that own at least one alarm-bearing
/// attribute (matches `DiscoverHierarchy` semantics).
pub alarm_bearing_only: bool,
/// When `true`, only return children that own at least one historized
/// attribute (matches `DiscoverHierarchy` semantics).
pub historized_only: bool,
}
/// Lazy hierarchy node used by the walker built on top of `BrowseChildren`.
///
/// A node owns its [`GalaxyObject`], a hint as to whether the server believes
/// it has at least one matching descendant under the active filter set, and an
/// internal `expanded` flag protected by an async mutex. Calling [`expand`]
/// the first time issues a paged `BrowseChildren` RPC; subsequent calls are
/// no-ops so callers can poll without re-hitting the server.
///
/// `LazyBrowseNode` is cheap to clone — clones share state through an
/// internal `Arc`, so expanding one clone makes the children visible to every
/// other clone.
///
/// [`expand`]: LazyBrowseNode::expand
pub struct LazyBrowseNode {
inner: Arc<LazyBrowseNodeInner>,
}
impl Clone for LazyBrowseNode {
fn clone(&self) -> Self {
Self {
inner: Arc::clone(&self.inner),
}
}
}
struct LazyBrowseNodeInner {
client: GalaxyClient,
object: GalaxyObject,
has_children_hint: bool,
options: BrowseChildrenOptions,
state: AsyncMutex<LazyBrowseNodeState>,
}
struct LazyBrowseNodeState {
children: Vec<LazyBrowseNode>,
is_expanded: bool,
}
impl LazyBrowseNode {
/// Borrow the [`GalaxyObject`] returned by the server for this node.
pub fn object(&self) -> &GalaxyObject {
&self.inner.object
}
/// Server-supplied hint: `true` when the child likely has at least one
/// further matching descendant. Useful to decide whether a UI should draw
/// an expand triangle without issuing the RPC up front.
pub fn has_children_hint(&self) -> bool {
self.inner.has_children_hint
}
/// Snapshot of the currently-known children. Empty until [`expand`] has
/// run at least once.
///
/// [`expand`]: LazyBrowseNode::expand
pub async fn children(&self) -> Vec<LazyBrowseNode> {
self.inner.state.lock().await.children.clone()
}
/// Returns `true` once [`expand`] has populated this node's children.
///
/// [`expand`]: LazyBrowseNode::expand
pub async fn is_expanded(&self) -> bool {
self.inner.state.lock().await.is_expanded
}
/// Populate this node's children by issuing a paged `BrowseChildren` RPC.
/// Subsequent calls are no-ops — the cached children stay in place and no
/// additional RPC is issued.
pub async fn expand(&self) -> Result<(), Error> {
let mut state = self.inner.state.lock().await;
if state.is_expanded {
return Ok(());
}
let mut client = self.inner.client.clone();
let new_children = client
.browse_children_inner(
Some(self.inner.object.gobject_id),
self.inner.options.clone(),
)
.await?;
state.children = new_children;
state.is_expanded = true;
Ok(())
}
}
/// Convenience alias for the generated Galaxy client wrapped in the
/// authentication interceptor.
@@ -62,18 +182,7 @@ impl GalaxyClient {
})?;
endpoint = endpoint.connect_timeout(options.connect_timeout());
if !options.plaintext() {
let mut tls = ClientTlsConfig::new();
if let Some(server_name) = options.server_name_override() {
tls = tls.domain_name(server_name.to_owned());
}
if let Some(ca_file) = options.ca_file() {
let certificate = fs::read(ca_file).map_err(|source| Error::InvalidEndpoint {
endpoint: options.endpoint().to_owned(),
detail: format!("failed to read CA file {}: {source}", ca_file.display()),
})?;
tls = tls.ca_certificate(Certificate::from_pem(certificate));
}
if let Some(tls) = build_tls_config(&options)? {
endpoint = endpoint.tls_config(tls)?;
}
@@ -172,6 +281,99 @@ impl GalaxyClient {
}
}
/// Browse the top-level (root) objects of the hierarchy as
/// [`LazyBrowseNode`] instances. Pass [`BrowseChildrenOptions`] to
/// restrict the result set; the same filter is reused when callers expand
/// any returned node.
pub async fn browse(
&mut self,
options: Option<BrowseChildrenOptions>,
) -> Result<Vec<LazyBrowseNode>, Error> {
let effective = options.unwrap_or_default();
self.browse_children_inner(None, effective).await
}
/// Issue a single `BrowseChildren` RPC and return the raw reply. Callers
/// that want to drive paging themselves (or inspect the cache sequence)
/// use this; high-level walking goes through [`browse`] and
/// [`LazyBrowseNode::expand`].
///
/// [`browse`]: GalaxyClient::browse
pub async fn browse_children_raw(
&mut self,
request: BrowseChildrenRequest,
) -> Result<BrowseChildrenReply, Error> {
let response = self
.inner
.browse_children(self.unary_request(request))
.await?;
Ok(response.into_inner())
}
pub(crate) async fn browse_children_inner(
&mut self,
parent_gobject_id: Option<i32>,
options: BrowseChildrenOptions,
) -> Result<Vec<LazyBrowseNode>, Error> {
let mut nodes = Vec::new();
let mut page_token = String::new();
let mut seen_page_tokens: HashSet<String> = HashSet::new();
loop {
let parent = parent_gobject_id.map(browse_children_request::Parent::ParentGobjectId);
let request = BrowseChildrenRequest {
page_size: BROWSE_CHILDREN_PAGE_SIZE,
page_token: page_token.clone(),
category_ids: options.category_ids.clone(),
template_chain_contains: options.template_chain_contains.clone(),
tag_name_glob: options.tag_name_glob.clone().unwrap_or_default(),
include_attributes: options.include_attributes,
alarm_bearing_only: options.alarm_bearing_only,
historized_only: options.historized_only,
parent,
};
let reply = self.browse_children_raw(request).await?;
let hints = reply.child_has_children;
for (index, object) in reply.children.into_iter().enumerate() {
let hint = hints.get(index).copied().unwrap_or(false);
nodes.push(self.make_lazy_node(object, hint, options.clone()));
}
page_token = reply.next_page_token;
if page_token.is_empty() {
return Ok(nodes);
}
if !seen_page_tokens.insert(page_token.clone()) {
return Err(Error::InvalidArgument {
name: "page_token".to_owned(),
detail: format!(
"galaxy browse children returned repeated page token `{page_token}`"
),
});
}
}
}
fn make_lazy_node(
&self,
object: GalaxyObject,
has_children_hint: bool,
options: BrowseChildrenOptions,
) -> LazyBrowseNode {
LazyBrowseNode {
inner: Arc::new(LazyBrowseNodeInner {
client: self.clone(),
object,
has_children_hint,
options,
state: AsyncMutex::new(LazyBrowseNodeState {
children: Vec::new(),
is_expanded: false,
}),
}),
}
}
/// Subscribe to the server-streamed deploy-event feed.
///
/// The server emits a bootstrap event describing the current cache state
@@ -234,9 +436,10 @@ mod tests {
GalaxyRepository, GalaxyRepositoryServer,
};
use crate::generated::galaxy_repository::v1::{
DeployEvent, DiscoverHierarchyReply, DiscoverHierarchyRequest, GalaxyAttribute,
GalaxyObject, GetLastDeployTimeReply, GetLastDeployTimeRequest, TestConnectionReply,
TestConnectionRequest, WatchDeployEventsRequest,
BrowseChildrenReply, BrowseChildrenRequest, DeployEvent, DiscoverHierarchyReply,
DiscoverHierarchyRequest, GalaxyAttribute, GalaxyObject, GetLastDeployTimeReply,
GetLastDeployTimeRequest, TestConnectionReply, TestConnectionRequest,
WatchDeployEventsRequest,
};
type DeployEventTx = mpsc::Sender<Result<DeployEvent, Status>>;
@@ -249,6 +452,9 @@ mod tests {
objects: Mutex<Vec<GalaxyObject>>,
discover_requests: Mutex<Vec<DiscoverHierarchyRequest>>,
discover_replies: Mutex<std::collections::VecDeque<DiscoverHierarchyReply>>,
browse_children_calls: Mutex<Vec<BrowseChildrenRequest>>,
browse_children_replies: Mutex<std::collections::VecDeque<BrowseChildrenReply>>,
browse_children_errors: Mutex<Vec<Status>>,
watch_requests: Mutex<Vec<WatchDeployEventsRequest>>,
watch_events: Mutex<Vec<DeployEvent>>,
watch_senders: Mutex<Vec<DeployEventTx>>,
@@ -279,7 +485,7 @@ mod tests {
_request: Request<GetLastDeployTimeRequest>,
) -> Result<Response<GetLastDeployTimeReply>, Status> {
let present = *self.state.present.lock().unwrap();
let time = self.state.last_deploy.lock().unwrap().clone();
let time = *self.state.last_deploy.lock().unwrap();
Ok(Response::new(GetLastDeployTimeReply {
present,
time_of_last_deploy: time,
@@ -306,6 +512,28 @@ mod tests {
}))
}
async fn browse_children(
&self,
request: Request<BrowseChildrenRequest>,
) -> Result<Response<BrowseChildrenReply>, Status> {
self.state
.browse_children_calls
.lock()
.unwrap()
.push(request.into_inner());
if let Some(error) = self.state.browse_children_errors.lock().unwrap().pop() {
return Err(error);
}
let reply = self
.state
.browse_children_replies
.lock()
.unwrap()
.pop_front()
.unwrap_or_default();
Ok(Response::new(reply))
}
type WatchDeployEventsStream =
Pin<Box<dyn tokio_stream::Stream<Item = Result<DeployEvent, Status>> + Send + 'static>>;
@@ -695,4 +923,295 @@ mod tests {
"drop signal channel closed unexpectedly"
);
}
fn browse_obj(gid: i32, tag: &str, is_area: bool) -> GalaxyObject {
GalaxyObject {
gobject_id: gid,
tag_name: tag.to_owned(),
contained_name: String::new(),
browse_name: tag.to_owned(),
parent_gobject_id: 0,
is_area,
category_id: 0,
hosted_by_gobject_id: 0,
template_chain: Vec::new(),
attributes: Vec::new(),
}
}
fn build_browse_reply(
children: Vec<GalaxyObject>,
child_has_children: Vec<bool>,
cache_sequence: u64,
) -> BrowseChildrenReply {
BrowseChildrenReply {
total_child_count: children.len() as i32,
cache_sequence,
children,
child_has_children,
next_page_token: String::new(),
}
}
#[tokio::test]
async fn browse_no_parent_returns_roots() {
let state = Arc::new(FakeState::default());
state
.browse_children_replies
.lock()
.unwrap()
.push_back(build_browse_reply(
vec![browse_obj(1, "Area_A", true), browse_obj(2, "Area_B", true)],
vec![true, false],
7,
));
let endpoint = spawn_fake(state.clone()).await;
let mut client = GalaxyClient::connect(ClientOptions::new(endpoint))
.await
.unwrap();
let roots = client.browse(None).await.unwrap();
assert_eq!(roots.len(), 2);
assert_eq!(roots[0].object().tag_name, "Area_A");
assert!(roots[0].has_children_hint());
assert_eq!(roots[1].object().tag_name, "Area_B");
assert!(!roots[1].has_children_hint());
let calls = state.browse_children_calls.lock().unwrap();
assert_eq!(calls.len(), 1);
assert!(
calls[0].parent.is_none(),
"root browse must send an empty parent oneof, got {:?}",
calls[0].parent
);
}
#[tokio::test]
async fn browse_expand_populates_children_and_marks_expanded() {
let state = Arc::new(FakeState::default());
// First call: roots.
state
.browse_children_replies
.lock()
.unwrap()
.push_back(build_browse_reply(
vec![browse_obj(10, "Area_A", true)],
vec![true],
1,
));
// Second call: children of gobject 10.
state
.browse_children_replies
.lock()
.unwrap()
.push_back(build_browse_reply(
vec![browse_obj(11, "Receiver_1", false)],
vec![false],
1,
));
let endpoint = spawn_fake(state.clone()).await;
let mut client = GalaxyClient::connect(ClientOptions::new(endpoint))
.await
.unwrap();
let roots = client.browse(None).await.unwrap();
let root = roots.into_iter().next().expect("at least one root");
assert!(!root.is_expanded().await);
root.expand().await.unwrap();
assert!(root.is_expanded().await);
let children = root.children().await;
assert_eq!(children.len(), 1);
assert_eq!(children[0].object().tag_name, "Receiver_1");
let calls = state.browse_children_calls.lock().unwrap();
assert_eq!(calls.len(), 2);
let expand_call = &calls[1];
match expand_call.parent.as_ref().expect("expand sends parent") {
browse_children_request::Parent::ParentGobjectId(id) => assert_eq!(*id, 10),
other => panic!("expected ParentGobjectId variant, got {other:?}"),
}
}
#[tokio::test]
async fn browse_expand_idempotent_no_second_rpc() {
let state = Arc::new(FakeState::default());
state
.browse_children_replies
.lock()
.unwrap()
.push_back(build_browse_reply(
vec![browse_obj(20, "Area_X", true)],
vec![true],
1,
));
state
.browse_children_replies
.lock()
.unwrap()
.push_back(build_browse_reply(
vec![browse_obj(21, "Leaf", false)],
vec![false],
1,
));
let endpoint = spawn_fake(state.clone()).await;
let mut client = GalaxyClient::connect(ClientOptions::new(endpoint))
.await
.unwrap();
let roots = client.browse(None).await.unwrap();
let root = roots.into_iter().next().unwrap();
root.expand().await.unwrap();
let after_first = state.browse_children_calls.lock().unwrap().len();
// Calling expand a second time must NOT issue a new RPC.
root.expand().await.unwrap();
let after_second = state.browse_children_calls.lock().unwrap().len();
assert_eq!(
after_first, after_second,
"expand should be idempotent — no extra RPC the second time"
);
assert_eq!(root.children().await.len(), 1);
}
#[tokio::test]
async fn browse_expand_unknown_parent_returns_not_found_error() {
let state = Arc::new(FakeState::default());
// Root browse succeeds.
state
.browse_children_replies
.lock()
.unwrap()
.push_back(build_browse_reply(
vec![browse_obj(99, "GhostArea", true)],
vec![true],
1,
));
let endpoint = spawn_fake(state.clone()).await;
let mut client = GalaxyClient::connect(ClientOptions::new(endpoint))
.await
.unwrap();
let roots = client.browse(None).await.unwrap();
let root = roots.into_iter().next().unwrap();
// Seed the NotFound only AFTER the root call so the FakeGalaxy's
// error stack doesn't intercept the initial browse.
state
.browse_children_errors
.lock()
.unwrap()
.push(Status::not_found("parent gobject 99 not present in cache"));
let error = root.expand().await.unwrap_err();
match &error {
Error::Status(status) => {
assert_eq!(status.code(), tonic::Code::NotFound);
}
other => panic!("expected Error::Status(NotFound), got {other:?}"),
}
// Failed expand must NOT mark the node as expanded — caller can retry.
assert!(!root.is_expanded().await);
assert!(root.children().await.is_empty());
}
#[tokio::test]
async fn browse_expand_multi_page_gathers_all_pages() {
let state = Arc::new(FakeState::default());
// First reply: roots.
state
.browse_children_replies
.lock()
.unwrap()
.push_back(build_browse_reply(
vec![browse_obj(30, "Plant", true)],
vec![true],
5,
));
// Second reply: page 1 of children, with a next_page_token.
let mut page_one = build_browse_reply(
vec![
browse_obj(31, "Child_A", false),
browse_obj(32, "Child_B", false),
],
vec![false, false],
5,
);
page_one.next_page_token = "cursor-2".to_owned();
page_one.total_child_count = 3;
state
.browse_children_replies
.lock()
.unwrap()
.push_back(page_one);
// Third reply: page 2 of children, with no next page.
state
.browse_children_replies
.lock()
.unwrap()
.push_back(build_browse_reply(
vec![browse_obj(33, "Child_C", false)],
vec![false],
5,
));
let endpoint = spawn_fake(state.clone()).await;
let mut client = GalaxyClient::connect(ClientOptions::new(endpoint))
.await
.unwrap();
let roots = client.browse(None).await.unwrap();
let root = roots.into_iter().next().unwrap();
root.expand().await.unwrap();
let children = root.children().await;
assert_eq!(children.len(), 3);
assert_eq!(children[0].object().tag_name, "Child_A");
assert_eq!(children[1].object().tag_name, "Child_B");
assert_eq!(children[2].object().tag_name, "Child_C");
let calls = state.browse_children_calls.lock().unwrap();
// 1 root call + 2 paged expand calls = 3 total.
assert_eq!(calls.len(), 3);
assert_eq!(calls[1].page_token, "");
assert_eq!(calls[2].page_token, "cursor-2");
}
#[tokio::test]
async fn browse_with_filter_forwards_to_request() {
let state = Arc::new(FakeState::default());
state
.browse_children_replies
.lock()
.unwrap()
.push_back(build_browse_reply(Vec::new(), Vec::new(), 1));
let endpoint = spawn_fake(state.clone()).await;
let mut client = GalaxyClient::connect(ClientOptions::new(endpoint))
.await
.unwrap();
let options = BrowseChildrenOptions {
category_ids: vec![3, 5],
template_chain_contains: vec!["$DelmiaReceiver".to_owned()],
tag_name_glob: Some("Recv_*".to_owned()),
include_attributes: Some(true),
alarm_bearing_only: true,
historized_only: false,
};
let _ = client.browse(Some(options)).await.unwrap();
let calls = state.browse_children_calls.lock().unwrap();
assert_eq!(calls.len(), 1);
let req = &calls[0];
assert_eq!(req.category_ids, vec![3, 5]);
assert_eq!(req.template_chain_contains, vec!["$DelmiaReceiver"]);
assert_eq!(req.tag_name_glob, "Recv_*");
assert_eq!(req.include_attributes, Some(true));
assert!(req.alarm_bearing_only);
assert!(!req.historized_only);
}
}
+1 -1
View File
@@ -32,7 +32,7 @@ pub use galaxy::{DeployEventStream, GalaxyClient};
#[doc(inline)]
pub use options::ClientOptions;
#[doc(inline)]
pub use session::Session;
pub use session::{next_correlation_id, Session};
#[doc(inline)]
pub use value::{MxArrayProjection, MxArrayValue, MxStatus, MxValue, MxValueProjection};
#[doc(inline)]
+96
View File
@@ -3,10 +3,14 @@
//! chain of `with_*` setters; the `Debug` impl redacts the API key.
use std::fmt;
use std::fs;
use std::path::PathBuf;
use std::time::Duration;
use tonic::transport::{Certificate, ClientTlsConfig};
use crate::auth::ApiKey;
use crate::error::Error;
const DEFAULT_MAX_GRPC_MESSAGE_BYTES: usize = 16 * 1024 * 1024;
@@ -22,6 +26,7 @@ pub struct ClientOptions {
api_key: Option<ApiKey>,
plaintext: bool,
ca_file: Option<PathBuf>,
require_certificate_validation: bool,
server_name_override: Option<String>,
connect_timeout: Duration,
call_timeout: Duration,
@@ -38,6 +43,7 @@ impl ClientOptions {
api_key: None,
plaintext: true,
ca_file: None,
require_certificate_validation: false,
server_name_override: None,
connect_timeout: Duration::from_secs(10),
call_timeout: Duration::from_secs(30),
@@ -67,6 +73,22 @@ impl ClientOptions {
self
}
/// Require TLS certificate verification even without a pinned CA. Default
/// false: the gateway's self-signed certificate is accepted (internal-tool
/// posture). Setting a CA file always verifies.
///
/// Note for Rust: tonic 0.13's `ClientTlsConfig` exposes no hook for a
/// custom rustls verifier, so the Rust client cannot accept an arbitrary
/// self-signed certificate the way the other clients do. With the default
/// (false) and no pinned CA, [`crate::client::GatewayClient::connect`]
/// rejects the TLS connection and asks for a CA file. Either pin a CA via
/// [`ClientOptions::with_ca_file`] (the supported lenient path on Rust) or
/// set this `true` to verify against the system trust roots.
pub fn with_require_certificate_validation(mut self, require: bool) -> Self {
self.require_certificate_validation = require;
self
}
/// Override the SNI/server name used during the TLS handshake. Useful
/// when the dial-target host name does not match the certificate.
pub fn with_server_name_override(mut self, server_name_override: impl Into<String>) -> Self {
@@ -95,6 +117,7 @@ impl ClientOptions {
self
}
/// Maximum encoded/decoded gRPC message size in bytes (default 16 MiB).
pub fn with_max_grpc_message_bytes(mut self, max_grpc_message_bytes: usize) -> Self {
self.max_grpc_message_bytes = max_grpc_message_bytes;
self
@@ -120,6 +143,12 @@ impl ClientOptions {
self.ca_file.as_ref()
}
/// Whether TLS certificate verification is required even without a pinned
/// CA. See [`ClientOptions::with_require_certificate_validation`].
pub fn require_certificate_validation(&self) -> bool {
self.require_certificate_validation
}
/// Optional SNI / server-name override for TLS handshakes.
pub fn server_name_override(&self) -> Option<&str> {
self.server_name_override.as_deref()
@@ -140,11 +169,74 @@ impl ClientOptions {
self.stream_timeout
}
/// Configured maximum encoded/decoded gRPC message size in bytes.
pub fn max_grpc_message_bytes(&self) -> usize {
self.max_grpc_message_bytes
}
}
/// Build the [`ClientTlsConfig`] for a non-plaintext connection described by
/// `options`, applying the lenient-default guard that is the **Rust
/// pin-only exception**.
///
/// Returns `Ok(None)` when `options.plaintext()` is `true` (no TLS needed).
/// Returns `Ok(Some(tls))` when a valid TLS config can be assembled.
/// Returns `Err(Error::InvalidEndpoint)` when TLS is requested but no pinned
/// CA was provided and `require_certificate_validation` is `false`.
///
/// # Why this guard exists
///
/// `tonic` 0.13's `ClientTlsConfig` builds its rustls verifier inside a
/// crate-private connector and exposes no hook for a custom
/// `ServerCertVerifier`. The Rust client therefore cannot accept an arbitrary
/// self-signed certificate the way the other language clients do. Rather than
/// silently falling back to system-root verification (which always fails
/// against a self-signed gateway certificate), we reject the configuration
/// early with an actionable error.
pub(crate) fn build_tls_config(options: &ClientOptions) -> Result<Option<ClientTlsConfig>, Error> {
if options.plaintext() {
return Ok(None);
}
let mut tls = ClientTlsConfig::new();
if let Some(server_name) = options.server_name_override() {
tls = tls.domain_name(server_name.to_owned());
}
if let Some(ca_file) = options.ca_file() {
let certificate = fs::read(ca_file).map_err(|source| Error::InvalidEndpoint {
endpoint: options.endpoint().to_owned(),
detail: format!("failed to read CA file {}: {source}", ca_file.display()),
})?;
tls = tls.ca_certificate(Certificate::from_pem(certificate));
} else if !options.require_certificate_validation() {
// Lenient-default fallback (Rust pin-only exception): tonic
// 0.13's `ClientTlsConfig` builds its rustls verifier inside a
// crate-private connector and exposes no hook for a custom
// `ServerCertVerifier`, so — unlike the other clients — the
// Rust client cannot accept an arbitrary self-signed cert. Pin
// the gateway's CA instead, or opt into strict verification
// against the system trust roots. We reject here rather than
// silently verifying against system roots (which would fail a
// self-signed gateway with a confusing handshake error).
//
// Note: a server-name override affects SNI (the hostname sent
// in the TLS ClientHello) but does NOT pin trust. Overriding
// the server name alone does not bypass certificate validation.
return Err(Error::InvalidEndpoint {
endpoint: options.endpoint().to_owned(),
detail: "TLS requested without a pinned CA. The Rust client cannot accept an \
arbitrary self-signed certificate (tonic 0.13 exposes no custom \
rustls verifier). Pin the gateway certificate with \
ClientOptions::with_ca_file, or call \
ClientOptions::with_require_certificate_validation(true) to verify \
against the system trust roots. Note: a server-name override \
affects SNI but does not pin trust."
.to_owned(),
});
}
Ok(Some(tls))
}
impl Default for ClientOptions {
fn default() -> Self {
Self::new("http://127.0.0.1:5000")
@@ -159,6 +251,10 @@ impl fmt::Debug for ClientOptions {
.field("api_key", &self.api_key.as_ref().map(|_| "<redacted>"))
.field("plaintext", &self.plaintext)
.field("ca_file", &self.ca_file)
.field(
"require_certificate_validation",
&self.require_certificate_validation,
)
.field("server_name_override", &self.server_name_override)
.field("connect_timeout", &self.connect_timeout)
.field("call_timeout", &self.call_timeout)
+104 -61
View File
@@ -8,6 +8,8 @@
//! Bulk commands enforce a 1000-item cap before contacting the worker, in
//! line with the gateway's documented `MAX_BULK_ITEMS`.
use std::sync::atomic::{AtomicU64, Ordering};
use crate::client::{EventStream, GatewayClient};
use crate::error::{ensure_protocol_success, Error};
use crate::generated::mxaccess_gateway::v1::mx_command::Payload;
@@ -26,6 +28,28 @@ use crate::value::MxValue;
const MAX_BULK_ITEMS: usize = 1_000;
/// Process-wide monotonic sequence used by [`next_correlation_id`].
static CORRELATION_SEQUENCE: AtomicU64 = AtomicU64::new(1);
/// Build a per-call correlation id that embeds the supplied `label`.
///
/// The returned token is opaque and guaranteed to be unique within the
/// current process: every call increments a process-wide atomic counter,
/// so concurrent CLI smokes and library callers on the same machine produce
/// distinct ids that gateway logs can tell apart. The token carries no
/// embedded secret beyond `label`.
///
/// The exact textual format (currently `rust-client-{label}-{N}`) is *not*
/// part of the public contract — callers must not parse it. The crate root
/// re-exports this helper as
/// [`zb_mom_ww_mxgateway_client::next_correlation_id`] so out-of-tree
/// consumers can build correlation ids without referencing the `session`
/// module path.
pub fn next_correlation_id(label: &str) -> String {
let sequence = CORRELATION_SEQUENCE.fetch_add(1, Ordering::Relaxed);
format!("rust-client-{label}-{sequence}")
}
/// Handle to an opened gateway session.
///
/// `Session` carries the gateway-issued session id and a cloned
@@ -79,7 +103,7 @@ impl Session {
.client
.close_session_raw(CloseSessionRequest {
session_id: self.id.clone(),
client_correlation_id: "rust-client-close-session".to_owned(),
client_correlation_id: next_correlation_id("close-session"),
})
.await?;
ensure_protocol_success("close session", reply.protocol_status.as_ref())?;
@@ -102,7 +126,7 @@ impl Session {
)
.await?;
Ok(register_server_handle(&reply))
register_server_handle(&reply)
}
/// Run MXAccess `AddItem` against `server_handle` and return the
@@ -123,7 +147,7 @@ impl Session {
)
.await?;
Ok(add_item_handle(&reply))
add_item_handle(&reply)
}
/// Run MXAccess `AddItem2` (item with a caller-supplied context string)
@@ -149,7 +173,7 @@ impl Session {
)
.await?;
Ok(add_item2_handle(&reply))
add_item2_handle(&reply)
}
/// Run MXAccess `RemoveItem` for the given handle pair.
@@ -229,7 +253,7 @@ impl Session {
)
.await?;
Ok(bulk_results(reply, BulkReplyKind::AddItemBulk))
bulk_results(reply, BulkReplyKind::AddItem)
}
/// Bulk variant of [`Session::advise`].
@@ -253,7 +277,7 @@ impl Session {
)
.await?;
Ok(bulk_results(reply, BulkReplyKind::AdviseItemBulk))
bulk_results(reply, BulkReplyKind::AdviseItem)
}
/// Bulk variant of [`Session::remove_item`].
@@ -277,7 +301,7 @@ impl Session {
)
.await?;
Ok(bulk_results(reply, BulkReplyKind::RemoveItemBulk))
bulk_results(reply, BulkReplyKind::RemoveItem)
}
/// Bulk variant of [`Session::un_advise`].
@@ -301,7 +325,7 @@ impl Session {
)
.await?;
Ok(bulk_results(reply, BulkReplyKind::UnAdviseItemBulk))
bulk_results(reply, BulkReplyKind::UnAdviseItem)
}
/// Bulk `Subscribe` (atomic add-and-advise) for a list of tag addresses.
@@ -325,7 +349,7 @@ impl Session {
)
.await?;
Ok(bulk_results(reply, BulkReplyKind::SubscribeBulk))
bulk_results(reply, BulkReplyKind::Subscribe)
}
/// Bulk `Unsubscribe` (atomic un-advise-and-remove) for a list of
@@ -350,7 +374,7 @@ impl Session {
)
.await?;
Ok(bulk_results(reply, BulkReplyKind::UnsubscribeBulk))
bulk_results(reply, BulkReplyKind::Unsubscribe)
}
/// Bulk `Read` — snapshot the current value for each requested tag.
@@ -366,10 +390,10 @@ impl Session {
/// # Errors
///
/// Same conditions as [`Session::add_item_bulk`].
pub async fn read_bulk(
pub async fn read_bulk<S: AsRef<str>>(
&self,
server_handle: i32,
tag_addresses: Vec<String>,
tag_addresses: &[S],
timeout_ms: u32,
) -> Result<Vec<BulkReadResult>, Error> {
ensure_bulk_size("tag_addresses", tag_addresses.len())?;
@@ -378,16 +402,21 @@ impl Session {
MxCommandKind::ReadBulk,
Payload::ReadBulk(ReadBulkCommand {
server_handle,
tag_addresses,
tag_addresses: tag_addresses
.iter()
.map(|tag| tag.as_ref().to_owned())
.collect(),
timeout_ms,
}),
)
.await?;
Ok(match reply.payload {
Some(mx_command_reply::Payload::ReadBulk(reply)) => reply.results,
_ => Vec::new(),
})
match reply.payload {
Some(mx_command_reply::Payload::ReadBulk(reply)) => Ok(reply.results),
_ => Err(Error::MalformedReply {
detail: "read_bulk reply did not carry a ReadBulk payload".to_owned(),
}),
}
}
/// Bulk `Write` (sequential MXAccess Write per entry, on the worker's STA).
@@ -416,7 +445,7 @@ impl Session {
)
.await?;
Ok(bulk_write_results(reply, BulkWriteReplyKind::Write))
bulk_write_results(reply, BulkWriteReplyKind::Write)
}
/// Bulk `Write2` (timestamped) — see [`Session::write_bulk`].
@@ -440,7 +469,7 @@ impl Session {
)
.await?;
Ok(bulk_write_results(reply, BulkWriteReplyKind::Write2))
bulk_write_results(reply, BulkWriteReplyKind::Write2)
}
/// Bulk `WriteSecured` — credential-sensitive values follow the same
@@ -465,7 +494,7 @@ impl Session {
)
.await?;
Ok(bulk_write_results(reply, BulkWriteReplyKind::WriteSecured))
bulk_write_results(reply, BulkWriteReplyKind::WriteSecured)
}
/// Bulk `WriteSecured2` (timestamped) — see [`Session::write_secured_bulk`].
@@ -489,7 +518,7 @@ impl Session {
)
.await?;
Ok(bulk_write_results(reply, BulkWriteReplyKind::WriteSecured2))
bulk_write_results(reply, BulkWriteReplyKind::WriteSecured2)
}
/// Run MXAccess `Write` (single-value, no caller-supplied timestamp).
@@ -608,7 +637,7 @@ impl Session {
fn command_request(&self, kind: MxCommandKind, payload: Payload) -> MxCommandRequest {
MxCommandRequest {
session_id: self.id.clone(),
client_correlation_id: format!("rust-client-{}", kind.as_str_name()),
client_correlation_id: next_correlation_id(kind.as_str_name()),
command: Some(MxCommand {
kind: kind as i32,
payload: Some(payload),
@@ -628,71 +657,80 @@ fn ensure_bulk_size(name: &'static str, len: usize) -> Result<(), Error> {
}
}
fn register_server_handle(reply: &MxCommandReply) -> i32 {
fn register_server_handle(reply: &MxCommandReply) -> Result<i32, Error> {
match reply.payload.as_ref() {
Some(mx_command_reply::Payload::Register(register)) => register.server_handle,
Some(mx_command_reply::Payload::Register(register)) => Ok(register.server_handle),
_ => reply
.return_value
.as_ref()
.and_then(int32_reply_value)
.unwrap_or_default(),
.ok_or_else(|| Error::MalformedReply {
detail: "register reply lacked a server_handle payload or int32 return_value"
.to_owned(),
}),
}
}
fn add_item_handle(reply: &MxCommandReply) -> i32 {
fn add_item_handle(reply: &MxCommandReply) -> Result<i32, Error> {
match reply.payload.as_ref() {
Some(mx_command_reply::Payload::AddItem(add_item)) => add_item.item_handle,
Some(mx_command_reply::Payload::AddItem(add_item)) => Ok(add_item.item_handle),
_ => reply
.return_value
.as_ref()
.and_then(int32_reply_value)
.unwrap_or_default(),
.ok_or_else(|| Error::MalformedReply {
detail: "add_item reply lacked an item_handle payload or int32 return_value"
.to_owned(),
}),
}
}
fn add_item2_handle(reply: &MxCommandReply) -> i32 {
fn add_item2_handle(reply: &MxCommandReply) -> Result<i32, Error> {
match reply.payload.as_ref() {
Some(mx_command_reply::Payload::AddItem2(add_item)) => add_item.item_handle,
Some(mx_command_reply::Payload::AddItem2(add_item)) => Ok(add_item.item_handle),
_ => reply
.return_value
.as_ref()
.and_then(int32_reply_value)
.unwrap_or_default(),
.ok_or_else(|| Error::MalformedReply {
detail: "add_item2 reply lacked an item_handle payload or int32 return_value"
.to_owned(),
}),
}
}
enum BulkReplyKind {
AddItemBulk,
AdviseItemBulk,
RemoveItemBulk,
UnAdviseItemBulk,
SubscribeBulk,
UnsubscribeBulk,
AddItem,
AdviseItem,
RemoveItem,
UnAdviseItem,
Subscribe,
Unsubscribe,
}
fn bulk_results(reply: MxCommandReply, kind: BulkReplyKind) -> Vec<SubscribeResult> {
fn bulk_results(reply: MxCommandReply, kind: BulkReplyKind) -> Result<Vec<SubscribeResult>, Error> {
match (reply.payload, kind) {
(Some(mx_command_reply::Payload::AddItemBulk(reply)), BulkReplyKind::AddItemBulk) => {
reply.results
(Some(mx_command_reply::Payload::AddItemBulk(reply)), BulkReplyKind::AddItem) => {
Ok(reply.results)
}
(Some(mx_command_reply::Payload::AdviseItemBulk(reply)), BulkReplyKind::AdviseItemBulk) => {
reply.results
(Some(mx_command_reply::Payload::AdviseItemBulk(reply)), BulkReplyKind::AdviseItem) => {
Ok(reply.results)
}
(Some(mx_command_reply::Payload::RemoveItemBulk(reply)), BulkReplyKind::RemoveItemBulk) => {
reply.results
(Some(mx_command_reply::Payload::RemoveItemBulk(reply)), BulkReplyKind::RemoveItem) => {
Ok(reply.results)
}
(
Some(mx_command_reply::Payload::UnAdviseItemBulk(reply)),
BulkReplyKind::UnAdviseItemBulk,
) => reply.results,
(Some(mx_command_reply::Payload::SubscribeBulk(reply)), BulkReplyKind::SubscribeBulk) => {
reply.results
(Some(mx_command_reply::Payload::UnAdviseItemBulk(reply)), BulkReplyKind::UnAdviseItem) => {
Ok(reply.results)
}
(
Some(mx_command_reply::Payload::UnsubscribeBulk(reply)),
BulkReplyKind::UnsubscribeBulk,
) => reply.results,
_ => Vec::new(),
(Some(mx_command_reply::Payload::SubscribeBulk(reply)), BulkReplyKind::Subscribe) => {
Ok(reply.results)
}
(Some(mx_command_reply::Payload::UnsubscribeBulk(reply)), BulkReplyKind::Unsubscribe) => {
Ok(reply.results)
}
_ => Err(Error::MalformedReply {
detail: "bulk subscribe reply did not carry the expected payload arm".to_owned(),
}),
}
}
@@ -703,23 +741,28 @@ enum BulkWriteReplyKind {
WriteSecured2,
}
fn bulk_write_results(reply: MxCommandReply, kind: BulkWriteReplyKind) -> Vec<BulkWriteResult> {
fn bulk_write_results(
reply: MxCommandReply,
kind: BulkWriteReplyKind,
) -> Result<Vec<BulkWriteResult>, Error> {
match (reply.payload, kind) {
(Some(mx_command_reply::Payload::WriteBulk(reply)), BulkWriteReplyKind::Write) => {
reply.results
Ok(reply.results)
}
(Some(mx_command_reply::Payload::Write2Bulk(reply)), BulkWriteReplyKind::Write2) => {
reply.results
Ok(reply.results)
}
(
Some(mx_command_reply::Payload::WriteSecuredBulk(reply)),
BulkWriteReplyKind::WriteSecured,
) => reply.results,
) => Ok(reply.results),
(
Some(mx_command_reply::Payload::WriteSecured2Bulk(reply)),
BulkWriteReplyKind::WriteSecured2,
) => reply.results,
_ => Vec::new(),
) => Ok(reply.results),
_ => Err(Error::MalformedReply {
detail: "bulk write reply did not carry the expected payload arm".to_owned(),
}),
}
}
+569 -11
View File
@@ -20,16 +20,19 @@ use zb_mom_ww_mxgateway_client::generated::mxaccess_gateway::v1::mx_access_gatew
use zb_mom_ww_mxgateway_client::generated::mxaccess_gateway::v1::mx_command_reply;
use zb_mom_ww_mxgateway_client::generated::mxaccess_gateway::v1::mx_value::Kind;
use zb_mom_ww_mxgateway_client::generated::mxaccess_gateway::v1::{
AcknowledgeAlarmReply, AcknowledgeAlarmRequest, ActiveAlarmSnapshot, AddItemReply,
AlarmFeedMessage, BulkSubscribeReply, CloseSessionReply, CloseSessionRequest, MxCommandKind,
MxCommandReply, MxDataType, MxEvent, MxEventFamily, MxStatusCategory, MxStatusProxy,
MxStatusSource, MxValue, OpenSessionReply, OpenSessionRequest, ProtocolStatus,
ProtocolStatusCode, QueryActiveAlarmsRequest, SessionState, StreamAlarmsRequest,
StreamEventsRequest, SubscribeResult,
alarm_feed_message, AcknowledgeAlarmReply, AcknowledgeAlarmRequest, ActiveAlarmSnapshot,
AddItem2Reply, AddItemReply, AlarmConditionState, AlarmFeedMessage, AlarmTransitionKind,
BulkReadReply, BulkReadResult, BulkSubscribeReply, BulkWriteReply, BulkWriteResult,
CloseSessionReply, CloseSessionRequest, MxCommandKind, MxCommandReply, MxDataType, MxEvent,
MxEventFamily, MxStatusCategory, MxStatusProxy, MxStatusSource, MxValue,
OnAlarmTransitionEvent, OpenSessionReply, OpenSessionRequest, ProtocolStatus,
ProtocolStatusCode, QueryActiveAlarmsRequest, RegisterReply, SessionState, StreamAlarmsRequest,
StreamEventsRequest, SubscribeResult, Write2BulkEntry, WriteBulkEntry, WriteSecured2BulkEntry,
WriteSecuredBulkEntry,
};
use zb_mom_ww_mxgateway_client::{
ApiKey, ClientOptions, CommandError, Error, GatewayClient, MxStatus, MxValue as ClientMxValue,
MxValueProjection,
next_correlation_id, ApiKey, ClientOptions, CommandError, Error, GatewayClient, MxStatus,
MxValue as ClientMxValue, MxValueProjection,
};
#[tokio::test]
@@ -272,11 +275,414 @@ fn command_error_display_keeps_raw_reply_accessible() {
assert!(error.to_string().contains("MxaccessFailure"));
}
// ---- Client.Rust-022 / 024 regression coverage ---------------------------
#[tokio::test]
async fn register_returns_malformed_reply_when_ok_reply_has_no_payload() {
let state = Arc::new(FakeState::default());
*state.invoke_override.lock().await = Some(InvokeOverride::OkWithoutPayload);
let endpoint = spawn_fake_gateway(state.clone()).await;
let client = GatewayClient::connect(ClientOptions::new(endpoint))
.await
.unwrap();
let session = client.session("session-fixture");
let error = session.register("client").await.unwrap_err();
assert!(
matches!(error, Error::MalformedReply { .. }),
"expected MalformedReply, got {error:?}"
);
}
#[tokio::test]
async fn add_item_returns_malformed_reply_when_ok_reply_has_no_payload() {
let state = Arc::new(FakeState::default());
*state.invoke_override.lock().await = Some(InvokeOverride::OkWithoutPayload);
let endpoint = spawn_fake_gateway(state.clone()).await;
let client = GatewayClient::connect(ClientOptions::new(endpoint))
.await
.unwrap();
let session = client.session("session-fixture");
let error = session.add_item(12, "Plant.Area.Tag").await.unwrap_err();
assert!(
matches!(error, Error::MalformedReply { .. }),
"expected MalformedReply, got {error:?}"
);
}
#[tokio::test]
async fn add_item2_returns_malformed_reply_when_ok_reply_has_no_payload() {
let state = Arc::new(FakeState::default());
*state.invoke_override.lock().await = Some(InvokeOverride::OkWithoutPayload);
let endpoint = spawn_fake_gateway(state.clone()).await;
let client = GatewayClient::connect(ClientOptions::new(endpoint))
.await
.unwrap();
let session = client.session("session-fixture");
let error = session
.add_item2(12, "Plant.Area.Tag", "ctx")
.await
.unwrap_err();
assert!(
matches!(error, Error::MalformedReply { .. }),
"expected MalformedReply, got {error:?}"
);
}
#[tokio::test]
async fn subscribe_bulk_returns_malformed_reply_on_mismatched_payload_arm() {
let state = Arc::new(FakeState::default());
*state.invoke_override.lock().await = Some(InvokeOverride::OkWithMismatchedPayload);
let endpoint = spawn_fake_gateway(state.clone()).await;
let client = GatewayClient::connect(ClientOptions::new(endpoint))
.await
.unwrap();
let session = client.session("session-fixture");
let error = session
.subscribe_bulk(12, vec!["Area001.Pump001.Speed".to_owned()])
.await
.unwrap_err();
assert!(
matches!(error, Error::MalformedReply { .. }),
"expected MalformedReply, got {error:?}"
);
}
#[tokio::test]
async fn read_bulk_returns_malformed_reply_on_mismatched_payload_arm() {
let state = Arc::new(FakeState::default());
*state.invoke_override.lock().await = Some(InvokeOverride::OkWithMismatchedPayload);
let endpoint = spawn_fake_gateway(state.clone()).await;
let client = GatewayClient::connect(ClientOptions::new(endpoint))
.await
.unwrap();
let session = client.session("session-fixture");
let error = session
.read_bulk(12, &["Area001.Pump001.Speed".to_owned()], 1000)
.await
.unwrap_err();
assert!(
matches!(error, Error::MalformedReply { .. }),
"expected MalformedReply, got {error:?}"
);
}
#[tokio::test]
async fn write_bulk_returns_malformed_reply_on_mismatched_payload_arm() {
let state = Arc::new(FakeState::default());
*state.invoke_override.lock().await = Some(InvokeOverride::OkWithMismatchedPayload);
let endpoint = spawn_fake_gateway(state.clone()).await;
let client = GatewayClient::connect(ClientOptions::new(endpoint))
.await
.unwrap();
let session = client.session("session-fixture");
let error = session
.write_bulk(
12,
vec![WriteBulkEntry {
item_handle: 34,
value: Some(ClientMxValue::int32(1).into_proto()),
user_id: 0,
}],
)
.await
.unwrap_err();
assert!(
matches!(error, Error::MalformedReply { .. }),
"expected MalformedReply, got {error:?}"
);
}
#[tokio::test]
async fn unary_invoke_maps_status_unavailable_to_error_unavailable() {
let state = Arc::new(FakeState::default());
*state.invoke_override.lock().await =
Some(InvokeOverride::Unavailable("gateway restarting".to_owned()));
let endpoint = spawn_fake_gateway(state.clone()).await;
let client = GatewayClient::connect(ClientOptions::new(endpoint))
.await
.unwrap();
let session = client.session("session-fixture");
let error = session.add_item(12, "Plant.Area.Tag").await.unwrap_err();
assert!(
matches!(error, Error::Unavailable { .. }),
"expected Unavailable, got {error:?}"
);
}
#[tokio::test]
async fn read_bulk_round_trips_through_the_fake_gateway() {
let state = Arc::new(FakeState::default());
let endpoint = spawn_fake_gateway(state.clone()).await;
let client = GatewayClient::connect(ClientOptions::new(endpoint))
.await
.unwrap();
let session = client.session("session-fixture");
let results = session
.read_bulk(12, &["Area001.Pump001.Speed".to_owned()], 1000)
.await
.unwrap();
assert_eq!(results.len(), 1);
assert!(results[0].was_successful);
assert!(results[0].was_cached);
}
#[tokio::test]
async fn write_bulk_round_trips_through_the_fake_gateway() {
let state = Arc::new(FakeState::default());
let endpoint = spawn_fake_gateway(state.clone()).await;
let client = GatewayClient::connect(ClientOptions::new(endpoint))
.await
.unwrap();
let session = client.session("session-fixture");
let results = session
.write_bulk(
12,
vec![WriteBulkEntry {
item_handle: 34,
value: Some(ClientMxValue::int32(1).into_proto()),
user_id: 0,
}],
)
.await
.unwrap();
assert_eq!(results.len(), 1);
assert!(results[0].was_successful);
let last_command = state.last_command_kind.lock().await;
assert_eq!(*last_command, Some(MxCommandKind::WriteBulk as i32));
}
#[tokio::test]
async fn write2_bulk_round_trips_through_the_fake_gateway() {
let state = Arc::new(FakeState::default());
let endpoint = spawn_fake_gateway(state.clone()).await;
let client = GatewayClient::connect(ClientOptions::new(endpoint))
.await
.unwrap();
let session = client.session("session-fixture");
let results = session
.write2_bulk(
12,
vec![Write2BulkEntry {
item_handle: 34,
value: Some(ClientMxValue::int32(1).into_proto()),
timestamp_value: Some(ClientMxValue::string("2026-05-24T00:00:00Z").into_proto()),
user_id: 0,
}],
)
.await
.unwrap();
assert_eq!(results.len(), 1);
assert!(results[0].was_successful);
let last_command = state.last_command_kind.lock().await;
assert_eq!(*last_command, Some(MxCommandKind::Write2Bulk as i32));
}
#[tokio::test]
async fn write_secured_bulk_round_trips_through_the_fake_gateway() {
let state = Arc::new(FakeState::default());
let endpoint = spawn_fake_gateway(state.clone()).await;
let client = GatewayClient::connect(ClientOptions::new(endpoint))
.await
.unwrap();
let session = client.session("session-fixture");
let results = session
.write_secured_bulk(
12,
vec![WriteSecuredBulkEntry {
item_handle: 34,
value: Some(ClientMxValue::int32(1).into_proto()),
current_user_id: 0,
verifier_user_id: 0,
}],
)
.await
.unwrap();
assert_eq!(results.len(), 1);
assert!(results[0].was_successful);
let last_command = state.last_command_kind.lock().await;
assert_eq!(*last_command, Some(MxCommandKind::WriteSecuredBulk as i32));
}
#[tokio::test]
async fn write_secured2_bulk_round_trips_through_the_fake_gateway() {
let state = Arc::new(FakeState::default());
let endpoint = spawn_fake_gateway(state.clone()).await;
let client = GatewayClient::connect(ClientOptions::new(endpoint))
.await
.unwrap();
let session = client.session("session-fixture");
let results = session
.write_secured2_bulk(
12,
vec![WriteSecured2BulkEntry {
item_handle: 34,
value: Some(ClientMxValue::int32(1).into_proto()),
timestamp_value: Some(ClientMxValue::string("2026-05-24T00:00:00Z").into_proto()),
current_user_id: 0,
verifier_user_id: 0,
}],
)
.await
.unwrap();
assert_eq!(results.len(), 1);
assert!(results[0].was_successful);
let last_command = state.last_command_kind.lock().await;
assert_eq!(*last_command, Some(MxCommandKind::WriteSecured2Bulk as i32));
}
#[tokio::test]
async fn stream_alarms_emits_snapshot_then_complete_then_transition_in_order() {
let state = Arc::new(FakeState::default());
*state.stream_alarms_script.lock().await = Some(vec![
AlarmFeedMessage {
payload: Some(alarm_feed_message::Payload::ActiveAlarm(
ActiveAlarmSnapshot {
alarm_full_reference: "Tank01.Level.HiHi".to_owned(),
current_state: AlarmConditionState::Active as i32,
..ActiveAlarmSnapshot::default()
},
)),
},
AlarmFeedMessage {
payload: Some(alarm_feed_message::Payload::SnapshotComplete(true)),
},
AlarmFeedMessage {
payload: Some(alarm_feed_message::Payload::Transition(
OnAlarmTransitionEvent {
alarm_full_reference: "Tank01.Level.HiHi".to_owned(),
transition_kind: AlarmTransitionKind::Raise as i32,
..OnAlarmTransitionEvent::default()
},
)),
},
]);
let endpoint = spawn_fake_gateway(state.clone()).await;
let client = GatewayClient::connect(ClientOptions::new(endpoint))
.await
.unwrap();
let mut stream = client
.stream_alarms(StreamAlarmsRequest {
client_correlation_id: next_correlation_id("test-stream-alarms"),
alarm_filter_prefix: String::new(),
})
.await
.unwrap();
let first = stream.next().await.unwrap().unwrap();
let second = stream.next().await.unwrap().unwrap();
let third = stream.next().await.unwrap().unwrap();
assert!(matches!(
first.payload,
Some(alarm_feed_message::Payload::ActiveAlarm(_))
));
assert!(matches!(
second.payload,
Some(alarm_feed_message::Payload::SnapshotComplete(true))
));
assert!(matches!(
third.payload,
Some(alarm_feed_message::Payload::Transition(_))
));
}
#[tokio::test]
async fn cli_subcommands_propagate_unique_correlation_ids_from_next_correlation_id() {
// The CLI's `stream-alarms` and `acknowledge-alarm` paths used to
// hard-code their correlation ids (Client.Rust-023). Verify the
// resolution end-to-end through `next_correlation_id`: every call
// observed at the fake gateway has a unique id that embeds the
// `cli-...` label, so concurrent CLI smokes can tell collisions apart.
let state = Arc::new(FakeState::default());
let endpoint = spawn_fake_gateway(state.clone()).await;
let client = GatewayClient::connect(ClientOptions::new(endpoint))
.await
.unwrap();
let first_corr = next_correlation_id("cli-stream-alarms");
let _ = client
.stream_alarms(StreamAlarmsRequest {
client_correlation_id: first_corr.clone(),
alarm_filter_prefix: String::new(),
})
.await
.unwrap();
assert_eq!(
*state.last_correlation_id.lock().await,
Some(first_corr.clone())
);
let second_corr = next_correlation_id("cli-stream-alarms");
assert_ne!(first_corr, second_corr);
assert!(second_corr.contains("cli-stream-alarms"));
let third_corr = next_correlation_id("cli-acknowledge-alarm");
let _ = client
.acknowledge_alarm(AcknowledgeAlarmRequest {
client_correlation_id: third_corr.clone(),
alarm_full_reference: "Tank01.Level.HiHi".to_owned(),
comment: String::new(),
operator_user: String::new(),
})
.await
.unwrap();
assert_eq!(*state.last_correlation_id.lock().await, Some(third_corr));
}
#[derive(Default)]
struct FakeState {
authorization: Mutex<Option<String>>,
last_command_kind: Mutex<Option<i32>>,
last_correlation_id: Mutex<Option<String>>,
stream_dropped: Arc<AtomicBool>,
/// Optional per-test override that pins the fake's `Invoke` handler to
/// a specific reply shape (or `Err(Status)`). The default of `None`
/// keeps the existing happy-path dispatcher.
invoke_override: Mutex<Option<InvokeOverride>>,
/// Optional per-test override that pins the fake's `StreamAlarms`
/// handler to emit a synthetic ConditionRefresh -> snapshot_complete
/// -> transition sequence.
stream_alarms_script: Mutex<Option<Vec<AlarmFeedMessage>>>,
}
/// Per-test override for the fake's `Invoke` handler.
#[allow(dead_code)]
enum InvokeOverride {
/// Reply with `protocol_status = Ok` and no `payload` set.
OkWithoutPayload,
/// Reply with `protocol_status = Ok` and a deliberately wrong payload
/// arm — e.g. an `AddItemReply` body when the caller invoked a bulk
/// command. The variant carries the kind to recognise in tests but the
/// reply itself is the mismatched-payload shape.
OkWithMismatchedPayload,
/// Fail the unary call with `Status::unavailable(...)` so the client's
/// `Code::Unavailable` -> `Error::Unavailable` mapping is exercised.
Unavailable(String),
}
#[derive(Clone)]
@@ -331,6 +737,35 @@ impl MxAccessGateway for FakeGateway {
.map(|command| command.kind)
.unwrap_or_default();
*self.state.last_command_kind.lock().await = Some(kind);
*self.state.last_correlation_id.lock().await = Some(request.client_correlation_id.clone());
// Honour any per-test override before falling through to the
// happy-path dispatcher.
if let Some(override_) = self.state.invoke_override.lock().await.take() {
return match override_ {
InvokeOverride::OkWithoutPayload => Ok(Response::new(MxCommandReply {
session_id: request.session_id,
correlation_id: "fake-correlation".to_owned(),
kind,
protocol_status: Some(ok_status("command ok")),
payload: None,
..MxCommandReply::default()
})),
InvokeOverride::OkWithMismatchedPayload => Ok(Response::new(MxCommandReply {
session_id: request.session_id,
correlation_id: "fake-correlation".to_owned(),
kind,
protocol_status: Some(ok_status("command ok")),
// Deliberately the wrong payload arm — `AddItemReply`
// for whatever command was actually invoked.
payload: Some(mx_command_reply::Payload::AddItem(AddItemReply {
item_handle: 99,
})),
..MxCommandReply::default()
})),
InvokeOverride::Unavailable(message) => Err(Status::unavailable(message)),
};
}
if kind == MxCommandKind::Write as i32 {
return Ok(Response::new(mxaccess_failure_reply()));
@@ -357,6 +792,92 @@ impl MxAccessGateway for FakeGateway {
}));
}
if kind == MxCommandKind::Register as i32 {
return Ok(Response::new(MxCommandReply {
session_id: request.session_id,
correlation_id: "fake-correlation".to_owned(),
kind,
protocol_status: Some(ok_status("command ok")),
payload: Some(mx_command_reply::Payload::Register(RegisterReply {
server_handle: 12,
})),
..MxCommandReply::default()
}));
}
if kind == MxCommandKind::AddItem2 as i32 {
return Ok(Response::new(MxCommandReply {
session_id: request.session_id,
correlation_id: "fake-correlation".to_owned(),
kind,
protocol_status: Some(ok_status("command ok")),
payload: Some(mx_command_reply::Payload::AddItem2(AddItem2Reply {
item_handle: 56,
})),
..MxCommandReply::default()
}));
}
if kind == MxCommandKind::ReadBulk as i32 {
return Ok(Response::new(MxCommandReply {
session_id: request.session_id,
correlation_id: "fake-correlation".to_owned(),
kind,
protocol_status: Some(ok_status("command ok")),
payload: Some(mx_command_reply::Payload::ReadBulk(BulkReadReply {
results: vec![BulkReadResult {
server_handle: 12,
tag_address: "Area001.Pump001.Speed".to_owned(),
item_handle: 34,
was_successful: true,
was_cached: true,
..BulkReadResult::default()
}],
})),
..MxCommandReply::default()
}));
}
if kind == MxCommandKind::WriteBulk as i32 {
return Ok(Response::new(write_bulk_reply_for(
request.session_id,
kind,
mx_command_reply::Payload::WriteBulk(BulkWriteReply {
results: vec![bulk_write_result_ok(12, 34)],
}),
)));
}
if kind == MxCommandKind::Write2Bulk as i32 {
return Ok(Response::new(write_bulk_reply_for(
request.session_id,
kind,
mx_command_reply::Payload::Write2Bulk(BulkWriteReply {
results: vec![bulk_write_result_ok(12, 34)],
}),
)));
}
if kind == MxCommandKind::WriteSecuredBulk as i32 {
return Ok(Response::new(write_bulk_reply_for(
request.session_id,
kind,
mx_command_reply::Payload::WriteSecuredBulk(BulkWriteReply {
results: vec![bulk_write_result_ok(12, 34)],
}),
)));
}
if kind == MxCommandKind::WriteSecured2Bulk as i32 {
return Ok(Response::new(write_bulk_reply_for(
request.session_id,
kind,
mx_command_reply::Payload::WriteSecured2Bulk(BulkWriteReply {
results: vec![bulk_write_result_ok(12, 34)],
}),
)));
}
Ok(Response::new(MxCommandReply {
session_id: request.session_id,
correlation_id: "fake-correlation".to_owned(),
@@ -387,8 +908,10 @@ impl MxAccessGateway for FakeGateway {
async fn acknowledge_alarm(
&self,
_request: Request<AcknowledgeAlarmRequest>,
request: Request<AcknowledgeAlarmRequest>,
) -> Result<Response<AcknowledgeAlarmReply>, Status> {
*self.state.last_correlation_id.lock().await =
Some(request.into_inner().client_correlation_id);
Ok(Response::new(AcknowledgeAlarmReply {
correlation_id: "corr-1".to_owned(),
protocol_status: Some(ok_status("ack ok")),
@@ -407,9 +930,18 @@ impl MxAccessGateway for FakeGateway {
async fn stream_alarms(
&self,
_request: Request<StreamAlarmsRequest>,
request: Request<StreamAlarmsRequest>,
) -> Result<Response<Self::StreamAlarmsStream>, Status> {
let (_sender, receiver) = mpsc::channel::<Result<AlarmFeedMessage, Status>>(1);
*self.state.last_correlation_id.lock().await =
Some(request.into_inner().client_correlation_id);
let script = self.state.stream_alarms_script.lock().await.take();
let (sender, receiver) =
mpsc::channel::<Result<AlarmFeedMessage, Status>>(script.as_ref().map_or(1, Vec::len));
if let Some(messages) = script {
for message in messages {
sender.send(Ok(message)).await.unwrap();
}
}
let stream = ReceiverStream::new(receiver);
Ok(Response::new(Box::pin(stream)))
}
@@ -469,6 +1001,32 @@ async fn spawn_fake_gateway(state: Arc<FakeState>) -> String {
format!("http://{address}")
}
fn write_bulk_reply_for(
session_id: String,
kind: i32,
payload: mx_command_reply::Payload,
) -> MxCommandReply {
MxCommandReply {
session_id,
correlation_id: "fake-correlation".to_owned(),
kind,
protocol_status: Some(ok_status("command ok")),
payload: Some(payload),
..MxCommandReply::default()
}
}
fn bulk_write_result_ok(server_handle: i32, item_handle: i32) -> BulkWriteResult {
BulkWriteResult {
server_handle,
item_handle,
was_successful: true,
hresult: Some(0),
statuses: Vec::new(),
error_message: String::new(),
}
}
fn ok_status(message: &str) -> ProtocolStatus {
ProtocolStatus {
code: ProtocolStatusCode::Ok as i32,
+137
View File
@@ -0,0 +1,137 @@
//! TLS posture coverage for the Rust client.
//!
//! tonic 0.13.1's `ClientTlsConfig` exposes no hook for a custom rustls
//! `ServerCertVerifier` (the verifier is built internally inside the
//! crate-private `TlsConnector`), so the Rust client cannot implement the
//! "accept any server certificate" lenient default the other clients use.
//! Rust is therefore the documented **pin-only exception**: TLS without a
//! pinned CA is rejected up front with a clear, actionable error, and
//! supplying a CA file is the supported path. These tests pin that contract.
use std::time::Duration;
use zb_mom_ww_mxgateway_client::{ClientOptions, Error, GalaxyClient, GatewayClient};
/// Drive `connect` to its error without requiring `GatewayClient: Debug`
/// (the success arm is dropped explicitly so `unwrap_err` is unnecessary).
async fn connect_err(options: ClientOptions) -> Error {
match GatewayClient::connect(options).await {
Ok(_client) => panic!("connect unexpectedly succeeded against a dead TLS address"),
Err(error) => error,
}
}
#[tokio::test]
async fn tls_without_ca_is_rejected_with_actionable_error_by_default() {
let options = ClientOptions::new("https://127.0.0.1:1")
.with_plaintext(false)
.with_connect_timeout(Duration::from_millis(200));
let error = connect_err(options).await;
let Error::InvalidEndpoint { detail, .. } = error else {
panic!("expected InvalidEndpoint, got {error:?}");
};
// The message must point the caller at the supported remedy (pin a CA)
// and name the opt-in escape hatch.
assert!(
detail.contains("ca_file") || detail.contains("CA"),
"error should instruct the user to pass a CA file: {detail}"
);
assert!(
detail.contains("require_certificate_validation"),
"error should mention the require_certificate_validation opt-in: {detail}"
);
}
#[tokio::test]
async fn tls_with_require_certificate_validation_does_not_short_circuit() {
// With strict verification opted in, the no-CA guard must not fire; the
// connect attempt instead proceeds to the transport (and fails to reach
// the dead address) rather than returning the "CA required" guard error.
let options = ClientOptions::new("https://127.0.0.1:1")
.with_plaintext(false)
.with_require_certificate_validation(true)
.with_connect_timeout(Duration::from_millis(200));
let error = connect_err(options).await;
assert!(
!matches!(&error, Error::InvalidEndpoint { detail, .. }
if detail.contains("require_certificate_validation")),
"strict verification must bypass the no-CA guard, got {error:?}"
);
}
#[tokio::test]
async fn tls_with_ca_file_is_permitted_and_proceeds_past_the_guard() {
// Pinning a CA is the supported TLS path: the no-CA guard must not fire.
// We hand it a readable PEM file; construction proceeds past the guard
// and only fails later at the transport (dead address / handshake).
let ca_path = std::env::temp_dir().join("mxgw-rust-tls-ca-fixture.pem");
std::fs::write(&ca_path, SELF_SIGNED_CA_PEM).unwrap();
let options = ClientOptions::new("https://127.0.0.1:1")
.with_plaintext(false)
.with_ca_file(&ca_path)
.with_connect_timeout(Duration::from_millis(200));
let error = connect_err(options).await;
let _ = std::fs::remove_file(&ca_path);
assert!(
!matches!(&error, Error::InvalidEndpoint { detail, .. }
if detail.contains("require_certificate_validation")),
"pinning a CA must bypass the no-CA guard, got {error:?}"
);
}
/// Drive `GalaxyClient::connect` to its error (mirrors `connect_err` above).
async fn galaxy_connect_err(options: ClientOptions) -> Error {
match GalaxyClient::connect(options).await {
Ok(_client) => {
panic!("GalaxyClient::connect unexpectedly succeeded against a dead TLS address")
}
Err(error) => error,
}
}
#[tokio::test]
async fn galaxy_tls_without_ca_is_rejected_with_actionable_error_by_default() {
// GalaxyClient::connect must apply the same TLS guard as GatewayClient —
// TLS without a pinned CA (and without require_certificate_validation)
// returns a clear, actionable InvalidEndpoint error.
let options = ClientOptions::new("https://127.0.0.1:1")
.with_plaintext(false)
.with_connect_timeout(Duration::from_millis(200));
let error = galaxy_connect_err(options).await;
let Error::InvalidEndpoint { detail, .. } = error else {
panic!("expected InvalidEndpoint, got {error:?}");
};
assert!(
detail.contains("ca_file") || detail.contains("CA"),
"error should instruct the user to pass a CA file: {detail}"
);
assert!(
detail.contains("require_certificate_validation"),
"error should mention the require_certificate_validation opt-in: {detail}"
);
}
/// A throwaway self-signed CA certificate (PEM). Only needs to parse as a
/// PEM trust root so the CA-pinning path is exercised past the guard.
const SELF_SIGNED_CA_PEM: &str = "-----BEGIN CERTIFICATE-----
MIIBhTCCASugAwIBAgIQIRi6zePL6mKjOipn+dNuaTAKBggqhkjOPQQDAjASMRAw
DgYDVQQKEwdBY21lIENvMB4XDTE3MTAyMDE5NDMwNloXDTE4MTAyMDE5NDMwNlow
EjEQMA4GA1UEChMHQWNtZSBDbzBZMBMGByqGSM49AgEGCCqGSM49AwEHA0IABD0d
7VNhbWvZLWPuj/RtHFjvtJBEwOkhbN/BnnE8rnZR8+sbwnc/KhCk3FhnpHZnQz7B
5aETbbIgmuvewdjvSBSjYzBhMA4GA1UdDwEB/wQEAwICpDATBgNVHSUEDDAKBggr
BgEFBQcDATAPBgNVHRMBAf8EBTADAQH/MCkGA1UdEQQiMCCCDmxvY2FsaG9zdDo1
NDUzgg4xMjcuMC4wLjE6NTQ1MzAKBggqhkjOPQQDAgNIADBFAiEA2zpJEPQyz6/l
Wf86aX6PepsntZv2GYlA5UpabfT2EZICICpJ5h/iI+i341gBmLiAFQOyTDT+/wQc
6MF9+Yw1Yy0t
-----END CERTIFICATE-----
";
+13 -5
View File
@@ -7,7 +7,7 @@
| Review date | 2026-05-24 |
| Commit reviewed | `42b0037` |
| Status | Re-reviewed |
| Open findings | 4 |
| Open findings | 0 |
## Checklist coverage
@@ -390,7 +390,7 @@ Re-review pass at `42b0037`. Diff against `d692232` consists of four commits:
| Severity | Medium |
| Category | Documentation & comments |
| Location | `clients/dotnet/README.md:137-138` |
| Status | Open |
| Status | Resolved |
**Description:** The README example block for the two new alarm CLI subcommands shipped in commit `11cc671` shows:
@@ -412,6 +412,8 @@ mxgw-dotnet acknowledge-alarm --reference "\\Galaxy\Area001.Pump001.PumpFault" -
A quick sanity check would be to drive each example through the test harness's `MxGatewayClientCli.RunAsync` shape and confirm exit 0 — copy/paste safety on the documented examples is the only realistic safeguard.
**Resolution:** 2026-05-24 — Confirmed against source: the README `dotnet run -- stream-alarms` example used `--session-id <id> --max-messages 1`, neither of which the CLI accepts (the production parser routes through `--filter-prefix` / `--max-events`), and `acknowledge-alarm` used `--session-id <id> --alarm-reference <ref>` against a session-less central alarm monitor that actually consumes `--reference`. Replaced both README example lines with the parser-correct shape: `stream-alarms --filter-prefix Area001 --max-events 1 --json` and `acknowledge-alarm --reference "\\Galaxy\Area001.Pump001.PumpFault" --comment "ack from cli" --operator operator1 --json`. Regression test `MxGatewayClientCliTests.RunAsync_ReadmeExamples_ForAlarmCommands_ParseSuccessfully` (xUnit `[Theory]` over `stream-alarms` and `acknowledge-alarm`) locates `clients/dotnet/README.md`, extracts the documented CLI command line for each subcommand, tokenizes it (preserving quoted segments), and drives it through the production `MxGatewayClientCli.RunAsync` against a `FakeCliClient`; the test asserts exit code 0 and that stderr contains neither "Unknown command" nor "Missing required option". Verified red against the original README text (both theory rows failed with exit code 1) and green after the README update, so any future documentation drift between README examples and the actual parser shape is caught at test time.
### Client.Dotnet-019
| Field | Value |
@@ -419,7 +421,7 @@ A quick sanity check would be to drive each example through the test harness's `
| Severity | Low |
| Category | Correctness & logic bugs |
| Location | `clients/dotnet/ZB.MOM.WW.MxGateway.Client.Cli/MxGatewayClientCli.cs:745` |
| Status | Open |
| Status | Resolved |
**Description:** Client.Dotnet-005 / 010 documented (and recorded as resolved) the silent register-handle fallback pattern `reply.Register?.ServerHandle ?? reply.ReturnValue.Int32Value`, where a successful protocol+MX-status reply missing its typed `register` oneof case falls through to `ReturnValue.Int32Value` and silently yields `0` for the handle. The new `BenchReadBulkAsync` handler introduced in commit `b3ae200` reinstates exactly that pattern at line 745:
@@ -431,6 +433,8 @@ The bench then drives the rest of the run — `SubscribeBulk`, warmup `ReadBulk`
**Recommendation:** Replace the fallback with an explicit null-check on `registerReply.Register` that throws `MxGatewayException` with the missing-payload context (kind = `Register`, session id, correlation id) — the same shape Client.Dotnet-005 prescribes. If the upstream SDK helpers in `MxGatewaySession` are restored to throw, route the bench through `MxGatewaySession.RegisterAsync` instead so the CLI inherits the SDK's protection.
**Resolution:** 2026-05-24 — Confirmed against source: `BenchReadBulkAsync` at line 745 carried the silent fallback `registerReply.Register?.ServerHandle ?? registerReply.ReturnValue.Int32Value`, so a successful protocol+MX-status reply missing its typed `register` payload would yield ServerHandle=0 and the bench would drive the rest of SubscribeBulk / warmup / steady-state ReadBulk / UnsubscribeBulk against an invalid handle. Added a private `RequireRegisterServerHandle(MxCommandReply reply, string sessionId)` helper that throws a descriptive `MxGatewayException` naming the missing typed payload, the session id, and the correlation id, then replaced the fallback site with `int serverHandle = RequireRegisterServerHandle(registerReply, sessionId);`. The bench's outer `RunCoreAsync` catch then surfaces the throw as exit code 1 plus the descriptive message on stderr, so the failure mode is loud rather than a wall of zero-result stats. Regression test `MxGatewayClientCliTests.RunAsync_BenchReadBulk_WhenRegisterReplyMissingTypedPayload_FailsLoudly` enqueues an `Ok` MX command reply with no typed `Register` payload, runs `bench-read-bulk` against the fake, and asserts exit code 1, that stderr names the `Register` payload, and that no `bench-read-bulk` stats JSON was emitted on stdout. Verified red against the original fallback (an NRE bubbled up later in the run rather than the descriptive throw) and green after the helper landed.
### Client.Dotnet-020
| Field | Value |
@@ -438,7 +442,7 @@ The bench then drives the rest of the run — `SubscribeBulk`, warmup `ReadBulk`
| Severity | Low |
| Category | Error handling & resilience |
| Location | `clients/dotnet/ZB.MOM.WW.MxGateway.Client.Cli/MxGatewayClientCli.cs:792-810`, `clients/dotnet/ZB.MOM.WW.MxGateway.Client.Cli/MxGatewayClientCli.cs:774-780` |
| Status | Open |
| Status | Resolved |
**Description:** `BenchReadBulkAsync`'s steady-state `while (DateTime.UtcNow < steadyDeadline)` loop wraps each `client.InvokeAsync(...)` in a bare `catch`:
@@ -466,6 +470,8 @@ The warmup loop above (lines 774-780) has no catch at all, so a warmup-time OCE
**Recommendation:** Replace the bare `catch` with `catch (Exception) when (!cancellationToken.IsCancellationRequested)`, or split into `catch (OperationCanceledException) { throw; } catch (Exception) { failedCalls++; ... continue; }`. The first form is the smallest diff and matches the pattern used elsewhere in the CLI. Add a regression test that runs `bench-read-bulk` with a `--duration-seconds 10` budget against a fake that throws on every `InvokeAsync`, cancels the supplied token after 100 ms, and asserts the run exits in well under 10 s. The wider precedent — Client.Dotnet-016's `BenchStreamEventsAsync` cancellation hardening — should already cover the shape of the test fixture.
**Resolution:** 2026-05-24 — Confirmed against source: the steady-state loop wrapped `client.InvokeAsync` in a bare `catch { sw.Stop(); failedCalls++; latencyMillis.Add(...); continue; }` with no type filter, so an `OperationCanceledException` thrown from a cancelled token (Ctrl+C, parent CTS, or the wall-clock budget) was swallowed and the loop spun until `DateTime.UtcNow >= steadyDeadline`. Replaced the bare clause with `catch (Exception ex) when (ex is not OperationCanceledException)`, so OCE now propagates out of the bench and unwinds through the outer `RunCoreAsync` (which intentionally lets OCE escape). Inline comment names the finding and the reason the filter exists. Regression test `MxGatewayClientCliTests.RunAsync_BenchReadBulk_WhenSteadyStateLoopReceivesCancellation_ExitsPromptly` configures the fake CLI client so Register and SubscribeBulk succeed, the first three ReadBulk calls succeed (the loop enters the steady-state body), then every subsequent ReadBulk throws `OperationCanceledException`; the test runs `bench-read-bulk --duration-seconds 30` and asserts the call throws OCE and wall-clock elapsed time stays under 10 s. Verified red against the original bare `catch` (the test ran for the full 30 s without throwing) and green after the filter landed (the bench exited promptly with OCE).
### Client.Dotnet-021
| Field | Value |
@@ -473,7 +479,7 @@ The warmup loop above (lines 774-780) has no catch at all, so a warmup-time OCE
| Severity | Low |
| Category | Correctness & logic bugs |
| Location | `clients/dotnet/ZB.MOM.WW.MxGateway.Client.Cli/MxGatewayClientCli.cs:487`, `clients/dotnet/ZB.MOM.WW.MxGateway.Client.Cli/MxGatewayClientCli.cs:715` |
| Status | Open |
| Status | Resolved |
**Description:** Both new bulk-read CLI handlers cast a signed `--timeout-ms` argument to `uint` without bounds checking:
@@ -499,3 +505,5 @@ uint timeoutMs = (uint)timeoutMsRaw;
```
A single shared helper (e.g. `ParseTimeoutMs(CliArguments, string, int)`) on `MxGatewayClientCli` would cover both call sites and remove the duplication.
**Resolution:** 2026-05-24 — Confirmed against source: both `ReadBulkAsync` (line 490) and `BenchReadBulkAsync` (line 715) cast `arguments.GetInt32("timeout-ms", ...)` straight to `uint`, so `--timeout-ms -1` silently wrapped to `0xFFFFFFFF` (~49.7 days). Added a single shared private helper `ParseTimeoutMs(CliArguments arguments, int defaultValue)` on `MxGatewayClientCli` that reads the int32, rejects negatives with a clear `ArgumentException` ("--timeout-ms must be a non-negative integer (use 0 for the gateway default)."), and returns the safe `(uint)`. Both call sites now route through the helper. Regression test `MxGatewayClientCliTests.RunAsync_TimeoutMs_NegativeValue_RejectsWithClearError` (xUnit `[Theory]` over `read-bulk` and `bench-read-bulk`) drives the CLI with `--timeout-ms -1` and asserts the exit code is non-zero, that stderr contains "timeout-ms", and that the "non-negative" guard text is present. Verified red against the original `(uint)arguments.GetInt32(...)` casts (the bench proceeded past the timeout parse and tripped a downstream "Queue empty" error rather than the descriptive guard message) and green after the helper landed.
+13 -13
View File
@@ -7,7 +7,7 @@
| Review date | 2026-05-24 |
| Commit reviewed | `42b0037` |
| Status | Re-reviewed |
| Open findings | 6 |
| Open findings | 0 |
## Checklist coverage
@@ -472,7 +472,7 @@ Each is a few lines and routes through the existing `runWithIO` entry point, so
| Severity | Medium |
| Category | Code organization & conventions |
| Location | `clients/go/cmd/mxgw-go/main.go:398-412,417-519` |
| Status | Open |
| Status | Resolved |
**Description:** Commit `8aaab82` ("Go client: port bulk read/write SDK methods + CLI subcommands") re-introduces every symptom that Client.Go-015 documented and was marked Resolved against an earlier commit:
@@ -484,7 +484,7 @@ Because the surrounding test file (`main_test.go`) lost the regression tests pro
**Recommendation:** Re-apply the Client.Go-015 fix on this re-added code. Drop the `secured` parameter and the `_ = secured` line (the `command` switch is the only routing key); derive the variant locally from `command`; register `-current-user-id` / `-verifier-user-id` only inside the secured branches and `-user-id` only inside Write/Write2 — so a wrong-variant flag fails with a clean `flag provided but not defined` usage error. Re-add the `TestRunWriteBulkVariantGatesSecuredFlags` table-test from the Client.Go-021 resolution so a future regression is caught by CI.
**Resolution:** Open.
**Resolution:** 2026-05-24 — Re-applied the Client.Go-015 fix. Dropped the unused `secured` parameter from `runWriteBulkVariant` and the misleading `_ = secured` line; the variant is now derived locally from `command` and gates flag registration. `-current-user-id` / `-verifier-user-id` are only registered for the secured variants and `-user-id` only for Write/Write2, so a wrong-variant flag now fails with a clean `flag provided but not defined` usage error. The four `runWrite*Bulk` wrappers were updated to match the new signature. Regression test `TestRunWriteBulkVariantGatesSecuredFlags` in `cmd/mxgw-go/main_test.go` (table-driven across all five wrong-variant flag/command pairs) was re-added; it failed pre-fix on every case ("session-id, item-handles, and values are required" reached because the flag was silently accepted), and passes post-fix with the expected `flag provided but not defined`.
### Client.Go-023
@@ -493,7 +493,7 @@ Because the surrounding test file (`main_test.go`) lost the regression tests pro
| Severity | Medium |
| Category | Concurrency & thread safety |
| Location | `clients/go/cmd/mxgw-go/main.go:604-606,616-632` |
| Status | Open |
| Status | Resolved |
**Description:** `runBenchReadBulk`'s warm-up and steady-state loops are wall-clock-only again:
@@ -515,7 +515,7 @@ Neither loop checks `ctx.Done()` / `ctx.Err()`. This is exactly the shape Client
**Recommendation:** Re-apply the Client.Go-018 fix: change both loop conditions to `for time.Now().Before(warmupDeadline) && ctx.Err() == nil` (and the same on `steadyDeadline`). The cross-language bench JSON shape is unchanged — the truncated window is just reported faithfully via `durationMs` / `totalCalls`. Optionally add the `signal.NotifyContext` pattern used by `runStreamAlarms` and `runGalaxyWatch` so direct Ctrl+C on the bench also short-circuits cleanly.
**Resolution:** Open.
**Resolution:** 2026-05-24 — Re-applied the Client.Go-018 fix. Both the warm-up and steady-state loops in `runBenchReadBulk` now carry an `&& ctx.Err() == nil` guard alongside the wall-clock check, so a cancelled parent context breaks the loops instead of spinning failing `ReadBulk` calls until the deadline elapses. The cross-language bench JSON shape (`durationMs` / `totalCalls`) is unchanged — the truncated window is just reported faithfully. Regression test `TestRunBenchReadBulkRespectsContextCancellation` in `cmd/mxgw-go/main_test.go` spins up a localhost TCP gRPC fake (`benchFakeGateway`) that answers OpenSession + Invoke for the register/subscribe/read/unsubscribe sequence, runs the bench with `-warmup-seconds 5 -duration-seconds 5`, cancels the ctx after 150ms, and asserts the bench returns in under 4s. Pre-fix the test ran for the full 10s (warmup+duration); post-fix it returns within ~250ms.
### Client.Go-024
@@ -524,7 +524,7 @@ Neither loop checks `ctx.Done()` / `ctx.Err()`. This is exactly the shape Client
| Severity | Low |
| Category | Testing coverage |
| Location | `clients/go/mxgateway/session.go:395-525`, `clients/go/mxgateway/alarms.go:65-76` |
| Status | Open |
| Status | Resolved |
**Description:** The five new bulk SDK methods on `Session` and the new `Client.StreamAlarms` method have **no unit tests** in `clients/go/mxgateway/`:
@@ -545,7 +545,7 @@ Neither loop checks `ctx.Done()` / `ctx.Err()`. This is exactly the shape Client
These plus `nil` / empty-slice rejection tests for each bulk method close out the new public surface.
**Resolution:** Open.
**Resolution:** 2026-05-24 — Added SDK-level tests using the existing `newBufconnClient` / `newBufconnClientWithAlarms` fake-gateway pattern. In `clients/go/mxgateway/client_session_test.go`: `TestWriteBulkBuildsOneBulkCommandAndReturnsPerEntryResults` confirms the protobuf payload carries `MX_COMMAND_KIND_WRITE_BULK` with all entries and returns per-entry results; `TestWriteBulkRejectsNilEntries` pins the nil guard on all five new bulk methods (WriteBulk/Write2Bulk/WriteSecuredBulk/WriteSecured2Bulk/ReadBulk); `TestReadBulkForwardsTimeoutAndUnpacksCachedFlag` pins normal `timeoutMs` arithmetic and `WasCached` propagation; `TestReadBulkSaturatesTimeoutAboveMaxUint32` pins the `> MaxUint32 ms` clamp at `session.go:504-509`. In `clients/go/mxgateway/alarms_test.go`: `TestStreamAlarmsPassesFilterPrefixAndReceivesFeedMessages` asserts request flows and stream Recv returns each fake `AlarmFeedMessage` (active-alarm snapshot, snapshot-complete sentinel) with auth metadata attached; `TestStreamAlarmsRejectsNilRequest` pins the nil guard. The `fakeGatewayWithAlarms` was extended with a `StreamAlarms` method.
### Client.Go-025
@@ -554,7 +554,7 @@ These plus `nil` / empty-slice rejection tests for each bulk method close out th
| Severity | Low |
| Category | Correctness & logic bugs |
| Location | `clients/go/mxgateway/session.go:395-485,495-525` |
| Status | Open |
| Status | Resolved |
**Description:** The five new bulk methods (`WriteBulk`, `Write2Bulk`, `WriteSecuredBulk`, `WriteSecured2Bulk`, `ReadBulk`) each guard with `if entries == nil { return error }` and an upper-bound `ensureBulkSize` check, but accept a non-nil empty slice (e.g. `[]*WriteBulkEntry{}` or `[]string{}`). The call then sends an `MX_COMMAND_KIND_WRITE_BULK` (or peer) command with zero entries across the gRPC wire to the gateway, which forwards to the worker for a no-op round trip. This is the same shape Client.Go-015 / Client.Go-021 were written against (the CLI now also accepts `mxgw-go write-bulk -item-handles , -values ,` which `parseInt32List` returns as empty without error). The pre-existing bulk methods (`AddItemBulk`, `AdviseItemBulk`, etc. at `session.go:253-343`) carry the identical pattern, so this is a long-standing convention — but it's still a real cost on the hot path. The Java / .NET / Rust / Python clients should be checked for parity if this is fixed.
@@ -565,7 +565,7 @@ These plus `nil` / empty-slice rejection tests for each bulk method close out th
Option 1 is cheaper for callers (one less round trip and one clearer error message) and removes the empty-list footgun for cross-language drivers that may pass empty arrays from PowerShell `,` splits.
**Resolution:** Open.
**Resolution:** 2026-05-24 — Audited the four other clients for parity: .NET (`MxGatewaySession.cs:520`), Rust (`session.rs:408` via `ensure_bulk_size`), Python (`session.py:350`), and Java (`MxGatewaySession.java:451`) all accept empty slices and make the round-trip with zero entries. To preserve cross-language behaviour (no error on empty input) while removing the wasteful round trip on the Go hot path, all five new bulk methods (`WriteBulk`, `Write2Bulk`, `WriteSecuredBulk`, `WriteSecured2Bulk`, `ReadBulk`) now short-circuit on `len(entries) == 0` and return an empty result slice without invoking the command. The nil guard is preserved (returns the "...are required" error) and the SDK doc comments now document the empty-slice no-op shape explicitly. Regression test `TestBulkMethodsShortCircuitOnEmptySliceWithoutRoundTrip` in `client_session_test.go` invokes each of the five methods with an empty slice and asserts (a) no error, (b) zero-length result, and (c) `fake.invokeRequest == nil` (no gRPC round trip). Pre-fix the test failed on the first assertion ("WriteBulk(empty) sent a round trip; expected short-circuit"); post-fix it passes.
### Client.Go-026
@@ -574,7 +574,7 @@ Option 1 is cheaper for callers (one less round trip and one clearer error messa
| Severity | Low |
| Category | Error handling & resilience |
| Location | `clients/go/cmd/mxgw-go/main.go:1196-1222` |
| Status | Open |
| Status | Resolved |
**Description:** `runBatch` reads command lines with a default `bufio.Scanner`:
@@ -595,7 +595,7 @@ A second weakness: `strings.Fields(line)` splits on whitespace and does no quote
**Recommendation:** Call `scanner.Buffer(make([]byte, 0, 64*1024), 16*1024*1024)` immediately after `bufio.NewScanner` so a long bulk-args line doesn't abort the session. If `runBatch` is intended to support free-text flag values (the `acknowledge-alarm -comment` shape is the obvious case), swap `strings.Fields` for a quote-aware tokeniser (`mvdan.cc/sh/v3/syntax` or a small inline state machine matching the .NET/Rust harness shape). Otherwise add a one-line comment to `runBatch`'s doc-comment that batch-mode arguments must not contain whitespace.
**Resolution:** Open.
**Resolution:** 2026-05-24 — `runBatch` now sets `scanner.Buffer(make([]byte, 0, 64*1024), 16*1024*1024)` immediately after `bufio.NewScanner`, lifting the per-line cap from 64 KiB to 16 MiB so a long bulk-args line (several thousand handles) no longer aborts the session. If a single line still exceeds the 16 MiB cap, the resulting `scanner.Err()` is now framed as a final error-with-sentinel (JSON payload + `batchEOR`) and returned, so the harness never sees an unframed bufio failure. Regression test `TestRunBatchHandlesLongCommandLine` in `cmd/mxgw-go/main_test.go` feeds an ~88 KiB `subscribe-bulk` line (above the old 64 KiB default) followed by `version --json` and asserts two EOR sentinels are emitted — pre-fix the test failed with "bufio.Scanner: token too long" returned from `runBatch`; post-fix both commands run and the session completes cleanly. Quote-aware tokenisation is out of scope for this finding (the recommendation accepts either fix); the `strings.Fields` shape is unchanged.
### Client.Go-027
@@ -604,7 +604,7 @@ A second weakness: `strings.Fields(line)` splits on whitespace and does no quote
| Severity | Low |
| Category | Code organization & conventions |
| Location | `clients/go/cmd/mxgw-go/main.go:1195-1206` |
| Status | Open |
| Status | Resolved |
**Description:** `runBatch`'s doc-comment says the loop "never terminates on command error; only stdin EOF (or an empty line) ends the session", and the implementation matches:
@@ -624,4 +624,4 @@ The two cases the empty-line check seems to cover — (a) operator pressing Ente
**Recommendation:** Change `if line == "" { break }` to `if line == "" { continue }` (alongside the existing `len(args) == 0` continue, which is then redundant — keep one, drop the other for clarity). Update the `runBatch` doc-comment to read "only stdin EOF ends the session" and drop the "or an empty line" clause. If the interactive ergonomic is genuinely wanted, gate it on `isatty(stdin)` so the batch-from-pipe case isn't affected.
**Resolution:** Open.
**Resolution:** 2026-05-24 — `runBatch` no longer treats a blank line as end-of-session. The `if line == "" { break }` early-exit was removed; blank or whitespace-only lines now fall through the existing `if len(args) == 0 { continue }` guard (kept as the single blank-line skip rule for clarity), so only stdin EOF ends the session. The doc-comment was updated to read "Blank lines are skipped; only stdin EOF ends the session." Regression test `TestRunBatchSkipsBlankLinesAndContinuesUntilEOF` in `cmd/mxgw-go/main_test.go` feeds `version --json\n\nversion --json\n` (a stray blank line between two commands) and asserts two EOR sentinels are emitted — pre-fix the test failed with "EOR sentinel count = 1, want 2" because the blank line broke the loop and the second command never ran; post-fix both commands run.
+16 -6
View File
@@ -7,7 +7,7 @@
| Review date | 2026-05-24 |
| Commit reviewed | `42b0037` |
| Status | Re-reviewed |
| Open findings | 5 |
| Open findings | 0 |
## Checklist coverage
@@ -551,7 +551,7 @@ Client.Java-001..031 are unchanged.
| Severity | High |
| Category | Documentation & comments |
| Location | `clients/java/README.md:182-183` |
| Status | Open |
| Status | Resolved |
**Description:** Commit `8738735` ("clients: document StreamAlarms + AcknowledgeAlarm in each README") added two new gradle invocations to the CLI Usage block:
@@ -569,6 +569,8 @@ A user copying either invocation from the README hits a picocli parse error imme
**Recommendation:** Drop the `--session-id <id>` token from both documented invocations, and change `--alarm-reference` to `--reference` in the `acknowledge-alarm` line. Optionally also add `--filter-prefix` to the `stream-alarms` example so readers see the scoping option, and align README option names with the actual CLI by either renaming the CLI option `--reference``--alarm-reference` (matches the proto `alarm_full_reference` field semantically) or leaving as is and only fixing the README. Add a small `MxGatewayCliTests` parse-only assertion for both subcommands that exercises every option flag to prevent the same drift the next time the CLI surface or README is touched.
**Resolution:** 2026-05-24 — Confirmed root cause against `clients/java/zb-mom-ww-mxgateway-cli/src/main/java/com/zb/mom/ww/mxgateway/cli/MxGatewayCli.java:1174-1182,1248-1258`: `StreamAlarmsCommand` exposes only `--filter-prefix` / `--limit` and `AcknowledgeAlarmCommand` exposes `--reference` / `--comment` / `--operator` — neither has a `--session-id` option and `acknowledge-alarm` has no `--alarm-reference` option, so both documented invocations failed picocli parse at the first unknown option. Fixed `clients/java/README.md:182-183` by dropping the `--session-id <id>` token from both lines, replacing it with `--filter-prefix Galaxy` on the `stream-alarms` example so readers see the actual scoping flag, and changing `--alarm-reference` to `--reference` on the `acknowledge-alarm` example. Added `MxGatewayCli.commandLine(...)` to package-private visibility (was `private`) so the test can drive the production picocli `CommandLine` directly without executing the command body. Regression tests in `MxGatewayCliTests`: `readmeDocumentedStreamAlarmsExampleParsesCleanly` and `readmeDocumentedAcknowledgeAlarmExampleParsesCleanly` pin the exact token list documented in the README and assert `commandLine.parseArgs(...)` returns without throwing a `picocli.CommandLine.ParameterException`. TDD red phase: before the README fix the previously-documented tokens (`--session-id <id>` + `--alarm-reference ...`) would have thrown `Unknown option: '--session-id'` / `Unknown option: '--alarm-reference'` at parse time; the new tests pass against the corrected README and would fail the next time someone drifts the documented surface from the actual CLI options.
### Client.Java-033
| Field | Value |
@@ -576,7 +578,7 @@ A user copying either invocation from the README hits a picocli parse error imme
| Severity | Medium |
| Category | Correctness & logic bugs |
| Location | `clients/java/zb-mom-ww-mxgateway-cli/src/main/java/com/zb/mom/ww/mxgateway/cli/MxGatewayCli.java:1078-1098` |
| Status | Open |
| Status | Resolved |
**Description:** `StreamAlarmsCommand.call()` allocates a bounded `ArrayBlockingQueue<Object>(1024)` and the gRPC observer publishes each `AlarmFeedMessage` via `queue.offer(value)`:
@@ -594,6 +596,8 @@ The library-side `MxEventStream` (Client.Java-002 resolution) and `DeployEventSt
**Recommendation:** Either (a) wrap the gRPC observer in the existing `MxEventStream`-style adaptor that calls `subscription.cancel()` and queues an exception on `queue.offer` returning `false`, then surface that exception from the drain loop — mirroring `MxEventStream.observer().onNext`'s overflow branch; or (b) reuse the library-side fail-fast plumbing by promoting `MxEventStream` (or extracting its terminal-state base) into a public `MxAlarmFeedStream` and have `MxGatewayClient.streamAlarms` return that instead of a bare subscription handle. Option (b) lines up with Client.Java-036 (deduplicate the subscription class family). Add a CLI regression test that overflows the bounded queue and asserts a non-zero exit / overflow exception, mirroring `MxGatewayMediumFindingsTests.eventStreamOverflowExceptionSurvivesASubsequentClose`.
**Resolution:** 2026-05-24 — Confirmed root cause at `MxGatewayCli.java` `StreamAlarmsCommand.call()`: the observer's `onNext` did `queue.offer(value)` and ignored the boolean return, so a 1024-element queue would silently drop messages past capacity. The same silent-drop affected the `onCompleted` branch (which `offer`s `ALARM_FEED_END`) once the queue was full, deadlocking the consumer since the drain loop never sees END. Took option (a) — minimal change that matches `MxEventStream`'s overflow branch. The fix: detect a failed `offer` inside `onNext`, call `subscription.cancel()` (via an `AtomicReference<MxGatewayAlarmFeedSubscription>` published immediately after `client.streamAlarms` returns), `queue.clear()`, then `queue.offer(IllegalStateException("stream-alarms queue overflowed (capacity 1024); consumer too slow"))` followed by `queue.offer(ALARM_FEED_END)`. The existing drain-loop `Throwable`-branch then surfaces the overflow as a thrown `IllegalStateException` from `call()`, which picocli reports as a non-zero CLI exit. Option (b) (promoting `MxEventStream` to a public alarm-feed stream) was considered and rejected for this change — it would change the public SDK surface; Client.Java-036's refactor handles deduplication at the subscription layer instead. Regression test: `MxGatewayCliTests.streamAlarmsCommandFailsFastOnQueueOverflow` — drives an `OverflowingFakeClient` whose `streamAlarms` synchronously pushes 2000 messages to the observer (exceeding the 1024 buffer), then asserts `run.exitCode() != 0`. TDD red phase confirmed deterministically: before the fix the test deadlocked (the buggy `offer` silently dropped both the overflowing alarms AND the `ALARM_FEED_END` sentinel that arrived after the queue filled, so the drain loop's `queue.take()` blocked forever); the background gradle run had to be killed with `TaskStop`. After the fix the same test exits in <1 second with the overflow exception propagating through picocli.
### Client.Java-034
| Field | Value |
@@ -601,7 +605,7 @@ The library-side `MxEventStream` (Client.Java-002 resolution) and `DeployEventSt
| Severity | Medium |
| Category | Correctness & logic bugs |
| Location | `clients/java/zb-mom-ww-mxgateway-cli/src/main/java/com/zb/mom/ww/mxgateway/cli/MxGatewayCli.java:182-198` |
| Status | Open |
| Status | Resolved |
**Description:** `BatchCommand.call()` reads one CLI invocation per stdin line and tokenises with:
@@ -622,6 +626,8 @@ The current `MxGatewayCliTests` test set (`batchCommandExecutesVersionAndEmitsEo
**Recommendation:** Replace `line.trim().split("\\s+")` with a real shell-style tokeniser that honours single and double quotes and backslash escapes — `picocli.CommandLine.ArgumentParser` doesn't ship one, but Apache Commons Exec's `CommandLine.translateCommandline(String)`, JDK 21's `java.util.spi.ToolProvider` argument parsing, or a small hand-written state machine all work. Cross-check the .NET / Go / Rust / Python `batch` implementations in the same change so all five clients use the same tokenisation; document the contract in the protocol comment in `MxGatewayCli.java` and in `scripts/run-client-e2e-tests.ps1`. Add a CLI test that feeds `acknowledge-alarm --comment "with spaces"` through `batch` and asserts the `--comment` value reaches the gateway as `"with spaces"`.
**Resolution:** 2026-05-24 — Confirmed root cause: `BatchCommand.call()` at the per-line loop used `line.trim().split("\\s+")` which has no quote handling. Replaced with a new package-private `MxGatewayCli.tokenizeBatchLine(String)` static helper — a hand-rolled POSIX-style shell tokenizer (no new dependency added) that honours: (a) double-quoted runs `"..."` with `\\`, `\"`, and `\n` escapes inside; (b) single-quoted runs `'...'` taken literally with no escapes (POSIX rule); (c) backslash escapes for any single character outside quotes (so `needs\ verification` is one token); (d) whitespace runs outside quotes separate tokens; (e) explicit `IllegalArgumentException` on unterminated quote or trailing backslash so the batch loop surfaces it as a JSON error instead of emitting wrong args. The `BatchCommand` per-line tokenisation now calls `tokenizeBatchLine(line)` and treats an empty-array result as a blank line (skip). Behaviour for whitespace-only input is unchanged. The cross-client `batch` audit (.NET / Go / Rust / Python) is out of scope for this Java-focused finding and tracked separately. Regression tests in `MxGatewayCliTests`: (a) `batchCommandTokenisesDoubleQuotedArgumentWithEmbeddedSpaces``--comment "needs verification"` round-trips intact; (b) `batchCommandTokenisesSingleQuotedArgumentWithEmbeddedSpaces` — single-quoted variant; (c) `batchCommandTokenisesBackslashEscapedSpaceOutsideQuotes``needs\ verification` outside quotes; (d) `batchCommandPreservesEmptyQuotedArgument``""` parses to an empty-string argument; (e) `batchCommandSupportsBackslashEscapedQuoteInsideDoubleQuotes``\"inner\"` survives the inner quotes. TDD red phase confirmed: all five tests failed against the original `split("\\s+")` implementation; after the fix all five pass.
### Client.Java-035
| Field | Value |
@@ -629,7 +635,7 @@ The current `MxGatewayCliTests` test set (`batchCommandExecutesVersionAndEmitsEo
| Severity | Low |
| Category | Testing coverage |
| Location | `clients/java/zb-mom-ww-mxgateway-client/src/test/java/com/zb/mom/ww/mxgateway/client/MxGatewayClientSessionTests.java` |
| Status | Open |
| Status | Resolved |
**Description:** Commit `8a0c59d` added `MxGatewayClient.streamAlarms(StreamAlarmsRequest, StreamObserver<AlarmFeedMessage>)` and a new public `MxGatewayAlarmFeedSubscription` class. No library-side test exercises either: a grep for `streamAlarms` across `zb-mom-ww-mxgateway-client/src/test/...` returns zero matches. The CLI tests (`MxGatewayCliTests.streamAlarmsCommand*`) exercise the path end-to-end, but they route through a `FakeClient.streamAlarms` override that bypasses the production `subscription.wrap(observer)` glue and the `withStreamDeadline(rawAsyncStub()).streamAlarms(...)` call. A regression to either — forgetting `.wrap(observer)`, dropping the deadline interceptor, misnaming the request — would compile and pass the CLI tests but break against a real gateway.
@@ -637,6 +643,8 @@ This is the same coverage gap pattern as Client.Java-030 (no fixture test for `Q
**Recommendation:** Add `streamAlarmsForwardsRequestAndStreamsAlarmFeedMessages` to `MxGatewayClientSessionTests` (in-process gRPC via the existing `InProcessGateway` + `TestGatewayService` fixture): override `TestGatewayService.streamAlarms` to capture the inbound `StreamAlarmsRequest` and emit one `active_alarm` snapshot, one `snapshot_complete`, and one `transition`, then complete. Call `MxGatewayClient.streamAlarms`, drain the observer via a `CountDownLatch`, and assert (a) the server observed the `alarm_filter_prefix`, (b) all three messages arrived in order with the expected payload-case, and (c) `MxGatewayAlarmFeedSubscription.cancel()` aborts the call (latch via `ServerCallStreamObserver.setOnCancelHandler`, mirroring the Client.Java-015 cancellation regression). Optionally also cover the cancel-before-beforeStart race that `MxGatewayAlarmFeedSubscription.wrap` handles, mirroring `mxEventStreamCloseBeforeBeforeStartCancelsStream`.
**Resolution:** 2026-05-24 — Confirmed the coverage gap: a grep across `zb-mom-ww-mxgateway-client/src/test/...` for `streamAlarms` returned zero matches; the CLI-only test routed through `FakeClient.streamAlarms` which bypassed both the production `subscription.wrap(observer)` and the `withStreamDeadline(rawAsyncStub()).streamAlarms(...)` gRPC call. Added `streamAlarmsForwardsRequestAndStreamsAlarmFeedMessages` to `MxGatewayClientSessionTests` in the same shape as `queryActiveAlarmsForwardsRequestAndStreamsSnapshots` (Client.Java-030 resolved this way). The test overrides `TestGatewayService.streamAlarms` to capture the inbound `StreamAlarmsRequest`, register a `serverCancelled` latch via `(ServerCallStreamObserver<AlarmFeedMessage>) responseObserver).setOnCancelHandler(...)`, then emit three messages: an `active_alarm` snapshot, a `snapshot_complete` sentinel, and a `transition`. It deliberately does NOT call `onCompleted()` so the call remains open for the cancellation assertion. The test then calls `MxGatewayClient.streamAlarms` against the in-process gateway, drains the wrapped observer via a `threeReceived` `CountDownLatch`, and asserts (a) the server observed `alarm_filter_prefix=Tank01`, (b) all three messages arrived in order with the expected payload-case (`ACTIVE_ALARM`, `SNAPSHOT_COMPLETE`, `TRANSITION`) and payload values (`Tank01.Level.HiHi`, transition kind `ACKNOWLEDGE`), and (c) `subscription.cancel()` causes the server's on-cancel handler to fire within 5 s (proves cancellation propagates through the production `subscription.wrap(observer)` glue, not just the CLI fake). TDD red phase: temporarily replaced the production `MxGatewayClient.streamAlarms` body with `withStreamDeadline(rawAsyncStub()).streamAlarms(request, observer);` (dropping the `subscription.wrap(observer)` indirection); the test failed at the `serverCancelled.await` assertion because cancellation was no longer wired to the underlying gRPC call. Restoring the production glue turned the build green.
### Client.Java-036
| Field | Value |
@@ -644,7 +652,7 @@ This is the same coverage gap pattern as Client.Java-030 (no fixture test for `Q
| Severity | Low |
| Category | Code organization & conventions |
| Location | `clients/java/zb-mom-ww-mxgateway-client/src/main/java/com/zb/mom/ww/mxgateway/client/MxGatewayAlarmFeedSubscription.java`, `MxGatewayEventSubscription.java`, `MxGatewayActiveAlarmsSubscription.java`, `DeployEventSubscription.java` |
| Status | Open |
| Status | Resolved |
**Description:** `MxGatewayAlarmFeedSubscription` is a structural near-copy of `MxGatewayEventSubscription` — same `AtomicReference<ClientCallStreamObserver<…>>` + `AtomicBoolean cancelled` field shape, the same `wrap(observer)` returning a `ClientResponseObserver` that stores `requestStream` in `beforeStart`, the same close-before-beforeStart race handling that Client.Java-014 originally fixed for `MxEventStream`, and the same `cancel()`+`close()` idempotency contract. The four subscription classes (`MxGatewayEventSubscription`, `MxGatewayActiveAlarmsSubscription`, `MxGatewayAlarmFeedSubscription`, `DeployEventSubscription`) are now ~60-line near-clones differing only in the request/response generic parameters and the `cancel` message string.
@@ -652,4 +660,6 @@ This is the same maintenance-hazard pattern Client.Java-009 / Client.Java-016 id
**Recommendation:** Extract a package-private abstract base, e.g. `MxGatewayStreamSubscription<TRequest>`, holding the `AtomicReference` / `AtomicBoolean` pair, the `cancel()` / `close()` implementation, and a `ClientResponseObserver` factory parameterised by the cancel-message string and the response observer. Have all four subscription classes extend it. Behaviour-only refactor — no public API change, existing tests cover the contract.
**Resolution:** 2026-05-24 — Extracted a package-private abstract base `MxGatewayStreamSubscription<TRequest, TResponse> implements AutoCloseable` (new file `clients/java/zb-mom-ww-mxgateway-client/src/main/java/com/zb/mom/ww/mxgateway/client/MxGatewayStreamSubscription.java`). It holds the shared `AtomicReference<ClientCallStreamObserver<TRequest>>` and `AtomicBoolean cancelled` pair, the `wrap(StreamObserver<TResponse>)` factory that returns a `ClientResponseObserver` with the Client.Java-014 close-before-beforeStart fix baked in, the `cancel()` / `close()` implementation, and an immutable `cancelMessage` injected by the subclass constructor. The four prior 60-line near-clones (`MxGatewayEventSubscription`, `MxGatewayAlarmFeedSubscription`, `MxGatewayActiveAlarmsSubscription`, `DeployEventSubscription`) collapse to ~10-line subclasses that only declare their `<Request, Response>` type parameters and supply the cancel-message string to `super(...)`. Public API surface is preserved: each subclass remains a `public final class` with a public no-arg constructor (the constructor was implicit on the original classes; I made it explicit `public` on the subclasses so the existing CLI `FakeClient.streamAlarms` in a different package can still `new MxGatewayAlarmFeedSubscription()`). The `wrap(...)` method is `final` and package-private on the base — same accessibility the four subclasses had before — so production callers in `MxGatewayClient`/`GalaxyRepositoryClient` see no change. New test file `MxGatewayStreamSubscriptionContractTests` exercises the lifecycle/cancellation contract identically across all four subclasses (16 tests, four per scenario): (a) cancel-before-beforeStart eagerly cancels the stream once it attaches with the subclass-specific message, (b) cancel-after-beforeStart forwards directly to the stream, (c) `close()` delegates to `cancel()`, (d) the wrapped observer forwards `onNext`/`onError`/`onCompleted` verbatim, and a compile-time `typeBoundsCheck` helper that asserts each subclass still binds its `<Req, Resp>` parameters to the right proto types. TDD red phase confirmed: temporarily breaking one subclass's `super(...)` message to `"BROKEN MESSAGE"` made the contract test for that subclass fail with `expected: <client cancelled alarm feed> but was: <BROKEN MESSAGE>`; restoring the correct value turned all 16 contract tests green. Future fixes to the shared lifecycle now live in one place — the next Client.Java-014/021-style race fix cannot drift across the four classes.
+117 -6
View File
@@ -7,7 +7,7 @@
| Review date | 2026-05-24 |
| Commit reviewed | `42b0037` |
| Status | Re-reviewed |
| Open findings | 5 |
| Open findings | 0 |
## Checklist coverage
@@ -835,7 +835,7 @@ parity fix.
| Severity | High |
| Category | Documentation & comments |
| Location | `clients/python/README.md:201-202`, `clients/python/src/zb_mom_ww_mxgateway_cli/commands.py:389-420` |
| Status | Open |
| Status | Resolved |
**Description:** The README CLI examples added by commit `8738735` for the
new alarm subcommands cite flags the CLI does not accept:
@@ -868,6 +868,19 @@ rename the CLI option to `--alarm-reference` and add a test that copy-pastes
the README examples through `CliRunner` to assert they parse. Option (1) is
the smaller change.
**Resolution:** 2026-05-24 — Fixed the README examples to match the
implementation (option 1, smaller change). `clients/python/README.md:201-202`
now reads `mxgw-py stream-alarms --max-messages 1 --json` and
`mxgw-py acknowledge-alarm --reference "\\Galaxy\Area001.Pump001.PumpFault" --json`
`--session-id` is dropped from both lines (the alarm feed is gateway-served,
session-less) and `--alarm-reference` is renamed to the real `--reference` flag.
Regression test
`tests/test_review_findings_022_to_026.py::test_readme_alarm_examples_parse_against_cli`
extracts every `mxgw-py …` line from the README, appends `--help` so only the
parser runs, and asserts that no example produces a `no such option` Click error.
Failed before the fix (the original `stream-alarms --session-id <id> …` line
emitted `Error: No such option: --session-id`), passes after.
### Client.Python-023
| Field | Value |
@@ -875,7 +888,7 @@ the smaller change.
| Severity | Medium |
| Category | Security |
| Location | `clients/python/src/zb_mom_ww_mxgateway_cli/commands.py:901-906` |
| Status | Open |
| Status | Resolved |
**Description:** Client.Python-013 (severity Medium, Security) was marked
**Resolved** on 2026-05-20 with the explicit claim that the silent
@@ -919,6 +932,31 @@ is marked Resolved with a 2026-05-20 commit reference, do **not** silently
re-resolve this finding — keep it Open with a fresh ID so the regression
audit trail is preserved.
**Resolution:** 2026-05-24 — Re-applied the Client.Python-013 fix on the
renamed CLI module. Dropped the `endpoint.startswith("localhost:") or
endpoint.startswith("127.0.0.1:")` auto-plaintext branch from
`_use_plaintext` in `clients/python/src/zb_mom_ww_mxgateway_cli/commands.py`.
TLS is now the default and `--plaintext` is the only way to opt in to
plaintext; `--tls` is accepted as a redundant affirmation and the two
flags combined raise `click.UsageError`. Regression tests live in
`tests/test_review_findings_022_to_026.py`:
`test_use_plaintext_does_not_auto_downgrade_for_localhost_endpoint` and
`test_use_plaintext_does_not_auto_downgrade_for_loopback_ipv4_endpoint`
exercise the bare-endpoint path,
`test_use_plaintext_requires_explicit_plaintext_flag` and
`test_use_plaintext_tls_flag_explicitly_disables_plaintext` pin the explicit
opt-in / opt-out contract,
`test_use_plaintext_rejects_plaintext_and_tls_combined` asserts mutual
exclusivity, and
`test_cli_localhost_endpoint_with_no_flags_uses_tls_channel` is an
end-to-end CliRunner test that intercepts `GatewayClient.connect` and
asserts the resolved `ClientOptions.plaintext` is `False` for a
`localhost:5000` endpoint without `--plaintext`. All five tests failed
against the pre-fix source and pass against the fix. **Behaviour change for
callers:** scripts that previously relied on
`mxgw-py … --endpoint localhost:5000 …` selecting plaintext silently must
now add an explicit `--plaintext` flag (or set up TLS on the gateway).
### Client.Python-024
| Field | Value |
@@ -926,7 +964,7 @@ audit trail is preserved.
| Severity | Medium |
| Category | Code organization & conventions |
| Location | `clients/python/src/zb_mom_ww_mxgateway_cli/commands.py:13,48-119` |
| Status | Open |
| Status | Resolved |
**Description:** The new `batch` subcommand (commit `71d2c39`) implements
the cross-language batch protocol by importing `click.testing.CliRunner`
@@ -965,6 +1003,33 @@ batch loop can interleave inner-command output with the
a regression test that drives `batch` with `batch\n` on stdin and asserts
recursive invocation is either rejected or correctly bounded.
**Resolution:** 2026-05-24 — Removed the `from click.testing import CliRunner`
import and the `CliRunner()` instantiation from
`clients/python/src/zb_mom_ww_mxgateway_cli/commands.py`. The `batch`
command body now dispatches each stdin line through a new
`_dispatch_batch_line` helper that calls `main.main(args=…,
standalone_mode=False, prog_name="mxgw-py")` directly and captures the
subcommand's stdout via `contextlib.redirect_stdout(io.StringIO())`. Click
exit conditions (`click.exceptions.Exit`, `click.ClickException`,
`SystemExit`) are caught and rendered as
`{"error": …, "type": …}` JSON; arbitrary exceptions are caught with a
broad `except Exception` so the batch loop never dies. A nested `batch`
line is rejected outright with a `RecursiveBatchError` JSON record before
the dispatcher runs, eliminating the silent-recursive-spawn footgun the
original `CliRunner.invoke(main, ["batch"], …)` path enabled. Regression
tests:
`tests/test_review_findings_022_to_026.py::test_batch_command_does_not_use_clirunner_in_production`
asserts the production module no longer imports `from click.testing` or
calls `CliRunner(`; and
`test_batch_recursive_batch_line_is_bounded` drives a `batch\nversion --json\n`
stdin payload and asserts the recursive `batch` line emits an error JSON
record rather than silently exiting. The pre-existing batch tests
(`test_batch_runs_version_command_and_writes_eor`,
`test_batch_terminates_on_empty_line`,
`test_batch_continues_after_error_line`) still pass against the new
implementation, confirming the wire-level contract (one EOR per line,
clean JSON error blocks) is preserved.
### Client.Python-025
| Field | Value |
@@ -972,7 +1037,7 @@ recursive invocation is either rejected or correctly bounded.
| Severity | Low |
| Category | Testing coverage |
| Location | `clients/python/tests/test_cli.py`, `clients/python/src/zb_mom_ww_mxgateway/{client.py,session.py}`, `clients/python/src/zb_mom_ww_mxgateway_cli/commands.py` |
| Status | Open |
| Status | Resolved |
**Description:** Commits `6add4b4` and `828e3e6` added five new SDK
methods (`Session.read_bulk`, `Session.write_bulk`, `Session.write2_bulk`,
@@ -1020,6 +1085,32 @@ applied to the renamed bench). At minimum, add a request-shape test for
`write_secured_bulk` since the secured family is the highest-risk
parity surface.
**Resolution:** 2026-05-24 — Added behavioural test coverage for the five
new bulk SDK methods, `stream_alarms`, and the new CLI subcommand bodies
in `tests/test_review_findings_022_to_026.py`. Request-shape tests
(`test_session_read_bulk_sends_expected_request_shape`,
`test_session_write_bulk_sends_expected_request_shape`,
`test_session_write2_bulk_sends_expected_request_shape`,
`test_session_write_secured_bulk_sends_expected_request_shape`,
`test_session_write_secured2_bulk_sends_expected_request_shape`) drive
each `Session.*_bulk` method against a fake `Invoke` stub and assert
the captured `MxCommand`'s `kind`, sub-message, `server_handle`, and
per-entry fields (including `current_user_id` / `verifier_user_id`
on the secured family — the highest-risk parity surface the finding
calls out). `test_stream_alarms_yields_feed_messages_and_cancels_on_close`
covers the `GatewayClient.stream_alarms` happy path including the
`_canceling_alarm_feed_iterator` cancel-on-close contract and the
authorization metadata header. CLI happy-path tests
(`test_cli_read_bulk_happy_path`, `test_cli_write_bulk_happy_path`,
`test_cli_stream_alarms_happy_path`, `test_cli_acknowledge_alarm_happy_path`)
each drive their subcommand through `CliRunner` against a fake stub
injected via a monkeypatched `GatewayClient.connect` and assert the
emitted JSON shape and that the captured RPC request carries the
expected fields. The four CLI happy-path tests passed even before any
production fix (the implementations were correct; the finding is a
coverage gap), but they now exist as regression guards against future
drift. No source change — pure coverage finding.
### Client.Python-026
| Field | Value |
@@ -1027,7 +1118,7 @@ parity surface.
| Severity | Low |
| Category | Correctness & logic bugs |
| Location | `clients/python/src/zb_mom_ww_mxgateway_cli/commands.py:674-738` |
| Status | Open |
| Status | Resolved |
**Description:** Two minor quality issues in the new `_bench_read_bulk`
body (commit `6add4b4`):
@@ -1060,3 +1151,23 @@ module-level `logger = logging.getLogger(__name__)`. No behavioural
change in the happy path; failure path becomes diagnosable. No new test
required for the import hoist; the logger change is exercised by the
existing bench smoke test once `caplog` is added to the test signature.
**Resolution:** 2026-05-24 — Hoisted `import time` to the module-level
import block in `clients/python/src/zb_mom_ww_mxgateway_cli/commands.py`
alongside the existing standard-library imports; the function-local
`import time` line at the top of `_bench_read_bulk` is gone. Added a
module-level `logger = logging.getLogger(__name__)` and rewrote the two
`finally` cleanup blocks to bind the swallowed exception and log it at
`WARNING` level — `unsubscribe_bulk` failures now emit
`"bench-read-bulk: unsubscribe_bulk cleanup failed: %s"` and the
`session.close()` failure path emits the equivalent — so a future
regression in the cleanup path is diagnosable at the next benchmark run
rather than silently corrupting subscription bookkeeping. Regression
tests in `tests/test_review_findings_022_to_026.py`:
`test_commands_module_imports_time_at_module_scope` uses
`inspect.getsource(_bench_read_bulk)` to assert no function-local
`import time` line, and asserts the module exposes `time` at module
scope; `test_commands_module_bench_read_bulk_does_not_use_bare_except_pass`
greps the function source for the `except Exception:\n pass` pattern
and rejects it. Both tests failed against the pre-fix source and pass
against the fix.
+25 -9
View File
@@ -7,7 +7,7 @@
| Review date | 2026-05-24 |
| Commit reviewed | `42b0037` |
| Status | Re-reviewed |
| Open findings | 8 |
| Open findings | 0 |
## Checklist coverage
@@ -481,7 +481,7 @@ The CLI integration in Client.Rust-014 works either way; this is solely about de
| Severity | Medium |
| Category | Correctness & logic bugs |
| Location | `clients/rust/src/session.rs:369-391,403-420,427-444,452-469,476-493,631-696,706-724` |
| Status | Open |
| Status | Resolved |
**Description:** Commit `3251069` re-introduced the bulk read/write SDK methods (`read_bulk`, `write_bulk`, `write2_bulk`, `write_secured_bulk`, `write_secured2_bulk`) on `Session`. Each method falls back to `Vec::new()` when an OK reply does not carry the expected typed payload arm:
@@ -496,6 +496,8 @@ and `bulk_write_results` does the same for the four write families. A caller of
**Recommendation:** Re-apply the Client.Rust-005/006 resolutions on top of the new methods: add `Error::MalformedReply { detail: String }` back to `error.rs`, change `register_server_handle` / `add_item_handle` / `add_item2_handle` to return `Result<i32, Error>` (yielding `MalformedReply` when the reply lacks an extractable handle), change `bulk_results` and `bulk_write_results` to return `Result<Vec<_>, Error>` (yielding `MalformedReply` on a mismatched / absent payload arm), and route the same fix through the new `read_bulk` inline branch. Re-add the six `…_returns_malformed_reply_…` tests from the Client.Rust-016 resolution to lock the contract in.
**Resolution:** 2026-05-24 — Re-added `Error::MalformedReply { detail: String }` to `clients/rust/src/error.rs` (alongside re-adding `Error::Unavailable` for Client.Rust-010's resolution). Changed `register_server_handle` / `add_item_handle` / `add_item2_handle` in `clients/rust/src/session.rs` to return `Result<i32, Error>` and yield `MalformedReply` when the reply lacks both a typed payload and an int32 `return_value`. Changed `bulk_results` and `bulk_write_results` to return `Result<Vec<_>, Error>` and yield `MalformedReply` on a mismatched or absent payload arm. Rewrote `read_bulk`'s inline branch the same way. Added six regression tests in `clients/rust/tests/client_behavior.rs``register_returns_malformed_reply_when_ok_reply_has_no_payload`, `add_item_returns_malformed_reply_when_ok_reply_has_no_payload`, `add_item2_returns_malformed_reply_when_ok_reply_has_no_payload`, `subscribe_bulk_returns_malformed_reply_on_mismatched_payload_arm`, `read_bulk_returns_malformed_reply_on_mismatched_payload_arm`, `write_bulk_returns_malformed_reply_on_mismatched_payload_arm` — driven by a new `InvokeOverride` enum on `FakeState` that pins the fake gateway's `Invoke` handler to `OkWithoutPayload`, `OkWithMismatchedPayload`, or `Unavailable(...)` per test.
### Client.Rust-023
| Field | Value |
@@ -503,7 +505,7 @@ and `bulk_write_results` does the same for the four write families. A caller of
| Severity | Low |
| Category | mxaccessgw conventions |
| Location | `clients/rust/crates/mxgw-cli/src/main.rs:835,872,1476` |
| Status | Open |
| Status | Resolved |
**Description:** Three CLI subcommands added since `d692232` hard-code their `client_correlation_id`:
@@ -517,6 +519,8 @@ Every invocation of these subcommands on the same machine — and every iteratio
**Recommendation:** Replace the three string literals with calls to the existing `next_correlation_id` helper (already `pub` and intended for raw-RPC consumers): `zb_mom_ww_mxgateway_client::session::next_correlation_id("cli-stream-alarms")`, `next_correlation_id("cli-acknowledge-alarm")`, `next_correlation_id("cli-bench-read-bulk-close")`. While here, restore the same fix at lines 506 and 553 so the CLI surface as a whole shares the library's correlation-id discipline.
**Resolution:** 2026-05-24 — Restored `pub fn next_correlation_id(label: &str) -> String` in `clients/rust/src/session.rs` (the `d692232` rename had reverted Client.Rust-011/014's resolution along with the rest) and re-exported it at the crate root in `clients/rust/src/lib.rs` (`pub use session::{next_correlation_id, Session};`) so the in-tree `mxgw` CLI calls the short `zb_mom_ww_mxgateway_client::next_correlation_id(...)` path. Replaced all five hard-coded correlation-id literals in `clients/rust/crates/mxgw-cli/src/main.rs` (`rust-cli-ping`, `rust-cli-close-session`, `rust-cli-stream-alarms`, `rust-cli-acknowledge-alarm`, `rust-cli-bench-read-bulk-close`) with `next_correlation_id("cli-...")`. `Session::command_request` and `Session::close` now use the same helper. Added a regression test `cli_subcommands_propagate_unique_correlation_ids_from_next_correlation_id` in `clients/rust/tests/client_behavior.rs` that records the correlation id observed at the fake gateway and asserts both that successive `next_correlation_id` calls produce distinct values and that the label is propagated through `stream_alarms` / `acknowledge_alarm`.
### Client.Rust-024
| Field | Value |
@@ -524,7 +528,7 @@ Every invocation of these subcommands on the same machine — and every iteratio
| Severity | Medium |
| Category | Testing coverage |
| Location | `clients/rust/tests/client_behavior.rs:405-415`; `clients/rust/src/session.rs:369-493`; `clients/rust/src/client.rs:265-291`; `clients/rust/crates/mxgw-cli/src/main.rs:1310-1505` |
| Status | Open |
| Status | Resolved |
**Description:** The diff under review adds substantial SDK and CLI surface with no positive-path coverage:
@@ -534,6 +538,8 @@ Every invocation of these subcommands on the same machine — and every iteratio
**Recommendation:** Extend the fake-gateway `Invoke` dispatcher in `tests/client_behavior.rs` with the five new bulk reply arms, add round-trip tests for each (`write_bulk` / `write2_bulk` / `write_secured_bulk` / `write_secured2_bulk` / `read_bulk`), and re-add the six malformed-reply tests from the Client.Rust-016 resolution. Replace the trivial `stream_alarms` stub with one that emits a synthetic `ActiveAlarm``SnapshotComplete``Transition` sequence and assert the client surfaces them in order. For the bench, factor the percentile / accounting helpers out of `run_bench_read_bulk` into a small struct (matching the previous `BenchReadBulkStats`) and add unit tests asserting `latencyMs.{p50,p95,p99,max,mean}` are computed correctly from a hand-built sample.
**Resolution:** 2026-05-24 — Extended the fake gateway's `Invoke` dispatcher to handle `Register`, `AddItem2`, `ReadBulk`, `WriteBulk`, `Write2Bulk`, `WriteSecuredBulk`, and `WriteSecured2Bulk`, with shared `write_bulk_reply_for` / `bulk_write_result_ok` helpers. Added round-trip integration tests in `clients/rust/tests/client_behavior.rs` for all five bulk SDK methods: `read_bulk_round_trips_through_the_fake_gateway`, `write_bulk_round_trips_through_the_fake_gateway`, `write2_bulk_round_trips_through_the_fake_gateway`, `write_secured_bulk_round_trips_through_the_fake_gateway`, `write_secured2_bulk_round_trips_through_the_fake_gateway`. Replaced the trivial `stream_alarms` stub with a per-test script (`FakeState::stream_alarms_script`) and added `stream_alarms_emits_snapshot_then_complete_then_transition_in_order`, which feeds an `ActiveAlarm``SnapshotComplete``Transition` sequence and asserts the client surfaces all three payload arms in order. Added three CLI-side unit tests against the existing `percentile_summary` helper in `clients/rust/crates/mxgw-cli/src/main.rs`: `bench_percentile_summary_matches_hand_built_sample` (asserts p50/p95/p99/max/mean on the sample `[1,2,3,4,5]`), `bench_percentile_summary_handles_empty_sample`, and `bench_percentile_summary_handles_single_value_sample`. The malformed-reply suite from Client.Rust-022 plus the unary `Error::Unavailable` test together cover the remaining surface.
### Client.Rust-025
| Field | Value |
@@ -541,7 +547,7 @@ Every invocation of these subcommands on the same machine — and every iteratio
| Severity | Low |
| Category | Design-document adherence |
| Location | `clients/rust/RustClientDesign.md:92-106,142-153,164-171` |
| Status | Open |
| Status | Resolved |
**Description:** CLAUDE.md mandates that "When public APIs, contracts, configuration, build steps, security behavior, event shapes, value conversion, status mapping, or lifecycle rules change, the affected docs ... must change in the same commit." The diff under review adds the following public surface, none of which is reflected in `RustClientDesign.md`:
@@ -553,6 +559,8 @@ Every invocation of these subcommands on the same machine — and every iteratio
**Recommendation:** Bring `RustClientDesign.md` back in sync with the actual implementation. Restore the `Session` API block to enumerate the five new bulk read/write methods alongside the existing six bulk-subscribe helpers (cross-reference Client.Rust-019's recommendation for the correct signatures — no trailing positional `user_id`/`timestamp`/`current_user_id`/`verifier_user_id`, those live on the per-entry structs). Add `stream_alarms` / `acknowledge_alarm` / `AlarmFeedStream` to a new "Alarms" section adjacent to the existing event-stream section. Expand the CLI command list to enumerate every subcommand the binary exposes today. Expand the `Error` enum sketch to match `clients/rust/src/error.rs`. Add a short "Windows build notes" subsection documenting the `.cargo/config.toml` stack workaround and why clap-derive's large `Command` enum needs it.
**Resolution:** 2026-05-24 — Brought `clients/rust/RustClientDesign.md` back in sync with the implementation. Removed the orphan `tracing` dependency line and added a new "Windows Build Notes" section explaining the `clients/rust/.cargo/config.toml` `/STACK:8388608` MSVC link-arg (why clap-derive's large `Command` enum needs it, that it writes into `IMAGE_OPTIONAL_HEADER.SizeOfStackReserve` for both debug and release builds, and that the MSVC-only target selector keeps mingw unaffected). Extended the `Session` API block to enumerate the five new bulk read/write helpers — `read_bulk<S: AsRef<str>>(&[S], u32)`, `write_bulk(Vec<WriteBulkEntry>)`, `write2_bulk(Vec<Write2BulkEntry>)`, `write_secured_bulk(Vec<WriteSecuredBulkEntry>)`, `write_secured2_bulk(Vec<WriteSecured2BulkEntry>)` — plus `add_item2`, `un_advise`, `remove_item`, and `write2`. Added a paragraph noting that per-entry credentials/timestamps live on the entry structs and that `read_bulk`'s borrow-and-`AsRef<str>` shape is what the cross-language bench-read-bulk hot loop relies on. Added a new "Alarms" section with `GatewayClient::stream_alarms` / `acknowledge_alarm` / the `AlarmFeedStream` type alias and the always-on (no-session) snapshot → complete → transition contract. Expanded the `Error` enum block to match `clients/rust/src/error.rs` (every public variant including `MalformedReply`, `Unavailable`, `InvalidEndpoint`, `InvalidArgument`). Expanded the CLI command list to enumerate every subcommand the binary exposes today, with notes on `batch`'s EOF-only termination and `bench-read-bulk`'s cross-language JSON shape.
### Client.Rust-026
| Field | Value |
@@ -560,7 +568,7 @@ Every invocation of these subcommands on the same machine — and every iteratio
| Severity | Low |
| Category | Performance & resource management |
| Location | `clients/rust/crates/mxgw-cli/src/main.rs:1402-1406,1419-1423` |
| Status | Open |
| Status | Resolved |
**Description:** `run_bench_read_bulk` clones the `tags: Vec<String>` on every iteration of both the warmup loop and the steady-state measurement loop:
@@ -586,6 +594,8 @@ Each clone allocates one fresh `Vec<String>` plus `bulk_size` heap-allocated `St
**Recommendation:** Change `Session::read_bulk` to take `tag_addresses: &[String]` or generic over `AsRef<str>` (as Client.Rust-019's recommendation noted is what `read_bulk` was at `a020350`) so the bench can pass `&tags` once and avoid the per-call clone. Alternatively keep the `Vec<String>` signature but borrow the underlying buffer for the hot loop — e.g. build the bench's payload once outside the steady-state window and re-use it. The current `read_bulk(..., Vec<String>, ...)` shape forces the clone at the call site.
**Resolution:** 2026-05-24 — Changed `Session::read_bulk` to `read_bulk<S: AsRef<str>>(&self, server_handle, tag_addresses: &[S], timeout_ms)` so the bench loop can borrow the tag list once instead of cloning it per iteration. `run_bench_read_bulk` now binds `let tags_ref: &[String] = &tags;` outside the warm-up / steady-state loops and passes `tags_ref` into both, eliminating the per-call `Vec<String>` + `String` clone-allocations that were previously charged into the per-call latency reported by the cross-language `latencyMs` JSON contract. The CLI `read-bulk` subcommand was updated to call `session.read_bulk(server_handle, &items, timeout_ms)`. Clippy's `clone_on_copy` / `redundant_clone` lints are clean at HEAD so no extra regression test is needed beyond `read_bulk_round_trips_through_the_fake_gateway` from Client.Rust-024.
### Client.Rust-027
| Field | Value |
@@ -593,7 +603,7 @@ Each clone allocates one fresh `Vec<String>` plus `bulk_size` heap-allocated `St
| Severity | Low |
| Category | Documentation & comments |
| Location | `clients/rust/.cargo/config.toml:1-9` |
| Status | Open |
| Status | Resolved |
**Description:** The new build-config file added by `71d2c39` carries this leading comment:
@@ -616,6 +626,8 @@ Two issues with the documentation vs the configuration:
**Recommendation:** Either tighten the `[target.…]` selector to only the MSVC-linker tier (`cfg(all(windows, target_env = "msvc"))`) and re-word the comment to make the release-build behaviour explicit ("the stack reservation goes into the PE header for both debug and release builds; release is unaffected at runtime because the optimizer elides the enum from the stack frame"), or add the `+gnu` variant as a parallel block using `-Wl,--stack,8388608`. As a complementary fix, see Client.Rust-024's recommendation — boxing the largest clap-derive variants would let the stack workaround be retired entirely.
**Resolution:** 2026-05-24 — Tightened the target selector in `clients/rust/.cargo/config.toml` from `cfg(windows)` to `cfg(all(windows, target_env = "msvc"))` so `x86_64-pc-windows-gnu` (mingw) builds, which route link args through the GNU linker and reject `/STACK:`, are no longer affected. Rewrote the leading comment to make the release-build behaviour explicit: the `/STACK:` link-arg writes into `IMAGE_OPTIONAL_HEADER.SizeOfStackReserve` at link time and applies to both debug and release builds (so release artifacts ship with the same 8 MB stack reservation); release builds do not need it at runtime because the optimizer elides the enum from the stack frame, but the setting is kept on so both flavours produce binaries with identical stack metadata. The doc note in `RustClientDesign.md` now mirrors this.
### Client.Rust-028
| Field | Value |
@@ -623,7 +635,7 @@ Two issues with the documentation vs the configuration:
| Severity | Low |
| Category | mxaccessgw conventions |
| Location | `clients/rust/crates/mxgw-cli/src/main.rs:1126-1166` |
| Status | Open |
| Status | Resolved |
**Description:** `run_batch` reads commands from stdin with the blocking `std::io::Stdin::lock().lines()` iterator while the surrounding function is `async fn` and the runtime is `#[tokio::main]` (multi-threaded by default). Each `for line in stdin.lock().lines()` iteration pins one of tokio's worker threads on a blocking syscall (`ReadFile` on Windows), then spawns the dispatch as a separate `tokio::task::spawn(dispatch(cli.command))` — which itself runs on another worker thread — and awaits its `JoinHandle`. The pattern works for a single-threaded harness driver but has two latent problems:
@@ -632,6 +644,8 @@ Two issues with the documentation vs the configuration:
**Recommendation:** Replace the empty-line break with an EOF-only terminator (`if line.trim().is_empty() { println!("{BATCH_EOR}"); stdout.lock().flush().ok(); continue; }`) so accidental blank lines log an empty-EOR-bracketed result instead of ending the session. Optionally wrap the stdin reader in `tokio::task::spawn_blocking` so the runtime can move it onto a dedicated blocking-pool thread; the current shape works for today's harness contract but is brittle if the dispatch future ever needs to share the runtime with stdin reads.
**Resolution:** 2026-05-24 — Removed the `if line.is_empty() { break; }` empty-line sentinel in `run_batch` so the only terminator is now stdin EOF (the implicit end of the `for line in stdin.lock().lines()` iterator). Blank or whitespace-only lines now fall into the existing `parts.is_empty()` branch which logs `__MXGW_BATCH_EOR__` and continues, matching the other four language CLIs and shielding the PowerShell e2e harness from accidental `Write-Output ""` calls between commands. Added a comment block on `run_batch` explaining the tokio-runtime / blocking-stdin trade-off: dispatch is already spawned on a fresh tokio task so the blocking iterator parks at most one worker thread on `ReadFile`, and that is acceptable because no other future on the main task needs to run while we wait for the next command. The CLI doc comment on the `Batch` clap variant was updated to mirror the new semantics. No unit test added — `run_batch` reads from real stdin which can't be driven from a `#[tokio::test]` without a subprocess harness, and the existing `parses_batch_command` / `batch_eor_marker_is_stable` tests already cover the parser and the EOR sentinel.
### Client.Rust-029
| Field | Value |
@@ -639,7 +653,7 @@ Two issues with the documentation vs the configuration:
| Severity | High |
| Category | mxaccessgw conventions |
| Location | `clients/rust/src/options.rs:98,143`; `clients/rust/src/galaxy.rs:282`; `clients/rust/src/session.rs:664-671` |
| Status | Open |
| Status | Resolved |
**Description:** `cargo clippy --workspace --all-targets -- -D warnings` fails at HEAD `42b0037` with three errors that the prior d692232 reviewer noted as "out of scope for Client.Rust-021" but did not open as a tracked finding:
@@ -671,3 +685,5 @@ All three were resolved at `a020350` (Client.Rust-001, Client.Rust-002, Client.R
The third error (`BulkReplyKind` enum-variant-names) is also touched by the diff under review: commit `3251069` added a sibling enum `BulkWriteReplyKind` at `session.rs:699-704` whose variants (`Write`, `Write2`, `WriteSecured`, `WriteSecured2`) do not share a suffix and so do not trip the lint — but the pattern is now duplicated. A fix should rename both enums consistently or apply the same scoped `#[allow(clippy::enum_variant_names)]` reason-annotated allow to both.
**Recommendation:** Re-apply Client.Rust-001 (add doc comments on `with_max_grpc_message_bytes` / `max_grpc_message_bytes` in `options.rs`), Client.Rust-002 (drop the `Bulk` suffix from `BulkReplyKind`'s variants so they become `AddItem` / `AdviseItem` / …, or add a narrowly-scoped `#[allow(clippy::enum_variant_names)]` with a reason comment), and Client.Rust-012 (replace `last_deploy.lock().unwrap().clone()` with `*last_deploy.lock().unwrap()` in `galaxy.rs:282`). Verify with `cargo clippy --workspace --all-targets -- -D warnings`. Consider adding a pre-commit / CI gate so the next reviewer never has to discover the regression by running clippy.
**Resolution:** 2026-05-24 — Re-applied all three resolutions. `clients/rust/src/options.rs` now has `///` doc comments on `with_max_grpc_message_bytes` and `max_grpc_message_bytes`. `clients/rust/src/galaxy.rs:282` uses `*self.state.last_deploy.lock().unwrap()` instead of `.clone()`. `clients/rust/src/session.rs`'s `BulkReplyKind` variants are renamed to `AddItem` / `AdviseItem` / `RemoveItem` / `UnAdviseItem` / `Subscribe` / `Unsubscribe` (no shared `Bulk` suffix), with the call sites in `add_item_bulk` / `advise_item_bulk` / `remove_item_bulk` / `un_advise_item_bulk` / `subscribe_bulk` / `unsubscribe_bulk` updated accordingly. The sibling `BulkWriteReplyKind` already had non-suffix-sharing variants (`Write` / `Write2` / `WriteSecured` / `WriteSecured2`) and required no rename. `cargo clippy --workspace --all-targets -- -D warnings` is clean at HEAD.
+4 -2
View File
@@ -7,7 +7,7 @@
| Review date | 2026-05-24 |
| Commit reviewed | `42b0037` |
| Status | Re-reviewed |
| Open findings | 1 |
| Open findings | 0 |
## Checklist coverage
@@ -494,7 +494,7 @@ The Write parity test (IntegrationTests-012's resolution) added exactly this ass
| Severity | Low |
| Category | Correctness & logic bugs |
| Location | `src/ZB.MOM.WW.MxGateway.IntegrationTests/IntegrationTestEnvironmentTests.cs:57-84` (`ResolveRepositoryRoot_NoMarkers_ThrowsInvalidOperationExceptionNamingStartAndMarkers`) |
| Status | Open |
| Status | Resolved |
**Description:** The new regression test for IntegrationTests-022 builds an "isolated" start directory under `Path.GetTempPath()` (e.g. `C:\Users\<user>\AppData\Local\Temp\<random>\nested` on Windows) and calls `ResolveRepositoryRoot(isolatedStart)`, asserting an `InvalidOperationException` is thrown. The walker walks every parent — `<random>`, `Temp`, `Local`, `AppData`, `<user>`, `Users`, `C:\` — and stops only when it either finds a repository root marker or runs out of parents. The test silently assumes none of those ancestor directories satisfies `IsRepositoryRoot` (a `src/` subdirectory next to `.git` / `*.sln` / `*.slnx`). The assumption is environment-dependent:
@@ -504,3 +504,5 @@ The Write parity test (IntegrationTests-012's resolution) added exactly this ass
The current dev box layout (`C:\Users\dohertj2\Desktop\mxaccessgw`) is safe because Temp is at `C:\Users\dohertj2\AppData\Local\Temp` and the walker exits at `C:\` without ever encountering `src/`. The fragility is invisible on this machine and only surfaces if the test ever runs in CI / on a contributor box with a less hermetic file-system layout.
**Recommendation:** Isolate the walker from any ambient ancestor by either (a) constructing an `isolatedRoot` directly under a drive root and pointing the walker at a chain entirely under it (e.g. create `<isolatedRoot>\level1\level2\level3` and start the walk at `level3`, then assert the throw — the walker stops at the drive root regardless of what is on it), (b) refactoring `ResolveRepositoryRoot` to accept an injectable `stopBoundary` parameter for tests and pass `isolatedRoot` as the boundary, or (c) replacing the `Assert.Throws` shape with an explicit upward-walk check that the test owns. Option (a) is the smallest change: prepend a sentinel — e.g. create a dummy `<isolatedRoot>\sentinel-no-markers` and assert nothing about Temp ancestors — and pass the test only when the walker reaches that sentinel without finding a marker. The current shape is acceptable on the documented dev box but should not be the sole regression coverage for IntegrationTests-022.
**Resolution:** Resolved 2026-05-24 — Took option (b) (inject a stop-boundary) because option (a) does not actually solve the leak: a sentinel chain under `Path.GetTempPath()` still leaves the walker free to ascend past it into Temp / AppData / Users / C:\, so any ambient ancestor with `src/` + `.git`/`.sln`/`.slnx` still wins. Added an optional `stopBoundary` parameter to `IntegrationTestEnvironment.ResolveRepositoryRoot(string startDirectory, string? stopBoundary = null)`. When supplied, the walker checks the boundary for markers and then stops, refusing to ascend past it; production callers (the `MXGATEWAY_LIVE_MXACCESS_WORKER_EXE` resolution path) continue to pass `null` so the walk to drive-root behavior is unchanged. Updated both existing tests (`ResolveRepositoryRoot_AcceptsGitWorktreeFile` and `ResolveRepositoryRoot_NoMarkers_ThrowsInvalidOperationExceptionNamingStartAndMarkers`) to pass their owned temp directory as the boundary, sealing the walker inside a chain the test fully controls. Added a new regression test `ResolveRepositoryRoot_StopBoundary_IsolatesWalkerFromAmbientAncestorMarkers` that deliberately constructs an outer marker-bearing ancestor (`outerRoot/src` + `outerRoot/.git`), an inner boundary, and an isolated start beneath the boundary; first asserts that without the boundary the walker leaks up to `outerRoot` (the precise IntegrationTests-025 failure mode), then asserts that *with* the boundary the same call throws — proving the boundary is the load-bearing isolation. TDD red/green confirmed: the new regression test fails against the pre-fix walker (`Assert.Throws() Failure: No exception was thrown`) and passes once the boundary handling is restored. Re-ran the full `IntegrationTestEnvironmentTests` slice with `TMP` / `TEMP` redirected under a deliberately constructed `<temp>\fake-repo-ancestor` directory carrying `src/` and a `.git` file — the original flake repro from the finding — and confirmed all 5 tests pass (the same redirection produced `Assert.Throws() Failure` on the pre-fix code). Build: 0 warnings / 0 errors.
+50 -51
View File
@@ -10,15 +10,15 @@ Each module's `findings.md` is the source of truth; this file is generated from
| Module | Reviewer | Date | Commit | Status | Open | Total |
|---|---|---|---|---|---|---|
| [Client.Dotnet](Client.Dotnet/findings.md) | Claude Code | 2026-05-24 | `42b0037` | Re-reviewed | 4 | 21 |
| [Client.Go](Client.Go/findings.md) | Claude Code | 2026-05-24 | `42b0037` | Re-reviewed | 6 | 27 |
| [Client.Java](Client.Java/findings.md) | Claude Code | 2026-05-24 | `42b0037` | Re-reviewed | 5 | 36 |
| [Client.Python](Client.Python/findings.md) | Claude Code | 2026-05-24 | `42b0037` | Re-reviewed | 5 | 26 |
| [Client.Rust](Client.Rust/findings.md) | Claude Code | 2026-05-24 | `42b0037` | Re-reviewed | 8 | 29 |
| [Client.Dotnet](Client.Dotnet/findings.md) | Claude Code | 2026-05-24 | `42b0037` | Re-reviewed | 0 | 21 |
| [Client.Go](Client.Go/findings.md) | Claude Code | 2026-05-24 | `42b0037` | Re-reviewed | 0 | 27 |
| [Client.Java](Client.Java/findings.md) | Claude Code | 2026-05-24 | `42b0037` | Re-reviewed | 0 | 36 |
| [Client.Python](Client.Python/findings.md) | Claude Code | 2026-05-24 | `42b0037` | Re-reviewed | 0 | 26 |
| [Client.Rust](Client.Rust/findings.md) | Claude Code | 2026-05-24 | `42b0037` | Re-reviewed | 0 | 29 |
| [Contracts](Contracts/findings.md) | Claude Code | 2026-05-24 | `42b0037` | Re-reviewed | 0 | 17 |
| [IntegrationTests](IntegrationTests/findings.md) | Claude Code | 2026-05-24 | `42b0037` | Re-reviewed | 1 | 25 |
| [Server](Server/findings.md) | Claude Code | 2026-05-24 | `42b0037` | Re-reviewed | 7 | 50 |
| [Tests](Tests/findings.md) | Claude Code | 2026-05-24 | `42b0037` | Re-reviewed | 5 | 31 |
| [IntegrationTests](IntegrationTests/findings.md) | Claude Code | 2026-05-24 | `42b0037` | Re-reviewed | 0 | 25 |
| [Server](Server/findings.md) | Claude Code | 2026-05-24 | `42b0037` | Re-reviewed | 0 | 50 |
| [Tests](Tests/findings.md) | Claude Code | 2026-05-24 | `42b0037` | Re-reviewed | 0 | 31 |
| [Worker](Worker/findings.md) | Claude Code | 2026-05-24 | `42b0037` | Re-reviewed | 0 | 25 |
| [Worker.Tests](Worker.Tests/findings.md) | Claude Code | 2026-05-24 | `42b0037` | Re-reviewed | 0 | 30 |
@@ -26,49 +26,7 @@ Each module's `findings.md` is the source of truth; this file is generated from
Findings with status `Open` or `In Progress`, ordered by severity.
| ID | Severity | Category | Location | Description |
|---|---|---|---|---|
| Client.Java-032 | High | Documentation & comments | `clients/java/README.md:182-183` | Commit `8738735` ("clients: document StreamAlarms + AcknowledgeAlarm in each README") added two new gradle invocations to the CLI Usage block: ``` gradle :zb-mom-ww-mxgateway-cli:run --args="stream-alarms --endpoint localhost:5000 --api-ke… |
| Client.Python-022 | High | Documentation & comments | `clients/python/README.md:201-202`, `clients/python/src/zb_mom_ww_mxgateway_cli/commands.py:389-420` | The README CLI examples added by commit `8738735` for the new alarm subcommands cite flags the CLI does not accept: ``` mxgw-py stream-alarms --session-id <id> --max-messages 1 --json mxgw-py acknowledge-alarm --session-id <id> --alarm-ref… |
| Client.Rust-029 | High | mxaccessgw conventions | `clients/rust/src/options.rs:98,143`; `clients/rust/src/galaxy.rs:282`; `clients/rust/src/session.rs:664-671` | `cargo clippy --workspace --all-targets -- -D warnings` fails at HEAD `42b0037` with three errors that the prior d692232 reviewer noted as "out of scope for Client.Rust-021" but did not open as a tracked finding: ``` error: missing documen… |
| Client.Dotnet-018 | Medium | Documentation & comments | `clients/dotnet/README.md:137-138` | The README example block for the two new alarm CLI subcommands shipped in commit `11cc671` shows: ``` mxgw-dotnet stream-alarms --session-id <id> --max-messages 1 --json mxgw-dotnet acknowledge-alarm --session-id <id> --alarm-reference "\\… |
| Client.Go-022 | Medium | Code organization & conventions | `clients/go/cmd/mxgw-go/main.go:398-412,417-519` | Commit `8aaab82` ("Go client: port bulk read/write SDK methods + CLI subcommands") re-introduces every symptom that Client.Go-015 documented and was marked Resolved against an earlier commit: - `runWriteBulkVariant(ctx, args, stdout, stder… |
| Client.Go-023 | Medium | Concurrency & thread safety | `clients/go/cmd/mxgw-go/main.go:604-606,616-632` | `runBenchReadBulk`'s warm-up and steady-state loops are wall-clock-only again: ```go warmupDeadline := time.Now().Add(time.Duration(*warmupSeconds) * time.Second) timeout := time.Duration(*timeoutMs) * time.Millisecond for time.Now().Befor… |
| Client.Java-033 | Medium | Correctness & logic bugs | `clients/java/zb-mom-ww-mxgateway-cli/src/main/java/com/zb/mom/ww/mxgateway/cli/MxGatewayCli.java:1078-1098` | `StreamAlarmsCommand.call()` allocates a bounded `ArrayBlockingQueue<Object>(1024)` and the gRPC observer publishes each `AlarmFeedMessage` via `queue.offer(value)`: ``` BlockingQueue<Object> queue = new ArrayBlockingQueue<>(1024); … @Over… |
| Client.Java-034 | Medium | Correctness & logic bugs | `clients/java/zb-mom-ww-mxgateway-cli/src/main/java/com/zb/mom/ww/mxgateway/cli/MxGatewayCli.java:182-198` | `BatchCommand.call()` reads one CLI invocation per stdin line and tokenises with: ``` String[] args = line.trim().split("\\s+"); … int exitCode = cmd.execute(args); ``` `split("\\s+")` does no shell-quoting parsing — it just splits on whit… |
| Client.Python-023 | Medium | Security | `clients/python/src/zb_mom_ww_mxgateway_cli/commands.py:901-906` | Client.Python-013 (severity Medium, Security) was marked |
| Client.Python-024 | Medium | Code organization & conventions | `clients/python/src/zb_mom_ww_mxgateway_cli/commands.py:13,48-119` | The new `batch` subcommand (commit `71d2c39`) implements the cross-language batch protocol by importing `click.testing.CliRunner` into production code and calling `runner.invoke(main, args, catch_exceptions=True)` in a `for raw_line in sys… |
| Client.Rust-022 | Medium | Correctness & logic bugs | `clients/rust/src/session.rs:369-391,403-420,427-444,452-469,476-493,631-696,706-724` | Commit `3251069` re-introduced the bulk read/write SDK methods (`read_bulk`, `write_bulk`, `write2_bulk`, `write_secured_bulk`, `write_secured2_bulk`) on `Session`. Each method falls back to `Vec::new()` when an OK reply does not carry the… |
| Client.Rust-024 | Medium | Testing coverage | `clients/rust/tests/client_behavior.rs:405-415`; `clients/rust/src/session.rs:369-493`; `clients/rust/src/client.rs:265-291`; `clients/rust/crates/mxgw-cli/src/main.rs:1310-1505` | The diff under review adds substantial SDK and CLI surface with no positive-path coverage: 1. **`GatewayClient::stream_alarms`** (client.rs:280-291) has no test. The fake gateway's `stream_alarms` impl in `tests/client_behavior.rs:408-415`… |
| Server-044 | Medium | Correctness & logic bugs | `src/ZB.MOM.WW.MxGateway.Server/Sessions/SessionManager.cs:216-254` | `KillWorkerAsync` is the mirror of `CloseSessionCoreAsync` for the new admin-only Kill flow, but its catch path leaks the `mxgateway.sessions.open` gauge — the exact bug that Server-006 closed for `OpenSessionAsync`. The happy path increme… |
| Tests-027 | Medium | Concurrency & thread safety | `src/ZB.MOM.WW.MxGateway.Tests/Gateway/Grpc/MxAccessGatewayServiceTests.cs:199-240`, `src/ZB.MOM.WW.MxGateway.Server/Metrics/GatewayMetrics.cs:8,73,246-251` | The review brief explicitly flagged `MxAccessGatewayServiceTests.StreamEvents_WhenEventIsWritten_RecordsSendDuration` as a known flake that "passed solo on rerun". The root cause is the `MeterListener` subscribes by `instrument.Meter.Name… |
| Client.Dotnet-019 | Low | Correctness & logic bugs | `clients/dotnet/ZB.MOM.WW.MxGateway.Client.Cli/MxGatewayClientCli.cs:745` | Client.Dotnet-005 / 010 documented (and recorded as resolved) the silent register-handle fallback pattern `reply.Register?.ServerHandle ?? reply.ReturnValue.Int32Value`, where a successful protocol+MX-status reply missing its typed `regist… |
| Client.Dotnet-020 | Low | Error handling & resilience | `clients/dotnet/ZB.MOM.WW.MxGateway.Client.Cli/MxGatewayClientCli.cs:792-810`, `clients/dotnet/ZB.MOM.WW.MxGateway.Client.Cli/MxGatewayClientCli.cs:774-780` | `BenchReadBulkAsync`'s steady-state `while (DateTime.UtcNow < steadyDeadline)` loop wraps each `client.InvokeAsync(...)` in a bare `catch`: ```csharp try { reply = await client.InvokeAsync( CreateCommandRequest(sessionId, readBulkMxCommand… |
| Client.Dotnet-021 | Low | Correctness & logic bugs | `clients/dotnet/ZB.MOM.WW.MxGateway.Client.Cli/MxGatewayClientCli.cs:487`, `clients/dotnet/ZB.MOM.WW.MxGateway.Client.Cli/MxGatewayClientCli.cs:715` | Both new bulk-read CLI handlers cast a signed `--timeout-ms` argument to `uint` without bounds checking: ```csharp // ReadBulkAsync (line 487) TimeoutMs = (uint)arguments.GetInt32("timeout-ms", 0), // BenchReadBulkAsync (line 715) uint tim… |
| Client.Go-024 | Low | Testing coverage | `clients/go/mxgateway/session.go:395-525`, `clients/go/mxgateway/alarms.go:65-76` | The five new bulk SDK methods on `Session` and the new `Client.StreamAlarms` method have **no unit tests** in `clients/go/mxgateway/`: - `Session.WriteBulk` (`session.go:395`) - `Session.Write2Bulk` (`session.go:418`) - `Session.WriteSecur… |
| Client.Go-025 | Low | Correctness & logic bugs | `clients/go/mxgateway/session.go:395-485,495-525` | The five new bulk methods (`WriteBulk`, `Write2Bulk`, `WriteSecuredBulk`, `WriteSecured2Bulk`, `ReadBulk`) each guard with `if entries == nil { return error }` and an upper-bound `ensureBulkSize` check, but accept a non-nil empty slice (e.… |
| Client.Go-026 | Low | Error handling & resilience | `clients/go/cmd/mxgw-go/main.go:1196-1222` | `runBatch` reads command lines with a default `bufio.Scanner`: ```go scanner := bufio.NewScanner(in) for scanner.Scan() { ... } return scanner.Err() ``` The default `bufio.Scanner` token size is 64 KiB (`bufio.MaxScanTokenSize`). One long… |
| Client.Go-027 | Low | Code organization & conventions | `clients/go/cmd/mxgw-go/main.go:1195-1206` | `runBatch`'s doc-comment says the loop "never terminates on command error; only stdin EOF (or an empty line) ends the session", and the implementation matches: ```go for scanner.Scan() { line := scanner.Text() if line == "" { break } ... }… |
| Client.Java-035 | Low | Testing coverage | `clients/java/zb-mom-ww-mxgateway-client/src/test/java/com/zb/mom/ww/mxgateway/client/MxGatewayClientSessionTests.java` | Commit `8a0c59d` added `MxGatewayClient.streamAlarms(StreamAlarmsRequest, StreamObserver<AlarmFeedMessage>)` and a new public `MxGatewayAlarmFeedSubscription` class. No library-side test exercises either: a grep for `streamAlarms` across `… |
| Client.Java-036 | Low | Code organization & conventions | `clients/java/zb-mom-ww-mxgateway-client/src/main/java/com/zb/mom/ww/mxgateway/client/MxGatewayAlarmFeedSubscription.java`, `MxGatewayEventSubscription.java`, `MxGatewayActiveAlarmsSubscription.java`, `DeployEventSubscription.java` | `MxGatewayAlarmFeedSubscription` is a structural near-copy of `MxGatewayEventSubscription` — same `AtomicReference<ClientCallStreamObserver<…>>` + `AtomicBoolean cancelled` field shape, the same `wrap(observer)` returning a `ClientResponse… |
| Client.Python-025 | Low | Testing coverage | `clients/python/tests/test_cli.py`, `clients/python/src/zb_mom_ww_mxgateway/{client.py,session.py}`, `clients/python/src/zb_mom_ww_mxgateway_cli/commands.py` | Commits `6add4b4` and `828e3e6` added five new SDK methods (`Session.read_bulk`, `Session.write_bulk`, `Session.write2_bulk`, `Session.write_secured_bulk`, `Session.write_secured2_bulk`), `GatewayClient.stream_alarms`, the helper `_canceli… |
| Client.Python-026 | Low | Correctness & logic bugs | `clients/python/src/zb_mom_ww_mxgateway_cli/commands.py:674-738` | Two minor quality issues in the new `_bench_read_bulk` body (commit `6add4b4`): 1. `import time` is done inside the function body (line 676) rather than at module top. `PythonStyleGuide.md` does not state this explicitly, but every other h… |
| Client.Rust-023 | Low | mxaccessgw conventions | `clients/rust/crates/mxgw-cli/src/main.rs:835,872,1476` | Three CLI subcommands added since `d692232` hard-code their `client_correlation_id`: ```rust client_correlation_id: "rust-cli-stream-alarms".to_owned(), // line 835 client_correlation_id: "rust-cli-acknowledge-alarm".to_owned(), // line 87… |
| Client.Rust-025 | Low | Design-document adherence | `clients/rust/RustClientDesign.md:92-106,142-153,164-171` | CLAUDE.md mandates that "When public APIs, contracts, configuration, build steps, security behavior, event shapes, value conversion, status mapping, or lifecycle rules change, the affected docs ... must change in the same commit." The diff… |
| Client.Rust-026 | Low | Performance & resource management | `clients/rust/crates/mxgw-cli/src/main.rs:1402-1406,1419-1423` | `run_bench_read_bulk` clones the `tags: Vec<String>` on every iteration of both the warmup loop and the steady-state measurement loop: ```rust while Instant::now() < warmup_deadline { let _ = session .read_bulk(server_handle, tags.clone(),… |
| Client.Rust-027 | Low | Documentation & comments | `clients/rust/.cargo/config.toml:1-9` | The new build-config file added by `71d2c39` carries this leading comment: ``` [target.'cfg(windows)'] # Bump the default 1 MB Windows stack to 8 MB. clap-derive builds a large # Command enum in this CLI (one variant per subcommand, each c… |
| Client.Rust-028 | Low | mxaccessgw conventions | `clients/rust/crates/mxgw-cli/src/main.rs:1126-1166` | `run_batch` reads commands from stdin with the blocking `std::io::Stdin::lock().lines()` iterator while the surrounding function is `async fn` and the runtime is `#[tokio::main]` (multi-threaded by default). Each `for line in stdin.lock().… |
| IntegrationTests-025 | Low | Correctness & logic bugs | `src/ZB.MOM.WW.MxGateway.IntegrationTests/IntegrationTestEnvironmentTests.cs:57-84` (`ResolveRepositoryRoot_NoMarkers_ThrowsInvalidOperationExceptionNamingStartAndMarkers`) | The new regression test for IntegrationTests-022 builds an "isolated" start directory under `Path.GetTempPath()` (e.g. `C:\Users\<user>\AppData\Local\Temp\<random>\nested` on Windows) and calls `ResolveRepositoryRoot(isolatedStart)`, asser… |
| Server-045 | Low | Concurrency & thread safety | `src/ZB.MOM.WW.MxGateway.Server/Sessions/SessionManager.cs:225,242-245`, `src/ZB.MOM.WW.MxGateway.Server/Sessions/GatewaySession.cs:837-841` | `KillWorkerAsync` reads `session.State` once into a local `bool wasClosed` (line 225) before calling `session.KillWorker(reason)`. The read is unsynchronized — `State` is a getter that takes `_syncRoot` internally so the read itself is saf… |
| Server-046 | Low | Error handling & resilience | `src/ZB.MOM.WW.MxGateway.Server/Sessions/SessionManager.cs:286-307` | `ShutdownAsync` was updated to fall back to `KillWorker` when `CloseSessionCoreAsync` throws (lines 294-305) — a useful resilience improvement on its own. But the fallback's bookkeeping is wrong: `session.KillWorker(GatewayShutdownReason)`… |
| Server-047 | Low | Code organization & conventions | `src/ZB.MOM.WW.MxGateway.Server/Dashboard/Components/Pages/ApiKeysPage.razor:324-334`, `src/ZB.MOM.WW.MxGateway.Server/Dashboard/Components/Pages/SessionsPage.razor:171-195`, `src/ZB.MOM.WW.MxGateway.Server/Dashboard/Components/Pages/SessionDetailsPage.razor:231-255` | The shared `ConfirmDialog.razor` (added in `0e56b5b` / `24cc5fd`) is wired by three pages, but the pages handle `PendingAction` cleanup inconsistently: - `ApiKeysPage.ConfirmPendingAsync` captures the action, sets `PendingAction = null` sy… |
| Server-048 | Low | Testing coverage | `src/ZB.MOM.WW.MxGateway.Tests/Gateway/Sessions/SessionManagerTests.cs:463-498` | The two new `KillWorkerAsync_*` tests cover the happy path (`KillWorkerAsync_KillsWorkerAndRemovesSession`) and the missing-session error (`KillWorkerAsync_WhenSessionMissing_ThrowsSessionNotFound`). Three behaviorally distinct cases are m… |
| Server-049 | Low | Documentation & comments | `src/ZB.MOM.WW.MxGateway.Server/Dashboard/IDashboardSessionAdminService.cs:5-18`, `src/ZB.MOM.WW.MxGateway.Server/Dashboard/DashboardSessionAdminService.cs:8-25` | `IDashboardSessionAdminService` declares three members — `CanManage`, `CloseSessionAsync`, `KillWorkerAsync` — none of which carry XML documentation. `DashboardSessionAdminService.CanManage` and the two operation methods are also undocumen… |
| Server-050 | Low | Error handling & resilience | `src/ZB.MOM.WW.MxGateway.Server/Dashboard/DashboardSessionAdminService.cs:42-75,92-125` | `CloseSessionAsync` and `KillWorkerAsync` catch only `SessionManagerException` (the `SessionNotFound` filter, then a general `SessionManagerException` catch). Anything else propagates raw to Blazor's error boundary. The propagation paths e… |
| Tests-028 | Low | Testing coverage | `src/ZB.MOM.WW.MxGateway.Tests/Gateway/Sessions/SessionManagerTests.cs:466-496,802-807`, `src/ZB.MOM.WW.MxGateway.Server/Sessions/SessionManager.cs:216-253` | The new `KillWorkerAsync_KillsWorkerAndRemovesSession` (line 466) and `KillWorkerAsync_WhenSessionMissing_ThrowsSessionNotFound` (line 486) pin the new kill-path entry, but they do not pin the `reason` argument propagating through the chai… |
| Tests-029 | Low | Error handling & resilience | `src/ZB.MOM.WW.MxGateway.Tests/Gateway/Dashboard/DashboardSessionAdminServiceTests.cs:61-106,139-222`, `src/ZB.MOM.WW.MxGateway.Server/Dashboard/DashboardSessionAdminService.cs:77-125` | The new `DashboardSessionAdminServiceTests` covers the happy path and the viewer-denial path for both `CloseSessionAsync` and `KillWorkerAsync`, plus `CloseSessionAsync_WhenSessionMissing_ReportsFriendlyError` for the close-side `SessionNo… |
| Tests-030 | Low | Testing coverage | `src/ZB.MOM.WW.MxGateway.Tests/Gateway/Dashboard/DashboardApiKeyManagementServiceTests.cs:115-163`, `src/ZB.MOM.WW.MxGateway.Server/Dashboard/DashboardApiKeyManagementService.cs:146-177` | The three new `DeleteAsync_*` fixtures cover unauthorised user, success path with audit, and store-refuses-with-friendly-error. They do not exercise two production behaviours: (1) `DeleteAsync_WhenStoreRefuses_ReportsFriendlyError` (line 1… |
| Tests-031 | Low | Concurrency & thread safety | `src/ZB.MOM.WW.MxGateway.Tests/Gateway/Dashboard/DashboardSnapshotPublisherTests.cs:22-61` | `ExecuteAsync_WhenSnapshotServiceThrowsOnce_ReconnectsAfterDelay` records `startedAt = DateTimeOffset.UtcNow` *before* calling `publisher.StartAsync(...)`, then asserts `secondSubscribeAt - startedAt >= reconnectDelay - 10ms` (line 59). Th… |
_No pending findings._
## Closed findings
@@ -79,12 +37,15 @@ Findings with status `Resolved`, `Won't Fix`, or `Deferred`.
| Server-001 | Critical | Resolved | Security | `src/MxGateway.Server/GatewayApplication.cs:147-149`, `src/MxGateway.Server/Dashboard/DashboardEndpointRouteBuilderExtensions.cs:55-58`, `src/MxGateway.Server/Dashboard/Components/Routes.razor:1-15` |
| Client.Go-001 | High | Resolved | Correctness & logic bugs | `clients/go/mxgateway/errors.go:88-93`, `clients/go/mxgateway/errors.go:117-128` |
| Client.Java-013 | High | Resolved | Testing coverage | `clients/java/mxgateway-cli/src/test/java/com/dohertylan/mxgateway/cli/MxGatewayCliTests.java:212-304`, `clients/java/mxgateway-cli/src/main/java/com/dohertylan/mxgateway/cli/MxGatewayCli.java:1214-1244` |
| Client.Java-032 | High | Resolved | Documentation & comments | `clients/java/README.md:182-183` |
| Client.Python-018 | High | Resolved | Code organization & conventions | `clients/python/pyproject.toml:11` |
| Client.Python-022 | High | Resolved | Documentation & comments | `clients/python/README.md:201-202`, `clients/python/src/zb_mom_ww_mxgateway_cli/commands.py:389-420` |
| Client.Rust-001 | High | Resolved | mxaccessgw conventions | `clients/rust/src/options.rs:98,143` |
| Client.Rust-002 | High | Resolved | mxaccessgw conventions | `clients/rust/src/session.rs:522` |
| Client.Rust-003 | High | Resolved | Correctness & logic bugs | `clients/rust/crates/mxgw-cli/src/main.rs:1051` |
| Client.Rust-012 | High | Resolved | mxaccessgw conventions | `clients/rust/src/galaxy.rs:282` |
| Client.Rust-013 | High | Resolved | mxaccessgw conventions | `src/MxGateway.Contracts/Protos/mxaccess_gateway.proto:414-424` (origin); `clients/rust/src/generated.rs:11-31` (suppression site) |
| Client.Rust-029 | High | Resolved | mxaccessgw conventions | `clients/rust/src/options.rs:98,143`; `clients/rust/src/galaxy.rs:282`; `clients/rust/src/session.rs:664-671` |
| IntegrationTests-001 | High | Resolved | Design-document adherence | `src/MxGateway.IntegrationTests/Galaxy/LiveGalaxyRepositoryFactAttribute.cs:7`, `src/MxGateway.IntegrationTests/Galaxy/GalaxyRepositoryLiveTests.cs` |
| IntegrationTests-002 | High | Resolved | Design-document adherence | `src/MxGateway.IntegrationTests/DashboardLdapLiveTests.cs:13`, `src/MxGateway.Server/Configuration/LdapOptions.cs:27` |
| Server-003 | High | Resolved | Security | `src/MxGateway.Server/Dashboard/DashboardAuthorizationHandler.cs:39,54-59`, `src/MxGateway.Server/Dashboard/DashboardAuthenticator.cs:236-258` |
@@ -99,8 +60,11 @@ Findings with status `Resolved`, `Won't Fix`, or `Deferred`.
| Client.Dotnet-001 | Medium | Resolved | Error handling & resilience | `clients/dotnet/MxGateway.Client/GrpcMxGatewayClientTransport.cs:190-199`, `clients/dotnet/MxGateway.Client/GrpcGalaxyRepositoryClientTransport.cs:131-140` |
| Client.Dotnet-002 | Medium | Resolved | Testing coverage | `clients/dotnet/MxGateway.Client.Tests/FakeGatewayTransport.cs:145-148`, `clients/dotnet/MxGateway.Client.Tests/MxGatewayClientSessionTests.cs:236-256` |
| Client.Dotnet-003 | Medium | Resolved | Concurrency & thread safety | `clients/dotnet/MxGateway.Client/MxGatewaySession.cs:659-663`, `clients/dotnet/MxGateway.Client/MxGatewayClient.cs:230-240` |
| Client.Dotnet-018 | Medium | Resolved | Documentation & comments | `clients/dotnet/README.md:137-138` |
| Client.Go-002 | Medium | Resolved | Error handling & resilience | `clients/go/mxgateway/session.go:440-516` |
| Client.Go-003 | Medium | Resolved | Correctness & logic bugs | `clients/go/cmd/mxgw-go/main.go:517-532` |
| Client.Go-022 | Medium | Resolved | Code organization & conventions | `clients/go/cmd/mxgw-go/main.go:398-412,417-519` |
| Client.Go-023 | Medium | Resolved | Concurrency & thread safety | `clients/go/cmd/mxgw-go/main.go:604-606,616-632` |
| Client.Java-001 | Medium | Resolved | Security | `clients/java/mxgateway-client/src/main/java/com/dohertylan/mxgateway/client/MxGatewaySecrets.java:30-32` |
| Client.Java-002 | Medium | Resolved | Concurrency & thread safety | `clients/java/mxgateway-client/src/main/java/com/dohertylan/mxgateway/client/MxEventStream.java:31,66-92` |
| Client.Java-003 | Medium | Resolved | mxaccessgw conventions | `clients/java/mxgateway-client/src/main/java/com/dohertylan/mxgateway/client/MxGatewayClient.java:119-140` |
@@ -111,15 +75,21 @@ Findings with status `Resolved`, `Won't Fix`, or `Deferred`.
| Client.Java-021 | Medium | Resolved | Concurrency & thread safety | `clients/java/mxgateway-client/src/main/java/com/dohertylan/mxgateway/client/DeployEventStream.java:96-135` |
| Client.Java-027 | Medium | Resolved | Documentation & comments | `clients/java/README.md:36,107-175,185,205,220`, `clients/java/JavaClientDesign.md:195-211` |
| Client.Java-028 | Medium | Resolved | Documentation & comments | `clients/java/JavaClientDesign.md:23-27` |
| Client.Java-033 | Medium | Resolved | Correctness & logic bugs | `clients/java/zb-mom-ww-mxgateway-cli/src/main/java/com/zb/mom/ww/mxgateway/cli/MxGatewayCli.java:1078-1098` |
| Client.Java-034 | Medium | Resolved | Correctness & logic bugs | `clients/java/zb-mom-ww-mxgateway-cli/src/main/java/com/zb/mom/ww/mxgateway/cli/MxGatewayCli.java:182-198` |
| Client.Python-003 | Medium | Resolved | Error handling & resilience | `clients/python/src/mxgateway/client.py:125-137,155-173` |
| Client.Python-005 | Medium | Resolved | Performance & resource management | `clients/python/src/mxgateway/galaxy.py:117-140` |
| Client.Python-009 | Medium | Resolved | Testing coverage | `clients/python/tests/` |
| Client.Python-013 | Medium | Resolved | Security | `clients/python/src/mxgateway_cli/commands.py:757-762` |
| Client.Python-023 | Medium | Resolved | Security | `clients/python/src/zb_mom_ww_mxgateway_cli/commands.py:901-906` |
| Client.Python-024 | Medium | Resolved | Code organization & conventions | `clients/python/src/zb_mom_ww_mxgateway_cli/commands.py:13,48-119` |
| Client.Rust-005 | Medium | Resolved | Correctness & logic bugs | `clients/rust/src/session.rs:489-520` |
| Client.Rust-006 | Medium | Resolved | Error handling & resilience | `clients/rust/src/session.rs:531-555` |
| Client.Rust-015 | Medium | Resolved | Error handling & resilience | `clients/rust/crates/mxgw-cli/src/main.rs:1053-1070` |
| Client.Rust-016 | Medium | Resolved | Testing coverage | `clients/rust/tests/client_behavior.rs`, `clients/rust/src/session.rs:489-519,654-768` |
| Client.Rust-018 | Medium | Resolved | Error handling & resilience | `clients/rust/crates/mxgw-cli/src/main.rs:1098-1170`; `scripts/bench-read-bulk.ps1:347-365`; siblings: `clients/go/cmd/mxgw-go/main.go:600-648`, `clients/python/src/mxgateway_cli/commands.py:614-662`, `clients/dotnet/MxGateway.Client.Cli/MxGatewayClientCli.cs:685-770`, `clients/java/mxgateway-cli/src/main/java/com/dohertylan/mxgateway/cli/MxGatewayCli.java:855-940` |
| Client.Rust-022 | Medium | Resolved | Correctness & logic bugs | `clients/rust/src/session.rs:369-391,403-420,427-444,452-469,476-493,631-696,706-724` |
| Client.Rust-024 | Medium | Resolved | Testing coverage | `clients/rust/tests/client_behavior.rs:405-415`; `clients/rust/src/session.rs:369-493`; `clients/rust/src/client.rs:265-291`; `clients/rust/crates/mxgw-cli/src/main.rs:1310-1505` |
| Contracts-002 | Medium | Resolved | Error handling & resilience | `src/MxGateway.Contracts/Protos/mxaccess_gateway.proto:384-385`, `:95` |
| Contracts-009 | Medium | Resolved | Design-document adherence | `docs/Contracts.md:13-24` |
| IntegrationTests-003 | Medium | Resolved | Correctness & logic bugs | `src/MxGateway.IntegrationTests/WorkerLiveMxAccessSmokeTests.cs:89-97` |
@@ -142,6 +112,7 @@ Findings with status `Resolved`, `Won't Fix`, or `Deferred`.
| Server-032 | Medium | Resolved | Error handling & resilience | `src/ZB.MOM.WW.MxGateway.Server/Workers/WorkerClient.cs:510-569` (gateway-side `_events` channel); `src/ZB.MOM.WW.MxGateway.Server/Workers/WorkerClientOptions.cs:45-53` (`EventChannelFullModeTimeout`) |
| Server-033 | Medium | Resolved | Error handling & resilience | `src/MxGateway.Server/Galaxy/GalaxyHierarchyCache.cs:265-323` (`TryRestoreFromDiskAsync`), `:84-99` (`_firstLoad` / `WaitForFirstLoadAsync`); `src/MxGateway.Server/Grpc/GalaxyRepositoryGrpcService.cs:141-163` (`WaitForCacheBootstrap`) |
| Server-038 | Medium | Resolved | Security | `src/ZB.MOM.WW.MxGateway.Server/Dashboard/Hubs/EventsHub.cs:23-44` |
| Server-044 | Medium | Resolved | Correctness & logic bugs | `src/ZB.MOM.WW.MxGateway.Server/Sessions/SessionManager.cs:216-254` |
| Tests-003 | Medium | Resolved | Performance & resource management | `src/MxGateway.Tests/Security/Authentication/SqliteAuthStoreTests.cs:170-176`, `src/MxGateway.Tests/Security/Authentication/ApiKeyAdminCliRunnerTests.cs:252-258` |
| Tests-004 | Medium | Resolved | Testing coverage | `src/MxGateway.Tests/Security/Authorization/GatewayGrpcAuthorizationInterceptorTests.cs` |
| Tests-005 | Medium | Resolved | Testing coverage | `src/MxGateway.Tests/Gateway/Grpc/EventStreamServiceTests.cs:239-261`, `src/MxGateway.Tests/Gateway/Sessions/SessionManagerTests.cs` |
@@ -150,6 +121,7 @@ Findings with status `Resolved`, `Won't Fix`, or `Deferred`.
| Tests-016 | Medium | Resolved | Testing coverage | `src/MxGateway.Tests/Galaxy/GalaxyHierarchyCacheTests.cs:29-41,115-124` |
| Tests-020 | Medium | Resolved | Testing coverage | `src/MxGateway.Tests/Gateway/Grpc/MxAccessGatewayServiceConstraintTests.cs:275-347`, `src/MxGateway.Server/Grpc/MxAccessGatewayService.cs:803-829` |
| Tests-026 | Medium | Resolved | Testing coverage | `src/ZB.MOM.WW.MxGateway.Tests/Gateway/Grpc/EventStreamServiceTests.cs`, `src/ZB.MOM.WW.MxGateway.Server/Grpc/EventStreamService.cs:123-126` |
| Tests-027 | Medium | Resolved | Concurrency & thread safety | `src/ZB.MOM.WW.MxGateway.Tests/Gateway/Grpc/MxAccessGatewayServiceTests.cs:199-240`, `src/ZB.MOM.WW.MxGateway.Server/Metrics/GatewayMetrics.cs:8,73,246-251` |
| Worker-004 | Medium | Resolved | Correctness & logic bugs | `src/MxGateway.Worker/Ipc/WorkerPipeSession.cs:565-588` |
| Worker-005 | Medium | Resolved | Error handling & resilience | `src/MxGateway.Worker/MxAccess/MxAccessStaSession.cs:205-258` (production alarm poll loop) |
| Worker-006 | Medium | Resolved | Correctness & logic bugs | `src/MxGateway.Worker/Ipc/WorkerPipeSession.cs:117-124`, `src/MxGateway.Worker/MxAccess/MxAccessStaSession.cs:386-491` |
@@ -180,6 +152,9 @@ Findings with status `Resolved`, `Won't Fix`, or `Deferred`.
| Client.Dotnet-015 | Low | Resolved | Correctness & logic bugs | `clients/dotnet/MxGateway.Client.Cli/MxGatewayClientCli.cs:221-236`, `clients/dotnet/MxGateway.Client.Cli/MxGatewayClientCli.cs:596-1065` |
| Client.Dotnet-016 | Low | Resolved | Concurrency & thread safety | `clients/dotnet/MxGateway.Client.Cli/MxGatewayClientCli.cs:922-976` |
| Client.Dotnet-017 | Low | Resolved | Error handling & resilience | `clients/dotnet/MxGateway.Client.Cli/MxGatewayClientCli.cs:1190-1262` |
| Client.Dotnet-019 | Low | Resolved | Correctness & logic bugs | `clients/dotnet/ZB.MOM.WW.MxGateway.Client.Cli/MxGatewayClientCli.cs:745` |
| Client.Dotnet-020 | Low | Resolved | Error handling & resilience | `clients/dotnet/ZB.MOM.WW.MxGateway.Client.Cli/MxGatewayClientCli.cs:792-810`, `clients/dotnet/ZB.MOM.WW.MxGateway.Client.Cli/MxGatewayClientCli.cs:774-780` |
| Client.Dotnet-021 | Low | Resolved | Correctness & logic bugs | `clients/dotnet/ZB.MOM.WW.MxGateway.Client.Cli/MxGatewayClientCli.cs:487`, `clients/dotnet/ZB.MOM.WW.MxGateway.Client.Cli/MxGatewayClientCli.cs:715` |
| Client.Go-004 | Low | Resolved | mxaccessgw conventions | `clients/go/mxgateway/alarms_test.go:153-154`, `clients/go/mxgateway/galaxy_test.go:58-59` |
| Client.Go-005 | Low | Resolved | Design-document adherence | `clients/go/mxgateway/client.go:64,68`, `clients/go/mxgateway/galaxy.go:83,87` |
| Client.Go-006 | Low | Resolved | Error handling & resilience | `clients/go/mxgateway/errors.go:9-130` |
@@ -198,6 +173,10 @@ Findings with status `Resolved`, `Won't Fix`, or `Deferred`.
| Client.Go-019 | Low | Resolved | Documentation & comments | `clients/go/cmd/mxgw-go/main.go:710-716`, `clients/go/cmd/mxgw-go/main.go:1204,1213` |
| Client.Go-020 | Low | Resolved | Code organization & conventions | `clients/go/cmd/mxgw-go/main.go:753-802`, `clients/go/cmd/mxgw-go/main.go:1199-1275` |
| Client.Go-021 | Low | Resolved | Testing coverage | `clients/go/cmd/mxgw-go/main_test.go`, `clients/go/cmd/mxgw-go/main.go:363-520,522-655` |
| Client.Go-024 | Low | Resolved | Testing coverage | `clients/go/mxgateway/session.go:395-525`, `clients/go/mxgateway/alarms.go:65-76` |
| Client.Go-025 | Low | Resolved | Correctness & logic bugs | `clients/go/mxgateway/session.go:395-485,495-525` |
| Client.Go-026 | Low | Resolved | Error handling & resilience | `clients/go/cmd/mxgw-go/main.go:1196-1222` |
| Client.Go-027 | Low | Resolved | Code organization & conventions | `clients/go/cmd/mxgw-go/main.go:1195-1206` |
| Client.Java-006 | Low | Resolved | Performance & resource management | `clients/java/mxgateway-client/src/main/java/com/dohertylan/mxgateway/client/MxGatewayClient.java:323-328`, `clients/java/mxgateway-client/src/main/java/com/dohertylan/mxgateway/client/GalaxyRepositoryClient.java:279-284` |
| Client.Java-007 | Low | Resolved | Testing coverage | `clients/java/mxgateway-client/src/test/java/com/dohertylan/mxgateway/client/` |
| Client.Java-008 | Low | Resolved | Error handling & resilience | `clients/java/mxgateway-client/src/main/java/com/dohertylan/mxgateway/client/MxGatewayClient.java:298-304` |
@@ -218,6 +197,8 @@ Findings with status `Resolved`, `Won't Fix`, or `Deferred`.
| Client.Java-029 | Low | Resolved | Documentation & comments | `clients/java/README.md:208-209` |
| Client.Java-030 | Low | Resolved | Testing coverage | `clients/java/zb-mom-ww-mxgateway-client/src/test/java/com/zb/mom/ww/mxgateway/client/` |
| Client.Java-031 | Low | Resolved | mxaccessgw conventions | `clients/java/README.md:13,17,26` |
| Client.Java-035 | Low | Resolved | Testing coverage | `clients/java/zb-mom-ww-mxgateway-client/src/test/java/com/zb/mom/ww/mxgateway/client/MxGatewayClientSessionTests.java` |
| Client.Java-036 | Low | Resolved | Code organization & conventions | `clients/java/zb-mom-ww-mxgateway-client/src/main/java/com/zb/mom/ww/mxgateway/client/MxGatewayAlarmFeedSubscription.java`, `MxGatewayEventSubscription.java`, `MxGatewayActiveAlarmsSubscription.java`, `DeployEventSubscription.java` |
| Client.Python-001 | Low | Resolved | Documentation & comments | `clients/python/pyproject.toml:8,25`, `clients/python/src/mxgateway_cli/commands.py:25` |
| Client.Python-002 | Low | Resolved | Code organization & conventions | `clients/python/src/mxgateway/__init__.py:27` |
| Client.Python-004 | Low | Resolved | Correctness & logic bugs | `clients/python/src/mxgateway_cli/commands.py:386,402-404` |
@@ -234,6 +215,8 @@ Findings with status `Resolved`, `Won't Fix`, or `Deferred`.
| Client.Python-019 | Low | Resolved | Code organization & conventions | `clients/python/pyproject.toml:60-61`, `clients/python/src/mxgateway_cli/` |
| Client.Python-020 | Low | Resolved | Testing coverage | `clients/python/tests/`, `scripts/` |
| Client.Python-021 | Low | Resolved | Documentation & comments | `clients/python/src/mxgateway_cli/commands.py`, `clients/python/README.md:235-258` |
| Client.Python-025 | Low | Resolved | Testing coverage | `clients/python/tests/test_cli.py`, `clients/python/src/zb_mom_ww_mxgateway/{client.py,session.py}`, `clients/python/src/zb_mom_ww_mxgateway_cli/commands.py` |
| Client.Python-026 | Low | Resolved | Correctness & logic bugs | `clients/python/src/zb_mom_ww_mxgateway_cli/commands.py:674-738` |
| Client.Rust-004 | Low | Resolved | Documentation & comments | `clients/rust/src/version.rs:7` |
| Client.Rust-007 | Low | Resolved | Design-document adherence | `clients/rust/RustClientDesign.md:14-55` |
| Client.Rust-008 | Low | Resolved | Performance & resource management | `clients/rust/src/value.rs:161-261` |
@@ -245,6 +228,11 @@ Findings with status `Resolved`, `Won't Fix`, or `Deferred`.
| Client.Rust-019 | Low | Resolved | Design-document adherence | `clients/rust/RustClientDesign.md:96-100` |
| Client.Rust-020 | Low | Resolved | Documentation & comments | `clients/rust/src/session.rs:31-46`; `clients/rust/src/lib.rs:14-39` |
| Client.Rust-021 | Low | Resolved | Design-document adherence | `clients/rust/RustClientDesign.md:14-33` |
| Client.Rust-023 | Low | Resolved | mxaccessgw conventions | `clients/rust/crates/mxgw-cli/src/main.rs:835,872,1476` |
| Client.Rust-025 | Low | Resolved | Design-document adherence | `clients/rust/RustClientDesign.md:92-106,142-153,164-171` |
| Client.Rust-026 | Low | Resolved | Performance & resource management | `clients/rust/crates/mxgw-cli/src/main.rs:1402-1406,1419-1423` |
| Client.Rust-027 | Low | Resolved | Documentation & comments | `clients/rust/.cargo/config.toml:1-9` |
| Client.Rust-028 | Low | Resolved | mxaccessgw conventions | `clients/rust/crates/mxgw-cli/src/main.rs:1126-1166` |
| Contracts-001 | Low | Resolved | Design-document adherence | `docs/Grpc.md:13` (and `:3`, `:32`, `:39`) |
| Contracts-003 | Low | Won't Fix | Code organization & conventions | `src/MxGateway.Contracts/MxGateway.Contracts.csproj:10` |
| Contracts-004 | Low | Resolved | Documentation & comments | `src/MxGateway.Contracts/GatewayContractInfo.cs:3-6` |
@@ -274,6 +262,7 @@ Findings with status `Resolved`, `Won't Fix`, or `Deferred`.
| IntegrationTests-022 | Low | Resolved | Code organization & conventions | `src/ZB.MOM.WW.MxGateway.IntegrationTests/IntegrationTestEnvironment.cs:103-138` (`ResolveRepositoryRoot` / `IsRepositoryRoot`) |
| IntegrationTests-023 | Low | Resolved | Testing coverage | `src/ZB.MOM.WW.MxGateway.IntegrationTests/DashboardLdapLiveTests.cs:14-29` |
| IntegrationTests-024 | Low | Resolved | Code organization & conventions | `src/ZB.MOM.WW.MxGateway.IntegrationTests/WorkerLiveMxAccessSmokeTests.cs` (`NullDashboardEventBroadcaster` private class at end of file) |
| IntegrationTests-025 | Low | Resolved | Correctness & logic bugs | `src/ZB.MOM.WW.MxGateway.IntegrationTests/IntegrationTestEnvironmentTests.cs:57-84` (`ResolveRepositoryRoot_NoMarkers_ThrowsInvalidOperationExceptionNamingStartAndMarkers`) |
| Server-007 | Low | Resolved | Performance & resource management | `src/MxGateway.Server/Galaxy/GalaxyHierarchyProjector.cs:55-70` |
| Server-008 | Low | Resolved | Performance & resource management | `src/MxGateway.Server/Grpc/GalaxyRepositoryGrpcService.cs:111-134,160-189` |
| Server-009 | Low | Resolved | Error handling & resilience | `src/MxGateway.Server/Security/Authentication/AuthSqliteConnectionFactory.cs:15-32` |
@@ -302,6 +291,12 @@ Findings with status `Resolved`, `Won't Fix`, or `Deferred`.
| Server-041 | Low | Resolved | Design-document adherence | `src/ZB.MOM.WW.MxGateway.Server/Grpc/EventStreamService.cs:123-126`, `src/ZB.MOM.WW.MxGateway.Server/Dashboard/Hubs/IDashboardEventBroadcaster.cs:6-10` |
| Server-042 | Low | Resolved | Performance & resource management | `src/ZB.MOM.WW.MxGateway.Server/Dashboard/Hubs/DashboardSnapshotPublisher.cs:18-41` |
| Server-043 | Low | Resolved | Documentation & comments | `src/ZB.MOM.WW.MxGateway.Server/Dashboard/HubTokenService.cs:1`, `src/ZB.MOM.WW.MxGateway.Server/Dashboard/DashboardServiceCollectionExtensions.cs:24` |
| Server-045 | Low | Resolved | Concurrency & thread safety | `src/ZB.MOM.WW.MxGateway.Server/Sessions/SessionManager.cs:225,242-245`, `src/ZB.MOM.WW.MxGateway.Server/Sessions/GatewaySession.cs:837-841` |
| Server-046 | Low | Resolved | Error handling & resilience | `src/ZB.MOM.WW.MxGateway.Server/Sessions/SessionManager.cs:286-307` |
| Server-047 | Low | Resolved | Code organization & conventions | `src/ZB.MOM.WW.MxGateway.Server/Dashboard/Components/Pages/ApiKeysPage.razor:324-334`, `src/ZB.MOM.WW.MxGateway.Server/Dashboard/Components/Pages/SessionsPage.razor:171-195`, `src/ZB.MOM.WW.MxGateway.Server/Dashboard/Components/Pages/SessionDetailsPage.razor:231-255` |
| Server-048 | Low | Resolved | Testing coverage | `src/ZB.MOM.WW.MxGateway.Tests/Gateway/Sessions/SessionManagerTests.cs:463-498` |
| Server-049 | Low | Resolved | Documentation & comments | `src/ZB.MOM.WW.MxGateway.Server/Dashboard/IDashboardSessionAdminService.cs:5-18`, `src/ZB.MOM.WW.MxGateway.Server/Dashboard/DashboardSessionAdminService.cs:8-25` |
| Server-050 | Low | Resolved | Error handling & resilience | `src/ZB.MOM.WW.MxGateway.Server/Dashboard/DashboardSessionAdminService.cs:42-75,92-125` |
| Tests-007 | Low | Resolved | Code organization & conventions | `src/MxGateway.Tests/Gateway/Grpc/MxAccessGatewayServiceTests.cs:682`, `src/MxGateway.Tests/Gateway/Grpc/GalaxyRepositoryGrpcServiceTests.cs:324`, `src/MxGateway.Tests/Gateway/GatewayEndToEndFakeWorkerSmokeTests.cs:460`, `src/MxGateway.Tests/Security/Authorization/GatewayGrpcAuthorizationInterceptorTests.cs:233` |
| Tests-008 | Low | Resolved | mxaccessgw conventions | `src/MxGateway.Tests/Gateway/Sessions/WorkerAlarmRpcDispatcherTests.cs:1-9`, `src/MxGateway.Tests/Gateway/Sessions/NotWiredAlarmRpcDispatcherTests.cs:1-3`, `src/MxGateway.Tests/Gateway/Sessions/SessionManagerAlarmAutoSubscribeTests.cs:1` |
| Tests-009 | Low | Resolved | Documentation & comments | `src/MxGateway.Tests/Gateway/Sessions/SessionManagerTests.cs:36-37,99,365` |
@@ -318,6 +313,10 @@ Findings with status `Resolved`, `Won't Fix`, or `Deferred`.
| Tests-023 | Low | Resolved | Correctness & logic bugs | `src/MxGateway.Tests/Gateway/Sessions/SessionWorkerClientFactoryFakeWorkerTests.cs:334-374` |
| Tests-024 | Low | Resolved | Testing coverage | `src/MxGateway.Server/Grpc/MxAccessGatewayService.cs:713-730,784-801,859-876`, `src/MxGateway.Tests/Gateway/Grpc/MxAccessGatewayServiceConstraintTests.cs` |
| Tests-025 | Low | Resolved | Code organization & conventions | `src/ZB.MOM.WW.MxGateway.Tests/Gateway/Grpc/EventStreamServiceTests.cs:285-289`, `src/ZB.MOM.WW.MxGateway.Tests/Gateway/GatewayEndToEndFakeWorkerSmokeTests.cs:417-421` |
| Tests-028 | Low | Resolved | Testing coverage | `src/ZB.MOM.WW.MxGateway.Tests/Gateway/Sessions/SessionManagerTests.cs:466-496,802-807`, `src/ZB.MOM.WW.MxGateway.Server/Sessions/SessionManager.cs:216-253` |
| Tests-029 | Low | Resolved | Error handling & resilience | `src/ZB.MOM.WW.MxGateway.Tests/Gateway/Dashboard/DashboardSessionAdminServiceTests.cs:61-106,139-222`, `src/ZB.MOM.WW.MxGateway.Server/Dashboard/DashboardSessionAdminService.cs:77-125` |
| Tests-030 | Low | Resolved | Testing coverage | `src/ZB.MOM.WW.MxGateway.Tests/Gateway/Dashboard/DashboardApiKeyManagementServiceTests.cs:115-163`, `src/ZB.MOM.WW.MxGateway.Server/Dashboard/DashboardApiKeyManagementService.cs:146-177` |
| Tests-031 | Low | Resolved | Concurrency & thread safety | `src/ZB.MOM.WW.MxGateway.Tests/Gateway/Dashboard/DashboardSnapshotPublisherTests.cs:22-61` |
| Worker-009 | Low | Resolved | Performance & resource management | `src/MxGateway.Worker/Ipc/WorkerFrameReader.cs:31,49`, `src/MxGateway.Worker/Ipc/WorkerFrameWriter.cs:57-58` |
| Worker-010 | Low | Resolved | Correctness & logic bugs | `src/MxGateway.Worker/Conversion/VariantConverter.cs:204-226` |
| Worker-011 | Low | Resolved | Correctness & logic bugs | `src/MxGateway.Worker/Ipc/WorkerPipeClient.cs:169-171` |
+22 -8
View File
@@ -7,7 +7,7 @@
| Review date | 2026-05-24 |
| Commit reviewed | `42b0037` |
| Status | Re-reviewed |
| Open findings | 7 |
| Open findings | 0 |
## Checklist coverage
@@ -816,7 +816,7 @@ Add a regression test that advises N items without an active `StreamEvents` cons
| Severity | Medium |
| Category | Correctness & logic bugs |
| Location | `src/ZB.MOM.WW.MxGateway.Server/Sessions/SessionManager.cs:216-254` |
| Status | Open |
| Status | Resolved |
**Description:** `KillWorkerAsync` is the mirror of `CloseSessionCoreAsync` for the new admin-only Kill flow, but its catch path leaks the `mxgateway.sessions.open` gauge — the exact bug that Server-006 closed for `OpenSessionAsync`. The happy path increments `_metrics.SessionClosed()` once after `session.KillWorker(reason)` returns (line 244), which decrements `_openSessions`. The catch path, however, records `_metrics.Fault(...)`, calls `session.MarkFaulted(...)`, and then awaits `RemoveSessionAsync(session)` — but never calls `_metrics.SessionClosed()` (nor `SessionRemoved()`), so a kill that throws from `session.KillWorker` leaves the open-session gauge permanently incremented. `RemoveSessionAsync` only calls `_metrics.RemoveSessionEvents(...)` and `ReleaseSessionSlot()`; neither touches `_openSessions`. Server-006's fix pattern (track whether the open-counter was recorded, and decrement on the failing path) was applied to `OpenSessionAsync` but not propagated to this new write path.
@@ -824,6 +824,8 @@ In practice the trigger is narrow — `GatewaySession.KillWorker` calls `_worker
**Recommendation:** Mirror Server-006's fix: track whether the session was counted as opened (it always is in `KillWorkerAsync``GetRequiredSession` only succeeds for sessions in the registry, all of which had `SessionOpened()` called), and decrement on the failing path. Concretely, add `_metrics.SessionClosed()` (or `_metrics.SessionRemoved()` if the kill is being treated as an unclean removal) inside the catch block before `RemoveSessionAsync(session)`. The cleanest form is to record `SessionClosed()` once at the top of the method (under a flag), then only re-record if the happy path actually transitions; or to add `_metrics.SessionClosed()` in the catch right after `MarkFaulted`. Add a `SessionManagerTests.KillWorkerAsync_WhenSessionKillThrows_DecrementsOpenSessionGauge` regression test that uses a `FakeWorkerClient.KillThrows = true` to exercise the catch.
**Resolution:** 2026-05-24 — Confirmed against source: `KillWorkerAsync`'s catch block called `MarkFaulted`, `Fault`, and `RemoveSessionAsync` but never decremented the open-session gauge, mirroring exactly the Server-006 leak on the open path. The catch path now calls `_metrics.SessionRemoved()` after `MarkFaulted`, so the gauge is restored when `session.KillWorker` (via the new `KillWorkerWithCloseGateAsync` helper) throws. Combined with the Server-045 fix (the kill path now routes through a new `GatewaySession.KillWorkerWithCloseGateAsync` that takes the per-session `_closeLock`), every session reaching `KillWorkerAsync` had `SessionOpened()` recorded and the catch correctly decrements it. Regression test in `src/ZB.MOM.WW.MxGateway.Tests/Gateway/Sessions/SessionManagerTests.cs`: `KillWorkerAsync_WhenSessionKillThrows_DecrementsOpenSessionGauge` (uses a new `FakeWorkerClient.KillException` flag to force `_workerClient.Kill` to throw and asserts the open-session gauge returns to 0 after the kill faults). Confirmed to fail before the fix and pass after.
### Server-045
| Field | Value |
@@ -831,12 +833,14 @@ In practice the trigger is narrow — `GatewaySession.KillWorker` calls `_worker
| Severity | Low |
| Category | Concurrency & thread safety |
| Location | `src/ZB.MOM.WW.MxGateway.Server/Sessions/SessionManager.cs:225,242-245`, `src/ZB.MOM.WW.MxGateway.Server/Sessions/GatewaySession.cs:837-841` |
| Status | Open |
| Status | Resolved |
**Description:** `KillWorkerAsync` reads `session.State` once into a local `bool wasClosed` (line 225) before calling `session.KillWorker(reason)`. The read is unsynchronized — `State` is a getter that takes `_syncRoot` internally so the read itself is safe, but there is no lock spanning "read state, call KillWorker, conditionally record metric." Two concurrent `KillWorkerAsync` calls on the same session (e.g. one operator clicking Kill on the Sessions page and another clicking Kill on the Session Details page within the same render tick) can both observe `wasClosed = false`, then both call `session.KillWorker(...)` (the second is effectively a no-op because `TransitionTo` refuses to overwrite `Closed`), and both call `_metrics.SessionClosed()` at line 244. The `_openSessions` gauge is bounded at 0 by `GatewayMetrics.SessionClosed`'s `if (_openSessions > 0)` guard, but the `_sessionsClosed` counter (and the `mxgateway.sessions.closed` counter exported by the meter) is double-incremented; `_metrics.Fault` is not used here, so the only mitigation is the SessionsRegistry race — the second call's `GetRequiredSession` could miss if the first already removed the session via `RemoveSessionAsync`, but only if the second arrives after the first's removal completes. The window is small but exists, and the same race exists for "Kill from one tab while the lease-expired sweep is closing the session." `CloseSessionCoreAsync` has the same shape, so this isn't a regression specifically from the kill change — but the new path widens the surface where the issue can fire.
**Recommendation:** Either (a) gate `KillWorkerAsync` on a per-session lock — extending the `_closeLock` pattern that `GatewaySession.CloseAsync` already uses, or introducing a new `_killLock` and accepting that close + kill don't serialize against each other — or (b) accept the metric double-count as harmless and document it on `KillWorkerAsync`'s XML doc. Option (a) is the more defensible long-term fix; option (b) is acceptable for v1 if the metric is purely informational. Adding a test that issues concurrent kills against the same session id and asserts `_sessionsClosed == 1` would pin the chosen behavior either way.
**Resolution:** 2026-05-24 — Took recommended option (a). Added `GatewaySession.KillWorkerWithCloseGateAsync(reason, ct)` that acquires the per-session `_closeLock`, reads `_state` under `_syncRoot`, calls `_workerClient.Kill(reason)`, then `TransitionTo(Closed)`, and returns the wasClosed observation. `SessionManager.KillWorkerAsync` now invokes that helper instead of reading `State` and calling `KillWorker` separately. Concurrent kill (and concurrent close+kill) callers now serialize on `_closeLock`, so the first caller observes `wasClosed=false` and the second observes `wasClosed=true`, eliminating the double-increment of `mxgateway.sessions.closed`. Regression test in `src/ZB.MOM.WW.MxGateway.Tests/Gateway/Sessions/SessionManagerTests.cs`: `KillWorkerAsync_ConcurrentCallsOnSameSession_CountClosedExactlyOnce` (issues two `KillWorkerAsync` calls on the same session id concurrently, accepts `SessionNotFound` on whichever loses the race after `RemoveSessionAsync`, and asserts `SessionsClosed == 1` and `OpenSessions == 0`).
### Server-046
| Field | Value |
@@ -844,12 +848,14 @@ In practice the trigger is narrow — `GatewaySession.KillWorker` calls `_worker
| Severity | Low |
| Category | Error handling & resilience |
| Location | `src/ZB.MOM.WW.MxGateway.Server/Sessions/SessionManager.cs:286-307` |
| Status | Open |
| Status | Resolved |
**Description:** `ShutdownAsync` was updated to fall back to `KillWorker` when `CloseSessionCoreAsync` throws (lines 294-305) — a useful resilience improvement on its own. But the fallback's bookkeeping is wrong: `session.KillWorker(GatewayShutdownReason)` is called and `RemoveSessionAsync(session)` is awaited, but `_metrics.SessionClosed()` is never invoked, so for every session whose graceful close throws, the `mxgateway.sessions.open` gauge stays incremented after shutdown completes. Worse, `CloseSessionCoreAsync`'s `SessionCloseStartedException` catch (line 330) already records `_metrics.SessionRemoved()` (line 334-336) before re-throwing — so for that specific exception type, the gauge is decremented inside the inner catch, then the outer fallback runs and does not double-decrement (good), but `_metrics.SessionClosed()` is never called, so the `_sessionsClosed` counter under-counts by one. For any other exception (the more common case), neither inner catch records anything, so both `_sessionsClosed` and `_openSessions` end up wrong: gauge is left high, counter is left low.
**Recommendation:** Inside the `ShutdownAsync` fallback (after the `KillWorker` call but before/inside the `RemoveSessionAsync`), call `_metrics.SessionClosed()` unless the inner catch already recorded the close. The simplest shape is to propagate a `wasClosed` flag out of `CloseSessionCoreAsync` (or replace the fallback's manual choreography with a single call into `KillWorkerAsync(...)`, which has the right metric path once Server-044 is fixed). The latter is the cleanest — `ShutdownAsync` becomes "try graceful, fall back to `KillWorkerAsync`," and there's exactly one accounting path for each session. Add a `SessionManagerTests.ShutdownAsync_WhenCloseThrows_StillDecrementsOpenSessionGauge` test using a session whose `CloseAsync` throws (e.g. a `BlockingShutdownWorkerClient` configured to throw on `ShutdownAsync`).
**Resolution:** 2026-05-24 — Two coordinated changes: (1) `CloseSessionCoreAsync`'s `SessionCloseStartedException` catch now calls `_metrics.SessionClosed()` (decrements the open-session gauge AND increments the closed counter) instead of `_metrics.SessionRemoved()` (gauge only). A close that ran far enough to attempt the worker shutdown but failed is still a closed session for accounting purposes — the session is removed from the registry and disposed below, so the counter must reflect that. (2) `ShutdownAsync`'s outer fallback now routes the kill through `KillWorkerAsync` (which has the correct metric path post-Server-044) rather than manually calling `session.KillWorker` + `RemoveSessionAsync`. In practice the inner catch already removes the session so the outer fallback is defensive — but routing both paths through the same accounting eliminates the inconsistency the finding called out. The pre-existing `CloseSessionAsync_WhenWorkerShutdownFails_RemovesSessionAndReleasesSlot` test was updated to assert the new (correct) `SessionsClosed == 1` value, with a comment back-referencing Server-046. New regression test in `src/ZB.MOM.WW.MxGateway.Tests/Gateway/Sessions/SessionManagerTests.cs`: `ShutdownAsync_WhenSessionCloseThrows_StillDecrementsOpenSessionGaugeAndIncrementsClosedCounter` (uses a `FakeWorkerClient.ShutdownException` to force the graceful close to throw, then asserts both the open-session gauge drops to 0 and the closed counter increments to 1). Confirmed to fail before the fix and pass after.
### Server-047
| Field | Value |
@@ -857,7 +863,7 @@ In practice the trigger is narrow — `GatewaySession.KillWorker` calls `_worker
| Severity | Low |
| Category | Code organization & conventions |
| Location | `src/ZB.MOM.WW.MxGateway.Server/Dashboard/Components/Pages/ApiKeysPage.razor:324-334`, `src/ZB.MOM.WW.MxGateway.Server/Dashboard/Components/Pages/SessionsPage.razor:171-195`, `src/ZB.MOM.WW.MxGateway.Server/Dashboard/Components/Pages/SessionDetailsPage.razor:231-255` |
| Status | Open |
| Status | Resolved |
**Description:** The shared `ConfirmDialog.razor` (added in `0e56b5b` / `24cc5fd`) is wired by three pages, but the pages handle `PendingAction` cleanup inconsistently:
@@ -868,6 +874,8 @@ The user-visible difference: rotating/revoking/deleting a key vs closing/killing
**Recommendation:** Align `ApiKeysPage.ConfirmPendingAsync` with the sessions pages: hold `PendingAction`, set `IsBusy = true`, run the action, then clear `PendingAction` in the `finally`. The current ApiKeysPage shape was inherited from before the dialog existed (when the confirmation was a `confirm()` JS call); the dialog component change can flatten the difference now. As a smaller alternative, document the divergence on the component's XML doc — but the shared component should ideally be used consistently.
**Resolution:** 2026-05-24 — Took the recommended alignment. `ApiKeysPage.ConfirmPendingAsync` (`src/ZB.MOM.WW.MxGateway.Server/Dashboard/Components/Pages/ApiKeysPage.razor`) now holds `PendingAction` for the duration of the awaited action (so the shared `ConfirmDialog` renders its `IsBusy` in-flight state on the dialog itself, matching the sessions pages) and clears it in `finally` regardless of outcome. The action is captured up front so a clear in `finally` works even when the action throws. `RunManagementActionAsync` continues to drive `IsBusy = true` inside its own `try/finally`, so the dialog now correctly disables Confirm/Cancel while the awaited service call runs. Pure UX-consistency change; no new automated test (no bUnit harness in the test project — same precedent as Server-010).
### Server-048
| Field | Value |
@@ -875,7 +883,7 @@ The user-visible difference: rotating/revoking/deleting a key vs closing/killing
| Severity | Low |
| Category | Testing coverage |
| Location | `src/ZB.MOM.WW.MxGateway.Tests/Gateway/Sessions/SessionManagerTests.cs:463-498` |
| Status | Open |
| Status | Resolved |
**Description:** The two new `KillWorkerAsync_*` tests cover the happy path (`KillWorkerAsync_KillsWorkerAndRemovesSession`) and the missing-session error (`KillWorkerAsync_WhenSessionMissing_ThrowsSessionNotFound`). Three behaviorally distinct cases are missing:
@@ -885,6 +893,8 @@ The user-visible difference: rotating/revoking/deleting a key vs closing/killing
**Recommendation:** Add the three tests above. The fakes in `MxGateway.Tests/TestSupport/` already cover most of the moving parts; `FakeWorkerClient` needs a single `ThrowOnKill` flag (or the existing `KillThrowing` if any).
**Resolution:** 2026-05-24 — Closed by the regression tests added for Server-044 and Server-045 per the prompt's direction: case (1) is covered by `KillWorkerAsync_WhenSessionKillThrows_DecrementsOpenSessionGauge` (uses the new `FakeWorkerClient.KillException` flag); case (3) is covered by `KillWorkerAsync_ConcurrentCallsOnSameSession_CountClosedExactlyOnce`. Case (2) (`wasClosed=true` short-circuit) is implicitly exercised by the concurrent test — once the kill path serializes on the per-session close lock (Server-045 fix), the second kill that wins the registry race observes `wasClosed=true` and skips the counter increment, which is what the test pins (`SessionsClosed == 1`, not 2). The dedicated `KillWorkerAsync_WhenSessionAlreadyClosed_DoesNotReincrementClosedCounter` test was drafted but removed: closing a session disposes it (Server-016's `_closeLock.Dispose()`), so re-issuing a kill against a previously-closed-and-disposed session always fails on the disposed semaphore, which is realistic for production but not a useful unit-test shape. No new test file; the regression coverage already lives in `src/ZB.MOM.WW.MxGateway.Tests/Gateway/Sessions/SessionManagerTests.cs`.
### Server-049
| Field | Value |
@@ -892,12 +902,14 @@ The user-visible difference: rotating/revoking/deleting a key vs closing/killing
| Severity | Low |
| Category | Documentation & comments |
| Location | `src/ZB.MOM.WW.MxGateway.Server/Dashboard/IDashboardSessionAdminService.cs:5-18`, `src/ZB.MOM.WW.MxGateway.Server/Dashboard/DashboardSessionAdminService.cs:8-25` |
| Status | Open |
| Status | Resolved |
**Description:** `IDashboardSessionAdminService` declares three members — `CanManage`, `CloseSessionAsync`, `KillWorkerAsync` — none of which carry XML documentation. `DashboardSessionAdminService.CanManage` and the two operation methods are also undocumented (only the constructor parameters are named). The C# style guide requires public-surface XML docs and CLAUDE.md mandates that "docs change with the code." The peer `IDashboardApiKeyManagementService` is also undocumented, so this isn't unique — but the new interface is a fresh public surface being landed in `c5e7479`, and the contract subtleties (CanManage returns false for non-Admin; missing-session paths surface as `Succeeded = false` not as a thrown exception; `KillReason` is fixed at `"dashboard-admin-kill"` and that value reaches the audit log) are exactly what XML docs are for.
**Recommendation:** Add `<summary>` blocks to `IDashboardSessionAdminService.CanManage` (states the Admin-role gate), `CloseSessionAsync` and `KillWorkerAsync` (state that missing sessions return `DashboardSessionAdminResult.Fail(...)` rather than throwing, and that the audit log captures actor + remote IP). Add `<param>` and `<returns>` for the request/response shape. The same sweep can pick up the longstanding gap on `IDashboardApiKeyManagementService` if the team wants — but the new file is the load-bearing one.
**Resolution:** 2026-05-24 — Added `<summary>` + `<remarks>` blocks to every member of `IDashboardSessionAdminService` (`src/ZB.MOM.WW.MxGateway.Server/Dashboard/IDashboardSessionAdminService.cs`): an interface-level `<remarks>` describing the Admin-role gate, audit log shape, and `DashboardSessionAdminResult.Fail` semantics; per-member docs on `CanManage`, `CloseSessionAsync`, and `KillWorkerAsync` calling out the missing-session-returns-Fail contract and the `dashboard-admin-kill` reason constant that reaches the worker-kill audit log and `mxgateway.workers.killed` counter tag. `DashboardSessionAdminService` (`src/ZB.MOM.WW.MxGateway.Server/Dashboard/DashboardSessionAdminService.cs`) picked up a class-level `<summary>` + `<remarks>` describing the per-page audit-log seam, plus `<inheritdoc />` on each public method. Pure documentation change; no test (the behavioral contracts the docs describe are already exercised by the existing `DashboardSessionAdminServiceTests` cases).
### Server-050
| Field | Value |
@@ -905,7 +917,7 @@ The user-visible difference: rotating/revoking/deleting a key vs closing/killing
| Severity | Low |
| Category | Error handling & resilience |
| Location | `src/ZB.MOM.WW.MxGateway.Server/Dashboard/DashboardSessionAdminService.cs:42-75,92-125` |
| Status | Open |
| Status | Resolved |
**Description:** `CloseSessionAsync` and `KillWorkerAsync` catch only `SessionManagerException` (the `SessionNotFound` filter, then a general `SessionManagerException` catch). Anything else propagates raw to Blazor's error boundary. The propagation paths exist:
@@ -915,3 +927,5 @@ The user-visible difference: rotating/revoking/deleting a key vs closing/killing
Today neither call site has a Blazor error boundary, so an unhandled exception lands as a generic Blazor circuit error page. The friendlier-error contract that Server-044's commit message advertises ("audit-logs, friendly errors") is incomplete: only `SessionManagerException` gets a friendly error.
**Recommendation:** Add a general `catch (Exception exception)` after the `SessionManagerException` catch in both `CloseSessionAsync` and `KillWorkerAsync`, log a warning (matching the SessionManagerException pattern), and return `DashboardSessionAdminResult.Fail($"{operation} failed unexpectedly. See the gateway log for details.")`. This makes the result type truly the only output the page sees. Add a regression test using a `ThrowingSessionManager` that throws e.g. `InvalidOperationException` from `KillWorkerAsync` and asserts the service returns a failing result rather than propagating.
**Resolution:** 2026-05-24 — Added the recommended general `catch (Exception)` arms to both `DashboardSessionAdminService.CloseSessionAsync` and `KillWorkerAsync` (`src/ZB.MOM.WW.MxGateway.Server/Dashboard/DashboardSessionAdminService.cs`), placed after the `SessionManagerException` catches and behind a `catch (OperationCanceledException) when (cancellationToken.IsCancellationRequested) throw;` so caller cancellation still propagates cleanly. The new catches log a warning with actor + session id and return `DashboardSessionAdminResult.Fail("{Operation} failed unexpectedly for session {SessionId}. See the gateway log for details.")`, mirroring the SessionManagerException pattern. Regression tests in `src/ZB.MOM.WW.MxGateway.Tests/Gateway/Dashboard/DashboardSessionAdminServiceTests.cs`: `CloseSessionAsync_WhenManagerThrowsUnexpected_ReturnsFriendlyFail` (the `ISessionManager` throws `InvalidOperationException("unexpected")`) and `KillWorkerAsync_WhenManagerThrowsUnexpected_ReturnsFriendlyFail` (throws `IOException("pipe broken")`); both assert the service returns a failing result with a non-blank message rather than propagating. The fake's new `CloseThrowsUnexpected` / `KillThrowsUnexpected` properties hold the configured exception. Confirmed to fail before the fix (raw exception propagated) and pass after.
+18 -6
View File
@@ -7,7 +7,7 @@
| Review date | 2026-05-24 |
| Commit reviewed | `42b0037` |
| Status | Re-reviewed |
| Open findings | 5 |
| Open findings | 0 |
## Checklist coverage
@@ -488,12 +488,14 @@ The cancellation tests for `WorkerClient` in `WorkerClientTests` *do* exercise t
| Severity | Medium |
| Category | Concurrency & thread safety |
| Location | `src/ZB.MOM.WW.MxGateway.Tests/Gateway/Grpc/MxAccessGatewayServiceTests.cs:199-240`, `src/ZB.MOM.WW.MxGateway.Server/Metrics/GatewayMetrics.cs:8,73,246-251` |
| Status | Open |
| Status | Resolved |
**Description:** The review brief explicitly flagged `MxAccessGatewayServiceTests.StreamEvents_WhenEventIsWritten_RecordsSendDuration` as a known flake that "passed solo on rerun". The root cause is the `MeterListener` subscribes by `instrument.Meter.Name == GatewayMetrics.MeterName` (a *process-shared* constant `"MxGateway.Server"`), not by the specific `GatewayMetrics` instance constructed in the test. Tests-012 made the xUnit parallelism policy explicit (`parallelizeTestCollections: true`, `maxParallelThreads: -1`), and every other test that builds its own `GatewayMetrics()` and exercises `MxAccessGatewayService.StreamEvents` or `EventStreamService.StreamEventsAsync` (e.g. the new `StreamEventsAsync_*` family added by Tests-026 and Server-041, plus the pre-existing `StreamEventsAsync_YieldsEventsInWorkerOrder` etc.) routes through `GatewayMetrics.RecordEventStreamSend` → the same histogram name `mxgateway.events.stream_send.duration`. When two such tests run concurrently in the same xUnit process, the `MeterListener` in this test sees measurements from *both* meters and `families.Count` grows to >1, breaking `Assert.Equal([MxEventFamily.OnDataChange.ToString()], families)`. Solo reruns pass because no other producer is alive. This is exactly the cross-test mutable-state pattern Tests-012 set the guardrail comment against.
**Recommendation:** Either (a) filter the `MeterListener` callback by the specific `Meter` instance — capture `metrics._meter` (or expose `GatewayMetrics.Meter`) and compare with `ReferenceEquals(instrument.Meter, expectedMeter)` instead of comparing `Meter.Name`; or (b) place this test in a single-threaded `[Collection("GatewayMetrics-Listener")]` so no other `RecordEventStreamSend` producer runs concurrently. Option (a) is preferred because it removes the cross-talk vector permanently and lets the test stay parallelisable.
**Resolution:** 2026-05-24 — Applied option (a). Added an `internal Meter Meter => _meter;` accessor on `GatewayMetrics` (visible to the Tests project via the existing `InternalsVisibleTo`) and changed both the `InstrumentPublished` filter and the `SetMeasurementEventCallback<double>` filter in `StreamEvents_WhenEventIsWritten_RecordsSendDuration` from `instrument.Meter.Name == GatewayMetrics.MeterName` to `ReferenceEquals(instrument.Meter, metrics.Meter)`. Added a companion regression `StreamEvents_RecordSendDurationListener_IgnoresMeasurementsFromOtherMetersWithSameName` that constructs a second `GatewayMetrics`, records an `OnWriteComplete` measurement on it before the test-under-test publishes, and asserts the listener captures only the test-under-test's `OnDataChange` family. Confirmed the regression catches the original `Meter.Name`-only filter (got `["OnWriteComplete", "OnDataChange"]` for `["OnDataChange"]`) by temporarily reverting the filter shape; restored ReferenceEquals after. Suite green 3/3 (512/512); the two Tests-027 tests pass 5/5 solo. The cross-talk vector is permanently closed without giving up parallelism.
### Tests-028
| Field | Value |
@@ -501,12 +503,14 @@ The cancellation tests for `WorkerClient` in `WorkerClientTests` *do* exercise t
| Severity | Low |
| Category | Testing coverage |
| Location | `src/ZB.MOM.WW.MxGateway.Tests/Gateway/Sessions/SessionManagerTests.cs:466-496,802-807`, `src/ZB.MOM.WW.MxGateway.Server/Sessions/SessionManager.cs:216-253` |
| Status | Open |
| Status | Resolved |
**Description:** The new `KillWorkerAsync_KillsWorkerAndRemovesSession` (line 466) and `KillWorkerAsync_WhenSessionMissing_ThrowsSessionNotFound` (line 486) pin the new kill-path entry, but they do not pin the `reason` argument propagating through the chain. `SessionManager.KillWorkerAsync(sessionId, reason, ct)` validates `reason` with `ArgumentException.ThrowIfNullOrWhiteSpace(reason)` (line 221), calls `session.KillWorker(reason)` (line 229), and logs `reason={Reason}` (line 251); but the `FakeWorkerClient.Kill(string reason)` discards the argument (line 803-807) and the assertion is only `Assert.Equal(1, workerClient.KillCount)`. A regression that (a) hard-coded an internal `"unspecified"` reason between `SessionManager` and `GatewaySession`, (b) swapped to a different overload that dropped the reason, or (c) deleted the `ThrowIfNullOrWhiteSpace` guard would all pass the current tests. The dashboard caller (`DashboardSessionAdminService.KillWorkerAsync`) passes a hard-coded `"dashboard-admin-kill"` reason and the only test that observes it (`KillWorkerAsync_AdminKillsWorker`) asserts `!string.IsNullOrWhiteSpace(LastKillReason)` rather than pinning the value — so the same-class drift is also untested.
**Recommendation:** (1) Capture `LastKillReason` on `FakeWorkerClient.Kill` and assert `KillWorkerAsync_KillsWorkerAndRemovesSession` propagates the test-supplied `"test-kill"` string end-to-end. (2) Add `KillWorkerAsync_WithBlankReason_ThrowsArgumentException` (parameterised over `null`, `""`, `" "`) to pin the `ArgumentException.ThrowIfNullOrWhiteSpace` guard. (3) Tighten `DashboardSessionAdminServiceTests.KillWorkerAsync_AdminKillsWorker` to `Assert.Equal("dashboard-admin-kill", sessionManager.LastKillReason)` so a future reason-string change is a deliberate test update.
**Resolution:** 2026-05-24 — Added `LastKillReason` to `FakeWorkerClient` in `SessionManagerTests.cs` and set it inside `Kill(string reason)`. Tightened `KillWorkerAsync_KillsWorkerAndRemovesSession` to assert `workerClient.LastKillReason == "test-kill"`, pinning the end-to-end propagation from `SessionManager.KillWorkerAsync``session.KillWorker(reason)``IWorkerClient.Kill(reason)`. Added `KillWorkerAsync_WithBlankReason_ThrowsArgumentException` as a `[Theory]` over `""`, `" "`, `"\t"` plus a separate `KillWorkerAsync_WithNullReason_ThrowsArgumentNullException` fact (xUnit `InlineData` cannot carry `null` for a non-nullable string, and `ArgumentException.ThrowIfNullOrWhiteSpace` throws `ArgumentNullException` for `null`). Both new tests confirm `KillCount == 0` and the session remains registered, proving the guard fires before any lookup or worker call. Tightened `DashboardSessionAdminServiceTests.KillWorkerAsync_AdminKillsWorker` to `Assert.Equal("dashboard-admin-kill", sessionManager.LastKillReason)`. All affected tests pass; suite green.
### Tests-029
| Field | Value |
@@ -514,12 +518,16 @@ The cancellation tests for `WorkerClient` in `WorkerClientTests` *do* exercise t
| Severity | Low |
| Category | Error handling & resilience |
| Location | `src/ZB.MOM.WW.MxGateway.Tests/Gateway/Dashboard/DashboardSessionAdminServiceTests.cs:61-106,139-222`, `src/ZB.MOM.WW.MxGateway.Server/Dashboard/DashboardSessionAdminService.cs:77-125` |
| Status | Open |
| Status | Resolved |
**Description:** The new `DashboardSessionAdminServiceTests` covers the happy path and the viewer-denial path for both `CloseSessionAsync` and `KillWorkerAsync`, plus `CloseSessionAsync_WhenSessionMissing_ReportsFriendlyError` for the close-side `SessionNotFound` catch — but the kill-side error branches are not tested. The product code's `KillWorkerAsync` (lines 111-114) has the same `SessionNotFound` catch returning `"Session {id} was not found."` and (lines 115-124) a generic `SessionManagerException` catch returning `"Kill failed: {message}"`; neither is exercised. The fake's `KillWorkerAsync` (lines 200-209) only succeeds — there is no `KillThrowsNotFound` / `KillThrowsGeneric` configuration option matching the existing `CloseThrowsNotFound`. Symmetrically, `CloseSessionAsync` has the same `IsNullOrWhiteSpace(sessionId)` guard (line 37-40) but no blank-id test even though `KillWorkerAsync_BlankSessionId_ReturnsFailure` exists for the parallel kill guard — a guard-removal regression on close would slip through.
**Recommendation:** Mirror the existing close-side fixtures onto the kill side: add `KillThrowsNotFound` / `KillThrowsGeneric` init-flags to the `FakeSessionManager`, then `KillWorkerAsync_WhenSessionMissing_ReportsFriendlyError`, `KillWorkerAsync_WhenSessionManagerThrows_ReportsKillFailedMessage`, and `CloseSessionAsync_BlankSessionId_ReturnsFailure`. These are mechanical copies of the existing patterns and bring close/kill coverage into symmetry.
**Re-triage note:** The Server batch already added `CloseSessionAsync_WhenManagerThrowsUnexpected_ReturnsFriendlyFail` and `KillWorkerAsync_WhenManagerThrowsUnexpected_ReturnsFriendlyFail` (the Server-050 regressions visible at HEAD lines 125-162 of the test file), so the kill-side `SessionManagerException` general-catch branch and the close-side parallel are both covered there in a generic-exception shape. The only remaining asymmetry was the blank-session-id guard, per the prompt scope.
**Resolution:** 2026-05-24 — Added `CloseSessionAsync_BlankSessionId_ReturnsFailure` to `DashboardSessionAdminServiceTests`. The new test invokes `service.CloseSessionAsync(adminUser, " ", ct)` and asserts `Succeeded == false` and `sessionManager.CloseCount == 0`, pinning the `string.IsNullOrWhiteSpace(sessionId)` guard at `DashboardSessionAdminService.cs:52-55`. This brings close/kill blank-id coverage into symmetry with the existing `KillWorkerAsync_BlankSessionId_ReturnsFailure`. The `KillThrowsNotFound` / `KillThrowsGeneric` extensions from the original recommendation are not needed because the unexpected-throw branches are already covered by the Server-050 regressions noted above. All tests pass; suite green.
### Tests-030
| Field | Value |
@@ -527,12 +535,14 @@ The cancellation tests for `WorkerClient` in `WorkerClientTests` *do* exercise t
| Severity | Low |
| Category | Testing coverage |
| Location | `src/ZB.MOM.WW.MxGateway.Tests/Gateway/Dashboard/DashboardApiKeyManagementServiceTests.cs:115-163`, `src/ZB.MOM.WW.MxGateway.Server/Dashboard/DashboardApiKeyManagementService.cs:146-177` |
| Status | Open |
| Status | Resolved |
**Description:** The three new `DeleteAsync_*` fixtures cover unauthorised user, success path with audit, and store-refuses-with-friendly-error. They do not exercise two production behaviours: (1) `DeleteAsync_WhenStoreRefuses_ReportsFriendlyError` (line 151-163) does not construct or inject a `FakeApiKeyAuditStore`, so it never observes that the product code still emits an audit entry with `EventType = "dashboard-delete-key"` and `Details = "not-found-or-active"` on the failure branch (`AppendAuditAsync` runs unconditionally at line 167-172). A regression that placed the `AppendAuditAsync` call inside the `if (deleted)` branch would silently drop the audit trail for refused deletes — a real audit-completeness gap. (2) There is no `DeleteAsync_BlankKeyId_ReturnsFailure` or `DeleteAsync_InvalidKeyId_ReturnsFailure` test, even though `ValidateKeyId(keyId)` (line 156-160) guards on the same conditions as Create/Revoke/Rotate. The `Revoke`/`Rotate` paths have equivalent fixtures (the file's earlier tests cover them); only Delete is missing them.
**Recommendation:** (1) Add a `FakeApiKeyAuditStore` to `DeleteAsync_WhenStoreRefuses_ReportsFriendlyError` and assert it contains exactly one entry with `EventType == "dashboard-delete-key"` and `Details == "not-found-or-active"`. (2) Add `DeleteAsync_BlankKeyId_ReturnsFailure` (parameterised over `null`, `""`, `" "`) and `DeleteAsync_InvalidKeyId_ReturnsFailure` (a keyId with characters the `ValidateKeyId` rules reject) to pin the validation branch end-to-end.
**Resolution:** 2026-05-24 — Renamed `DeleteAsync_WhenStoreRefuses_ReportsFriendlyError` to `DeleteAsync_WhenStoreRefuses_ReportsFriendlyErrorAndAudits` and extended it to inject a `FakeApiKeyAuditStore`; the test now asserts the single audit entry has `EventType == "dashboard-delete-key"`, `KeyId == "operator01"`, and `Details == "not-found-or-active"`. This pins the unconditional-audit invariant at `DashboardApiKeyManagementService.cs:167-172` — a regression moving the `AppendAuditAsync` call inside `if (deleted)` would now fail the test. Added `DeleteAsync_BlankKeyId_ReturnsFailure` as a `[Theory]` over `""`, `" "`, `"\t"` that asserts `Succeeded == false`, `adminStore.DeleteCount == 0`, AND `auditStore.Entries` is empty — pinning that the `ValidateKeyId` guard at line 156-160 fires before any store or audit work. All tests pass; suite green.
### Tests-031
| Field | Value |
@@ -540,8 +550,10 @@ The cancellation tests for `WorkerClient` in `WorkerClientTests` *do* exercise t
| Severity | Low |
| Category | Concurrency & thread safety |
| Location | `src/ZB.MOM.WW.MxGateway.Tests/Gateway/Dashboard/DashboardSnapshotPublisherTests.cs:22-61` |
| Status | Open |
| Status | Resolved |
**Description:** `ExecuteAsync_WhenSnapshotServiceThrowsOnce_ReconnectsAfterDelay` records `startedAt = DateTimeOffset.UtcNow` *before* calling `publisher.StartAsync(...)`, then asserts `secondSubscribeAt - startedAt >= reconnectDelay - 10ms` (line 59). The measured gap is *not* the reconnect delay in isolation — it is `(StartAsync scheduling) + (first WatchSnapshotsAsync setup + Task.Yield) + (throw) + reconnect delay + (second WatchSnapshotsAsync setup)`. On a slow/contended CI agent the first three terms easily dominate (favouring the assertion); but on a fast machine, Windows `Task.Delay(50ms)` rounds up to the next ~15.6 ms tick boundary and may return at ~46-50 ms relative to schedule, while the first three terms can be sub-millisecond — so the gap measurement can land within 1-2 ms of the lower bound, and the 10 ms slack may not absorb a single missed quantum. This is a latent flake of the same flavour as Tests-006 (heartbeat timing) but on a wall-clock dependency the test cannot inject around because `DashboardSnapshotPublisher` uses `Task.Delay(_reconnectDelay)` directly. Tests-006 / Tests-017 moved heartbeat tests onto `ManualTimeProvider`; this test cannot do that without a product change to use a `TimeProvider`-aware delay.
**Recommendation:** (a) The cheap fix: have `ThrowOnceThenYieldSnapshotService` record `_firstThrowAt = DateTimeOffset.UtcNow` immediately before the `throw`, and change the assertion to `secondSubscribeAt - firstThrowAt >= reconnectDelay - 10ms` — the gap then measures only the reconnect delay, eliminating the variable scheduling baseline. (b) The deeper fix: extend `DashboardSnapshotPublisher` to accept an `ITimeProvider`-style delay seam (or a virtual `DelayAsync` hook) so a `ManualTimeProvider` could advance time deterministically. (a) is preferred for now; (b) belongs as a follow-up if more reconnect-loop tests are added.
**Resolution:** 2026-05-24 — Applied option (a). Added `FirstThrowAt` to `ThrowOnceThenYieldSnapshotService` and set it via `FirstThrowAt = DateTimeOffset.UtcNow;` immediately before the first-call `throw`. Removed the pre-`StartAsync` `startedAt` baseline; the assertion now reads `gap = secondSubscribeAt - firstThrowAt` (both timestamps captured inside the fake), and the 10 ms slack absorbs the Windows `Task.Delay` quantum without the variable `StartAsync` / scheduling overhead in the baseline. This is the same flake-isolation pattern Tests-006 / Tests-017 used (measuring only the production delay, not test-side setup). Suite green; the test passes deterministically across repeated runs.
+13
View File
@@ -51,6 +51,19 @@ The shared inputs are:
The commands in the matrix use `MXGATEWAY_API_KEY` through each CLI's
`api-key-env` flag. They must not embed bearer tokens or raw API keys.
### TLS variant
The matrix runs over plaintext (`h2c`) by default. A TLS variant exists but stays
a manual/opt-in run, consistent with the gate above, because it needs the gateway
started with an HTTPS endpoint (an `https://` `MXGATEWAY_ENDPOINT`) and each CLI
switched to its TLS flag (`--tls` / `-tls` / `--plaintext=false` /
`plaintext=False`). The clients are lenient by default and accept the gateway's
auto-generated self-signed certificate without extra trust setup, except the Rust
CLI, which is pin-only and needs `--ca-file` or `--require-certificate-validation`
(and Python uses trust-on-first-use). See
[Gateway Configuration — Automatic self-signed certificate](./GatewayConfiguration.md#automatic-self-signed-certificate)
and each client README for the per-client TLS flags.
## JSON Comparison
Every command in the matrix requests JSON output. A runner can compare the
+49
View File
@@ -362,6 +362,55 @@ Dashboard access should require API-key-backed dashboard authentication with
is enabled by default through `Dashboard:AllowAnonymousLocalhost`; the bypass is
limited to loopback requests.
## Lazy Browse Is Wire-Only
Decision: the gateway continues to pull the full Galaxy hierarchy on each
deploy. `BrowseChildren` and the lazy dashboard render only avoid sending and
DOM-materializing the full tree — they do not push laziness into SQL or cache
loading.
Rationale: snapshot persistence and the dashboard summary both depend on a
fully-materialized cache. Lazy SQL would increase per-click latency on a
deployment-heavy box, multiply per-session SQL connections, and complicate the
cold-start path. Wire-side laziness solves the actual pain (oversized gRPC
replies and a heavy DOM) without disturbing the materialization model.
## TLS Auto-Certificate and Lenient Client Trust
Decision: when a Kestrel `https://` endpoint is configured without a certificate
of its own (and no `Kestrel:Certificates:Default` is set), the gateway generates
and persists a self-signed certificate rather than failing to start. Clients
connecting over TLS without a pinned CA accept whatever certificate the server
presents by default; pinning a CA restores full verification.
Rationale: `mxaccessgw` is an internal tool with no PKI to issue or distribute
certificates. The prior behavior — an `https` endpoint with no certificate
fails at startup with Kestrel's opaque "no server certificate was specified"
error — pushed operators toward plaintext (`h2c`), exposing the API key and
request payloads on the wire. Auto-generating a long-lived, persisted, reused
certificate lets TLS "just work" with zero certificate management, while the
lenient client default means clients connect to that self-signed certificate
without a manual trust step. Both choices are deliberate, not oversights:
strict-by-default would force PKI work this tool does not warrant. Plaintext-only
deployments are untouched — no certificate or key material is written for them —
and an operator who supplies a real certificate transparently overrides the
generated one.
Two clients diverge from "accept any certificate" because their gRPC stacks lack
a per-channel skip-verify hook:
- Python uses trust-on-first-use: it fetches the server's presented certificate
over a separate unverified probe and pins it for the channel, and defaults the
SNI/target-name override to `localhost` (the generated certificate always
carries a `localhost` SAN).
- Rust is pin-only: tonic exposes no public hook to inject a custom certificate
verifier, so TLS over Rust requires either a pinned CA or an explicit opt-in to
system-trust verification; otherwise connecting returns a clear, actionable
error.
See [Gateway Configuration — Automatic self-signed certificate](./GatewayConfiguration.md#automatic-self-signed-certificate)
and the per-client READMEs for the as-built behavior.
## Later Revisit Items
These are explicit post-v1 revisit items, not open blockers:

Some files were not shown because too many files have changed in this diff Show More