fix(client/java): correct browseChildrenRaw README; CLI --require-certificate-validation (Client.Java-037,038)

This commit is contained in:
Joseph Doherty
2026-06-15 02:56:15 -04:00
parent cebe67e9bd
commit 75a39f5a8c
4 changed files with 151 additions and 8 deletions
+11 -6
View File
@@ -128,18 +128,20 @@ gradle :zb-mom-ww-mxgateway-cli:run --args="galaxy-discover --endpoint localhost
### Browsing lazily
For UI trees or OPC UA bridges, use `browseChildren` to walk one level at a
For UI trees or OPC UA bridges, use `browseChildrenRaw` to walk one level at a
time instead of loading the full hierarchy with `discoverHierarchy`. Pass a
default request for root objects; subsequent calls set `parentGobjectId`,
`parentTagName`, or `parentContainedPath`. Filter fields match
`DiscoverHierarchy`. Each response pairs `getChildrenList()` with
`getChildHasChildrenList()` so you know which nodes to expand. See
[Galaxy Repository](../../docs/GalaxyRepository.md#browsechildren) for full
request and filter semantics. This snippet documents the API as it appears once
the Java client is regenerated on the Windows host.
request and filter semantics. For most callers the high-level
`browse()`/`LazyBrowseNode` walker below is the preferred surface;
`browseChildrenRaw` exposes the single underlying RPC when you need direct
control of paging.
```java
BrowseChildrenReply reply = galaxy.browseChildren(
BrowseChildrenReply reply = galaxy.browseChildrenRaw(
BrowseChildrenRequest.newBuilder().build());
List<GalaxyObject> children = reply.getChildrenList();
@@ -248,8 +250,11 @@ gradle :zb-mom-ww-mxgateway-cli:run --args="smoke --endpoint localhost:5000 --ap
```
The CLI accepts `--api-key`, `--api-key-env`, `--plaintext`, `--ca-file`,
`--server-name-override`, `--timeout`, and `--json` on gateway commands. JSON
output redacts API keys.
`--server-name-override`, `--require-certificate-validation`, `--timeout`, and
`--json` on gateway commands. JSON output redacts API keys. TLS is lenient by
default (the certificate is not verified unless you pin a CA with `--ca-file`);
pass `--require-certificate-validation` to verify the server certificate against
the JVM trust store without pinning.
Use TLS options for a secured gateway:
@@ -1366,6 +1366,13 @@ public final class MxGatewayCli implements Callable<Integer> {
@Option(names = "--server-name-override", description = "TLS server name override.")
String serverNameOverride = "";
@Option(
names = "--require-certificate-validation",
description =
"Verify the server certificate against the JVM trust store "
+ "(disables the lenient default; ignored with --plaintext or --ca-file pinning).")
boolean requireCertificateValidation;
@Option(names = "--timeout", defaultValue = "30s", description = "Per-call timeout.")
String timeout;
@@ -1388,6 +1395,7 @@ public final class MxGatewayCli implements Callable<Integer> {
.plaintext(plaintext)
.caCertificatePath(caFile)
.serverNameOverride(serverNameOverride)
.requireCertificateValidation(requireCertificateValidation)
.callTimeout(resolvedTimeout)
.build();
}
@@ -1400,6 +1408,7 @@ public final class MxGatewayCli implements Callable<Integer> {
values.put("plaintext", plaintext);
values.put("caFile", caFile == null ? "" : caFile.toString());
values.put("serverNameOverride", serverNameOverride);
values.put("requireCertificateValidation", requireCertificateValidation);
values.put("timeout", timeout);
return values;
}
@@ -5,6 +5,7 @@ import static org.junit.jupiter.api.Assertions.assertFalse;
import static org.junit.jupiter.api.Assertions.assertTrue;
import com.zb.mom.ww.mxgateway.client.MxGatewayAlarmFeedSubscription;
import com.zb.mom.ww.mxgateway.client.MxGatewayClientOptions;
import io.grpc.stub.StreamObserver;
import java.io.ByteArrayInputStream;
import java.io.InputStream;
@@ -289,6 +290,51 @@ final class MxGatewayCliTests {
}
}
@Test
void requireCertificateValidationFlagPropagatesThroughToClientOptions() {
// Client.Java-038 regression — the --require-certificate-validation
// CLI flag must reach MxGatewayClientOptions.requireCertificateValidation
// via CommonOptions.toClientOptions(), so CLI users can opt into strict
// JVM-trust verification without pinning a CA.
CapturingClientFactory factory = new CapturingClientFactory();
CliRun run = execute(
factory,
"acknowledge-alarm",
"--endpoint",
"localhost:5000",
"--api-key-env",
"MXGATEWAY_API_KEY",
"--require-certificate-validation",
"--reference",
"Tank01.Level.HiHi");
assertEquals(0, run.exitCode(), "errors:\n" + run.errors());
assertTrue(
factory.capturedClientOptions.requireCertificateValidation(),
"--require-certificate-validation did not propagate into MxGatewayClientOptions");
}
@Test
void requireCertificateValidationDefaultsToLenientWhenFlagAbsent() {
// Without the flag, the lenient-by-default trust posture must be
// preserved (requireCertificateValidation == false).
CapturingClientFactory factory = new CapturingClientFactory();
CliRun run = execute(
factory,
"acknowledge-alarm",
"--endpoint",
"localhost:5000",
"--api-key-env",
"MXGATEWAY_API_KEY",
"--reference",
"Tank01.Level.HiHi");
assertEquals(0, run.exitCode(), "errors:\n" + run.errors());
assertFalse(
factory.capturedClientOptions.requireCertificateValidation(),
"requireCertificateValidation should default to false (lenient)");
}
@Test
void streamAlarmsCommandFailsFastOnQueueOverflow() {
// Client.Java-033 regression — the CLI's stream-alarms bounded queue
@@ -435,6 +481,23 @@ final class MxGatewayCliTests {
}
}
/**
* Factory that records the {@link MxGatewayClientOptions} produced by
* {@link MxGatewayCli.CommonOptions#toClientOptions()} so a test can assert
* how CLI flags map onto the library option surface. Wraps the standard
* {@link FakeClient} so the command body still completes. Used by the
* Client.Java-038 option-flow regression.
*/
private static final class CapturingClientFactory implements MxGatewayCli.MxGatewayCliClientFactory {
private MxGatewayClientOptions capturedClientOptions;
@Override
public MxGatewayCli.MxGatewayCliClient connect(MxGatewayCli.CommonOptions options) {
capturedClientOptions = options.toClientOptions();
return new FakeClient(options.spec.commandLine().getOut());
}
}
/**
* Factory whose fake client floods the {@code streamAlarms} observer with
* 2000 messages synchronously, exceeding the CLI's bounded 1024-element
+68 -2
View File
@@ -4,8 +4,8 @@
|---|---|
| Module | `clients/java` |
| Reviewer | Claude Code |
| Review date | 2026-05-24 |
| Commit reviewed | `42b0037` |
| Review date | 2026-06-15 |
| Commit reviewed | `410acc9` |
| Status | Re-reviewed |
| Open findings | 0 |
@@ -77,6 +77,35 @@ Client.Java-001..031 are unchanged.
| 9 | Testing coverage | Issue found: the new `MxGatewayClient.streamAlarms` SDK method has no library-side test in `zb-mom-ww-mxgateway-client/src/test/...` — only the CLI test exercises the path via a `FakeClient.streamAlarms` override that bypasses the production `subscription.wrap(observer)` glue (Client.Java-035). |
| 10 | Documentation & comments | Issue found: README (`clients/java/README.md:182-183`) documents the new `stream-alarms` and `acknowledge-alarm` commands with `--session-id <id>` (neither command has that option) and `acknowledge-alarm --alarm-reference …` (actual flag is `--reference`) — every documented invocation fails at picocli parse time (Client.Java-032). |
### 2026-06-15 re-review (commit 410acc9)
Re-review pass at `410acc9`. Diff against `42b0037` is eleven commits touching
`clients/java`: `d3cb311` (Client.Java-032..036 fixes — shared subscription
base + batch tokenizer), `0d6193c`/`803a207`/`b4bc2df`/`4a19854`/`b244851`/
`68f905a` (the `BrowseChildren` lazy-browse SDK surface: `GalaxyRepositoryClient.browse()`,
`browse(BrowseChildrenOptions)`, `browseChildrenRaw`, `browseChildrenInner`,
plus the `LazyBrowseNode` walker and `BrowseChildrenOptions`), `a276f46`/
`ba82afe`/`2eb8137` (lenient-by-default TLS: new `requireCertificateValidation`
option, `InsecureTrustManagerFactory` fallback, foojay toolchain resolver), and
`fe44e3c` (maven-publish wiring for the Gitea Maven feed). Generated
protobuf/gRPC Java is excluded. `gradle test` could not be run here — this macOS
host has no Java runtime (the module builds on the Windows host per project
memory); findings below are from source inspection. Prior findings
Client.Java-001..036 are unchanged.
| # | Category | Result |
|---|---|---|
| 1 | Correctness & logic bugs | No issues found in this diff. `LazyBrowseNode.expand()` leader/coalesce logic is correct (single in-flight future, slot cleared on failure for retry); `browseChildrenInner` pagination handles the empty/null next-page token and guards against repeated page tokens; the `child_has_children` parallel array is bounds-checked (`i < getChildHasChildrenCount()`), defaulting absent hints to false. |
| 2 | mxaccessgw conventions | No issues found. No MXAccess COM, no synthesized events, generated code untouched. The lenient-TLS default is a documented repo-wide design decision (`docs/DesignDecisions.md` "TLS Auto-Certificate and Lenient Client Trust"), not a Java-specific deviation. |
| 3 | Concurrency & thread safety | No issues found. `LazyBrowseNode` does not hold the `expandLock` monitor across the BrowseChildren RPC (fixed in `68f905a`); readers use a separate `ReentrantReadWriteLock` so `getChildren()`/`isExpanded()` never block on the in-flight RPC; `BrowseChildrenOptions` is immutable. The shared `MxGatewayStreamSubscription` base (Client.Java-036) is covered. |
| 4 | Error handling & resilience | No issues found. `browseChildrenRaw` normalises non-`MxGatewayException` gRPC errors via `MxGatewayErrors.fromGrpc`; the non-leader `expand()` path rethrows the leader's `MxGatewayException`/`RuntimeException` and restores the interrupt flag on `InterruptedException`. |
| 5 | Security | No issues found. maven-publish credentials come from `GITEA_USERNAME`/`GITEA_TOKEN` env vars with empty-string fallback — no committed secrets. The lenient-TLS `InsecureTrustManagerFactory` default is the documented, intentional design for this PKI-less internal tool; strict verification is reachable via `caCertificatePath` (pin) or `requireCertificateValidation(true)`, both tested in `MxGatewayClientTlsTests`. |
| 6 | Performance & resource management | No issues found. |
| 7 | Design-document adherence | No issues found. The browse surface matches `docs/GalaxyRepository.md#browsechildren` (cache-served lazy expand, `has_children` hint, repeated-page-token → error); the TLS posture matches `docs/GatewayConfiguration.md` and `JavaClientDesign.md`. |
| 8 | Code organization & conventions | Issue found: the new `requireCertificateValidation` library option is not exposed or propagated by the CLI `CommonOptions.toClientOptions()`, so CLI users cannot opt into JVM-trust-store verification — same additive-surface gap pattern as the resolved Client.Java-025 (Client.Java-038). |
| 9 | Testing coverage | No issues found. The browse surface has thorough library tests in `GalaxyRepositoryClientTests` (roots, expand-populates, idempotent-single-RPC, unknown-parent not-found, multi-page gather, concurrent-callers-one-RPC, filter forwarding, repeated-page-token rejection); TLS lenient/strict paths are covered by `MxGatewayClientTlsTests` against a real in-process TLS server. |
| 10 | Documentation & comments | Issue found: the README "Browsing lazily" first code snippet calls `galaxy.browseChildren(BrowseChildrenRequest…)`, but no such method exists on `GalaxyRepositoryClient` — the raw single-RPC method is `browseChildrenRaw(BrowseChildrenRequest)`; the documented snippet does not compile (Client.Java-037). |
## Findings
### Client.Java-001
@@ -662,4 +691,41 @@ This is the same maintenance-hazard pattern Client.Java-009 / Client.Java-016 id
**Resolution:** 2026-05-24 — Extracted a package-private abstract base `MxGatewayStreamSubscription<TRequest, TResponse> implements AutoCloseable` (new file `clients/java/zb-mom-ww-mxgateway-client/src/main/java/com/zb/mom/ww/mxgateway/client/MxGatewayStreamSubscription.java`). It holds the shared `AtomicReference<ClientCallStreamObserver<TRequest>>` and `AtomicBoolean cancelled` pair, the `wrap(StreamObserver<TResponse>)` factory that returns a `ClientResponseObserver` with the Client.Java-014 close-before-beforeStart fix baked in, the `cancel()` / `close()` implementation, and an immutable `cancelMessage` injected by the subclass constructor. The four prior 60-line near-clones (`MxGatewayEventSubscription`, `MxGatewayAlarmFeedSubscription`, `MxGatewayActiveAlarmsSubscription`, `DeployEventSubscription`) collapse to ~10-line subclasses that only declare their `<Request, Response>` type parameters and supply the cancel-message string to `super(...)`. Public API surface is preserved: each subclass remains a `public final class` with a public no-arg constructor (the constructor was implicit on the original classes; I made it explicit `public` on the subclasses so the existing CLI `FakeClient.streamAlarms` in a different package can still `new MxGatewayAlarmFeedSubscription()`). The `wrap(...)` method is `final` and package-private on the base — same accessibility the four subclasses had before — so production callers in `MxGatewayClient`/`GalaxyRepositoryClient` see no change. New test file `MxGatewayStreamSubscriptionContractTests` exercises the lifecycle/cancellation contract identically across all four subclasses (16 tests, four per scenario): (a) cancel-before-beforeStart eagerly cancels the stream once it attaches with the subclass-specific message, (b) cancel-after-beforeStart forwards directly to the stream, (c) `close()` delegates to `cancel()`, (d) the wrapped observer forwards `onNext`/`onError`/`onCompleted` verbatim, and a compile-time `typeBoundsCheck` helper that asserts each subclass still binds its `<Req, Resp>` parameters to the right proto types. TDD red phase confirmed: temporarily breaking one subclass's `super(...)` message to `"BROKEN MESSAGE"` made the contract test for that subclass fail with `expected: <client cancelled alarm feed> but was: <BROKEN MESSAGE>`; restoring the correct value turned all 16 contract tests green. Future fixes to the shared lifecycle now live in one place — the next Client.Java-014/021-style race fix cannot drift across the four classes.
### Client.Java-037
| Field | Value |
|---|---|
| Severity | Medium |
| Category | Documentation & comments |
| Location | `clients/java/README.md:138-149` |
| Status | Resolved |
**Description:** The "Browsing lazily" section's first (low-level) code snippet documents a `browseChildren` method that does not exist on the public client surface:
```java
BrowseChildrenReply reply = galaxy.browseChildren(
BrowseChildrenRequest.newBuilder().build());
```
`GalaxyRepositoryClient` exposes only `browse()`, `browse(BrowseChildrenOptions)`, and the raw single-RPC method `browseChildrenRaw(BrowseChildrenRequest)` (verified at `GalaxyRepositoryClient.java:227,238,251`). There is no `browseChildren(BrowseChildrenRequest)`, so the documented snippet fails to compile — a user copy-pasting the primary low-level example hits a missing-symbol error immediately. The README hedges the snippet with "This snippet documents the API as it appears once the Java client is regenerated on the Windows host," but the discrepancy is not a regeneration artifact: the hand-written wrapper method is named `browseChildrenRaw`, not `browseChildren`. The adjacent "High-level walker" snippet (`galaxy.browse()`, `root.expand()`, `root.getChildren()`, `child.hasChildrenHint()`, `child.getObject().getTagName()`) is correct against the actual API; only the low-level snippet is wrong.
**Recommendation:** Change `galaxy.browseChildren(` to `galaxy.browseChildrenRaw(` in the low-level snippet so it matches the real method name, or replace the low-level example with the `browse()`/`LazyBrowseNode` walker that the SDK actually intends as the primary surface. Drop the "as it appears once regenerated" caveat once the snippet compiles against the current source. Consider an `installDist`-based or compile-checked doc snippet test to prevent README API drift, mirroring the parse-only assertions added for Client.Java-032.
**Resolution:** 2026-06-15 — Confirmed against source: `GalaxyRepositoryClient` (`zb-mom-ww-mxgateway-client/.../GalaxyRepositoryClient.java:227,238,251`) exposes only `browse()`, `browse(BrowseChildrenOptions)`, and the raw single-RPC `browseChildrenRaw(BrowseChildrenRequest)` — there is no `browseChildren(BrowseChildrenRequest)`, so the documented snippet did not compile. Fixed the README "Browsing lazily" low-level snippet at `clients/java/README.md` by renaming `galaxy.browseChildren(` to `galaxy.browseChildrenRaw(`; the surrounding accessors (`BrowseChildrenReply`/`BrowseChildrenRequest` types, `getChildrenList()`, `getChildHasChildrenList()`, `getTagName()`) are all valid proto accessors and were left unchanged. Replaced the misleading "as it appears once the Java client is regenerated on the Windows host" caveat (the discrepancy was a hand-written wrapper name, not a codegen artifact) with prose steering callers to the high-level `browse()`/`LazyBrowseNode` walker as the preferred surface and `browseChildrenRaw` as the direct-paging escape hatch. Documentation-only change; no test added (no compile-checked doc-snippet harness exists yet — left as the noted future enhancement).
### Client.Java-038
| Field | Value |
|---|---|
| Severity | Low |
| Category | Code organization & conventions |
| Location | `clients/java/zb-mom-ww-mxgateway-cli/src/main/java/com/zb/mom/ww/mxgateway/cli/MxGatewayCli.java:1347-1393` |
| Status | Resolved |
**Description:** Commit `a276f46` added `requireCertificateValidation` to `MxGatewayClientOptions` as a first-class TLS-trust toggle (lenient-by-default; set `true` to verify against the JVM trust store without pinning a CA). The CLI `CommonOptions` exposes `--plaintext`, `--ca-file`, and `--server-name-override` and propagates them through `toClientOptions()`, but it neither declares a `--require-certificate-validation` option nor sets `builder.requireCertificateValidation(...)`. CLI users therefore have no way to request strict verification short of supplying a pinned CA via `--ca-file`; the lenient `InsecureTrustManagerFactory` default is forced on every non-pinned TLS CLI connection. This is the same additive-surface gap pattern as the resolved Client.Java-025 (`shutdownTimeout` not propagated to the CLI). `docs/CrossLanguageSmokeMatrix.md` documents `--require-certificate-validation` for the Rust CLI's pin-only stack but not Java, so this is not a direct README contradiction; it is a library-vs-CLI surface inconsistency. Severity is Low because the secure-by-pinning path (`--ca-file`) remains available and the lenient default is the documented intended behaviour for this internal tool.
**Recommendation:** Add a `--require-certificate-validation` boolean option to `CommonOptions` (default unset/false to preserve the lenient default) and propagate it into `toClientOptions()` via `builder.requireCertificateValidation(value)`. Include the resolved value in `redactedJsonMap()` so `--json` output reflects the effective trust posture. Add a CLI parse-only assertion exercising the flag to keep the CLI surface tracking the library surface.
**Resolution:** 2026-06-15 — Confirmed against source: `MxGatewayClientOptions` (`zb-mom-ww-mxgateway-client/.../MxGatewayClientOptions.java:108,260`) exposes `requireCertificateValidation()` and a `Builder.requireCertificateValidation(boolean)`, but the CLI `CommonOptions` in `MxGatewayCli.java` declared no flag and `toClientOptions()` never set it, forcing the lenient default on every non-pinned TLS CLI connection. Added a bare-boolean `@Option(names = "--require-certificate-validation")` field to `CommonOptions` (defaults to `false`, preserving the lenient default; mirrors the existing `--plaintext` flag-style option), propagated it through `toClientOptions()` via `.requireCertificateValidation(requireCertificateValidation)`, and added it to `redactedJsonMap()` so `--json` output reflects the effective trust posture. Documented the new flag and the lenient-by-default trust posture in `clients/java/README.md`. Note: the Client.Java-025 precedent (`shutdownTimeout`) was applied to the pre-rename `mxgateway-cli` module and is not present in this renamed `zb-mom-ww-mxgateway-cli` `toClientOptions()`; I mirrored the live `--ca-file`/`--server-name-override` TLS-option plumbing pattern instead, which is the correct precedent here. Regression tests in `MxGatewayCliTests`: `requireCertificateValidationFlagPropagatesThroughToClientOptions` (drives `acknowledge-alarm --require-certificate-validation` through a new `CapturingClientFactory` that records `options.toClientOptions()` and asserts `MxGatewayClientOptions.requireCertificateValidation()` is `true`) and `requireCertificateValidationDefaultsToLenientWhenFlagAbsent` (asserts the flag defaults to `false`). The capturing factory exercises the real `toClientOptions()` propagation, stronger than a parse-only check.