code-reviews: re-review Client.Java at 42b0037
Append 5 new findings (Client.Java-032..036): README flags for new alarm subcommands do not exist; StreamAlarmsCommand bounded queue silently drops alarms instead of fail-fast; BatchCommand whitespace tokenisation shreds quoted args; no library-side stream_alarms test; MxGatewayAlarmFeedSubscription is the fourth duplicate subscription class. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -5,9 +5,9 @@
|
||||
| Module | `clients/java` |
|
||||
| Reviewer | Claude Code |
|
||||
| Review date | 2026-05-24 |
|
||||
| Commit reviewed | `d692232` |
|
||||
| Status | Reviewed |
|
||||
| Open findings | 0 |
|
||||
| Commit reviewed | `42b0037` |
|
||||
| Status | Re-reviewed |
|
||||
| Open findings | 5 |
|
||||
|
||||
## Checklist coverage
|
||||
|
||||
@@ -52,6 +52,31 @@ documentation updates lag — see the doc-side findings below.
|
||||
| 9 | Testing coverage | Cross-reference to Client.Java-030; the bulk-command gaps tracked under Client.Java-026 remain. |
|
||||
| 10 | Documentation & comments | Issues found: Client.Java-027 (Gradle task names in README and JavaClientDesign still reference the old `:mxgateway-client:` and `:mxgateway-cli:` paths — every command in the README breaks if copy-pasted); Client.Java-029 (`README.md:209` cites `zb-mom-ww-mxgateway-cli/build/install/mxgateway-cli` but the actual install path contains a doubled directory `zb-mom-ww-mxgateway-cli/build/install/zb-mom-ww-mxgateway-cli/`). |
|
||||
|
||||
### 2026-05-24 re-review (commit 42b0037)
|
||||
|
||||
Re-review pass at `42b0037`. Diff against `d692232` is five commits touching
|
||||
`clients/java`: `10bd0c0` (Client.Java-027..031 fixes), `71d2c39` (port
|
||||
`batch` subcommand), `f90bff0` (port bulk read/write SDK + CLI subcommands),
|
||||
`8a0c59d` (port `stream-alarms` + `acknowledge-alarm` to the Java CLI plus a
|
||||
new `MxGatewayClient.streamAlarms` SDK method returning
|
||||
`MxGatewayAlarmFeedSubscription`), and `8738735` (README updates for the
|
||||
alarm surface). `gradle build` from `clients/java` is green; new findings
|
||||
focus on the alarm surface and the `batch` subcommand. Prior findings
|
||||
Client.Java-001..031 are unchanged.
|
||||
|
||||
| # | Category | Result |
|
||||
|---|---|---|
|
||||
| 1 | Correctness & logic bugs | Issues found: `BatchCommand` tokenises stdin with `line.trim().split("\\s+")`, so any argument with embedded whitespace (e.g. `--comment "needs verification"`) is shredded mid-string (Client.Java-034); `StreamAlarmsCommand` uses `queue.offer(value)` on a bounded queue with no overflow detection, silently dropping alarm messages when the gateway pushes faster than the consumer drains (Client.Java-033). |
|
||||
| 2 | mxaccessgw conventions | Cross-reference to Client.Java-033 — silently-dropping events violates `JavaStyleGuide.md` ("Do not reorder, coalesce, or drop events in client code") and the fail-fast event-backpressure contract. No new finding raised separately. |
|
||||
| 3 | Concurrency & thread safety | No issues found in this diff. `MxGatewayAlarmFeedSubscription` correctly mirrors `MxGatewayEventSubscription`'s `beforeStart`-vs-`cancel` race fix (Client.Java-014). |
|
||||
| 4 | Error handling & resilience | No issues found in this diff. |
|
||||
| 5 | Security | No issues found in this diff. |
|
||||
| 6 | Performance & resource management | No issues found in this diff. |
|
||||
| 7 | Design-document adherence | No issues found in this diff. The new `streamAlarms` SDK surface matches `gateway.md`'s always-on alarm-monitor design (no worker session, multi-subscriber). |
|
||||
| 8 | Code organization & conventions | Issue found: `MxGatewayAlarmFeedSubscription` is a structural copy of `MxGatewayEventSubscription` (same fields, same `beforeStart`/`cancel` shape) — the subscription-class family should share a common base or helper (Client.Java-036). |
|
||||
| 9 | Testing coverage | Issue found: the new `MxGatewayClient.streamAlarms` SDK method has no library-side test in `zb-mom-ww-mxgateway-client/src/test/...` — only the CLI test exercises the path via a `FakeClient.streamAlarms` override that bypasses the production `subscription.wrap(observer)` glue (Client.Java-035). |
|
||||
| 10 | Documentation & comments | Issue found: README (`clients/java/README.md:182-183`) documents the new `stream-alarms` and `acknowledge-alarm` commands with `--session-id <id>` (neither command has that option) and `acknowledge-alarm --alarm-reference …` (actual flag is `--reference`) — every documented invocation fails at picocli parse time (Client.Java-032). |
|
||||
|
||||
## Findings
|
||||
|
||||
### Client.Java-001
|
||||
@@ -518,3 +543,113 @@ documentation updates lag — see the doc-side findings below.
|
||||
**Recommendation:** Either (a) update the prose to the full prefixed names, or (b) clarify in a one-line note: "The subprojects are `zb-mom-ww-mxgateway-client` and `zb-mom-ww-mxgateway-cli`; this README refers to them by their short suffixes below for readability." (a) is more honest; (b) preserves readability at the cost of one extra concept.
|
||||
|
||||
**Resolution:** 2026-05-24 — Took option (a): updated the README layout-section prose in `clients/java/README.md:17,22,26` (the `mxgateway-client` generates… paragraph, the `mxgateway-client` exposes… paragraph, and the `mxgateway-cli` depends on `mxgateway-client`… paragraph) to use the full prefixed `zb-mom-ww-mxgateway-client` and `zb-mom-ww-mxgateway-cli` names, matching the layout block at lines 13–14 and `settings.gradle`. Reflows the final paragraph slightly because the prefixed names push past the existing 80-column wrap; content is unchanged otherwise. Doc-only change.
|
||||
|
||||
### Client.Java-032
|
||||
|
||||
| Field | Value |
|
||||
|---|---|
|
||||
| Severity | High |
|
||||
| Category | Documentation & comments |
|
||||
| Location | `clients/java/README.md:182-183` |
|
||||
| Status | Open |
|
||||
|
||||
**Description:** Commit `8738735` ("clients: document StreamAlarms + AcknowledgeAlarm in each README") added two new gradle invocations to the CLI Usage block:
|
||||
|
||||
```
|
||||
gradle :zb-mom-ww-mxgateway-cli:run --args="stream-alarms --endpoint localhost:5000 --api-key-env MXGATEWAY_API_KEY --plaintext --session-id <id> --limit 1 --json"
|
||||
gradle :zb-mom-ww-mxgateway-cli:run --args="acknowledge-alarm --endpoint localhost:5000 --api-key-env MXGATEWAY_API_KEY --plaintext --session-id <id> --alarm-reference \"\\Galaxy\Area001.Pump001.PumpFault\" --json"
|
||||
```
|
||||
|
||||
Both are wrong against the actual CLI surface in `MxGatewayCli.java`:
|
||||
|
||||
- `StreamAlarmsCommand` (line 1060) has no `--session-id` option — the gateway's alarm feed is session-less ("Served by the gateway's always-on alarm monitor — no worker session is opened" per the SDK Javadoc at `MxGatewayClient.java:331`). picocli will reject the unknown option with a non-zero exit before the command body runs.
|
||||
- `AcknowledgeAlarmCommand` (line 1135) also has no `--session-id` option, AND the option is named `--reference` (line 1136), not `--alarm-reference`. Two unknown-option errors per copy-paste.
|
||||
|
||||
A user copying either invocation from the README hits a picocli parse error immediately. The other clients' README updates in the same commit (commit `8738735`) likely have the same issue but are out of scope for this Java review.
|
||||
|
||||
**Recommendation:** Drop the `--session-id <id>` token from both documented invocations, and change `--alarm-reference` to `--reference` in the `acknowledge-alarm` line. Optionally also add `--filter-prefix` to the `stream-alarms` example so readers see the scoping option, and align README option names with the actual CLI by either renaming the CLI option `--reference` → `--alarm-reference` (matches the proto `alarm_full_reference` field semantically) or leaving as is and only fixing the README. Add a small `MxGatewayCliTests` parse-only assertion for both subcommands that exercises every option flag to prevent the same drift the next time the CLI surface or README is touched.
|
||||
|
||||
### Client.Java-033
|
||||
|
||||
| Field | Value |
|
||||
|---|---|
|
||||
| Severity | Medium |
|
||||
| Category | Correctness & logic bugs |
|
||||
| Location | `clients/java/zb-mom-ww-mxgateway-cli/src/main/java/com/zb/mom/ww/mxgateway/cli/MxGatewayCli.java:1078-1098` |
|
||||
| Status | Open |
|
||||
|
||||
**Description:** `StreamAlarmsCommand.call()` allocates a bounded `ArrayBlockingQueue<Object>(1024)` and the gRPC observer publishes each `AlarmFeedMessage` via `queue.offer(value)`:
|
||||
|
||||
```
|
||||
BlockingQueue<Object> queue = new ArrayBlockingQueue<>(1024);
|
||||
…
|
||||
@Override public void onNext(AlarmFeedMessage value) { queue.offer(value); }
|
||||
```
|
||||
|
||||
`BlockingQueue.offer` returns `false` when the queue is full and the message is dropped silently — no exception, no log, no signal to the consumer that data was lost. Combined with gRPC's default auto-inbound flow control (the same regime that Client.Java-011 documented for `MxEventStream`), any consumer stall long enough to fill 1024 slots will silently lose alarms past the boundary. The active-alarm snapshot replay at session open is exactly the burst case that pushes a moderately busy alarm feed close to that limit, and the failure mode is invisible — the CLI prints a truncated feed and exits 0.
|
||||
|
||||
The library-side `MxEventStream` (Client.Java-002 resolution) and `DeployEventStream` (Client.Java-021 resolution) explicitly surface overflow as `MxGatewayException` so the consumer can resubscribe with a resume cursor. The new `stream-alarms` CLI path regresses that contract.
|
||||
|
||||
`JavaStyleGuide.md` ("Streaming" section, line 57) explicitly prohibits this pattern: "Do not reorder, coalesce, or drop events in client code." `CLAUDE.md` ("fail-fast event backpressure") makes overflow a deliberate cancel-and-surface signal, not a silent drop.
|
||||
|
||||
**Recommendation:** Either (a) wrap the gRPC observer in the existing `MxEventStream`-style adaptor that calls `subscription.cancel()` and queues an exception on `queue.offer` returning `false`, then surface that exception from the drain loop — mirroring `MxEventStream.observer().onNext`'s overflow branch; or (b) reuse the library-side fail-fast plumbing by promoting `MxEventStream` (or extracting its terminal-state base) into a public `MxAlarmFeedStream` and have `MxGatewayClient.streamAlarms` return that instead of a bare subscription handle. Option (b) lines up with Client.Java-036 (deduplicate the subscription class family). Add a CLI regression test that overflows the bounded queue and asserts a non-zero exit / overflow exception, mirroring `MxGatewayMediumFindingsTests.eventStreamOverflowExceptionSurvivesASubsequentClose`.
|
||||
|
||||
### Client.Java-034
|
||||
|
||||
| Field | Value |
|
||||
|---|---|
|
||||
| Severity | Medium |
|
||||
| Category | Correctness & logic bugs |
|
||||
| Location | `clients/java/zb-mom-ww-mxgateway-cli/src/main/java/com/zb/mom/ww/mxgateway/cli/MxGatewayCli.java:182-198` |
|
||||
| Status | Open |
|
||||
|
||||
**Description:** `BatchCommand.call()` reads one CLI invocation per stdin line and tokenises with:
|
||||
|
||||
```
|
||||
String[] args = line.trim().split("\\s+");
|
||||
…
|
||||
int exitCode = cmd.execute(args);
|
||||
```
|
||||
|
||||
`split("\\s+")` does no shell-quoting parsing — it just splits on whitespace runs. Any argument with embedded whitespace gets shredded mid-string. Examples that break:
|
||||
|
||||
- `acknowledge-alarm --comment "needs verification" --operator op1` becomes `[acknowledge-alarm, --comment, "needs, verification", --operator, op1]` — picocli sees `"needs` as the comment value and `verification"` as an unknown positional argument.
|
||||
- `galaxy-watch --json --last-seen-deploy-time "2026-04-28 18:30:00Z"` (any value with a space) fails the same way.
|
||||
|
||||
The commit message for `71d2c39` says the protocol expects "one line of stdin = one full CLI invocation" without specifying whether quoted arguments are honoured, but the cross-language matrix (`scripts/run-client-e2e-tests.ps1` per the prior reviews) compares output across all five clients — if other clients (.NET, Go, Rust, Python) implement shell-like tokenisation, the Java CLI will diverge on any command-line value containing whitespace, and the e2e harness will fail or skip those cases inconsistently.
|
||||
|
||||
The current `MxGatewayCliTests` test set (`batchCommandExecutesVersionAndEmitsEorMarker`, `batchCommandEmitsEorAfterFailedCommandAndContinues`) only covers whitespace-free args, so the bug is invisible in the test suite.
|
||||
|
||||
**Recommendation:** Replace `line.trim().split("\\s+")` with a real shell-style tokeniser that honours single and double quotes and backslash escapes — `picocli.CommandLine.ArgumentParser` doesn't ship one, but Apache Commons Exec's `CommandLine.translateCommandline(String)`, JDK 21's `java.util.spi.ToolProvider` argument parsing, or a small hand-written state machine all work. Cross-check the .NET / Go / Rust / Python `batch` implementations in the same change so all five clients use the same tokenisation; document the contract in the protocol comment in `MxGatewayCli.java` and in `scripts/run-client-e2e-tests.ps1`. Add a CLI test that feeds `acknowledge-alarm --comment "with spaces"` through `batch` and asserts the `--comment` value reaches the gateway as `"with spaces"`.
|
||||
|
||||
### Client.Java-035
|
||||
|
||||
| Field | Value |
|
||||
|---|---|
|
||||
| Severity | Low |
|
||||
| Category | Testing coverage |
|
||||
| Location | `clients/java/zb-mom-ww-mxgateway-client/src/test/java/com/zb/mom/ww/mxgateway/client/MxGatewayClientSessionTests.java` |
|
||||
| Status | Open |
|
||||
|
||||
**Description:** Commit `8a0c59d` added `MxGatewayClient.streamAlarms(StreamAlarmsRequest, StreamObserver<AlarmFeedMessage>)` and a new public `MxGatewayAlarmFeedSubscription` class. No library-side test exercises either: a grep for `streamAlarms` across `zb-mom-ww-mxgateway-client/src/test/...` returns zero matches. The CLI tests (`MxGatewayCliTests.streamAlarmsCommand*`) exercise the path end-to-end, but they route through a `FakeClient.streamAlarms` override that bypasses the production `subscription.wrap(observer)` glue and the `withStreamDeadline(rawAsyncStub()).streamAlarms(...)` call. A regression to either — forgetting `.wrap(observer)`, dropping the deadline interceptor, misnaming the request — would compile and pass the CLI tests but break against a real gateway.
|
||||
|
||||
This is the same coverage gap pattern as Client.Java-030 (no fixture test for `QueryActiveAlarmsRequest`) which was resolved by adding `queryActiveAlarmsForwardsRequestAndStreamsSnapshots` to `MxGatewayClientSessionTests`. The new alarm-feed RPC deserves the same shape.
|
||||
|
||||
**Recommendation:** Add `streamAlarmsForwardsRequestAndStreamsAlarmFeedMessages` to `MxGatewayClientSessionTests` (in-process gRPC via the existing `InProcessGateway` + `TestGatewayService` fixture): override `TestGatewayService.streamAlarms` to capture the inbound `StreamAlarmsRequest` and emit one `active_alarm` snapshot, one `snapshot_complete`, and one `transition`, then complete. Call `MxGatewayClient.streamAlarms`, drain the observer via a `CountDownLatch`, and assert (a) the server observed the `alarm_filter_prefix`, (b) all three messages arrived in order with the expected payload-case, and (c) `MxGatewayAlarmFeedSubscription.cancel()` aborts the call (latch via `ServerCallStreamObserver.setOnCancelHandler`, mirroring the Client.Java-015 cancellation regression). Optionally also cover the cancel-before-beforeStart race that `MxGatewayAlarmFeedSubscription.wrap` handles, mirroring `mxEventStreamCloseBeforeBeforeStartCancelsStream`.
|
||||
|
||||
### Client.Java-036
|
||||
|
||||
| Field | Value |
|
||||
|---|---|
|
||||
| Severity | Low |
|
||||
| Category | Code organization & conventions |
|
||||
| Location | `clients/java/zb-mom-ww-mxgateway-client/src/main/java/com/zb/mom/ww/mxgateway/client/MxGatewayAlarmFeedSubscription.java`, `MxGatewayEventSubscription.java`, `MxGatewayActiveAlarmsSubscription.java`, `DeployEventSubscription.java` |
|
||||
| Status | Open |
|
||||
|
||||
**Description:** `MxGatewayAlarmFeedSubscription` is a structural near-copy of `MxGatewayEventSubscription` — same `AtomicReference<ClientCallStreamObserver<…>>` + `AtomicBoolean cancelled` field shape, the same `wrap(observer)` returning a `ClientResponseObserver` that stores `requestStream` in `beforeStart`, the same close-before-beforeStart race handling that Client.Java-014 originally fixed for `MxEventStream`, and the same `cancel()`+`close()` idempotency contract. The four subscription classes (`MxGatewayEventSubscription`, `MxGatewayActiveAlarmsSubscription`, `MxGatewayAlarmFeedSubscription`, `DeployEventSubscription`) are now ~60-line near-clones differing only in the request/response generic parameters and the `cancel` message string.
|
||||
|
||||
This is the same maintenance-hazard pattern Client.Java-009 / Client.Java-016 identified for `createChannel` / `withDeadline` / `shutdown` and which was resolved by extracting `MxGatewayChannels`. A future fix to one subscription class (e.g. propagating Client.Java-014's race fix to a future fifth subscription type, or hardening idempotent `cancel`) will silently miss the others — exactly what happened with Client.Java-021 (the `MxEventStream` terminal-state fix that didn't reach `DeployEventStream` for months).
|
||||
|
||||
**Recommendation:** Extract a package-private abstract base, e.g. `MxGatewayStreamSubscription<TRequest>`, holding the `AtomicReference` / `AtomicBoolean` pair, the `cancel()` / `close()` implementation, and a `ClientResponseObserver` factory parameterised by the cancel-message string and the response observer. Have all four subscription classes extend it. Behaviour-only refactor — no public API change, existing tests cover the contract.
|
||||
|
||||
|
||||
|
||||
Reference in New Issue
Block a user