Files
mxaccessgw/code-reviews/Client.Rust/findings.md
T
Joseph Doherty a0203503a7 Code-review 2026-05-20 sweep: re-review at 1cd51bb, resolve 72 findings across all 11 modules
Re-reviewed every module/client against the 10-category checklist
(REVIEW-PROCESS.md) at commit 1cd51bb, filed 72 new findings, and
fixed them in three priority waves (3 High, 17 Medium, 52 Low).

Highs
- Server-017: enumerate AcknowledgeAlarm / QueryActiveAlarms in
  GatewayGrpcScopeResolver so non-admin keys can use them; document
  the mapping in docs/Authorization.md; add interceptor tests.
- Client.Java-013: add the five missing bulk-method stubs to the
  CLI FakeSession so the test module compiles on a clean tree.
- Client.Rust-013: fix the clippy::doc_lazy_continuation regression
  in generated tonic code by reformatting the ReadBulkCommand proto
  comment and scoping a #![allow(...)] to the generated submodules.

Mediums (highlights)
- Server: unify GatewaySession state-lock discipline (-015) and
  make DisposeAsync race-safe against in-flight CloseAsync (-016);
  add constraint-enforcement test coverage for the bulk-plan path
  (-021).
- Worker: introduce StaRuntimeShutdownException so RunAlarmPollLoop
  can distinguish graceful shutdown from a real STA-affinity
  violation (-016); have the watchdog skip StaHung while
  CurrentCommandCorrelationId is non-empty so a legitimate slow
  ReadBulk no longer self-faults (-017).
- Tests: add per-method round-trip + cancellation coverage for the
  11 GatewaySession bulk methods (-013); replace the real TCP probe
  in GalaxyHierarchyCacheTests with an IGalaxyRepository fake
  (-016).
- IntegrationTests: drive the StreamEvents writer in the live Write
  test and assert OnWriteComplete (-012); add live tests for
  Unadvise/RemoveItem/Unregister ordering, WriteSecured, and
  abnormal worker exit (-014).
- Worker.Tests: replace MxAccessSession reflection with an internal
  CreateForTesting factory (-016); cover WorkerCancel and
  unexpected-body envelope branches (-017).
- Client.Java: cancel MxEventStream when close() races
  beforeStart() (-014); return a CancellingCompletableFuture that
  actually forwards cancellation through .thenApply chains (-015).
- Client.Python: drop the silent localhost-plaintext downgrade in
  the CLI; require explicit --plaintext (-013).
- Client.Rust: stop bench-read-bulk from polluting success-latency
  histograms with failed-call durations (-015); add coverage for
  the five MalformedReply paths, the bulk-write helpers, the
  Error::Unavailable mapping, and the unary-fault path (-016).
- Contracts: extend docs/Contracts.md with the bulk read/write
  command family (-009).

Lows (highlights)
- Server: cap GalaxyGlobMatcher.RegexCache; align
  WorkerAlarmRpcDispatcher missing-session handling; drop the
  duplicate dashboard @page routes; refresh IAlarmRpcDispatcher
  XML doc.
- Worker: surface SetXmlAlarmQuery COM failures; remove dead
  subscriptionExpression / ExecutingCommand arms; preserve
  factory-supplied runtime sessions; split MxAlarmSnapshot.cs into
  three files.
- Tests: dispose the WebApplication in seven test classes; rebuild
  FakeWorkerProcess.WaitForExitAsync against a real TaskCompletion
  source; switch the heartbeat-expires test to ManualTimeProvider;
  add InvariantCulture to the remaining DateTimeOffset.Parse sites;
  document GalaxyFilterInputSafetyTests in GatewayTesting.md.
- IntegrationTests: comment fixes, RecordingServerStreamWriter
  IDisposable, class-level [Trait], single-source ZB default
  connection string.
- Worker.Tests: replace silent-return gating with LiveMxAccessFact
  so absent env vars SKIP not pass; PascalCase rename of probe
  [Fact]s; deterministic deadline test; new frame-protocol error
  tests; ComputeTransitions diff-coverage; relocate dev-rig probes
  to Probes/.
- Contracts: add round-trip coverage and per-field redaction /
  Galaxy-identifier comments to the protos.
- Client.Dotnet: introduce clients/dotnet/Directory.Build.props so
  TreatWarningsAsErrors / analysers apply; document
  DiscoverHierarchyOptions and IMxGatewayCliClient; require typed
  bulk-read handles in CLI; surface AcknowledgeAlarm transport
  faults through Translate().
- Client.Go: kill dead code in alarms_test / fakeGalaxyServer /
  runWriteBulkVariant; document the six new subcommands in
  writeUsage; drain galaxy-watch events on limit; switch io.EOF
  comparisons to errors.Is.
- Client.Java: shared shutdown helpers + new shutdownTimeout
  option; regex-based credential redaction; Long.toUnsignedString
  for uint64 sequence; doc fixes.
- Client.Python: combine duplicate imports; add coverage for
  _percentile / bench-read-bulk / MAX_AGGREGATE_EVENTS /
  _api_key_from_env; populate pyproject metadata and ship py.typed.
- Client.Rust: expose next_correlation_id() so CLI ping/close
  stop hard-coding correlation IDs; resync RustClientDesign.md
  with the current Session / Error surface and CLI subcommand set.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-20 09:46:47 -04:00

27 KiB

Code Review — Client.Rust

Field Value
Module clients/rust
Reviewer Claude Code
Review date 2026-05-20
Commit reviewed 1cd51bb
Status Reviewed
Open findings 0

Checklist coverage

This re-review (1cd51bb) covers the changes added since 3cc53a8: the new bulk-write/read methods on Session, the read_bulk borrowed-slice signature, MalformedReply / Unavailable error variants, the projection-on-demand MxValue/MxArrayValue, the next_correlation_id rework, the new ReadBulk and bulk-write CLI subcommands, and the cross-language bench-read-bulk driver.

# Category Result
1 Correctness & logic bugs Issue found: read_bulk is missing the OK-but-shapeless MalformedReply symmetry of the other bulk helpers, but the bigger issue is no test exercises any of the new MalformedReply paths (Client.Rust-016).
2 mxaccessgw conventions Issue found: cargo clippy --workspace --all-targets -- -D warnings still fails — a fresh clippy::doc_lazy_continuation violation in ReadBulkCommand's generated doc comment trips the lint that the prior fixes did not anticipate (Client.Rust-013). CLI subcommands still emit hard-coded client_correlation_id strings on the raw paths (Client.Rust-014).
3 Concurrency & thread safety No issues found — CORRELATION_SEQUENCE is AtomicU64 with Relaxed, which is correct for monotonic id generation; clients remain cheaply cloneable; streams are Send.
4 Error handling & resilience Issue found: bench-read-bulk records every read_bulk failure into the latency histogram as if it succeeded, skewing p99/max upward (Client.Rust-015). The new Error::Unavailable mapping looks correct.
5 Security No issues found — API keys still redacted in Debug/Display, status messages scrubbed, secret arguments unchanged.
6 Performance & resource management No issues found in the changed code — read_bulk is honest about the unavoidable owned-Vec materialisation; projection-on-demand is now lazy.
7 Design-document adherence Issue found: RustClientDesign.md was refreshed but never grew the new bulk-write/read methods, the Unavailable/MalformedReply error variants, or the bench-read-bulk CLI command on its current surface (Client.Rust-017).
8 Code organization & conventions No new issues — BulkWriteReplyKind follows the renamed BulkReplyKind shape.
9 Testing coverage Issue found: none of the new code paths (bulk-write helpers, read_bulk, MalformedReply, Error::Unavailable, the bench-read-bulk flow) are covered by client-side tests (Client.Rust-016).
10 Documentation & comments No new issues beyond Client.Rust-017.

Findings

Client.Rust-001

Field Value
Severity High
Category mxaccessgw conventions
Location clients/rust/src/options.rs:98,143
Status Resolved

Description: with_max_grpc_message_bytes and max_grpc_message_bytes have no /// doc comments. The crate sets #![warn(missing_docs)] and CLAUDE.md mandates that cargo clippy --workspace --all-targets -- -D warnings pass. Under -D warnings these become hard errors, so clippy fails to compile the crate — breaking the documented build/test workflow for the module.

Recommendation: Add doc comments to both methods, e.g. /// Maximum encoded/decoded gRPC message size in bytes (default 16 MiB).

Resolution: Resolved in 0d8a28d (2026-05-18): doc comments added to both methods.

Client.Rust-002

Field Value
Severity High
Category mxaccessgw conventions
Location clients/rust/src/session.rs:522
Status Resolved

Description: The BulkReplyKind enum's variants (AddItemBulk, AdviseItemBulk, RemoveItemBulk, UnAdviseItemBulk, SubscribeBulk, UnsubscribeBulk) all share the Bulk suffix, tripping clippy::enum_variant_names. Under -D warnings this is a compile error, so cargo clippy --workspace --all-targets -- -D warnings fails — a violation of the CLAUDE.md requirement that clippy pass cleanly.

Recommendation: Rename the variants to drop the common suffix (e.g. AddItem, AdviseItem, …) or add a narrowly-scoped #[allow(clippy::enum_variant_names)] with a reason comment.

Resolution: Resolved in 0d8a28d (2026-05-18): variants renamed to AddItem/AdviseItem/RemoveItem/UnAdviseItem/Subscribe/Unsubscribe, which no longer share a common suffix.

Client.Rust-003

Field Value
Severity High
Category Correctness & logic bugs
Location clients/rust/crates/mxgw-cli/src/main.rs:1051
Status Resolved

Description: The unit test version_json_output_has_protocol_versions asserts value["gatewayProtocolVersion"] == 2, but GATEWAY_PROTOCOL_VERSION is 3 (version.rs:10), matching the authoritative server constant GatewayContractInfo.GatewayProtocolVersion = 3. The test fails, so cargo test --workspace (the documented test step) does not pass — the test was not updated when the protocol version was bumped.

Recommendation: Update the assertion to 3, or better, assert against GATEWAY_PROTOCOL_VERSION so it cannot drift again.

Resolution: Resolved in 0d8a28d (2026-05-18): the test now asserts against the GATEWAY_PROTOCOL_VERSION / WORKER_PROTOCOL_VERSION constants, so it cannot drift again.

Client.Rust-004

Field Value
Severity Low
Category Documentation & comments
Location clients/rust/src/version.rs:7
Status Resolved

Description: CLIENT_VERSION is "0.1.0-dev" and its doc comment claims "Mirrors Cargo.toml", but Cargo.toml declares version = "0.1.0" (no -dev suffix). The comment is misleading and the value is not actually kept in sync with the manifest.

Recommendation: Either set CLIENT_VERSION from the build via env!("CARGO_PKG_VERSION"), or correct the constant to "0.1.0" and drop the "Mirrors Cargo.toml" claim.

Resolution: Resolved in 0d8a28d (2026-05-18): CLIENT_VERSION is now env!("CARGO_PKG_VERSION"), taken from Cargo.toml at compile time so the two cannot drift.

Client.Rust-005

Field Value
Severity Medium
Category Correctness & logic bugs
Location clients/rust/src/session.rs:489-520
Status Resolved

Description: register_server_handle, add_item_handle, and add_item2_handle fall through to reply.return_value … .unwrap_or_default(), returning 0 when the reply carries neither the expected typed payload nor an Int32 return_value. Because Session::invoke has already confirmed protocol_status == Ok, a malformed-but-OK reply silently yields handle 0, which the caller then uses as a real handle against the worker.

Recommendation: Return Err(Error::ProtocolStatus { … }) (or a dedicated Error::MalformedReply) when an OK reply lacks an extractable handle, instead of defaulting to 0.

Resolution: Resolved in 0d8a28d (2026-05-18): the three handle extractors now return Result<i32, Error> and yield the new Error::MalformedReply when an OK reply carries no usable handle.

Client.Rust-006

Field Value
Severity Medium
Category Error handling & resilience
Location clients/rust/src/session.rs:531-555
Status Resolved

Description: bulk_results returns Vec::new() for any (payload, kind) combination that does not match the expected arm — including an OK reply carrying the wrong or no payload. A caller of subscribe_bulk/add_item_bulk then sees an empty result vector and cannot distinguish "zero items processed" from "gateway returned a shapeless reply".

Recommendation: Treat a missing/mismatched bulk payload on an OK reply as an error rather than an empty vector, or document the empty-vec fallback explicitly and log it.

Resolution: Resolved in 0d8a28d (2026-05-18): bulk_results now returns Result<Vec<SubscribeResult>, Error> and yields Error::MalformedReply on a mismatched or absent bulk payload.

Client.Rust-007

Field Value
Severity Low
Category Design-document adherence
Location clients/rust/RustClientDesign.md:14-55
Status Resolved

Description: RustClientDesign.md is stale relative to the implemented code. It documents a nested crates/mxgateway-client/ layout (the real crate root is clients/rust/ with a flat src/), and lists tracing among "Expected dependencies", but tracing appears in no Cargo.toml. CLAUDE.md requires docs to change with the source.

Recommendation: Update RustClientDesign.md to the actual flat layout and remove tracing from the dependency list (or add tracing if structured logging is genuinely intended).

Resolution: Resolved in 0d8a28d (2026-05-18): the "Crate Layout" section now shows the actual flat layout (mxgateway-client as the workspace-root crate, mxgw-cli as a member) and the unused tracing entry was removed from the dependency list.

Client.Rust-008

Field Value
Severity Low
Category Performance & resource management
Location clients/rust/src/value.rs:161-261
Status Resolved

Description: MxValueProjection::from_proto and MxArrayProjection::from_proto deep-clone every element out of the wire message while MxValue/MxArrayValue also retain the original raw message. Every MxValue therefore holds two copies of its payload, wasteful for large string arrays or raw blobs arriving on the event stream.

Recommendation: Compute the projection lazily on demand, or have the projection borrow from raw, so array/raw payloads are not duplicated for every wrapped value.

Resolution: Resolved in 0d8a28d (2026-05-18): MxValue and MxArrayValue no longer cache a projection field — projection() computes the typed view on demand from raw. A value built only to be sent over the wire now holds a single copy of its payload and pays no projection cost.

Client.Rust-009

Field Value
Severity Low
Category Testing coverage
Location clients/rust/tests/client_behavior.rs, clients/rust/src/galaxy.rs
Status Resolved

Description: Several critical paths are untested: TLS channel setup (with_plaintext(false) / CA-file loading), mid-stream tonic::Status fault propagation through EventStream/DeployEventStream (tests only send Ok items), and the bulk-size cap (ensure_bulk_size rejecting >1000 items).

Recommendation: Add tests that (a) feed an Err(Status) into the event/deploy streams and assert it surfaces as the mapped Error, (b) assert add_item_bulk with 1001 items returns Error::InvalidArgument, and (c) exercise the CA-file/InvalidEndpoint error path.

Resolution: Resolved in 0d8a28d (2026-05-18): added add_item_bulk_rejects_input_above_the_thousand_item_cap, event_stream_surfaces_a_mid_stream_status_fault (the fake gateway now optionally emits a mid-stream Status::unavailable), and connect_with_unreadable_ca_file_reports_invalid_endpoint.

Client.Rust-010

Field Value
Severity Low
Category Error handling & resilience
Location clients/rust/src/client.rs:255-268, clients/rust/src/galaxy.rs:204-216
Status Resolved

Description: The client applies only a per-call deadline via Request::set_timeout and has no retry, reconnect, or transient-vs-permanent classification. A transient Unavailable (e.g. a gateway restart) maps to the catch-all Error::Status and is indistinguishable from a permanent failure. This is an acceptable v1 stance but is undocumented.

Recommendation: Either add a documented Error::Unavailable variant classifying Code::Unavailable/Code::ResourceExhausted, or explicitly document in the README that the client performs no retries and that transient failures arrive as Error::Status.

Resolution: Resolved in 0d8a28d (2026-05-18): added the Error::Unavailable variant; From<tonic::Status> maps Code::Unavailable and Code::ResourceExhausted to it, so callers can classify transient failures without unwrapping the raw status.

Client.Rust-011

Field Value
Severity Low
Category mxaccessgw conventions
Location clients/rust/src/session.rs:469
Status Resolved

Description: command_request hard-codes client_correlation_id as format!("rust-client-{}", kind.as_str_name()). Every invocation of the same command kind on a session uses an identical correlation id, so the id cannot correlate a specific request/reply pair in gateway logs or among concurrent in-flight calls. MXAccess parity diagnostics rely on correlation ids being unique per call.

Recommendation: Append a per-call unique suffix (monotonic counter or UUID) to the correlation id, or expose a way for the caller to supply one.

Resolution: Resolved in 0d8a28d (2026-05-18): correlation ids are built by next_correlation_id, which appends a process-wide atomic sequence number; Session::close uses it too.

Client.Rust-012

Field Value
Severity High
Category mxaccessgw conventions
Location clients/rust/src/galaxy.rs:282
Status Resolved

Description: Found while verifying the fix for Client.Rust-001/002: cargo clippy --workspace --all-targets -- -D warnings reported a third violation the original review missed. The get_last_deploy_time test fake calls .clone() on a MutexGuard<Option<prost_types::Timestamp>>, and Option<Timestamp> is Copy (clippy::clone_on_copy). Under -D warnings this is a compile error, so clippy still did not pass after Client.Rust-001/002 alone.

Recommendation: Dereference instead of cloning: *self.state.last_deploy.lock().unwrap().

Resolution: Resolved in 0d8a28d (2026-05-18): replaced .clone() with a deref. cargo clippy --workspace --all-targets -- -D warnings now passes cleanly.

Client.Rust-013

Field Value
Severity High
Category mxaccessgw conventions
Location src/MxGateway.Contracts/Protos/mxaccess_gateway.proto:414-424 (origin); clients/rust/src/generated.rs:11-31 (suppression site)
Status Resolved

Description: cargo clippy --workspace --all-targets -- -D warnings fails again on this commit, this time on a clippy::doc_lazy_continuation violation in generated code:

error: doc list item without indentation
   --> .../mxaccess_gateway.v1.rs:526:5
    |
526 | /// `timeout_ms == 0` uses the gateway-configured default (1000 ms).
    |     ^

The lint fires because the ReadBulkCommand proto comment (added with the bulk Read feature in commit 5e375f6) writes a bulleted list and then a trailing paragraph without the required blank line. prost-build forwards the proto comment verbatim into Rust doc comments, and the Rust client compiles those generated modules with crate-default lints. The crate already opts out of clippy::large_enum_variant in src/generated.rs for exactly this kind of generator-style problem, but doc_lazy_continuation is not on the allow-list, so the lint reaches -D warnings and breaks the documented cargo clippy --workspace --all-targets -- -D warnings invocation that CLAUDE.md mandates pass. The Rust client review was previously closed as clippy-clean (Client.Rust-001/002/012); this is the third clippy-clean regression caused by generated code in this module and warrants a more durable fix.

Recommendation: Add #![allow(clippy::doc_lazy_continuation)] to each generated submodule in clients/rust/src/generated.rs alongside clippy::large_enum_variant, so generated doc comments — which the client cannot edit — cannot break the -D warnings build. Independently, fix the upstream proto comment to insert a blank line before the trailing paragraph so the C# / Go / Python / Java generators do not carry the same flaky text.

Resolution: 2026-05-20 — Added #![allow(clippy::doc_lazy_continuation)] to each generated submodule in clients/rust/src/generated.rs next to the existing clippy::large_enum_variant allow, and reformatted the ReadBulkCommand proto comment in src/MxGateway.Contracts/Protos/mxaccess_gateway.proto to surround the bulleted list with blank lines so doc-comment generators in every language see a properly-terminated list. cargo clippy --workspace --all-targets -- -D warnings and cargo test --workspace now pass, and dotnet build src/MxGateway.Contracts/MxGateway.Contracts.csproj reports 0 warnings.

Client.Rust-014

Field Value
Severity Low
Category mxaccessgw conventions
Location clients/rust/crates/mxgw-cli/src/main.rs:450,497
Status Resolved

Description: Client.Rust-011 made Session build unique correlation ids per call, but the mxgw CLI's Ping and CloseSession subcommands still hard-code client_correlation_id: "rust-cli-ping".to_owned() and "rust-cli-close-session".to_owned(). Both go through client.invoke(…) / client.close_session_raw(…) rather than the Session helpers, so the library's id generator does not run. The CLI is the cross-language e2e driver — when the same machine runs concurrent CLI smokes, every ping/close-session request collides on the same correlation id in gateway logs, defeating the diagnostic value the library fix unlocked.

Recommendation: Either (a) expose session::next_correlation_id as a pub(crate) or library-level helper and have the CLI call it from Ping/CloseSession, or (b) replace these RPCs with the higher-level Session helpers (Session::close, and a thin Session::ping wrapper) so the CLI shares the library's correlation-id discipline by construction.

Resolution: 2026-05-20 — Promoted session::next_correlation_id from a module-private helper to a pub library-level function (it already lived in the pub mod session) and updated the mxgw CLI's Ping and CloseSession subcommands to call mxgateway_client::session::next_correlation_id("cli-ping") / next_correlation_id("cli-close-session") instead of the hard-coded "rust-cli-ping" / "rust-cli-close-session" strings. Concurrent CLI smokes now produce unique correlation ids per call — driven by the same process-wide CORRELATION_SEQUENCE AtomicU64 the library uses — so gateway logs can tell collisions apart again. cargo fmt, cargo build --workspace, cargo clippy --workspace --all-targets -- -D warnings, and cargo test --workspace all pass.

Client.Rust-015

Field Value
Severity Medium
Category Error handling & resilience
Location clients/rust/crates/mxgw-cli/src/main.rs:1053-1070
Status Resolved

Description: The new cross-language benchmark bench-read-bulk pushes the elapsed time of every read_bulk call into latencies_ms regardless of whether the call returned Ok or Err:

let outcome = session.read_bulk(server_handle, &tags, timeout_ms).await;
let elapsed_ms = call_start.elapsed().as_secs_f64() * 1000.0;
latencies_ms.push(elapsed_ms);
match outcome {
    Ok(results) => { successful_calls += 1;  }
    Err(_) => failed_calls += 1,
}

A failed read_bulk (transient Unavailable, deadline-exceeded mid-call, etc.) typically returns later than a successful one — it includes the full per-call timeout that the success path never waits for. The histogram therefore conflates "p99 cached-read latency" with "p99 of (cached-read + timed-out call)", and the JSON document the PowerShell driver collates publishes latencyMs.p99 / latencyMs.max that no longer represent successful-call latency. Worse, the failure category is silently dropped (Err(_) => failed_calls += 1) so a benchmark run that fails on every call still emits a coherent-looking JSON without ever surfacing why. This is misleading for a benchmark whose JSON shape is the cross-language comparison contract.

Recommendation: Only push elapsed time into latencies_ms on Ok, or split into two histograms (successLatencyMs and failureLatencyMs) and log the first failure's error string into the stats record so a partial-failure run is visible at the report layer.

Resolution: 2026-05-20 — Extracted the per-iteration accounting in bench-read-bulk into a BenchReadBulkStats helper with explicit record_success/record_failure methods. Successful read_bulk calls now flow into success_latencies_ms (driving the cross-language latencyMs.p99/max JSON contract), failures flow into a separate failure_latencies_ms histogram surfaced as failureLatencyMs, and the first failure's redacted error string is stashed as firstFailure so a partial-failure run is visible at the report layer instead of producing a coherent-looking JSON that hides every error. Added a unit test (bench_read_bulk_stats_keeps_failures_out_of_success_latency_histogram) that records two fast successes plus a deliberately slow failure and asserts the success histogram never sees the failure latency, plus a smaller smoke test for the zero-duration calls-per-second path.

Client.Rust-016

Field Value
Severity Medium
Category Testing coverage
Location clients/rust/tests/client_behavior.rs, clients/rust/src/session.rs:489-519,654-768
Status Resolved

Description: The fixes for Client.Rust-005 / 006 added five new Error::MalformedReply paths to session.rs (register_server_handle, add_item_handle, add_item2_handle, bulk_results, bulk_write_results) plus the inline branch in read_bulk. None of them are exercised by tests — every test in client_behavior.rs feeds the matching payload back to the client, so the malformed-reply branches are dead code from the test suite's perspective. The new bulk-write helpers (write_bulk, write2_bulk, write_secured_bulk, write_secured2_bulk) only have a single happy-path assertion via write_bulk, leaving the three other variants and every per-entry-failure shape untested. The bench-read-bulk flow has no test (the driver script is the only consumer). The Error::Unavailable variant from Client.Rust-010 is covered by event_stream_surfaces_a_mid_stream_status_fault, but the same variant on a unary Code::Unavailable is not.

Recommendation: Add three light tests against the existing FakeGateway:

  1. Have the fake reply to AddItem (or Register / AddItem2) with protocol_status = Ok and no payload, and assert the client surfaces Error::MalformedReply.
  2. Have the fake reply to WriteBulk with protocol_status = Ok and the wrong payload arm (e.g. an AddItemReply body), and assert Error::MalformedReply.
  3. Have the fake fail the unary Invoke with Status::unavailable(...) and assert Error::Unavailable.

Optionally add Write2Bulk / WriteSecuredBulk / WriteSecured2Bulk smoke assertions so all four bulk-write families have at least one round-trip test.

Resolution: 2026-05-20 — Added eight new integration tests in clients/rust/tests/client_behavior.rs. Each new Error::MalformedReply site is exercised via a test-only InvokeOverride injected into FakeState that lets a single test pin the fake gateway's Invoke handler to one of three malformed shapes (OK reply with no payload, OK reply with the wrong payload arm for read_bulk, OK reply with the wrong payload arm for the other bulk / bulk-write families): register_returns_malformed_reply_when_ok_reply_has_no_payload, add_item_returns_malformed_reply_when_ok_reply_has_no_payload, add_item2_returns_malformed_reply_when_ok_reply_has_no_payload, subscribe_bulk_returns_malformed_reply_on_mismatched_payload_arm, write_bulk_returns_malformed_reply_on_mismatched_payload_arm, and read_bulk_returns_malformed_reply_on_mismatched_payload_arm. The unary Error::Unavailable path is covered by unary_invoke_maps_status_unavailable_to_error_unavailable (the override returns Status::unavailable(...)). The remaining three bulk-write families gained round-trip smoke tests — write2_bulk_round_trips_through_the_fake_gateway, write_secured_bulk_round_trips_through_the_fake_gateway, write_secured2_bulk_round_trips_through_the_fake_gateway — extending the fake gateway's dispatcher with happy-path replies for Write2Bulk / WriteSecuredBulk / WriteSecured2Bulk. The bench-read-bulk flow gets a BenchReadBulkStats unit test in crates/mxgw-cli/src/main.rs (see Client.Rust-015) that asserts the latency-tracking change keeps failed-call durations out of latencyMs.

Client.Rust-017

Field Value
Severity Low
Category Design-document adherence
Location clients/rust/RustClientDesign.md:79-99,156-163
Status Resolved

Description: CLAUDE.md requires docs to change with the source. RustClientDesign.md was refreshed to fix the layout/tracing drift (Client.Rust-007), but the Session API surface in the design (Library API block, lines 79-99) still lists only the original six bulk helpers — add_item_bulk, advise_item_bulk, remove_item_bulk, un_advise_item_bulk, subscribe_bulk, unsubscribe_bulk — and is missing the five new bulk-write helpers and read_bulk (write_bulk, write2_bulk, write_secured_bulk, write_secured2_bulk, read_bulk) that landed in commits 5e375f6 / f220908 / 61644e6. The Error Handling block (lines 130-146) still enumerates Transport, Status, Authentication, Authorization, Session, Worker, Command, MxAccess, Timeout, Cancelled — but not MalformedReply, Unavailable, or InvalidEndpoint, all of which are now public variants of the crate's Error enum. The Test CLI block (lines 158-163) lists version / smoke / stream-events / write but is missing every new subcommand (read-bulk, write-bulk, write2-bulk, write-secured-bulk, write-secured2-bulk, bench-read-bulk, galaxy watch).

Recommendation: Bring the design doc back in sync: extend the Session API code block to enumerate the bulk-write/read methods, expand the Error enum to match clients/rust/src/error.rs, and add the missing CLI subcommands. The README is already up to date, so this is design-doc-only churn.

Resolution: 2026-05-20 — Brought clients/rust/RustClientDesign.md back in sync with the implementation. The Session block now lists the five new bulk helpers (write_bulk, write2_bulk, write_secured_bulk, write_secured2_bulk, read_bulk) alongside the original six and notes that session::next_correlation_id is pub for raw-RPC consumers (the CLI). The Error enum block now matches clients/rust/src/error.rsInvalidEndpoint, InvalidArgument, Transport, Authentication, Authorization, Timeout, Cancelled, Unavailable, Status, Command, ProtocolStatus, MalformedReply — with a short paragraph explaining what Unavailable, MalformedReply, and InvalidEndpoint classify. The Test CLI block enumerates every subcommand the binary exposes today: version, ping, open-session, close-session, register, add-item, advise, subscribe-bulk, unsubscribe-bulk, read-bulk, write, write2, write-bulk, write2-bulk, write-secured-bulk, write-secured2-bulk, stream-events, bench-read-bulk, smoke, and the galaxy {test-connection,last-deploy-time,discover-hierarchy,watch} subtree.