Code-review 2026-05-20 sweep: re-review at 1cd51bb, resolve 72 findings across all 11 modules

Re-reviewed every module/client against the 10-category checklist
(REVIEW-PROCESS.md) at commit 1cd51bb, filed 72 new findings, and
fixed them in three priority waves (3 High, 17 Medium, 52 Low).

Highs
- Server-017: enumerate AcknowledgeAlarm / QueryActiveAlarms in
  GatewayGrpcScopeResolver so non-admin keys can use them; document
  the mapping in docs/Authorization.md; add interceptor tests.
- Client.Java-013: add the five missing bulk-method stubs to the
  CLI FakeSession so the test module compiles on a clean tree.
- Client.Rust-013: fix the clippy::doc_lazy_continuation regression
  in generated tonic code by reformatting the ReadBulkCommand proto
  comment and scoping a #![allow(...)] to the generated submodules.

Mediums (highlights)
- Server: unify GatewaySession state-lock discipline (-015) and
  make DisposeAsync race-safe against in-flight CloseAsync (-016);
  add constraint-enforcement test coverage for the bulk-plan path
  (-021).
- Worker: introduce StaRuntimeShutdownException so RunAlarmPollLoop
  can distinguish graceful shutdown from a real STA-affinity
  violation (-016); have the watchdog skip StaHung while
  CurrentCommandCorrelationId is non-empty so a legitimate slow
  ReadBulk no longer self-faults (-017).
- Tests: add per-method round-trip + cancellation coverage for the
  11 GatewaySession bulk methods (-013); replace the real TCP probe
  in GalaxyHierarchyCacheTests with an IGalaxyRepository fake
  (-016).
- IntegrationTests: drive the StreamEvents writer in the live Write
  test and assert OnWriteComplete (-012); add live tests for
  Unadvise/RemoveItem/Unregister ordering, WriteSecured, and
  abnormal worker exit (-014).
- Worker.Tests: replace MxAccessSession reflection with an internal
  CreateForTesting factory (-016); cover WorkerCancel and
  unexpected-body envelope branches (-017).
- Client.Java: cancel MxEventStream when close() races
  beforeStart() (-014); return a CancellingCompletableFuture that
  actually forwards cancellation through .thenApply chains (-015).
- Client.Python: drop the silent localhost-plaintext downgrade in
  the CLI; require explicit --plaintext (-013).
- Client.Rust: stop bench-read-bulk from polluting success-latency
  histograms with failed-call durations (-015); add coverage for
  the five MalformedReply paths, the bulk-write helpers, the
  Error::Unavailable mapping, and the unary-fault path (-016).
- Contracts: extend docs/Contracts.md with the bulk read/write
  command family (-009).

Lows (highlights)
- Server: cap GalaxyGlobMatcher.RegexCache; align
  WorkerAlarmRpcDispatcher missing-session handling; drop the
  duplicate dashboard @page routes; refresh IAlarmRpcDispatcher
  XML doc.
- Worker: surface SetXmlAlarmQuery COM failures; remove dead
  subscriptionExpression / ExecutingCommand arms; preserve
  factory-supplied runtime sessions; split MxAlarmSnapshot.cs into
  three files.
- Tests: dispose the WebApplication in seven test classes; rebuild
  FakeWorkerProcess.WaitForExitAsync against a real TaskCompletion
  source; switch the heartbeat-expires test to ManualTimeProvider;
  add InvariantCulture to the remaining DateTimeOffset.Parse sites;
  document GalaxyFilterInputSafetyTests in GatewayTesting.md.
- IntegrationTests: comment fixes, RecordingServerStreamWriter
  IDisposable, class-level [Trait], single-source ZB default
  connection string.
- Worker.Tests: replace silent-return gating with LiveMxAccessFact
  so absent env vars SKIP not pass; PascalCase rename of probe
  [Fact]s; deterministic deadline test; new frame-protocol error
  tests; ComputeTransitions diff-coverage; relocate dev-rig probes
  to Probes/.
- Contracts: add round-trip coverage and per-field redaction /
  Galaxy-identifier comments to the protos.
- Client.Dotnet: introduce clients/dotnet/Directory.Build.props so
  TreatWarningsAsErrors / analysers apply; document
  DiscoverHierarchyOptions and IMxGatewayCliClient; require typed
  bulk-read handles in CLI; surface AcknowledgeAlarm transport
  faults through Translate().
- Client.Go: kill dead code in alarms_test / fakeGalaxyServer /
  runWriteBulkVariant; document the six new subcommands in
  writeUsage; drain galaxy-watch events on limit; switch io.EOF
  comparisons to errors.Is.
- Client.Java: shared shutdown helpers + new shutdownTimeout
  option; regex-based credential redaction; Long.toUnsignedString
  for uint64 sequence; doc fixes.
- Client.Python: combine duplicate imports; add coverage for
  _percentile / bench-read-bulk / MAX_AGGREGATE_EVENTS /
  _api_key_from_env; populate pyproject metadata and ship py.typed.
- Client.Rust: expose next_correlation_id() so CLI ping/close
  stop hard-coding correlation IDs; resync RustClientDesign.md
  with the current Session / Error surface and CLI subcommand set.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
Joseph Doherty
2026-05-20 09:46:47 -04:00
parent 1cd51bbda3
commit a0203503a7
122 changed files with 8723 additions and 757 deletions
+54 -12
View File
@@ -93,11 +93,24 @@ impl Session {
pub async fn subscribe_bulk(&self, server_handle: i32, tag_addresses: Vec<String>) -> Result<Vec<SubscribeResult>, Error>;
pub async fn unsubscribe_bulk(&self, server_handle: i32, item_handles: Vec<i32>) -> Result<Vec<SubscribeResult>, Error>;
pub async fn write(&self, server_handle: i32, item_handle: i32, value: MxValue, user_id: i32) -> Result<(), Error>;
pub async fn write_bulk(&self, server_handle: i32, entries: Vec<WriteBulkEntry>, user_id: i32) -> Result<Vec<BulkWriteResult>, Error>;
pub async fn write2_bulk(&self, server_handle: i32, entries: Vec<Write2BulkEntry>, timestamp: prost_types::Timestamp, user_id: i32) -> Result<Vec<BulkWriteResult>, Error>;
pub async fn write_secured_bulk(&self, server_handle: i32, entries: Vec<WriteSecuredBulkEntry>, current_user_id: i32, verifier_user_id: i32) -> Result<Vec<BulkWriteResult>, Error>;
pub async fn write_secured2_bulk(&self, server_handle: i32, entries: Vec<WriteSecured2BulkEntry>, timestamp: prost_types::Timestamp, current_user_id: i32, verifier_user_id: i32) -> Result<Vec<BulkWriteResult>, Error>;
pub async fn read_bulk(&self, server_handle: i32, tags: &[String], timeout_ms: u32) -> Result<Vec<ReadBulkResult>, Error>;
pub async fn events(&self) -> Result<impl Stream<Item = Result<MxEvent, Error>>, Error>;
pub async fn close(&self) -> Result<(), Error>;
}
```
The five bulk-write helpers (`write_bulk`, `write2_bulk`, `write_secured_bulk`,
`write_secured2_bulk`) and `read_bulk` mirror the worker's bulk command shapes
in `mxaccess_gateway.proto` and use the same correlation-id discipline as the
unary helpers — `session::next_correlation_id` is `pub` so that consumers
constructing raw `MxCommandRequest`/`CloseSessionRequest` payloads outside
the `Session` helpers (notably the `mxgw` test CLI's `ping` and
`close-session` subcommands) share the same id generation.
## Authentication
Use a `tonic` interceptor or request extension layer to add:
@@ -132,19 +145,29 @@ Use `thiserror`:
```rust
pub enum Error {
InvalidEndpoint { endpoint: String, detail: String },
InvalidArgument { name: String, detail: String },
Transport(tonic::transport::Error),
Status(tonic::Status),
Authentication(String),
Authorization(String),
Session(SessionError),
Worker(WorkerError),
Command(CommandError),
MxAccess(MxAccessError),
Timeout,
Cancelled,
Authentication { message: String, status: Box<tonic::Status> },
Authorization { message: String, status: Box<tonic::Status> },
Timeout { message: String, status: Box<tonic::Status> },
Cancelled { message: String, status: Box<tonic::Status> },
Unavailable { message: String, status: Box<tonic::Status> },
Status(Box<tonic::Status>),
Command(Box<CommandError>),
ProtocolStatus { operation: &'static str, code: ProtocolStatusCode, message: String },
MalformedReply { detail: String },
}
```
`Unavailable` classifies the transient `Code::Unavailable` /
`Code::ResourceExhausted` statuses so callers can decide whether to retry
without unwrapping the raw status. `MalformedReply` surfaces OK replies
whose payload does not carry the data the command contract requires (for
example, an `AddItem` reply missing the item handle, or a `WriteBulk` reply
carrying the wrong payload arm). `InvalidEndpoint` is returned when the
endpoint URL fails to parse or its TLS material cannot be loaded.
Preserve raw command replies in `CommandError` where applicable.
## Test CLI
@@ -153,13 +176,32 @@ Binary: `mxgw`.
Use `clap` derive.
Commands:
Commands (see `clients/rust/README.md` for full argument lists):
```text
mxgw version
mxgw smoke --endpoint http://localhost:5000 --api-key-env MXGATEWAY_API_KEY --item TestChildObject.TestInt
mxgw ping
mxgw open-session
mxgw close-session --session-id <id>
mxgw register --session-id <id> --client-name <name>
mxgw add-item --session-id <id> --server-handle <h> --item <tag>
mxgw advise --session-id <id> --server-handle <h> --item-handle <h>
mxgw subscribe-bulk --session-id <id> --server-handle <h> --items <a,b,c>
mxgw unsubscribe-bulk --session-id <id> --server-handle <h> --item-handles <1,2,3>
mxgw read-bulk --session-id <id> --server-handle <h> --items <a,b,c> --timeout-ms 1500
mxgw write --session-id <id> --server-handle 1 --item-handle 1 --value-type int32 --value 123
mxgw write2 --session-id <id> --server-handle 1 --item-handle 1 --value-type int32 --value 123 --timestamp <rfc3339>
mxgw write-bulk --session-id <id> --server-handle <h> --item-handles <1,2> --value-type int32 --values <1,2>
mxgw write2-bulk --session-id <id> --server-handle <h> --item-handles <1,2> --value-type int32 --values <1,2> --timestamp <rfc3339>
mxgw write-secured-bulk --session-id <id> --server-handle <h> --item-handles <1,2> --value-type int32 --values <1,2>
mxgw write-secured2-bulk --session-id <id> --server-handle <h> --item-handles <1,2> --value-type int32 --values <1,2> --timestamp <rfc3339>
mxgw stream-events --session-id <id> --json
mxgw write --session-id <id> --server-handle 1 --item-handle 1 --type int32 --value 123
mxgw bench-read-bulk --duration-seconds 30 --bulk-size 6 --json
mxgw smoke --endpoint http://localhost:5000 --api-key-env MXGATEWAY_API_KEY --item TestChildObject.TestInt
mxgw galaxy test-connection
mxgw galaxy last-deploy-time
mxgw galaxy discover-hierarchy
mxgw galaxy watch [--last-seen-deploy-time <rfc3339>] [--max-events N]
```
JSON output should use `serde_json`.
+199 -51
View File
@@ -447,7 +447,9 @@ async fn run(cli: Cli) -> Result<(), Error> {
let client = connect(connection).await?;
let reply = client
.invoke(MxCommandRequest {
client_correlation_id: "rust-cli-ping".to_owned(),
client_correlation_id: mxgateway_client::session::next_correlation_id(
"cli-ping",
),
command: Some(MxCommand {
kind: MxCommandKind::Ping as i32,
payload: Some(mxgateway_client::generated::mxaccess_gateway::v1::mx_command::Payload::Ping(
@@ -494,7 +496,9 @@ async fn run(cli: Cli) -> Result<(), Error> {
let reply = client
.close_session_raw(CloseSessionRequest {
session_id,
client_correlation_id: "rust-cli-close-session".to_owned(),
client_correlation_id: mxgateway_client::session::next_correlation_id(
"cli-close-session",
),
})
.await?;
if json {
@@ -1034,19 +1038,13 @@ async fn run_bench_read_bulk(
.map(|r| r.item_handle)
.collect();
let warmup_deadline = std::time::Instant::now()
+ std::time::Duration::from_secs(warmup_seconds);
let warmup_deadline =
std::time::Instant::now() + std::time::Duration::from_secs(warmup_seconds);
while std::time::Instant::now() < warmup_deadline {
let _ = session
.read_bulk(server_handle, &tags, timeout_ms)
.await;
let _ = session.read_bulk(server_handle, &tags, timeout_ms).await;
}
let mut latencies_ms: Vec<f64> = Vec::with_capacity(65_536);
let mut total_read_results: u64 = 0;
let mut cached_read_results: u64 = 0;
let mut successful_calls: u64 = 0;
let mut failed_calls: u64 = 0;
let mut stats = BenchReadBulkStats::default();
let steady_start = std::time::Instant::now();
let steady_deadline = steady_start + std::time::Duration::from_secs(duration_seconds);
@@ -1054,18 +1052,9 @@ async fn run_bench_read_bulk(
let call_start = std::time::Instant::now();
let outcome = session.read_bulk(server_handle, &tags, timeout_ms).await;
let elapsed_ms = call_start.elapsed().as_secs_f64() * 1000.0;
latencies_ms.push(elapsed_ms);
match outcome {
Ok(results) => {
successful_calls += 1;
for r in &results {
total_read_results += 1;
if r.was_cached {
cached_read_results += 1;
}
}
}
Err(_) => failed_calls += 1,
Ok(results) => stats.record_success(elapsed_ms, &results),
Err(error) => stats.record_failure(elapsed_ms, &error),
}
}
let steady_elapsed = steady_start.elapsed();
@@ -1074,36 +1063,20 @@ async fn run_bench_read_bulk(
let _ = session.unsubscribe_bulk(server_handle, item_handles).await;
}
let total_calls = successful_calls + failed_calls;
let calls_per_second = if steady_elapsed.as_secs_f64() > 0.0 {
total_calls as f64 / steady_elapsed.as_secs_f64()
} else {
0.0
let context = BenchReadBulkContext {
endpoint: &endpoint,
client_name: &client_name,
bulk_size,
duration_seconds,
warmup_seconds,
steady_elapsed,
tags: &tags,
};
let summary = percentile_summary(&latencies_ms);
let stats = serde_json::json!({
"language": "rust",
"command": "bench-read-bulk",
"endpoint": endpoint,
"clientName": client_name,
"bulkSize": bulk_size,
"durationSeconds": duration_seconds,
"warmupSeconds": warmup_seconds,
"durationMs": steady_elapsed.as_millis() as u64,
"tags": tags,
"totalCalls": total_calls,
"successfulCalls": successful_calls,
"failedCalls": failed_calls,
"totalReadResults": total_read_results,
"cachedReadResults": cached_read_results,
"callsPerSecond": round_to(calls_per_second, 2),
"latencyMs": summary,
});
let json_stats = stats.to_json(&context);
if use_json {
println!("{}", stats);
println!("{}", json_stats);
} else {
println!("{calls_per_second}");
println!("{}", stats.calls_per_second(steady_elapsed));
}
Ok::<(), Error>(())
}
@@ -1113,6 +1086,102 @@ async fn run_bench_read_bulk(
bench_outcome
}
/// Per-iteration accounting for `bench-read-bulk`.
///
/// Only successful `read_bulk` calls contribute to the success-latency
/// histogram (`success_latencies_ms`). Failures are tracked separately in
/// `failure_latencies_ms` and the first failure's redacted error string is
/// stashed in `first_failure` so a partial-failure run is visible in the
/// emitted JSON. This keeps the cross-language `latencyMs.p99`/`max`
/// contract honest: it reports successful-call latency only and never
/// folds in a per-call timeout from a failed RPC.
#[derive(Default)]
struct BenchReadBulkStats {
success_latencies_ms: Vec<f64>,
failure_latencies_ms: Vec<f64>,
total_read_results: u64,
cached_read_results: u64,
successful_calls: u64,
failed_calls: u64,
first_failure: Option<String>,
}
impl BenchReadBulkStats {
fn record_success(
&mut self,
elapsed_ms: f64,
results: &[mxgateway_client::generated::mxaccess_gateway::v1::BulkReadResult],
) {
self.success_latencies_ms.push(elapsed_ms);
self.successful_calls += 1;
for result in results {
self.total_read_results += 1;
if result.was_cached {
self.cached_read_results += 1;
}
}
}
fn record_failure(&mut self, elapsed_ms: f64, error: &Error) {
self.failure_latencies_ms.push(elapsed_ms);
self.failed_calls += 1;
if self.first_failure.is_none() {
self.first_failure = Some(error.to_string());
}
}
fn total_calls(&self) -> u64 {
self.successful_calls + self.failed_calls
}
fn calls_per_second(&self, elapsed: std::time::Duration) -> f64 {
let seconds = elapsed.as_secs_f64();
if seconds > 0.0 {
self.total_calls() as f64 / seconds
} else {
0.0
}
}
fn to_json(&self, context: &BenchReadBulkContext<'_>) -> serde_json::Value {
let calls_per_second = self.calls_per_second(context.steady_elapsed);
let success_summary = percentile_summary(&self.success_latencies_ms);
let failure_summary = percentile_summary(&self.failure_latencies_ms);
serde_json::json!({
"language": "rust",
"command": "bench-read-bulk",
"endpoint": context.endpoint,
"clientName": context.client_name,
"bulkSize": context.bulk_size,
"durationSeconds": context.duration_seconds,
"warmupSeconds": context.warmup_seconds,
"durationMs": context.steady_elapsed.as_millis() as u64,
"tags": context.tags,
"totalCalls": self.total_calls(),
"successfulCalls": self.successful_calls,
"failedCalls": self.failed_calls,
"totalReadResults": self.total_read_results,
"cachedReadResults": self.cached_read_results,
"callsPerSecond": round_to(calls_per_second, 2),
"latencyMs": success_summary,
"failureLatencyMs": failure_summary,
"firstFailure": self.first_failure,
})
}
}
/// Static configuration for one `bench-read-bulk` run, packaged so the
/// JSON serialiser can quote it back without taking eight positional args.
struct BenchReadBulkContext<'a> {
endpoint: &'a str,
client_name: &'a str,
bulk_size: usize,
duration_seconds: u64,
warmup_seconds: u64,
steady_elapsed: std::time::Duration,
tags: &'a [String],
}
fn percentile_summary(sample: &[f64]) -> serde_json::Value {
if sample.is_empty() {
return serde_json::json!({ "p50": 0.0, "p95": 0.0, "p99": 0.0, "max": 0.0, "mean": 0.0 });
@@ -1294,7 +1363,13 @@ fn build_write_bulk_entries(
item_handles: &[i32],
value_type: CliValueType,
values: &[String],
) -> Result<Vec<(i32, mxgateway_client::generated::mxaccess_gateway::v1::MxValue)>, Error> {
) -> Result<
Vec<(
i32,
mxgateway_client::generated::mxaccess_gateway::v1::MxValue,
)>,
Error,
> {
if item_handles.len() != values.len() {
return Err(Error::InvalidArgument {
name: "values".to_owned(),
@@ -1660,4 +1735,77 @@ mod tests {
assert_eq!(frac.seconds, utc.seconds);
assert_eq!(frac.nanos, 250_000_000);
}
#[test]
fn bench_read_bulk_stats_keeps_failures_out_of_success_latency_histogram() {
use mxgateway_client::generated::mxaccess_gateway::v1::BulkReadResult;
use mxgateway_client::Error;
let mut stats = super::BenchReadBulkStats::default();
let cached = BulkReadResult {
was_cached: true,
was_successful: true,
..BulkReadResult::default()
};
let uncached = BulkReadResult {
was_cached: false,
was_successful: true,
..BulkReadResult::default()
};
// Two fast successes and one slow failure: the slow failure must
// not pollute the success p99/max histogram.
stats.record_success(1.5, std::slice::from_ref(&cached));
stats.record_success(2.0, std::slice::from_ref(&uncached));
let failure = Error::MalformedReply {
detail: "synthetic failure for the bench test".to_owned(),
};
stats.record_failure(1_500.0, &failure);
assert_eq!(stats.success_latencies_ms, vec![1.5, 2.0]);
assert_eq!(stats.failure_latencies_ms, vec![1_500.0]);
assert_eq!(stats.successful_calls, 2);
assert_eq!(stats.failed_calls, 1);
assert_eq!(stats.total_calls(), 3);
assert_eq!(stats.total_read_results, 2);
assert_eq!(stats.cached_read_results, 1);
assert!(stats
.first_failure
.as_deref()
.unwrap()
.contains("synthetic failure"));
let elapsed = std::time::Duration::from_secs(1);
let context = super::BenchReadBulkContext {
endpoint: "http://fake",
client_name: "client",
bulk_size: 2,
duration_seconds: 1,
warmup_seconds: 0,
steady_elapsed: elapsed,
tags: &[],
};
let payload = stats.to_json(&context);
// The success-latency histogram must never see the 1_500 ms failure.
assert_eq!(payload["latencyMs"]["max"].as_f64().unwrap(), 2.0);
assert!(payload["latencyMs"]["p99"].as_f64().unwrap() <= 2.0);
// The failure-latency histogram must own it instead.
assert_eq!(
payload["failureLatencyMs"]["max"].as_f64().unwrap(),
1_500.0
);
assert_eq!(payload["failedCalls"].as_u64().unwrap(), 1);
assert_eq!(payload["successfulCalls"].as_u64().unwrap(), 2);
assert!(payload["firstFailure"]
.as_str()
.unwrap()
.contains("synthetic failure"));
}
#[test]
fn bench_read_bulk_stats_calls_per_second_handles_zero_duration() {
let stats = super::BenchReadBulkStats::default();
assert_eq!(stats.calls_per_second(std::time::Duration::ZERO), 0.0);
}
}
+3
View File
@@ -14,6 +14,7 @@ pub mod mxaccess_gateway {
/// gateway to language clients.
pub mod v1 {
#![allow(clippy::large_enum_variant)]
#![allow(clippy::doc_lazy_continuation)]
tonic::include_proto!("mxaccess_gateway.v1");
}
@@ -25,6 +26,7 @@ pub mod mxaccess_worker {
/// the named-pipe transport between gateway and worker.
pub mod v1 {
#![allow(clippy::large_enum_variant)]
#![allow(clippy::doc_lazy_continuation)]
tonic::include_proto!("mxaccess_worker.v1");
}
@@ -36,6 +38,7 @@ pub mod galaxy_repository {
/// discovery and deploy-event watch RPCs.
pub mod v1 {
#![allow(clippy::large_enum_variant)]
#![allow(clippy::doc_lazy_continuation)]
tonic::include_proto!("galaxy_repository.v1");
}
+9 -3
View File
@@ -33,7 +33,14 @@ static CORRELATION_SEQUENCE: AtomicU64 = AtomicU64::new(0);
/// Build a unique `client_correlation_id` for a request so concurrent or
/// repeated calls of the same command kind can be told apart in gateway logs.
fn next_correlation_id(label: &str) -> String {
///
/// Exposed so consumers that construct raw [`MxCommandRequest`] /
/// [`CloseSessionRequest`] payloads outside the `Session` helpers — notably
/// the `mxgw` test CLI — share the same correlation-id discipline as the
/// library. The returned id is `rust-client-{label}-{N}` where `N` comes
/// from a process-wide atomic sequence.
#[must_use]
pub fn next_correlation_id(label: &str) -> String {
let sequence = CORRELATION_SEQUENCE.fetch_add(1, Ordering::Relaxed);
format!("rust-client-{label}-{sequence}")
}
@@ -761,8 +768,7 @@ fn bulk_write_results(
BulkWriteReplyKind::WriteSecured2,
) => Ok(reply.results),
_ => Err(Error::MalformedReply {
detail: "bulk write reply did not carry the expected BulkWriteReply payload"
.to_owned(),
detail: "bulk write reply did not carry the expected BulkWriteReply payload".to_owned(),
}),
}
}
+394 -30
View File
@@ -20,7 +20,8 @@ use mxgateway_client::generated::mxaccess_gateway::v1::{
CloseSessionReply, CloseSessionRequest, MxCommandKind, MxCommandReply, MxDataType, MxEvent,
MxEventFamily, MxStatusCategory, MxStatusProxy, MxStatusSource, MxValue, OpenSessionReply,
OpenSessionRequest, ProtocolStatus, ProtocolStatusCode, QueryActiveAlarmsRequest, SessionState,
StreamEventsRequest, SubscribeResult, WriteBulkEntry,
StreamEventsRequest, SubscribeResult, Write2BulkEntry, WriteBulkEntry, WriteSecured2BulkEntry,
WriteSecuredBulkEntry,
};
use mxgateway_client::{
ApiKey, ClientOptions, CommandError, Error, GatewayClient, MxStatus, MxValue as ClientMxValue,
@@ -160,7 +161,10 @@ async fn read_bulk_forwards_timeout_and_unpacks_cached_flag() {
let entry = &results[0];
assert!(entry.was_cached);
assert_eq!(entry.value.as_ref().and_then(|v| v.kind.as_ref()), Some(&Kind::Int32Value(99)));
assert_eq!(
entry.value.as_ref().and_then(|v| v.kind.as_ref()),
Some(&Kind::Int32Value(99))
);
assert_eq!(*state.last_read_bulk_timeout_ms.lock().await, Some(750));
}
@@ -393,6 +397,238 @@ async fn connect_with_unreadable_ca_file_reports_invalid_endpoint() {
);
}
#[tokio::test]
async fn register_returns_malformed_reply_when_ok_reply_has_no_payload() {
let state = Arc::new(FakeState::default());
*state.invoke_override.lock().await = Some(InvokeOverride::OkReplyNoPayload);
let endpoint = spawn_fake_gateway(state.clone()).await;
let client = GatewayClient::connect(ClientOptions::new(endpoint))
.await
.unwrap();
let session = client.session("session-fixture");
let error = session.register("client-name").await.unwrap_err();
assert!(
matches!(&error, Error::MalformedReply { detail } if detail.contains("Register")),
"expected MalformedReply for register, got {error:?}"
);
}
#[tokio::test]
async fn add_item_returns_malformed_reply_when_ok_reply_has_no_payload() {
let state = Arc::new(FakeState::default());
*state.invoke_override.lock().await = Some(InvokeOverride::OkReplyNoPayload);
let endpoint = spawn_fake_gateway(state.clone()).await;
let client = GatewayClient::connect(ClientOptions::new(endpoint))
.await
.unwrap();
let session = client.session("session-fixture");
let error = session.add_item(12, "Plant.Area.Tag").await.unwrap_err();
assert!(
matches!(&error, Error::MalformedReply { detail } if detail.contains("AddItem")),
"expected MalformedReply for add_item, got {error:?}"
);
}
#[tokio::test]
async fn add_item2_returns_malformed_reply_when_ok_reply_has_no_payload() {
let state = Arc::new(FakeState::default());
*state.invoke_override.lock().await = Some(InvokeOverride::OkReplyNoPayload);
let endpoint = spawn_fake_gateway(state.clone()).await;
let client = GatewayClient::connect(ClientOptions::new(endpoint))
.await
.unwrap();
let session = client.session("session-fixture");
let error = session
.add_item2(12, "Plant.Area.Tag", "ctx")
.await
.unwrap_err();
assert!(
matches!(&error, Error::MalformedReply { detail } if detail.contains("AddItem2")),
"expected MalformedReply for add_item2, got {error:?}"
);
}
#[tokio::test]
async fn subscribe_bulk_returns_malformed_reply_on_mismatched_payload_arm() {
let state = Arc::new(FakeState::default());
*state.invoke_override.lock().await = Some(InvokeOverride::OkReplyWrongPayloadForBulk);
let endpoint = spawn_fake_gateway(state.clone()).await;
let client = GatewayClient::connect(ClientOptions::new(endpoint))
.await
.unwrap();
let session = client.session("session-fixture");
let error = session
.subscribe_bulk(12, vec!["Tank01.Level".to_owned()])
.await
.unwrap_err();
assert!(
matches!(&error, Error::MalformedReply { detail } if detail.contains("bulk")),
"expected MalformedReply for subscribe_bulk, got {error:?}"
);
}
#[tokio::test]
async fn write_bulk_returns_malformed_reply_on_mismatched_payload_arm() {
let state = Arc::new(FakeState::default());
*state.invoke_override.lock().await = Some(InvokeOverride::OkReplyWrongPayloadForBulkWrite);
let endpoint = spawn_fake_gateway(state.clone()).await;
let client = GatewayClient::connect(ClientOptions::new(endpoint))
.await
.unwrap();
let session = client.session("session-fixture");
let error = session
.write_bulk(
12,
vec![WriteBulkEntry {
item_handle: 901,
value: Some(int_value(11)),
user_id: 5,
}],
)
.await
.unwrap_err();
assert!(
matches!(&error, Error::MalformedReply { detail } if detail.contains("bulk write")),
"expected MalformedReply for write_bulk, got {error:?}"
);
}
#[tokio::test]
async fn read_bulk_returns_malformed_reply_on_mismatched_payload_arm() {
let state = Arc::new(FakeState::default());
*state.invoke_override.lock().await = Some(InvokeOverride::OkReplyWrongPayloadForReadBulk);
let endpoint = spawn_fake_gateway(state.clone()).await;
let client = GatewayClient::connect(ClientOptions::new(endpoint))
.await
.unwrap();
let session = client.session("session-fixture");
let error = session
.read_bulk(12, &["Tank01.Level"], 500)
.await
.unwrap_err();
assert!(
matches!(&error, Error::MalformedReply { detail } if detail.contains("ReadBulk")),
"expected MalformedReply for read_bulk, got {error:?}"
);
}
#[tokio::test]
async fn unary_invoke_maps_status_unavailable_to_error_unavailable() {
let state = Arc::new(FakeState::default());
*state.invoke_override.lock().await =
Some(InvokeOverride::Unavailable("gateway restarting".to_owned()));
let endpoint = spawn_fake_gateway(state.clone()).await;
let client = GatewayClient::connect(ClientOptions::new(endpoint))
.await
.unwrap();
let session = client.session("session-fixture");
let error = session.add_item(12, "Plant.Area.Tag").await.unwrap_err();
assert!(
matches!(&error, Error::Unavailable { .. }),
"expected Error::Unavailable for unary unavailable, got {error:?}"
);
}
#[tokio::test]
async fn write2_bulk_round_trips_through_the_fake_gateway() {
let state = Arc::new(FakeState::default());
let endpoint = spawn_fake_gateway(state.clone()).await;
let client = GatewayClient::connect(ClientOptions::new(endpoint))
.await
.unwrap();
let session = client.session("session-fixture");
let results = session
.write2_bulk(
12,
vec![Write2BulkEntry {
item_handle: 901,
value: Some(int_value(11)),
timestamp_value: Some(int_value(0)),
user_id: 5,
}],
)
.await
.unwrap();
assert_eq!(results.len(), 2);
assert!(results[0].was_successful);
assert!(!results[1].was_successful);
let last_command = state.last_command_kind.lock().await;
assert_eq!(*last_command, Some(MxCommandKind::Write2Bulk as i32));
}
#[tokio::test]
async fn write_secured_bulk_round_trips_through_the_fake_gateway() {
let state = Arc::new(FakeState::default());
let endpoint = spawn_fake_gateway(state.clone()).await;
let client = GatewayClient::connect(ClientOptions::new(endpoint))
.await
.unwrap();
let session = client.session("session-fixture");
let results = session
.write_secured_bulk(
12,
vec![WriteSecuredBulkEntry {
item_handle: 901,
current_user_id: 7,
verifier_user_id: 9,
value: Some(int_value(11)),
}],
)
.await
.unwrap();
assert_eq!(results.len(), 2);
assert!(results[0].was_successful);
let last_command = state.last_command_kind.lock().await;
assert_eq!(*last_command, Some(MxCommandKind::WriteSecuredBulk as i32));
}
#[tokio::test]
async fn write_secured2_bulk_round_trips_through_the_fake_gateway() {
let state = Arc::new(FakeState::default());
let endpoint = spawn_fake_gateway(state.clone()).await;
let client = GatewayClient::connect(ClientOptions::new(endpoint))
.await
.unwrap();
let session = client.session("session-fixture");
let results = session
.write_secured2_bulk(
12,
vec![WriteSecured2BulkEntry {
item_handle: 901,
current_user_id: 7,
verifier_user_id: 9,
value: Some(int_value(11)),
timestamp_value: Some(int_value(0)),
}],
)
.await
.unwrap();
assert_eq!(results.len(), 2);
assert!(results[0].was_successful);
let last_command = state.last_command_kind.lock().await;
assert_eq!(*last_command, Some(MxCommandKind::WriteSecured2Bulk as i32));
}
#[derive(Default)]
struct FakeState {
authorization: Mutex<Option<String>>,
@@ -400,6 +636,39 @@ struct FakeState {
last_read_bulk_timeout_ms: Mutex<Option<u32>>,
stream_dropped: Arc<AtomicBool>,
emit_stream_fault: AtomicBool,
/// Test-injected override for the next (and all subsequent) `Invoke`
/// calls. When `Some`, the fake gateway returns the override's response
/// instead of its default per-kind reply. Used by the malformed-reply
/// and unary-Unavailable tests; default `None` preserves existing
/// happy-path test behaviour.
invoke_override: Mutex<Option<InvokeOverride>>,
}
/// Test-injected override for the fake gateway's `Invoke` handler.
///
/// Each variant short-circuits the per-kind dispatch in `FakeGateway::invoke`
/// and reproduces one of the wire shapes the Rust client's error paths must
/// handle. The bool tags the OK reply variants as "OK envelope, payload
/// missing/wrong" — the exact condition the new `Error::MalformedReply`
/// paths in `session.rs` are designed to catch.
#[derive(Clone)]
enum InvokeOverride {
/// Return `Status::unavailable(message)` from the unary Invoke RPC, so
/// the client maps it to `Error::Unavailable`.
Unavailable(String),
/// Return an OK `MxCommandReply` whose `payload` field is `None`. Used
/// to exercise `register_server_handle` / `add_item_handle` /
/// `add_item2_handle` falling through to the `MalformedReply` arm.
OkReplyNoPayload,
/// Return an OK reply whose payload arm does not match the bulk-read
/// command, so `read_bulk` falls through to its `MalformedReply` arm.
OkReplyWrongPayloadForReadBulk,
/// Return an OK reply whose payload arm does not match the requested
/// bulk command, so `bulk_results` falls through to `MalformedReply`.
OkReplyWrongPayloadForBulk,
/// Return an OK reply whose payload arm does not match the requested
/// bulk-write command, so `bulk_write_results` returns `MalformedReply`.
OkReplyWrongPayloadForBulkWrite,
}
#[derive(Clone)]
@@ -453,6 +722,58 @@ impl MxAccessGateway for FakeGateway {
.unwrap_or_default();
*self.state.last_command_kind.lock().await = Some(kind);
if let Some(override_) = self.state.invoke_override.lock().await.clone() {
return match override_ {
InvokeOverride::Unavailable(message) => Err(Status::unavailable(message)),
InvokeOverride::OkReplyNoPayload => Ok(Response::new(MxCommandReply {
session_id: request.session_id,
correlation_id: "fake-correlation".to_owned(),
kind,
protocol_status: Some(ok_status("command ok but payload omitted")),
payload: None,
..MxCommandReply::default()
})),
InvokeOverride::OkReplyWrongPayloadForReadBulk => {
Ok(Response::new(MxCommandReply {
session_id: request.session_id,
correlation_id: "fake-correlation".to_owned(),
kind,
protocol_status: Some(ok_status("read-bulk wrong payload arm")),
// AddItem payload arm against a ReadBulk request:
// the client's `read_bulk` matcher must reject it.
payload: Some(mx_command_reply::Payload::AddItem(AddItemReply {
item_handle: 0,
})),
..MxCommandReply::default()
}))
}
InvokeOverride::OkReplyWrongPayloadForBulk => Ok(Response::new(MxCommandReply {
session_id: request.session_id,
correlation_id: "fake-correlation".to_owned(),
kind,
protocol_status: Some(ok_status("bulk wrong payload arm")),
// AddItem payload arm against a SubscribeBulk request.
payload: Some(mx_command_reply::Payload::AddItem(AddItemReply {
item_handle: 0,
})),
..MxCommandReply::default()
})),
InvokeOverride::OkReplyWrongPayloadForBulkWrite => {
Ok(Response::new(MxCommandReply {
session_id: request.session_id,
correlation_id: "fake-correlation".to_owned(),
kind,
protocol_status: Some(ok_status("bulk-write wrong payload arm")),
// AddItem payload arm against a WriteBulk request.
payload: Some(mx_command_reply::Payload::AddItem(AddItemReply {
item_handle: 0,
})),
..MxCommandReply::default()
}))
}
};
}
if kind == MxCommandKind::Write as i32 {
return Ok(Response::new(mxaccess_failure_reply()));
}
@@ -478,36 +799,41 @@ impl MxAccessGateway for FakeGateway {
}));
}
// All four bulk-write families return `BulkWriteReply` over the
// wire and only differ by which `payload` arm carries it. The
// round-trip tests below want one entry per family, so wire them
// all up to the same canned reply (one success + one failure) and
// pick the matching payload arm by kind.
if kind == MxCommandKind::WriteBulk as i32 {
// Echo one success and one failure so the test can assert the per-entry
// shape and verify the call did not throw on per-entry failure.
return Ok(Response::new(MxCommandReply {
session_id: request.session_id,
correlation_id: "fake-correlation".to_owned(),
return Ok(Response::new(bulk_write_envelope(
request.session_id,
kind,
protocol_status: Some(ok_status("command ok")),
payload: Some(mx_command_reply::Payload::WriteBulk(BulkWriteReply {
results: vec![
BulkWriteResult {
server_handle: 12,
item_handle: 901,
was_successful: true,
hresult: None,
statuses: vec![],
error_message: String::new(),
},
BulkWriteResult {
server_handle: 12,
item_handle: 902,
was_successful: false,
hresult: None,
statuses: vec![],
error_message: "invalid handle".to_owned(),
},
],
})),
..MxCommandReply::default()
}));
mx_command_reply::Payload::WriteBulk(canned_bulk_write_reply()),
)));
}
if kind == MxCommandKind::Write2Bulk as i32 {
return Ok(Response::new(bulk_write_envelope(
request.session_id,
kind,
mx_command_reply::Payload::Write2Bulk(canned_bulk_write_reply()),
)));
}
if kind == MxCommandKind::WriteSecuredBulk as i32 {
return Ok(Response::new(bulk_write_envelope(
request.session_id,
kind,
mx_command_reply::Payload::WriteSecuredBulk(canned_bulk_write_reply()),
)));
}
if kind == MxCommandKind::WriteSecured2Bulk as i32 {
return Ok(Response::new(bulk_write_envelope(
request.session_id,
kind,
mx_command_reply::Payload::WriteSecured2Bulk(canned_bulk_write_reply()),
)));
}
if kind == MxCommandKind::ReadBulk as i32 {
@@ -699,6 +1025,44 @@ fn mxaccess_failure_reply() -> MxCommandReply {
}
}
fn canned_bulk_write_reply() -> BulkWriteReply {
BulkWriteReply {
results: vec![
BulkWriteResult {
server_handle: 12,
item_handle: 901,
was_successful: true,
hresult: None,
statuses: vec![],
error_message: String::new(),
},
BulkWriteResult {
server_handle: 12,
item_handle: 902,
was_successful: false,
hresult: None,
statuses: vec![],
error_message: "invalid handle".to_owned(),
},
],
}
}
fn bulk_write_envelope(
session_id: String,
kind: i32,
payload: mx_command_reply::Payload,
) -> MxCommandReply {
MxCommandReply {
session_id,
correlation_id: "fake-correlation".to_owned(),
kind,
protocol_status: Some(ok_status("command ok")),
payload: Some(payload),
..MxCommandReply::default()
}
}
fn event(sequence: u64) -> MxEvent {
MxEvent {
family: MxEventFamily::OnDataChange as i32,