Files
mxaccessgw/clients/rust/RustClientDesign.md
T
Joseph Doherty a0203503a7 Code-review 2026-05-20 sweep: re-review at 1cd51bb, resolve 72 findings across all 11 modules
Re-reviewed every module/client against the 10-category checklist
(REVIEW-PROCESS.md) at commit 1cd51bb, filed 72 new findings, and
fixed them in three priority waves (3 High, 17 Medium, 52 Low).

Highs
- Server-017: enumerate AcknowledgeAlarm / QueryActiveAlarms in
  GatewayGrpcScopeResolver so non-admin keys can use them; document
  the mapping in docs/Authorization.md; add interceptor tests.
- Client.Java-013: add the five missing bulk-method stubs to the
  CLI FakeSession so the test module compiles on a clean tree.
- Client.Rust-013: fix the clippy::doc_lazy_continuation regression
  in generated tonic code by reformatting the ReadBulkCommand proto
  comment and scoping a #![allow(...)] to the generated submodules.

Mediums (highlights)
- Server: unify GatewaySession state-lock discipline (-015) and
  make DisposeAsync race-safe against in-flight CloseAsync (-016);
  add constraint-enforcement test coverage for the bulk-plan path
  (-021).
- Worker: introduce StaRuntimeShutdownException so RunAlarmPollLoop
  can distinguish graceful shutdown from a real STA-affinity
  violation (-016); have the watchdog skip StaHung while
  CurrentCommandCorrelationId is non-empty so a legitimate slow
  ReadBulk no longer self-faults (-017).
- Tests: add per-method round-trip + cancellation coverage for the
  11 GatewaySession bulk methods (-013); replace the real TCP probe
  in GalaxyHierarchyCacheTests with an IGalaxyRepository fake
  (-016).
- IntegrationTests: drive the StreamEvents writer in the live Write
  test and assert OnWriteComplete (-012); add live tests for
  Unadvise/RemoveItem/Unregister ordering, WriteSecured, and
  abnormal worker exit (-014).
- Worker.Tests: replace MxAccessSession reflection with an internal
  CreateForTesting factory (-016); cover WorkerCancel and
  unexpected-body envelope branches (-017).
- Client.Java: cancel MxEventStream when close() races
  beforeStart() (-014); return a CancellingCompletableFuture that
  actually forwards cancellation through .thenApply chains (-015).
- Client.Python: drop the silent localhost-plaintext downgrade in
  the CLI; require explicit --plaintext (-013).
- Client.Rust: stop bench-read-bulk from polluting success-latency
  histograms with failed-call durations (-015); add coverage for
  the five MalformedReply paths, the bulk-write helpers, the
  Error::Unavailable mapping, and the unary-fault path (-016).
- Contracts: extend docs/Contracts.md with the bulk read/write
  command family (-009).

Lows (highlights)
- Server: cap GalaxyGlobMatcher.RegexCache; align
  WorkerAlarmRpcDispatcher missing-session handling; drop the
  duplicate dashboard @page routes; refresh IAlarmRpcDispatcher
  XML doc.
- Worker: surface SetXmlAlarmQuery COM failures; remove dead
  subscriptionExpression / ExecutingCommand arms; preserve
  factory-supplied runtime sessions; split MxAlarmSnapshot.cs into
  three files.
- Tests: dispose the WebApplication in seven test classes; rebuild
  FakeWorkerProcess.WaitForExitAsync against a real TaskCompletion
  source; switch the heartbeat-expires test to ManualTimeProvider;
  add InvariantCulture to the remaining DateTimeOffset.Parse sites;
  document GalaxyFilterInputSafetyTests in GatewayTesting.md.
- IntegrationTests: comment fixes, RecordingServerStreamWriter
  IDisposable, class-level [Trait], single-source ZB default
  connection string.
- Worker.Tests: replace silent-return gating with LiveMxAccessFact
  so absent env vars SKIP not pass; PascalCase rename of probe
  [Fact]s; deterministic deadline test; new frame-protocol error
  tests; ComputeTransitions diff-coverage; relocate dev-rig probes
  to Probes/.
- Contracts: add round-trip coverage and per-field redaction /
  Galaxy-identifier comments to the protos.
- Client.Dotnet: introduce clients/dotnet/Directory.Build.props so
  TreatWarningsAsErrors / analysers apply; document
  DiscoverHierarchyOptions and IMxGatewayCliClient; require typed
  bulk-read handles in CLI; surface AcknowledgeAlarm transport
  faults through Translate().
- Client.Go: kill dead code in alarms_test / fakeGalaxyServer /
  runWriteBulkVariant; document the six new subcommands in
  writeUsage; drain galaxy-watch events on limit; switch io.EOF
  comparisons to errors.Is.
- Client.Java: shared shutdown helpers + new shutdownTimeout
  option; regex-based credential redaction; Long.toUnsignedString
  for uint64 sequence; doc fixes.
- Client.Python: combine duplicate imports; add coverage for
  _percentile / bench-read-bulk / MAX_AGGREGATE_EVENTS /
  _api_key_from_env; populate pyproject metadata and ship py.typed.
- Client.Rust: expose next_correlation_id() so CLI ping/close
  stop hard-coding correlation IDs; resync RustClientDesign.md
  with the current Session / Error surface and CLI subcommand set.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-20 09:46:47 -04:00

9.0 KiB

Rust Client Detailed Design

Purpose

Provide an async Rust client crate for MXAccess Gateway, plus a test CLI and unit tests. The Rust client should use tonic and tokio.

Follow the Rust Style Guide for handwritten code and the Protobuf Style Guide for generated contract inputs.

Crate Layout

Actual layout — the mxgateway-client library crate is the workspace root, with the mxgw test CLI as a workspace member:

clients/rust/                 # `mxgateway-client` library crate (workspace root)
  Cargo.toml
  build.rs
  src/
    lib.rs
    client.rs
    session.rs
    galaxy.rs
    options.rs
    auth.rs
    value.rs
    version.rs
    error.rs
    generated.rs
  crates/
    mxgw-cli/                 # `mxgw` test CLI (workspace member)
      Cargo.toml
      src/main.rs
  tests/
    client_behavior.rs
    proto_fixtures.rs

Dependencies:

  • tonic
  • prost
  • prost-types
  • tokio
  • tokio-stream
  • thiserror
  • clap
  • serde
  • serde_json

Library API

Suggested API:

pub struct GatewayClient { /* tonic channel + generated client */ }

pub struct ClientOptions {
    pub endpoint: String,
    pub api_key: String,
    pub plaintext: bool,
    pub ca_file: Option<PathBuf>,
    pub server_name_override: Option<String>,
    pub connect_timeout: Duration,
    pub call_timeout: Duration,
}

impl GatewayClient {
    pub async fn connect(options: ClientOptions) -> Result<Self, Error>;
    pub async fn open_session(&self, options: OpenSessionOptions) -> Result<Session, Error>;
    pub async fn invoke(&self, request: MxCommandRequest) -> Result<MxCommandReply, Error>;
}

Session:

pub struct Session {
    pub id: String,
}

impl Session {
    pub async fn register(&self, client_name: &str) -> Result<i32, Error>;
    pub async fn add_item(&self, server_handle: i32, item: &str) -> Result<i32, Error>;
    pub async fn add_item2(&self, server_handle: i32, item: &str, context: &str) -> Result<i32, Error>;
    pub async fn advise(&self, server_handle: i32, item_handle: i32) -> Result<(), Error>;
    pub async fn add_item_bulk(&self, server_handle: i32, tag_addresses: Vec<String>) -> Result<Vec<SubscribeResult>, Error>;
    pub async fn advise_item_bulk(&self, server_handle: i32, item_handles: Vec<i32>) -> Result<Vec<SubscribeResult>, Error>;
    pub async fn remove_item_bulk(&self, server_handle: i32, item_handles: Vec<i32>) -> Result<Vec<SubscribeResult>, Error>;
    pub async fn un_advise_item_bulk(&self, server_handle: i32, item_handles: Vec<i32>) -> Result<Vec<SubscribeResult>, Error>;
    pub async fn subscribe_bulk(&self, server_handle: i32, tag_addresses: Vec<String>) -> Result<Vec<SubscribeResult>, Error>;
    pub async fn unsubscribe_bulk(&self, server_handle: i32, item_handles: Vec<i32>) -> Result<Vec<SubscribeResult>, Error>;
    pub async fn write(&self, server_handle: i32, item_handle: i32, value: MxValue, user_id: i32) -> Result<(), Error>;
    pub async fn write_bulk(&self, server_handle: i32, entries: Vec<WriteBulkEntry>, user_id: i32) -> Result<Vec<BulkWriteResult>, Error>;
    pub async fn write2_bulk(&self, server_handle: i32, entries: Vec<Write2BulkEntry>, timestamp: prost_types::Timestamp, user_id: i32) -> Result<Vec<BulkWriteResult>, Error>;
    pub async fn write_secured_bulk(&self, server_handle: i32, entries: Vec<WriteSecuredBulkEntry>, current_user_id: i32, verifier_user_id: i32) -> Result<Vec<BulkWriteResult>, Error>;
    pub async fn write_secured2_bulk(&self, server_handle: i32, entries: Vec<WriteSecured2BulkEntry>, timestamp: prost_types::Timestamp, current_user_id: i32, verifier_user_id: i32) -> Result<Vec<BulkWriteResult>, Error>;
    pub async fn read_bulk(&self, server_handle: i32, tags: &[String], timeout_ms: u32) -> Result<Vec<ReadBulkResult>, Error>;
    pub async fn events(&self) -> Result<impl Stream<Item = Result<MxEvent, Error>>, Error>;
    pub async fn close(&self) -> Result<(), Error>;
}

The five bulk-write helpers (write_bulk, write2_bulk, write_secured_bulk, write_secured2_bulk) and read_bulk mirror the worker's bulk command shapes in mxaccess_gateway.proto and use the same correlation-id discipline as the unary helpers — session::next_correlation_id is pub so that consumers constructing raw MxCommandRequest/CloseSessionRequest payloads outside the Session helpers (notably the mxgw test CLI's ping and close-session subcommands) share the same id generation.

Authentication

Use a tonic interceptor or request extension layer to add:

authorization: Bearer <api key>

Use SecretString or equivalent if a dependency is acceptable. Always redact API keys in Debug output.

TLS

Support:

  • plaintext channel for local development,
  • native or rustls TLS depending on project preference,
  • custom CA file,
  • domain override.

Streaming

Expose event streams as a Stream<Item = Result<MxEvent, Error>>. Dropping the stream should cancel the underlying gRPC stream.

Do not buffer unboundedly in the client. If a helper channel is used, make it bounded.

Error Handling

Use thiserror:

pub enum Error {
    InvalidEndpoint { endpoint: String, detail: String },
    InvalidArgument { name: String, detail: String },
    Transport(tonic::transport::Error),
    Authentication { message: String, status: Box<tonic::Status> },
    Authorization { message: String, status: Box<tonic::Status> },
    Timeout { message: String, status: Box<tonic::Status> },
    Cancelled { message: String, status: Box<tonic::Status> },
    Unavailable { message: String, status: Box<tonic::Status> },
    Status(Box<tonic::Status>),
    Command(Box<CommandError>),
    ProtocolStatus { operation: &'static str, code: ProtocolStatusCode, message: String },
    MalformedReply { detail: String },
}

Unavailable classifies the transient Code::Unavailable / Code::ResourceExhausted statuses so callers can decide whether to retry without unwrapping the raw status. MalformedReply surfaces OK replies whose payload does not carry the data the command contract requires (for example, an AddItem reply missing the item handle, or a WriteBulk reply carrying the wrong payload arm). InvalidEndpoint is returned when the endpoint URL fails to parse or its TLS material cannot be loaded.

Preserve raw command replies in CommandError where applicable.

Test CLI

Binary: mxgw.

Use clap derive.

Commands (see clients/rust/README.md for full argument lists):

mxgw version
mxgw ping
mxgw open-session
mxgw close-session --session-id <id>
mxgw register --session-id <id> --client-name <name>
mxgw add-item --session-id <id> --server-handle <h> --item <tag>
mxgw advise --session-id <id> --server-handle <h> --item-handle <h>
mxgw subscribe-bulk --session-id <id> --server-handle <h> --items <a,b,c>
mxgw unsubscribe-bulk --session-id <id> --server-handle <h> --item-handles <1,2,3>
mxgw read-bulk --session-id <id> --server-handle <h> --items <a,b,c> --timeout-ms 1500
mxgw write --session-id <id> --server-handle 1 --item-handle 1 --value-type int32 --value 123
mxgw write2 --session-id <id> --server-handle 1 --item-handle 1 --value-type int32 --value 123 --timestamp <rfc3339>
mxgw write-bulk --session-id <id> --server-handle <h> --item-handles <1,2> --value-type int32 --values <1,2>
mxgw write2-bulk --session-id <id> --server-handle <h> --item-handles <1,2> --value-type int32 --values <1,2> --timestamp <rfc3339>
mxgw write-secured-bulk --session-id <id> --server-handle <h> --item-handles <1,2> --value-type int32 --values <1,2>
mxgw write-secured2-bulk --session-id <id> --server-handle <h> --item-handles <1,2> --value-type int32 --values <1,2> --timestamp <rfc3339>
mxgw stream-events --session-id <id> --json
mxgw bench-read-bulk --duration-seconds 30 --bulk-size 6 --json
mxgw smoke --endpoint http://localhost:5000 --api-key-env MXGATEWAY_API_KEY --item TestChildObject.TestInt
mxgw galaxy test-connection
mxgw galaxy last-deploy-time
mxgw galaxy discover-hierarchy
mxgw galaxy watch [--last-seen-deploy-time <rfc3339>] [--max-events N]

JSON output should use serde_json.

Unit Tests

Use a fake tonic server started on a local ephemeral port, or abstract the generated client behind a trait for unit tests.

Required tests:

  • generated client compiles from proto,
  • auth metadata injection,
  • TLS/plaintext endpoint construction,
  • value conversion,
  • command request construction,
  • error mapping from tonic::Status,
  • event stream order,
  • stream cancellation,
  • CLI parsing,
  • JSON redaction.

Integration Tests

Skip unless:

MXGATEWAY_INTEGRATION=1

Use tokio::test. Run bounded smoke flow and ensure CloseSession is attempted with drop fallback docs, but do not rely on Drop for async close.