Client.Dotnet-001: MapRpcException typed only Unauthenticated and PermissionDenied; every other gRPC status collapsed to an untyped exception with the status code discarded. Added a nullable StatusCode to MxGatewayException, extracted the duplicated mappers into a shared RpcExceptionMapper that records the code for every status, and documented it. Client.Dotnet-002: the production retry branch (MxGatewayException wrapping RpcException) was never exercised. FakeGatewayTransport gained a MapTransportExceptions mode that runs thrown RpcExceptions through RpcExceptionMapper exactly as the production transport does. Client.Dotnet-003: MxGatewaySession.DisposeAsync disposed _closeLock while a concurrent CloseAsync could be parked in WaitAsync. DisposeAsync now drains in-flight CloseAsync callers before disposing the semaphore; the client's _disposed flag is accessed via Interlocked. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
12 KiB
Code Review — Client.Dotnet
| Field | Value |
|---|---|
| Module | clients/dotnet |
| Reviewer | Claude Code |
| Review date | 2026-05-18 |
| Commit reviewed | 3cc53a8 |
| Status | Reviewed |
| Open findings | 5 |
Checklist coverage
| # | Category | Result |
|---|---|---|
| 1 | Correctness & logic bugs | Minor: handle-selector fallback ?? reply.ReturnValue.Int32Value can mask a missing typed reply (Client.Dotnet-005); CLI redactor misses env-var keys (Client.Dotnet-008). |
| 2 | mxaccessgw conventions | Good — consumes the shared contracts project, no forked proto, authorization: Bearer metadata correct, parity preserved via split EnsureProtocolSuccess/EnsureMxAccessSuccess. |
| 3 | Concurrency & thread safety | Issue found: _disposed flags unsynchronized; MxGatewaySession.DisposeAsync can race a concurrent CloseAsync (Client.Dotnet-003). |
| 4 | Error handling & resilience | Issues found: gRPC-to-native mapping collapses non-auth statuses into one untyped exception (Client.Dotnet-001); shared retry/timeout budget (Client.Dotnet-004). |
| 5 | Security | Good — API key never logged by the library, CLI redacts keys, TLS custom-root validation correct. |
| 6 | Performance & resource management | No issues found — channels and streaming calls disposed correctly. |
| 7 | Design-document adherence | No issues found — matches ClientLibrariesDesign.md. |
| 8 | Code organization & conventions | Issue found: undocumented public members (Client.Dotnet-006). |
| 9 | Testing coverage | Issue found: the production retry path is never exercised (Client.Dotnet-002). |
| 10 | Documentation & comments | Issue found: doc misstates the unary timeout retry budget as per-call (Client.Dotnet-004, Client.Dotnet-007). |
Findings
Client.Dotnet-001
| Field | Value |
|---|---|
| Severity | Medium |
| Category | Error handling & resilience |
| Location | clients/dotnet/MxGateway.Client/GrpcMxGatewayClientTransport.cs:190-199, clients/dotnet/MxGateway.Client/GrpcGalaxyRepositoryClientTransport.cs:131-140 |
| Status | Resolved |
Description: MapRpcException only produces typed exceptions for Unauthenticated and PermissionDenied. Every other gRPC status — NotFound, InvalidArgument, ResourceExhausted, FailedPrecondition, Unavailable, Internal — collapses into the base MxGatewayException with no surfaced StatusCode. Callers cannot programmatically distinguish a transient outage from a permanent bad-argument error without reflecting into InnerException and downcasting to RpcException.
Recommendation: Carry the gRPC StatusCode on MxGatewayException (e.g. a StatusCode property) and/or add typed subclasses for at least NotFound, InvalidArgument, and Unavailable. Populate it from exception.StatusCode in MapRpcException.
Resolution: (2026-05-18) Confirmed against source: both transports had a duplicated private MapRpcException that only typed two statuses and discarded the gRPC code for the rest. Added a nullable StatusCode property (Grpc.Core.StatusCode?) to MxGatewayException plus constructors that carry it, threaded it through MxGatewayAuthenticationException/MxGatewayAuthorizationException, and extracted the two duplicated mappers into a single shared internal RpcExceptionMapper (RpcExceptionMapper.cs) that populates StatusCode from exception.StatusCode for every status. Callers can now distinguish transient from permanent failures without downcasting InnerException. Documented in clients/dotnet/README.md. Regression test: RpcExceptionMapperTests (8 cases incl. the [Theory] over NotFound/InvalidArgument/ResourceExhausted/FailedPrecondition/Unavailable/Internal).
Client.Dotnet-002
| Field | Value |
|---|---|
| Severity | Medium |
| Category | Testing coverage |
| Location | clients/dotnet/MxGateway.Client.Tests/FakeGatewayTransport.cs:145-148, clients/dotnet/MxGateway.Client.Tests/MxGatewayClientSessionTests.cs:236-256 |
| Status | Resolved |
Description: The retry predicate MxGatewayClientRetryPolicy.IsTransientGrpcFailure handles two shapes: a raw RpcException and an MxGatewayException { InnerException: RpcException }. In production the transport always maps RpcException → MxGatewayException before it reaches the retry pipeline, so only the wrapped-MxGatewayException branch ever runs in production. But FakeGatewayTransport throws the raw RpcException and never maps it, so every retry test exercises only the raw-RpcException branch — the branch that never occurs in production. The production retry behaviour is effectively untested.
Recommendation: Add a fake/transport mode that maps RpcException to MxGatewayException the way GrpcMxGatewayClientTransport does (or add tests that enqueue a pre-wrapped MxGatewayException), so the actually-used predicate branch is covered.
Resolution: (2026-05-18) Confirmed against source: FakeGatewayTransport threw queued exceptions verbatim, so the existing retry tests only ever hit the raw-RpcException predicate branch. Added a MapTransportExceptions flag to FakeGatewayTransport that, when set, runs thrown RpcExceptions through the same shared RpcExceptionMapper the production gRPC transport uses, producing the wrapped MxGatewayException shape. Added regression test MxGatewayClientSessionTests.InvokeAsync_RetriesSafeDiagnosticCommand_WhenTransportMapsRpcException, which exercises the previously-untested production predicate branch. Verified red: removing the MxGatewayException { InnerException: RpcException } case from IsTransientGrpcFailure fails the new test while the pre-existing raw-RpcException test still passes.
Client.Dotnet-003
| Field | Value |
|---|---|
| Severity | Medium |
| Category | Concurrency & thread safety |
| Location | clients/dotnet/MxGateway.Client/MxGatewaySession.cs:659-663, clients/dotnet/MxGateway.Client/MxGatewayClient.cs:230-240 |
| Status | Resolved |
Description: DisposeAsync calls CloseAsync() (no token) then unconditionally _closeLock.Dispose(). If another thread is concurrently awaiting CloseAsync(token) — legal, since the type exposes public async methods and no single-threaded contract — disposing the SemaphoreSlim while a WaitAsync is pending throws ObjectDisposedException into that caller. The _disposed flags in both clients are also plain unsynchronised bool reads/writes; ThrowIfDisposed racing DisposeAsync can observe a stale value.
Recommendation: Either document MxGatewaySession/MxGatewayClient as not thread-safe for concurrent dispose, or guard _disposed with Interlocked/volatile and avoid disposing _closeLock until all in-flight CloseAsync calls complete.
Resolution: (2026-05-18) Confirmed against source: MxGatewaySession.DisposeAsync disposed _closeLock unconditionally, racing concurrent CloseAsync callers; MxGatewayClient._disposed was a plain bool. Fixed MxGatewaySession by tracking in-flight CloseAsync callers with an _activeCloseCount guarded by a dedicated _disposeGate lock and a _closeLockDisposed flag: CloseAsync registers under the gate (and throws ObjectDisposedException if disposal already won) before awaiting _closeLock.WaitAsync, and DisposeAsync drains _activeCloseCount to zero before disposing the semaphore, so the close lock provably outlives every pending WaitAsync. Fixed MxGatewayClient by changing _disposed to an int accessed via Interlocked.Exchange/Volatile.Read. Regression test MxGatewayClientSessionTests.DisposeAsync_DoesNotRaceConcurrentCloseAsync runs 100 iterations with one close holding the lock and one parked behind it while DisposeAsync runs concurrently; verified red against the original DisposeAsync (fails with ObjectDisposedException), green after the fix.
Client.Dotnet-004
| Field | Value |
|---|---|
| Severity | Low |
| Category | Error handling & resilience |
| Location | clients/dotnet/MxGateway.Client/MxGatewayClient.cs:283-294, clients/dotnet/MxGateway.Client/GalaxyRepositoryClient.cs:392-403 |
| Status | Open |
Description: ExecuteSafeUnaryAsync wraps the whole Polly retry pipeline in a single linked CTS cancelled after Options.DefaultCallTimeout, while CreateCallOptions also stamps each individual call with a DefaultCallTimeout gRPC deadline. The retry pipeline therefore shares one DefaultCallTimeout budget across the initial attempt plus all retries plus backoff delays. The README/XML docs describe DefaultCallTimeout as a per-call timeout, which misrepresents this. DeadlineExceeded is also classified as transient, so an attempt that exhausts the shared budget is retried only to immediately fail again.
Recommendation: Decide whether DefaultCallTimeout is per-attempt or per-operation and make code and docs consistent — e.g. a separate per-attempt deadline and a distinct overall-operation timeout. Reconsider retrying on DeadlineExceeded when the deadline was client-imposed.
Resolution: (open)
Client.Dotnet-005
| Field | Value |
|---|---|
| Severity | Low |
| Category | Correctness & logic bugs |
| Location | clients/dotnet/MxGateway.Client/MxGatewaySession.cs:82,124,175 |
| Status | Open |
Description: RegisterAsync/AddItemAsync/AddItem2Async return reply.<Typed>?.ServerHandle ?? reply.ReturnValue.Int32Value. After EnsureMxAccessSuccess() passes, a missing typed payload silently falls back to ReturnValue.Int32Value, which for a reply carrying no return value is 0. A caller then uses 0 as a ServerHandle/ItemHandle, producing a confusing downstream invalid-handle failure rather than a clear "gateway reply missing payload" error.
Recommendation: If the typed sub-message is the contract for these commands, treat its absence on an otherwise-successful reply as an error (throw a descriptive MxGatewayException) rather than falling through to ReturnValue.Int32Value.
Resolution: (open)
Client.Dotnet-006
| Field | Value |
|---|---|
| Severity | Low |
| Category | Code organization & conventions |
| Location | clients/dotnet/MxGateway.Client/MxGatewayClientOptions.cs:50, clients/dotnet/MxGateway.Client/MxGatewayClientContractInfo.cs:10-14 |
| Status | Open |
Description: MxGatewayClientOptions.MaxGrpcMessageBytes and the two consts in MxGatewayClientContractInfo are public members with no XML doc comments, inconsistent with every other public member in the assembly and with the repo's documented C# style emphasis on a documented public surface.
Recommendation: Add <summary> doc comments to MaxGrpcMessageBytes, GatewayProtocolVersion, and WorkerProtocolVersion.
Resolution: (open)
Client.Dotnet-007
| Field | Value |
|---|---|
| Severity | Low |
| Category | Documentation & comments |
| Location | clients/dotnet/MxGateway.Client/MxGatewayClient.cs:185-192 |
| Status | Open |
Description: The AcknowledgeAlarmAsync XML comment states the gateway authenticates against an invoke:alarm-ack scope, but CLAUDE.md documents the scope set without any invoke:alarm-ack sub-scope. The comment may describe an intended finer-grained scope that does not exist, misleading integrators about what API key they need.
Recommendation: Reconcile the comment with the actual server-side scope check, or update the scope documentation if sub-scopes were genuinely added; keep client doc and gateway auth model in sync.
Resolution: (open)
Client.Dotnet-008
| Field | Value |
|---|---|
| Severity | Low |
| Category | Correctness & logic bugs |
| Location | clients/dotnet/MxGateway.Client.Cli/MxGatewayCliSecretRedactor.cs:9-17 |
| Status | Open |
Description: The CLI redactor only removes the API key string when it was supplied via --api-key; RunCoreAsync passes arguments.GetOptional("api-key") to Redact. When the key comes from an environment variable (--api-key-env, the documented default path), apiKey is null and no redaction occurs. If a gRPC/transport error message ever echoes the bearer token, it would be printed unredacted.
Recommendation: Resolve the effective API key (same logic as ResolveApiKey) before redacting, so the env-var-sourced key is also stripped from error output.
Resolution: (open)