File and fix Server-030 and Client.Dotnet-017 from e2e surfacing

Both findings surfaced when running the cross-language e2e matrix
(scripts/run-client-e2e-tests.ps1) against the redeployed gateway at
commit 84d36b7. Filed in code-reviews/Server/findings.md and
code-reviews/Client.Dotnet/findings.md and fixed in the same change.

Server-030 (Medium / Error handling): GatewaySession.GetReadyWorkerClient
gated on `_state == Ready && _workerClient.State == Ready` but only
formatted `_state` into the SessionManagerException message. Under load
the gateway-driven `_state` and the worker-driven `WorkerClient.State`
can diverge, producing a self-contradictory diagnostic ("Session ... is
not ready. Current state is Ready."). The Java e2e client hit this on
the 56th item after 55 successful add-items. Rewrote the message to
include both states ("Session state is X; worker state is Y"), added
an XML doc explaining the two-state contract and that this branch is
the fail-fast for a divergence race, and added regression test
SessionManagerTests.InvokeAsync_WhenWorkerNotReadyButSessionReady_DiagnosticIncludesBothStates
that pins both states appear in the message. The deeper race (should
the gateway briefly wait for worker-Ready before failing?) remains
open as a follow-up.

Client.Dotnet-017 (Low / Error handling): stream-events CLI threw
OperationCanceledException as an unhandled exception when the user's
--timeout expired before --max-events was reached. Exit code
-532462766, no aggregate JSON. The other client CLIs (Go, Rust, Python,
Java) exit 0 in this case. Wrapped the `await foreach` in
`catch (OperationCanceledException) when (cancellationToken.IsCancellationRequested)`
so the supplied token's cancellation (--timeout, Ctrl+C, or parent
CTS) becomes graceful completion; the aggregate `{ "events": [...] }`
JSON still runs after the catch. Added regression test
RunAsync_StreamEvents_WhenTimeoutFiresAfterEvents_EmitsCollectedEventsAndExitsZero
backed by a new FakeCliClient.StreamHangAfterEvents hook that yields
the configured events then parks on the cancellation token.

Side cleanup: the GatewayApplicationTests test added under Server-020
was asserting an invariant (`/dashboard/dashboard/X` doesn't exist)
that I broke by reverting Server-020 in 84d36b7. The doubled endpoint
shapes do exist now (MapGroup("/dashboard") prefixing an already
"/dashboard/X" @page directive) but they're harmless — no client
requests `/dashboard/dashboard/X`. Replaced the test with a positive
assertion (`/dashboard/X` routes ARE registered) and rewrote the XML
doc to record the actual contract.

Verified: dotnet test src/MxGateway.Tests passes 480/480, dotnet test
clients/dotnet/MxGateway.Client.Tests passes 77/77, gateway redeployed
at this commit and GET http://localhost:5130/dashboard returns 200.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
Joseph Doherty
2026-05-20 13:07:39 -04:00
parent 84d36b7638
commit b794c46bc7
8 changed files with 219 additions and 45 deletions
+4 -2
View File
@@ -10,14 +10,14 @@ Each module's `findings.md` is the source of truth; this file is generated from
| Module | Reviewer | Date | Commit | Status | Open | Total |
|---|---|---|---|---|---|---|
| [Client.Dotnet](Client.Dotnet/findings.md) | Claude Code | 2026-05-20 | `a020350` | Reviewed | 0 | 16 |
| [Client.Dotnet](Client.Dotnet/findings.md) | Claude Code | 2026-05-20 | `a020350` | Reviewed | 0 | 17 |
| [Client.Go](Client.Go/findings.md) | Claude Code | 2026-05-20 | `a020350` | Reviewed | 0 | 21 |
| [Client.Java](Client.Java/findings.md) | Claude Code | 2026-05-20 | `a020350` | Reviewed | 0 | 26 |
| [Client.Python](Client.Python/findings.md) | Claude Code | 2026-05-20 | `a020350` | Reviewed | 0 | 21 |
| [Client.Rust](Client.Rust/findings.md) | Claude Code | 2026-05-20 | `a020350` | Reviewed | 0 | 20 |
| [Contracts](Contracts/findings.md) | Claude Code | 2026-05-20 | `a020350` | Reviewed | 0 | 15 |
| [IntegrationTests](IntegrationTests/findings.md) | Claude Code | 2026-05-20 | `a020350` | Reviewed | 0 | 21 |
| [Server](Server/findings.md) | Claude Code | 2026-05-20 | `a020350` | Reviewed | 0 | 29 |
| [Server](Server/findings.md) | Claude Code | 2026-05-20 | `a020350` | Reviewed | 0 | 30 |
| [Tests](Tests/findings.md) | Claude Code | 2026-05-20 | `a020350` | Reviewed | 0 | 24 |
| [Worker](Worker/findings.md) | Claude Code | 2026-05-20 | `a020350` | Reviewed | 0 | 25 |
| [Worker.Tests](Worker.Tests/findings.md) | Claude Code | 2026-05-20 | `a020350` | Reviewed | 0 | 30 |
@@ -93,6 +93,7 @@ Findings with status `Resolved`, `Won't Fix`, or `Deferred`.
| Server-015 | Medium | Resolved | Concurrency & thread safety | `src/MxGateway.Server/Sessions/GatewaySession.cs:8-15,266-308,720-775` |
| Server-016 | Medium | Resolved | Error handling & resilience | `src/MxGateway.Server/Sessions/GatewaySession.cs:790-797`, `src/MxGateway.Server/Sessions/SessionManager.cs:237-258` |
| Server-021 | Medium | Resolved | Testing coverage | `src/MxGateway.Server/Grpc/MxAccessGatewayService.cs:266-664`, `src/MxGateway.Tests/Gateway/Grpc/MxAccessGatewayServiceTests.cs` |
| Server-030 | Medium | Resolved | Error handling & resilience | `src/MxGateway.Server/Sessions/GatewaySession.cs:952-980` |
| Tests-003 | Medium | Resolved | Performance & resource management | `src/MxGateway.Tests/Security/Authentication/SqliteAuthStoreTests.cs:170-176`, `src/MxGateway.Tests/Security/Authentication/ApiKeyAdminCliRunnerTests.cs:252-258` |
| Tests-004 | Medium | Resolved | Testing coverage | `src/MxGateway.Tests/Security/Authorization/GatewayGrpcAuthorizationInterceptorTests.cs` |
| Tests-005 | Medium | Resolved | Testing coverage | `src/MxGateway.Tests/Gateway/Grpc/EventStreamServiceTests.cs:239-261`, `src/MxGateway.Tests/Gateway/Sessions/SessionManagerTests.cs` |
@@ -129,6 +130,7 @@ Findings with status `Resolved`, `Won't Fix`, or `Deferred`.
| Client.Dotnet-014 | Low | Resolved | Testing coverage | `clients/dotnet/MxGateway.Client.Tests/MxGatewayClientAlarmsTests.cs:76-98`, `clients/dotnet/MxGateway.Client.Tests/FakeGatewayTransport.cs:212-231` |
| Client.Dotnet-015 | Low | Resolved | Correctness & logic bugs | `clients/dotnet/MxGateway.Client.Cli/MxGatewayClientCli.cs:221-236`, `clients/dotnet/MxGateway.Client.Cli/MxGatewayClientCli.cs:596-1065` |
| Client.Dotnet-016 | Low | Resolved | Concurrency & thread safety | `clients/dotnet/MxGateway.Client.Cli/MxGatewayClientCli.cs:922-976` |
| Client.Dotnet-017 | Low | Resolved | Error handling & resilience | `clients/dotnet/MxGateway.Client.Cli/MxGatewayClientCli.cs:1190-1262` |
| Client.Go-004 | Low | Resolved | mxaccessgw conventions | `clients/go/mxgateway/alarms_test.go:153-154`, `clients/go/mxgateway/galaxy_test.go:58-59` |
| Client.Go-005 | Low | Resolved | Design-document adherence | `clients/go/mxgateway/client.go:64,68`, `clients/go/mxgateway/galaxy.go:83,87` |
| Client.Go-006 | Low | Resolved | Error handling & resilience | `clients/go/mxgateway/errors.go:9-130` |