Resolve Worker.Tests-003..007 code-review findings

Worker.Tests-003: removed the wall-clock `Elapsed < 2s` assertion from
InvokeAsync_WakesIdlePumpForQueuedCommand; the awaited completion against a
30s idle period already proves the wake event drove dispatch.

Worker.Tests-004: MxAccessStaSession.Dispose now joins the alarm poll task
after cancelling the CTS (consistent with ShutdownGracefullyAsync), and
Dispose_StopsAlarmPollLoop asserts deterministically instead of via Task.Delay.

Worker.Tests-005: undisposed MemoryStream instances across the frame-protocol
and pipe-session tests are now `using` declarations.

Worker.Tests-006: Dispose_StopsAlarmPollLoop now constructs MxAccessStaSession
with `using` so a failed assertion cannot leak the STA poll loop.

Worker.Tests-007: docs/WorkerFrameProtocol.md verification section corrected
to target MxGateway.Worker.Tests / MxGateway.Worker with -p:Platform=x86.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
Joseph Doherty
2026-05-18 21:45:01 -04:00
parent 5ade3f4f48
commit 18ce2922e2
7 changed files with 78 additions and 46 deletions
+11 -11
View File
@@ -7,7 +7,7 @@
| Review date | 2026-05-18 |
| Commit reviewed | `6c64030` |
| Status | Reviewed |
| Open findings | 13 |
| Open findings | 8 |
## Checklist coverage
@@ -63,13 +63,13 @@
| Severity | Medium |
| Category | Concurrency & thread safety |
| Location | `src/MxGateway.Worker.Tests/Sta/StaRuntimeTests.cs:46-48` |
| Status | Open |
| Status | Resolved |
**Description:** `InvokeAsync_WakesIdlePumpForQueuedCommand` asserts `stopwatch.Elapsed < TimeSpan.FromSeconds(2)` — a wall-clock assertion that on a loaded CI agent can exceed 2s, producing a false failure. The test also does not actually prove the wake event (vs the 50 ms idle pump) caused the dispatch.
**Recommendation:** Remove the wall-clock assertion (the awaited result already proves the command ran), or raise the budget substantially with a comment that it is a coarse smoke check.
**Resolution:** _(open)_
**Resolution:** 2026-05-18 — Removed the `Stopwatch` and the `stopwatch.Elapsed < TimeSpan.FromSeconds(2)` wall-clock assertion from `InvokeAsync_WakesIdlePumpForQueuedCommand`. The test already constructs the `StaRuntime` with a 30-second idle pump period, so the awaited `InvokeAsync` completing at all proves the command wake event — not the idle pump tick — drove the dispatch; no timing budget is needed. The XML-doc comment now states this explicitly. The now-unused `using System.Diagnostics;` was removed (`TreatWarningsAsErrors`).
### Worker.Tests-004
@@ -78,13 +78,13 @@
| Severity | Medium |
| Category | Concurrency & thread safety |
| Location | `src/MxGateway.Worker.Tests/MxAccess/MxAccessStaSessionTests.cs:281-329` |
| Status | Open |
| Status | Resolved |
**Description:** `StartAsync_WithAlarmCommandHandlerFactory_PollOnceCalledViaSta` and `Dispose_StopsAlarmPollLoop` use poll-until loops, and `Dispose_StopsAlarmPollLoop` additionally does `await Task.Delay(1000)` then asserts `PollCount` is unchanged. The 1s "no further polls" window is a timing race: a poll scheduled just before disposal could increment the counter afterward, and a slow agent could simply not run a poll in the window even without correct stop logic.
**Recommendation:** Make the poll loop deterministically observable — expose a "poll loop stopped" signal or have `Dispose` join the poll task — then assert on that rather than on elapsed-time silence.
**Resolution:** _(open)_
**Resolution:** 2026-05-18 — `MxAccessStaSession.Dispose` now joins the alarm poll task (`pollTaskToJoin.Wait(TimeSpan.FromSeconds(5))`) after cancelling the poll CTS, instead of setting `alarmPollTask = null` and discarding it. Once `Dispose` returns, the poll loop has provably exited and no `PollOnce` call can still be in flight. `Dispose_StopsAlarmPollLoop` was rewritten to drop the `await Task.Delay(1000)` "no further polls" window: it now captures `PollCount` immediately after `Dispose()` returns and re-asserts equality after a bare `await Task.Yield()` — a deterministic frozen-count check rather than an elapsed-time race. The success-direction poll-until loop in `PollOnceCalledViaSta` was left as-is: waiting for an event to *occur* is sound; only waiting for an event to *not* occur is the race, and that pattern is now eliminated. Note: `ShutdownGracefullyAsync` already joined the poll task, so this change makes `Dispose` consistent with the graceful path.
### Worker.Tests-005
@@ -93,13 +93,13 @@
| Severity | Medium |
| Category | Performance & resource management |
| Location | `src/MxGateway.Worker.Tests/Ipc/WorkerFrameProtocolTests.cs:20-31,103-105`, `src/MxGateway.Worker.Tests/Ipc/WorkerPipeSessionTests.cs:28-31` |
| Status | Open |
| Status | Resolved |
**Description:** `MemoryStream` instances are created and never disposed across the frame-protocol and pipe-session tests (`MemoryStream stream = new();` with no `using`). Disposal is cheap so impact is low, but it is inconsistent with the rest of the suite (which carefully `using`s `CancellationTokenSource`, `StaRuntime`, `PipePair`). `WorkerFrameWriter`/`WorkerFrameReader` are also constructed without disposal.
**Recommendation:** Wrap `MemoryStream` (and reader/writer if they are `IDisposable`) in `using` declarations for consistency.
**Resolution:** _(open)_
**Resolution:** 2026-05-18 — All six `MemoryStream` test-body declarations in `WorkerFrameProtocolTests.cs` and the five `inbound`/`outbound` `MemoryStream` declarations in the `WorkerPipeSessionTests.cs` handshake tests were converted to `using` declarations, matching how the rest of the suite handles `CancellationTokenSource`/`StaRuntime`/`PipePair`. Re-triage of the parenthetical: `WorkerFrameWriter` and `WorkerFrameReader` are **not** `IDisposable` (`sealed class` with no `IDisposable` and no `Dispose` member — verified in `src/MxGateway.Worker/Ipc/`), so the finding's "reader/writer if they are `IDisposable`" suggestion does not apply and no change was made there. The shared `MemoryStream` instances inside the `WorkerPipeSessionTests` harness/helper classes (`ReadWrittenFrames` parameter, the `PipePair`/harness fields) are out of the cited line scope and were left untouched.
### Worker.Tests-006
@@ -108,13 +108,13 @@
| Severity | Medium |
| Category | Performance & resource management |
| Location | `src/MxGateway.Worker.Tests/MxAccess/MxAccessStaSessionTests.cs:282,305,315,323` |
| Status | Open |
| Status | Resolved |
**Description:** `Dispose_StopsAlarmPollLoop` constructs `MxAccessStaSession session` without `using` (unlike every sibling test) and relies on an explicit `session.Dispose()`. If an assertion between `StartAsync` and `Dispose()` throws, the session — its STA thread and poll loop — leaks for the rest of the run. The `StaRuntime` is `using`d so the thread is eventually reclaimed, but the alarm poll loop and handler are not.
**Recommendation:** Use `using MxAccessStaSession session = ...` and drop the manual `Dispose()`, or wrap the body in try/finally.
**Resolution:** _(open)_
**Resolution:** 2026-05-18 — `Dispose_StopsAlarmPollLoop` now declares its `MxAccessStaSession` with a `using` declaration. The manual `session.Dispose()` is kept because the test's purpose is to observe poll behaviour across disposal — but `MxAccessStaSession.Dispose` is idempotent (guarded by the `disposed` field), so the explicit mid-test call and the `using`-scope call do not conflict. An assertion thrown anywhere in the body now still tears the session (STA poll loop + alarm handler) down. The cited line numbers in the finding were imprecise — they straddle `PollOnceCalledViaSta` and `Dispose_StopsAlarmPollLoop` — but the described root cause (one `MxAccessStaSession` constructed without `using`) was singular and is the one in `Dispose_StopsAlarmPollLoop`; the sibling tests `PollOnceCalledViaSta` and `RunAlarmPollLoop_WhenPollOnceThrows_RecordsFaultOnEventQueue` already used `using` and needed no change.
### Worker.Tests-007
@@ -123,13 +123,13 @@
| Severity | Medium |
| Category | Design-document adherence |
| Location | `docs/WorkerFrameProtocol.md:38-49` |
| Status | Open |
| Status | Resolved |
**Description:** `docs/WorkerFrameProtocol.md` instructs running `dotnet test src/MxGateway.Tests/MxGateway.Tests.csproj --filter WorkerFrameProtocolTests` and states the frame protocol "is part of `MxGateway.Server`". The frame protocol actually lives in `MxGateway.Worker.Ipc` and is tested by `src/MxGateway.Worker.Tests/Ipc/WorkerFrameProtocolTests.cs`. The doc's verification command points at the wrong project and build, so anyone following it after changing the worker frame protocol will not run the relevant tests.
**Recommendation:** Update `docs/WorkerFrameProtocol.md` to reference `src/MxGateway.Worker.Tests` and the x86 worker build (`-p:Platform=x86`).
**Resolution:** _(open)_
**Resolution:** 2026-05-18 — Rewrote the `## Verification` section of `docs/WorkerFrameProtocol.md`. The test command now targets `src/MxGateway.Worker.Tests/MxGateway.Worker.Tests.csproj -p:Platform=x86 --filter WorkerFrameProtocolTests`; the build command now targets `src/MxGateway.Worker/MxGateway.Worker.csproj -p:Platform=x86`. The prose now states the frame protocol lives in `MxGateway.Worker.Ipc` (naming `WorkerFrameReader`/`WorkerFrameWriter`/`WorkerFrameProtocolOptions` and the `WorkerFrameProtocolTests.cs` test file) and notes the worker is an x86 process. Verified against the source: the frame-protocol types are confirmed under `src/MxGateway.Worker/Ipc/` and the tests under `src/MxGateway.Worker.Tests/Ipc/`, so the original doc was wrong on both project and component. Fenced code blocks were also relabelled `powershell` (the build/test commands are run from PowerShell on this Windows dev box).
### Worker.Tests-008