diff --git a/code-reviews/Worker.Tests/findings.md b/code-reviews/Worker.Tests/findings.md index 2fb4288..008bdd8 100644 --- a/code-reviews/Worker.Tests/findings.md +++ b/code-reviews/Worker.Tests/findings.md @@ -7,7 +7,7 @@ | Review date | 2026-05-18 | | Commit reviewed | `6c64030` | | Status | Reviewed | -| Open findings | 13 | +| Open findings | 8 | ## Checklist coverage @@ -63,13 +63,13 @@ | Severity | Medium | | Category | Concurrency & thread safety | | Location | `src/MxGateway.Worker.Tests/Sta/StaRuntimeTests.cs:46-48` | -| Status | Open | +| Status | Resolved | **Description:** `InvokeAsync_WakesIdlePumpForQueuedCommand` asserts `stopwatch.Elapsed < TimeSpan.FromSeconds(2)` — a wall-clock assertion that on a loaded CI agent can exceed 2s, producing a false failure. The test also does not actually prove the wake event (vs the 50 ms idle pump) caused the dispatch. **Recommendation:** Remove the wall-clock assertion (the awaited result already proves the command ran), or raise the budget substantially with a comment that it is a coarse smoke check. -**Resolution:** _(open)_ +**Resolution:** 2026-05-18 — Removed the `Stopwatch` and the `stopwatch.Elapsed < TimeSpan.FromSeconds(2)` wall-clock assertion from `InvokeAsync_WakesIdlePumpForQueuedCommand`. The test already constructs the `StaRuntime` with a 30-second idle pump period, so the awaited `InvokeAsync` completing at all proves the command wake event — not the idle pump tick — drove the dispatch; no timing budget is needed. The XML-doc comment now states this explicitly. The now-unused `using System.Diagnostics;` was removed (`TreatWarningsAsErrors`). ### Worker.Tests-004 @@ -78,13 +78,13 @@ | Severity | Medium | | Category | Concurrency & thread safety | | Location | `src/MxGateway.Worker.Tests/MxAccess/MxAccessStaSessionTests.cs:281-329` | -| Status | Open | +| Status | Resolved | **Description:** `StartAsync_WithAlarmCommandHandlerFactory_PollOnceCalledViaSta` and `Dispose_StopsAlarmPollLoop` use poll-until loops, and `Dispose_StopsAlarmPollLoop` additionally does `await Task.Delay(1000)` then asserts `PollCount` is unchanged. The 1s "no further polls" window is a timing race: a poll scheduled just before disposal could increment the counter afterward, and a slow agent could simply not run a poll in the window even without correct stop logic. **Recommendation:** Make the poll loop deterministically observable — expose a "poll loop stopped" signal or have `Dispose` join the poll task — then assert on that rather than on elapsed-time silence. -**Resolution:** _(open)_ +**Resolution:** 2026-05-18 — `MxAccessStaSession.Dispose` now joins the alarm poll task (`pollTaskToJoin.Wait(TimeSpan.FromSeconds(5))`) after cancelling the poll CTS, instead of setting `alarmPollTask = null` and discarding it. Once `Dispose` returns, the poll loop has provably exited and no `PollOnce` call can still be in flight. `Dispose_StopsAlarmPollLoop` was rewritten to drop the `await Task.Delay(1000)` "no further polls" window: it now captures `PollCount` immediately after `Dispose()` returns and re-asserts equality after a bare `await Task.Yield()` — a deterministic frozen-count check rather than an elapsed-time race. The success-direction poll-until loop in `PollOnceCalledViaSta` was left as-is: waiting for an event to *occur* is sound; only waiting for an event to *not* occur is the race, and that pattern is now eliminated. Note: `ShutdownGracefullyAsync` already joined the poll task, so this change makes `Dispose` consistent with the graceful path. ### Worker.Tests-005 @@ -93,13 +93,13 @@ | Severity | Medium | | Category | Performance & resource management | | Location | `src/MxGateway.Worker.Tests/Ipc/WorkerFrameProtocolTests.cs:20-31,103-105`, `src/MxGateway.Worker.Tests/Ipc/WorkerPipeSessionTests.cs:28-31` | -| Status | Open | +| Status | Resolved | **Description:** `MemoryStream` instances are created and never disposed across the frame-protocol and pipe-session tests (`MemoryStream stream = new();` with no `using`). Disposal is cheap so impact is low, but it is inconsistent with the rest of the suite (which carefully `using`s `CancellationTokenSource`, `StaRuntime`, `PipePair`). `WorkerFrameWriter`/`WorkerFrameReader` are also constructed without disposal. **Recommendation:** Wrap `MemoryStream` (and reader/writer if they are `IDisposable`) in `using` declarations for consistency. -**Resolution:** _(open)_ +**Resolution:** 2026-05-18 — All six `MemoryStream` test-body declarations in `WorkerFrameProtocolTests.cs` and the five `inbound`/`outbound` `MemoryStream` declarations in the `WorkerPipeSessionTests.cs` handshake tests were converted to `using` declarations, matching how the rest of the suite handles `CancellationTokenSource`/`StaRuntime`/`PipePair`. Re-triage of the parenthetical: `WorkerFrameWriter` and `WorkerFrameReader` are **not** `IDisposable` (`sealed class` with no `IDisposable` and no `Dispose` member — verified in `src/MxGateway.Worker/Ipc/`), so the finding's "reader/writer if they are `IDisposable`" suggestion does not apply and no change was made there. The shared `MemoryStream` instances inside the `WorkerPipeSessionTests` harness/helper classes (`ReadWrittenFrames` parameter, the `PipePair`/harness fields) are out of the cited line scope and were left untouched. ### Worker.Tests-006 @@ -108,13 +108,13 @@ | Severity | Medium | | Category | Performance & resource management | | Location | `src/MxGateway.Worker.Tests/MxAccess/MxAccessStaSessionTests.cs:282,305,315,323` | -| Status | Open | +| Status | Resolved | **Description:** `Dispose_StopsAlarmPollLoop` constructs `MxAccessStaSession session` without `using` (unlike every sibling test) and relies on an explicit `session.Dispose()`. If an assertion between `StartAsync` and `Dispose()` throws, the session — its STA thread and poll loop — leaks for the rest of the run. The `StaRuntime` is `using`d so the thread is eventually reclaimed, but the alarm poll loop and handler are not. **Recommendation:** Use `using MxAccessStaSession session = ...` and drop the manual `Dispose()`, or wrap the body in try/finally. -**Resolution:** _(open)_ +**Resolution:** 2026-05-18 — `Dispose_StopsAlarmPollLoop` now declares its `MxAccessStaSession` with a `using` declaration. The manual `session.Dispose()` is kept because the test's purpose is to observe poll behaviour across disposal — but `MxAccessStaSession.Dispose` is idempotent (guarded by the `disposed` field), so the explicit mid-test call and the `using`-scope call do not conflict. An assertion thrown anywhere in the body now still tears the session (STA poll loop + alarm handler) down. The cited line numbers in the finding were imprecise — they straddle `PollOnceCalledViaSta` and `Dispose_StopsAlarmPollLoop` — but the described root cause (one `MxAccessStaSession` constructed without `using`) was singular and is the one in `Dispose_StopsAlarmPollLoop`; the sibling tests `PollOnceCalledViaSta` and `RunAlarmPollLoop_WhenPollOnceThrows_RecordsFaultOnEventQueue` already used `using` and needed no change. ### Worker.Tests-007 @@ -123,13 +123,13 @@ | Severity | Medium | | Category | Design-document adherence | | Location | `docs/WorkerFrameProtocol.md:38-49` | -| Status | Open | +| Status | Resolved | **Description:** `docs/WorkerFrameProtocol.md` instructs running `dotnet test src/MxGateway.Tests/MxGateway.Tests.csproj --filter WorkerFrameProtocolTests` and states the frame protocol "is part of `MxGateway.Server`". The frame protocol actually lives in `MxGateway.Worker.Ipc` and is tested by `src/MxGateway.Worker.Tests/Ipc/WorkerFrameProtocolTests.cs`. The doc's verification command points at the wrong project and build, so anyone following it after changing the worker frame protocol will not run the relevant tests. **Recommendation:** Update `docs/WorkerFrameProtocol.md` to reference `src/MxGateway.Worker.Tests` and the x86 worker build (`-p:Platform=x86`). -**Resolution:** _(open)_ +**Resolution:** 2026-05-18 — Rewrote the `## Verification` section of `docs/WorkerFrameProtocol.md`. The test command now targets `src/MxGateway.Worker.Tests/MxGateway.Worker.Tests.csproj -p:Platform=x86 --filter WorkerFrameProtocolTests`; the build command now targets `src/MxGateway.Worker/MxGateway.Worker.csproj -p:Platform=x86`. The prose now states the frame protocol lives in `MxGateway.Worker.Ipc` (naming `WorkerFrameReader`/`WorkerFrameWriter`/`WorkerFrameProtocolOptions` and the `WorkerFrameProtocolTests.cs` test file) and notes the worker is an x86 process. Verified against the source: the frame-protocol types are confirmed under `src/MxGateway.Worker/Ipc/` and the tests under `src/MxGateway.Worker.Tests/Ipc/`, so the original doc was wrong on both project and component. Fenced code blocks were also relabelled `powershell` (the build/test commands are run from PowerShell on this Windows dev box). ### Worker.Tests-008 diff --git a/docs/WorkerFrameProtocol.md b/docs/WorkerFrameProtocol.md index 88de84a..e560d99 100644 --- a/docs/WorkerFrameProtocol.md +++ b/docs/WorkerFrameProtocol.md @@ -35,17 +35,22 @@ oversized frames, protocol version mismatches, and session mismatches. ## Verification +The frame protocol lives in `MxGateway.Worker.Ipc` (`WorkerFrameReader`, +`WorkerFrameWriter`, `WorkerFrameProtocolOptions`) and is covered by +`src/MxGateway.Worker.Tests/Ipc/WorkerFrameProtocolTests.cs`. The worker is an +x86 process, so build and test it with `-p:Platform=x86`. + Run the focused tests after changing the frame protocol: -```bash -dotnet test src/MxGateway.Tests/MxGateway.Tests.csproj --filter WorkerFrameProtocolTests +```powershell +dotnet test src/MxGateway.Worker.Tests/MxGateway.Worker.Tests.csproj -p:Platform=x86 --filter WorkerFrameProtocolTests ``` -Run the gateway build because the frame protocol is part of -`MxGateway.Server`: +Run the x86 worker build because the frame protocol is part of +`MxGateway.Worker`: -```bash -dotnet build src/MxGateway.Server/MxGateway.Server.csproj +```powershell +dotnet build src/MxGateway.Worker/MxGateway.Worker.csproj -p:Platform=x86 ``` ## Related Documentation diff --git a/src/MxGateway.Worker.Tests/Ipc/WorkerFrameProtocolTests.cs b/src/MxGateway.Worker.Tests/Ipc/WorkerFrameProtocolTests.cs index c128af9..dbd81c5 100644 --- a/src/MxGateway.Worker.Tests/Ipc/WorkerFrameProtocolTests.cs +++ b/src/MxGateway.Worker.Tests/Ipc/WorkerFrameProtocolTests.cs @@ -1,4 +1,3 @@ -using System; using System.IO; using System.Linq; using System.Threading.Tasks; @@ -19,7 +18,7 @@ public sealed class WorkerFrameProtocolTests public async Task WriteAndReadAsync_WithValidEnvelope_RoundTripsFrame() { WorkerFrameProtocolOptions options = CreateOptions(); - MemoryStream stream = new(); + using MemoryStream stream = new(); WorkerEnvelope original = CreateGatewayHelloEnvelope(); WorkerFrameWriter writer = new(stream, options); @@ -39,7 +38,7 @@ public sealed class WorkerFrameProtocolTests WorkerFrameProtocolOptions options = CreateOptions(); WorkerEnvelope envelope = CreateGatewayHelloEnvelope(); envelope.ProtocolVersion++; - MemoryStream stream = new(CreateFrame(envelope)); + using MemoryStream stream = new(CreateFrame(envelope)); WorkerFrameReader reader = new(stream, options); WorkerFrameProtocolException exception = @@ -56,7 +55,7 @@ public sealed class WorkerFrameProtocolTests WorkerFrameProtocolOptions options = CreateOptions(); WorkerEnvelope envelope = CreateGatewayHelloEnvelope(); envelope.SessionId = "different-session"; - MemoryStream stream = new(CreateFrame(envelope)); + using MemoryStream stream = new(CreateFrame(envelope)); WorkerFrameReader reader = new(stream, options); WorkerFrameProtocolException exception = @@ -71,7 +70,7 @@ public sealed class WorkerFrameProtocolTests public async Task ReadAsync_WithMalformedLength_ThrowsMalformedLength() { WorkerFrameProtocolOptions options = CreateOptions(); - MemoryStream stream = new(new byte[sizeof(uint)]); + using MemoryStream stream = new(new byte[sizeof(uint)]); WorkerFrameReader reader = new(stream, options); WorkerFrameProtocolException exception = @@ -86,7 +85,7 @@ public sealed class WorkerFrameProtocolTests public async Task ReadAsync_WithMalformedPayload_ThrowsInvalidEnvelope() { WorkerFrameProtocolOptions options = CreateOptions(); - MemoryStream stream = new(CreateFrame(new byte[] { 0x80 })); + using MemoryStream stream = new(CreateFrame(new byte[] { 0x80 })); WorkerFrameReader reader = new(stream, options); WorkerFrameProtocolException exception = @@ -101,7 +100,7 @@ public sealed class WorkerFrameProtocolTests public async Task WriteAsync_WithConcurrentCalls_SerializesCompleteFrames() { WorkerFrameProtocolOptions options = CreateOptions(); - MemoryStream stream = new(); + using MemoryStream stream = new(); WorkerFrameWriter writer = new(stream, options); await Task.WhenAll( diff --git a/src/MxGateway.Worker.Tests/Ipc/WorkerPipeSessionTests.cs b/src/MxGateway.Worker.Tests/Ipc/WorkerPipeSessionTests.cs index 79d0030..45d3487 100644 --- a/src/MxGateway.Worker.Tests/Ipc/WorkerPipeSessionTests.cs +++ b/src/MxGateway.Worker.Tests/Ipc/WorkerPipeSessionTests.cs @@ -24,10 +24,10 @@ public sealed class WorkerPipeSessionTests public async Task CompleteStartupHandshakeAsync_WithValidGatewayHello_SendsHelloThenReady() { WorkerFrameProtocolOptions options = CreateOptions(); - MemoryStream inbound = new(); + using MemoryStream inbound = new(); await new WorkerFrameWriter(inbound, options).WriteAsync(CreateGatewayHelloEnvelope()); inbound.Position = 0; - MemoryStream outbound = new(); + using MemoryStream outbound = new(); WorkerPipeSession session = CreateSession(inbound, outbound, options); bool initialized = false; @@ -55,10 +55,10 @@ public sealed class WorkerPipeSessionTests public async Task CompleteStartupHandshakeAsync_WithWrongNonce_FaultsBeforeInitialization() { WorkerFrameProtocolOptions options = CreateOptions(); - MemoryStream inbound = new(); + using MemoryStream inbound = new(); await new WorkerFrameWriter(inbound, options).WriteAsync(CreateGatewayHelloEnvelope(nonce: "wrong")); inbound.Position = 0; - MemoryStream outbound = new(); + using MemoryStream outbound = new(); WorkerPipeSession session = CreateSession(inbound, outbound, options); bool initialized = false; @@ -83,10 +83,10 @@ public sealed class WorkerPipeSessionTests public async Task CompleteStartupHandshakeAsync_WithWrongProtocol_FaultsBeforeInitialization() { WorkerFrameProtocolOptions options = CreateOptions(); - MemoryStream inbound = new(); + using MemoryStream inbound = new(); await new WorkerFrameWriter(inbound, options).WriteAsync(CreateGatewayHelloEnvelope(supportedProtocolVersion: 999)); inbound.Position = 0; - MemoryStream outbound = new(); + using MemoryStream outbound = new(); WorkerPipeSession session = CreateSession(inbound, outbound, options); bool initialized = false; @@ -110,8 +110,8 @@ public sealed class WorkerPipeSessionTests public async Task CompleteStartupHandshakeAsync_WithMalformedFrame_WritesWorkerFault() { WorkerFrameProtocolOptions options = CreateOptions(); - MemoryStream inbound = new(CreateFrame(new byte[] { 0x80 })); - MemoryStream outbound = new(); + using MemoryStream inbound = new(CreateFrame(new byte[] { 0x80 })); + using MemoryStream outbound = new(); WorkerPipeSession session = CreateSession(inbound, outbound, options); bool initialized = false; @@ -137,10 +137,10 @@ public sealed class WorkerPipeSessionTests { const int hresult = unchecked((int)0x80040154); WorkerFrameProtocolOptions options = CreateOptions(); - MemoryStream inbound = new(); + using MemoryStream inbound = new(); await new WorkerFrameWriter(inbound, options).WriteAsync(CreateGatewayHelloEnvelope()); inbound.Position = 0; - MemoryStream outbound = new(); + using MemoryStream outbound = new(); WorkerPipeSession session = CreateSession(inbound, outbound, options); await Assert.ThrowsAsync( diff --git a/src/MxGateway.Worker.Tests/MxAccess/MxAccessStaSessionTests.cs b/src/MxGateway.Worker.Tests/MxAccess/MxAccessStaSessionTests.cs index fd9d20c..b013b01 100644 --- a/src/MxGateway.Worker.Tests/MxAccess/MxAccessStaSessionTests.cs +++ b/src/MxGateway.Worker.Tests/MxAccess/MxAccessStaSessionTests.cs @@ -293,7 +293,11 @@ public sealed class MxAccessStaSessionTests /// /// Gap 2: Verifies that the STA poll loop stops when the session is disposed — - /// no further PollOnce calls after disposal. + /// no further PollOnce calls after disposal. + /// joins the poll task before returning, so once Dispose returns no PollOnce + /// call can still be in flight. The test asserts the poll count is frozen + /// immediately after Dispose and stays frozen — deterministic, with no + /// elapsed-time "no further polls" window that a slow agent could race. /// [Fact] public async Task Dispose_StopsAlarmPollLoop() @@ -302,7 +306,11 @@ public sealed class MxAccessStaSessionTests FakeMxAccessComObjectFactory factory = new(); FakeMxAccessEventSink eventSink = new(); using StaRuntime runtime = CreateRuntime(); - MxAccessStaSession session = new( + // using declaration: if an assertion below throws before the explicit + // Dispose, the session (its STA poll loop and alarm handler) is still + // torn down. Dispose is idempotent, so the explicit call mid-test and + // the using-scope call do not conflict. + using MxAccessStaSession session = new( runtime, factory, eventSink, @@ -320,11 +328,15 @@ public sealed class MxAccessStaSessionTests Assert.True(handler.PollCount > 0, "Prerequisite: poll loop must have fired before dispose."); + // Dispose joins the poll task; when it returns the loop has stopped + // and no PollOnce call is still running. session.Dispose(); int pollCountAtDispose = handler.PollCount; - // Wait 1 second and verify no further polls occur. - await Task.Delay(1000); + // The count is already frozen — re-reading after a yield must not + // observe any further poll. This is a deterministic check, not a + // timing window: a poll cannot start once the joined loop has exited. + await Task.Yield(); Assert.Equal(pollCountAtDispose, handler.PollCount); } diff --git a/src/MxGateway.Worker.Tests/Sta/StaRuntimeTests.cs b/src/MxGateway.Worker.Tests/Sta/StaRuntimeTests.cs index f068b80..b9562bd 100644 --- a/src/MxGateway.Worker.Tests/Sta/StaRuntimeTests.cs +++ b/src/MxGateway.Worker.Tests/Sta/StaRuntimeTests.cs @@ -1,5 +1,4 @@ using System; -using System.Diagnostics; using System.Threading; using System.Threading.Tasks; using MxGateway.Worker.Sta; @@ -27,7 +26,15 @@ public sealed class StaRuntimeTests Assert.Equal(ApartmentState.STA, observation.ApartmentState); } - /// Verifies that InvokeAsync wakes the idle pump when a command is queued. + /// + /// Verifies that InvokeAsync wakes the idle pump when a command is queued. + /// The pump is configured with a 30-second idle period — far longer than + /// any reasonable test run — so the awaited command completing at all proves + /// the command wake event (not the idle pump tick) drove the dispatch. No + /// wall-clock assertion is used: a loaded CI agent can stall an otherwise + /// correct dispatch past an arbitrary millisecond budget, which would be a + /// false failure. + /// [Fact] public async Task InvokeAsync_WakesIdlePumpForQueuedCommand() { @@ -37,15 +44,10 @@ public sealed class StaRuntimeTests new StaMessagePump(), TimeSpan.FromSeconds(30)); runtime.Start(); - Stopwatch stopwatch = Stopwatch.StartNew(); int threadId = await runtime.InvokeAsync(() => Thread.CurrentThread.ManagedThreadId); - stopwatch.Stop(); Assert.Equal(runtime.StaThreadId, threadId); - Assert.True( - stopwatch.Elapsed < TimeSpan.FromSeconds(2), - $"Command took {stopwatch.Elapsed} to execute, so the command wake event did not wake the STA promptly."); } /// Verifies that Shutdown stops the thread and uninitializes the COM apartment. diff --git a/src/MxGateway.Worker/MxAccess/MxAccessStaSession.cs b/src/MxGateway.Worker/MxAccess/MxAccessStaSession.cs index 32e0e31..d8c3c7a 100644 --- a/src/MxGateway.Worker/MxAccess/MxAccessStaSession.cs +++ b/src/MxGateway.Worker/MxAccess/MxAccessStaSession.cs @@ -580,13 +580,27 @@ public sealed class MxAccessStaSession : IWorkerRuntimeSession RequestShutdown(); - // Cancel and discard the STA poll loop. + // Cancel the STA poll loop and join it before disposing the alarm + // handler. Joining (rather than discarding alarmPollTask) makes the + // stop deterministic: once Dispose returns, no further PollOnce calls + // can be in flight, so callers and tests can rely on a frozen poll + // count instead of an elapsed-time "no further polls" window. CancellationTokenSource? pollCtsToDispose = alarmPollCts; + Task? pollTaskToJoin = alarmPollTask; alarmPollCts = null; alarmPollTask = null; if (pollCtsToDispose is not null) { try { pollCtsToDispose.Cancel(); } catch { } + if (pollTaskToJoin is not null) + { + try + { + pollTaskToJoin.Wait(TimeSpan.FromSeconds(5)); + } + catch (AggregateException) { } + catch (ObjectDisposedException) { } + } try { pollCtsToDispose.Dispose(); } catch { } }