Tests-025 (Conventions): Extracted the previously-duplicated NullDashboardEventBroadcaster into TestSupport/NullDashboardEventBroadcaster.cs (singleton Instance, private ctor). The two nested copies in EventStreamServiceTests and GatewayEndToEndFakeWorkerSmokeTests were removed; both files now use the shared type via 'using ZB.MOM.WW.MxGateway.Tests.TestSupport;'. The Server-041 regression test's ThrowingDashboardEventBroadcaster is intentionally left nested — single-file usage doesn't warrant promotion to TestSupport. The third copy in IntegrationTests/WorkerLiveMxAccessSmokeTests was handled by IntegrationTests-024 in its own commit. Tests-026 (Testing coverage): Added a new RecordingDashboardEventBroadcaster test double in TestSupport — a thread-safe (ConcurrentQueue<DashboardEventCapture>) recorder. New fixture StreamEventsAsync_PublishesEachEventToDashboardBroadcaster in EventStreamServiceTests pushes two events through the fake session and asserts the broadcaster received both with the correct sessionId and WorkerSequence. TDD red→green confirmed: the deliberately-wrong "Expected 3, Actual 2" red phase proved the recording fake was actually invoked by the production code path. Verification: 486/486 server tests passing (485 previous + 1 new). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
64 KiB
Code Review — Tests
| Field | Value |
|---|---|
| Module | src/ZB.MOM.WW.MxGateway.Tests |
| Reviewer | Claude Code |
| Review date | 2026-05-24 |
| Commit reviewed | d692232 |
| Status | Reviewed |
| Open findings | 0 |
Checklist coverage
This pass (commit a020350) re-reviews the module after the Tests-013–019 batch was resolved alongside Server-017, Server-021, and Contracts-010.
| # | Category | Result |
|---|---|---|
| 1 | Correctness & logic bugs | Issue found: Tests-023 (the companion FakeWorkerProcess.WaitForExitAsync in SessionWorkerClientFactoryFakeWorkerTests.cs still uses the Tests-015 cheating pattern — HasExited = true; ExitCode = 0; regardless of whether the worker actually exited — and is a latent regression vector if any future exit assertion is added to that file). Tests-015 was only applied to the smoke-test copy. |
| 2 | mxaccessgw conventions | No new issues. Style/convention drift previously filed (Tests-008) remains resolved at a020350. |
| 3 | Concurrency & thread safety | No new issues. The remaining wall-clock dependencies (InvokeAsync_WhenSessionReady_RefreshesLease uses UtcNow at both ends of a ~1 hour delta, dwarfing clock resolution; CloseExpiredLeasesAsync_* reads UtcNow once and uses it consistently for both sides) are intrinsic to the production paths and not flake sources. The Tests-017 fix is in place at WorkerClientTests.cs:354. |
| 4 | Error handling & resilience | No new issues. Tests-013 closed the bulk-method coverage gap end-to-end (per-entry failure surfaces, protocol-status failures, and cancellation propagation are all exercised). Pipe-disconnect / worker-fault / kill paths all covered. |
| 5 | Security | No new issues. Adversarial-input safety (Tests-002), anonymous-localhost negatives (Tests-010), interceptor-service composition (Tests-004), constraint partial-denial merging (Server-021 — PredicateConstraintEnforcer + MxAccessGatewayServiceConstraintTests), and unmapped-RPC fail-closed (Server-017) all covered. |
| 6 | Performance & resource management | No new issues. Tests-014 (await using WebApplication) is applied to all seven GatewayApplication.Build(...) sites. Tests-003 (TempDatabaseDirectory) cleanup is in place. |
| 7 | Design-document adherence | Tests match docs/GatewayTesting.md; the new "Galaxy Filter Safety" subsection added under Tests-019 names GalaxyFilterInputSafetyTests. No drift found. |
| 8 | Code organization & conventions | Issue found: Tests-021 (ManualTimeProvider is duplicated as a private sealed class in four test files — WorkerClientTests, FakeWorkerHarnessTests, SessionManagerTests, GalaxyHierarchyCacheTests — and should follow the Tests-007 TestSupport/ consolidation pattern). |
| 9 | Testing coverage | Issues found: Tests-020 (MxAccessGatewayServiceConstraintTests covers only 2 of 4 WriteBulkConstraintPlan switch arms — Write2Bulk/WriteSecured2Bulk GetPayload/SetPayload would silently break with no failing test), Tests-022 (the eleven SessionManagerBulkTests.*_PropagatesCancellation tests pre-cancel the token, so the fake's first-line ThrowIfCancellationRequested handles it before InvokeBulkInternalAsync even runs — they do not exercise mid-flight cancellation), Tests-024 (BulkConstraintPlan.MergeDeniedInto silently drops or under-fills if the worker reply count diverges from the allowed-count — no test pins this protocol-mismatch edge case). |
| 10 | Documentation & comments | No new issues. Tests-019's docs/GatewayTesting.md addition is in place; new test files (SessionManagerBulkTests, MxAccessGatewayServiceConstraintTests, PredicateConstraintEnforcer) all have orienting class-level summaries. |
2026-05-24 review (commit d692232)
Re-review pass at d692232 scoped to the test-side fixture churn from
the dashboard refactor wave: the rename touched every namespace declaration
and using; the dashboard auth refactor rewrote three dashboard test files
(DashboardApiKeyAuthorizationTests, DashboardAuthorizationHandlerTests,
DashboardAuthenticatorTests); GatewayApplicationTests was updated for
root-mounted routes and the new ViewerPolicy; DashboardCookieOptionsTests
expects root-relative login/logout; a new DashboardHubsRegistrationTests
pins the three hub /negotiate endpoints and the DI shape; and the
EventStreamService ctor expansion drove inline NullDashboardEventBroadcaster
fakes in two test files.
| # | Category | Result |
|---|---|---|
| 1 | Correctness & logic bugs | No issues found in the a020350..d692232 diff. |
| 2 | mxaccessgw conventions | No issues found — namespaces updated cleanly, the fixture-helper consolidation pattern (TestSupport/) is intact. |
| 3 | Concurrency & thread safety | No issues found in this diff. |
| 4 | Error handling & resilience | No issues found in this diff. |
| 5 | Security | No issues found — DashboardAuthorizationHandlerTests covers both Viewer and Admin role paths and the loopback bypass. |
| 6 | Performance & resource management | No issues found in this diff. |
| 7 | Design-document adherence | No issues found in this diff. |
| 8 | Code organization & conventions | Issues found: Tests-025 (duplicate NullDashboardEventBroadcaster private classes in EventStreamServiceTests and GatewayEndToEndFakeWorkerSmokeTests; follow Tests-007 / Tests-021 consolidation pattern). |
| 9 | Testing coverage | Issues found: Tests-026 (no test proves EventStreamService actually calls IDashboardEventBroadcaster.Publish for each event — the only consumers in tests are Null fakes). |
| 10 | Documentation & comments | No issues found in this diff. |
Findings
Tests-001
| Field | Value |
|---|---|
| Severity | High |
| Category | Testing coverage |
| Location | src/MxGateway.Tests/Gateway/Grpc/MxAccessGatewayServiceTests.cs:483-489 |
| Status | Resolved |
Description: FakeSessionManager.TryGetSession unconditionally returns true and synthesizes a session for any id. As a result, Invoke_WhenSessionMissing_ThrowsNotFound (line 52) only passes because InvokeException is pre-seeded — it does not verify that the gateway service maps a genuinely missing session to NotFound. No test exercises the real gateway path where TryGetSession returns false (for StreamEvents, CloseSession, alarm RPCs). A regression dropping the missing-session check would not be caught.
Recommendation: Make FakeSessionManager.TryGetSession return false for unknown ids (return only seeded sessions), then assert NotFound/InvalidArgument is produced by the service's own lookup logic rather than an injected exception.
Resolution: Resolved 2026-05-18: confirmed root cause — added ResolveOnlySeededSessions/SeedSession to FakeSessionManager so TryGetSession returns false for unseeded ids, rewrote Invoke_WhenSessionMissing_ThrowsNotFound to drop the injected InvokeException and exercise the service's own ResolveSession lookup (asserts InvokeCount == 0), and added Invoke_WhenSessionSeeded_ResolvesAndInvokes, AcknowledgeAlarm_WhenSessionMissing_ThrowsNotFound, and QueryActiveAlarms_WhenSessionMissing_ThrowsNotFound.
Tests-002
| Field | Value |
|---|---|
| Severity | High |
| Category | Security |
| Location | src/MxGateway.Tests/Gateway/Grpc/GalaxyRepositoryGrpcServiceTests.cs:198-210 |
| Status | Resolved |
Description: The Galaxy Repository RPCs browse a SQL Server database (ZB). Every test injects a StubGalaxyHierarchyCache, so actual SQL query construction, parameterization, and filter/glob translation are never exercised. No test demonstrates that TagNameGlob, RootTagName, AlarmFilterPrefix, etc. are passed as parameters rather than concatenated into SQL. SQL-injection resistance of the Galaxy layer has zero coverage.
Recommendation: Add tests for the GalaxyRepository query-building layer (against SQLite or an in-memory abstraction, or by asserting parameter objects), covering glob/prefix inputs containing ', %, _, and ;. At minimum add a unit test over the SQL LIKE-pattern escaping helper.
Re-triage note: The finding's premise is partly misframed. GalaxyRepository issues only four constant SQL statements (HierarchySql, AttributesSql, SELECT 1, SELECT time_of_last_deploy FROM galaxy) — no DiscoverHierarchyRequest field is ever concatenated into SQL, so there is no dynamic SQL-injection surface and no LIKE-escaping helper to test. AlarmFilterPrefix belongs to the worker alarm path, not the Galaxy SQL layer. All filters (TagNameGlob, RootTagName, template-chain, category, contained-path) are applied in memory by GalaxyHierarchyProjector/GalaxyGlobMatcher against the cached snapshot. The genuine, testable concern — that adversarial filter strings are treated as opaque literals (no wildcard behaviour, no ReDoS, no exceptions) — remains valid and was previously uncovered. Severity left at High: an unsafe in-memory filter would still be a real security gap.
Resolution: Resolved 2026-05-18: added src/MxGateway.Tests/Galaxy/GalaxyFilterInputSafetyTests.cs (10 test methods, mostly [Theory] over adversarial inputs ', ' OR '1'='1, '; DROP TABLE gobject;--, %, _, 100%_off, [abc], Pump'001) covering GalaxyGlobMatcher literal-treatment / LIKE-wildcard / pathological-input (ReDoS) behaviour and GalaxyHierarchyProjector + DiscoverHierarchy RPC handling of adversarial TagNameGlob, RootTagName, and TemplateChainContains. No product bug found — the in-memory filter layer treats all metacharacters as literals; the passing tests resolve the coverage gap.
Tests-003
| Field | Value |
|---|---|
| Severity | Medium |
| Category | Performance & resource management |
| Location | src/MxGateway.Tests/Security/Authentication/SqliteAuthStoreTests.cs:170-176, src/MxGateway.Tests/Security/Authentication/ApiKeyAdminCliRunnerTests.cs:252-258 |
| Status | Resolved |
Description: CreateTempDatabasePath creates a fresh directory under %TEMP%\mxgateway-auth-tests\<guid> (and ...-cli-tests) for every test but nothing ever deletes it. WorkerProcessLauncherTests.TestDirectory correctly implements IDisposable and cleans up; these two do not. SQLite connection pooling can also keep the .db handle open after the test. Over many CI runs this leaks temp files and open handles.
Recommendation: Wrap the temp directory in an IDisposable/IAsyncDisposable helper (as WorkerProcessLauncherTests does) and call SqliteConnection.ClearAllPools() before deletion, or use Microsoft.Data.Sqlite in-memory mode where a real file is not needed.
Resolution: Resolved 2026-05-18: confirmed root cause — both CreateTempDatabasePath helpers created %TEMP% directories with no cleanup, and Microsoft.Data.Sqlite pools connections by default so the .db handle outlives the test. Added a shared TempDatabaseDirectory (src/MxGateway.Tests/Security/Authentication/TempDatabaseDirectory.cs) IDisposable helper that calls SqliteConnection.ClearAllPools() and recursively deletes its directory. SqliteAuthStoreTests and ApiKeyAdminCliRunnerTests now implement IDisposable, track every directory created via CreateTempDatabasePath, and dispose them after each test. All affected tests still pass.
Tests-004
| Field | Value |
|---|---|
| Severity | Medium |
| Category | Testing coverage |
| Location | src/MxGateway.Tests/Security/Authorization/GatewayGrpcAuthorizationInterceptorTests.cs |
| Status | Resolved |
Description: The authorization interceptor and MxAccessGatewayService are each tested in isolation, but no test composes the interceptor in front of the real service to confirm scope enforcement gates real RPCs end-to-end. A wiring mistake — interceptor not registered, or a new RPC added without a scope mapping in GatewayGrpcScopeResolver — would pass every existing test. GatewayGrpcScopeResolverTests also only checks an enumerated allow-list; it never asserts an unmapped request type fails closed.
Recommendation: Add an end-to-end test that runs OpenSession/Invoke through the interceptor+service composition with insufficient scope and asserts PermissionDenied; add a GatewayGrpcScopeResolver test asserting an unknown/unmapped request type throws or denies rather than returning a permissive default.
Resolution: Resolved 2026-05-18: confirmed the coverage gap. Added three interceptor+service composition tests to GatewayGrpcAuthorizationInterceptorTests that run the real GatewayGrpcAuthorizationInterceptor continuation into a real MxAccessGatewayService: InterceptorComposedWithService_OpenSessionMissingScope_DeniesBeforeServiceRuns (asserts PermissionDenied and OpenSessionCount == 0), InterceptorComposedWithService_OpenSessionWithScope_RunsServiceWithIdentity (service runs and observes the interceptor-pushed identity), and InterceptorComposedWithService_InvokeWriteCommandWithReadScope_DeniesBeforeServiceRuns (a Write command with only invoke:read is denied). Added two GatewayGrpcScopeResolverTests: ResolveRequiredScope_UnmappedRequestType_FailsClosedToAdminScope confirms an unmapped request type resolves to the most-restrictive Admin scope (the resolver's _ => GatewayScopes.Admin default already fails closed — no product bug), and ResolveRequiredScope_UnknownInvokeCommandKind_ReturnsInvokeReadScope confirms an unknown command kind does not silently grant write/admin access.
Tests-005
| Field | Value |
|---|---|
| Severity | Medium |
| Category | Testing coverage |
| Location | src/MxGateway.Tests/Gateway/Grpc/EventStreamServiceTests.cs:239-261, src/MxGateway.Tests/Gateway/Sessions/SessionManagerTests.cs |
| Status | Resolved |
Description: Worker-crash handling is only tested as a clean terminal exception from ReadEventsAsync or a pre-set ShutdownException. There is no test for a worker that faults mid-command — an InvokeAsync in flight when the pipe/worker dies — which is a core fault-handling path of the two-process design. WorkerClientTests covers pipe-disconnect faulting the read loop, but not the interaction where a pending InvokeAsync task observes the fault and surfaces a meaningful error code.
Recommendation: Add a WorkerClient/SessionManager test that disposes the worker pipe (or emits a WorkerFault) while an InvokeAsync is pending, and assert the invoke task fails with a WorkerClientException/SessionManagerException carrying the worker-faulted error code.
Resolution: Resolved 2026-05-18: confirmed the coverage gap and confirmed the product path already handles it correctly (WorkerClient.ReadLoopAsync → SetFaulted → CompletePendingCommands(fault) fails every pending command with the fault exception). Added two WorkerClientTests: InvokeAsync_WhenPipeDisconnectsMidCommand_FailsPendingInvokeWithPipeDisconnected (worker reads the command then disposes its pipe side; the pending invoke task fails with WorkerClientErrorCode.PipeDisconnected) and InvokeAsync_WhenWorkerFaultsMidCommand_FailsPendingInvokeWithWorkerFaulted (worker emits a WorkerFault envelope while the invoke is pending; the task fails with WorkerClientErrorCode.WorkerFaulted). Both also assert the client transitions to Faulted. No product change needed.
Tests-006
| Field | Value |
|---|---|
| Severity | Medium |
| Category | Concurrency & thread safety |
| Location | src/MxGateway.Tests/Gateway/Workers/WorkerClientTests.cs:76, src/MxGateway.Tests/Gateway/Workers/FakeWorkerHarnessTests.cs:122 |
| Status | Resolved |
Description: Several tests rely on fixed Task.Delay values: WorkerClientTests.InvokeAsync_WithLateReply… waits a hard-coded 50 ms after writing a late reply before issuing the second command, and the heartbeat tests use a 20 ms delay to make timestamps strictly increase. On a slow CI agent the 50 ms delay can be insufficient, and DateTimeOffset.UtcNow resolution can make the 20 ms heartbeat-advance assertion flaky.
Recommendation: Replace fixed delays with the existing WaitUntilAsync condition polling, and inject a controllable TimeProvider for heartbeat-timestamp comparisons instead of relying on wall-clock advance.
Re-triage note: The brief flagged ReadLoop_WhenClientFaults_KillsOwnedWorkerProcess as "a real WorkerClient fault→kill bug". On inspection it is not a product bug — it is a test race. WorkerClient.SetFaulted publishes the Faulted state under lock before calling KillOwnedProcess, so the old test's WaitUntilAsync(() => client.State == Faulted) could return between those two statements and observe process.KillCount == 0. The kill itself always runs synchronously inside SetFaulted, and ShutdownAsync/DisposeAsync re-issue an idempotent kill, so no real consumer relies on "state==Faulted implies process dead". The fix is therefore a test-quality fix (correctly Medium / Concurrency), not a product fix.
Resolution: Resolved 2026-05-18: (1) Made ReadLoop_WhenClientFaults_KillsOwnedWorkerProcess deterministic — it now awaits FakeWorkerProcess.WaitForExitAsync (the TaskCompletionSource completed inside Kill()), which completes exactly when the kill runs, eliminating the state-polling race; verified by running it five times in isolation (5/5 pass). (2) Removed the fixed 50 ms Task.Delay from InvokeAsync_WithLateReply_IgnoresLateReplyAndKeepsClientReady — the stale reply and the second reply are now sent in pipe (FIFO) order, so the read loop discards the stale reply before the second reply with no timing window. (3) Replaced the 20 ms Task.Delay heartbeat-advance hacks in WorkerClientTests.ReadLoop_WhenHeartbeatArrives_UpdatesLastHeartbeatAndWorkerProcess and FakeWorkerHarnessTests.SendHeartbeatAsync_UpdatesClientHeartbeatState with an injected ManualTimeProvider advanced by a fixed TimeSpan; both tests now assert the exact post-advance timestamp instead of > against wall-clock drift.
Tests-007
| Field | Value |
|---|---|
| Severity | Low |
| Category | Code organization & conventions |
| Location | src/MxGateway.Tests/Gateway/Grpc/MxAccessGatewayServiceTests.cs:682, src/MxGateway.Tests/Gateway/Grpc/GalaxyRepositoryGrpcServiceTests.cs:324, src/MxGateway.Tests/Gateway/GatewayEndToEndFakeWorkerSmokeTests.cs:460, src/MxGateway.Tests/Security/Authorization/GatewayGrpcAuthorizationInterceptorTests.cs:233 |
| Status | Resolved |
Description: A near-identical TestServerCallContext implementation is copy-pasted into at least four test files (and AllowAllConstraintEnforcer / TestServerStreamWriter / RecordingStreamWriter into several). Duplication risks the copies drifting and bloats each file.
Recommendation: Extract a shared TestServerCallContext, RecordingServerStreamWriter<T>, and AllowAllConstraintEnforcer into a common test-support folder/namespace.
Resolution: Resolved 2026-05-18: confirmed five duplicated copies (the brief's four plus a fifth in Galaxy/GalaxyFilterInputSafetyTests.cs). Added a shared MxGateway.Tests.TestSupport namespace under src/MxGateway.Tests/TestSupport/: TestServerCallContext.cs (single class with an optional Metadata? requestHeaders constructor parameter that subsumes both the no-arg and headers-bearing variants), RecordingServerStreamWriter.cs (thread-safe writer with Messages and WaitForFirstMessageAsync, replacing TestServerStreamWriter/RecordingStreamWriter/RecordingServerStreamWriter), and AllowAllConstraintEnforcer.cs. Deleted all five TestServerCallContext copies, both AllowAllConstraintEnforcer copies, and the three stream-writer copies; updated the five test files to using MxGateway.Tests.TestSupport; and renamed .Items call sites to .Messages. Removed the now-unused Grpc.Core using from GatewayEndToEndFakeWorkerSmokeTests.cs. Build clean (0 warnings) and suite green.
Tests-008
| Field | Value |
|---|---|
| Severity | Low |
| Category | mxaccessgw conventions |
| Location | src/MxGateway.Tests/Gateway/Sessions/WorkerAlarmRpcDispatcherTests.cs:1-9, src/MxGateway.Tests/Gateway/Sessions/NotWiredAlarmRpcDispatcherTests.cs:1-3, src/MxGateway.Tests/Gateway/Sessions/SessionManagerAlarmAutoSubscribeTests.cs:1 |
| Status | Resolved |
Description: The alarm test files diverge from the project's C# style and the rest of the suite: snake_case test method names instead of the PascalCase Method_Condition_Result pattern; redundant explicit using System;/System.Threading; imports despite implicit global usings; and explicit-type new instead of target-typed new() used elsewhere. There is also a typo in fixture data ("wnwrap subscribe failed").
Recommendation: Rename the alarm tests to the house Method_Condition_Result convention, drop redundant System.* usings, align new usage, and fix the wnwrap typo.
Re-triage note: Two of the finding's claims are incorrect. (1) "wnwrap subscribe failed" is not a typo — WnWrap is the real name of the worker's WnWrapAlarmConsumer MXAccess component (src/MxGateway.Worker/MxAccess/WnWrapAlarmConsumer.cs); the fixture string deliberately references it, so it was left unchanged. (2) SessionManagerAlarmAutoSubscribeTests.cs already uses PascalCase Method_Condition_Result names and target-typed new(), and its lone using System.Runtime.CompilerServices; is required for [EnumeratorCancellation] (not a global using) — it is not redundant. That file needed no change. The genuine style drift was confined to WorkerAlarmRpcDispatcherTests.cs and NotWiredAlarmRpcDispatcherTests.cs.
Resolution: Resolved 2026-05-18: renamed all ten WorkerAlarmRpcDispatcherTests methods and both NotWiredAlarmRpcDispatcherTests methods from snake_case to the house Method_Condition_Result PascalCase convention; dropped the redundant System/System.Collections.Generic/System.Linq/System.Threading/System.Threading.Tasks usings from WorkerAlarmRpcDispatcherTests.cs and System.Threading/System.Threading.Tasks from NotWiredAlarmRpcDispatcherTests.cs (all are implicit global usings), keeping the required System.Runtime.CompilerServices; converted explicit-type new SessionRegistry()/new WorkerAlarmRpcDispatcher(...)/new FakeAlarmWorkerClient/new List<...>()/new GatewaySession(...) to target-typed new(); and replaced the fully-qualified System.StringComparison with StringComparison. See the re-triage note for the two claims not actioned. Suite green.
Tests-009
| Field | Value |
|---|---|
| Severity | Low |
| Category | Documentation & comments |
| Location | src/MxGateway.Tests/Gateway/Sessions/SessionManagerTests.cs:36-37,99,365 |
| Status | Resolved |
Description: Several XML <summary> comments are copy-paste mismatches: the comment above OpenSessionAsync_SetsInitialDefaultLease describes correlation-ID generation; the comment above GatewaySessionSubscribeBulkAsync_ForwardsOneBulkCommand… describes lease refresh; the comment above CloseExpiredLeasesAsync_DoesNotCloseActiveEventSubscriber describes shutdown closing all sessions. Misleading test docs hinder triage.
Recommendation: Correct the <summary> text to match each test's actual behavior, or remove the redundant comments since the test names already describe the behavior.
Resolution: Resolved 2026-05-18: confirmed three copy-paste <summary> mismatches. The mislabelled comments were the summaries of the following tests left attached to the wrong method (the test below each then had no summary). Corrected all three: OpenSessionAsync_SetsInitialDefaultLease now describes setting the initial lease expiry; the comment above InvokeAsync_WhenSessionReady_RefreshesLease (the finding mis-cited the method name as GatewaySessionSubscribeBulkAsync_…) now describes lease refresh on invoke; and CloseExpiredLeasesAsync_DoesNotCloseActiveEventSubscriber now describes the expired-lease sweep leaving an active-event-subscriber session open. No behavior change.
Tests-010
| Field | Value |
|---|---|
| Severity | Low |
| Category | Security |
| Location | src/MxGateway.Tests/Gateway/Dashboard/DashboardAuthorizationHandlerTests.cs:26-36 |
| Status | Resolved |
Description: The anonymous-localhost bypass is tested only for the success case (allowAnonymousLocalhost: true + loopback succeeds) and the remote-unauthenticated denial. There is no test for the security-critical negatives: anonymous + loopback when AllowAnonymousLocalhost is false must be denied, and anonymous + non-loopback when the flag is true must still be denied (the bypass is scoped strictly to loopback). Those are the misconfiguration cases that would expose the dashboard.
Recommendation: Add tests: anonymous + loopback + allowAnonymousLocalhost: false → not succeeded; anonymous + non-loopback + allowAnonymousLocalhost: true → not succeeded.
Resolution: Resolved 2026-05-18: confirmed the coverage gap and confirmed DashboardAuthorizationHandler already gates the bypass correctly on AllowAnonymousLocalhost && IsLoopbackRequest() (no product bug). Added two DashboardAuthorizationHandlerTests: HandleAsync_AnonymousLocalhostDisallowed_DoesNotSucceed (anonymous + loopback + allowAnonymousLocalhost: false → not succeeded) and HandleAsync_AnonymousLocalhostAllowedFromRemoteAddress_DoesNotSucceed (anonymous + non-loopback + allowAnonymousLocalhost: true → not succeeded, proving the bypass stays scoped to loopback). Both pass.
Tests-011
| Field | Value |
|---|---|
| Severity | Low |
| Category | Correctness & logic bugs |
| Location | src/MxGateway.Tests/Gateway/GatewayEndToEndFakeWorkerSmokeTests.cs:233-301 |
| Status | Resolved |
Description: GatewayEndToEndFakeWorkerSmokeTests correctly stores and awaits launcher.WorkerTask, but SessionWorkerClientFactoryFakeWorkerTests uses _ = RunWorkerAsync(...) with no stored task (lines 152, 184, 220). An unhandled exception in the scripted worker becomes an unobserved TaskException that can surface as a process-level failure in an unrelated later test rather than failing the owning test.
Recommendation: Store the worker task and either await it during disposal or attach a continuation that fails the test on fault, mirroring GatewayEndToEndFakeWorkerSmokeTests.
Resolution: Resolved 2026-05-18: confirmed all three scripted launchers in SessionWorkerClientFactoryFakeWorkerTests discarded the worker task. Added an IWorkerTaskLauncher interface (each launcher now stores its scripted task in a WorkerTask property and exposes ObserveWorkerTaskAsync); the test class now implements IAsyncDisposable, tracks every launcher it creates via a Track helper, and in DisposeAsync awaits each WorkerTask (within TestTimeout) so a scripted-worker fault fails the owning test instead of leaking as an unobserved TaskScheduler.UnobservedTaskException. OperationCanceledException and IOException — the expected outcomes of the worker client tearing the pipe down — are swallowed; anything else rethrows. NeverReadyWorkerProcessLauncher (which parks on an infinite Task.Delay) was given its own CancellationTokenSource so disposal can cancel and observe the parked task. Suite green.
Tests-012
| Field | Value |
|---|---|
| Severity | Low |
| Category | Concurrency & thread safety |
| Location | src/MxGateway.Tests/Gateway/Workers/Fakes/FakeWorkerHarness.cs:62, src/MxGateway.Tests/Gateway/Workers/WorkerClientTests.cs:472 |
| Status | Resolved |
Description: Pipe names are uniquified per test with a GUID (good), but xUnit runs test classes in parallel by default and there is no xunit.runner.json or collection configuration. Tests that build a full WebApplication bind ephemeral ports (--urls=http://127.0.0.1:0, fine) but spin up DI containers and hosted services concurrently. Currently safe, but a future test binding a fixed port would silently collide.
Recommendation: Add an xunit.runner.json or a collection grouping the WebApplication-building tests, and keep the :0 ephemeral-port convention explicit so future tests do not introduce a fixed-port collision.
Resolution: Resolved 2026-05-18: added src/MxGateway.Tests/xunit.runner.json making the parallelism policy explicit (parallelizeTestCollections: true, maxParallelThreads: -1, parallelizeAssembly: false, longRunningTestSeconds: 30) and wired it into MxGateway.Tests.csproj as <None Update="xunit.runner.json" CopyToOutputDirectory="PreserveNewest" /> so the runner picks it up (confirmed present in bin/Debug/net10.0/). Added a comment at the only WebApplication-building call site (GatewayApplicationTests.cs, --urls=http://127.0.0.1:0) documenting that the ephemeral-port (:0) convention is mandatory because test collections run in parallel. No fixed-port binding exists today; this is a preventative guardrail as the finding recommends.
Tests-013
| Field | Value |
|---|---|
| Severity | Medium |
| Category | Testing coverage |
| Location | src/MxGateway.Server/Sessions/GatewaySession.cs:449-679, src/MxGateway.Tests/Gateway/Sessions/SessionManagerTests.cs |
| Status | Resolved |
Description: GatewaySession exposes eleven bulk methods (AddItemBulkAsync, AdviseItemBulkAsync, RemoveItemBulkAsync, UnAdviseItemBulkAsync, SubscribeBulkAsync, UnsubscribeBulkAsync, WriteBulkAsync, Write2BulkAsync, WriteSecuredBulkAsync, WriteSecured2BulkAsync, ReadBulkAsync) but only three (SubscribeBulkAsync, WriteBulkAsync, ReadBulkAsync) are exercised in SessionManagerTests. A grep across src/MxGateway.Tests for the other eight method names returns zero matches. The recent commit eaa7093 ("register the five new bulk subcommands in IsKnownGatewayCommand") explicitly added bulk surface to the gateway, and 1cd51bb added stress benchmarks for it, but the gateway-side tests do not pin the command-kind, payload-shape, or WriteSecured*Bulk credential-redaction behaviour for any of the new bulk variants. A future regression in WriteSecuredBulkAsync body construction would not be caught by the gateway unit suite.
Recommendation: Mirror the existing SubscribeBulkAsync / WriteBulkAsync / ReadBulkAsync test pattern for the eight missing methods: each test should OpenSessionAsync, invoke the bulk API, assert the worker received exactly one WorkerCommand of the matching MxCommandKind, and (for the secured variants) confirm the credential payload survives the round-trip without being log-redacted from the over-the-wire command shape.
Resolution: Resolved 2026-05-20: added src/MxGateway.Tests/Gateway/Sessions/SessionManagerBulkTests.cs with per-method coverage for all eleven bulk entry points. Each method now has a round-trip test that pins (a) the exact MxCommandKind sent to the worker, (b) the payload shape (server handle, item handles / tag addresses / entries, timeout for ReadBulk), and (c) per-entry failure surfacing where the reply contains a mix of WasSuccessful = true/false results with an ErrorMessage. Each method also has a *_PropagatesCancellation test that pre-cancels the token and asserts OperationCanceledException flows out. The secured variants additionally pin that CurrentUserId / VerifierUserId survive the over-the-wire command shape unchanged (the gateway's redaction rules apply only to logs, not to the command body the worker receives). New tests use a local FakeBulkWorkerClient keyed by MxCommand.Kind-specific replies; no production-code change. All 54 SessionManager/GalaxyHierarchyCache tests pass with dotnet test --filter "FullyQualifiedName~SessionManager|FullyQualifiedName~GalaxyHierarchyCache".
Tests-014
| Field | Value |
|---|---|
| Severity | Low |
| Category | Performance & resource management |
| Location | src/MxGateway.Tests/Gateway/GatewayApplicationTests.cs:18,33,44,62,81,105, src/MxGateway.Tests/Gateway/Dashboard/DashboardCookieOptionsTests.cs:17 |
| Status | Resolved |
Description: Seven [Fact] methods build a real WebApplication via GatewayApplication.Build([]) and never dispose it. WebApplication is IAsyncDisposable; constructing one stands up a full DI container, an OpenTelemetry meter (GatewayMetrics), Kestrel server objects, hosted services, and logging providers. Because the suite runs test collections in parallel (per the new xunit.runner.json from Tests-012), every undisposed instance keeps its meter/loggers/hosted services alive until the test process exits, doubling up live Meter instances each time and silently extending the memory/handle footprint of an xunit run. Only the two tests that actually call app.StartAsync() (GatewayApplicationTests.StartAsync_InvalidGatewayConfiguration_FailsStartup and SqliteAuthStoreTests.StartAsync_NewerSchemaVersion_BlocksStartup) currently use await using.
Recommendation: Promote each WebApplication app = GatewayApplication.Build(...) to await using WebApplication app = ... and make the containing test method async Task. The endpoint-listing assertions do not need await, but the await using will ensure the DI container, meter, and hosted services are torn down per-test.
Resolution: 2026-05-20 — Promoted all seven WebApplication-building tests (six in GatewayApplicationTests plus the one in DashboardCookieOptionsTests) to async Task with await using WebApplication app = GatewayApplication.Build(...), so the DI container, GatewayMetrics meter, hosted services, and Kestrel objects are torn down per-test rather than leaking until process exit. The previously already-await using StartAsync_InvalidGatewayConfiguration_FailsStartup was unchanged. Full suite green.
Tests-015
| Field | Value |
|---|---|
| Severity | Low |
| Category | Correctness & logic bugs |
| Location | src/MxGateway.Tests/Gateway/GatewayEndToEndFakeWorkerSmokeTests.cs:374-379,87 |
| Status | Resolved |
Description: The nested FakeWorkerProcess.WaitForExitAsync implementation unconditionally sets HasExited = true and ExitCode ??= 0 when called, regardless of whether the scripted worker actually completed the shutdown handshake. The smoke-test assertion Assert.True(launcher.Process.HasExited) therefore cannot distinguish "the scripted worker received WorkerShutdown, sent WorkerShutdownAck, and called MarkExited(0)" from "the gateway code path simply awaited WaitForExitAsync somewhere during teardown". The scripted worker happens to call MarkExited(0) after receiving the shutdown frame, but a regression that bypassed the shutdown-ack path entirely would still pass this assertion. The companion launcher in SessionWorkerClientFactoryFakeWorkerTests.FakeWorkerProcess.WaitForExitAsync (lines 351-356) has the same shape — fine there because no exit assertion is made — but the smoke test relies on this signal.
Recommendation: Make WaitForExitAsync await an internal TaskCompletionSource that is only completed by Kill() or MarkExited() (the same pattern WorkerClientTests.FakeWorkerProcess already uses for _exited), so HasExited reflects actual exit and the smoke test's assertion is meaningful.
Resolution: 2026-05-20 — Rewrote the smoke-test FakeWorkerProcess to back WaitForExitAsync with a TaskCompletionSource _exited that is only completed inside MarkExited (called by the scripted worker after sending WorkerShutdownAck) or Kill (which calls MarkExited(-1)), removing the "set HasExited = true and return immediately" cheat. The smoke test now also asserts Assert.Equal(0, launcher.Process.ExitCode) — MarkExited(0) is reachable only via the shutdown-ack branch, so a regression that bypassed the ack path would produce a non-zero (or null) exit code and fail the assertion deterministically. WorkerClient.ShutdownAsync calls WaitForProcessExitAsync, which now genuinely awaits the scripted worker's ack.
Tests-016
| Field | Value |
|---|---|
| Severity | Medium |
| Category | Testing coverage |
| Location | src/MxGateway.Tests/Galaxy/GalaxyHierarchyCacheTests.cs:29-41,115-124 |
| Status | Resolved |
Description: RefreshAsync_WhenSqlIsUnreachable_MarksUnavailableAndDoesNotPublish is in the unit-test project but exercises a real GalaxyHierarchyCache/GalaxyRepository against a hard-coded TCP socket 127.0.0.1:65500 with a one-second connect timeout. Per docs/GatewayTesting.md, live Galaxy coverage belongs in MxGateway.IntegrationTests and is gated by MXGATEWAY_RUN_LIVE_GALAXY_TESTS=1; this test is neither gated nor uses a stub repository. On most boxes the connect fails closed (the test passes), but the outcome depends on OS-level "connection refused" vs "no route to host" behaviour and is sensitive to environments where 127.0.0.1:65500 happens to be bound — a real flakiness source. It also breaks the gateway-without-MXAccess invariant in spirit (the gateway code path under test does I/O the unit project should not need).
Recommendation: Either (a) replace the real repository with an in-test fake that throws a SqlException/TimeoutException from GetHierarchyAsync, exercising GalaxyHierarchyCache.RefreshAsync's exception path directly; or (b) move the test to MxGateway.IntegrationTests and gate it behind a "no-live-DB-required" variant of the live-Galaxy attribute. (a) is preferred because the production path being tested is the cache's reaction to a repository exception, not socket behaviour.
Resolution: Resolved 2026-05-20: applied option (a). Introduced src/MxGateway.Server/Galaxy/IGalaxyRepository.cs with the four methods the cache consumes (TestConnectionAsync, GetLastDeployTimeAsync, GetHierarchyAsync, GetAttributesAsync); made GalaxyRepository implement it; changed GalaxyHierarchyCache's constructor to depend on IGalaxyRepository rather than the concrete type; and registered the interface against the existing concrete singleton in GalaxyRepositoryServiceCollectionExtensions.AddGalaxyRepository. Rewrote the test as RefreshAsync_WhenRepositoryThrows_MarksUnavailableAndDoesNotPublish using a local ThrowingGalaxyRepository : IGalaxyRepository that throws an InvalidOperationException from GetLastDeployTimeAsync (the first call the cache makes against the repository). The test now exercises the cache's exception branch directly — no TCP I/O — and additionally asserts that GetHierarchyAsync/GetAttributesAsync are NOT invoked once the deploy-time probe has failed. Current_BeforeAnyRefresh_ReturnsEmpty was migrated to the same fake. The unreachable CreateCache helper that built a real GalaxyRepository against 127.0.0.1:65500 was removed. The Galaxy SQL surface itself stays covered by MxGateway.IntegrationTests.Galaxy.GalaxyRepositoryLiveTests (gated by MXGATEWAY_RUN_LIVE_GALAXY_REPOSITORY_TESTS=1).
Tests-017
| Field | Value |
|---|---|
| Severity | Low |
| Category | Concurrency & thread safety |
| Location | src/MxGateway.Tests/Gateway/Workers/WorkerClientTests.cs:346-364 |
| Status | Resolved |
Description: HeartbeatMonitor_WhenHeartbeatExpires_FaultsClient configures HeartbeatGrace = 80 ms and HeartbeatCheckInterval = 20 ms, then asserts the client faults within the 5-second TestTimeout. The test compares against the real wall clock — the heartbeat monitor reads TimeProvider.System for the grace check. After Tests-006 migrated the other heartbeat tests to an injected ManualTimeProvider for determinism, this one is now the only WorkerClientTests heartbeat case that still rides the wall clock. The 5-second outer bound makes a false failure unlikely, but the test cannot fail fast when the heartbeat-monitor logic regresses — it just waits the full 5 seconds.
Recommendation: Inject the same ManualTimeProvider used by ReadLoop_WhenHeartbeatArrives_UpdatesLastHeartbeatAndWorkerProcess, then clock.Advance(TimeSpan.FromSeconds(2)) past the grace and assert the fault deterministically. The HeartbeatCheckInterval (20 ms) timer fire can stay on the real clock; what needs to be deterministic is the grace comparison.
Resolution: 2026-05-20 — HeartbeatMonitor_WhenHeartbeatExpires_FaultsClient now constructs a ManualTimeProvider seeded at "2026-05-20T12:00:00Z", passes it to CreateClient via the existing timeProvider parameter, and calls clock.Advance(TimeSpan.FromSeconds(2)) after the handshake. WorkerClient.MarkReady records _lastHeartbeatAt from the manual clock, so the next 20 ms HeartbeatCheckInterval tick observes now - lastHeartbeat = 2s > 80ms grace and faults deterministically. The check-interval timer stays on the real clock as the finding recommended; only the grace comparison is deterministic.
Tests-018
| Field | Value |
|---|---|
| Severity | Low |
| Category | Code organization & conventions |
| Location | src/MxGateway.Tests/Galaxy/GalaxyHierarchyCacheTests.cs:32, src/MxGateway.Tests/Gateway/Dashboard/DashboardSnapshotServiceTests.cs:45,51,57,105,134,163,167,202-209,284,317,523, src/MxGateway.Tests/Gateway/Sessions/SessionManagerTests.cs:40 |
| Status | Resolved |
Description: Several tests parse ISO-8601 literals with DateTimeOffset.Parse("2026-04-26T10:00:00Z") without an explicit CultureInfo.InvariantCulture. Directory.Build.props enables TreatWarningsAsErrors, but CA1305 (specify IFormatProvider) is not currently raised because the tests don't trigger it; nevertheless, DateTimeOffset.Parse without a culture takes CurrentCulture, and on a locale whose DateTimeFormatInfo rejects the Z suffix or uses non-Gregorian calendar conventions, these parses can throw at test time. WorkerClientTests.cs:327 and FakeWorkerHarnessTests.cs:121 already added System.Globalization.CultureInfo.InvariantCulture in the Tests-006 fix; the other ~15 call sites did not get the same treatment.
Recommendation: Add CultureInfo.InvariantCulture to every DateTimeOffset.Parse(...) call in MxGateway.Tests, or replace with DateTimeOffset.ParseExact against the literal "O" round-trip format. A single-line using System.Globalization; per file keeps the call sites concise.
Resolution: 2026-05-20 — Added CultureInfo.InvariantCulture to every DateTimeOffset.Parse site in MxGateway.Tests that lacked it: 16 call sites in DashboardSnapshotServiceTests.cs (a new using System.Globalization; was added so the call sites stay concise) and one in SessionManagerTests.cs (using the fully-qualified System.Globalization.CultureInfo.InvariantCulture to match the in-file style of the existing ManualTimeProvider parse sites). GalaxyHierarchyCacheTests.cs:36 was already correct from the Tests-016 rewrite. A final grep confirms every DateTimeOffset.Parse/DateTime.Parse call in src/MxGateway.Tests now passes CultureInfo.InvariantCulture.
Tests-019
| Field | Value |
|---|---|
| Severity | Low |
| Category | Documentation & comments |
| Location | docs/GatewayTesting.md, code-reviews/Tests/findings.md (Tests-002 re-triage) |
| Status | Resolved |
Description: The Tests-002 re-triage (2026-05-18) confirmed there is no SQL-injection surface in GalaxyRepository because filters are applied in memory by GalaxyHierarchyProjector/GalaxyGlobMatcher against the cached snapshot, and added 10 adversarial-input tests in src/MxGateway.Tests/Galaxy/GalaxyFilterInputSafetyTests.cs. That explanation lives only in the findings file; docs/GatewayTesting.md does not mention GalaxyFilterInputSafetyTests, the in-memory filter model, or the adversarial-input matrix. A future reader of the test docs will not know which tests pin the literal-filter behaviour or why the Galaxy SQL layer is not unit-tested for parameterisation. Per CLAUDE.md ("Update docs in the same change as the source. When public APIs, contracts, configuration, build steps, security behavior, event shapes, value conversion, status mapping, or lifecycle rules change, the affected docs must change in the same commit"), the Galaxy security-behaviour decision warrants a paragraph in GatewayTesting.md.
Recommendation: Add a short subsection to docs/GatewayTesting.md (probably under "Focused Commands" or a new "Galaxy Filter Safety" section) that names GalaxyFilterInputSafetyTests, explains that Galaxy filtering happens in memory against the cached hierarchy (so the SQL surface is constant), and lists the adversarial-input invariants the suite pins (%, _, ', ;, [abc] are literals; the glob regex has a 100 ms timeout against pathological input).
Resolution: 2026-05-20 — Added a "Galaxy Filter Safety" section to docs/GatewayTesting.md (immediately after "Live Galaxy Repository", before "Live LDAP") that names GalaxyFilterInputSafetyTests, re-frames the Tests-002 finding (the Galaxy SQL surface is constant — HierarchySql, AttributesSql, SELECT 1, SELECT time_of_last_deploy FROM galaxy), explains that all filters are applied in memory by GalaxyHierarchyProjector / GalaxyGlobMatcher, lists the adversarial-input matrix (', ' OR '1'='1, '; DROP TABLE gobject;--, %, _, 100%_off, [abc], Pump'001), and enumerates the invariants the suite pins (SQL metacharacters are opaque literals, only */? are glob wildcards, the matcher has a 100 ms regex timeout against pathological input, the projector returns zero matches / NotFound rather than the whole hierarchy, and the DiscoverHierarchy RPC end-to-end returns zero matches for adversarial globs).
Tests-020
| Field | Value |
|---|---|
| Severity | Medium |
| Category | Testing coverage |
| Location | src/MxGateway.Tests/Gateway/Grpc/MxAccessGatewayServiceConstraintTests.cs:275-347, src/MxGateway.Server/Grpc/MxAccessGatewayService.cs:803-829 |
| Status | Resolved |
Description: Server-021 added MxAccessGatewayServiceConstraintTests to exercise BulkConstraintPlan.MergeDeniedInto / CreateDeniedReply against a non-allow-all enforcer. The WriteBulkConstraintPlan has a four-arm GetPayload/SetPayload switch covering WriteBulk, Write2Bulk, WriteSecuredBulk, and WriteSecured2Bulk, but the new fixtures only cover two of those four arms — Invoke_WriteBulk_WithDeniedHandle_DropsEntryFromWorkerCallAndMergesDenialIntoReply (the WriteBulk arm) and Invoke_WriteSecuredBulk_WhenAllHandlesDenied_ShortCircuitsWithDeniedOnlyReply (the WriteSecuredBulk arm). The other two arms (Write2Bulk and WriteSecured2Bulk) and the parallel SubscribeBulkConstraintPlan RemoveItemBulk/UnAdviseItemBulk/UnsubscribeBulk cases (the subscribe-bulk plan's SetPayload switch in service code lines 742-753 covers only three kinds — AddItemBulk, AdviseItemBulk, SubscribeBulk — and the constraint test covers all three of those, but the unsubscribe-shaped bulk routes are also dispatched into denial paths through FilterHandleBulkAsync and have no constraint-test coverage either). A regression that wires a new bulk kind to the wrong reply slot, or drops a case arm during refactor, would compile clean and pass every existing test. The comment in Invoke_WriteSecuredBulk_WhenAllHandlesDenied_… ("The merge logic is shared, so a full denial here is enough to prove the secured-bulk routing") concedes the gap explicitly — but the _routing_ (the per-kind SetPayload switch) is exactly what is not shared and not exercised for Write2Bulk / WriteSecured2Bulk.
Recommendation: Add two short fixtures: Invoke_Write2Bulk_WhenAllHandlesDenied_ShortCircuitsWithDeniedOnlyReply and Invoke_WriteSecured2Bulk_WhenAllHandlesDenied_ShortCircuitsWithDeniedOnlyReply, mirroring the existing WriteSecuredBulk denial test but asserting reply.Write2Bulk / reply.WriteSecured2Bulk is populated (proving the SetPayload arm fires). The all-denied path is enough; the merge-with-allowed path is genuinely shared. Optionally also add denied-tag tests for RemoveItemBulk / UnsubscribeBulk to cover the handle-input variants of the SubscribeBulkConstraintPlan switch.
Resolution: 2026-05-20 — Added Invoke_Write2Bulk_WhenAllHandlesDenied_ShortCircuitsWithDeniedOnlyReply and Invoke_WriteSecured2Bulk_WhenAllHandlesDenied_ShortCircuitsWithDeniedOnlyReply to MxAccessGatewayServiceConstraintTests, plus matching CreateWrite2BulkRequest/CreateWriteSecured2BulkRequest helpers. Each new fixture asserts the worker is never called (InvokeCount == 0), reply.Kind matches the requested kind, the matching reply.{Write2Bulk,WriteSecured2Bulk}.Results slot is populated with denied entries, and the three sibling reply slots remain empty — pinning that the SetPayload switch fired for the correct arm and not for one of the other three Write*Bulk kinds. This closes the Write2Bulk/WriteSecured2Bulk arms of the four-arm GetPayload/SetPayload switch in WriteBulkConstraintPlan (MxAccessGatewayService.cs:803-829).
Tests-021
| Field | Value |
|---|---|
| Severity | Low |
| Category | Code organization & conventions |
| Location | src/MxGateway.Tests/Galaxy/GalaxyHierarchyCacheTests.cs:159-171, src/MxGateway.Tests/Gateway/Workers/FakeWorkerHarnessTests.cs:226-236, src/MxGateway.Tests/Gateway/Workers/WorkerClientTests.cs:620-630, src/MxGateway.Tests/Gateway/Sessions/SessionManagerTests.cs:766-… |
| Status | Resolved |
Description: Tests-006 / Tests-017 / Tests-018 introduced an injectable ManualTimeProvider to make heartbeat-timestamp / lease / cache tests deterministic. The class is now duplicated as a private sealed class ManualTimeProvider(DateTimeOffset start...) : TimeProvider in four test files (GalaxyHierarchyCacheTests.cs, FakeWorkerHarnessTests.cs, WorkerClientTests.cs, SessionManagerTests.cs). Each copy has the same three-line implementation (_now field, GetUtcNow() override, Advance(TimeSpan) method). One copy (GalaxyHierarchyCacheTests.cs:159) accepts a default DateTimeOffset and seeds with UtcNow; the other three require an explicit start — a small but real semantic divergence. Tests-007 consolidated the same kind of duplication for TestServerCallContext / RecordingServerStreamWriter / AllowAllConstraintEnforcer into src/MxGateway.Tests/TestSupport/; this is the same drift pattern.
Recommendation: Add src/MxGateway.Tests/TestSupport/ManualTimeProvider.cs with a single implementation (default-arg DateTimeOffset start = default resolving to a deterministic seed like DateTimeOffset.UnixEpoch or UtcNow, plus the Advance helper) and delete the four nested copies in favour of using MxGateway.Tests.TestSupport;. Same pattern as the Tests-007 resolution.
Resolution: 2026-05-20 — Added src/MxGateway.Tests/TestSupport/ManualTimeProvider.cs with the unified signature ManualTimeProvider(DateTimeOffset start = default) (a default start seeds from DateTimeOffset.UtcNow for the GalaxyHierarchyCacheTests call site that previously relied on that behaviour) plus the Advance(TimeSpan) helper. Deleted the four duplicated private sealed class ManualTimeProvider definitions from GalaxyHierarchyCacheTests.cs, FakeWorkerHarnessTests.cs, WorkerClientTests.cs, and SessionManagerTests.cs; each file now imports MxGateway.Tests.TestSupport. The SessionManagerTests copy previously lacked Advance — folding it onto the shared type does not regress because that file never called Advance. Same consolidation pattern as Tests-007.
Tests-022
| Field | Value |
|---|---|
| Severity | Low |
| Category | Testing coverage |
| Location | src/MxGateway.Tests/Gateway/Sessions/SessionManagerBulkTests.cs:52-61,90-99,126-135,163-172,202-211,238-247,282-294,339-360,413-434,484-506,553-567,663-688 |
| Status | Resolved |
Description: Tests-013 added eleven *_PropagatesCancellation tests that pre-cancel the token (cts.CancelAsync() before calling session.*BulkAsync(..., cts.Token)) and assert OperationCanceledException. The fakes' FakeBulkWorkerClient.InvokeAsync calls cancellationToken.ThrowIfCancellationRequested() as the first statement — so the exception is thrown synchronously inside the fake before any of GatewaySession.InvokeBulkInternalAsync → InvokeAsync → bulk-result projection runs. This verifies that the token reaches the worker client (a regression that swapped in CancellationToken.None between layers would fail the test), but it does not exercise mid-flight cancellation: a token that becomes cancelled while the worker is await-suspended waiting on a reply. Mid-flight cancellation is the more interesting path (it's what a real client closing its stream looks like) and is not pinned for any of the eleven bulk methods.
The cancellation tests for WorkerClient in WorkerClientTests do exercise the mid-flight path (the FakeWorkerClient returns Task.FromCanceled style via real pipe disconnection); only the gateway-side bulk tests are shallow.
Recommendation: For at least one representative bulk method (e.g. WriteSecuredBulkAsync — the highest-value gateway path), replace the pre-cancellation pattern with a fake whose InvokeAsync returns a TaskCompletionSource-backed task that never completes until cancelled, then cts.CancelAsync() after session.WriteSecuredBulkAsync(...) has been awaited far enough to register a continuation. Assert the resulting OperationCanceledException's CancellationToken matches cts.Token. The existing pre-cancel pattern is a reasonable cheap-coverage default for the other ten methods.
Resolution: 2026-05-20 — Added WriteSecuredBulkAsync_WhenCancelledMidFlight_ThrowsOperationCanceledForRequestToken to SessionManagerBulkTests backed by a new MidFlightBulkWorkerClient fake whose InvokeAsync registers a cancellation continuation on the caller's token, signals InvokeStarted, and parks on a TaskCompletionSource<WorkerCommandReply> that completes only when the token fires (or shutdown / kill / dispose tears it down). The test awaits InvokeStarted.Task, asserts the write task is still incomplete (proving the cancellation lands on an in-flight await rather than the synchronous fast-path), then calls cts.CancelAsync() and asserts the resulting OperationCanceledException.CancellationToken == cts.Token and InvokeCount == 1. The other ten *_PropagatesCancellation tests remain on the cheaper pre-cancel pattern per the finding's recommendation.
Tests-023
| Field | Value |
|---|---|
| Severity | Low |
| Category | Correctness & logic bugs |
| Location | src/MxGateway.Tests/Gateway/Sessions/SessionWorkerClientFactoryFakeWorkerTests.cs:334-374 |
| Status | Resolved |
Description: Tests-015 corrected the smoke-test FakeWorkerProcess.WaitForExitAsync (in GatewayEndToEndFakeWorkerSmokeTests.cs) so it now awaits a TaskCompletionSource only completed by Kill/MarkExited, removing the "set HasExited = true and return immediately" cheat. The companion FakeWorkerProcess in SessionWorkerClientFactoryFakeWorkerTests.cs:351-356 was not updated and still has the same cheat: WaitForExitAsync unconditionally sets HasExited = true; ExitCode = 0; return ValueTask.CompletedTask;. The original Tests-006 re-triage noted this companion was "fine there because no exit assertion is made"; the file at a020350 does not yet assert HasExited or ExitCode, so this is not a current bug — but it is a latent regression vector: a future test in the same file that asserts Assert.True(launcher.Process.HasExited) after triggering shutdown would pass spuriously, exactly the failure mode Tests-015 just closed in the smoke-test copy. Two near-identical fakes in the same project with diverging semantics is brittle.
Recommendation: Apply the same TaskCompletionSource _exited pattern to SessionWorkerClientFactoryFakeWorkerTests.FakeWorkerProcess: WaitForExitAsync awaits _exited.Task, Kill calls MarkExited(-1), and add a MarkExited(int) helper that completes the TCS. The scripted launchers in this file already call Kill() through the disposal path Tests-011 added, so the change is mechanical and preserves all current behaviour.
Resolution: 2026-05-20 — Brought the companion FakeWorkerProcess in SessionWorkerClientFactoryFakeWorkerTests.cs into parity with the Tests-015 smoke-test fake. WaitForExitAsync now awaits a TaskCompletionSource _exited (wrapped in WaitAsync(cancellationToken) for cooperative cancel) instead of unconditionally setting HasExited = true; ExitCode = 0. Kill(bool) increments KillCount and delegates to a new MarkExited(int exitCode) helper that sets HasExited, ExitCode, and completes the TCS. KillCount is still observable and pre-existing tests that assert KillCount > 0 continue to pass. The latent regression vector — that a future Assert.True(launcher.Process.HasExited) in this file would pass spuriously — is closed.
Tests-024
| Field | Value |
|---|---|
| Severity | Low |
| Category | Testing coverage |
| Location | src/MxGateway.Server/Grpc/MxAccessGatewayService.cs:713-730,784-801,859-876, src/MxGateway.Tests/Gateway/Grpc/MxAccessGatewayServiceConstraintTests.cs |
| Status | Resolved |
Description: Every BulkConstraintPlan.MergeDeniedInto implementation builds its merged reply by walking OriginalCount indices and dequeueing from the worker's allowedResults queue at each non-denied slot. TryDequeue silently returns false when the queue is empty, so if the worker returns fewer allowed results than the gateway forwarded (because of a protocol mismatch, a worker bug truncating the bulk reply, or a future change to per-entry result reporting), the merged reply will be shorter than OriginalCount — the gap is not filled with a synthetic failure result. Conversely, if the worker returns more allowed results than requested, the extras are silently dropped. Neither case is covered by MxAccessGatewayServiceConstraintTests: every fixture's sessionManager.InvokeReply returns exactly the same count as the number of allowed entries forwarded. A regression in worker bulk-reply construction or a contract drift could produce a silently-truncated public reply (clients observing fewer results than entries submitted, with no error) and no gateway-side test would fail.
Recommendation: Add two fixtures to MxAccessGatewayServiceConstraintTests: Invoke_WriteBulk_WhenWorkerReturnsFewerResultsThanAllowed_ProducesPartialReplyOrSyntheticFailure (worker reply has N-1 results for N allowed entries; assert either the merged reply has OriginalCount entries with a synthetic-failure tail, or — if the gateway's current policy is "truncate" — pin that behaviour explicitly and document the expectation in a comment), and Invoke_WriteBulk_WhenWorkerReturnsExtraResults_IgnoresExtras (worker returns N+2 for N allowed; assert merged reply has exactly OriginalCount). Whichever current behaviour is correct should be made explicit by the test — the goal is preventing a silent change.
Resolution: 2026-05-20 — Pinned the current BulkConstraintPlan.MergeDeniedInto behaviour for worker reply-count divergence. Added two fixtures to MxAccessGatewayServiceConstraintTests: Invoke_WriteBulk_WhenWorkerReturnsFewerResultsThanAllowed_MergedReplyIsTruncated (gateway forwards 2 allowed handles, worker returns 1 result; merged reply has 2 entries total — the worker result at the first non-denied slot and the denied entry at its original index — and the trailing under-supplied slot is silently dropped via Queue.TryDequeue returning false) and Invoke_WriteBulk_WhenWorkerReturnsExtraResults_IgnoresExtras (gateway forwards 2 allowed handles, worker returns 4; merged reply has exactly OriginalCount == 3 entries; the two extras are bounded out by the for index < OriginalCount loop). The fixtures explicitly pin "truncate / discard extras" as the current contract — a future change to synthesise failure tails or surface extras must update the test, preventing a silent behavioural change.
Tests-025
| Field | Value |
|---|---|
| Severity | Low |
| Category | Code organization & conventions |
| Location | src/ZB.MOM.WW.MxGateway.Tests/Gateway/Grpc/EventStreamServiceTests.cs:285-289, src/ZB.MOM.WW.MxGateway.Tests/Gateway/GatewayEndToEndFakeWorkerSmokeTests.cs:417-421 |
| Status | Resolved |
Description: Commit d692232 widened the EventStreamService constructor with an IDashboardEventBroadcaster parameter. Two test files now carry an identical private sealed class NullDashboardEventBroadcaster : IDashboardEventBroadcaster with a singleton Instance field and a no-op Publish. This mirrors the duplication pattern Tests-007 and Tests-021 already consolidated for TestServerCallContext / RecordingServerStreamWriter / AllowAllConstraintEnforcer / ManualTimeProvider into src/MxGateway.Tests/TestSupport/; the same pattern should apply here.
Recommendation: Extract NullDashboardEventBroadcaster to src/ZB.MOM.WW.MxGateway.Tests/TestSupport/NullDashboardEventBroadcaster.cs (or a single DashboardTestDoubles.cs), delete both nested copies, and update the two using-bearing files to import from TestSupport.
Resolution: 2026-05-24 — Extracted the shared no-op broadcaster to src/ZB.MOM.WW.MxGateway.Tests/TestSupport/NullDashboardEventBroadcaster.cs (single public sealed class with the singleton Instance field and a private constructor — matches the Tests-007 / Tests-021 consolidation pattern). Deleted both nested duplicates (EventStreamServiceTests.cs:319-323 and GatewayEndToEndFakeWorkerSmokeTests.cs:417-421); the latter already imported ZB.MOM.WW.MxGateway.Tests.TestSupport so NullDashboardEventBroadcaster.Instance resolves against the shared type. EventStreamServiceTests.cs gained a using ZB.MOM.WW.MxGateway.Tests.TestSupport;. The integration-tests copy in src/ZB.MOM.WW.MxGateway.IntegrationTests/WorkerLiveMxAccessSmokeTests.cs was left alone (different module, per scope). Server-041's ThrowingDashboardEventBroadcaster remains nested in EventStreamServiceTests (single-file usage, no consolidation needed). Build clean (0 warnings), full Tests suite green (486/486).
Tests-026
| Field | Value |
|---|---|
| Severity | Medium |
| Category | Testing coverage |
| Location | src/ZB.MOM.WW.MxGateway.Tests/Gateway/Grpc/EventStreamServiceTests.cs, src/ZB.MOM.WW.MxGateway.Server/Grpc/EventStreamService.cs:123-126 |
| Status | Resolved |
Description: The new IDashboardEventBroadcaster is wired into EventStreamService at line 123 (commit d692232) and the broadcaster's Publish is the only path that mirrors per-session events into the dashboard EventsHub. The unit tests inject NullDashboardEventBroadcaster.Instance, so the broadcaster invocation is never observed — a regression that silently dropped the Publish call (e.g. an if accidentally added around it, or removing the broadcaster ctor parameter) would not be caught by any test in this module. The hub-registration tests (DashboardHubsRegistrationTests) verify the endpoints exist but not the producer-side hook.
Recommendation: Add a fixture to EventStreamServiceTests named e.g. StreamEventsAsync_PublishesEachEventToDashboardBroadcaster: inject a recording fake that captures (sessionId, mxEvent) calls, push two events through the fake session, and assert the broadcaster received both with the correct session id and matching WorkerSequence. This pins the broadcast hook and proves the dashboard event mirror is not a no-op.
Resolution: 2026-05-24 — Added src/ZB.MOM.WW.MxGateway.Tests/TestSupport/RecordingDashboardEventBroadcaster.cs — a thread-safe IDashboardEventBroadcaster test double backed by a ConcurrentQueue<DashboardEventCapture> that captures every (sessionId, mxEvent) invocation. Added StreamEventsAsync_PublishesEachEventToDashboardBroadcaster to EventStreamServiceTests: pushes two events (WorkerSequence 7 / OnDataChange and 8 / OnWriteComplete) through the fake session, drains the stream, and asserts the recording broadcaster captured exactly two publishes with the matching SessionId, WorkerSequence, and Family for each. TDD red/green confirmed: a deliberate "expected 3 captures" failed (Expected: 3, Actual: 2) before flipping to the correct count; the green run passes deterministically. The fixture would have caught a regression that drops or wraps dashboardEventBroadcaster.Publish at EventStreamService.cs:133. Build clean (0 warnings); full Tests suite green (486/486).