Files
mxaccessgw/code-reviews/Tests/findings.md
T
Joseph Doherty b381bfcaf1 Resolve Tests-001 and Tests-002 code-review findings
Tests-001: FakeSessionManager.TryGetSession unconditionally synthesized a
session, so Invoke_WhenSessionMissing_ThrowsNotFound did not actually
verify the missing-session path. Added ResolveOnlySeededSessions/SeedSession
to the fake, rewrote the missing-session test, and added seeded-resolution
and alarm-RPC missing-session coverage.

Tests-002: re-triaged. GalaxyRepository issues only constant SQL; filters
are applied in-memory by GalaxyHierarchyProjector/GalaxyGlobMatcher. Kept
as a valid coverage gap and added GalaxyFilterInputSafetyTests exercising
filter/glob input safety directly.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-18 20:46:02 -04:00

15 KiB

Code Review — Tests

Field Value
Module src/MxGateway.Tests
Reviewer Claude Code
Review date 2026-05-18
Commit reviewed 6c64030
Status Reviewed
Open findings 10

Checklist coverage

# Category Result
1 Correctness & logic bugs Issue found: Tests-001 (FakeSessionManager.TryGetSession always returns true), Tests-011 (unobserved worker task).
2 mxaccessgw conventions FakeWorkerHarness used per docs; no real secrets; minor style drift in three alarm-test files (Tests-008).
3 Concurrency & thread safety Issues found: Tests-006 (Task.Delay-based timing), Tests-012 (no parallelism guard for WebApplication tests).
4 Error handling & resilience Strong — timeouts, faults, overflow, kill paths, protocol violations all exercised. No issues found.
5 Security Issues found: Tests-002 (no SQL-injection coverage of Galaxy RPCs), Tests-010 (anonymous-localhost negative cases untested).
6 Performance & resource management Issue found: Tests-003 (temp DB/worker directories never cleaned up).
7 Design-document adherence Tests match docs/GatewayTesting.md; no drift found. No issues found.
8 Code organization & conventions Issue found: Tests-007 (TestServerCallContext copy-pasted into 4+ files).
9 Testing coverage Issues found: Tests-001, Tests-004 (no end-to-end interceptor+service test), Tests-005 (no worker-crash-mid-command coverage), Tests-002.
10 Documentation & comments Issue found: Tests-009 (stale/mismatched XML <summary> comments).

Findings

Tests-001

Field Value
Severity High
Category Testing coverage
Location src/MxGateway.Tests/Gateway/Grpc/MxAccessGatewayServiceTests.cs:483-489
Status Resolved

Description: FakeSessionManager.TryGetSession unconditionally returns true and synthesizes a session for any id. As a result, Invoke_WhenSessionMissing_ThrowsNotFound (line 52) only passes because InvokeException is pre-seeded — it does not verify that the gateway service maps a genuinely missing session to NotFound. No test exercises the real gateway path where TryGetSession returns false (for StreamEvents, CloseSession, alarm RPCs). A regression dropping the missing-session check would not be caught.

Recommendation: Make FakeSessionManager.TryGetSession return false for unknown ids (return only seeded sessions), then assert NotFound/InvalidArgument is produced by the service's own lookup logic rather than an injected exception.

Resolution: Resolved 2026-05-18: confirmed root cause — added ResolveOnlySeededSessions/SeedSession to FakeSessionManager so TryGetSession returns false for unseeded ids, rewrote Invoke_WhenSessionMissing_ThrowsNotFound to drop the injected InvokeException and exercise the service's own ResolveSession lookup (asserts InvokeCount == 0), and added Invoke_WhenSessionSeeded_ResolvesAndInvokes, AcknowledgeAlarm_WhenSessionMissing_ThrowsNotFound, and QueryActiveAlarms_WhenSessionMissing_ThrowsNotFound.

Tests-002

Field Value
Severity High
Category Security
Location src/MxGateway.Tests/Gateway/Grpc/GalaxyRepositoryGrpcServiceTests.cs:198-210
Status Resolved

Description: The Galaxy Repository RPCs browse a SQL Server database (ZB). Every test injects a StubGalaxyHierarchyCache, so actual SQL query construction, parameterization, and filter/glob translation are never exercised. No test demonstrates that TagNameGlob, RootTagName, AlarmFilterPrefix, etc. are passed as parameters rather than concatenated into SQL. SQL-injection resistance of the Galaxy layer has zero coverage.

Recommendation: Add tests for the GalaxyRepository query-building layer (against SQLite or an in-memory abstraction, or by asserting parameter objects), covering glob/prefix inputs containing ', %, _, and ;. At minimum add a unit test over the SQL LIKE-pattern escaping helper.

Re-triage note: The finding's premise is partly misframed. GalaxyRepository issues only four constant SQL statements (HierarchySql, AttributesSql, SELECT 1, SELECT time_of_last_deploy FROM galaxy) — no DiscoverHierarchyRequest field is ever concatenated into SQL, so there is no dynamic SQL-injection surface and no LIKE-escaping helper to test. AlarmFilterPrefix belongs to the worker alarm path, not the Galaxy SQL layer. All filters (TagNameGlob, RootTagName, template-chain, category, contained-path) are applied in memory by GalaxyHierarchyProjector/GalaxyGlobMatcher against the cached snapshot. The genuine, testable concern — that adversarial filter strings are treated as opaque literals (no wildcard behaviour, no ReDoS, no exceptions) — remains valid and was previously uncovered. Severity left at High: an unsafe in-memory filter would still be a real security gap.

Resolution: Resolved 2026-05-18: added src/MxGateway.Tests/Galaxy/GalaxyFilterInputSafetyTests.cs (10 test methods, mostly [Theory] over adversarial inputs ', ' OR '1'='1, '; DROP TABLE gobject;--, %, _, 100%_off, [abc], Pump'001) covering GalaxyGlobMatcher literal-treatment / LIKE-wildcard / pathological-input (ReDoS) behaviour and GalaxyHierarchyProjector + DiscoverHierarchy RPC handling of adversarial TagNameGlob, RootTagName, and TemplateChainContains. No product bug found — the in-memory filter layer treats all metacharacters as literals; the passing tests resolve the coverage gap.

Tests-003

Field Value
Severity Medium
Category Performance & resource management
Location src/MxGateway.Tests/Security/Authentication/SqliteAuthStoreTests.cs:170-176, src/MxGateway.Tests/Security/Authentication/ApiKeyAdminCliRunnerTests.cs:252-258
Status Open

Description: CreateTempDatabasePath creates a fresh directory under %TEMP%\mxgateway-auth-tests\<guid> (and ...-cli-tests) for every test but nothing ever deletes it. WorkerProcessLauncherTests.TestDirectory correctly implements IDisposable and cleans up; these two do not. SQLite connection pooling can also keep the .db handle open after the test. Over many CI runs this leaks temp files and open handles.

Recommendation: Wrap the temp directory in an IDisposable/IAsyncDisposable helper (as WorkerProcessLauncherTests does) and call SqliteConnection.ClearAllPools() before deletion, or use Microsoft.Data.Sqlite in-memory mode where a real file is not needed.

Resolution: (open)

Tests-004

Field Value
Severity Medium
Category Testing coverage
Location src/MxGateway.Tests/Security/Authorization/GatewayGrpcAuthorizationInterceptorTests.cs
Status Open

Description: The authorization interceptor and MxAccessGatewayService are each tested in isolation, but no test composes the interceptor in front of the real service to confirm scope enforcement gates real RPCs end-to-end. A wiring mistake — interceptor not registered, or a new RPC added without a scope mapping in GatewayGrpcScopeResolver — would pass every existing test. GatewayGrpcScopeResolverTests also only checks an enumerated allow-list; it never asserts an unmapped request type fails closed.

Recommendation: Add an end-to-end test that runs OpenSession/Invoke through the interceptor+service composition with insufficient scope and asserts PermissionDenied; add a GatewayGrpcScopeResolver test asserting an unknown/unmapped request type throws or denies rather than returning a permissive default.

Resolution: (open)

Tests-005

Field Value
Severity Medium
Category Testing coverage
Location src/MxGateway.Tests/Gateway/Grpc/EventStreamServiceTests.cs:239-261, src/MxGateway.Tests/Gateway/Sessions/SessionManagerTests.cs
Status Open

Description: Worker-crash handling is only tested as a clean terminal exception from ReadEventsAsync or a pre-set ShutdownException. There is no test for a worker that faults mid-command — an InvokeAsync in flight when the pipe/worker dies — which is a core fault-handling path of the two-process design. WorkerClientTests covers pipe-disconnect faulting the read loop, but not the interaction where a pending InvokeAsync task observes the fault and surfaces a meaningful error code.

Recommendation: Add a WorkerClient/SessionManager test that disposes the worker pipe (or emits a WorkerFault) while an InvokeAsync is pending, and assert the invoke task fails with a WorkerClientException/SessionManagerException carrying the worker-faulted error code.

Resolution: (open)

Tests-006

Field Value
Severity Medium
Category Concurrency & thread safety
Location src/MxGateway.Tests/Gateway/Workers/WorkerClientTests.cs:76, src/MxGateway.Tests/Gateway/Workers/FakeWorkerHarnessTests.cs:122
Status Open

Description: Several tests rely on fixed Task.Delay values: WorkerClientTests.InvokeAsync_WithLateReply… waits a hard-coded 50 ms after writing a late reply before issuing the second command, and the heartbeat tests use a 20 ms delay to make timestamps strictly increase. On a slow CI agent the 50 ms delay can be insufficient, and DateTimeOffset.UtcNow resolution can make the 20 ms heartbeat-advance assertion flaky.

Recommendation: Replace fixed delays with the existing WaitUntilAsync condition polling, and inject a controllable TimeProvider for heartbeat-timestamp comparisons instead of relying on wall-clock advance.

Resolution: (open)

Tests-007

Field Value
Severity Low
Category Code organization & conventions
Location src/MxGateway.Tests/Gateway/Grpc/MxAccessGatewayServiceTests.cs:682, src/MxGateway.Tests/Gateway/Grpc/GalaxyRepositoryGrpcServiceTests.cs:324, src/MxGateway.Tests/Gateway/GatewayEndToEndFakeWorkerSmokeTests.cs:460, src/MxGateway.Tests/Security/Authorization/GatewayGrpcAuthorizationInterceptorTests.cs:233
Status Open

Description: A near-identical TestServerCallContext implementation is copy-pasted into at least four test files (and AllowAllConstraintEnforcer / TestServerStreamWriter / RecordingStreamWriter into several). Duplication risks the copies drifting and bloats each file.

Recommendation: Extract a shared TestServerCallContext, RecordingServerStreamWriter<T>, and AllowAllConstraintEnforcer into a common test-support folder/namespace.

Resolution: (open)

Tests-008

Field Value
Severity Low
Category mxaccessgw conventions
Location src/MxGateway.Tests/Gateway/Sessions/WorkerAlarmRpcDispatcherTests.cs:1-9, src/MxGateway.Tests/Gateway/Sessions/NotWiredAlarmRpcDispatcherTests.cs:1-3, src/MxGateway.Tests/Gateway/Sessions/SessionManagerAlarmAutoSubscribeTests.cs:1
Status Open

Description: The alarm test files diverge from the project's C# style and the rest of the suite: snake_case test method names instead of the PascalCase Method_Condition_Result pattern; redundant explicit using System;/System.Threading; imports despite implicit global usings; and explicit-type new instead of target-typed new() used elsewhere. There is also a typo in fixture data ("wnwrap subscribe failed").

Recommendation: Rename the alarm tests to the house Method_Condition_Result convention, drop redundant System.* usings, align new usage, and fix the wnwrap typo.

Resolution: (open)

Tests-009

Field Value
Severity Low
Category Documentation & comments
Location src/MxGateway.Tests/Gateway/Sessions/SessionManagerTests.cs:36-37,99,365
Status Open

Description: Several XML <summary> comments are copy-paste mismatches: the comment above OpenSessionAsync_SetsInitialDefaultLease describes correlation-ID generation; the comment above GatewaySessionSubscribeBulkAsync_ForwardsOneBulkCommand… describes lease refresh; the comment above CloseExpiredLeasesAsync_DoesNotCloseActiveEventSubscriber describes shutdown closing all sessions. Misleading test docs hinder triage.

Recommendation: Correct the <summary> text to match each test's actual behavior, or remove the redundant comments since the test names already describe the behavior.

Resolution: (open)

Tests-010

Field Value
Severity Low
Category Security
Location src/MxGateway.Tests/Gateway/Dashboard/DashboardAuthorizationHandlerTests.cs:26-36
Status Open

Description: The anonymous-localhost bypass is tested only for the success case (allowAnonymousLocalhost: true + loopback succeeds) and the remote-unauthenticated denial. There is no test for the security-critical negatives: anonymous + loopback when AllowAnonymousLocalhost is false must be denied, and anonymous + non-loopback when the flag is true must still be denied (the bypass is scoped strictly to loopback). Those are the misconfiguration cases that would expose the dashboard.

Recommendation: Add tests: anonymous + loopback + allowAnonymousLocalhost: false → not succeeded; anonymous + non-loopback + allowAnonymousLocalhost: true → not succeeded.

Resolution: (open)

Tests-011

Field Value
Severity Low
Category Correctness & logic bugs
Location src/MxGateway.Tests/Gateway/GatewayEndToEndFakeWorkerSmokeTests.cs:233-301
Status Open

Description: GatewayEndToEndFakeWorkerSmokeTests correctly stores and awaits launcher.WorkerTask, but SessionWorkerClientFactoryFakeWorkerTests uses _ = RunWorkerAsync(...) with no stored task (lines 152, 184, 220). An unhandled exception in the scripted worker becomes an unobserved TaskException that can surface as a process-level failure in an unrelated later test rather than failing the owning test.

Recommendation: Store the worker task and either await it during disposal or attach a continuation that fails the test on fault, mirroring GatewayEndToEndFakeWorkerSmokeTests.

Resolution: (open)

Tests-012

Field Value
Severity Low
Category Concurrency & thread safety
Location src/MxGateway.Tests/Gateway/Workers/Fakes/FakeWorkerHarness.cs:62, src/MxGateway.Tests/Gateway/Workers/WorkerClientTests.cs:472
Status Open

Description: Pipe names are uniquified per test with a GUID (good), but xUnit runs test classes in parallel by default and there is no xunit.runner.json or collection configuration. Tests that build a full WebApplication bind ephemeral ports (--urls=http://127.0.0.1:0, fine) but spin up DI containers and hosted services concurrently. Currently safe, but a future test binding a fixed port would silently collide.

Recommendation: Add an xunit.runner.json or a collection grouping the WebApplication-building tests, and keep the :0 ephemeral-port convention explicit so future tests do not introduce a fixed-port collision.

Resolution: (open)