b381bfcaf1
Tests-001: FakeSessionManager.TryGetSession unconditionally synthesized a session, so Invoke_WhenSessionMissing_ThrowsNotFound did not actually verify the missing-session path. Added ResolveOnlySeededSessions/SeedSession to the fake, rewrote the missing-session test, and added seeded-resolution and alarm-RPC missing-session coverage. Tests-002: re-triaged. GalaxyRepository issues only constant SQL; filters are applied in-memory by GalaxyHierarchyProjector/GalaxyGlobMatcher. Kept as a valid coverage gap and added GalaxyFilterInputSafetyTests exercising filter/glob input safety directly. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
210 lines
15 KiB
Markdown
210 lines
15 KiB
Markdown
# Code Review — Tests
|
|
|
|
| Field | Value |
|
|
|---|---|
|
|
| Module | `src/MxGateway.Tests` |
|
|
| Reviewer | Claude Code |
|
|
| Review date | 2026-05-18 |
|
|
| Commit reviewed | `6c64030` |
|
|
| Status | Reviewed |
|
|
| Open findings | 10 |
|
|
|
|
## Checklist coverage
|
|
|
|
| # | Category | Result |
|
|
|---|---|---|
|
|
| 1 | Correctness & logic bugs | Issue found: Tests-001 (`FakeSessionManager.TryGetSession` always returns true), Tests-011 (unobserved worker task). |
|
|
| 2 | mxaccessgw conventions | FakeWorkerHarness used per docs; no real secrets; minor style drift in three alarm-test files (Tests-008). |
|
|
| 3 | Concurrency & thread safety | Issues found: Tests-006 (`Task.Delay`-based timing), Tests-012 (no parallelism guard for `WebApplication` tests). |
|
|
| 4 | Error handling & resilience | Strong — timeouts, faults, overflow, kill paths, protocol violations all exercised. No issues found. |
|
|
| 5 | Security | Issues found: Tests-002 (no SQL-injection coverage of Galaxy RPCs), Tests-010 (anonymous-localhost negative cases untested). |
|
|
| 6 | Performance & resource management | Issue found: Tests-003 (temp DB/worker directories never cleaned up). |
|
|
| 7 | Design-document adherence | Tests match `docs/GatewayTesting.md`; no drift found. No issues found. |
|
|
| 8 | Code organization & conventions | Issue found: Tests-007 (`TestServerCallContext` copy-pasted into 4+ files). |
|
|
| 9 | Testing coverage | Issues found: Tests-001, Tests-004 (no end-to-end interceptor+service test), Tests-005 (no worker-crash-mid-command coverage), Tests-002. |
|
|
| 10 | Documentation & comments | Issue found: Tests-009 (stale/mismatched XML `<summary>` comments). |
|
|
|
|
## Findings
|
|
|
|
### Tests-001
|
|
|
|
| Field | Value |
|
|
|---|---|
|
|
| Severity | High |
|
|
| Category | Testing coverage |
|
|
| Location | `src/MxGateway.Tests/Gateway/Grpc/MxAccessGatewayServiceTests.cs:483-489` |
|
|
| Status | Resolved |
|
|
|
|
**Description:** `FakeSessionManager.TryGetSession` unconditionally returns `true` and synthesizes a session for any id. As a result, `Invoke_WhenSessionMissing_ThrowsNotFound` (line 52) only passes because `InvokeException` is pre-seeded — it does not verify that the gateway service maps a genuinely missing session to `NotFound`. No test exercises the real gateway path where `TryGetSession` returns `false` (for `StreamEvents`, `CloseSession`, alarm RPCs). A regression dropping the missing-session check would not be caught.
|
|
|
|
**Recommendation:** Make `FakeSessionManager.TryGetSession` return `false` for unknown ids (return only seeded sessions), then assert `NotFound`/`InvalidArgument` is produced by the service's own lookup logic rather than an injected exception.
|
|
|
|
**Resolution:** Resolved 2026-05-18: confirmed root cause — added `ResolveOnlySeededSessions`/`SeedSession` to `FakeSessionManager` so `TryGetSession` returns `false` for unseeded ids, rewrote `Invoke_WhenSessionMissing_ThrowsNotFound` to drop the injected `InvokeException` and exercise the service's own `ResolveSession` lookup (asserts `InvokeCount == 0`), and added `Invoke_WhenSessionSeeded_ResolvesAndInvokes`, `AcknowledgeAlarm_WhenSessionMissing_ThrowsNotFound`, and `QueryActiveAlarms_WhenSessionMissing_ThrowsNotFound`.
|
|
|
|
### Tests-002
|
|
|
|
| Field | Value |
|
|
|---|---|
|
|
| Severity | High |
|
|
| Category | Security |
|
|
| Location | `src/MxGateway.Tests/Gateway/Grpc/GalaxyRepositoryGrpcServiceTests.cs:198-210` |
|
|
| Status | Resolved |
|
|
|
|
**Description:** The Galaxy Repository RPCs browse a SQL Server database (`ZB`). Every test injects a `StubGalaxyHierarchyCache`, so actual SQL query construction, parameterization, and filter/glob translation are never exercised. No test demonstrates that `TagNameGlob`, `RootTagName`, `AlarmFilterPrefix`, etc. are passed as parameters rather than concatenated into SQL. SQL-injection resistance of the Galaxy layer has zero coverage.
|
|
|
|
**Recommendation:** Add tests for the `GalaxyRepository` query-building layer (against SQLite or an in-memory abstraction, or by asserting parameter objects), covering glob/prefix inputs containing `'`, `%`, `_`, and `;`. At minimum add a unit test over the SQL `LIKE`-pattern escaping helper.
|
|
|
|
**Re-triage note:** The finding's premise is partly misframed. `GalaxyRepository` issues only four *constant* SQL statements (`HierarchySql`, `AttributesSql`, `SELECT 1`, `SELECT time_of_last_deploy FROM galaxy`) — no `DiscoverHierarchyRequest` field is ever concatenated into SQL, so there is no dynamic SQL-injection surface and no `LIKE`-escaping helper to test. `AlarmFilterPrefix` belongs to the worker alarm path, not the Galaxy SQL layer. All filters (`TagNameGlob`, `RootTagName`, template-chain, category, contained-path) are applied **in memory** by `GalaxyHierarchyProjector`/`GalaxyGlobMatcher` against the cached snapshot. The genuine, testable concern — that adversarial filter strings are treated as opaque literals (no wildcard behaviour, no ReDoS, no exceptions) — remains valid and was previously uncovered. Severity left at High: an unsafe in-memory filter would still be a real security gap.
|
|
|
|
**Resolution:** Resolved 2026-05-18: added `src/MxGateway.Tests/Galaxy/GalaxyFilterInputSafetyTests.cs` (10 test methods, mostly `[Theory]` over adversarial inputs `'`, `' OR '1'='1`, `'; DROP TABLE gobject;--`, `%`, `_`, `100%_off`, `[abc]`, `Pump'001`) covering `GalaxyGlobMatcher` literal-treatment / `LIKE`-wildcard / pathological-input (ReDoS) behaviour and `GalaxyHierarchyProjector` + `DiscoverHierarchy` RPC handling of adversarial `TagNameGlob`, `RootTagName`, and `TemplateChainContains`. No product bug found — the in-memory filter layer treats all metacharacters as literals; the passing tests resolve the coverage gap.
|
|
|
|
### Tests-003
|
|
|
|
| Field | Value |
|
|
|---|---|
|
|
| Severity | Medium |
|
|
| Category | Performance & resource management |
|
|
| Location | `src/MxGateway.Tests/Security/Authentication/SqliteAuthStoreTests.cs:170-176`, `src/MxGateway.Tests/Security/Authentication/ApiKeyAdminCliRunnerTests.cs:252-258` |
|
|
| Status | Open |
|
|
|
|
**Description:** `CreateTempDatabasePath` creates a fresh directory under `%TEMP%\mxgateway-auth-tests\<guid>` (and `...-cli-tests`) for every test but nothing ever deletes it. `WorkerProcessLauncherTests.TestDirectory` correctly implements `IDisposable` and cleans up; these two do not. SQLite connection pooling can also keep the `.db` handle open after the test. Over many CI runs this leaks temp files and open handles.
|
|
|
|
**Recommendation:** Wrap the temp directory in an `IDisposable`/`IAsyncDisposable` helper (as `WorkerProcessLauncherTests` does) and call `SqliteConnection.ClearAllPools()` before deletion, or use `Microsoft.Data.Sqlite` in-memory mode where a real file is not needed.
|
|
|
|
**Resolution:** _(open)_
|
|
|
|
### Tests-004
|
|
|
|
| Field | Value |
|
|
|---|---|
|
|
| Severity | Medium |
|
|
| Category | Testing coverage |
|
|
| Location | `src/MxGateway.Tests/Security/Authorization/GatewayGrpcAuthorizationInterceptorTests.cs` |
|
|
| Status | Open |
|
|
|
|
**Description:** The authorization interceptor and `MxAccessGatewayService` are each tested in isolation, but no test composes the interceptor in front of the real service to confirm scope enforcement gates real RPCs end-to-end. A wiring mistake — interceptor not registered, or a new RPC added without a scope mapping in `GatewayGrpcScopeResolver` — would pass every existing test. `GatewayGrpcScopeResolverTests` also only checks an enumerated allow-list; it never asserts an unmapped request type fails closed.
|
|
|
|
**Recommendation:** Add an end-to-end test that runs `OpenSession`/`Invoke` through the interceptor+service composition with insufficient scope and asserts `PermissionDenied`; add a `GatewayGrpcScopeResolver` test asserting an unknown/unmapped request type throws or denies rather than returning a permissive default.
|
|
|
|
**Resolution:** _(open)_
|
|
|
|
### Tests-005
|
|
|
|
| Field | Value |
|
|
|---|---|
|
|
| Severity | Medium |
|
|
| Category | Testing coverage |
|
|
| Location | `src/MxGateway.Tests/Gateway/Grpc/EventStreamServiceTests.cs:239-261`, `src/MxGateway.Tests/Gateway/Sessions/SessionManagerTests.cs` |
|
|
| Status | Open |
|
|
|
|
**Description:** Worker-crash handling is only tested as a clean terminal exception from `ReadEventsAsync` or a pre-set `ShutdownException`. There is no test for a worker that faults mid-command — an `InvokeAsync` in flight when the pipe/worker dies — which is a core fault-handling path of the two-process design. `WorkerClientTests` covers pipe-disconnect faulting the read loop, but not the interaction where a pending `InvokeAsync` task observes the fault and surfaces a meaningful error code.
|
|
|
|
**Recommendation:** Add a `WorkerClient`/`SessionManager` test that disposes the worker pipe (or emits a `WorkerFault`) while an `InvokeAsync` is pending, and assert the invoke task fails with a `WorkerClientException`/`SessionManagerException` carrying the worker-faulted error code.
|
|
|
|
**Resolution:** _(open)_
|
|
|
|
### Tests-006
|
|
|
|
| Field | Value |
|
|
|---|---|
|
|
| Severity | Medium |
|
|
| Category | Concurrency & thread safety |
|
|
| Location | `src/MxGateway.Tests/Gateway/Workers/WorkerClientTests.cs:76`, `src/MxGateway.Tests/Gateway/Workers/FakeWorkerHarnessTests.cs:122` |
|
|
| Status | Open |
|
|
|
|
**Description:** Several tests rely on fixed `Task.Delay` values: `WorkerClientTests.InvokeAsync_WithLateReply…` waits a hard-coded 50 ms after writing a late reply before issuing the second command, and the heartbeat tests use a 20 ms delay to make timestamps strictly increase. On a slow CI agent the 50 ms delay can be insufficient, and `DateTimeOffset.UtcNow` resolution can make the 20 ms heartbeat-advance assertion flaky.
|
|
|
|
**Recommendation:** Replace fixed delays with the existing `WaitUntilAsync` condition polling, and inject a controllable `TimeProvider` for heartbeat-timestamp comparisons instead of relying on wall-clock advance.
|
|
|
|
**Resolution:** _(open)_
|
|
|
|
### Tests-007
|
|
|
|
| Field | Value |
|
|
|---|---|
|
|
| Severity | Low |
|
|
| Category | Code organization & conventions |
|
|
| Location | `src/MxGateway.Tests/Gateway/Grpc/MxAccessGatewayServiceTests.cs:682`, `src/MxGateway.Tests/Gateway/Grpc/GalaxyRepositoryGrpcServiceTests.cs:324`, `src/MxGateway.Tests/Gateway/GatewayEndToEndFakeWorkerSmokeTests.cs:460`, `src/MxGateway.Tests/Security/Authorization/GatewayGrpcAuthorizationInterceptorTests.cs:233` |
|
|
| Status | Open |
|
|
|
|
**Description:** A near-identical `TestServerCallContext` implementation is copy-pasted into at least four test files (and `AllowAllConstraintEnforcer` / `TestServerStreamWriter` / `RecordingStreamWriter` into several). Duplication risks the copies drifting and bloats each file.
|
|
|
|
**Recommendation:** Extract a shared `TestServerCallContext`, `RecordingServerStreamWriter<T>`, and `AllowAllConstraintEnforcer` into a common test-support folder/namespace.
|
|
|
|
**Resolution:** _(open)_
|
|
|
|
### Tests-008
|
|
|
|
| Field | Value |
|
|
|---|---|
|
|
| Severity | Low |
|
|
| Category | mxaccessgw conventions |
|
|
| Location | `src/MxGateway.Tests/Gateway/Sessions/WorkerAlarmRpcDispatcherTests.cs:1-9`, `src/MxGateway.Tests/Gateway/Sessions/NotWiredAlarmRpcDispatcherTests.cs:1-3`, `src/MxGateway.Tests/Gateway/Sessions/SessionManagerAlarmAutoSubscribeTests.cs:1` |
|
|
| Status | Open |
|
|
|
|
**Description:** The alarm test files diverge from the project's C# style and the rest of the suite: snake_case test method names instead of the PascalCase `Method_Condition_Result` pattern; redundant explicit `using System;`/`System.Threading;` imports despite implicit global usings; and explicit-type `new` instead of target-typed `new()` used elsewhere. There is also a typo in fixture data (`"wnwrap subscribe failed"`).
|
|
|
|
**Recommendation:** Rename the alarm tests to the house `Method_Condition_Result` convention, drop redundant `System.*` usings, align `new` usage, and fix the `wnwrap` typo.
|
|
|
|
**Resolution:** _(open)_
|
|
|
|
### Tests-009
|
|
|
|
| Field | Value |
|
|
|---|---|
|
|
| Severity | Low |
|
|
| Category | Documentation & comments |
|
|
| Location | `src/MxGateway.Tests/Gateway/Sessions/SessionManagerTests.cs:36-37,99,365` |
|
|
| Status | Open |
|
|
|
|
**Description:** Several XML `<summary>` comments are copy-paste mismatches: the comment above `OpenSessionAsync_SetsInitialDefaultLease` describes correlation-ID generation; the comment above `GatewaySessionSubscribeBulkAsync_ForwardsOneBulkCommand…` describes lease refresh; the comment above `CloseExpiredLeasesAsync_DoesNotCloseActiveEventSubscriber` describes shutdown closing all sessions. Misleading test docs hinder triage.
|
|
|
|
**Recommendation:** Correct the `<summary>` text to match each test's actual behavior, or remove the redundant comments since the test names already describe the behavior.
|
|
|
|
**Resolution:** _(open)_
|
|
|
|
### Tests-010
|
|
|
|
| Field | Value |
|
|
|---|---|
|
|
| Severity | Low |
|
|
| Category | Security |
|
|
| Location | `src/MxGateway.Tests/Gateway/Dashboard/DashboardAuthorizationHandlerTests.cs:26-36` |
|
|
| Status | Open |
|
|
|
|
**Description:** The anonymous-localhost bypass is tested only for the success case (`allowAnonymousLocalhost: true` + loopback succeeds) and the remote-unauthenticated denial. There is no test for the security-critical negatives: anonymous + loopback when `AllowAnonymousLocalhost` is `false` must be denied, and anonymous + non-loopback when the flag is `true` must still be denied (the bypass is scoped strictly to loopback). Those are the misconfiguration cases that would expose the dashboard.
|
|
|
|
**Recommendation:** Add tests: anonymous + loopback + `allowAnonymousLocalhost: false` → not succeeded; anonymous + non-loopback + `allowAnonymousLocalhost: true` → not succeeded.
|
|
|
|
**Resolution:** _(open)_
|
|
|
|
### Tests-011
|
|
|
|
| Field | Value |
|
|
|---|---|
|
|
| Severity | Low |
|
|
| Category | Correctness & logic bugs |
|
|
| Location | `src/MxGateway.Tests/Gateway/GatewayEndToEndFakeWorkerSmokeTests.cs:233-301` |
|
|
| Status | Open |
|
|
|
|
**Description:** `GatewayEndToEndFakeWorkerSmokeTests` correctly stores and awaits `launcher.WorkerTask`, but `SessionWorkerClientFactoryFakeWorkerTests` uses `_ = RunWorkerAsync(...)` with no stored task (lines 152, 184, 220). An unhandled exception in the scripted worker becomes an unobserved `TaskException` that can surface as a process-level failure in an unrelated later test rather than failing the owning test.
|
|
|
|
**Recommendation:** Store the worker task and either await it during disposal or attach a continuation that fails the test on fault, mirroring `GatewayEndToEndFakeWorkerSmokeTests`.
|
|
|
|
**Resolution:** _(open)_
|
|
|
|
### Tests-012
|
|
|
|
| Field | Value |
|
|
|---|---|
|
|
| Severity | Low |
|
|
| Category | Concurrency & thread safety |
|
|
| Location | `src/MxGateway.Tests/Gateway/Workers/Fakes/FakeWorkerHarness.cs:62`, `src/MxGateway.Tests/Gateway/Workers/WorkerClientTests.cs:472` |
|
|
| Status | Open |
|
|
|
|
**Description:** Pipe names are uniquified per test with a GUID (good), but xUnit runs test classes in parallel by default and there is no `xunit.runner.json` or collection configuration. Tests that build a full `WebApplication` bind ephemeral ports (`--urls=http://127.0.0.1:0`, fine) but spin up DI containers and hosted services concurrently. Currently safe, but a future test binding a fixed port would silently collide.
|
|
|
|
**Recommendation:** Add an `xunit.runner.json` or a collection grouping the `WebApplication`-building tests, and keep the `:0` ephemeral-port convention explicit so future tests do not introduce a fixed-port collision.
|
|
|
|
**Resolution:** _(open)_
|