a0203503a7
Re-reviewed every module/client against the 10-category checklist
(REVIEW-PROCESS.md) at commit 1cd51bb, filed 72 new findings, and
fixed them in three priority waves (3 High, 17 Medium, 52 Low).
Highs
- Server-017: enumerate AcknowledgeAlarm / QueryActiveAlarms in
GatewayGrpcScopeResolver so non-admin keys can use them; document
the mapping in docs/Authorization.md; add interceptor tests.
- Client.Java-013: add the five missing bulk-method stubs to the
CLI FakeSession so the test module compiles on a clean tree.
- Client.Rust-013: fix the clippy::doc_lazy_continuation regression
in generated tonic code by reformatting the ReadBulkCommand proto
comment and scoping a #![allow(...)] to the generated submodules.
Mediums (highlights)
- Server: unify GatewaySession state-lock discipline (-015) and
make DisposeAsync race-safe against in-flight CloseAsync (-016);
add constraint-enforcement test coverage for the bulk-plan path
(-021).
- Worker: introduce StaRuntimeShutdownException so RunAlarmPollLoop
can distinguish graceful shutdown from a real STA-affinity
violation (-016); have the watchdog skip StaHung while
CurrentCommandCorrelationId is non-empty so a legitimate slow
ReadBulk no longer self-faults (-017).
- Tests: add per-method round-trip + cancellation coverage for the
11 GatewaySession bulk methods (-013); replace the real TCP probe
in GalaxyHierarchyCacheTests with an IGalaxyRepository fake
(-016).
- IntegrationTests: drive the StreamEvents writer in the live Write
test and assert OnWriteComplete (-012); add live tests for
Unadvise/RemoveItem/Unregister ordering, WriteSecured, and
abnormal worker exit (-014).
- Worker.Tests: replace MxAccessSession reflection with an internal
CreateForTesting factory (-016); cover WorkerCancel and
unexpected-body envelope branches (-017).
- Client.Java: cancel MxEventStream when close() races
beforeStart() (-014); return a CancellingCompletableFuture that
actually forwards cancellation through .thenApply chains (-015).
- Client.Python: drop the silent localhost-plaintext downgrade in
the CLI; require explicit --plaintext (-013).
- Client.Rust: stop bench-read-bulk from polluting success-latency
histograms with failed-call durations (-015); add coverage for
the five MalformedReply paths, the bulk-write helpers, the
Error::Unavailable mapping, and the unary-fault path (-016).
- Contracts: extend docs/Contracts.md with the bulk read/write
command family (-009).
Lows (highlights)
- Server: cap GalaxyGlobMatcher.RegexCache; align
WorkerAlarmRpcDispatcher missing-session handling; drop the
duplicate dashboard @page routes; refresh IAlarmRpcDispatcher
XML doc.
- Worker: surface SetXmlAlarmQuery COM failures; remove dead
subscriptionExpression / ExecutingCommand arms; preserve
factory-supplied runtime sessions; split MxAlarmSnapshot.cs into
three files.
- Tests: dispose the WebApplication in seven test classes; rebuild
FakeWorkerProcess.WaitForExitAsync against a real TaskCompletion
source; switch the heartbeat-expires test to ManualTimeProvider;
add InvariantCulture to the remaining DateTimeOffset.Parse sites;
document GalaxyFilterInputSafetyTests in GatewayTesting.md.
- IntegrationTests: comment fixes, RecordingServerStreamWriter
IDisposable, class-level [Trait], single-source ZB default
connection string.
- Worker.Tests: replace silent-return gating with LiveMxAccessFact
so absent env vars SKIP not pass; PascalCase rename of probe
[Fact]s; deterministic deadline test; new frame-protocol error
tests; ComputeTransitions diff-coverage; relocate dev-rig probes
to Probes/.
- Contracts: add round-trip coverage and per-field redaction /
Galaxy-identifier comments to the protos.
- Client.Dotnet: introduce clients/dotnet/Directory.Build.props so
TreatWarningsAsErrors / analysers apply; document
DiscoverHierarchyOptions and IMxGatewayCliClient; require typed
bulk-read handles in CLI; surface AcknowledgeAlarm transport
faults through Translate().
- Client.Go: kill dead code in alarms_test / fakeGalaxyServer /
runWriteBulkVariant; document the six new subcommands in
writeUsage; drain galaxy-watch events on limit; switch io.EOF
comparisons to errors.Is.
- Client.Java: shared shutdown helpers + new shutdownTimeout
option; regex-based credential redaction; Long.toUnsignedString
for uint64 sequence; doc fixes.
- Client.Python: combine duplicate imports; add coverage for
_percentile / bench-read-bulk / MAX_AGGREGATE_EVENTS /
_api_key_from_env; populate pyproject metadata and ship py.typed.
- Client.Rust: expose next_correlation_id() so CLI ping/close
stop hard-coding correlation IDs; resync RustClientDesign.md
with the current Session / Error surface and CLI subcommand set.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
319 lines
41 KiB
Markdown
319 lines
41 KiB
Markdown
# Code Review — Tests
|
|
|
|
| Field | Value |
|
|
|---|---|
|
|
| Module | `src/MxGateway.Tests` |
|
|
| Reviewer | Claude Code |
|
|
| Review date | 2026-05-20 |
|
|
| Commit reviewed | `1cd51bb` |
|
|
| Status | Reviewed |
|
|
| Open findings | 0 |
|
|
|
|
## Checklist coverage
|
|
|
|
| # | Category | Result |
|
|
|---|---|---|
|
|
| 1 | Correctness & logic bugs | Issue found: Tests-015 (`FakeWorkerProcess.WaitForExitAsync` mutates `HasExited`, weakening the smoke test assertion). |
|
|
| 2 | mxaccessgw conventions | No new issues. Style/convention drift previously filed has been resolved. |
|
|
| 3 | Concurrency & thread safety | Issue found: Tests-017 (`HeartbeatMonitor_WhenHeartbeatExpires_FaultsClient` still on real wall-clock). |
|
|
| 4 | Error handling & resilience | Strong — timeouts, faults, overflow, kill paths, protocol violations all exercised. No new issues found. |
|
|
| 5 | Security | No new issues. `Galaxy` adversarial-input safety (Tests-002), dashboard anonymous-localhost negatives (Tests-010), and interceptor composition (Tests-004) all resolved in the prior pass. |
|
|
| 6 | Performance & resource management | Issue found: Tests-014 (`WebApplication` instances built by `GatewayApplicationTests` and `DashboardCookieOptionsTests` are never disposed). |
|
|
| 7 | Design-document adherence | Tests match `docs/GatewayTesting.md`; no drift found. No issues found. |
|
|
| 8 | Code organization & conventions | Issue found: Tests-018 (`DateTimeOffset.Parse` calls without `CultureInfo.InvariantCulture`). |
|
|
| 9 | Testing coverage | Issues found: Tests-013 (eight new `GatewaySession.*BulkAsync` methods untested), Tests-016 (a Galaxy cache unit test performs a real network connect attempt). |
|
|
| 10 | Documentation & comments | Issue found: Tests-019 (the `Re-triage note` paragraphs added to Tests-002/006/008 only live inside `findings.md` — `docs/GatewayTesting.md` is not updated to describe the in-memory Galaxy filter safety tests added under that finding). |
|
|
|
|
## Findings
|
|
|
|
### Tests-001
|
|
|
|
| Field | Value |
|
|
|---|---|
|
|
| Severity | High |
|
|
| Category | Testing coverage |
|
|
| Location | `src/MxGateway.Tests/Gateway/Grpc/MxAccessGatewayServiceTests.cs:483-489` |
|
|
| Status | Resolved |
|
|
|
|
**Description:** `FakeSessionManager.TryGetSession` unconditionally returns `true` and synthesizes a session for any id. As a result, `Invoke_WhenSessionMissing_ThrowsNotFound` (line 52) only passes because `InvokeException` is pre-seeded — it does not verify that the gateway service maps a genuinely missing session to `NotFound`. No test exercises the real gateway path where `TryGetSession` returns `false` (for `StreamEvents`, `CloseSession`, alarm RPCs). A regression dropping the missing-session check would not be caught.
|
|
|
|
**Recommendation:** Make `FakeSessionManager.TryGetSession` return `false` for unknown ids (return only seeded sessions), then assert `NotFound`/`InvalidArgument` is produced by the service's own lookup logic rather than an injected exception.
|
|
|
|
**Resolution:** Resolved 2026-05-18: confirmed root cause — added `ResolveOnlySeededSessions`/`SeedSession` to `FakeSessionManager` so `TryGetSession` returns `false` for unseeded ids, rewrote `Invoke_WhenSessionMissing_ThrowsNotFound` to drop the injected `InvokeException` and exercise the service's own `ResolveSession` lookup (asserts `InvokeCount == 0`), and added `Invoke_WhenSessionSeeded_ResolvesAndInvokes`, `AcknowledgeAlarm_WhenSessionMissing_ThrowsNotFound`, and `QueryActiveAlarms_WhenSessionMissing_ThrowsNotFound`.
|
|
|
|
### Tests-002
|
|
|
|
| Field | Value |
|
|
|---|---|
|
|
| Severity | High |
|
|
| Category | Security |
|
|
| Location | `src/MxGateway.Tests/Gateway/Grpc/GalaxyRepositoryGrpcServiceTests.cs:198-210` |
|
|
| Status | Resolved |
|
|
|
|
**Description:** The Galaxy Repository RPCs browse a SQL Server database (`ZB`). Every test injects a `StubGalaxyHierarchyCache`, so actual SQL query construction, parameterization, and filter/glob translation are never exercised. No test demonstrates that `TagNameGlob`, `RootTagName`, `AlarmFilterPrefix`, etc. are passed as parameters rather than concatenated into SQL. SQL-injection resistance of the Galaxy layer has zero coverage.
|
|
|
|
**Recommendation:** Add tests for the `GalaxyRepository` query-building layer (against SQLite or an in-memory abstraction, or by asserting parameter objects), covering glob/prefix inputs containing `'`, `%`, `_`, and `;`. At minimum add a unit test over the SQL `LIKE`-pattern escaping helper.
|
|
|
|
**Re-triage note:** The finding's premise is partly misframed. `GalaxyRepository` issues only four *constant* SQL statements (`HierarchySql`, `AttributesSql`, `SELECT 1`, `SELECT time_of_last_deploy FROM galaxy`) — no `DiscoverHierarchyRequest` field is ever concatenated into SQL, so there is no dynamic SQL-injection surface and no `LIKE`-escaping helper to test. `AlarmFilterPrefix` belongs to the worker alarm path, not the Galaxy SQL layer. All filters (`TagNameGlob`, `RootTagName`, template-chain, category, contained-path) are applied **in memory** by `GalaxyHierarchyProjector`/`GalaxyGlobMatcher` against the cached snapshot. The genuine, testable concern — that adversarial filter strings are treated as opaque literals (no wildcard behaviour, no ReDoS, no exceptions) — remains valid and was previously uncovered. Severity left at High: an unsafe in-memory filter would still be a real security gap.
|
|
|
|
**Resolution:** Resolved 2026-05-18: added `src/MxGateway.Tests/Galaxy/GalaxyFilterInputSafetyTests.cs` (10 test methods, mostly `[Theory]` over adversarial inputs `'`, `' OR '1'='1`, `'; DROP TABLE gobject;--`, `%`, `_`, `100%_off`, `[abc]`, `Pump'001`) covering `GalaxyGlobMatcher` literal-treatment / `LIKE`-wildcard / pathological-input (ReDoS) behaviour and `GalaxyHierarchyProjector` + `DiscoverHierarchy` RPC handling of adversarial `TagNameGlob`, `RootTagName`, and `TemplateChainContains`. No product bug found — the in-memory filter layer treats all metacharacters as literals; the passing tests resolve the coverage gap.
|
|
|
|
### Tests-003
|
|
|
|
| Field | Value |
|
|
|---|---|
|
|
| Severity | Medium |
|
|
| Category | Performance & resource management |
|
|
| Location | `src/MxGateway.Tests/Security/Authentication/SqliteAuthStoreTests.cs:170-176`, `src/MxGateway.Tests/Security/Authentication/ApiKeyAdminCliRunnerTests.cs:252-258` |
|
|
| Status | Resolved |
|
|
|
|
**Description:** `CreateTempDatabasePath` creates a fresh directory under `%TEMP%\mxgateway-auth-tests\<guid>` (and `...-cli-tests`) for every test but nothing ever deletes it. `WorkerProcessLauncherTests.TestDirectory` correctly implements `IDisposable` and cleans up; these two do not. SQLite connection pooling can also keep the `.db` handle open after the test. Over many CI runs this leaks temp files and open handles.
|
|
|
|
**Recommendation:** Wrap the temp directory in an `IDisposable`/`IAsyncDisposable` helper (as `WorkerProcessLauncherTests` does) and call `SqliteConnection.ClearAllPools()` before deletion, or use `Microsoft.Data.Sqlite` in-memory mode where a real file is not needed.
|
|
|
|
**Resolution:** Resolved 2026-05-18: confirmed root cause — both `CreateTempDatabasePath` helpers created `%TEMP%` directories with no cleanup, and `Microsoft.Data.Sqlite` pools connections by default so the `.db` handle outlives the test. Added a shared `TempDatabaseDirectory` (`src/MxGateway.Tests/Security/Authentication/TempDatabaseDirectory.cs`) `IDisposable` helper that calls `SqliteConnection.ClearAllPools()` and recursively deletes its directory. `SqliteAuthStoreTests` and `ApiKeyAdminCliRunnerTests` now implement `IDisposable`, track every directory created via `CreateTempDatabasePath`, and dispose them after each test. All affected tests still pass.
|
|
|
|
### Tests-004
|
|
|
|
| Field | Value |
|
|
|---|---|
|
|
| Severity | Medium |
|
|
| Category | Testing coverage |
|
|
| Location | `src/MxGateway.Tests/Security/Authorization/GatewayGrpcAuthorizationInterceptorTests.cs` |
|
|
| Status | Resolved |
|
|
|
|
**Description:** The authorization interceptor and `MxAccessGatewayService` are each tested in isolation, but no test composes the interceptor in front of the real service to confirm scope enforcement gates real RPCs end-to-end. A wiring mistake — interceptor not registered, or a new RPC added without a scope mapping in `GatewayGrpcScopeResolver` — would pass every existing test. `GatewayGrpcScopeResolverTests` also only checks an enumerated allow-list; it never asserts an unmapped request type fails closed.
|
|
|
|
**Recommendation:** Add an end-to-end test that runs `OpenSession`/`Invoke` through the interceptor+service composition with insufficient scope and asserts `PermissionDenied`; add a `GatewayGrpcScopeResolver` test asserting an unknown/unmapped request type throws or denies rather than returning a permissive default.
|
|
|
|
**Resolution:** Resolved 2026-05-18: confirmed the coverage gap. Added three interceptor+service composition tests to `GatewayGrpcAuthorizationInterceptorTests` that run the real `GatewayGrpcAuthorizationInterceptor` continuation into a real `MxAccessGatewayService`: `InterceptorComposedWithService_OpenSessionMissingScope_DeniesBeforeServiceRuns` (asserts `PermissionDenied` and `OpenSessionCount == 0`), `InterceptorComposedWithService_OpenSessionWithScope_RunsServiceWithIdentity` (service runs and observes the interceptor-pushed identity), and `InterceptorComposedWithService_InvokeWriteCommandWithReadScope_DeniesBeforeServiceRuns` (a `Write` command with only `invoke:read` is denied). Added two `GatewayGrpcScopeResolverTests`: `ResolveRequiredScope_UnmappedRequestType_FailsClosedToAdminScope` confirms an unmapped request type resolves to the most-restrictive `Admin` scope (the resolver's `_ => GatewayScopes.Admin` default already fails closed — no product bug), and `ResolveRequiredScope_UnknownInvokeCommandKind_ReturnsInvokeReadScope` confirms an unknown command kind does not silently grant write/admin access.
|
|
|
|
### Tests-005
|
|
|
|
| Field | Value |
|
|
|---|---|
|
|
| Severity | Medium |
|
|
| Category | Testing coverage |
|
|
| Location | `src/MxGateway.Tests/Gateway/Grpc/EventStreamServiceTests.cs:239-261`, `src/MxGateway.Tests/Gateway/Sessions/SessionManagerTests.cs` |
|
|
| Status | Resolved |
|
|
|
|
**Description:** Worker-crash handling is only tested as a clean terminal exception from `ReadEventsAsync` or a pre-set `ShutdownException`. There is no test for a worker that faults mid-command — an `InvokeAsync` in flight when the pipe/worker dies — which is a core fault-handling path of the two-process design. `WorkerClientTests` covers pipe-disconnect faulting the read loop, but not the interaction where a pending `InvokeAsync` task observes the fault and surfaces a meaningful error code.
|
|
|
|
**Recommendation:** Add a `WorkerClient`/`SessionManager` test that disposes the worker pipe (or emits a `WorkerFault`) while an `InvokeAsync` is pending, and assert the invoke task fails with a `WorkerClientException`/`SessionManagerException` carrying the worker-faulted error code.
|
|
|
|
**Resolution:** Resolved 2026-05-18: confirmed the coverage gap and confirmed the product path already handles it correctly (`WorkerClient.ReadLoopAsync` → `SetFaulted` → `CompletePendingCommands(fault)` fails every pending command with the fault exception). Added two `WorkerClientTests`: `InvokeAsync_WhenPipeDisconnectsMidCommand_FailsPendingInvokeWithPipeDisconnected` (worker reads the command then disposes its pipe side; the pending invoke task fails with `WorkerClientErrorCode.PipeDisconnected`) and `InvokeAsync_WhenWorkerFaultsMidCommand_FailsPendingInvokeWithWorkerFaulted` (worker emits a `WorkerFault` envelope while the invoke is pending; the task fails with `WorkerClientErrorCode.WorkerFaulted`). Both also assert the client transitions to `Faulted`. No product change needed.
|
|
|
|
### Tests-006
|
|
|
|
| Field | Value |
|
|
|---|---|
|
|
| Severity | Medium |
|
|
| Category | Concurrency & thread safety |
|
|
| Location | `src/MxGateway.Tests/Gateway/Workers/WorkerClientTests.cs:76`, `src/MxGateway.Tests/Gateway/Workers/FakeWorkerHarnessTests.cs:122` |
|
|
| Status | Resolved |
|
|
|
|
**Description:** Several tests rely on fixed `Task.Delay` values: `WorkerClientTests.InvokeAsync_WithLateReply…` waits a hard-coded 50 ms after writing a late reply before issuing the second command, and the heartbeat tests use a 20 ms delay to make timestamps strictly increase. On a slow CI agent the 50 ms delay can be insufficient, and `DateTimeOffset.UtcNow` resolution can make the 20 ms heartbeat-advance assertion flaky.
|
|
|
|
**Recommendation:** Replace fixed delays with the existing `WaitUntilAsync` condition polling, and inject a controllable `TimeProvider` for heartbeat-timestamp comparisons instead of relying on wall-clock advance.
|
|
|
|
**Re-triage note:** The brief flagged `ReadLoop_WhenClientFaults_KillsOwnedWorkerProcess` as "a real `WorkerClient` fault→kill bug". On inspection it is **not a product bug** — it is a test race. `WorkerClient.SetFaulted` publishes the `Faulted` state under lock *before* calling `KillOwnedProcess`, so the old test's `WaitUntilAsync(() => client.State == Faulted)` could return between those two statements and observe `process.KillCount == 0`. The kill itself always runs synchronously inside `SetFaulted`, and `ShutdownAsync`/`DisposeAsync` re-issue an idempotent kill, so no real consumer relies on "state==Faulted implies process dead". The fix is therefore a test-quality fix (correctly Medium / Concurrency), not a product fix.
|
|
|
|
**Resolution:** Resolved 2026-05-18: (1) Made `ReadLoop_WhenClientFaults_KillsOwnedWorkerProcess` deterministic — it now `await`s `FakeWorkerProcess.WaitForExitAsync` (the `TaskCompletionSource` completed inside `Kill()`), which completes exactly when the kill runs, eliminating the state-polling race; verified by running it five times in isolation (5/5 pass). (2) Removed the fixed 50 ms `Task.Delay` from `InvokeAsync_WithLateReply_IgnoresLateReplyAndKeepsClientReady` — the stale reply and the second reply are now sent in pipe (FIFO) order, so the read loop discards the stale reply before the second reply with no timing window. (3) Replaced the 20 ms `Task.Delay` heartbeat-advance hacks in `WorkerClientTests.ReadLoop_WhenHeartbeatArrives_UpdatesLastHeartbeatAndWorkerProcess` and `FakeWorkerHarnessTests.SendHeartbeatAsync_UpdatesClientHeartbeatState` with an injected `ManualTimeProvider` advanced by a fixed `TimeSpan`; both tests now assert the exact post-advance timestamp instead of `>` against wall-clock drift.
|
|
|
|
### Tests-007
|
|
|
|
| Field | Value |
|
|
|---|---|
|
|
| Severity | Low |
|
|
| Category | Code organization & conventions |
|
|
| Location | `src/MxGateway.Tests/Gateway/Grpc/MxAccessGatewayServiceTests.cs:682`, `src/MxGateway.Tests/Gateway/Grpc/GalaxyRepositoryGrpcServiceTests.cs:324`, `src/MxGateway.Tests/Gateway/GatewayEndToEndFakeWorkerSmokeTests.cs:460`, `src/MxGateway.Tests/Security/Authorization/GatewayGrpcAuthorizationInterceptorTests.cs:233` |
|
|
| Status | Resolved |
|
|
|
|
**Description:** A near-identical `TestServerCallContext` implementation is copy-pasted into at least four test files (and `AllowAllConstraintEnforcer` / `TestServerStreamWriter` / `RecordingStreamWriter` into several). Duplication risks the copies drifting and bloats each file.
|
|
|
|
**Recommendation:** Extract a shared `TestServerCallContext`, `RecordingServerStreamWriter<T>`, and `AllowAllConstraintEnforcer` into a common test-support folder/namespace.
|
|
|
|
**Resolution:** Resolved 2026-05-18: confirmed five duplicated copies (the brief's four plus a fifth in `Galaxy/GalaxyFilterInputSafetyTests.cs`). Added a shared `MxGateway.Tests.TestSupport` namespace under `src/MxGateway.Tests/TestSupport/`: `TestServerCallContext.cs` (single class with an optional `Metadata? requestHeaders` constructor parameter that subsumes both the no-arg and headers-bearing variants), `RecordingServerStreamWriter.cs` (thread-safe writer with `Messages` and `WaitForFirstMessageAsync`, replacing `TestServerStreamWriter`/`RecordingStreamWriter`/`RecordingServerStreamWriter`), and `AllowAllConstraintEnforcer.cs`. Deleted all five `TestServerCallContext` copies, both `AllowAllConstraintEnforcer` copies, and the three stream-writer copies; updated the five test files to `using MxGateway.Tests.TestSupport;` and renamed `.Items` call sites to `.Messages`. Removed the now-unused `Grpc.Core` using from `GatewayEndToEndFakeWorkerSmokeTests.cs`. Build clean (0 warnings) and suite green.
|
|
|
|
### Tests-008
|
|
|
|
| Field | Value |
|
|
|---|---|
|
|
| Severity | Low |
|
|
| Category | mxaccessgw conventions |
|
|
| Location | `src/MxGateway.Tests/Gateway/Sessions/WorkerAlarmRpcDispatcherTests.cs:1-9`, `src/MxGateway.Tests/Gateway/Sessions/NotWiredAlarmRpcDispatcherTests.cs:1-3`, `src/MxGateway.Tests/Gateway/Sessions/SessionManagerAlarmAutoSubscribeTests.cs:1` |
|
|
| Status | Resolved |
|
|
|
|
**Description:** The alarm test files diverge from the project's C# style and the rest of the suite: snake_case test method names instead of the PascalCase `Method_Condition_Result` pattern; redundant explicit `using System;`/`System.Threading;` imports despite implicit global usings; and explicit-type `new` instead of target-typed `new()` used elsewhere. There is also a typo in fixture data (`"wnwrap subscribe failed"`).
|
|
|
|
**Recommendation:** Rename the alarm tests to the house `Method_Condition_Result` convention, drop redundant `System.*` usings, align `new` usage, and fix the `wnwrap` typo.
|
|
|
|
**Re-triage note:** Two of the finding's claims are incorrect. (1) `"wnwrap subscribe failed"` is **not a typo** — `WnWrap` is the real name of the worker's `WnWrapAlarmConsumer` MXAccess component (`src/MxGateway.Worker/MxAccess/WnWrapAlarmConsumer.cs`); the fixture string deliberately references it, so it was left unchanged. (2) `SessionManagerAlarmAutoSubscribeTests.cs` already uses PascalCase `Method_Condition_Result` names and target-typed `new()`, and its lone `using System.Runtime.CompilerServices;` is **required** for `[EnumeratorCancellation]` (not a global using) — it is not redundant. That file needed no change. The genuine style drift was confined to `WorkerAlarmRpcDispatcherTests.cs` and `NotWiredAlarmRpcDispatcherTests.cs`.
|
|
|
|
**Resolution:** Resolved 2026-05-18: renamed all ten `WorkerAlarmRpcDispatcherTests` methods and both `NotWiredAlarmRpcDispatcherTests` methods from snake_case to the house `Method_Condition_Result` PascalCase convention; dropped the redundant `System`/`System.Collections.Generic`/`System.Linq`/`System.Threading`/`System.Threading.Tasks` usings from `WorkerAlarmRpcDispatcherTests.cs` and `System.Threading`/`System.Threading.Tasks` from `NotWiredAlarmRpcDispatcherTests.cs` (all are implicit global usings), keeping the required `System.Runtime.CompilerServices`; converted explicit-type `new SessionRegistry()`/`new WorkerAlarmRpcDispatcher(...)`/`new FakeAlarmWorkerClient`/`new List<...>()`/`new GatewaySession(...)` to target-typed `new()`; and replaced the fully-qualified `System.StringComparison` with `StringComparison`. See the re-triage note for the two claims not actioned. Suite green.
|
|
|
|
### Tests-009
|
|
|
|
| Field | Value |
|
|
|---|---|
|
|
| Severity | Low |
|
|
| Category | Documentation & comments |
|
|
| Location | `src/MxGateway.Tests/Gateway/Sessions/SessionManagerTests.cs:36-37,99,365` |
|
|
| Status | Resolved |
|
|
|
|
**Description:** Several XML `<summary>` comments are copy-paste mismatches: the comment above `OpenSessionAsync_SetsInitialDefaultLease` describes correlation-ID generation; the comment above `GatewaySessionSubscribeBulkAsync_ForwardsOneBulkCommand…` describes lease refresh; the comment above `CloseExpiredLeasesAsync_DoesNotCloseActiveEventSubscriber` describes shutdown closing all sessions. Misleading test docs hinder triage.
|
|
|
|
**Recommendation:** Correct the `<summary>` text to match each test's actual behavior, or remove the redundant comments since the test names already describe the behavior.
|
|
|
|
**Resolution:** Resolved 2026-05-18: confirmed three copy-paste `<summary>` mismatches. The mislabelled comments were the summaries of the *following* tests left attached to the wrong method (the test below each then had no summary). Corrected all three: `OpenSessionAsync_SetsInitialDefaultLease` now describes setting the initial lease expiry; the comment above `InvokeAsync_WhenSessionReady_RefreshesLease` (the finding mis-cited the method name as `GatewaySessionSubscribeBulkAsync_…`) now describes lease refresh on invoke; and `CloseExpiredLeasesAsync_DoesNotCloseActiveEventSubscriber` now describes the expired-lease sweep leaving an active-event-subscriber session open. No behavior change.
|
|
|
|
### Tests-010
|
|
|
|
| Field | Value |
|
|
|---|---|
|
|
| Severity | Low |
|
|
| Category | Security |
|
|
| Location | `src/MxGateway.Tests/Gateway/Dashboard/DashboardAuthorizationHandlerTests.cs:26-36` |
|
|
| Status | Resolved |
|
|
|
|
**Description:** The anonymous-localhost bypass is tested only for the success case (`allowAnonymousLocalhost: true` + loopback succeeds) and the remote-unauthenticated denial. There is no test for the security-critical negatives: anonymous + loopback when `AllowAnonymousLocalhost` is `false` must be denied, and anonymous + non-loopback when the flag is `true` must still be denied (the bypass is scoped strictly to loopback). Those are the misconfiguration cases that would expose the dashboard.
|
|
|
|
**Recommendation:** Add tests: anonymous + loopback + `allowAnonymousLocalhost: false` → not succeeded; anonymous + non-loopback + `allowAnonymousLocalhost: true` → not succeeded.
|
|
|
|
**Resolution:** Resolved 2026-05-18: confirmed the coverage gap and confirmed `DashboardAuthorizationHandler` already gates the bypass correctly on `AllowAnonymousLocalhost && IsLoopbackRequest()` (no product bug). Added two `DashboardAuthorizationHandlerTests`: `HandleAsync_AnonymousLocalhostDisallowed_DoesNotSucceed` (anonymous + loopback + `allowAnonymousLocalhost: false` → not succeeded) and `HandleAsync_AnonymousLocalhostAllowedFromRemoteAddress_DoesNotSucceed` (anonymous + non-loopback + `allowAnonymousLocalhost: true` → not succeeded, proving the bypass stays scoped to loopback). Both pass.
|
|
|
|
### Tests-011
|
|
|
|
| Field | Value |
|
|
|---|---|
|
|
| Severity | Low |
|
|
| Category | Correctness & logic bugs |
|
|
| Location | `src/MxGateway.Tests/Gateway/GatewayEndToEndFakeWorkerSmokeTests.cs:233-301` |
|
|
| Status | Resolved |
|
|
|
|
**Description:** `GatewayEndToEndFakeWorkerSmokeTests` correctly stores and awaits `launcher.WorkerTask`, but `SessionWorkerClientFactoryFakeWorkerTests` uses `_ = RunWorkerAsync(...)` with no stored task (lines 152, 184, 220). An unhandled exception in the scripted worker becomes an unobserved `TaskException` that can surface as a process-level failure in an unrelated later test rather than failing the owning test.
|
|
|
|
**Recommendation:** Store the worker task and either await it during disposal or attach a continuation that fails the test on fault, mirroring `GatewayEndToEndFakeWorkerSmokeTests`.
|
|
|
|
**Resolution:** Resolved 2026-05-18: confirmed all three scripted launchers in `SessionWorkerClientFactoryFakeWorkerTests` discarded the worker task. Added an `IWorkerTaskLauncher` interface (each launcher now stores its scripted task in a `WorkerTask` property and exposes `ObserveWorkerTaskAsync`); the test class now implements `IAsyncDisposable`, tracks every launcher it creates via a `Track` helper, and in `DisposeAsync` awaits each `WorkerTask` (within `TestTimeout`) so a scripted-worker fault fails the owning test instead of leaking as an unobserved `TaskScheduler.UnobservedTaskException`. `OperationCanceledException` and `IOException` — the expected outcomes of the worker client tearing the pipe down — are swallowed; anything else rethrows. `NeverReadyWorkerProcessLauncher` (which parks on an infinite `Task.Delay`) was given its own `CancellationTokenSource` so disposal can cancel and observe the parked task. Suite green.
|
|
|
|
### Tests-012
|
|
|
|
| Field | Value |
|
|
|---|---|
|
|
| Severity | Low |
|
|
| Category | Concurrency & thread safety |
|
|
| Location | `src/MxGateway.Tests/Gateway/Workers/Fakes/FakeWorkerHarness.cs:62`, `src/MxGateway.Tests/Gateway/Workers/WorkerClientTests.cs:472` |
|
|
| Status | Resolved |
|
|
|
|
**Description:** Pipe names are uniquified per test with a GUID (good), but xUnit runs test classes in parallel by default and there is no `xunit.runner.json` or collection configuration. Tests that build a full `WebApplication` bind ephemeral ports (`--urls=http://127.0.0.1:0`, fine) but spin up DI containers and hosted services concurrently. Currently safe, but a future test binding a fixed port would silently collide.
|
|
|
|
**Recommendation:** Add an `xunit.runner.json` or a collection grouping the `WebApplication`-building tests, and keep the `:0` ephemeral-port convention explicit so future tests do not introduce a fixed-port collision.
|
|
|
|
**Resolution:** Resolved 2026-05-18: added `src/MxGateway.Tests/xunit.runner.json` making the parallelism policy explicit (`parallelizeTestCollections: true`, `maxParallelThreads: -1`, `parallelizeAssembly: false`, `longRunningTestSeconds: 30`) and wired it into `MxGateway.Tests.csproj` as `<None Update="xunit.runner.json" CopyToOutputDirectory="PreserveNewest" />` so the runner picks it up (confirmed present in `bin/Debug/net10.0/`). Added a comment at the only `WebApplication`-building call site (`GatewayApplicationTests.cs`, `--urls=http://127.0.0.1:0`) documenting that the ephemeral-port (`:0`) convention is mandatory because test collections run in parallel. No fixed-port binding exists today; this is a preventative guardrail as the finding recommends.
|
|
|
|
### Tests-013
|
|
|
|
| Field | Value |
|
|
|---|---|
|
|
| Severity | Medium |
|
|
| Category | Testing coverage |
|
|
| Location | `src/MxGateway.Server/Sessions/GatewaySession.cs:449-679`, `src/MxGateway.Tests/Gateway/Sessions/SessionManagerTests.cs` |
|
|
| Status | Resolved |
|
|
|
|
**Description:** `GatewaySession` exposes eleven bulk methods (`AddItemBulkAsync`, `AdviseItemBulkAsync`, `RemoveItemBulkAsync`, `UnAdviseItemBulkAsync`, `SubscribeBulkAsync`, `UnsubscribeBulkAsync`, `WriteBulkAsync`, `Write2BulkAsync`, `WriteSecuredBulkAsync`, `WriteSecured2BulkAsync`, `ReadBulkAsync`) but only three (`SubscribeBulkAsync`, `WriteBulkAsync`, `ReadBulkAsync`) are exercised in `SessionManagerTests`. A grep across `src/MxGateway.Tests` for the other eight method names returns zero matches. The recent commit `eaa7093` ("register the five new bulk subcommands in `IsKnownGatewayCommand`") explicitly added bulk surface to the gateway, and `1cd51bb` added stress benchmarks for it, but the gateway-side tests do not pin the command-kind, payload-shape, or `WriteSecured*Bulk` credential-redaction behaviour for any of the new bulk variants. A future regression in `WriteSecuredBulkAsync` body construction would not be caught by the gateway unit suite.
|
|
|
|
**Recommendation:** Mirror the existing `SubscribeBulkAsync` / `WriteBulkAsync` / `ReadBulkAsync` test pattern for the eight missing methods: each test should `OpenSessionAsync`, invoke the bulk API, assert the worker received exactly one `WorkerCommand` of the matching `MxCommandKind`, and (for the secured variants) confirm the credential payload survives the round-trip without being log-redacted from the over-the-wire command shape.
|
|
|
|
**Resolution:** Resolved 2026-05-20: added `src/MxGateway.Tests/Gateway/Sessions/SessionManagerBulkTests.cs` with per-method coverage for all eleven bulk entry points. Each method now has a round-trip test that pins (a) the exact `MxCommandKind` sent to the worker, (b) the payload shape (server handle, item handles / tag addresses / entries, timeout for `ReadBulk`), and (c) per-entry failure surfacing where the reply contains a mix of `WasSuccessful = true`/`false` results with an `ErrorMessage`. Each method also has a `*_PropagatesCancellation` test that pre-cancels the token and asserts `OperationCanceledException` flows out. The secured variants additionally pin that `CurrentUserId` / `VerifierUserId` survive the over-the-wire command shape unchanged (the gateway's redaction rules apply only to logs, not to the command body the worker receives). New tests use a local `FakeBulkWorkerClient` keyed by `MxCommand.Kind`-specific replies; no production-code change. All 54 SessionManager/GalaxyHierarchyCache tests pass with `dotnet test --filter "FullyQualifiedName~SessionManager|FullyQualifiedName~GalaxyHierarchyCache"`.
|
|
|
|
### Tests-014
|
|
|
|
| Field | Value |
|
|
|---|---|
|
|
| Severity | Low |
|
|
| Category | Performance & resource management |
|
|
| Location | `src/MxGateway.Tests/Gateway/GatewayApplicationTests.cs:18,33,44,62,81,105`, `src/MxGateway.Tests/Gateway/Dashboard/DashboardCookieOptionsTests.cs:17` |
|
|
| Status | Resolved |
|
|
|
|
**Description:** Seven `[Fact]` methods build a real `WebApplication` via `GatewayApplication.Build([])` and never dispose it. `WebApplication` is `IAsyncDisposable`; constructing one stands up a full DI container, an OpenTelemetry meter (`GatewayMetrics`), Kestrel server objects, hosted services, and logging providers. Because the suite runs test collections in parallel (per the new `xunit.runner.json` from Tests-012), every undisposed instance keeps its meter/loggers/hosted services alive until the test process exits, doubling up live Meter instances each time and silently extending the memory/handle footprint of an `xunit` run. Only the two tests that actually call `app.StartAsync()` (`GatewayApplicationTests.StartAsync_InvalidGatewayConfiguration_FailsStartup` and `SqliteAuthStoreTests.StartAsync_NewerSchemaVersion_BlocksStartup`) currently use `await using`.
|
|
|
|
**Recommendation:** Promote each `WebApplication app = GatewayApplication.Build(...)` to `await using WebApplication app = ...` and make the containing test method `async Task`. The endpoint-listing assertions do not need `await`, but the `await using` will ensure the DI container, meter, and hosted services are torn down per-test.
|
|
|
|
**Resolution:** 2026-05-20 — Promoted all seven `WebApplication`-building tests (six in `GatewayApplicationTests` plus the one in `DashboardCookieOptionsTests`) to `async Task` with `await using WebApplication app = GatewayApplication.Build(...)`, so the DI container, `GatewayMetrics` meter, hosted services, and Kestrel objects are torn down per-test rather than leaking until process exit. The previously already-`await using` `StartAsync_InvalidGatewayConfiguration_FailsStartup` was unchanged. Full suite green.
|
|
|
|
### Tests-015
|
|
|
|
| Field | Value |
|
|
|---|---|
|
|
| Severity | Low |
|
|
| Category | Correctness & logic bugs |
|
|
| Location | `src/MxGateway.Tests/Gateway/GatewayEndToEndFakeWorkerSmokeTests.cs:374-379,87` |
|
|
| Status | Resolved |
|
|
|
|
**Description:** The nested `FakeWorkerProcess.WaitForExitAsync` implementation unconditionally sets `HasExited = true` and `ExitCode ??= 0` when called, regardless of whether the scripted worker actually completed the shutdown handshake. The smoke-test assertion `Assert.True(launcher.Process.HasExited)` therefore cannot distinguish "the scripted worker received `WorkerShutdown`, sent `WorkerShutdownAck`, and called `MarkExited(0)`" from "the gateway code path simply awaited `WaitForExitAsync` somewhere during teardown". The scripted worker happens to call `MarkExited(0)` after receiving the shutdown frame, but a regression that bypassed the shutdown-ack path entirely would still pass this assertion. The companion launcher in `SessionWorkerClientFactoryFakeWorkerTests.FakeWorkerProcess.WaitForExitAsync` (lines 351-356) has the same shape — fine there because no exit assertion is made — but the smoke test relies on this signal.
|
|
|
|
**Recommendation:** Make `WaitForExitAsync` await an internal `TaskCompletionSource` that is only completed by `Kill()` or `MarkExited()` (the same pattern `WorkerClientTests.FakeWorkerProcess` already uses for `_exited`), so `HasExited` reflects actual exit and the smoke test's assertion is meaningful.
|
|
|
|
**Resolution:** 2026-05-20 — Rewrote the smoke-test `FakeWorkerProcess` to back `WaitForExitAsync` with a `TaskCompletionSource _exited` that is only completed inside `MarkExited` (called by the scripted worker after sending `WorkerShutdownAck`) or `Kill` (which calls `MarkExited(-1)`), removing the "set `HasExited = true` and return immediately" cheat. The smoke test now also asserts `Assert.Equal(0, launcher.Process.ExitCode)` — `MarkExited(0)` is reachable only via the shutdown-ack branch, so a regression that bypassed the ack path would produce a non-zero (or null) exit code and fail the assertion deterministically. `WorkerClient.ShutdownAsync` calls `WaitForProcessExitAsync`, which now genuinely awaits the scripted worker's ack.
|
|
|
|
### Tests-016
|
|
|
|
| Field | Value |
|
|
|---|---|
|
|
| Severity | Medium |
|
|
| Category | Testing coverage |
|
|
| Location | `src/MxGateway.Tests/Galaxy/GalaxyHierarchyCacheTests.cs:29-41,115-124` |
|
|
| Status | Resolved |
|
|
|
|
**Description:** `RefreshAsync_WhenSqlIsUnreachable_MarksUnavailableAndDoesNotPublish` is in the unit-test project but exercises a real `GalaxyHierarchyCache`/`GalaxyRepository` against a hard-coded TCP socket `127.0.0.1:65500` with a one-second connect timeout. Per `docs/GatewayTesting.md`, live Galaxy coverage belongs in `MxGateway.IntegrationTests` and is gated by `MXGATEWAY_RUN_LIVE_GALAXY_TESTS=1`; this test is neither gated nor uses a stub repository. On most boxes the connect fails closed (the test passes), but the outcome depends on OS-level "connection refused" vs "no route to host" behaviour and is sensitive to environments where 127.0.0.1:65500 happens to be bound — a real flakiness source. It also breaks the gateway-without-MXAccess invariant in spirit (the gateway code path under test does I/O the unit project should not need).
|
|
|
|
**Recommendation:** Either (a) replace the real repository with an in-test fake that throws a `SqlException`/`TimeoutException` from `GetHierarchyAsync`, exercising `GalaxyHierarchyCache.RefreshAsync`'s exception path directly; or (b) move the test to `MxGateway.IntegrationTests` and gate it behind a "no-live-DB-required" variant of the live-Galaxy attribute. (a) is preferred because the production path being tested is the cache's reaction to a repository exception, not socket behaviour.
|
|
|
|
**Resolution:** Resolved 2026-05-20: applied option (a). Introduced `src/MxGateway.Server/Galaxy/IGalaxyRepository.cs` with the four methods the cache consumes (`TestConnectionAsync`, `GetLastDeployTimeAsync`, `GetHierarchyAsync`, `GetAttributesAsync`); made `GalaxyRepository` implement it; changed `GalaxyHierarchyCache`'s constructor to depend on `IGalaxyRepository` rather than the concrete type; and registered the interface against the existing concrete singleton in `GalaxyRepositoryServiceCollectionExtensions.AddGalaxyRepository`. Rewrote the test as `RefreshAsync_WhenRepositoryThrows_MarksUnavailableAndDoesNotPublish` using a local `ThrowingGalaxyRepository : IGalaxyRepository` that throws an `InvalidOperationException` from `GetLastDeployTimeAsync` (the first call the cache makes against the repository). The test now exercises the cache's exception branch directly — no TCP I/O — and additionally asserts that `GetHierarchyAsync`/`GetAttributesAsync` are NOT invoked once the deploy-time probe has failed. `Current_BeforeAnyRefresh_ReturnsEmpty` was migrated to the same fake. The unreachable `CreateCache` helper that built a real `GalaxyRepository` against `127.0.0.1:65500` was removed. The Galaxy SQL surface itself stays covered by `MxGateway.IntegrationTests.Galaxy.GalaxyRepositoryLiveTests` (gated by `MXGATEWAY_RUN_LIVE_GALAXY_REPOSITORY_TESTS=1`).
|
|
|
|
### Tests-017
|
|
|
|
| Field | Value |
|
|
|---|---|
|
|
| Severity | Low |
|
|
| Category | Concurrency & thread safety |
|
|
| Location | `src/MxGateway.Tests/Gateway/Workers/WorkerClientTests.cs:346-364` |
|
|
| Status | Resolved |
|
|
|
|
**Description:** `HeartbeatMonitor_WhenHeartbeatExpires_FaultsClient` configures `HeartbeatGrace = 80 ms` and `HeartbeatCheckInterval = 20 ms`, then asserts the client faults within the 5-second `TestTimeout`. The test compares against the real wall clock — the heartbeat monitor reads `TimeProvider.System` for the grace check. After Tests-006 migrated the other heartbeat tests to an injected `ManualTimeProvider` for determinism, this one is now the only `WorkerClientTests` heartbeat case that still rides the wall clock. The 5-second outer bound makes a false failure unlikely, but the test cannot fail fast when the heartbeat-monitor logic regresses — it just waits the full 5 seconds.
|
|
|
|
**Recommendation:** Inject the same `ManualTimeProvider` used by `ReadLoop_WhenHeartbeatArrives_UpdatesLastHeartbeatAndWorkerProcess`, then `clock.Advance(TimeSpan.FromSeconds(2))` past the grace and assert the fault deterministically. The `HeartbeatCheckInterval` (20 ms) timer fire can stay on the real clock; what needs to be deterministic is the grace comparison.
|
|
|
|
**Resolution:** 2026-05-20 — `HeartbeatMonitor_WhenHeartbeatExpires_FaultsClient` now constructs a `ManualTimeProvider` seeded at `"2026-05-20T12:00:00Z"`, passes it to `CreateClient` via the existing `timeProvider` parameter, and calls `clock.Advance(TimeSpan.FromSeconds(2))` after the handshake. `WorkerClient.MarkReady` records `_lastHeartbeatAt` from the manual clock, so the next 20 ms `HeartbeatCheckInterval` tick observes `now - lastHeartbeat = 2s > 80ms grace` and faults deterministically. The check-interval timer stays on the real clock as the finding recommended; only the grace comparison is deterministic.
|
|
|
|
### Tests-018
|
|
|
|
| Field | Value |
|
|
|---|---|
|
|
| Severity | Low |
|
|
| Category | Code organization & conventions |
|
|
| Location | `src/MxGateway.Tests/Galaxy/GalaxyHierarchyCacheTests.cs:32`, `src/MxGateway.Tests/Gateway/Dashboard/DashboardSnapshotServiceTests.cs:45,51,57,105,134,163,167,202-209,284,317,523`, `src/MxGateway.Tests/Gateway/Sessions/SessionManagerTests.cs:40` |
|
|
| Status | Resolved |
|
|
|
|
**Description:** Several tests parse ISO-8601 literals with `DateTimeOffset.Parse("2026-04-26T10:00:00Z")` without an explicit `CultureInfo.InvariantCulture`. `Directory.Build.props` enables `TreatWarningsAsErrors`, but CA1305 (specify `IFormatProvider`) is not currently raised because the tests don't trigger it; nevertheless, `DateTimeOffset.Parse` without a culture takes `CurrentCulture`, and on a locale whose `DateTimeFormatInfo` rejects the `Z` suffix or uses non-Gregorian calendar conventions, these parses can throw at test time. `WorkerClientTests.cs:327` and `FakeWorkerHarnessTests.cs:121` already added `System.Globalization.CultureInfo.InvariantCulture` in the Tests-006 fix; the other ~15 call sites did not get the same treatment.
|
|
|
|
**Recommendation:** Add `CultureInfo.InvariantCulture` to every `DateTimeOffset.Parse(...)` call in `MxGateway.Tests`, or replace with `DateTimeOffset.ParseExact` against the literal `"O"` round-trip format. A single-line `using System.Globalization;` per file keeps the call sites concise.
|
|
|
|
**Resolution:** 2026-05-20 — Added `CultureInfo.InvariantCulture` to every `DateTimeOffset.Parse` site in `MxGateway.Tests` that lacked it: 16 call sites in `DashboardSnapshotServiceTests.cs` (a new `using System.Globalization;` was added so the call sites stay concise) and one in `SessionManagerTests.cs` (using the fully-qualified `System.Globalization.CultureInfo.InvariantCulture` to match the in-file style of the existing `ManualTimeProvider` parse sites). `GalaxyHierarchyCacheTests.cs:36` was already correct from the Tests-016 rewrite. A final grep confirms every `DateTimeOffset.Parse`/`DateTime.Parse` call in `src/MxGateway.Tests` now passes `CultureInfo.InvariantCulture`.
|
|
|
|
### Tests-019
|
|
|
|
| Field | Value |
|
|
|---|---|
|
|
| Severity | Low |
|
|
| Category | Documentation & comments |
|
|
| Location | `docs/GatewayTesting.md`, `code-reviews/Tests/findings.md` (Tests-002 re-triage) |
|
|
| Status | Resolved |
|
|
|
|
**Description:** The Tests-002 re-triage (2026-05-18) confirmed there is no SQL-injection surface in `GalaxyRepository` because filters are applied in memory by `GalaxyHierarchyProjector`/`GalaxyGlobMatcher` against the cached snapshot, and added 10 adversarial-input tests in `src/MxGateway.Tests/Galaxy/GalaxyFilterInputSafetyTests.cs`. That explanation lives only in the findings file; `docs/GatewayTesting.md` does not mention `GalaxyFilterInputSafetyTests`, the in-memory filter model, or the adversarial-input matrix. A future reader of the test docs will not know which tests pin the literal-filter behaviour or why the Galaxy SQL layer is not unit-tested for parameterisation. Per `CLAUDE.md` ("Update docs in the same change as the source. When public APIs, contracts, configuration, build steps, security behavior, event shapes, value conversion, status mapping, or lifecycle rules change, the affected docs must change in the same commit"), the Galaxy security-behaviour decision warrants a paragraph in `GatewayTesting.md`.
|
|
|
|
**Recommendation:** Add a short subsection to `docs/GatewayTesting.md` (probably under "Focused Commands" or a new "Galaxy Filter Safety" section) that names `GalaxyFilterInputSafetyTests`, explains that Galaxy filtering happens in memory against the cached hierarchy (so the SQL surface is constant), and lists the adversarial-input invariants the suite pins (`%`, `_`, `'`, `;`, `[abc]` are literals; the glob regex has a 100 ms timeout against pathological input).
|
|
|
|
**Resolution:** 2026-05-20 — Added a "Galaxy Filter Safety" section to `docs/GatewayTesting.md` (immediately after "Live Galaxy Repository", before "Live LDAP") that names `GalaxyFilterInputSafetyTests`, re-frames the Tests-002 finding (the Galaxy SQL surface is constant — `HierarchySql`, `AttributesSql`, `SELECT 1`, `SELECT time_of_last_deploy FROM galaxy`), explains that all filters are applied in memory by `GalaxyHierarchyProjector` / `GalaxyGlobMatcher`, lists the adversarial-input matrix (`'`, `' OR '1'='1`, `'; DROP TABLE gobject;--`, `%`, `_`, `100%_off`, `[abc]`, `Pump'001`), and enumerates the invariants the suite pins (SQL metacharacters are opaque literals, only `*`/`?` are glob wildcards, the matcher has a 100 ms regex timeout against pathological input, the projector returns zero matches / `NotFound` rather than the whole hierarchy, and the `DiscoverHierarchy` RPC end-to-end returns zero matches for adversarial globs).
|