Files
mxaccessgw/code-reviews/IntegrationTests/findings.md
T
Joseph Doherty bc55396334 Resolve IntegrationTests-001 and IntegrationTests-002 code-review findings
IntegrationTests-001: documented the live Galaxy Repository test suite and
its MXGATEWAY_RUN_LIVE_GALAXY_TESTS / MXGATEWAY_LIVE_GALAXY_CONN gating in
docs/GatewayTesting.md.

IntegrationTests-002: documented the live LDAP test suite in
docs/GatewayTesting.md and added a concrete "Provisioning the GwAdmin group"
step to glauth.md so DashboardLdapLiveTests' GwAdmin-membership assumption
is reproducible.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-18 20:46:09 -04:00

12 KiB

Code Review — IntegrationTests

Field Value
Module src/MxGateway.IntegrationTests
Reviewer Claude Code
Review date 2026-05-18
Commit reviewed 6c64030
Status Reviewed
Open findings 8

Checklist coverage

# Category Result
1 Correctness & logic bugs Issues found: IntegrationTests-003 (asserts only on first event), IntegrationTests-010 (WaitForFirstMessageAsync ignores cancellation).
2 mxaccessgw conventions Live tests correctly gated and skip (not fail) when prerequisites are absent; LiveGalaxyRepositoryFactAttribute undocumented in the opt-in matrix.
3 Concurrency & thread safety Issue found: IntegrationTests-007 (no [Collection]/parallelism guard for shared MXAccess/ZB/GLAuth).
4 Error handling & resilience Issue found: IntegrationTests-004 (cleanup WaitAsync can mask the original failure).
5 Security No production secrets; only documented dev GLAuth creds and a localhost ZB connection string, all env-overridable. No issues found.
6 Performance & resource management Worker process disposed transitively via session disposal; no leaked pipes/COM/processes. No issues found.
7 Design-document adherence Issues found: IntegrationTests-001 (Galaxy live suite absent from the opt-in matrix), IntegrationTests-002 (GwAdmin LDAP prerequisite undocumented).
8 Code organization & conventions Issue found: IntegrationTests-008 (three near-identical fact attributes).
9 Testing coverage Issues found: IntegrationTests-005 (thin MXAccess parity coverage), IntegrationTests-006 (thin LDAP failure-path coverage).
10 Documentation & comments Issue found: IntegrationTests-009 (TestServerCallContext mislabelled "Mock").

Findings

IntegrationTests-001

Field Value
Severity High
Category Design-document adherence
Location src/MxGateway.IntegrationTests/Galaxy/LiveGalaxyRepositoryFactAttribute.cs:7, src/MxGateway.IntegrationTests/Galaxy/GalaxyRepositoryLiveTests.cs
Status Resolved

Description: The Galaxy Repository live test suite and its gating env var MXGATEWAY_RUN_LIVE_GALAXY_TESTS (plus connection-string override MXGATEWAY_LIVE_GALAXY_CONN) are completely absent from docs/GatewayTesting.md. CLAUDE.md mandates updating docs in the same change as the source. The opt-in matrix documents only the MXAccess and LDAP env vars, so an operator running the documented matrix has no way to know these tests exist or how to enable them.

Recommendation: Add a "Live Galaxy Repository" section to docs/GatewayTesting.md documenting MXGATEWAY_RUN_LIVE_GALAXY_TESTS=1, MXGATEWAY_LIVE_GALAXY_CONN, the ZB database prerequisite, and the covered RPCs, mirroring the existing "Live MXAccess Smoke" section.

Resolution: Resolved 2026-05-18: Added a "Live Galaxy Repository" section to docs/GatewayTesting.md documenting MXGATEWAY_RUN_LIVE_GALAXY_TESTS, MXGATEWAY_LIVE_GALAXY_CONN, the deployed-ZB prerequisite, and the covered GalaxyRepository RPCs.

IntegrationTests-002

Field Value
Severity High
Category Design-document adherence
Location src/MxGateway.IntegrationTests/DashboardLdapLiveTests.cs:13, src/MxGateway.Server/Configuration/LdapOptions.cs:27
Status Resolved

Description: DashboardLdapLiveTests builds the authenticator with new GatewayOptions(), so it relies on LdapOptions.RequiredGroup defaulting to GwAdmin and asserts the admin user is a member of a GwAdmin LDAP group. glauth.md does not list GwAdmin as a provisioned group — it lists admin only in the five role groups and describes GwAdmin as a group to add "when reuse isn't enough." If GLAuth has only the documented baseline groups, AuthenticateAsync_AdminInGwAdminGroup_Succeeds fails (not skips) on any box where the env var is set. This is an undocumented hard prerequisite beyond "LDAP is up."

Recommendation: Either document the required GwAdmin GLAuth provisioning step in glauth.md and GatewayTesting.md, or have the test set RequiredGroup to a baseline group glauth.md guarantees admin belongs to (e.g. WriteOperate).

Resolution: Resolved 2026-05-18: Took the documentation fix — promoted the glauth.md "Adding a gw-specific group" section into a concrete "Provisioning the GwAdmin group" step that grants GwAdmin to admin, cross-referenced it from the groups/verification sections, and added a "Live LDAP" section to docs/GatewayTesting.md calling out GwAdmin as a hard prerequisite. Alternative considered: weaken the test to a baseline group (WriteOperate) — rejected because GwAdmin is the real default LdapOptions.RequiredGroup and the test should exercise it.

IntegrationTests-003

Field Value
Severity Medium
Category Correctness & logic bugs
Location src/MxGateway.IntegrationTests/WorkerLiveMxAccessSmokeTests.cs:89-97
Status Open

Description: The test asserts only on the first MxEvent recorded by RecordingServerStreamWriter. A live MXAccess provider can deliver an initial state/quality event whose family or handles differ from the expected OnDataChange (e.g. a registration-state or bad-quality bootstrap event). Because WaitForFirstMessageAsync returns whatever arrives first, a genuine ordering/family defect could fail spuriously or leave later wrong events unverified.

Recommendation: Filter for the first event with Family == OnDataChange (with a bounded retry/poll) or assert the full recorded sequence, so the test verifies the event the worker is supposed to emit.

Resolution: (open)

IntegrationTests-004

Field Value
Severity Medium
Category Error handling & resilience
Location src/MxGateway.IntegrationTests/WorkerLiveMxAccessSmokeTests.cs:108-111
Status Open

Description: In the finally block, after CloseSessionAsync, the test does await streamTask.WaitAsync(StreamShutdownTimeout). If closing the session does not promptly complete the stream (or StreamEvents itself faults), this throws TimeoutException from inside finally, which replaces/masks any original assertion failure from the try block. The diagnostic value of the real failure is lost.

Recommendation: Wrap the streamTask.WaitAsync (and ideally WaitForProcessesAsync) in a try/catch that logs the cleanup exception via output.WriteLine instead of letting it propagate.

Resolution: (open)

IntegrationTests-005

Field Value
Severity Medium
Category Testing coverage
Location src/MxGateway.IntegrationTests/WorkerLiveMxAccessSmokeTests.cs
Status Open

Description: The only live MXAccess test covers the Register→AddItem→Advise→one-OnDataChange→Close happy path. CLAUDE.md stresses that MXAccess parity is the contract and calls out non-obvious behaviors (WriteSecured ordering, OperationComplete semantics, invalid-handle exceptions). None of Write, WriteSecured, Unadvise, RemoveItem, Unregister, OperationComplete, an invalid-handle command, or a worker-fault path is exercised against live COM — exactly the paths fake-worker tests cannot validate.

Recommendation: Add live coverage for at least a Write round-trip and an invalid-handle command, plus a worker-fault/abnormal-exit scenario, even if behind additional opt-in env vars.

Resolution: (open)

IntegrationTests-006

Field Value
Severity Medium
Category Testing coverage
Location src/MxGateway.IntegrationTests/DashboardLdapLiveTests.cs
Status Open

Description: LDAP live coverage is two cases: admin succeeds, readonly is denied for missing group. There is no coverage of a wrong password for a valid user, an unknown username, or the LDAP-server-unreachable path — all of which DashboardAuthenticator has distinct branches for (the LdapException catch, the candidate is null branch). The negative test only proves group-membership denial, not credential rejection.

Recommendation: Add a live test for admin with a wrong password asserting Succeeded == false and that the password is not leaked into FailureMessage, and a test for an unknown username.

Resolution: (open)

IntegrationTests-007

Field Value
Severity Low
Category Concurrency & thread safety
Location src/MxGateway.IntegrationTests/WorkerLiveMxAccessSmokeTests.cs:20, src/MxGateway.IntegrationTests/Galaxy/GalaxyRepositoryLiveTests.cs:5, src/MxGateway.IntegrationTests/DashboardLdapLiveTests.cs:9
Status Open

Description: The live test classes contend for genuinely shared singletons — one MXAccess COM provider, one ZB SQL database, one GLAuth instance with a 3-fail/10-minute per-IP lockout. No [Collection] annotation or DisableTestParallelization is declared, so xUnit's default cross-class parallelism could run the Galaxy tests concurrently or interleave an LDAP failure burst that trips the GLAuth lockout.

Recommendation: Place the live test classes in a shared [Collection], or set [assembly: CollectionBehavior(DisableTestParallelization = true)] for this opt-in project, so live external resources are accessed serially.

Resolution: (open)

IntegrationTests-008

Field Value
Severity Low
Category Code organization & conventions
Location src/MxGateway.IntegrationTests/LiveLdapFactAttribute.cs, src/MxGateway.IntegrationTests/Galaxy/LiveGalaxyRepositoryFactAttribute.cs, src/MxGateway.IntegrationTests/LiveMxAccessFactAttribute.cs
Status Open

Description: Three near-identical fact attributes each re-implement the same "compare env var to 1 with StringComparison.Ordinal, set Skip otherwise" pattern. LiveMxAccessFactAttribute delegates to IntegrationTestEnvironment while the other two inline the logic, so the project has two divergent styles for the same concern.

Recommendation: Extract a shared helper (e.g. IntegrationTestEnvironment.IsEnabled(string variableName)) and have all three attributes call it.

Resolution: (open)

IntegrationTests-009

Field Value
Severity Low
Category Documentation & comments
Location src/MxGateway.IntegrationTests/WorkerLiveMxAccessSmokeTests.cs:372-375
Status Open

Description: TestServerCallContext is XML-documented as a "Mock server call context," but it is a hand-written stub/fake with no mocking framework and no verification behavior. Per the style guides (accurate naming; explain why not what), calling it a mock misleads readers who may expect verifiable interactions.

Recommendation: Reword the summary to "test stub" / "minimal ServerCallContext implementation for in-process gRPC calls."

Resolution: (open)

IntegrationTests-010

Field Value
Severity Low
Category Correctness & logic bugs
Location src/MxGateway.IntegrationTests/WorkerLiveMxAccessSmokeTests.cs:366-369
Status Open

Description: WaitForFirstMessageAsync accepts only a timeout and never observes a CancellationToken. There is no per-test cancellation propagation, so if the gateway/worker hangs without writing an event the test relies solely on the 15s WaitAsync timeout and gives no contextual diagnostics. Combined with IntegrationTests-004, a hung live worker produces a bare TimeoutException.

Recommendation: Accept a CancellationToken (linked to TestServerCallContext's token), pass it to firstMessage.Task.WaitAsync(timeout, token), and on timeout emit the recorded Messages count via output.WriteLine before throwing.

Resolution: (open)