# Gateway Testing Gateway tests run without installed MXAccess by using fake workers, fake transports, and in-process gRPC service fakes. Live MXAccess verification belongs in opt-in integration tests because it depends on installed COM components and provider state. ## Fake Worker Harness `FakeWorkerHarness` in `src/ZB.MOM.WW.MxGateway.Tests/Gateway/Workers/Fakes/` provides an in-process worker side for named-pipe IPC tests. It uses the same `WorkerFrameReader`, `WorkerFrameWriter`, and `WorkerEnvelope` contract as the gateway so tests exercise real frame validation and worker-client state changes. Use the harness when a gateway or session test needs worker behavior without starting `ZB.MOM.WW.MxGateway.Worker.exe` or loading MXAccess COM. The harness scripts: - `WorkerHello` and `WorkerReady` startup, - command replies with matching correlation ids, - ordered `WorkerEvent` frames, - `WorkerHeartbeat` frames, - `WorkerFault` frames, - shutdown acknowledgements, - malformed protobuf payloads and oversized frame headers, - slow or hung workers by withholding a reply. Session-level tests can connect the harness to the pipe created by `SessionWorkerClientFactory` with `ConnectToGatewayPipeAsync`. Lower-level `WorkerClient` tests can use `CreateConnectedPairAsync` to create both pipe ends inside the test. `GatewayEndToEndFakeWorkerSmokeTests` composes the real gRPC service, `SessionManager`, `SessionWorkerClientFactory`, `WorkerClient`, and `EventStreamService` with a scripted fake worker launcher. The smoke test covers `OpenSession`, `Register`, `AddItem`, `Advise`, one streamed `OnDataChange` event, and `CloseSession` without loading MXAccess COM. ## Live MXAccess Smoke `WorkerLiveMxAccessSmokeTests` in `src/ZB.MOM.WW.MxGateway.IntegrationTests/` composes the real gRPC service, `SessionManager`, `SessionWorkerClientFactory`, `WorkerClient`, `WorkerProcessLauncher`, and `ZB.MOM.WW.MxGateway.Worker.exe`. It is skipped unless `MXGATEWAY_RUN_LIVE_MXACCESS_TESTS=1` is set because it creates the installed MXAccess COM object and depends on live provider state. The live smoke opens a gateway session, launches the x86 worker, runs `Register`, `AddItem`, and `Advise`, waits a bounded time for the first `OnDataChange` event (skipping any earlier bootstrap/registration-state event), and closes the session in a `finally` block so the worker gets a graceful shutdown request even when a command or event assertion fails. Cleanup failures in that `finally` block are logged rather than thrown, so a real assertion failure is never masked by a shutdown timeout. `WorkerLiveMxAccessSmokeTests` additionally covers five MXAccess parity paths the fake-worker tests cannot validate: - a `Write` round-trip against an advised item, asserting both that the reply is `Ok` / `MxCommandKind.Write` *and* that the worker emits a matching `OnWriteComplete` event for the targeted (server, item) handle pair — the same round-trip proof used by `scripts/run-client-e2e-tests.ps1`, - an `AddItem` against an invalid server handle, asserting the MXAccess failure surfaces in the command reply without faulting the gateway transport, - the `UnAdvise` → `RemoveItem` → `Unregister` teardown chain, asserting each step replies `Ok` with the matching `MxCommandKind`, that no further `OnDataChange` events arrive for the un-advised pair, and that a second `RemoveItem` against the freed handle relays a non-`Ok` MXAccess failure, - a `WriteSecured` round-trip after `AuthenticateUser`, asserting the reply carries `MxCommandKind.WriteSecured` and the credential password never appears in the diagnostic message (parity for both the secured-write ordering rule and the "do not log secrets" contract), and - an abnormal worker exit (the worker process is killed mid-session) where the gateway must transition the session to `SessionState.Faulted` with a non-empty fault description carrying a known worker-client classification (pipe disconnected / worker faulted / end-of-stream / heartbeat expired). All six tests are gated by the same `MXGATEWAY_RUN_LIVE_MXACCESS_TESTS=1` opt-in variable. Build the worker before running the smoke: ```bash dotnet build src/ZB.MOM.WW.MxGateway.Worker/ZB.MOM.WW.MxGateway.Worker.csproj -p:Platform=x86 ``` Run the smoke explicitly: ```bash $env:MXGATEWAY_RUN_LIVE_MXACCESS_TESTS = "1" dotnet test src/ZB.MOM.WW.MxGateway.IntegrationTests/ZB.MOM.WW.MxGateway.IntegrationTests.csproj --filter FullyQualifiedName~WorkerLiveMxAccessSmokeTests ``` Optional live smoke variables: | Variable | Default | Description | |----------|---------|-------------| | `MXGATEWAY_LIVE_MXACCESS_WORKER_EXE` | First existing `ZB.MOM.WW.MxGateway.Worker.exe` under `src/ZB.MOM.WW.MxGateway.Worker/bin/...` | Worker executable path. Set this when running against a packaged worker or a non-default build output. | | `MXGATEWAY_LIVE_MXACCESS_ITEM` | `TestChildObject.TestInt` | MXAccess item reference used by `AddItem`. | | `MXGATEWAY_LIVE_MXACCESS_CLIENT_NAME` | `ZB.MOM.WW.MxGateway.IntegrationTests` | Client name passed to `Register`. | | `MXGATEWAY_LIVE_MXACCESS_EVENT_TIMEOUT_SECONDS` | `15` | Maximum wait for the first `OnDataChange` (also used for the `OnWriteComplete` round-trip and the abnormal-exit fault transition). | | `MXGATEWAY_LIVE_MXACCESS_WRITE_SECURED_USER` | `admin` | ArchestrA user name passed to `AuthenticateUser` before the `WriteSecured` parity step. | | `MXGATEWAY_LIVE_MXACCESS_WRITE_SECURED_PASSWORD` | `admin123` | Password paired with the user above. Never logged; the test asserts the value does not appear in the WriteSecured diagnostic message. | The test output includes session id, worker process id, command status, HRESULT/status diagnostics, event sequence and handles, close status, and worker stdout/stderr lines emitted during the run. ## Dev-rig Probes `src/ZB.MOM.WW.MxGateway.Worker.Tests/Probes/` partitions runtime probes from the regular Worker.Tests regression suite. The folder is its own `ZB.MOM.WW.MxGateway.Worker.Tests.Probes` namespace so a discovery filter (e.g. `dotnet test --filter FullyQualifiedName~ZB.MOM.WW.MxGateway.Worker.Tests.Probes`) can target or exclude them without enumerating individual class names. The probes are `[Fact(Skip = "...")]` by default and exist to characterize live AVEVA behavior on the dev rig, not to gate CI — flip `Skip = null` on the dev box with installed MXAccess + a running Galaxy provider when running them: - `AlarmsLiveSmokeTests` — end-to-end smoke for the alarms-over-gateway pipeline (`WnWrapAlarmConsumer` + `AlarmDispatcher` + `MxAccessAlarmEventSink`) against `\\\Galaxy!DEV` with the dev rig's 10-second flip script writing `TestMachine_001.TestAlarm001`. - `AlarmClientWmProbeTests` — registers as an `AlarmClient` consumer on a real hidden message-only window and logs every Win32 message that arrives during a fixed pump window. Used to identify the `WM_APP` / `RegisterWindowMessage` IDs alarm callbacks use. - `WnWrapConsumerProbeTests` — instantiates AVEVA's standalone `wnwrapConsumer` COM class, subscribes to the dev rig's `\\\Galaxy!DEV` provider, and polls `GetXmlCurrentAlarms2`. The XML payload bypasses the `FILETIME→DateTime` auto-marshaling that crashes `aaAlarmManagedClient.AlarmClient.GetHighPriAlarm` on this rig. The probes share the Worker.Tests project (so they can use its `net48`/`x86` configuration and the installed `ArchestrA.MxAccess` / `aaAlarmManagedClient` references), but they are not part of the regression contract — a Worker.Tests run with `Skip` left in place passes them as skipped. ## Live Galaxy Repository `GalaxyRepositoryLiveTests` in `src/ZB.MOM.WW.MxGateway.IntegrationTests/Galaxy/` exercises `GalaxyRepository` directly against the `ZB` Galaxy Repository SQL database. It is skipped unless `MXGATEWAY_RUN_LIVE_GALAXY_TESTS=1` is set because it depends on a reachable SQL Server instance and deployed Galaxy state — fake-worker tests cannot cover the SQL browse RPCs. The suite covers `TestConnectionAsync`, `GetLastDeployTimeAsync`, `GetHierarchyAsync`, and `GetAttributesAsync`. `GetHierarchyAsync` and `GetAttributesAsync` assert a non-empty result, so the connected `ZB` database must contain a deployed Galaxy, not just an empty schema. Run the Galaxy live tests explicitly: ```bash $env:MXGATEWAY_RUN_LIVE_GALAXY_TESTS = "1" dotnet test src/ZB.MOM.WW.MxGateway.IntegrationTests/ZB.MOM.WW.MxGateway.IntegrationTests.csproj --filter FullyQualifiedName~GalaxyRepositoryLiveTests ``` Optional live Galaxy variables: | Variable | Default | Description | |----------|---------|-------------| | `MXGATEWAY_LIVE_GALAXY_CONN` | `Server=localhost;Database=ZB;Integrated Security=True;TrustServerCertificate=True;Encrypt=False;` | Galaxy Repository connection string. Set this when the `ZB` database is on a non-default instance or needs SQL authentication. | The default connection string targets `ZB` on `localhost` with Windows authentication, which matches the Galaxy Repository conventions in CLAUDE.md. ## Galaxy Filter Safety `GalaxyFilterInputSafetyTests` in `src/ZB.MOM.WW.MxGateway.Tests/Galaxy/` covers adversarial input handling for the Galaxy Repository browse filter layer. It runs in the unit-test project (no live SQL needed) and complements the live SQL coverage in `GalaxyRepositoryLiveTests`. The test class re-frames the original "Galaxy SQL injection" concern (Tests-002 in `code-reviews/Tests/findings.md`). `GalaxyRepository` issues only four *constant* SQL statements (`HierarchySql`, `AttributesSql`, `SELECT 1`, `SELECT time_of_last_deploy FROM galaxy`) — no `DiscoverHierarchyRequest` field is ever concatenated into a SQL string, so there is no dynamic SQL surface and no `LIKE`-escaping helper to test. All filters (`TagNameGlob`, `RootTagName`, template-chain, category, contained-path) are applied **in memory** by `GalaxyHierarchyProjector` / `GalaxyGlobMatcher` against the cached snapshot. The adversarial-input matrix (`'`, `' OR '1'='1`, `'; DROP TABLE gobject;--`, `%`, `_`, `100%_off`, `[abc]`, `Pump'001`) pins the following invariants: - SQL metacharacters (`'`, `;`) and `LIKE`-wildcards (`%`, `_`) are treated as opaque literals by `GalaxyGlobMatcher` — they never act as wildcards, never spuriously match unrelated text. - Only `*` and `?` are glob wildcards. - `GalaxyGlobMatcher` applies a 100 ms regex timeout so a pathological glob (e.g. 5 000 `a` characters plus a literal `!`) completes promptly rather than catastrophically backtracking. - `GalaxyHierarchyProjector` returns zero matches (rather than the whole hierarchy) for an adversarial `TagNameGlob` or `TemplateChainContains`, and surfaces `NotFound` for an adversarial `RootTagName`. - The `DiscoverHierarchy` RPC end-to-end returns zero matches for adversarial `TagNameGlob` rather than faulting. These invariants are the real security surface of the Galaxy browse path; the SQL-injection framing does not apply to a constant-query layer. ## Live LDAP `DashboardLdapLiveTests` in `src/ZB.MOM.WW.MxGateway.IntegrationTests/` exercises `DashboardAuthenticator` against the live GLAuth directory. It is skipped unless `MXGATEWAY_RUN_LIVE_LDAP_TESTS=1` is set because it binds against the GLAuth service described in `glauth.md`. The suite builds the authenticator with a default `GatewayOptions`, so `LdapOptions.RequiredGroup` keeps its `GwAdmin` default. `GwAdmin` is the gateway-specific dashboard-admin role and is **not** part of the five baseline GLAuth role groups — it must be provisioned before the LDAP live tests pass. `AuthenticateAsync_AdminInGwAdminGroup_Succeeds` fails (rather than skips) when GLAuth has only the baseline groups, so this is a hard prerequisite beyond "LDAP is up." See the "Adding a gw-specific group" section of `glauth.md` for the provisioning step that adds `GwAdmin` and grants it to `admin`. The suite covers both the success path and the `DashboardAuthenticator` failure branches: `admin` in `GwAdmin` succeeds; `readonly` is denied for missing group; `admin` with a wrong password is rejected by the candidate bind without leaking the password into `FailureMessage`; an unknown username yields no candidate; and an unreachable LDAP server is absorbed into a failed result rather than throwing. Run the LDAP live tests explicitly: ```bash $env:MXGATEWAY_RUN_LIVE_LDAP_TESTS = "1" dotnet test src/ZB.MOM.WW.MxGateway.IntegrationTests/ZB.MOM.WW.MxGateway.IntegrationTests.csproj --filter FullyQualifiedName~DashboardLdapLiveTests ``` ## Client E2E Scripts `scripts/discover-testmachine-tags.ps1` queries the ZB Galaxy Repository for the deployed runtime references used by the live client e2e scripts. It reads `TestMachine_001` through `TestMachine_020` and the expected attributes: - `ProtectedValue` - `TestChangingInt` - `TestBoolArray` - `TestIntArray` - `TestDateTimeArray` - `TestStringArray` The discovery output includes the exact `fullTagReference`, data type, array dimension, and security classification. The array attributes are expected to be dimension 50. `ProtectedValue` has security classification 2 and requires secured write semantics; the current client CLI e2e runner subscribes to it but does not attempt a normal `Write`. Run discovery directly when validating the Galaxy Repository inputs: ```powershell powershell -ExecutionPolicy Bypass -File scripts/discover-testmachine-tags.ps1 -Json ``` `scripts/run-client-e2e-tests.ps1` drives the .NET, Go, Rust, Python, and Java client CLIs through a live gateway session. The gateway and worker are assumed to be already running at `-Endpoint`; the script does not start or stop them. For each client it runs these phases, then closes the session in a `finally` path and writes a JSON report under `artifacts/e2e/`: 1. **Session + register** — opens one session and registers. 2. **Bulk** — verifies `SubscribeBulk` / `UnsubscribeBulk` on a bounded tag subset (skip with `-SkipBulk`). 3. **Add-item / advise** — adds and advises every discovered test tag. The loop has no `StreamEvents` consumer attached, so advised tags accumulate MXAccess change events in the worker event channel (`MxGateway:Events:QueueCapacity`); left unbounded it overflows under `FailFast` backpressure and faults the worker. Every `-DrainEveryTags` advised tags (default 15) the loop connects a short-lived `StreamEvents` drain so the gateway pumps that channel empty. `-DrainEveryTags 0` disables the drain. 4. **Stream** — asserts a bounded event stream delivers at least one event (skip with `-SkipStream`). 5. **Parity** — asserts MXAccess error paths are rejected rather than silently succeeding: an invalid item handle and an unknown session id (skip with `-SkipParity`). 6. **Auth rejection** — asserts `open-session` is rejected when the API key is missing, and (when `-RejectScopeApiKeyEnv` names an insufficient-scope key) when the key lacks the required scope. Skip with `-SkipAuth`. 7. **Write round-trip** — *opt-in (`-VerifyWrite`).* Runs right after `register`: adds and advises a configurable writable attribute (`-WriteAttribute`, default `TestChangingInt`), writes a per-client sentinel value, then streams events and asserts an `OnWriteComplete` event for that item is observed — proof the write round-tripped through the gateway, worker, and MXAccess provider. The written value being echoed back in an `OnDataChange` is recorded best-effort (`echoObserved`): a provider-driven attribute such as `TestChangingInt` accepts the write but immediately overwrites it, so no data-change carries the value back. The Rust `stream-events` CLI emits full per-event JSON (`family`, `itemHandle`, `value`) so all five clients apply the same checks. It is opt-in because it mutates live tag state. The phase fails fast if the write command is rejected — e.g. against a gateway whose worker predates write support (`MxAccessCommandExecutor` returning `InvalidRequest` for `Write`/`Write2`/`WriteSecured`/`WriteSecured2`). 8. **Alarm feed + acknowledge** — *opt-in (`-VerifyAlarms`).* Runs after the stream phase. Exercises the two session-less alarm subcommands against the gateway's central alarm monitor: `stream-alarms` reads a bounded slice of the feed (`-AlarmStreamMax`, default 1 — the feed's first message always arrives immediately, whereas later ones depend on live transitions) and asserts at least one `AlarmFeedMessage`; `acknowledge-alarm` acknowledges `-AlarmReference` (default `Galaxy!TestArea.TestMachine_001.TestAlarm001`) and asserts the RPC round-trips. The native ack outcome is not asserted — it depends on whether that alarm is currently active. It is opt-in because it depends on the gateway's central alarm monitor being enabled (`MxGateway:Alarms:Enabled`) and a live alarm provider. Each client CLI is driven through one long-lived `batch` process. Every CLI exposes a `batch` subcommand: a process that reads one command line from stdin, runs it through the normal subcommand dispatch, writes the JSON result, then a line containing exactly `__MXGW_BATCH_EOR__`. The harness launches one such process per client and pings the ~250 operations of the flow through it, so the process — and, for the JVM, the runtime — cold-start is paid once per client instead of once per operation. A command that fails inside the batch process writes its `{"error":...}` envelope and the loop continues; the harness treats that envelope as the operation failure (used by the parity and auth phases). Before the per-client phases run, the script builds the .NET CLI (`dotnet build`) and installs the Java CLI (`gradle :mxgateway-cli:installDist`) once, so the `batch` process launches straight from the compiled exe / the installed launcher. The Go, Rust, and Python batch processes are launched via `go run` / `cargo run` / `python -m`, which compile-or-start once when that single per-client process starts. Build the gateway and worker, start the gateway, and provide a valid API key before running the client e2e script: ```powershell $env:MXGATEWAY_API_KEY = "" powershell -ExecutionPolicy Bypass -File scripts/run-client-e2e-tests.ps1 ``` Useful runner options: ```powershell powershell -ExecutionPolicy Bypass -File scripts/run-client-e2e-tests.ps1 -Clients dotnet,python -MachineStart 1 -MachineEnd 2 powershell -ExecutionPolicy Bypass -File scripts/run-client-e2e-tests.ps1 -BulkTagCount 10 powershell -ExecutionPolicy Bypass -File scripts/run-client-e2e-tests.ps1 -SkipStream powershell -ExecutionPolicy Bypass -File scripts/run-client-e2e-tests.ps1 -SkipBulk # Write round-trip (opt-in): point at a writable scalar attribute and its # value type. powershell -ExecutionPolicy Bypass -File scripts/run-client-e2e-tests.ps1 -VerifyWrite -WriteAttribute TestChangingInt -WriteType int32 # Alarm feed + acknowledge (opt-in): needs MxGateway:Alarms:Enabled on the gateway. powershell -ExecutionPolicy Bypass -File scripts/run-client-e2e-tests.ps1 -VerifyAlarms -AlarmReference "Galaxy!TestArea.TestMachine_001.TestAlarm001" # Auth rejection: also assert an insufficient-scope key is denied. powershell -ExecutionPolicy Bypass -File scripts/run-client-e2e-tests.ps1 -RejectScopeApiKeyEnv MXGATEWAY_READONLY_API_KEY # Run all five clients concurrently as isolated child processes. powershell -ExecutionPolicy Bypass -File scripts/run-client-e2e-tests.ps1 -Parallel # Validate the flow offline (prints commands, contacts no gateway). powershell -ExecutionPolicy Bypass -File scripts/run-client-e2e-tests.ps1 -DryRun powershell -ExecutionPolicy Bypass -File scripts/run-client-e2e-tests.ps1 -Endpoint localhost:5000 -ApiKeyEnv MXGATEWAY_API_KEY ``` When `-VerifyWrite` is enabled, the write round-trip fails loudly if the write command is rejected, if `-WriteAttribute` does not name a writable scalar attribute, or if no `OnWriteComplete` event is observed for the written item within `-WriteEchoMaxEvents` (default 200) streamed events. Raise `-WriteEchoMaxEvents` if the gateway's per-session event backlog is large enough to push `OnWriteComplete` past that bound. ## Focused Commands Run the cross-language smoke matrix tests after changing the documented client smoke command list: ```bash dotnet test src/ZB.MOM.WW.MxGateway.Tests/ZB.MOM.WW.MxGateway.Tests.csproj --filter FullyQualifiedName~CrossLanguageSmokeMatrixTests ``` Run the parity fixture matrix tests after changing the integration parity scenario list: ```bash dotnet test src/ZB.MOM.WW.MxGateway.Tests/ZB.MOM.WW.MxGateway.Tests.csproj --filter FullyQualifiedName~ParityFixtureMatrixTests ``` Run the fake worker tests after changing gateway worker IPC, session startup, or event streaming behavior: ```bash dotnet test src/ZB.MOM.WW.MxGateway.Tests/ZB.MOM.WW.MxGateway.Tests.csproj --filter FullyQualifiedName~FakeWorkerHarnessTests dotnet test src/ZB.MOM.WW.MxGateway.Tests/ZB.MOM.WW.MxGateway.Tests.csproj --filter FullyQualifiedName~SessionWorkerClientFactoryFakeWorkerTests dotnet test src/ZB.MOM.WW.MxGateway.Tests/ZB.MOM.WW.MxGateway.Tests.csproj --filter FullyQualifiedName~GatewayEndToEndFakeWorkerSmokeTests dotnet test src/ZB.MOM.WW.MxGateway.Tests/ZB.MOM.WW.MxGateway.Tests.csproj --filter FullyQualifiedName~WorkerClientTests dotnet test src/ZB.MOM.WW.MxGateway.Worker.Tests/ZB.MOM.WW.MxGateway.Worker.Tests.csproj -p:Platform=x86 --filter FullyQualifiedName~WorkerPipeSessionTests ``` Run the gateway test project after shared gateway test infrastructure changes: ```bash dotnet test src/ZB.MOM.WW.MxGateway.Tests/ZB.MOM.WW.MxGateway.Tests.csproj ``` ## Related Documentation - [Cross-Language Smoke Matrix](./CrossLanguageSmokeMatrix.md) - [Parity Fixture Matrix](./ParityFixtureMatrix.md) - [Gateway Process Design](./GatewayProcessDesign.md) - [Worker Frame Protocol](./WorkerFrameProtocol.md) - [MXAccess Worker Instance Detailed Design](./MxAccessWorkerInstanceDesign.md)