22 KiB
Gateway Testing
Gateway tests run without installed MXAccess by using fake workers, fake transports, and in-process gRPC service fakes. Live MXAccess verification belongs in opt-in integration tests because it depends on installed COM components and provider state.
Fake Worker Harness
FakeWorkerHarness in src/ZB.MOM.WW.MxGateway.Tests/Gateway/Workers/Fakes/ provides an
in-process worker side for named-pipe IPC tests. It uses the same
WorkerFrameReader, WorkerFrameWriter, and WorkerEnvelope contract as the
gateway so tests exercise real frame validation and worker-client state changes.
Use the harness when a gateway or session test needs worker behavior without
starting ZB.MOM.WW.MxGateway.Worker.exe or loading MXAccess COM. The harness scripts:
WorkerHelloandWorkerReadystartup,- command replies with matching correlation ids,
- ordered
WorkerEventframes, WorkerHeartbeatframes,WorkerFaultframes,- shutdown acknowledgements,
- malformed protobuf payloads and oversized frame headers,
- slow or hung workers by withholding a reply.
Session-level tests can connect the harness to the pipe created by
SessionWorkerClientFactory with ConnectToGatewayPipeAsync. Lower-level
WorkerClient tests can use CreateConnectedPairAsync to create both pipe ends
inside the test.
GatewayEndToEndFakeWorkerSmokeTests composes the real gRPC service,
SessionManager, SessionWorkerClientFactory, WorkerClient, and
EventStreamService with a scripted fake worker launcher. The smoke test covers
OpenSession, Register, AddItem, Advise, one streamed OnDataChange
event, and CloseSession without loading MXAccess COM.
Live MXAccess Smoke
WorkerLiveMxAccessSmokeTests in src/ZB.MOM.WW.MxGateway.IntegrationTests/ composes the
real gRPC service, SessionManager, SessionWorkerClientFactory,
WorkerClient, WorkerProcessLauncher, and ZB.MOM.WW.MxGateway.Worker.exe. It is
skipped unless MXGATEWAY_RUN_LIVE_MXACCESS_TESTS=1 is set because it creates
the installed MXAccess COM object and depends on live provider state.
The live smoke opens a gateway session, launches the x86 worker, runs
Register, AddItem, and Advise, waits a bounded time for the first
OnDataChange event (skipping any earlier bootstrap/registration-state event),
and closes the session in a finally block so the worker gets a graceful
shutdown request even when a command or event assertion fails. Cleanup failures
in that finally block are logged rather than thrown, so a real assertion
failure is never masked by a shutdown timeout.
WorkerLiveMxAccessSmokeTests additionally covers five MXAccess parity paths the
fake-worker tests cannot validate:
- a
Writeround-trip against an advised item, asserting both that the reply isOk/MxCommandKind.Writeand that the worker emits a matchingOnWriteCompleteevent for the targeted (server, item) handle pair — the same round-trip proof used byscripts/run-client-e2e-tests.ps1, - an
AddItemagainst an invalid server handle, asserting the MXAccess failure surfaces in the command reply without faulting the gateway transport, - the
UnAdvise→RemoveItem→Unregisterteardown chain, asserting each step repliesOkwith the matchingMxCommandKind, that no furtherOnDataChangeevents arrive for the un-advised pair, and that a secondRemoveItemagainst the freed handle relays a non-OkMXAccess failure, - a
WriteSecuredround-trip afterAuthenticateUser, asserting the reply carriesMxCommandKind.WriteSecuredand the credential password never appears in the diagnostic message (parity for both the secured-write ordering rule and the "do not log secrets" contract), and - an abnormal worker exit (the worker process is killed mid-session) where the
gateway must transition the session to
SessionState.Faultedwith a non-empty fault description carrying a known worker-client classification (pipe disconnected / worker faulted / end-of-stream / heartbeat expired).
All six tests are gated by the same MXGATEWAY_RUN_LIVE_MXACCESS_TESTS=1
opt-in variable.
Build the worker before running the smoke:
dotnet build src/ZB.MOM.WW.MxGateway.Worker/ZB.MOM.WW.MxGateway.Worker.csproj -p:Platform=x86
Run the smoke explicitly:
$env:MXGATEWAY_RUN_LIVE_MXACCESS_TESTS = "1"
dotnet test src/ZB.MOM.WW.MxGateway.IntegrationTests/ZB.MOM.WW.MxGateway.IntegrationTests.csproj --filter FullyQualifiedName~WorkerLiveMxAccessSmokeTests
Optional live smoke variables:
| Variable | Default | Description |
|---|---|---|
MXGATEWAY_LIVE_MXACCESS_WORKER_EXE |
First existing ZB.MOM.WW.MxGateway.Worker.exe under src/ZB.MOM.WW.MxGateway.Worker/bin/... |
Worker executable path. Set this when running against a packaged worker or a non-default build output. |
MXGATEWAY_LIVE_MXACCESS_ITEM |
TestChildObject.TestInt |
MXAccess item reference used by AddItem. |
MXGATEWAY_LIVE_MXACCESS_CLIENT_NAME |
ZB.MOM.WW.MxGateway.IntegrationTests |
Client name passed to Register. |
MXGATEWAY_LIVE_MXACCESS_EVENT_TIMEOUT_SECONDS |
15 |
Maximum wait for the first OnDataChange (also used for the OnWriteComplete round-trip and the abnormal-exit fault transition). |
MXGATEWAY_LIVE_MXACCESS_WRITE_SECURED_USER |
admin |
ArchestrA user name passed to AuthenticateUser before the WriteSecured parity step. |
MXGATEWAY_LIVE_MXACCESS_WRITE_SECURED_PASSWORD |
admin123 |
Password paired with the user above. Never logged; the test asserts the value does not appear in the WriteSecured diagnostic message. |
The test output includes session id, worker process id, command status, HRESULT/status diagnostics, event sequence and handles, close status, and worker stdout/stderr lines emitted during the run.
Dev-rig Probes
src/ZB.MOM.WW.MxGateway.Worker.Tests/Probes/ partitions runtime probes from the regular
Worker.Tests regression suite. The folder is its own
ZB.MOM.WW.MxGateway.Worker.Tests.Probes namespace so a discovery filter (e.g. dotnet test --filter FullyQualifiedName~ZB.MOM.WW.MxGateway.Worker.Tests.Probes) can target or
exclude them without enumerating individual class names. The probes are
[Fact(Skip = "...")] by default and exist to characterize live AVEVA
behavior on the dev rig, not to gate CI — flip Skip = null on the dev box
with installed MXAccess + a running Galaxy provider when running them:
AlarmsLiveSmokeTests— end-to-end smoke for the alarms-over-gateway pipeline (WnWrapAlarmConsumer+AlarmDispatcher+MxAccessAlarmEventSink) against\\<machine>\Galaxy!DEVwith the dev rig's 10-second flip script writingTestMachine_001.TestAlarm001.AlarmClientWmProbeTests— registers as anAlarmClientconsumer on a real hidden message-only window and logs every Win32 message that arrives during a fixed pump window. Used to identify theWM_APP/RegisterWindowMessageIDs alarm callbacks use.WnWrapConsumerProbeTests— instantiates AVEVA's standalonewnwrapConsumerCOM class, subscribes to the dev rig's\\<machine>\Galaxy!DEVprovider, and pollsGetXmlCurrentAlarms2. The XML payload bypasses theFILETIME→DateTimeauto-marshaling that crashesaaAlarmManagedClient.AlarmClient.GetHighPriAlarmon this rig.
The probes share the Worker.Tests project (so they can use its net48/x86
configuration and the installed ArchestrA.MxAccess / aaAlarmManagedClient
references), but they are not part of the regression contract — a Worker.Tests
run with Skip left in place passes them as skipped.
Live Galaxy Repository
GalaxyRepositoryLiveTests in src/ZB.MOM.WW.MxGateway.IntegrationTests/Galaxy/ exercises
GalaxyRepository directly against the ZB Galaxy Repository SQL database. It is
skipped unless MXGATEWAY_RUN_LIVE_GALAXY_TESTS=1 is set because it depends on a
reachable SQL Server instance and deployed Galaxy state — fake-worker tests cannot
cover the SQL browse RPCs.
The suite covers TestConnectionAsync, GetLastDeployTimeAsync,
GetHierarchyAsync, and GetAttributesAsync. GetHierarchyAsync and
GetAttributesAsync assert a non-empty result, so the connected ZB database
must contain a deployed Galaxy, not just an empty schema.
Run the Galaxy live tests explicitly:
$env:MXGATEWAY_RUN_LIVE_GALAXY_TESTS = "1"
dotnet test src/ZB.MOM.WW.MxGateway.IntegrationTests/ZB.MOM.WW.MxGateway.IntegrationTests.csproj --filter FullyQualifiedName~GalaxyRepositoryLiveTests
Optional live Galaxy variables:
| Variable | Default | Description |
|---|---|---|
MXGATEWAY_LIVE_GALAXY_CONN |
Server=localhost;Database=ZB;Integrated Security=True;TrustServerCertificate=True;Encrypt=False; |
Galaxy Repository connection string. Set this when the ZB database is on a non-default instance or needs SQL authentication. |
The default connection string targets ZB on localhost with Windows
authentication, which matches the Galaxy Repository conventions in CLAUDE.md.
Galaxy Filter Safety
GalaxyFilterInputSafetyTests in src/ZB.MOM.WW.MxGateway.Tests/Galaxy/ covers adversarial
input handling for the Galaxy Repository browse filter layer. It runs in the
unit-test project (no live SQL needed) and complements the live SQL coverage in
GalaxyRepositoryLiveTests.
The test class re-frames the original "Galaxy SQL injection" concern (Tests-002 in
code-reviews/Tests/findings.md). GalaxyRepository issues only four constant
SQL statements (HierarchySql, AttributesSql, SELECT 1,
SELECT time_of_last_deploy FROM galaxy) — no DiscoverHierarchyRequest field
is ever concatenated into a SQL string, so there is no dynamic SQL surface and no
LIKE-escaping helper to test. All filters (TagNameGlob, RootTagName,
template-chain, category, contained-path) are applied in memory by
GalaxyHierarchyProjector / GalaxyGlobMatcher against the cached snapshot.
The adversarial-input matrix (', ' OR '1'='1, '; DROP TABLE gobject;--,
%, _, 100%_off, [abc], Pump'001) pins the following invariants:
- SQL metacharacters (
',;) andLIKE-wildcards (%,_) are treated as opaque literals byGalaxyGlobMatcher— they never act as wildcards, never spuriously match unrelated text. - Only
*and?are glob wildcards. GalaxyGlobMatcherapplies a 100 ms regex timeout so a pathological glob (e.g. 5 000acharacters plus a literal!) completes promptly rather than catastrophically backtracking.GalaxyHierarchyProjectorreturns zero matches (rather than the whole hierarchy) for an adversarialTagNameGloborTemplateChainContains, and surfacesNotFoundfor an adversarialRootTagName.- The
DiscoverHierarchyRPC end-to-end returns zero matches for adversarialTagNameGlobrather than faulting.
These invariants are the real security surface of the Galaxy browse path; the SQL-injection framing does not apply to a constant-query layer.
Live LDAP
DashboardLdapLiveTests in src/ZB.MOM.WW.MxGateway.IntegrationTests/ exercises
DashboardAuthenticator against the live GLAuth directory. It is skipped unless
MXGATEWAY_RUN_LIVE_LDAP_TESTS=1 is set because it binds against the GLAuth
service described in glauth.md.
The suite builds the authenticator with GatewayOptions.Dashboard.GroupToRole
set to { GwAdmin: Admin }. GwAdmin is the gateway-specific
dashboard-admin role and is not part of the five baseline GLAuth role
groups — it must be provisioned before the LDAP live tests pass.
AuthenticateAsync_AdminInGwAdminGroup_Succeeds fails (rather than skips)
when GLAuth has only the baseline groups, so this is a hard prerequisite
beyond "LDAP is up." See the "Adding a gw-specific group" section of
glauth.md for the provisioning step that adds GwAdmin and grants it to
admin.
DashboardAuthenticator delegates the LDAP bind and group search to the shared
ZB.MOM.WW.Auth.Ldap provider (LdapAuthService) and only maps the resulting
groups to dashboard roles via DashboardGroupRoleMapper; the bind/search
mechanics that decide each outcome live in that shared provider, not in
DashboardAuthenticator.
The suite covers both the success path and the failure outcomes: admin whose
LDAP groups resolve to the Admin role succeeds and emits the role claim;
readonly is denied because no group in their memberOf appears in
GroupToRole; admin with a wrong password fails authentication without leaking
the password into FailureMessage; an unknown username fails authentication; and
an unreachable LDAP server is absorbed into a failed result rather than throwing.
Run the LDAP live tests explicitly:
$env:MXGATEWAY_RUN_LIVE_LDAP_TESTS = "1"
dotnet test src/ZB.MOM.WW.MxGateway.IntegrationTests/ZB.MOM.WW.MxGateway.IntegrationTests.csproj --filter FullyQualifiedName~DashboardLdapLiveTests
Client E2E Scripts
scripts/discover-testmachine-tags.ps1 queries the ZB Galaxy Repository for the
deployed runtime references used by the live client e2e scripts. It reads
TestMachine_001 through TestMachine_020 and the expected attributes:
ProtectedValueTestChangingIntTestBoolArrayTestIntArrayTestDateTimeArrayTestStringArray
The discovery output includes the exact fullTagReference, data type, array
dimension, and security classification. The array attributes are expected to be
dimension 50. ProtectedValue has security classification 2 and requires
secured write semantics; the current client CLI e2e runner subscribes to it but
does not attempt a normal Write.
Run discovery directly when validating the Galaxy Repository inputs:
powershell -ExecutionPolicy Bypass -File scripts/discover-testmachine-tags.ps1 -Json
scripts/run-client-e2e-tests.ps1 drives the .NET, Go, Rust, Python, and Java
client CLIs through a live gateway session. The gateway and worker are assumed
to be already running at -Endpoint; the script does not start or stop them.
For each client it runs these phases, then closes the session in a finally
path and writes a JSON report under artifacts/e2e/:
-
Session + register — opens one session and registers.
-
Bulk — verifies
SubscribeBulk/UnsubscribeBulkon a bounded tag subset (skip with-SkipBulk). -
Add-item / advise — adds and advises every discovered test tag. The loop has no
StreamEventsconsumer attached, so advised tags accumulate MXAccess change events in the worker event channel (MxGateway:Events:QueueCapacity); left unbounded it overflows underFailFastbackpressure and faults the worker. Every-DrainEveryTagsadvised tags (default 15) the loop connects a short-livedStreamEventsdrain so the gateway pumps that channel empty.-DrainEveryTags 0disables the drain. -
Stream — asserts a bounded event stream delivers at least one event (skip with
-SkipStream). -
Parity — asserts MXAccess error paths are rejected rather than silently succeeding: an invalid item handle and an unknown session id (skip with
-SkipParity). -
Auth rejection — asserts
open-sessionis rejected when the API key is missing, and (when-RejectScopeApiKeyEnvnames an insufficient-scope key) when the key lacks the required scope. Skip with-SkipAuth. -
Write round-trip — opt-in (
-VerifyWrite). Runs right afterregister: adds and advises a configurable writable attribute (-WriteAttribute, defaultTestChangingInt), writes a per-client sentinel value, then streams events and asserts anOnWriteCompleteevent for that item is observed — proof the write round-tripped through the gateway, worker, and MXAccess provider. The written value being echoed back in anOnDataChangeis recorded best-effort (echoObserved): a provider-driven attribute such asTestChangingIntaccepts the write but immediately overwrites it, so no data-change carries the value back. The Ruststream-eventsCLI emits full per-event JSON (family,itemHandle,value) so all five clients apply the same checks.It is opt-in because it mutates live tag state. The phase fails fast if the write command is rejected — e.g. against a gateway whose worker predates write support (
MxAccessCommandExecutorreturningInvalidRequestforWrite/Write2/WriteSecured/WriteSecured2). -
Alarm feed + acknowledge — opt-in (
-VerifyAlarms). Runs after the stream phase. Exercises the two session-less alarm subcommands against the gateway's central alarm monitor:stream-alarmsreads a bounded slice of the feed (-AlarmStreamMax, default 1 — the feed's first message always arrives immediately, whereas later ones depend on live transitions) and asserts at least oneAlarmFeedMessage;acknowledge-alarmacknowledges-AlarmReference(defaultGalaxy!TestArea.TestMachine_001.TestAlarm001) and asserts the RPC round-trips. The native ack outcome is not asserted — it depends on whether that alarm is currently active.It is opt-in because it depends on the gateway's central alarm monitor being enabled (
MxGateway:Alarms:Enabled) and a live alarm provider.
Each client CLI is driven through one long-lived batch process. Every CLI
exposes a batch subcommand: a process that reads one command line from stdin,
runs it through the normal subcommand dispatch, writes the JSON result, then a
line containing exactly __MXGW_BATCH_EOR__. The harness launches one such
process per client and pings the ~250 operations of the flow through it, so the
process — and, for the JVM, the runtime — cold-start is paid once per client
instead of once per operation. A command that fails inside the batch process
writes its {"error":...} envelope and the loop continues; the harness treats
that envelope as the operation failure (used by the parity and auth phases).
Before the per-client phases run, the script builds the .NET CLI
(dotnet build) and installs the Java CLI (gradle :mxgateway-cli:installDist)
once, so the batch process launches straight from the compiled exe / the
installed launcher. The Go, Rust, and Python batch processes are launched via
go run / cargo run / python -m, which compile-or-start once when that
single per-client process starts.
Build the gateway and worker, start the gateway, and provide a valid API key before running the client e2e script:
$env:MXGATEWAY_API_KEY = "<api-key>"
powershell -ExecutionPolicy Bypass -File scripts/run-client-e2e-tests.ps1
Useful runner options:
powershell -ExecutionPolicy Bypass -File scripts/run-client-e2e-tests.ps1 -Clients dotnet,python -MachineStart 1 -MachineEnd 2
powershell -ExecutionPolicy Bypass -File scripts/run-client-e2e-tests.ps1 -BulkTagCount 10
powershell -ExecutionPolicy Bypass -File scripts/run-client-e2e-tests.ps1 -SkipStream
powershell -ExecutionPolicy Bypass -File scripts/run-client-e2e-tests.ps1 -SkipBulk
# Write round-trip (opt-in): point at a writable scalar attribute and its
# value type.
powershell -ExecutionPolicy Bypass -File scripts/run-client-e2e-tests.ps1 -VerifyWrite -WriteAttribute TestChangingInt -WriteType int32
# Alarm feed + acknowledge (opt-in): needs MxGateway:Alarms:Enabled on the gateway.
powershell -ExecutionPolicy Bypass -File scripts/run-client-e2e-tests.ps1 -VerifyAlarms -AlarmReference "Galaxy!TestArea.TestMachine_001.TestAlarm001"
# Auth rejection: also assert an insufficient-scope key is denied.
powershell -ExecutionPolicy Bypass -File scripts/run-client-e2e-tests.ps1 -RejectScopeApiKeyEnv MXGATEWAY_READONLY_API_KEY
# Run all five clients concurrently as isolated child processes.
powershell -ExecutionPolicy Bypass -File scripts/run-client-e2e-tests.ps1 -Parallel
# Validate the flow offline (prints commands, contacts no gateway).
powershell -ExecutionPolicy Bypass -File scripts/run-client-e2e-tests.ps1 -DryRun
powershell -ExecutionPolicy Bypass -File scripts/run-client-e2e-tests.ps1 -Endpoint localhost:5000 -ApiKeyEnv MXGATEWAY_API_KEY
When -VerifyWrite is enabled, the write round-trip fails loudly if the write
command is rejected, if -WriteAttribute does not name a writable scalar
attribute, or if no OnWriteComplete event is observed for the written item
within -WriteEchoMaxEvents (default 200) streamed events. Raise
-WriteEchoMaxEvents if the gateway's per-session event backlog is large
enough to push OnWriteComplete past that bound.
Focused Commands
Run the cross-language smoke matrix tests after changing the documented client smoke command list:
dotnet test src/ZB.MOM.WW.MxGateway.Tests/ZB.MOM.WW.MxGateway.Tests.csproj --filter FullyQualifiedName~CrossLanguageSmokeMatrixTests
Run the parity fixture matrix tests after changing the integration parity scenario list:
dotnet test src/ZB.MOM.WW.MxGateway.Tests/ZB.MOM.WW.MxGateway.Tests.csproj --filter FullyQualifiedName~ParityFixtureMatrixTests
Run the fake worker tests after changing gateway worker IPC, session startup, or event streaming behavior:
dotnet test src/ZB.MOM.WW.MxGateway.Tests/ZB.MOM.WW.MxGateway.Tests.csproj --filter FullyQualifiedName~FakeWorkerHarnessTests
dotnet test src/ZB.MOM.WW.MxGateway.Tests/ZB.MOM.WW.MxGateway.Tests.csproj --filter FullyQualifiedName~SessionWorkerClientFactoryFakeWorkerTests
dotnet test src/ZB.MOM.WW.MxGateway.Tests/ZB.MOM.WW.MxGateway.Tests.csproj --filter FullyQualifiedName~GatewayEndToEndFakeWorkerSmokeTests
dotnet test src/ZB.MOM.WW.MxGateway.Tests/ZB.MOM.WW.MxGateway.Tests.csproj --filter FullyQualifiedName~WorkerClientTests
dotnet test src/ZB.MOM.WW.MxGateway.Worker.Tests/ZB.MOM.WW.MxGateway.Worker.Tests.csproj -p:Platform=x86 --filter FullyQualifiedName~WorkerPipeSessionTests
Run the gateway test project after shared gateway test infrastructure changes:
dotnet test src/ZB.MOM.WW.MxGateway.Tests/ZB.MOM.WW.MxGateway.Tests.csproj