758aca2355
Running the matrix against a live gateway surfaced several issues: - The write phase is now opt-in (-VerifyWrite, was -SkipWrite). It runs right after register so only a small event backlog precedes the write, and asserts the reliable OnWriteComplete signal (the written value is not echoed back by a provider-driven attribute like TestChangingInt, so the value compare is best-effort). - Java was launched as bare "gradle", which .NET's Process.Start cannot exec (it is gradle.bat) — resolve the launcher and run it via cmd.exe. - The Java client's MxEventStream queue capacity was 16, which overflows on any active session's backlog-replay burst; raised to 1024. - The Rust stream-events CLI now renders the event family as the proto enum name, matching the protobuf-JSON the other four clients emit. Update docs/GatewayTesting.md for the reworked write phase. Verified live: the full five-client matrix passes with -VerifyWrite. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
282 lines
13 KiB
Markdown
282 lines
13 KiB
Markdown
# Gateway Testing
|
|
|
|
Gateway tests run without installed MXAccess by using fake workers, fake
|
|
transports, and in-process gRPC service fakes. Live MXAccess verification belongs
|
|
in opt-in integration tests because it depends on installed COM components and
|
|
provider state.
|
|
|
|
## Fake Worker Harness
|
|
|
|
`FakeWorkerHarness` in `src/MxGateway.Tests/Gateway/Workers/Fakes/` provides an
|
|
in-process worker side for named-pipe IPC tests. It uses the same
|
|
`WorkerFrameReader`, `WorkerFrameWriter`, and `WorkerEnvelope` contract as the
|
|
gateway so tests exercise real frame validation and worker-client state changes.
|
|
|
|
Use the harness when a gateway or session test needs worker behavior without
|
|
starting `MxGateway.Worker.exe` or loading MXAccess COM. The harness scripts:
|
|
|
|
- `WorkerHello` and `WorkerReady` startup,
|
|
- command replies with matching correlation ids,
|
|
- ordered `WorkerEvent` frames,
|
|
- `WorkerHeartbeat` frames,
|
|
- `WorkerFault` frames,
|
|
- shutdown acknowledgements,
|
|
- malformed protobuf payloads and oversized frame headers,
|
|
- slow or hung workers by withholding a reply.
|
|
|
|
Session-level tests can connect the harness to the pipe created by
|
|
`SessionWorkerClientFactory` with `ConnectToGatewayPipeAsync`. Lower-level
|
|
`WorkerClient` tests can use `CreateConnectedPairAsync` to create both pipe ends
|
|
inside the test.
|
|
|
|
`GatewayEndToEndFakeWorkerSmokeTests` composes the real gRPC service,
|
|
`SessionManager`, `SessionWorkerClientFactory`, `WorkerClient`, and
|
|
`EventStreamService` with a scripted fake worker launcher. The smoke test covers
|
|
`OpenSession`, `Register`, `AddItem`, `Advise`, one streamed `OnDataChange`
|
|
event, and `CloseSession` without loading MXAccess COM.
|
|
|
|
## Live MXAccess Smoke
|
|
|
|
`WorkerLiveMxAccessSmokeTests` in `src/MxGateway.IntegrationTests/` composes the
|
|
real gRPC service, `SessionManager`, `SessionWorkerClientFactory`,
|
|
`WorkerClient`, `WorkerProcessLauncher`, and `MxGateway.Worker.exe`. It is
|
|
skipped unless `MXGATEWAY_RUN_LIVE_MXACCESS_TESTS=1` is set because it creates
|
|
the installed MXAccess COM object and depends on live provider state.
|
|
|
|
The live smoke opens a gateway session, launches the x86 worker, runs
|
|
`Register`, `AddItem`, and `Advise`, waits a bounded time for the first
|
|
`OnDataChange` event (skipping any earlier bootstrap/registration-state event),
|
|
and closes the session in a `finally` block so the worker gets a graceful
|
|
shutdown request even when a command or event assertion fails. Cleanup failures
|
|
in that `finally` block are logged rather than thrown, so a real assertion
|
|
failure is never masked by a shutdown timeout.
|
|
|
|
`WorkerLiveMxAccessSmokeTests` additionally covers two MXAccess parity paths the
|
|
fake-worker tests cannot validate:
|
|
|
|
- a `Write` round-trip against an advised item, and
|
|
- an `AddItem` against an invalid server handle, asserting the MXAccess failure
|
|
surfaces in the command reply without faulting the gateway transport.
|
|
|
|
All three tests are gated by the same `MXGATEWAY_RUN_LIVE_MXACCESS_TESTS=1`
|
|
opt-in variable.
|
|
|
|
Build the worker before running the smoke:
|
|
|
|
```bash
|
|
dotnet build src/MxGateway.Worker/MxGateway.Worker.csproj -p:Platform=x86
|
|
```
|
|
|
|
Run the smoke explicitly:
|
|
|
|
```bash
|
|
$env:MXGATEWAY_RUN_LIVE_MXACCESS_TESTS = "1"
|
|
dotnet test src/MxGateway.IntegrationTests/MxGateway.IntegrationTests.csproj --filter FullyQualifiedName~WorkerLiveMxAccessSmokeTests
|
|
```
|
|
|
|
Optional live smoke variables:
|
|
|
|
| Variable | Default | Description |
|
|
|----------|---------|-------------|
|
|
| `MXGATEWAY_LIVE_MXACCESS_WORKER_EXE` | First existing `MxGateway.Worker.exe` under `src/MxGateway.Worker/bin/...` | Worker executable path. Set this when running against a packaged worker or a non-default build output. |
|
|
| `MXGATEWAY_LIVE_MXACCESS_ITEM` | `TestChildObject.TestInt` | MXAccess item reference used by `AddItem`. |
|
|
| `MXGATEWAY_LIVE_MXACCESS_CLIENT_NAME` | `MxGateway.IntegrationTests` | Client name passed to `Register`. |
|
|
| `MXGATEWAY_LIVE_MXACCESS_EVENT_TIMEOUT_SECONDS` | `15` | Maximum wait for the first `OnDataChange`. |
|
|
|
|
The test output includes session id, worker process id, command status,
|
|
HRESULT/status diagnostics, event sequence and handles, close status, and worker
|
|
stdout/stderr lines emitted during the run.
|
|
|
|
## Live Galaxy Repository
|
|
|
|
`GalaxyRepositoryLiveTests` in `src/MxGateway.IntegrationTests/Galaxy/` exercises
|
|
`GalaxyRepository` directly against the `ZB` Galaxy Repository SQL database. It is
|
|
skipped unless `MXGATEWAY_RUN_LIVE_GALAXY_TESTS=1` is set because it depends on a
|
|
reachable SQL Server instance and deployed Galaxy state — fake-worker tests cannot
|
|
cover the SQL browse RPCs.
|
|
|
|
The suite covers `TestConnectionAsync`, `GetLastDeployTimeAsync`,
|
|
`GetHierarchyAsync`, and `GetAttributesAsync`. `GetHierarchyAsync` and
|
|
`GetAttributesAsync` assert a non-empty result, so the connected `ZB` database
|
|
must contain a deployed Galaxy, not just an empty schema.
|
|
|
|
Run the Galaxy live tests explicitly:
|
|
|
|
```bash
|
|
$env:MXGATEWAY_RUN_LIVE_GALAXY_TESTS = "1"
|
|
dotnet test src/MxGateway.IntegrationTests/MxGateway.IntegrationTests.csproj --filter FullyQualifiedName~GalaxyRepositoryLiveTests
|
|
```
|
|
|
|
Optional live Galaxy variables:
|
|
|
|
| Variable | Default | Description |
|
|
|----------|---------|-------------|
|
|
| `MXGATEWAY_LIVE_GALAXY_CONN` | `Server=localhost;Database=ZB;Integrated Security=True;TrustServerCertificate=True;Encrypt=False;` | Galaxy Repository connection string. Set this when the `ZB` database is on a non-default instance or needs SQL authentication. |
|
|
|
|
The default connection string targets `ZB` on `localhost` with Windows
|
|
authentication, which matches the Galaxy Repository conventions in CLAUDE.md.
|
|
|
|
## Live LDAP
|
|
|
|
`DashboardLdapLiveTests` in `src/MxGateway.IntegrationTests/` exercises
|
|
`DashboardAuthenticator` against the live GLAuth directory. It is skipped unless
|
|
`MXGATEWAY_RUN_LIVE_LDAP_TESTS=1` is set because it binds against the GLAuth
|
|
service described in `glauth.md`.
|
|
|
|
The suite builds the authenticator with a default `GatewayOptions`, so
|
|
`LdapOptions.RequiredGroup` keeps its `GwAdmin` default. `GwAdmin` is the
|
|
gateway-specific dashboard-admin role and is **not** part of the five baseline
|
|
GLAuth role groups — it must be provisioned before the LDAP live tests pass.
|
|
`AuthenticateAsync_AdminInGwAdminGroup_Succeeds` fails (rather than skips) when
|
|
GLAuth has only the baseline groups, so this is a hard prerequisite beyond "LDAP
|
|
is up." See the "Adding a gw-specific group" section of `glauth.md` for the
|
|
provisioning step that adds `GwAdmin` and grants it to `admin`.
|
|
|
|
The suite covers both the success path and the `DashboardAuthenticator` failure
|
|
branches: `admin` in `GwAdmin` succeeds; `readonly` is denied for missing group;
|
|
`admin` with a wrong password is rejected by the candidate bind without leaking
|
|
the password into `FailureMessage`; an unknown username yields no candidate; and
|
|
an unreachable LDAP server is absorbed into a failed result rather than throwing.
|
|
|
|
Run the LDAP live tests explicitly:
|
|
|
|
```bash
|
|
$env:MXGATEWAY_RUN_LIVE_LDAP_TESTS = "1"
|
|
dotnet test src/MxGateway.IntegrationTests/MxGateway.IntegrationTests.csproj --filter FullyQualifiedName~DashboardLdapLiveTests
|
|
```
|
|
|
|
## Client E2E Scripts
|
|
|
|
`scripts/discover-testmachine-tags.ps1` queries the ZB Galaxy Repository for the
|
|
deployed runtime references used by the live client e2e scripts. It reads
|
|
`TestMachine_001` through `TestMachine_020` and the expected attributes:
|
|
|
|
- `ProtectedValue`
|
|
- `TestChangingInt`
|
|
- `TestBoolArray`
|
|
- `TestIntArray`
|
|
- `TestDateTimeArray`
|
|
- `TestStringArray`
|
|
|
|
The discovery output includes the exact `fullTagReference`, data type, array
|
|
dimension, and security classification. The array attributes are expected to be
|
|
dimension 50. `ProtectedValue` has security classification 2 and requires
|
|
secured write semantics; the current client CLI e2e runner subscribes to it but
|
|
does not attempt a normal `Write`.
|
|
|
|
Run discovery directly when validating the Galaxy Repository inputs:
|
|
|
|
```powershell
|
|
powershell -ExecutionPolicy Bypass -File scripts/discover-testmachine-tags.ps1 -Json
|
|
```
|
|
|
|
`scripts/run-client-e2e-tests.ps1` drives the .NET, Go, Rust, Python, and Java
|
|
client CLIs through a live gateway session. The gateway and worker are assumed
|
|
to be already running at `-Endpoint`; the script does not start or stop them.
|
|
For each client it runs these phases, then closes the session in a `finally`
|
|
path and writes a JSON report under `artifacts/e2e/`:
|
|
|
|
1. **Session + register** — opens one session and registers.
|
|
2. **Bulk** — verifies `SubscribeBulk` / `UnsubscribeBulk` on a bounded tag
|
|
subset (skip with `-SkipBulk`).
|
|
3. **Add-item / advise** — adds and advises every discovered test tag.
|
|
4. **Stream** — asserts a bounded event stream delivers at least one event
|
|
(skip with `-SkipStream`).
|
|
5. **Parity** — asserts MXAccess error paths are rejected rather than silently
|
|
succeeding: an invalid item handle and an unknown session id (skip with
|
|
`-SkipParity`).
|
|
6. **Auth rejection** — asserts `open-session` is rejected when the API key is
|
|
missing, and (when `-RejectScopeApiKeyEnv` names an insufficient-scope key)
|
|
when the key lacks the required scope. Skip with `-SkipAuth`.
|
|
7. **Write round-trip** — *opt-in (`-VerifyWrite`).* Runs right after
|
|
`register`: adds and advises a configurable writable attribute
|
|
(`-WriteAttribute`, default `TestChangingInt`), writes a per-client
|
|
sentinel value, then streams events and asserts an `OnWriteComplete` event
|
|
for that item is observed — proof the write round-tripped through the
|
|
gateway, worker, and MXAccess provider. The written value being echoed back
|
|
in an `OnDataChange` is recorded best-effort (`echoObserved`): a
|
|
provider-driven attribute such as `TestChangingInt` accepts the write but
|
|
immediately overwrites it, so no data-change carries the value back. The
|
|
Rust `stream-events` CLI emits full per-event JSON (`family`, `itemHandle`,
|
|
`value`) so all five clients apply the same checks.
|
|
|
|
It is opt-in because it mutates live tag state. The phase fails fast if the
|
|
write command is rejected — e.g. against a gateway whose worker predates
|
|
write support (`MxAccessCommandExecutor` returning `InvalidRequest` for
|
|
`Write`/`Write2`/`WriteSecured`/`WriteSecured2`).
|
|
|
|
Build the gateway and worker, start the gateway, and provide a valid API key
|
|
before running the client e2e script:
|
|
|
|
```powershell
|
|
$env:MXGATEWAY_API_KEY = "<api-key>"
|
|
powershell -ExecutionPolicy Bypass -File scripts/run-client-e2e-tests.ps1
|
|
```
|
|
|
|
Useful runner options:
|
|
|
|
```powershell
|
|
powershell -ExecutionPolicy Bypass -File scripts/run-client-e2e-tests.ps1 -Clients dotnet,python -MachineStart 1 -MachineEnd 2
|
|
powershell -ExecutionPolicy Bypass -File scripts/run-client-e2e-tests.ps1 -BulkTagCount 10
|
|
powershell -ExecutionPolicy Bypass -File scripts/run-client-e2e-tests.ps1 -SkipStream
|
|
powershell -ExecutionPolicy Bypass -File scripts/run-client-e2e-tests.ps1 -SkipBulk
|
|
# Write round-trip (opt-in): point at a writable scalar attribute and its
|
|
# value type.
|
|
powershell -ExecutionPolicy Bypass -File scripts/run-client-e2e-tests.ps1 -VerifyWrite -WriteAttribute TestChangingInt -WriteType int32
|
|
# Auth rejection: also assert an insufficient-scope key is denied.
|
|
powershell -ExecutionPolicy Bypass -File scripts/run-client-e2e-tests.ps1 -RejectScopeApiKeyEnv MXGATEWAY_READONLY_API_KEY
|
|
# Run all five clients concurrently as isolated child processes.
|
|
powershell -ExecutionPolicy Bypass -File scripts/run-client-e2e-tests.ps1 -Parallel
|
|
# Validate the flow offline (prints commands, contacts no gateway).
|
|
powershell -ExecutionPolicy Bypass -File scripts/run-client-e2e-tests.ps1 -DryRun
|
|
powershell -ExecutionPolicy Bypass -File scripts/run-client-e2e-tests.ps1 -Endpoint localhost:5000 -ApiKeyEnv MXGATEWAY_API_KEY
|
|
```
|
|
|
|
When `-VerifyWrite` is enabled, the write round-trip fails loudly if the write
|
|
command is rejected, if `-WriteAttribute` does not name a writable scalar
|
|
attribute, or if no `OnWriteComplete` event is observed for the written item
|
|
within `-WriteEchoMaxEvents` (default 200) streamed events. Raise
|
|
`-WriteEchoMaxEvents` if the gateway's per-session event backlog is large
|
|
enough to push `OnWriteComplete` past that bound.
|
|
|
|
## Focused Commands
|
|
|
|
Run the cross-language smoke matrix tests after changing the documented client
|
|
smoke command list:
|
|
|
|
```bash
|
|
dotnet test src/MxGateway.Tests/MxGateway.Tests.csproj --filter FullyQualifiedName~CrossLanguageSmokeMatrixTests
|
|
```
|
|
|
|
Run the parity fixture matrix tests after changing the integration parity
|
|
scenario list:
|
|
|
|
```bash
|
|
dotnet test src/MxGateway.Tests/MxGateway.Tests.csproj --filter FullyQualifiedName~ParityFixtureMatrixTests
|
|
```
|
|
|
|
Run the fake worker tests after changing gateway worker IPC, session startup, or
|
|
event streaming behavior:
|
|
|
|
```bash
|
|
dotnet test src/MxGateway.Tests/MxGateway.Tests.csproj --filter FullyQualifiedName~FakeWorkerHarnessTests
|
|
dotnet test src/MxGateway.Tests/MxGateway.Tests.csproj --filter FullyQualifiedName~SessionWorkerClientFactoryFakeWorkerTests
|
|
dotnet test src/MxGateway.Tests/MxGateway.Tests.csproj --filter FullyQualifiedName~GatewayEndToEndFakeWorkerSmokeTests
|
|
dotnet test src/MxGateway.Tests/MxGateway.Tests.csproj --filter FullyQualifiedName~WorkerClientTests
|
|
dotnet test src/MxGateway.Worker.Tests/MxGateway.Worker.Tests.csproj -p:Platform=x86 --filter FullyQualifiedName~WorkerPipeSessionTests
|
|
```
|
|
|
|
Run the gateway test project after shared gateway test infrastructure changes:
|
|
|
|
```bash
|
|
dotnet test src/MxGateway.Tests/MxGateway.Tests.csproj
|
|
```
|
|
|
|
## Related Documentation
|
|
|
|
- [Cross-Language Smoke Matrix](./CrossLanguageSmokeMatrix.md)
|
|
- [Parity Fixture Matrix](./ParityFixtureMatrix.md)
|
|
- [Gateway Process Design](./GatewayProcessDesign.md)
|
|
- [Worker Frame Protocol](./WorkerFrameProtocol.md)
|
|
- [MXAccess Worker Instance Detailed Design](./MxAccessWorkerInstanceDesign.md)
|