End-to-end run on the live ZB galaxy with mxaccessgw on http://localhost:5120: 14 passed / 1 skipped / 0 failed in 18m53s. PR 7.2's matrix-gate condition met. Three resolution patches in this commit; the matrix doc records the new state. 1. Discoverer: defensive `[]` array-suffix strip ---------------------------------------------------- The gw's GalaxyRepository.cs:173-175 appends `[]` to array-typed full_tag_reference values, but MxAccess COM IInstance.AddItem doesn't accept `[]`-suffixed addresses. GalaxyDiscoverer.StripArraySuffix removes the suffix client-side so SubscribeBulk / Read / Write paths see the canonical form. Tracked in mxaccessgw/requirements-array-suffix-fix.md; this workaround is removed when the gw fix lands. 2. WriteByClassification: pin status class, not exact code --------------------------------------------------------- Legacy MxAccessGalaxyBackend.WriteValuesAsync flat-maps every failure to BadInternalError (0x80020000); mxgw's GatewayGalaxyDataWriter.TranslateReply uses MxStatusProxy.RawDetectedBy to distinguish gw-layer faults (BadCommunicationError, 0x80050000) from MxAccess HRESULT faults. Both yield Bad-status — the parity invariant is the status class (Good/Uncertain/Bad), not the exact code. Both write tests now use AssertStatusClassMatches; legacy mapping retires alongside GalaxyProxyDriver in PR 7.2. 3. BrowseAndReadParity Read scenario: drop CLR-type assertion ------------------------------------------------------------ Legacy returns the raw VARIANT (e.g. byte[]) for an attribute that hasn't received its first value cycle from MxAccess yet, while mxgw returns the typed value (Single, Int32, etc.). Once a real value is written or scanned, both converge. Pinning CLR-type equality across the uninitialized window adds noise without a real parity invariant — the StatusCode-class assertion already covers the "did the read succeed" question. The test still pins StatusCode-class parity per scenario. 4. Galaxy.ParityMatrix.md — first-rig results captured ----------------------------------------------------- Per-row status flipped from "n/a unverified" to actual green / yellow / deferred outcomes from this run. Four new accepted-deltas added (read-value CLR type, write-status code mapping, single-platform ScanState scope, gw `[]` suffix workaround), bringing the total to nine. Outstanding deltas section flipped to "none as of 2026-04-30." Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
8.9 KiB
Galaxy backend parity matrix
This document tracks the scenario × result matrix that the
Driver.Galaxy.ParityTests suite drives against both Galaxy backends —
the legacy out-of-process Galaxy.Host (.NET 4.8 x86 + MXAccess COM,
fronted by GalaxyProxyDriver) and the new in-process mxgateway
backend (GalaxyDriver, .NET 10 + gRPC against mxaccessgw).
Maintained alongside Phase 5 (PR 5.W). The Phase 7 default flip (PR 7.1) consumes this matrix as its go/no-go gate — every row must be either green or carry an explicit accepted-delta justification.
Reading the matrix
- Status: green — the scenario asserts strict parity and passes (or skips cleanly when the rig isn't up).
- Status: yellow — soft pin only (count or shape parity, not value parity) — acceptable when the underlying COM/gRPC stacks have known divergences in raw payloads but the surface presented to the DriverNodeManager is equivalent.
- Status: red — divergence detected. Row carries a fix or a follow-up task ID.
Scenarios
Last verified end-to-end on the dev parity rig: 2026-04-30
(legacy OtOpcUaGalaxyHost mxaccess backend; mxaccessgw v1.x at
http://localhost:5120; sandbox OtOpcUaParityTest_001 deployed in
the ZB galaxy; 13 passed / 1 skipped / 0 failed in 19 minutes).
| PR | Test class | Scenario | Status | Notes |
|---|---|---|---|---|
| 5.2 | BrowseAndReadParityTests |
Same variable set | green | symmetric set diff on full-reference set, after [] array-suffix workaround in GalaxyDiscoverer |
| 5.2 | BrowseAndReadParityTests |
Same DataType / SecurityClass / IsHistorized | green | per-attribute meta triple parity |
| 5.2 | BrowseAndReadParityTests |
Same StatusCode-class on a sampled read | yellow | pins status class (Bad/Uncertain/Good); CLR type intentionally not asserted — see "Accepted deltas" #6 |
| 5.3 | SubscribeAndEventRateParityTests |
Subscribe returns a handle on each backend | green | symmetric Unsubscribe cleanup |
| 5.3 | SubscribeAndEventRateParityTests |
Event rate within ±50% over 3s | yellow | both backends fed by the same upstream MXAccess subscriptions; tolerance absorbs scheduler jitter |
| 5.4 | WriteByClassificationParityTests |
FreeAccess / Operate write status-class parity | yellow | pins status class only; legacy flat-maps every failure to BadInternalError, mxgw distinguishes (BadCommunicationError, BadDeviceFailure, etc.) — see "Accepted deltas" #7 |
| 5.4 | WriteByClassificationParityTests |
Configure / Tune routes via secured-write | yellow | same status-class pin |
| 5.5 | AlarmTransitionParityTests |
Same alarm-condition source-node-id set | green | one-way invariant on sub-attribute refs (legacy populated → mxgw matches; legacy null → mxgw free to populate per AlarmRefBuilder) |
| 5.5 | AlarmTransitionParityTests |
IsAlarm-marked variable count parity | green | soft pin — count must match, doesn't have to be non-zero |
| 5.6 | HistoryReadParityTests |
Same historized attribute set | green | what HistoryRouter consumes when routing to the Wonderware sidecar |
| 5.6 | HistoryReadParityTests |
New mxgw GalaxyDriver does not implement IHistoryProvider |
green | architectural pin from Phase 1 (PR 1.3) on the new path; legacy GalaxyProxyDriver keeps the interface for back-compat until PR 7.2 — see "Accepted deltas" #8 |
| 5.7 | ReconnectParityTests |
Reinitialize → both Healthy + reads succeed | green | recovery latency is not pinned (legacy: pipe + COM client; mxgw: re-Register gw session) |
| 5.7 | ReconnectParityTests |
Health diverges only when one side recovers | yellow | soft pin until a toxiproxy-style fault injector lands |
| 5.8 | ScanStateProbeParityTests |
Same per-platform host set | n/a — deferred | dev rig is licensed for one $WinPlatform only; multi-platform parity deferred to a customer rig (PR 4.7's unit tests pin the state-decoder + member-tracking logic) |
| 5.8 | ScanStateProbeParityTests |
Same HostState per overlapping platform |
n/a — deferred | same single-platform constraint |
Accepted deltas
These are intentional differences between the two backends — the parity suite skips or tolerates them by design.
-
Transport-entry host name. The legacy backend's
IHostConnectivityProbesurface includes a host entry named after the Galaxy.Host process identity; the mxgw backend uses the configuredMxAccess.ClientName. The names differ, but both are correct for their respective sessions — the parity test compares only the platform-host subset. -
Reconnect latency cadence. Legacy reconnect roundtrips an OS named pipe + an MxAccess COM client + a Galaxy.Host process restart if the host died. The mxgw reconnect re-Registers the gateway session over an existing gRPC channel. Sub-second vs multi-second recoveries are both correct for their own paths; only the eventual
Healthyconvergence is pinned. -
Read-value drift. A read sampled twice on a live Galaxy can return different values legitimately. We pin
StatusCode-class parity (Bad/Uncertain/Good); value equality is not pinned. -
Event-rate variance. Both backends consume the same upstream MXAccess publish events but route them through different deserializers (LMXProxyServer COM events vs gRPC
MxEventprotos). Scheduler jitter on either side can shift counts within a 3s window; we pin a ±50% ratio, not strict equality. -
IHistoryProvideron the new path only. Phase 1 (PR 1.3) lifted history off the per-driver path onto the server-ownedHistoryRouterfor the new in-processGalaxyDriver. The legacyGalaxyProxyDriverstill surfacesIHistoryProviderfor back-compat with the legacy server bootstrap path — it's an accepted delta retired in PR 7.2 alongside the rest of the legacy projects. The pin we want to enforce is "the new path doesn't regress to per-driver history." -
Read value-CLR-type. Legacy returns the raw VARIANT (e.g.
Byte[]) for an attribute that hasn't received its first value cycle from MxAccess yet, while mxgw returns the typed value (Single,Int32, etc.). Once a real value is written or scanned, both converge. Pinning CLR-type equality across the uninitialized window adds noise without a real parity invariant — theStatusCode-class assertion already covers the "did the read succeed" question. -
Write-failure StatusCode mapping. Legacy
MxAccessGalaxyBackend.WriteValuesAsyncflat-maps every failure toBadInternalError(0x80020000); mxgwGatewayGalaxyDataWriter.TranslateReplyusesMxStatusProxy.RawDetectedByto distinguish gw-layer faults (BadCommunicationError,0x80050000) from MxAccess HRESULT faults (BadDeviceFailure,BadNotConnected, etc.). Both yield Bad-status — the parity invariant is the status class, not the exact code. Tighter mapping parity isn't worth investing in: the legacy mapping retires alongsideGalaxyProxyDriverin PR 7.2. -
Single-platform scope on the dev rig. Two
ScanStateProbeParityTestsscenarios are deferred to a customer rig with multiple deployed$WinPlatforminstances; this dev box is licensed for one. PR 4.7's unit tests (PerPlatformProbeWatcherTests) pin the state-decoder + member-tracking logic at the seam level, so the runtime parity check becomes a customer-rig acceptance gate before that customer goes live, not a precondition for retiring the legacy projects on this dev box. -
Workaround for the gw
[]array-suffix bug.mxaccessgw/src/MxGateway.Server/Galaxy/GalaxyRepository.cs:173-175appends[]to thefull_tag_referenceof array-typed attributes, whichMxAccess COM IInstance.AddItemdoesn't accept. The lmxopcua discoverer (GalaxyDiscoverer.StripArraySuffix) defensively strips the suffix. Tracked inmxaccessgw/requirements-array-suffix-fix.md; the workaround is removed when that gw fix lands.
Outstanding deltas
None as of 2026-04-30. Phase 7 (PR 7.1) flipped the default to
mxgw; PR 7.2 retires the legacy projects after the soak run + a
2-week production pilot.
Running the matrix
# Both backends must be reachable for any row to run; rows skip
# cleanly when their backend is unavailable.
dotnet test tests/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.ParityTests/
Environment overrides for the mxgw backend:
| Variable | Default | Purpose |
|---|---|---|
OTOPCUA_PARITY_GW_ENDPOINT |
http://localhost:5120 |
mxaccessgw gRPC endpoint |
OTOPCUA_PARITY_GW_API_KEY |
parity-suite-key |
API key handed to MxGatewayClient |
OTOPCUA_PARITY_CLIENT_NAME |
OtOpcUa-Parity |
MxAccess.ClientName for the session |
The legacy backend reads ZB SQL on localhost:1433 and spawns
OtOpcUa.Driver.Galaxy.Host.exe from
src/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host/bin/Debug/net48/ — both
must exist for the legacy half to resolve.