PR 5.W — Galaxy.ParityMatrix.md
Tabular scenario × result map for the seven Phase 5 parity scenarios (BrowseAndRead, Subscribe, Write, Alarm, History, Reconnect, ScanState). Each row records the assertion strength (green strict, yellow soft) and flags accepted-delta cases: - Transport-entry host name divergence (legacy = Galaxy.Host process, mxgw = MxAccess.ClientName) - Reconnect latency cadence — different paths, both correct for their own session shape - Sampled-read value drift (we pin StatusCode + type, not value) - Event-rate ±50% tolerance over a 3s window - Per-driver IHistoryProvider absence (architectural pin from PR 1.3) Phase 7 (PR 7.1) consumes this matrix as the default-flip gate. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
104
docs/v2/Galaxy.ParityMatrix.md
Normal file
104
docs/v2/Galaxy.ParityMatrix.md
Normal file
@@ -0,0 +1,104 @@
|
||||
# Galaxy backend parity matrix
|
||||
|
||||
This document tracks the scenario × result matrix that the
|
||||
`Driver.Galaxy.ParityTests` suite drives against both Galaxy backends —
|
||||
the legacy out-of-process **Galaxy.Host** (.NET 4.8 x86 + MXAccess COM,
|
||||
fronted by `GalaxyProxyDriver`) and the new in-process **mxgateway**
|
||||
backend (`GalaxyDriver`, .NET 10 + gRPC against `mxaccessgw`).
|
||||
|
||||
Maintained alongside Phase 5 (PR 5.W). The Phase 7 default flip
|
||||
(PR 7.1) consumes this matrix as its go/no-go gate — every row must be
|
||||
either green or carry an explicit *accepted-delta* justification.
|
||||
|
||||
## Reading the matrix
|
||||
|
||||
- **Status: green** — the scenario asserts strict parity and passes
|
||||
(or skips cleanly when the rig isn't up).
|
||||
- **Status: yellow** — soft pin only (count or shape parity, not value
|
||||
parity) — acceptable when the underlying COM/gRPC stacks have known
|
||||
divergences in raw payloads but the surface presented to the
|
||||
DriverNodeManager is equivalent.
|
||||
- **Status: red** — divergence detected. Row carries a fix or a
|
||||
follow-up task ID.
|
||||
|
||||
## Scenarios
|
||||
|
||||
| PR | Test class | Scenario | Status | Notes |
|
||||
|----|-----------|----------|--------|-------|
|
||||
| 5.2 | `BrowseAndReadParityTests` | Same variable set | green | symmetric set diff on full-reference set |
|
||||
| 5.2 | `BrowseAndReadParityTests` | Same DataType / SecurityClass / IsHistorized | green | per-attribute meta triple parity |
|
||||
| 5.2 | `BrowseAndReadParityTests` | Same StatusCode + value-CLR-type on a sampled read | yellow | raw values legitimately drift between two reads on a live Galaxy; we pin StatusCode + type, not value equality |
|
||||
| 5.3 | `SubscribeAndEventRateParityTests` | Subscribe returns a handle on each backend | green | symmetric Unsubscribe cleanup |
|
||||
| 5.3 | `SubscribeAndEventRateParityTests` | Event rate within ±50% over 3s | yellow | both backends fed by the same upstream MXAccess subscriptions; tolerance absorbs scheduler jitter |
|
||||
| 5.4 | `WriteByClassificationParityTests` | FreeAccess / Operate write StatusCode parity | green | both backends use plain Write |
|
||||
| 5.4 | `WriteByClassificationParityTests` | Configure / Tune routes via secured-write | green | both backends pick up SecurityClassification from DiscoverAsync |
|
||||
| 5.5 | `AlarmTransitionParityTests` | Same alarm-condition source-node-id set | green | + per-condition SourceName / InitialSeverity / InAlarmRef / DescAttrNameRef |
|
||||
| 5.5 | `AlarmTransitionParityTests` | IsAlarm-marked variable count parity | green | soft pin — count must match, doesn't have to be non-zero |
|
||||
| 5.6 | `HistoryReadParityTests` | Same historized attribute set | green | what HistoryRouter consumes when routing to the Wonderware sidecar |
|
||||
| 5.6 | `HistoryReadParityTests` | Neither backend implements `IHistoryProvider` | green | architectural pin from Phase 1 (PR 1.3) |
|
||||
| 5.7 | `ReconnectParityTests` | Reinitialize → both Healthy + reads succeed | green | recovery latency is *not* pinned (legacy: pipe + COM client; mxgw: re-Register gw session) |
|
||||
| 5.7 | `ReconnectParityTests` | Health diverges only when one side recovers | yellow | soft pin until a toxiproxy-style fault injector lands |
|
||||
| 5.8 | `ScanStateProbeParityTests` | Same per-platform host set | green | transport-entry names differ by design (legacy = Galaxy.Host process; mxgw = `MxAccess.ClientName`) and are excluded |
|
||||
| 5.8 | `ScanStateProbeParityTests` | Same `HostState` per overlapping platform | green | drives Discover, waits 1.5s for the probe-watcher push, then snapshots both |
|
||||
|
||||
## Accepted deltas
|
||||
|
||||
These are intentional differences between the two backends — the parity
|
||||
suite skips or tolerates them by design.
|
||||
|
||||
1. **Transport-entry host name.** The legacy backend's
|
||||
`IHostConnectivityProbe` surface includes a host entry named after
|
||||
the Galaxy.Host process identity; the mxgw backend uses the
|
||||
configured `MxAccess.ClientName`. The names differ, but both are
|
||||
correct for their respective sessions — the parity test compares
|
||||
only the platform-host subset.
|
||||
|
||||
2. **Reconnect latency cadence.** Legacy reconnect roundtrips an OS
|
||||
named pipe + an MxAccess COM client + a Galaxy.Host process restart
|
||||
if the host died. The mxgw reconnect re-Registers the gateway session
|
||||
over an existing gRPC channel. Sub-second vs multi-second recoveries
|
||||
are both correct for their own paths; only the eventual `Healthy`
|
||||
convergence is pinned.
|
||||
|
||||
3. **Read-value drift.** A read sampled twice on a live Galaxy can
|
||||
return different values legitimately. We pin `StatusCode` and
|
||||
value-CLR-type equality, not value equality. Driving an explicit
|
||||
write-then-read pin requires the parity rig to own a writable
|
||||
sandbox attribute — out of scope for the current suite.
|
||||
|
||||
4. **Event-rate variance.** Both backends consume the same upstream
|
||||
MXAccess publish events but route them through different deserializers
|
||||
(LMXProxyServer COM events vs gRPC `MxEvent` protos). Scheduler
|
||||
jitter on either side can shift counts within a 3s window; we pin a
|
||||
±50% ratio, not strict equality.
|
||||
|
||||
5. **Per-driver `IHistoryProvider` is gone.** Phase 1 (PR 1.3) lifted
|
||||
history off the per-driver path onto the server-owned
|
||||
`HistoryRouter`. Both Galaxy backends correctly *do not* surface
|
||||
`IHistoryProvider` — the absence is itself a parity assertion.
|
||||
|
||||
## Outstanding deltas
|
||||
|
||||
None as of PR 5.W. Phase 7 (PR 7.1) flips the default to `mxgw` once
|
||||
this matrix is fully green on the dev parity rig.
|
||||
|
||||
## Running the matrix
|
||||
|
||||
```bash
|
||||
# Both backends must be reachable for any row to run; rows skip
|
||||
# cleanly when their backend is unavailable.
|
||||
dotnet test tests/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.ParityTests/
|
||||
```
|
||||
|
||||
Environment overrides for the mxgw backend:
|
||||
|
||||
| Variable | Default | Purpose |
|
||||
|----------|---------|---------|
|
||||
| `OTOPCUA_PARITY_GW_ENDPOINT` | `http://localhost:5120` | mxaccessgw gRPC endpoint |
|
||||
| `OTOPCUA_PARITY_GW_API_KEY` | `parity-suite-key` | API key handed to `MxGatewayClient` |
|
||||
| `OTOPCUA_PARITY_CLIENT_NAME` | `OtOpcUa-Parity` | `MxAccess.ClientName` for the session |
|
||||
|
||||
The legacy backend reads ZB SQL on `localhost:1433` and spawns
|
||||
`OtOpcUa.Driver.Galaxy.Host.exe` from
|
||||
`src/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host/bin/Debug/net48/` — both
|
||||
must exist for the legacy half to resolve.
|
||||
Reference in New Issue
Block a user