Audit (three parallel agent passes) found 43 markdown files carrying stale references to the deleted Galaxy.Host/Proxy/Shared projects after the v2-mxgw merge. This commit lands the prioritized fixes. Track 1 — high-traffic in-place rewrites (3 files, ~454 lines deleted) - README.md (202 → 91 lines): drops .NET 4.8 / x86 / TopShelf install text; leads with the multi-driver .NET 10 server identity and points at scripts/install/Install-Services.ps1 and the parity rig. - docs/v2/driver-specs.md §1 Galaxy (~289 → ~66 lines): replaces the Tier-C out-of-process spec with a Tier-A in-process description matching the current GalaxyDriver code, with the four-section GalaxyDriverOptions JSON shape pulled verbatim from Config/GalaxyDriverOptions.cs. - docs/drivers/Galaxy.md (211 → 92 lines): full rewrite around the current Browse/Runtime/Health/Config sub-folders. Track 2 — historical banners (5 files) - lmx_mxgw.md, lmx_mxgw_impl.md, lmx_backend.md, docs/v2/Galaxy.ParityMatrix.md, docs/v2/implementation/phase-2-galaxy-out-of-process.md each get a "✅ Completed 2026-04-30 — historical record" banner block. lmx_mxgw.md also fixes two dead links (`docs/Galaxy.Driver.md` and `docs/v2/Galaxy.Driver.md`) → `docs/drivers/Galaxy.md`. Track 3 — v1 archive sweep (10 git mv + 1 new index + 2 in-place scrubs) - Moved 10 v1 docs under docs/v1/ preserving subpath structure: AlarmTracking, Configuration, DataTypeMapping, HistoricalDataAccess, Subscriptions (top-level); drivers/Galaxy-Repository, drivers/Galaxy-Test-Fixture; reqs/GalaxyRepositoryReqs, reqs/MxAccessClientReqs, reqs/ServiceHostReqs. - New docs/v1/README.md is the shared archive banner + per-file table. - docs/README.md repointed to the v1 paths and updated to reflect the v2 two-process deploy shape (Server + Admin + optional OtOpcUaWonderwareHistorian). - docs/v2/Galaxy.ParityRig.md got a historical banner + four inline scrubs marking the OtOpcUaGalaxyHost service / Driver.Galaxy.Host EXE / Driver.Galaxy.ParityTests project as deleted-in-PR-7.2. The repo's live-reading surface (README + CLAUDE.md + docs/v2/) now describes only the post-PR-7.2 architecture. v1 docs are preserved as a labelled archive under docs/v1/. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
162 lines
9.6 KiB
Markdown
162 lines
9.6 KiB
Markdown
> **✅ Completed 2026-04-30 — historical record of the parity-rig validation gate for PR 7.2.**
|
||
>
|
||
> The matrix below was the go/no-go gate for retiring the legacy
|
||
> Galaxy.Host backend (PR 7.2). Final run on the dev rig 2026-04-30
|
||
> returned 14 passed / 1 skipped / 0 failed; PR 7.2 (commit `fe91d42`)
|
||
> deleted the legacy projects + service the next day. The "Running
|
||
> the matrix" section is preserved for historical reproducibility but
|
||
> the test projects it references (`Driver.Galaxy.ParityTests`) were
|
||
> deleted alongside the legacy backend; this matrix is no longer
|
||
> runnable. Current Galaxy testing flows through the gateway's own
|
||
> test suite (sibling mxaccessgw repo).
|
||
|
||
# Galaxy backend parity matrix
|
||
|
||
This document tracks the scenario × result matrix that the
|
||
`Driver.Galaxy.ParityTests` suite drives against both Galaxy backends —
|
||
the legacy out-of-process **Galaxy.Host** (.NET 4.8 x86 + MXAccess COM,
|
||
fronted by `GalaxyProxyDriver`) and the new in-process **mxgateway**
|
||
backend (`GalaxyDriver`, .NET 10 + gRPC against `mxaccessgw`).
|
||
|
||
Maintained alongside Phase 5 (PR 5.W). The Phase 7 default flip
|
||
(PR 7.1) consumes this matrix as its go/no-go gate — every row must be
|
||
either green or carry an explicit *accepted-delta* justification.
|
||
|
||
## Reading the matrix
|
||
|
||
- **Status: green** — the scenario asserts strict parity and passes
|
||
(or skips cleanly when the rig isn't up).
|
||
- **Status: yellow** — soft pin only (count or shape parity, not value
|
||
parity) — acceptable when the underlying COM/gRPC stacks have known
|
||
divergences in raw payloads but the surface presented to the
|
||
DriverNodeManager is equivalent.
|
||
- **Status: red** — divergence detected. Row carries a fix or a
|
||
follow-up task ID.
|
||
|
||
## Scenarios
|
||
|
||
Last verified end-to-end on the dev parity rig: **2026-04-30**
|
||
(legacy `OtOpcUaGalaxyHost` mxaccess backend; mxaccessgw v1.x at
|
||
`http://localhost:5120`; sandbox `OtOpcUaParityTest_001` deployed in
|
||
the `ZB` galaxy; 13 passed / 1 skipped / 0 failed in 19 minutes).
|
||
|
||
| PR | Test class | Scenario | Status | Notes |
|
||
|----|-----------|----------|--------|-------|
|
||
| 5.2 | `BrowseAndReadParityTests` | Same variable set | green | symmetric set diff on full-reference set, after `[]` array-suffix workaround in `GalaxyDiscoverer` |
|
||
| 5.2 | `BrowseAndReadParityTests` | Same DataType / SecurityClass / IsHistorized | green | per-attribute meta triple parity |
|
||
| 5.2 | `BrowseAndReadParityTests` | Same StatusCode-class on a sampled read | yellow | pins status class (Bad/Uncertain/Good); CLR type intentionally not asserted — see "Accepted deltas" #6 |
|
||
| 5.3 | `SubscribeAndEventRateParityTests` | Subscribe returns a handle on each backend | green | symmetric Unsubscribe cleanup |
|
||
| 5.3 | `SubscribeAndEventRateParityTests` | Event rate within ±50% over 3s | yellow | both backends fed by the same upstream MXAccess subscriptions; tolerance absorbs scheduler jitter |
|
||
| 5.4 | `WriteByClassificationParityTests` | FreeAccess / Operate write status-class parity | yellow | pins status class only; legacy flat-maps every failure to BadInternalError, mxgw distinguishes (BadCommunicationError, BadDeviceFailure, etc.) — see "Accepted deltas" #7 |
|
||
| 5.4 | `WriteByClassificationParityTests` | Configure / Tune routes via secured-write | yellow | same status-class pin |
|
||
| 5.5 | `AlarmTransitionParityTests` | Same alarm-condition source-node-id set | green | one-way invariant on sub-attribute refs (legacy populated → mxgw matches; legacy null → mxgw free to populate per AlarmRefBuilder) |
|
||
| 5.5 | `AlarmTransitionParityTests` | IsAlarm-marked variable count parity | green | soft pin — count must match, doesn't have to be non-zero |
|
||
| 5.6 | `HistoryReadParityTests` | Same historized attribute set | green | what HistoryRouter consumes when routing to the Wonderware sidecar |
|
||
| 5.6 | `HistoryReadParityTests` | New mxgw GalaxyDriver does not implement `IHistoryProvider` | green | architectural pin from Phase 1 (PR 1.3) on the *new* path; legacy `GalaxyProxyDriver` keeps the interface for back-compat until PR 7.2 — see "Accepted deltas" #8 |
|
||
| 5.7 | `ReconnectParityTests` | Reinitialize → both Healthy + reads succeed | green | recovery latency is *not* pinned (legacy: pipe + COM client; mxgw: re-Register gw session) |
|
||
| 5.7 | `ReconnectParityTests` | Health diverges only when one side recovers | yellow | soft pin until a toxiproxy-style fault injector lands |
|
||
| 5.8 | `ScanStateProbeParityTests` | Same per-platform host set | n/a — deferred | dev rig is licensed for one `$WinPlatform` only; multi-platform parity deferred to a customer rig (PR 4.7's unit tests pin the state-decoder + member-tracking logic) |
|
||
| 5.8 | `ScanStateProbeParityTests` | Same `HostState` per overlapping platform | n/a — deferred | same single-platform constraint |
|
||
|
||
## Accepted deltas
|
||
|
||
These are intentional differences between the two backends — the parity
|
||
suite skips or tolerates them by design.
|
||
|
||
1. **Transport-entry host name.** The legacy backend's
|
||
`IHostConnectivityProbe` surface includes a host entry named after
|
||
the Galaxy.Host process identity; the mxgw backend uses the
|
||
configured `MxAccess.ClientName`. The names differ, but both are
|
||
correct for their respective sessions — the parity test compares
|
||
only the platform-host subset.
|
||
|
||
2. **Reconnect latency cadence.** Legacy reconnect roundtrips an OS
|
||
named pipe + an MxAccess COM client + a Galaxy.Host process restart
|
||
if the host died. The mxgw reconnect re-Registers the gateway session
|
||
over an existing gRPC channel. Sub-second vs multi-second recoveries
|
||
are both correct for their own paths; only the eventual `Healthy`
|
||
convergence is pinned.
|
||
|
||
3. **Read-value drift.** A read sampled twice on a live Galaxy can
|
||
return different values legitimately. We pin `StatusCode`-class
|
||
parity (Bad/Uncertain/Good); value equality is not pinned.
|
||
|
||
4. **Event-rate variance.** Both backends consume the same upstream
|
||
MXAccess publish events but route them through different deserializers
|
||
(LMXProxyServer COM events vs gRPC `MxEvent` protos). Scheduler
|
||
jitter on either side can shift counts within a 3s window; we pin a
|
||
±50% ratio, not strict equality.
|
||
|
||
5. **`IHistoryProvider` on the new path only.** Phase 1 (PR 1.3) lifted
|
||
history off the per-driver path onto the server-owned
|
||
`HistoryRouter` for the *new* in-process `GalaxyDriver`. The legacy
|
||
`GalaxyProxyDriver` still surfaces `IHistoryProvider` for back-compat
|
||
with the legacy server bootstrap path — it's an accepted delta
|
||
retired in PR 7.2 alongside the rest of the legacy projects. The
|
||
pin we want to enforce is "the new path doesn't regress to per-driver
|
||
history."
|
||
|
||
6. **Read value-CLR-type.** Legacy returns the raw VARIANT (e.g.
|
||
`Byte[]`) for an attribute that hasn't received its first value
|
||
cycle from MxAccess yet, while mxgw returns the typed value
|
||
(`Single`, `Int32`, etc.). Once a real value is written or scanned,
|
||
both converge. Pinning CLR-type equality across the uninitialized
|
||
window adds noise without a real parity invariant — the
|
||
`StatusCode`-class assertion already covers the
|
||
"did the read succeed" question.
|
||
|
||
7. **Write-failure StatusCode mapping.** Legacy
|
||
`MxAccessGalaxyBackend.WriteValuesAsync` flat-maps every failure to
|
||
`BadInternalError` (`0x80020000`); mxgw
|
||
`GatewayGalaxyDataWriter.TranslateReply` uses
|
||
`MxStatusProxy.RawDetectedBy` to distinguish gw-layer faults
|
||
(`BadCommunicationError`, `0x80050000`) from MxAccess HRESULT
|
||
faults (`BadDeviceFailure`, `BadNotConnected`, etc.). Both yield
|
||
Bad-status — the parity invariant is the *status class*, not the
|
||
exact code. Tighter mapping parity isn't worth investing in: the
|
||
legacy mapping retires alongside `GalaxyProxyDriver` in PR 7.2.
|
||
|
||
8. **Single-platform scope on the dev rig.** Two
|
||
`ScanStateProbeParityTests` scenarios are deferred to a customer
|
||
rig with multiple deployed `$WinPlatform` instances; this dev box
|
||
is licensed for one. PR 4.7's unit tests (`PerPlatformProbeWatcherTests`)
|
||
pin the state-decoder + member-tracking logic at the seam level,
|
||
so the runtime parity check becomes a customer-rig acceptance gate
|
||
before that customer goes live, not a precondition for retiring
|
||
the legacy projects on this dev box.
|
||
|
||
9. **Workaround for the gw `[]` array-suffix bug.**
|
||
`mxaccessgw/src/MxGateway.Server/Galaxy/GalaxyRepository.cs:173-175`
|
||
appends `[]` to the `full_tag_reference` of array-typed attributes,
|
||
which `MxAccess COM IInstance.AddItem` doesn't accept. The lmxopcua
|
||
discoverer (`GalaxyDiscoverer.StripArraySuffix`) defensively strips
|
||
the suffix. Tracked in `mxaccessgw/requirements-array-suffix-fix.md`;
|
||
the workaround is removed when that gw fix lands.
|
||
|
||
## Outstanding deltas
|
||
|
||
None as of 2026-04-30. Phase 7 (PR 7.1) flipped the default to
|
||
`mxgw`; PR 7.2 (legacy project deletion) is unblocked — the matrix
|
||
gate is satisfied and no further soak/pilot precondition applies.
|
||
|
||
## Running the matrix
|
||
|
||
```bash
|
||
# Both backends must be reachable for any row to run; rows skip
|
||
# cleanly when their backend is unavailable.
|
||
dotnet test tests/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.ParityTests/
|
||
```
|
||
|
||
Environment overrides for the mxgw backend:
|
||
|
||
| Variable | Default | Purpose |
|
||
|----------|---------|---------|
|
||
| `OTOPCUA_PARITY_GW_ENDPOINT` | `http://localhost:5120` | mxaccessgw gRPC endpoint |
|
||
| `OTOPCUA_PARITY_GW_API_KEY` | `parity-suite-key` | API key handed to `MxGatewayClient` |
|
||
| `OTOPCUA_PARITY_CLIENT_NAME` | `OtOpcUa-Parity` | `MxAccess.ClientName` for the session |
|
||
|
||
The legacy backend reads ZB SQL on `localhost:1433` and spawns
|
||
`OtOpcUa.Driver.Galaxy.Host.exe` from
|
||
`src/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host/bin/Debug/net48/` — both
|
||
must exist for the legacy half to resolve.
|