# Galaxy parity rig — runbook > ✅ **Completed 2026-04-30 — historical record.** This runbook is the > recipe that produced the green parity matrix that gated PR 7.2 > (retire legacy Galaxy projects, merged at commit `ae7106d`). The > matrix it produced is captured in > [`Galaxy.ParityMatrix.md`](Galaxy.ParityMatrix.md), also marked > historical. The test project this doc drove > (`Driver.Galaxy.ParityTests`) was deleted in PR 7.2, along with > `Driver.Galaxy.{Host,Proxy,Shared}` and the `OtOpcUaGalaxyHost` > Windows service. **You cannot re-run this rig today.** Current > Galaxy testing flows through the gateway's own test suite in the > sibling `mxaccessgw` repo. > > The text below is preserved as-written so the migration trail (what > was tested, against what shape, with what env vars) stays auditable. Brings up both Galaxy backends side-by-side against a single live Galaxy so the parity matrix in `docs/v2/Galaxy.ParityMatrix.md` and the soak scenario in `tests/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.ParityTests/SoakScenarioTests.cs` can run for real. Closing the parity matrix was the gate for PR 7.2 (retire legacy Galaxy projects). ## Conceptual layout ``` Galaxy ZB SQL ──┬── OtOpcUaGalaxyHost (NSSM service, net48 x86) [DELETED in PR 7.2] │ └── MxAccess COM, ClientName "OtOpcUa-Galaxy.Host" │ └── named pipe "OtOpcUaGalaxy" │ ▲ │ │ pipe IPC │ │ │ GalaxyProxyDriver ◄── parity test (legacy half) │ └── mxaccessgw service └── MxAccess COM, ClientName "OtOpcUa-Parity" └── gRPC on http://localhost:5120 ▲ │ gRPC │ GalaxyDriver (in-process) ◄── parity test (mxgw half) ``` Both halves talk to the **same Galaxy** through **two distinct MxAccess sessions** (different ClientNames so they don't evict each other). ## What was on the dev box at the time Per `~/.claude/projects/.../memory/` *as of the rig run*: - **AVEVA System Platform + Galaxy + MXAccess runtime** — `project_aveva_platform_installed.md`. - **`OtOpcUaGalaxyHost`** Windows service running as `dohertj2`, NSSM-wrapped, binary at `C:\publish\OtOpcUaGalaxyHost\OtOpcUa.Driver.Galaxy.Host.exe`, shared secret at `.local/galaxy-host-secret.txt`, ZB SQL on `localhost:1433` — `project_galaxy_host_installed.md`. **(Service uninstalled and binary retired as part of PR 7.2; the host source project no longer exists in this repo.)** - **Parity test project** (`Driver.Galaxy.ParityTests`) — committed and skip-clean at the time of the rig run. **Deleted in PR 7.2.** ## Setup steps (one-time) ### 1. Build + run mxaccessgw The gateway source is at `c:\Users\dohertj2\Desktop\mxaccessgw\`. Build both halves — the worker has to be x86 net48 (MxAccess COM bitness), the server is .NET 10: ```powershell cd C:\Users\dohertj2\Desktop\mxaccessgw dotnet build src\MxGateway.Worker -c Release # produces bin\x86\Release\net48\MxGateway.Worker.exe dotnet build src\MxGateway.Server -c Release # produces bin\Release\net10.0\MxGateway.Server.dll ``` Initialize the auth database and mint an API key. The CLI mode is gated by an `apikey` first-arg prefix: ```powershell $env:MxGateway__ApiKeyPepper = "parity-rig-dev-pepper" # any stable string for dev $srv = "C:\Users\dohertj2\Desktop\mxaccessgw\src\MxGateway.Server\bin\Release\net10.0\MxGateway.Server.dll" dotnet $srv apikey init-db # → "init-db: initialized" dotnet $srv apikey create-key ` --key-id parity-rig ` --display-name "OtOpcUa-Parity" ` --scopes "session:open,session:close,invoke:read,invoke:write,invoke:secure,events:read,metadata:read" # → "API key: mxgw_parity-rig_" ← capture this; you can't list secrets later ``` Save that exact key string for `OTOPCUA_PARITY_GW_API_KEY` in step 2. Run the server with three env-var overrides — the defaults don't quite match what gRPC + the parity test need: ```powershell $env:MxGateway__ApiKeyPepper = "parity-rig-dev-pepper" # MUST match the create-key invocation $env:Kestrel__Endpoints__Http__Url = "http://localhost:5120" $env:Kestrel__Endpoints__Http__Protocols = "Http2" # gRPC needs h2c on plain HTTP $env:MxGateway__Worker__ExecutablePath = ` "C:\Users\dohertj2\Desktop\mxaccessgw\src\MxGateway.Worker\bin\x86\Release\net48\MxGateway.Worker.exe" # appsettings.json's relative path is missing the \net48 segment; absolute path sidesteps that dotnet $srv # → "Now listening on: http://localhost:5120" ``` The worker spawns lazily on the first OpenSession RPC — there's no worker process visible in Task Manager until the first session. If the worker can't spawn, the server returns `Failed to open session session-…` with a `WorkerProcessLaunchException` in the server log. NSSM-wrap it later if the rig becomes long-lived; for first-pass provisioning a console window is easier to inspect. ### 2. Set the parity env vars In the test-runner shell: ```powershell $env:OTOPCUA_PARITY_GW_ENDPOINT = "http://localhost:5120" $env:OTOPCUA_PARITY_GW_API_KEY = "parity-suite-key" # match the gw config $env:OTOPCUA_PARITY_CLIENT_NAME = "OtOpcUa-Parity" ``` Elevation status doesn't matter — the legacy Galaxy.Host pipe ACL accepts elevated and non-elevated `dohertj2` shells alike (the Administrators deny ACE was removed 2026-04-24; see `project_galaxy_host_installed.md`). ### 3. Verify both halves resolve ```powershell cd C:\Users\dohertj2\Desktop\lmxopcua dotnet test tests\ZB.MOM.WW.OtOpcUa.Driver.Galaxy.ParityTests\ ` --filter "FullyQualifiedName~HarnessShapeTests" ``` `Harness_records_a_skip_reason_for_each_unavailable_backend` is the two-line truth-teller: - Both `LegacyDriver` non-null + both `MxGatewayDriver` non-null → rig is up. - One side null → read its `LegacySkipReason` / `MxGatewaySkipReason` and fix. ## Running the matrix Once both halves resolve: ```powershell dotnet test tests\ZB.MOM.WW.OtOpcUa.Driver.Galaxy.ParityTests\ ` --filter "Category=ParityE2E" ``` This runs all 17 scenario tests across the seven scenario classes (BrowseAndRead / Subscribe / Write / Alarm / History / Reconnect / ScanState). Each scenario class is independent — failures in one don't block the rest. Track the result against `docs/v2/Galaxy.ParityMatrix.md`. Update each row to: - **green** if the scenario passes - **yellow** if it skipped because the dev Galaxy doesn't have the right shape (see coverage matrix below) - **red** if it asserted a real delta — those are the deltas that block PR 7.2; chase each before retiring the legacy backend ## Galaxy shape needed for full coverage Skip-on-empty-shape scenarios fail-soft today. To turn a skip into a real result, the dev Galaxy needs the shape in the right column: | Scenario | Needs | Local rig | |---|---|---| | `BrowseAndReadParityTests` (3 tests) | Any deployed objects with attributes | ✅ existing seed | | `SubscribeAndEventRateParityTests` event-rate | ≥5 attributes whose values *change* in 3s | ⚙ scriptable via graccess-cli | | `WriteByClassificationParityTests` (FreeAccess/Operate) | A FreeAccess/Operate numeric attribute | ⚙ scriptable via graccess-cli | | `WriteByClassificationParityTests` (Configure/Tune) | A Configure/Tune attribute | ⚙ scriptable via graccess-cli | | `AlarmTransitionParityTests` (2 tests) | Attributes with the `$Alarm*` extension | ⚙ scriptable via graccess-cli | | `HistoryReadParityTests` (historized set) | Attributes with the History extension | ⚙ scriptable via graccess-cli | | `ScanStateProbeParityTests` (2 tests) | Multiple `$WinPlatform` / `$AppEngine` objects | ❌ **deferred to customer rig** — this dev box is provisioned for one platform only | ### The single-platform constraint The dev box at `DESKTOP-6JL3KKO` is licensed / configured for a single deployed `$WinPlatform`. Adding a second platform isn't feasible here, so `ScanStateProbeParityTests` will skip in a "no overlap" branch on this rig. Both of its scenarios already handle that case gracefully (`Assert.Skip("no overlapping platform hosts between backends — likely the transport names differ but no $WinPlatform was discovered")`), so the matrix reports them as **n/a (deferred)** rather than red. Plan: defer the two ScanState scenarios to a customer rig with multiple platforms. The PR 7.2 gate accepts "n/a, deferred" on these rows provided the legacy `GalaxyRuntimeProbeManager` and the in-process `PerPlatformProbeWatcher` have matching unit-test coverage of the state-decoder + member-tracking logic — which they do (PR 4.7's tests). Treat the runtime parity check as a customer-rig acceptance gate before that customer goes live, not a precondition for retiring the legacy projects on this dev box. ### Provisioning the rest via graccess-cli `C:\Users\dohertj2\Desktop\graccess\graccess_cli\` is a .NET Framework 4.8 console app over the ArchestrA GRAccess COM API. It can configure templates, instances, attributes, UDAs, extensions, and attribute security — i.e. every row above marked ⚙ scriptable. Full surface in `graccess/graccess_cli/docs/usage.md` and per-area workflow guides (`attribute-editing.md`, `template-editing.md`, `template-instance-editing.md`). Reserve a sandbox UDO (e.g. `OtOpcUaParityTest`) to avoid mutating attributes on plant-relevant objects. Concrete commands per requirement: **A FreeAccess/Operate numeric attribute** (covers WriteByClassification FreeAccess/Operate scenario): ```powershell graccess object uda add ` --galaxy ZB --name OtOpcUaParityTest --type template ` --uda OperateValue --data-type MxFloat ` --category MxCategoryWriteable_C --security MxSecurityOperate ` --confirm --confirm-target OtOpcUaParityTest ``` **A Configure / Tune attribute** (covers WriteByClassification Configure/Tune scenario): ```powershell # Tune graccess object uda add ` --galaxy ZB --name OtOpcUaParityTest --type template ` --uda TuneValue --data-type MxFloat ` --category MxCategoryWriteable_T --security MxSecurityTune ` --confirm --confirm-target OtOpcUaParityTest # Configure graccess object uda add ` --galaxy ZB --name OtOpcUaParityTest --type template ` --uda ConfigValue --data-type MxFloat ` --category MxCategoryWriteable_C --security MxSecurityConfigure ` --confirm --confirm-target OtOpcUaParityTest ``` **A changing-value attribute** (covers Subscribe event-rate scenario). Two ways: 1. *On-scan increment* — bind a script extension that bumps a counter each scan. Simplest to author with `object extension add` against `ScriptExtension` plus `object attribute set` for the script body (see `attribute-editing.md` §"Edit Extensions" for the pattern). 2. *External writer loop* — leave the attribute as plain Float and run a one-liner that writes incrementing values from the parity-test shell. Uses the legacy backend path so it's available before the mxgw subscriber is up. This keeps the Galaxy template clean. For first-pass validation pick #2 — no template surgery needed, and the write loop runs only during `dotnet test`. **Attributes with the `$Alarm*` extension** (covers AlarmTransition scenario). Per `attribute-editing.md` §"Edit Alarm Settings" the likely-named attributes vary by extension type (`Limit`, `RateOfChange`, etc.). Add the extension via: ```powershell graccess object extension add ` --galaxy ZB --name OtOpcUaParityTest --type template ` --extension-type AnalogLimitAlarm --primitive AlarmInput ` --object-extension ` --confirm --confirm-target OtOpcUaParityTest ``` Then set HiHi/Hi/Lo/LoLo limit values + priority on the resulting attributes via `object attribute set`. Inspect first via `object attributes` to see the names the extension introduces — they differ across Aveva versions. **Attributes with the History extension** (covers HistoryRead routing scenario). History settings are usually attribute or extension attributes; `attribute-editing.md` §"Edit History Settings" covers the discovery flow. Quick start: ```powershell graccess object extension add ` --galaxy ZB --name OtOpcUaParityTest --type template ` --extension-type HistoryExtension --primitive HistoryRecord ` --object-extension ` --confirm --confirm-target OtOpcUaParityTest # Then enable history on whichever attribute the extension points at graccess object attribute set ` --galaxy ZB --name OtOpcUaParityTest --type template ` --attribute HistoryEnabled --value true --data-type bool ` --confirm --confirm-target OtOpcUaParityTest ``` **Deploy + restart Galaxy.Host after any of the above** so MxAccess sees the change: ```powershell graccess object deploy --galaxy ZB --name OtOpcUaParityTest_001 ` --confirm --confirm-target OtOpcUaParityTest_001 sc.exe restart OtOpcUaGalaxyHost # service no longer exists post-PR-7.2; in the modern shape, restart mxaccessgw instead ``` Then re-run the parity matrix. The previously-skipped scenarios should now find a sandbox attribute matching their selector and assert. ## Soak run The 24h × 50k soak gates the production confidence half of PR 7.2. ```powershell $env:OTOPCUA_SOAK_RUN = "1" $env:OTOPCUA_SOAK_TAGS = "" $env:OTOPCUA_SOAK_MINUTES = "1440" # default 24h; compress for first runs $env:OTOPCUA_SOAK_DROP_PCT = "0.5" dotnet test tests\ZB.MOM.WW.OtOpcUa.Driver.Galaxy.ParityTests\ ` --filter "Category=Soak" ``` The test logs a per-minute CSV-style line to stdout: ``` soak,1.0,received=51234,dispatched=51234,dropped=0,ws_mb=412 soak,2.0,received=102468,dispatched=102468,dropped=0,ws_mb=415 ... ``` Capture stdout to a file for post-run analysis. The three guards (`received` growing, `dropped/received` ratio, working-set delta) all fire mid-run rather than at end-of-test, so a failure surfaces within the first few minutes if the architecture is wrong. ## Compressed-tag soak (when Galaxy isn't 50k tags) A first-pass validation is fine with the override: ```powershell $env:OTOPCUA_SOAK_RUN = "1" $env:OTOPCUA_SOAK_TAGS = "500" # whatever the dev Galaxy has $env:OTOPCUA_SOAK_MINUTES = "60" # one hour is enough to surface plumbing bugs $env:OTOPCUA_SOAK_DROP_PCT = "1.0" ``` This validates the *plumbing* (bounded channel, pump invariants, leak guard) but doesn't pin the 50k-tag scaling assertion. Defer the full 50k validation to a customer rig with that scale, or build a synthetic Galaxy with a script that imports 50k attributes onto a generated UDO (~2 hours of one-off work). ## Troubleshooting - **`MxGatewaySkipReason` says "mxaccessgw not reachable"** — the gw isn't listening, or it's on a different port. `Test-NetConnection localhost -Port 5120` is the quick check. - **`MxGatewaySkipReason` says "mxgateway backend boot failed: RpcException: Unauthenticated"** — API key mismatch. Verify the `OTOPCUA_PARITY_GW_API_KEY` env var matches the gw's configured key. - **`LegacySkipReason` says "Galaxy ZB SQL not reachable on localhost:1433"** — SQL Server isn't running, or its TCP listener is off. Check `services.msc` for the SQL Server (default) instance. - **`LegacySkipReason` says "Galaxy.Host EXE not built"** — at rig time the parity harness looked under `src/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host/bin/Debug/net48/` for the EXE it spawned as a subprocess, separate from the published copy at `C:\publish\OtOpcUaGalaxyHost\` used by the Windows service. **Both the source project and the published binary were removed in PR 7.2, so this troubleshooting branch no longer applies — the legacy half cannot be brought up at all.** - **Both halves resolve but parity scenarios assert deltas** — that's the expected outcome the rig exists to surface. Review each delta against `docs/v2/Galaxy.ParityMatrix.md`'s "Accepted deltas" section to decide whether it's a real bug or a pre-accepted divergence. ## After the rig is green When the matrix is fully green or carries documented accepted-deltas, PR 7.2 (legacy project deletion) is unblocked. The only follow-up is to promote any newly-discovered accepted-delta to the matrix doc with the why so the matrix history stays auditable.