Audit (three parallel agent passes) found 43 markdown files carrying stale references to the deleted Galaxy.Host/Proxy/Shared projects after the v2-mxgw merge. This commit lands the prioritized fixes. Track 1 — high-traffic in-place rewrites (3 files, ~454 lines deleted) - README.md (202 → 91 lines): drops .NET 4.8 / x86 / TopShelf install text; leads with the multi-driver .NET 10 server identity and points at scripts/install/Install-Services.ps1 and the parity rig. - docs/v2/driver-specs.md §1 Galaxy (~289 → ~66 lines): replaces the Tier-C out-of-process spec with a Tier-A in-process description matching the current GalaxyDriver code, with the four-section GalaxyDriverOptions JSON shape pulled verbatim from Config/GalaxyDriverOptions.cs. - docs/drivers/Galaxy.md (211 → 92 lines): full rewrite around the current Browse/Runtime/Health/Config sub-folders. Track 2 — historical banners (5 files) - lmx_mxgw.md, lmx_mxgw_impl.md, lmx_backend.md, docs/v2/Galaxy.ParityMatrix.md, docs/v2/implementation/phase-2-galaxy-out-of-process.md each get a "✅ Completed 2026-04-30 — historical record" banner block. lmx_mxgw.md also fixes two dead links (`docs/Galaxy.Driver.md` and `docs/v2/Galaxy.Driver.md`) → `docs/drivers/Galaxy.md`. Track 3 — v1 archive sweep (10 git mv + 1 new index + 2 in-place scrubs) - Moved 10 v1 docs under docs/v1/ preserving subpath structure: AlarmTracking, Configuration, DataTypeMapping, HistoricalDataAccess, Subscriptions (top-level); drivers/Galaxy-Repository, drivers/Galaxy-Test-Fixture; reqs/GalaxyRepositoryReqs, reqs/MxAccessClientReqs, reqs/ServiceHostReqs. - New docs/v1/README.md is the shared archive banner + per-file table. - docs/README.md repointed to the v1 paths and updated to reflect the v2 two-process deploy shape (Server + Admin + optional OtOpcUaWonderwareHistorian). - docs/v2/Galaxy.ParityRig.md got a historical banner + four inline scrubs marking the OtOpcUaGalaxyHost service / Driver.Galaxy.Host EXE / Driver.Galaxy.ParityTests project as deleted-in-PR-7.2. The repo's live-reading surface (README + CLAUDE.md + docs/v2/) now describes only the post-PR-7.2 architecture. v1 docs are preserved as a labelled archive under docs/v1/. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
16 KiB
Galaxy parity rig — runbook
✅ Completed 2026-04-30 — historical record. This runbook is the recipe that produced the green parity matrix that gated PR 7.2 (retire legacy Galaxy projects, merged at commit
ae7106d). The matrix it produced is captured inGalaxy.ParityMatrix.md, also marked historical. The test project this doc drove (Driver.Galaxy.ParityTests) was deleted in PR 7.2, along withDriver.Galaxy.{Host,Proxy,Shared}and theOtOpcUaGalaxyHostWindows service. You cannot re-run this rig today. Current Galaxy testing flows through the gateway's own test suite in the siblingmxaccessgwrepo.The text below is preserved as-written so the migration trail (what was tested, against what shape, with what env vars) stays auditable.
Brings up both Galaxy backends side-by-side against a single live Galaxy
so the parity matrix in docs/v2/Galaxy.ParityMatrix.md and the soak
scenario in tests/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.ParityTests/SoakScenarioTests.cs
can run for real. Closing the parity matrix was the gate for PR 7.2
(retire legacy Galaxy projects).
Conceptual layout
Galaxy ZB SQL ──┬── OtOpcUaGalaxyHost (NSSM service, net48 x86) [DELETED in PR 7.2]
│ └── MxAccess COM, ClientName "OtOpcUa-Galaxy.Host"
│ └── named pipe "OtOpcUaGalaxy"
│ ▲
│ │ pipe IPC
│ │
│ GalaxyProxyDriver ◄── parity test (legacy half)
│
└── mxaccessgw service
└── MxAccess COM, ClientName "OtOpcUa-Parity"
└── gRPC on http://localhost:5120
▲
│ gRPC
│
GalaxyDriver (in-process) ◄── parity test (mxgw half)
Both halves talk to the same Galaxy through two distinct MxAccess sessions (different ClientNames so they don't evict each other).
What was on the dev box at the time
Per ~/.claude/projects/.../memory/ as of the rig run:
- AVEVA System Platform + Galaxy + MXAccess runtime —
project_aveva_platform_installed.md. OtOpcUaGalaxyHostWindows service running asdohertj2, NSSM-wrapped, binary atC:\publish\OtOpcUaGalaxyHost\OtOpcUa.Driver.Galaxy.Host.exe, shared secret at.local/galaxy-host-secret.txt, ZB SQL onlocalhost:1433—project_galaxy_host_installed.md. (Service uninstalled and binary retired as part of PR 7.2; the host source project no longer exists in this repo.)- Parity test project (
Driver.Galaxy.ParityTests) — committed and skip-clean at the time of the rig run. Deleted in PR 7.2.
Setup steps (one-time)
1. Build + run mxaccessgw
The gateway source is at c:\Users\dohertj2\Desktop\mxaccessgw\.
Build both halves — the worker has to be x86 net48 (MxAccess COM
bitness), the server is .NET 10:
cd C:\Users\dohertj2\Desktop\mxaccessgw
dotnet build src\MxGateway.Worker -c Release # produces bin\x86\Release\net48\MxGateway.Worker.exe
dotnet build src\MxGateway.Server -c Release # produces bin\Release\net10.0\MxGateway.Server.dll
Initialize the auth database and mint an API key. The CLI mode is
gated by an apikey first-arg prefix:
$env:MxGateway__ApiKeyPepper = "parity-rig-dev-pepper" # any stable string for dev
$srv = "C:\Users\dohertj2\Desktop\mxaccessgw\src\MxGateway.Server\bin\Release\net10.0\MxGateway.Server.dll"
dotnet $srv apikey init-db # → "init-db: initialized"
dotnet $srv apikey create-key `
--key-id parity-rig `
--display-name "OtOpcUa-Parity" `
--scopes "session:open,session:close,invoke:read,invoke:write,invoke:secure,events:read,metadata:read"
# → "API key: mxgw_parity-rig_<base64suffix>" ← capture this; you can't list secrets later
Save that exact key string for OTOPCUA_PARITY_GW_API_KEY in step 2.
Run the server with three env-var overrides — the defaults don't quite match what gRPC + the parity test need:
$env:MxGateway__ApiKeyPepper = "parity-rig-dev-pepper" # MUST match the create-key invocation
$env:Kestrel__Endpoints__Http__Url = "http://localhost:5120"
$env:Kestrel__Endpoints__Http__Protocols = "Http2" # gRPC needs h2c on plain HTTP
$env:MxGateway__Worker__ExecutablePath = `
"C:\Users\dohertj2\Desktop\mxaccessgw\src\MxGateway.Worker\bin\x86\Release\net48\MxGateway.Worker.exe"
# appsettings.json's relative path is missing the \net48 segment; absolute path sidesteps that
dotnet $srv
# → "Now listening on: http://localhost:5120"
The worker spawns lazily on the first OpenSession RPC — there's no
worker process visible in Task Manager until the first session. If
the worker can't spawn, the server returns Failed to open session session-… with a WorkerProcessLaunchException in the server log.
NSSM-wrap it later if the rig becomes long-lived; for first-pass provisioning a console window is easier to inspect.
2. Set the parity env vars
In the test-runner shell:
$env:OTOPCUA_PARITY_GW_ENDPOINT = "http://localhost:5120"
$env:OTOPCUA_PARITY_GW_API_KEY = "parity-suite-key" # match the gw config
$env:OTOPCUA_PARITY_CLIENT_NAME = "OtOpcUa-Parity"
Elevation status doesn't matter — the legacy Galaxy.Host pipe ACL accepts
elevated and non-elevated dohertj2 shells alike (the Administrators deny
ACE was removed 2026-04-24; see project_galaxy_host_installed.md).
3. Verify both halves resolve
cd C:\Users\dohertj2\Desktop\lmxopcua
dotnet test tests\ZB.MOM.WW.OtOpcUa.Driver.Galaxy.ParityTests\ `
--filter "FullyQualifiedName~HarnessShapeTests"
Harness_records_a_skip_reason_for_each_unavailable_backend is the
two-line truth-teller:
- Both
LegacyDrivernon-null + bothMxGatewayDrivernon-null → rig is up. - One side null → read its
LegacySkipReason/MxGatewaySkipReasonand fix.
Running the matrix
Once both halves resolve:
dotnet test tests\ZB.MOM.WW.OtOpcUa.Driver.Galaxy.ParityTests\ `
--filter "Category=ParityE2E"
This runs all 17 scenario tests across the seven scenario classes (BrowseAndRead / Subscribe / Write / Alarm / History / Reconnect / ScanState). Each scenario class is independent — failures in one don't block the rest.
Track the result against docs/v2/Galaxy.ParityMatrix.md. Update each
row to:
- green if the scenario passes
- yellow if it skipped because the dev Galaxy doesn't have the right shape (see coverage matrix below)
- red if it asserted a real delta — those are the deltas that block PR 7.2; chase each before retiring the legacy backend
Galaxy shape needed for full coverage
Skip-on-empty-shape scenarios fail-soft today. To turn a skip into a real result, the dev Galaxy needs the shape in the right column:
| Scenario | Needs | Local rig |
|---|---|---|
BrowseAndReadParityTests (3 tests) |
Any deployed objects with attributes | ✅ existing seed |
SubscribeAndEventRateParityTests event-rate |
≥5 attributes whose values change in 3s | ⚙ scriptable via graccess-cli |
WriteByClassificationParityTests (FreeAccess/Operate) |
A FreeAccess/Operate numeric attribute | ⚙ scriptable via graccess-cli |
WriteByClassificationParityTests (Configure/Tune) |
A Configure/Tune attribute | ⚙ scriptable via graccess-cli |
AlarmTransitionParityTests (2 tests) |
Attributes with the $Alarm* extension |
⚙ scriptable via graccess-cli |
HistoryReadParityTests (historized set) |
Attributes with the History extension | ⚙ scriptable via graccess-cli |
ScanStateProbeParityTests (2 tests) |
Multiple $WinPlatform / $AppEngine objects |
❌ deferred to customer rig — this dev box is provisioned for one platform only |
The single-platform constraint
The dev box at DESKTOP-6JL3KKO is licensed / configured for a single
deployed $WinPlatform. Adding a second platform isn't feasible here,
so ScanStateProbeParityTests will skip in a "no overlap" branch on
this rig. Both of its scenarios already handle that case gracefully
(Assert.Skip("no overlapping platform hosts between backends — likely the transport names differ but no $WinPlatform was discovered")), so
the matrix reports them as n/a (deferred) rather than red.
Plan: defer the two ScanState scenarios to a customer rig with multiple
platforms. The PR 7.2 gate accepts "n/a, deferred" on these rows
provided the legacy GalaxyRuntimeProbeManager and the in-process
PerPlatformProbeWatcher have matching unit-test coverage of the
state-decoder + member-tracking logic — which they do (PR 4.7's tests).
Treat the runtime parity check as a customer-rig acceptance gate before
that customer goes live, not a precondition for retiring the legacy
projects on this dev box.
Provisioning the rest via graccess-cli
C:\Users\dohertj2\Desktop\graccess\graccess_cli\ is a .NET Framework
4.8 console app over the ArchestrA GRAccess COM API. It can configure
templates, instances, attributes, UDAs, extensions, and attribute
security — i.e. every row above marked ⚙ scriptable. Full surface in
graccess/graccess_cli/docs/usage.md and per-area workflow guides
(attribute-editing.md, template-editing.md,
template-instance-editing.md).
Reserve a sandbox UDO (e.g. OtOpcUaParityTest) to avoid mutating
attributes on plant-relevant objects. Concrete commands per requirement:
A FreeAccess/Operate numeric attribute (covers WriteByClassification FreeAccess/Operate scenario):
graccess object uda add `
--galaxy ZB --name OtOpcUaParityTest --type template `
--uda OperateValue --data-type MxFloat `
--category MxCategoryWriteable_C --security MxSecurityOperate `
--confirm --confirm-target OtOpcUaParityTest
A Configure / Tune attribute (covers WriteByClassification Configure/Tune scenario):
# Tune
graccess object uda add `
--galaxy ZB --name OtOpcUaParityTest --type template `
--uda TuneValue --data-type MxFloat `
--category MxCategoryWriteable_T --security MxSecurityTune `
--confirm --confirm-target OtOpcUaParityTest
# Configure
graccess object uda add `
--galaxy ZB --name OtOpcUaParityTest --type template `
--uda ConfigValue --data-type MxFloat `
--category MxCategoryWriteable_C --security MxSecurityConfigure `
--confirm --confirm-target OtOpcUaParityTest
A changing-value attribute (covers Subscribe event-rate scenario). Two ways:
- On-scan increment — bind a script extension that bumps a counter
each scan. Simplest to author with
object extension addagainstScriptExtensionplusobject attribute setfor the script body (seeattribute-editing.md§"Edit Extensions" for the pattern). - External writer loop — leave the attribute as plain Float and run a one-liner that writes incrementing values from the parity-test shell. Uses the legacy backend path so it's available before the mxgw subscriber is up. This keeps the Galaxy template clean.
For first-pass validation pick #2 — no template surgery needed, and the
write loop runs only during dotnet test.
Attributes with the $Alarm* extension (covers AlarmTransition
scenario). Per attribute-editing.md §"Edit Alarm Settings" the
likely-named attributes vary by extension type
(Limit, RateOfChange, etc.). Add the extension via:
graccess object extension add `
--galaxy ZB --name OtOpcUaParityTest --type template `
--extension-type AnalogLimitAlarm --primitive AlarmInput `
--object-extension `
--confirm --confirm-target OtOpcUaParityTest
Then set HiHi/Hi/Lo/LoLo limit values + priority on the resulting
attributes via object attribute set. Inspect first via
object attributes to see the names the extension introduces — they
differ across Aveva versions.
Attributes with the History extension (covers HistoryRead routing
scenario). History settings are usually attribute or extension
attributes; attribute-editing.md §"Edit History Settings" covers the
discovery flow. Quick start:
graccess object extension add `
--galaxy ZB --name OtOpcUaParityTest --type template `
--extension-type HistoryExtension --primitive HistoryRecord `
--object-extension `
--confirm --confirm-target OtOpcUaParityTest
# Then enable history on whichever attribute the extension points at
graccess object attribute set `
--galaxy ZB --name OtOpcUaParityTest --type template `
--attribute HistoryEnabled --value true --data-type bool `
--confirm --confirm-target OtOpcUaParityTest
Deploy + restart Galaxy.Host after any of the above so MxAccess sees the change:
graccess object deploy --galaxy ZB --name OtOpcUaParityTest_001 `
--confirm --confirm-target OtOpcUaParityTest_001
sc.exe restart OtOpcUaGalaxyHost # service no longer exists post-PR-7.2; in the modern shape, restart mxaccessgw instead
Then re-run the parity matrix. The previously-skipped scenarios should now find a sandbox attribute matching their selector and assert.
Soak run
The 24h × 50k soak gates the production confidence half of PR 7.2.
$env:OTOPCUA_SOAK_RUN = "1"
$env:OTOPCUA_SOAK_TAGS = "<actual tag count if Galaxy < 50k>"
$env:OTOPCUA_SOAK_MINUTES = "1440" # default 24h; compress for first runs
$env:OTOPCUA_SOAK_DROP_PCT = "0.5"
dotnet test tests\ZB.MOM.WW.OtOpcUa.Driver.Galaxy.ParityTests\ `
--filter "Category=Soak"
The test logs a per-minute CSV-style line to stdout:
soak,1.0,received=51234,dispatched=51234,dropped=0,ws_mb=412
soak,2.0,received=102468,dispatched=102468,dropped=0,ws_mb=415
...
Capture stdout to a file for post-run analysis. The three guards
(received growing, dropped/received ratio, working-set delta) all
fire mid-run rather than at end-of-test, so a failure surfaces within
the first few minutes if the architecture is wrong.
Compressed-tag soak (when Galaxy isn't 50k tags)
A first-pass validation is fine with the override:
$env:OTOPCUA_SOAK_RUN = "1"
$env:OTOPCUA_SOAK_TAGS = "500" # whatever the dev Galaxy has
$env:OTOPCUA_SOAK_MINUTES = "60" # one hour is enough to surface plumbing bugs
$env:OTOPCUA_SOAK_DROP_PCT = "1.0"
This validates the plumbing (bounded channel, pump invariants, leak guard) but doesn't pin the 50k-tag scaling assertion. Defer the full 50k validation to a customer rig with that scale, or build a synthetic Galaxy with a script that imports 50k attributes onto a generated UDO (~2 hours of one-off work).
Troubleshooting
MxGatewaySkipReasonsays "mxaccessgw not reachable" — the gw isn't listening, or it's on a different port.Test-NetConnection localhost -Port 5120is the quick check.MxGatewaySkipReasonsays "mxgateway backend boot failed: RpcException: Unauthenticated" — API key mismatch. Verify theOTOPCUA_PARITY_GW_API_KEYenv var matches the gw's configured key.LegacySkipReasonsays "Galaxy ZB SQL not reachable on localhost:1433" — SQL Server isn't running, or its TCP listener is off. Checkservices.mscfor the SQL Server (default) instance.LegacySkipReasonsays "Galaxy.Host EXE not built" — at rig time the parity harness looked undersrc/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host/bin/Debug/net48/for the EXE it spawned as a subprocess, separate from the published copy atC:\publish\OtOpcUaGalaxyHost\used by the Windows service. Both the source project and the published binary were removed in PR 7.2, so this troubleshooting branch no longer applies — the legacy half cannot be brought up at all.- Both halves resolve but parity scenarios assert deltas — that's
the expected outcome the rig exists to surface. Review each delta
against
docs/v2/Galaxy.ParityMatrix.md's "Accepted deltas" section to decide whether it's a real bug or a pre-accepted divergence.
After the rig is green
When the matrix is fully green or carries documented accepted-deltas, PR 7.2 (legacy project deletion) is unblocked. The only follow-up is to promote any newly-discovered accepted-delta to the matrix doc with the why so the matrix history stays auditable.