Foundational PRs from lmx_mxgw_impl.md, all green. Bodies only — DI/wiring deferred to PR 1+2.W (combined wire-up) and PR 3.W. PR 1.1 — IHistorianDataSource lifted to Core.Abstractions/Historian/ Reuses existing DataValueSnapshot + HistoricalEvent shapes; sidecar (PR 3.4) translates byte-quality → uint StatusCode internally. PR 1.2 — IHistoryRouter + HistoryRouter on the server Longest-prefix-match resolution, case-insensitive, ObjectDisposed-guarded, swallow-on-shutdown disposal of misbehaving sources. PR 1.3 — DriverNodeManager.HistoryRead* dispatch through IHistoryRouter Per-tag resolution with LegacyDriverHistoryAdapter wrapping `_driver as IHistoryProvider` so existing tests + drivers keep working until PR 7.2 retires the fallback. PR 2.1 — AlarmConditionInfo extended with five sub-attribute refs InAlarmRef / PriorityRef / DescAttrNameRef / AckedRef / AckMsgWriteRef. Optional defaulted parameters preserve all existing 3-arg call sites. PR 2.2 — AlarmConditionService state machine in Server/Alarms/ Driver-agnostic port of GalaxyAlarmTracker. Sub-attribute refs come from AlarmConditionInfo, values arrive as DataValueSnapshot, ack writes route through IAlarmAcknowledger. State machine preserves Active/Acknowledged/ Inactive transitions, Acked-on-active reset, post-disposal silence. PR 2.3 — DriverNodeManager wires AlarmConditionService MarkAsAlarmCondition registers each alarm-bearing variable with the service; DriverWritableAcknowledger routes ack-message writes through the driver's IWritable + CapabilityInvoker. Service-raised transitions route via OnAlarmServiceTransition → matching ConditionSink. Legacy IAlarmSource path unchanged for null service. PR 3.1 — Driver.Historian.Wonderware shell project (net48 x86) Console host shell + smoke test; SDK references + code lift come in PR 3.2. Tests: 9 (PR 1.1) + 5 (PR 2.1) + 10 (PR 1.2) + 19 (PR 2.2) + 1 (PR 3.1) all pass. Existing AlarmSubscribeIntegrationTests + HistoryReadIntegrationTests unchanged. Plan + audit docs (lmx_backend.md, lmx_mxgw.md, lmx_mxgw_impl.md) included so parallel subagent worktrees can read them. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
44 KiB
Galaxy → MxGateway Migration — Detailed Implementation Plan
Companion to lmx_mxgw.md (design plan). This document breaks the plan into
PR-sized tasks with concrete file paths, acceptance checks, test deltas, and
explicit parallel-safety analysis for subagent execution.
Cross-repo scope:
lmxopcua(this repo) — drivers, server, install scripts, e2e, docs.mxaccessgw(C:\Users\dohertj2\Desktop\mxaccessgw) — gRPC gateway, worker, .NET client.
How to use parallel subagents safely
The plan lists each task with a parallel-key. Two tasks share a key when
they touch the same file(s); tasks with disjoint keys are safe to run in
parallel. Tasks within the same phase that share a key MUST run
sequentially.
Subagent execution rules
- One git worktree per parallel subagent. Spawn each parallel agent
with
Agent({ isolation: "worktree", ... })so they never collide on the working tree. Merge back to a shared integration branch after each parallel batch completes. - Interface-defining tasks run first, then their consumers. Anywhere the plan says "PR X.0: define interface", that PR must merge to the integration branch before its consumers fan out in parallel.
- Shared-file edits serialize. Files touched by more than one PR in a
batch —
ZB.MOM.WW.OtOpcUa.slnx,Install-Services.ps1,appsettings.json,CLAUDE.md,MEMORY.md— get a single dedicated "wire-up" PR at the end of the batch that ingests every parallel branch's needed line. Don't let parallel agents edit them. - Test fixtures own their fixture file. When two PRs both need a
FakeMxGatewayClient, the first PR creates it and exposes the contract; subsequent PRs add cases to the same file or extend it via partial class in their own test files. - Subagent prompt must include the parallel-key and disallowed paths.
Any agent prompt must say "you may NOT edit
<sln file>,<wire-up files>, or files outside<your scope>. If you discover a needed change there, surface it as a task for the wire-up PR; do not make it yourself." This prevents merge conflicts at integration time. - Choose the right subagent type.
Explore— read-only research/locate. Cheap. Use before any PR that needs to learn the surrounding code.Plan— produce a step-by-step PR plan from a brief; no code writes. Use when a task description below is too coarse for a fresh agent.general-purpose— code-writing. Use for PRs that create/modify source.code-simplifier— post-PR cleanup pass on the same files.codex:rescue— a stuck PR; use sparingly.
- Foreground vs. background. Run one PR foreground if its result gates the rest of your work this turn. Run the rest in background and read results when they complete.
- Trust but verify. After every subagent claims completion, the
parent runs the build (
dotnet build ZB.MOM.WW.OtOpcUa.slnx) and the target tests. The agent's report is hearsay until the build is green. - Worktree cleanup. When
isolation: "worktree"returns no path, nothing was changed; if it returns a path, integrate by cherry-picking or fast-forwarding into the integration branch, then prune the worktree.
Locked files (never edit from a parallel batch)
These get a dedicated wire-up PR at the end of each phase's parallel fanout:
| File | Why locked |
|---|---|
ZB.MOM.WW.OtOpcUa.slnx |
New project additions stack and conflict |
src/ZB.MOM.WW.OtOpcUa.Server/appsettings.json |
Config schema additions stack |
src/ZB.MOM.WW.OtOpcUa.Server/Program.cs (or Startup.cs) |
DI registrations stack |
scripts/install/Install-Services.ps1 |
Service registrations stack |
scripts/e2e/e2e-config.sample.json |
E2E config stacks |
CLAUDE.md, docs/v2/dev-environment.md |
Doc edits stack |
MEMORY.md (auto-memory index) |
One line per change; conflicts often |
mxaccessgw/MxGateway.sln |
Same reason as our slnx |
mxaccessgw/clients/proto/*.proto files |
Proto edits stack and reorder field numbers |
Phase 0 — mxaccessgw foundation work
Repo: C:\Users\dohertj2\Desktop\mxaccessgw. Branch off main per task.
| PR | Title | Parallel-key | Files |
|---|---|---|---|
| 0.1 | Galaxy attribute metadata parity | gw-proto-galaxy |
clients/proto/galaxy_repository.proto, src/MxGateway.Server/Galaxy/AttributeMapper.cs, src/MxGateway.Server/Galaxy/GalaxyHierarchyCache.cs, gr/-equivalent SQL in src/MxGateway.Server/Galaxy/Sql/, contract tests |
| 0.2 | Bulk subscribe with publishing-interval hint | gw-proto-mxaccess |
clients/proto/mxaccess_gateway.proto (extend SubscribeBulkCommand with optional uint32 buffered_update_interval_ms), src/MxGateway.Worker/MxAccess/Commands/SubscribeBulkHandler.cs, src/MxGateway.Server/Sessions/Mappers.cs, worker tests |
| 0.3 | Subscription replay RPC | gw-proto-mxaccess |
Same proto file as 0.2 (add ReplaySubscriptionsCommand), src/MxGateway.Worker/MxAccess/Commands/ReplaySubscriptionsHandler.cs, gateway forwarder, tests |
| 0.4 | Session health stream | gw-proto-mxaccess |
Same proto (add StreamSessionHealth(SessionId) returns (stream SessionHealth)), src/MxGateway.Server/Sessions/SessionHealthService.cs, dashboard projection, tests |
| 0.5 | Document event-stream resume contract | gw-docs |
docs/Sessions.md, docs/gateway-process-design.md — define retention bound, events_lost signal in MxEvent envelope |
| 0.6 | .NET client MxValue adapter + SubscribeWithCallback |
gw-dotnet-client |
clients/dotnet/MxGateway.Client/MxValueAdapter.cs (new), clients/dotnet/MxGateway.Client/MxGatewaySession.cs (extend with SubscribeWithCallbackAsync), clients/dotnet/MxGateway.Client.Tests/ |
| 0.7 | API key scopes + mxgw-key minting CLI |
gw-auth |
src/MxGateway.Server/Auth/, src/MxGateway.Cli/, docs/Authentication.md |
Phase 0 parallel batches
- Batch 0a (parallel): 0.1 (
gw-proto-galaxy), 0.5 (gw-docs), 0.6 (gw-dotnet-client), 0.7 (gw-auth). Four worktrees, fourgeneral-purposeagents. - Batch 0b (sequential within key, parallel across keys): 0.2 → 0.3 →
0.4 all share
gw-proto-mxaccess. Land them in order on the same agent (or three sequential calls). Field number assignment must be coordinated through the wire-up PR. - Wire-up 0.W: integrate proto-generated descriptors, regenerate
clients/proto/descriptors, run cross-language smoke matrix.
Phase 0 exit: mxaccessgw main carries all seven PRs. Tag the gw NuGet
release. Bump MxGateway.Client consumed by lmxopcua.
Phase 1 — Server-level historian extension point (lmxopcua)
Goal: detach IHistorianDataSource from the Galaxy driver. Server's
HistoryRead* operations call into a registered data source by namespace,
not into IHistoryProvider on the driver.
Tasks
PR 1.1 — Lift IHistorianDataSource to Core.Abstractions
Parallel-key: core-abs-historian (locks files in
Core.Abstractions/Historian/).
Files
- Create:
src/ZB.MOM.WW.OtOpcUa.Core.Abstractions/Historian/IHistorianDataSource.cssrc/ZB.MOM.WW.OtOpcUa.Core.Abstractions/Historian/HistorianSample.cssrc/ZB.MOM.WW.OtOpcUa.Core.Abstractions/Historian/HistorianAggregateSample.cssrc/ZB.MOM.WW.OtOpcUa.Core.Abstractions/Historian/HistorianEvent.cssrc/ZB.MOM.WW.OtOpcUa.Core.Abstractions/Historian/HistorianHealthSnapshot.cs
- Move-from (Galaxy.Host originals stay until phase 7; new copies live in
Core.Abstractions and are pure POCO):
- source bodies in
src/.../Driver.Galaxy.Host/Backend/Historian/
- source bodies in
- Modify:
src/ZB.MOM.WW.OtOpcUa.Core.Abstractions/ZB.MOM.WW.OtOpcUa.Core.Abstractions.csproj(no change if files auto-included)
- Tests:
tests/ZB.MOM.WW.OtOpcUa.Core.Abstractions.Tests/Historian/IHistorianDataSourceContractTests.cs— contract documentation tests (null arg behavior, time-range ordering).
Acceptance
dotnet buildclean.- New tests run and pass.
- Galaxy.Host still compiles (it keeps its own copies until phase 7).
Subagent prompt boilerplate (template — re-use this shape for each PR):
You are working in worktree
<path>. Create the files listed in PR 1.1 oflmx_mxgw_impl.md. Do NOT edit any file underDriver.Galaxy.Host/,appsettings.json, the.slnx, orProgram.cs. The DTOs are pure value records — do not import OPC UA types or COM types. Rundotnet build src/ZB.MOM.WW.OtOpcUa.Core.Abstractionsbefore reporting.
PR 1.2 — IHistoryService plugin host on the server
Parallel-key: server-history.
Files
- Create:
src/ZB.MOM.WW.OtOpcUa.Server/History/IHistoryRouter.cs— namespace →IHistorianDataSource.src/ZB.MOM.WW.OtOpcUa.Server/History/HistoryRouter.cs— registry impl.src/ZB.MOM.WW.OtOpcUa.Server/History/HistoryServiceAdapter.cs— bridges OPC UAHistoryRead/HistoryReadProcessed/HistoryReadAtTime/HistoryReadEventsto the router.
- Modify:
src/ZB.MOM.WW.OtOpcUa.Server/OpcUaServerService.cs— registerHistoryServiceAdapter. Locked file — defer to wire-up PR 1.W.
- Tests:
tests/ZB.MOM.WW.OtOpcUa.Server.Tests/History/HistoryRouterTests.cs.
Acceptance
- Router resolves data source by namespace prefix.
- Unknown namespace returns
BadHistoryOperationUnsupported(or current status used for that case — verify against existing server behavior inOpcUaServerService.csbefore coding).
Depends on: 1.1 merged.
PR 1.3 — Driver capability shrink: drop IHistoryProvider requirement
Parallel-key: server-history.
Files
- Modify:
src/ZB.MOM.WW.OtOpcUa.Server/DriverNodeManager.cs(or whereverIHistoryProvideris consumed; locate viaGrep "IHistoryProvider"). Replace direct calls withIHistoryRouter.Resolve(...).
- Tests:
- Update any test that exercised
IHistoryProviderpaths to register a fake data source via the router.
- Update any test that exercised
Depends on: 1.2 merged.
PR 1.W — Phase 1 wire-up
Parallel-key: locked-files.
Files
src/ZB.MOM.WW.OtOpcUa.Server/OpcUaServerService.cs— DI registration ofHistoryRouter+ the legacy Galaxy.Host historian adapter.ZB.MOM.WW.OtOpcUa.slnx— no change unless a new project was added; if PR 1.1 went into the existingCore.Abstractionsproject, no slnx edit.
Phase 1 parallel batches
- Batch 1a (sequential): 1.1 → 1.2 → 1.3 → 1.W. Each blocks the next.
- Total: one foreground sequence; no parallelism in Phase 1. Use one
general-purposeagent across all four PRs, or one PR per agent in order.
Phase 2 — Server-level alarm condition subsystem (lmxopcua)
Goal: drop GalaxyAlarmTracker from the driver's responsibilities; the
server runs the AlarmCondition state machine driven by IsAlarm=true
attribute metadata.
Tasks
PR 2.1 — Address-space builder alarm-declaration API
Parallel-key: core-abs-alarms.
Files
- Modify:
src/ZB.MOM.WW.OtOpcUa.Core.Abstractions/IAddressSpaceBuilder.cs— addIAlarmConditionDeclaration MarkAsAlarmCondition(...)(the method already exists perGalaxyProxyDriver.cs:146; verify shape and extend with the four sub-attribute references).src/ZB.MOM.WW.OtOpcUa.Core.Abstractions/Alarms/AlarmConditionInfo.cs— addInAlarmRef,PriorityRef,DescAttrNameRef,AckedRef,AckMsgWriteReffields.
- Tests:
tests/ZB.MOM.WW.OtOpcUa.Core.Abstractions.Tests/Alarms/AlarmConditionInfoTests.cs.
Acceptance
- Existing call sites (
GalaxyProxyDriver.DiscoverAsync) still compile — add the new fields with safe defaults.
PR 2.2 — AlarmConditionService (state machine)
Parallel-key: server-alarms.
Files
- Create:
src/ZB.MOM.WW.OtOpcUa.Server/Alarms/AlarmConditionService.cssrc/ZB.MOM.WW.OtOpcUa.Server/Alarms/AlarmConditionState.cssrc/ZB.MOM.WW.OtOpcUa.Server/Alarms/IAlarmAcknowledger.cs
- Reference impl to port (do not duplicate — read it for invariants):
src/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host/Backend/Alarms/GalaxyAlarmTracker.cs
- Tests:
tests/ZB.MOM.WW.OtOpcUa.Server.Tests/Alarms/AlarmConditionServiceTests.cs— port the existing tracker tests (tests/.../Galaxy.Host.Tests/).
Subagent guidance
- Two-step. First a
Planagent: readGalaxyAlarmTracker.csand produce a state-transition table + a list of tests to port. Then ageneral-purposeagent: implementAlarmConditionServiceagainst that table.
Depends on: 2.1 merged.
PR 2.3 — Wire alarm service into DriverNodeManager
Parallel-key: server-alarms.
Files
- Modify:
src/ZB.MOM.WW.OtOpcUa.Server/DriverNodeManager.cs— on each driver's discovery, collect alarm declarations and hand toAlarmConditionServicealong with the driver'sISubscribableandIWritablefor sub-attribute advise + ack writes.
- Tests:
- extend
DriverNodeManagerTestswith a fake driver that declares one alarm-bearing node.
- extend
Depends on: 2.2 merged.
PR 2.W — Phase 2 wire-up
DI registration of AlarmConditionService in OpcUaServerService.cs.
Phase 2 parallel batches
- Batch 2a (sequential): 2.1 → 2.2 → 2.3 → 2.W.
Phases 1 + 2 cross-batch parallelism
PR 1.1 and PR 2.1 touch different files in Core.Abstractions/ (one
under Historian/, one in IAddressSpaceBuilder.cs + Alarms/). They are
parallel-safe.
PR 1.2/1.3 and PR 2.2/2.3 both modify OpcUaServerService.cs and
DriverNodeManager.cs. They share two locked files — but only at the
DI-registration level. If we split the OpcUaServerService.cs edits into a
single combined wire-up PR (1+2.W), the body PRs 1.2/1.3 and 2.2/2.3 don't
touch them. Then the body PRs can run in parallel batches across
phase 1 and phase 2.
Recommended Phase 1+2 plan (parallel):
- Run PR 1.1 and PR 2.1 in parallel (two worktrees, two
general-purposeagents). Both targetCore.Abstractionsonly. - Merge both to integration branch.
- Run PR 1.2/1.3 and PR 2.2/2.3 in parallel, each as a sequential
2-PR chain on its own worktree. Constraint: neither chain edits
OpcUaServerService.csorDriverNodeManager.cs— defer all DI/wiring to the combined wire-up. - Merge both chains.
- Combined wire-up PR 1+2.W edits
OpcUaServerService.csandDriverNodeManager.csonce.
Phase 3 — Driver.Historian.Wonderware sidecar
Goal: house the existing HistorianDataSource code in its own .NET 4.8 x86
service, exposed over named pipe; ship a .NET 10 client implementing
IHistorianDataSource.
Tasks
PR 3.1 — Create the sidecar shell project
Parallel-key: historian-sidecar-host.
Files
- Create project:
src/ZB.MOM.WW.OtOpcUa.Driver.Historian.Wonderware/Driver.Historian.Wonderware.csproj(<TargetFramework>net48</TargetFramework>,<PlatformTarget>x86</PlatformTarget>).Program.cs— Serilog + console host + named pipe server (mirrorDriver.Galaxy.Host/Program.csshape: env-driven pipe name, allowed SID, shared secret).
- Create test project:
tests/ZB.MOM.WW.OtOpcUa.Driver.Historian.Wonderware.Tests/
- Locked:
.slnx,Install-Services.ps1(wire-up).
PR 3.2 — Lift HistorianDataSource & friends
Parallel-key: historian-sidecar-host.
Files
- Move (preserve git history with
git mv):src/.../Driver.Galaxy.Host/Backend/Historian/HistorianDataSource.cs→src/.../Driver.Historian.Wonderware/Backend/HistorianDataSource.csHistorianClusterEndpointPicker.csHistorianClusterNodeState.csHistorianConfiguration.csHistorianEventDto.csHistorianHealthSnapshot.csHistorianQualityMapper.csHistorianSample.csIHistorianConnectionFactory.cs
- Add a thin
IHistorianDataSourceshim in the sidecar that re-implements the interface fromCore.Abstractions/Historian/(after PR 1.1). - Galaxy.Host needs to keep building until phase 7. Either:
- Add
Driver.Historian.WonderwareProjectReference fromDriver.Galaxy.Hostand re-use the moved code, OR - Leave a stub copy in Galaxy.Host that delegates to the sidecar via the new client. Pick option 1 (cleaner).
- Add
- Tests:
git mvmatching test files fromtests/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Tests/Backend/Historian/totests/ZB.MOM.WW.OtOpcUa.Driver.Historian.Wonderware.Tests/.
Depends on: PR 1.1 merged (interface lives in Core.Abstractions).
PR 3.3 — Pipe contract + handler
Parallel-key: historian-sidecar-pipe.
Files
- Create:
src/ZB.MOM.WW.OtOpcUa.Driver.Historian.Wonderware/Ipc/Contracts.cs(MessagePack DTOs:ReadRawRequest/Reply,ReadProcessedRequest/Reply,ReadAtTimeRequest/Reply,ReadEventsRequest/Reply,WriteAlarmEventsRequest/Reply— alarm-event persistence write path; mirror today'sGalaxyHistorianWriter.WriteBatchAsyncpayload so the SQLite store-and-forward sink inCore.AlarmHistoriancan drain into the Wonderware historian event store after Galaxy.Proxy is deleted).Ipc/PipeServer.cs— copy + adaptDriver.Galaxy.Host/Ipc/PipeServer.cs(same ACL/secret model).Ipc/HistorianFrameHandler.cs— handles all five contract pairs above.
- Tests:
tests/.../Driver.Historian.Wonderware.Tests/Ipc/PipeRoundTripTests.cs— round-trip every contract pair includingWriteAlarmEvents.
PR 3.4 — .NET 10 client
Parallel-key: historian-sidecar-client.
Files
- Create project:
src/ZB.MOM.WW.OtOpcUa.Driver.Historian.Wonderware.Client/(.NET 10 x64). Implements:IHistorianDataSource(read path: raw / processed / at-time / events) against the sidecar pipe.IAlarmHistorianWriter(write path: alarm-event persistence) against the sidecar pipeWriteAlarmEventscontract from PR 3.3.
- Tests:
tests/.../Driver.Historian.Wonderware.Client.Tests/against an in-proc fake pipe server. Cover both the read interface and the alarm-event write interface; verify the SQLite store-and-forward sink (Core.AlarmHistorian.SqliteStoreAndForwardSink) drains successfully when the client is plugged in as its target.
Depends on: PR 3.3 merged (contracts published).
PR 3.W — Phase 3 wire-up
Files
ZB.MOM.WW.OtOpcUa.slnx— register three new projects + two new test projects.scripts/install/Install-Services.ps1— registerOtOpcUaWonderwareHistorianNSSM service.src/ZB.MOM.WW.OtOpcUa.Server/OpcUaServerService.cs— register the client as both anIHistorianDataSourcefor the Galaxy namespace and theIAlarmHistorianWritertarget for the SQLite store-and-forward sink, replacing today'sGalaxyProxyDriver.WriteBatchAsyncroute.src/ZB.MOM.WW.OtOpcUa.Server/appsettings.json—Historian:Wonderwareblock.
Phase 3 parallel batches
- Batch 3a (sequential): 3.1 (shell) → 3.2 (lift code).
- Batch 3b (parallel after 3.2): 3.3 (pipe) and 3.4 (client) — but 3.4 depends on 3.3's contracts. So sequential within Phase 3.
- Batch 3c: 3.W.
But Phase 3 is fully independent of Phase 1.1's downstream work once 1.1 has merged. Phase 3 can run in parallel with Phase 1.2/1.3 and all of Phase 2.
Recommended phasing: kick off Phase 3 in parallel with Phase 2, both gated only on Phase 1.1's merge.
Phase 4 — New Driver.Galaxy (Tier-A, .NET 10) against gw
This is the bulk of the work. Each PR adds one capability to the new driver. The driver builds and links from PR 4.0 onward; capabilities arrive as incremental green bars.
The driver lives at src/ZB.MOM.WW.OtOpcUa.Driver.Galaxy/ (note: same
short name as the old .Proxy, but new project. The .Host, .Proxy,
.Shared projects continue to coexist until phase 7).
Tasks
PR 4.0 — Project skeleton, options, factory
Parallel-key: galaxy-shell.
Files
- Create project:
src/ZB.MOM.WW.OtOpcUa.Driver.Galaxy/Driver.Galaxy.csproj(.NET 10 x64), referencesCore.Abstractions,Core,MxGateway.Client(NuGet from gw repo).GalaxyDriver.cs—IDriver+IDisposableskeleton;InitializecreatesMxGatewayClientand opens a session;Shutdowndisposes.Config/GalaxyDriverOptions.cs— POCO matching the JSON shape inlmx_mxgw.md.GalaxyDriverFactoryExtensions.cs—AddGalaxyDriver(IServiceCollection).
- Tests:
tests/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Tests/(new project)Tests/GalaxyDriverInitializationTests.cs— uses a fakeIMxGatewayClientTransportto verify open-session behavior.
- Locked:
.slnx(wire-up PR 4.W).
Acceptance
- Driver builds,
Initializeopens a session against a fake transport,Shutdowncloses it. IDriver.RecycleAsync(if present in the interface today) returns the same stub shape as the legacy backend —{Accepted = true, GraceSeconds = 15}— and is documented in the file as intentionally a no-op until a future PR wires it through gw. Today'sMxAccessGalaxyBackend.RecycleAsyncis itself a stub, so this preserves behavior exactly.
PR 4.1 — ITagDiscovery via GalaxyRepositoryClient
Parallel-key: galaxy-discover.
Files
- Create:
src/.../Driver.Galaxy/Browse/GalaxyDiscoverer.cssrc/.../Driver.Galaxy/Browse/DataTypeMap.cs—mx_data_type → DriverDataType. Port table fromGalaxyProxyDriver.MapDataType(lines 523–532) and verify againstgr/data_type_mapping.md.src/.../Driver.Galaxy/Browse/SecurityMap.cs— portGalaxyProxyDriver.MapSecurity(lines 534–544).src/.../Driver.Galaxy/Browse/AlarmRefBuilder.cs— for any attribute whereIsAlarm=true, compute the five sub-attribute references by Galaxy naming convention (<tag>.<attr>.InAlarm,<tag>.<attr>.Priority,<tag>.<attr>.DescAttrName,<tag>.<attr>.Acked,<tag>.<attr>.AckMsg) and populateAlarmConditionInfo.{InAlarmRef, PriorityRef, DescAttrNameRef, AckedRef, AckMsgWriteRef}before passing toMarkAsAlarmCondition. Mirrors today's behavior inMxAccessGalaxyBackend.SubscribeAlarmsAsyncso the server-levelAlarmConditionService(Phase 2) has every ref it needs.
- Modify:
GalaxyDriver.cs— implementITagDiscovery.DiscoverAsynccalling discoverer.
- Tests:
Tests/Browse/GalaxyDiscovererTests.cs— fakeIGalaxyRepositoryClientTransportwith cannedGalaxyObjectlist.Tests/Browse/AlarmRefBuilderTests.cs— for an alarm-bearing attribute, verify all five refs match the<tag>.<attr>.{...}shape and round-trip cleanly throughMarkAsAlarmCondition.
Acceptance
- Discovered nodes carry
mx_data_type,IsArray,ArrayDim,SecurityClassification,IsHistorized,IsAlarmmatching what the legacy backend produces (snapshot-compared in Phase 5). - Every
IsAlarm=trueattribute callsMarkAsAlarmConditionwith all five sub-attribute refs populated. TheAlarmConditionServicefrom Phase 2 must be able to subscribe and ack without further help from the driver.
Subagent guidance
- Use an
Exploreagent first: "find every place inDriver.Galaxy.Proxy/GalaxyProxyDriver.csthat consumesDiscoverHierarchyResponseand list every wire field it reads, so we know what gw's proto must surface."
Depends on: PR 4.0 merged + PR 0.1 (gw attribute parity) NuGet bumped.
PR 4.2 — IReadable (one-shot read path)
Parallel-key: galaxy-read.
Files
- Create:
src/.../Driver.Galaxy/Runtime/GalaxyMxSession.cs— ownsMxGatewaySession,Registerserver handle, in-memorytag → itemHandleregistry.src/.../Driver.Galaxy/Runtime/MxValueDecoder.cs—MxValue → object(boolean/int32/float/double/string/datetime, plus array variants).src/.../Driver.Galaxy/Runtime/StatusCodeMap.cs— explicitMxStatusProxy → uint OPC UA StatusCodemapping table. Today's coarsevtq.Quality >= 192 ? Good : Uncertain_Placeholderbecomes a full mapping covering at minimum:Good (0x0),Uncertain (0x40000000),Uncertain_LastUsableValue (0x40A40000),Bad (0x80000000),Bad_NotConnected (0x808A0000),Bad_NoCommunication (0x80310000),Bad_OutOfService (0x808D0000). Document any unmapped category asBad_InternalErrorand log once with the rawMxStatusProxyso the matrix can be extended from field data.
- Modify:
GalaxyDriver.cs— implementIReadable.ReadAsync: per tag,AddItem→ short-livedAdvise→ firstOnDataChange. (If Phase 0 added a synchronousReadAsyncRPC, use that; flag a follow-up if missing.)
- Tests:
Tests/Runtime/GalaxyReadTests.cs— fake transport with scriptedOnDataChangeresponses.Tests/Runtime/StatusCodeMapTests.cs— exhaustive mapping cases plus "unknown category falls back to Bad_InternalError and emits a single diagnostic log" assertion.
Depends on: PR 4.0.
PR 4.3 — IWritable + secured-write routing
Parallel-key: galaxy-write.
Files
- Create:
src/.../Driver.Galaxy/Runtime/MxValueEncoder.cs—object → MxValue(the inverse of 4.2's decoder; unify into one type if simpler).
- Modify:
GalaxyDriver.cs— implementIWritable.WriteAsync. Route writes whose attribute carriesSecurityClassification.SecuredWrite/VerifiedWritethroughWriteSecuredAsync(mxaccessgw exposes this inMxGatewaySession).
- Tests:
Tests/Runtime/GalaxyWriteTests.cs— verify the routing decision given eachSecurityClassificationvalue.
Depends on: PR 4.2 merged (shares GalaxyMxSession + value type code).
PR 4.4 — ISubscribable + EventPump
Parallel-key: galaxy-subscribe.
Files
- Create:
src/.../Driver.Galaxy/Runtime/SubscriptionRegistry.cs—(driverSubId → list<itemHandle>)and reverse map.src/.../Driver.Galaxy/Runtime/EventPump.cs— single consumer ofMxGatewaySession.StreamEventsAsync. Maps eachOnDataChangeto aDataChangeEventArgsper registered driver subscription.src/.../Driver.Galaxy/Runtime/GalaxySubscriptionHandle.cs(port from Proxy).
- Modify:
GalaxyDriver.cs— implementISubscribable.SubscribeAsyncusingSubscribeBulkAsyncwith thebuffered_update_interval_mshint from PR 0.2.
- Tests:
Tests/Runtime/EventPumpFanoutTests.cs— one item → multiple driver subscriptions → one event per driver subscription.Tests/Runtime/SubscribeBulkTests.cs— partial failures.
Depends on: PR 4.3.
PR 4.5 — ReconnectSupervisor
Parallel-key: galaxy-reconnect.
Files
- Create:
src/.../Driver.Galaxy/Runtime/ReconnectSupervisor.cs— state machine(Healthy → TransportLost → ReopeningSession → ReplayingSubscriptions → Healthy). SurfacesDriverState.Degradedwhile not Healthy.
- Modify:
GalaxyDriver.cs+GalaxyMxSession.cs— wire transport-error callbacks into the supervisor; replay subscriptions viaReplaySubscriptionsCommand(PR 0.3).
- Tests:
Tests/Runtime/ReconnectSupervisorTests.cswith simulated drops.
Depends on: PR 4.4. Strong recommend Phase 0.3 (replay RPC) merged.
PR 4.6 — IRediscoverable via WatchDeployEvents
Parallel-key: galaxy-deploy.
Files
- Create:
src/.../Driver.Galaxy/Browse/DeployWatcher.cs— long-lived consumer ofGalaxyRepositoryClient.WatchDeployEventsAsync.
- Modify:
GalaxyDriver.cs— start watcher on Initialize; raiseOnRediscoveryNeededper event.
- Tests:
Tests/Browse/DeployWatcherTests.cs.
Depends on: PR 4.0. Independent of PR 4.2–4.5 — can run in parallel with all of them.
PR 4.7 — IHostConnectivityProbe (transport health + per-platform probes)
Parallel-key: galaxy-health.
The current driver reports two flavors of host connectivity:
- Top-level transport health — flips
Running/Stoppedon the synthetic host named afterOTOPCUA_GALAXY_CLIENT_NAMEwhenever the MXAccess COM proxy connects/disconnects. - Per-platform
ScanStateprobes — for each discovered$WinPlatformand$AppEnginegobject, advise itsScanStateattribute and translate value transitions into per-hostRunning/Stopped/Unknown. Lives inDriver.Galaxy.Host/Backend/Stability/GalaxyRuntimeProbeManager.cs.
This PR ports both.
Files
- Create:
src/.../Driver.Galaxy/Health/HostConnectivityForwarder.cs— consumes PR 0.4StreamSessionHealthand surfaces the synthetic top-level host entry (named after the configured MXAccessClientName).src/.../Driver.Galaxy/Health/PerPlatformProbeWatcher.cs— port ofGalaxyRuntimeProbeManager. OnDiscover, takes the list of discovered$WinPlatform/$AppEnginetag names, subscribes theirScanStatevia the driver's ownGalaxyMxSession.SubscribeBulkAsync(or directly through the gw session), runs the same state machine (OnProbeCallbackinterpretation logic — port verbatim with tests), and raises per-hostHostStatusChangedEventArgsthrough the aggregator below.src/.../Driver.Galaxy/Health/HostStatusAggregator.cs— single sink that merges the forwarder's transport entry with the watcher's per-platform entries into theIReadOnlyList<HostConnectivityStatus>surfaced byIHostConnectivityProbe.GetHostStatuses(). Owns the de-dup + diff logic that today lives inGalaxyProxyDriver.OnHostConnectivityUpdate.
- Modify:
GalaxyDriver.cs— wire forwarder + watcher + aggregator into Initialize. On everyITagDiscovery.DiscoverAsynccompletion (incl. re-discovery from PR 4.6), feed the watcher the fresh platform list so probe subscriptions follow Galaxy redeploys.
- Tests:
Tests/Health/HostConnectivityForwarderTests.cs.Tests/Health/PerPlatformProbeWatcherTests.cs— port the existingGalaxyRuntimeProbeManagerTests(or whatever coversOnProbeCallback) verbatim. Cover: initial subscribe on Discover, re-subscribe after Rediscover, value-transition state machine, cleanup on Shutdown.Tests/Health/HostStatusAggregatorTests.cs— transport entry plus multiple per-platform entries, transitions, aggregator emitsOnHostStatusChangedonly on actual state change.
Acceptance
- Top-level transport up/down reflected within 1s of gw
SessionHealthflip. - Each
$WinPlatform/$AppEnginegobject in the discovered hierarchy produces exactly one entry inGetHostStatuses(), transitioning onScanStatechanges. - After a redeploy that adds a new platform, the watcher subscribes its
ScanStatewithout restarting the driver.
Depends on: PR 4.0 + PR 4.1 (needs the discoverer's platform list). Independent of PR 4.2–4.6 — parallel-safe with the runtime track.
PR 4.W — Backend-flag wiring
Parallel-key: locked-files.
Files
src/.../Server/Configuration/DriverFactoryRegistry.cs(or wherever drivers are wired) — add aGalaxy:Backendswitch:legacy-host→ existingGalaxyProxyDriverregistration (untouched).mxgateway→ newGalaxyDriverregistration via PR 4.0's extension.
src/.../Server/appsettings.json— sample new config block.ZB.MOM.WW.OtOpcUa.slnx— registerDriver.Galaxyand its tests.CLAUDE.md— note new driver, retain old driver pointers.
Acceptance
- With
Galaxy:Backend=legacy-host(default), unchanged behavior. - With
Galaxy:Backend=mxgateway, server boots against the new driver and passes a smoke test against the dev gw.
Phase 4 parallel batches
Dependency graph:
4.0 (shell) ──┬── 4.1 (discover) ──┬── 4.6 (deploy)
│ └── 4.7 (health: needs platform list)
├── 4.2 (read) ── 4.3 (write) ── 4.4 (subscribe) ── 4.5 (reconnect)
│ \
│ → 4.W (wire-up)
└── (no longer parallel-with-4.1: 4.7 moved under 4.1)
- After 4.0 merges, 4.1 and the 4.2-chain head can run in two parallel worktrees.
- After 4.1 merges, 4.6 and 4.7 can run in two parallel worktrees.
- 4.2 → 4.3 → 4.4 → 4.5 is one sequential chain on its own worktree
(they all touch
GalaxyDriver.csandGalaxyMxSession.cs) and runs alongside the discover/deploy/health track. - 4.W gathers everything.
Recommended Phase 4 plan:
- Stage 1 (after 4.0): two worktrees — W1: 4.1; W2: 4.2 → 4.3 → 4.4 → 4.5.
- Stage 2 (after 4.1 merges, W2 still running): three worktrees — W1: 4.6; W3: 4.7; W2: continues runtime chain.
- Stage 3: 4.W wire-up.
Phase 5 — Parity test matrix
Tasks
PR 5.1 — Driver.Galaxy.ParityTests project
Parallel-key: parity-shell.
Files
- Create:
tests/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.ParityTests/ParityHarness.cs— boots the OtOpcUa server twice with each backend, drives the same OPC UA scenarios, captures structured snapshots.- Theory data per scenario (browse, subscribe, alarm transition, write by classification, history read).
- Reuses existing live-Galaxy fixtures from
tests/.../Driver.Galaxy.E2E/.
PR 5.2 — Browse + read parity scenarios
Parallel-key: parity-browse.
PR 5.3 — Subscribe + event-rate parity scenarios
Parallel-key: parity-subscribe.
PR 5.4 — Write-by-classification parity scenarios
Parallel-key: parity-write.
PR 5.5 — Alarm-transition parity scenarios
Parallel-key: parity-alarms.
Cover both:
- Live transitions: Active / Acknowledged / Inactive sequences against
.InAlarm/.Ackedvalue flips on the dev Galaxy. Must match legacy-host event ordering and severity mapping. - Alarm-event persistence: trigger N alarm transitions, then verify
the SQLite store-and-forward sink drains them into the Wonderware
historian event store via the new sidecar's
WriteAlarmEventscontract (PR 3.3). Compare the persisted rows to those produced by the legacyGalaxyHistorianWriterpath.
PR 5.6 — History-read parity scenarios
Parallel-key: parity-history.
PR 5.7 — Reconnect/disruption scenarios
Parallel-key: parity-reconnect.
PR 5.8 — Per-platform ScanState probe parity
Parallel-key: parity-probes.
Verify the new PerPlatformProbeWatcher (PR 4.7) produces the same
per-host HostConnectivityStatus stream as the legacy
GalaxyRuntimeProbeManager:
- Initial state on Discover for each
$WinPlatform/$AppEngine. - Transition events when a runtime is stopped/started on the dev Galaxy.
- Re-subscription after a redeploy that adds/removes a platform.
- Cleanup of probe subscriptions on Shutdown (no leaked advises in gw).
PR 5.W — Parity matrix doc
Files
docs/v2/Galaxy.ParityMatrix.md— table of scenario × result for both backends. Resolved deltas marked, accepted deltas justified.
Phase 5 parallel batches
After 5.1 lands, scenarios 5.2–5.8 are fully parallel — they each add
a separate test class file. Seven worktrees, seven general-purpose agents.
5.W runs after all scenarios merge and pass.
Phase 6 — Performance + hardening
Tasks
PR 6.1 — OpenTelemetry traces
Parallel-key: perf-otel.
PR 6.2 — Bounded channel + drop-newest metrics
Parallel-key: perf-eventpump.
PR 6.3 — Buffered update interval landing
Parallel-key: perf-buffered.
Wire MxAccess:PublishingIntervalMs → SetBufferedUpdateInterval once
gw exposes it.
PR 6.4 — Soak test scenario
Parallel-key: perf-soak.
50k tags, 24h, automated metric collection.
PR 6.5 — Tune MxGatewayClientOptions defaults
Parallel-key: perf-tuning.
Based on soak data.
PR 6.W — Performance doc
docs/v2/Galaxy.Performance.md.
Phase 6 parallel batches
6.1, 6.2, 6.3 all touch Driver.Galaxy/Runtime/. Serialize them, OR split
files explicitly:
- 6.1 owns a new
Runtime/Tracing.csinjected via decorator. Parallel-safe. - 6.2 owns
Runtime/EventPump.cs. Conflicts with PR 4.4 only if reordered; not in parallel with 6.1 if 6.1 also wraps EventPump. Decide upfront: PR 6.1 wraps at the gateway-client boundary, PR 6.2 owns EventPump internals. Parallel-safe. - 6.3 modifies
GalaxyDriver.SubscribeAsynconly. Parallel-safe.
So 6.1, 6.2, 6.3 parallel, then 6.4 (depends on all three). 6.5 sequential after 6.4 (uses its data). 6.W last.
Phase 7 — Retire legacy
Tasks
PR 7.1 — Default flip
Parallel-key: retire-defaults.
Files
src/.../Server/appsettings.json→Galaxy:Backend = mxgateway.scripts/e2e/e2e-config.sample.json→ dropOTOPCUA_GALAXY_*pipe vars, add gw endpoint.scripts/install/Install-Services.ps1→ removeOtOpcUaGalaxyHostregistration; keepOtOpcUaWonderwareHistorianfrom PR 3.W.
PR 7.2 — Delete legacy projects
Parallel-key: retire-delete.
Files
- Delete:
src/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host/src/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Proxy/src/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Shared/tests/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Tests/tests/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Proxy.Tests/tests/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Shared.Tests/
- Modify:
ZB.MOM.WW.OtOpcUa.slnx— remove the six entries.Server/Configuration/DriverFactoryRegistry.cs— remove thelegacy-hostswitch arm.
Depends on: PR 7.1 fully soaked (no rollback risk).
PR 7.3 — Doc + memory housekeeping
Parallel-key: retire-docs.
Files
CLAUDE.md— rewrite Galaxy section.docs/v2/dev-environment.md— dropOtOpcUaGalaxyHostreferences.docs/ServiceHosting.md,docs/Redundancy.md,docs/security.md— scrubGalaxy.Host/Galaxy.Proxymentions.~/.claude/projects/.../memory/MEMORY.md— retire entries:project_galaxy_host_service.mdproject_galaxy_host_installed.mdproject_aveva_platform_installed.md(revise — server box no longer needs AVEVA; gw box does)
- Delete:
mxaccess_documentation.md(no longer consumed by this repo).
- Add memory entry:
project_galaxy_via_mxgateway.md.
Phase 7 parallel batches
- Batch 7a (sequential, gated by phase 6 production soak): 7.1.
- Batch 7b (parallel after 7.1): 7.2 (
retire-delete) and 7.3 (retire-docs) — disjoint files.
Cross-phase dependency graph
Phase 0 (gw repo) ────────────────────────────────────┐
│
Phase 1.1 (Core.Abs/Historian) ──┐ │
├── Phase 1.2/1.3 │
│ (server History)│
Phase 2.1 (Core.Abs/Alarms) ──────┤ │
├── Phase 2.2/2.3 │
│ (server Alarms) │
│ │
└── Phase 3 (sidecar host + client)
│ │
└─────────┴── Phase 4 (Driver.Galaxy)
│
Phase 5 (parity)
│
Phase 6 (perf)
│
Phase 7 (retire)
Maximum-parallelism rollout (one possible execution)
- Day 0–N (mxaccessgw): Phase 0 batches 0a + 0b + 0.W in parallel worktrees, separate repo from this one — runs in parallel with everything below until consumers need the gw bump.
- Day 0–N (this repo): Phases 1.1 and 2.1 in parallel (two worktrees). Merge.
- Day N+: Phases 1.2/1.3, 2.2/2.3, 3.1+3.2+3.3+3.4 in parallel (three worktrees, each a sequential chain).
- Day M: combined wire-up PR 1+2.W, then PR 3.W. Server passes existing e2e against legacy backend.
- Day M+: Phase 4.0 lands. Phase 4 fan-out (four worktrees) starts.
- Day P: Phase 4 wire-up. Phase 5 fan-out (six worktrees) starts.
- Day Q: Phase 5 wire-up. Phase 6 fan-out (three worktrees + sequential).
- Day R: Phase 7. Done.
Subagent prompt template
Re-use this shell when launching any of the parallel coding tasks. Replace
<bracketed> parts.
You are implementing PR <id> from lmx_mxgw_impl.md ("<title>").
Repo: <C:\Users\dohertj2\Desktop\lmxopcua | C:\Users\dohertj2\Desktop\mxaccessgw>.
Worktree: <path>.
Scope (you may create/edit only these files):
<list>
DO NOT edit:
- Any file outside the scope above
- ZB.MOM.WW.OtOpcUa.slnx / mxaccessgw/MxGateway.sln
- src/.../Server/Program.cs, OpcUaServerService.cs, appsettings.json
- scripts/install/Install-Services.ps1
- scripts/e2e/e2e-config.sample.json
- CLAUDE.md, docs/**, MEMORY.md, mxaccess_documentation.md
Acceptance:
<list>
Tests:
<list>
If you find a needed change outside scope, STOP and surface it as a
finding rather than editing — it will be picked up by the wire-up PR.
Before reporting completion:
1. Run `dotnet build <smallest project tree that covers your scope>`.
2. Run the new/changed tests.
3. Report: files changed, test command + result, any out-of-scope
findings.
Risk register (operational)
| Risk | When it bites | Mitigation |
|---|---|---|
| Phase 0 gw bump breaks existing mxaccessgw consumers | Phase 0 wire-up | Cross-language smoke matrix in mxaccessgw must run before merge |
Two parallel agents both edit OpcUaServerService.cs despite the rule |
Phases 1+2 parallel | Wire-up convention + grep-based pre-merge check (git diff --stat origin/main of locked files in the integration branch must be empty until the wire-up PR) |
Subagent silently adds a stray using to a locked file |
Anytime | The build-and-test step in the prompt will fail if the locked file changed and broke compile; a git diff --name-only whitelist check at integration-branch merge time enforces it |
| Galaxy.Host can't build during phase 3.2 because lifted files vanished | Phase 3 mid-flight | PR 3.2 adds a ProjectReference from Galaxy.Host to Driver.Historian.Wonderware so the moved files remain reachable; tests cover both call sites |
| Phase 4 chain stalls because gw exposes no synchronous read | PR 4.2 | Surface as a Phase 0 finding immediately — add a ReadCommand to gw or accept short-lived advise as the read mechanism (document as a perf accepted delta in 5.W) |
| Phase 5 parity matrix exposes a delta no one wants to fix | Phase 5 | Phase 7 gating: Galaxy:Backend=mxgateway does not become default until every parity delta is either resolved or has a written acceptance from the user |
Soak test in 6.4 finds a memory leak in EventPump |
Phase 6 | EventPump bounded-channel design (PR 6.2) is shipped before soak so the leak is bounded by design |
| Stale memory file references retired code after phase 7 | Phase 7 | PR 7.3 explicitly retires project_galaxy_host_* entries; add a memory-audit step to phase-close checklist |
Phase-close checklist (apply at the end of each phase)
Before declaring a phase done:
dotnet build ZB.MOM.WW.OtOpcUa.slnxclean on integration branch.dotnet test ZB.MOM.WW.OtOpcUa.slnxclean (or all-but-known-skipped).- Live-Galaxy smoke (when applicable) green on dev box.
- No locked files modified outside their wire-up PR
(
git log --name-only origin/main..HEAD -- <locked-paths>shows only the wire-up commit). MEMORY.mdupdated for any persistent context this phase introduced.- Doc updates limited to the phase's scope (no doc edits sprinkled across non-doc PRs).