Files
lmxopcua/lmx_mxgw_impl.md
Joseph Doherty 006af51768 docs: post-PR-7.2 cleanup — audit + three-track scrub
Audit (three parallel agent passes) found 43 markdown files carrying
stale references to the deleted Galaxy.Host/Proxy/Shared projects
after the v2-mxgw merge. This commit lands the prioritized fixes.

Track 1 — high-traffic in-place rewrites (3 files, ~454 lines deleted)
- README.md (202 → 91 lines): drops .NET 4.8 / x86 / TopShelf install
  text; leads with the multi-driver .NET 10 server identity and points
  at scripts/install/Install-Services.ps1 and the parity rig.
- docs/v2/driver-specs.md §1 Galaxy (~289 → ~66 lines): replaces the
  Tier-C out-of-process spec with a Tier-A in-process description
  matching the current GalaxyDriver code, with the four-section
  GalaxyDriverOptions JSON shape pulled verbatim from
  Config/GalaxyDriverOptions.cs.
- docs/drivers/Galaxy.md (211 → 92 lines): full rewrite around the
  current Browse/Runtime/Health/Config sub-folders.

Track 2 — historical banners (5 files)
- lmx_mxgw.md, lmx_mxgw_impl.md, lmx_backend.md,
  docs/v2/Galaxy.ParityMatrix.md,
  docs/v2/implementation/phase-2-galaxy-out-of-process.md each get a
  " Completed 2026-04-30 — historical record" banner block. lmx_mxgw.md
  also fixes two dead links (`docs/Galaxy.Driver.md` and
  `docs/v2/Galaxy.Driver.md`) → `docs/drivers/Galaxy.md`.

Track 3 — v1 archive sweep (10 git mv + 1 new index + 2 in-place scrubs)
- Moved 10 v1 docs under docs/v1/ preserving subpath structure:
  AlarmTracking, Configuration, DataTypeMapping, HistoricalDataAccess,
  Subscriptions (top-level); drivers/Galaxy-Repository,
  drivers/Galaxy-Test-Fixture; reqs/GalaxyRepositoryReqs,
  reqs/MxAccessClientReqs, reqs/ServiceHostReqs.
- New docs/v1/README.md is the shared archive banner + per-file table.
- docs/README.md repointed to the v1 paths and updated to reflect the
  v2 two-process deploy shape (Server + Admin + optional
  OtOpcUaWonderwareHistorian).
- docs/v2/Galaxy.ParityRig.md got a historical banner + four inline
  scrubs marking the OtOpcUaGalaxyHost service / Driver.Galaxy.Host
  EXE / Driver.Galaxy.ParityTests project as deleted-in-PR-7.2.

The repo's live-reading surface (README + CLAUDE.md + docs/v2/) now
describes only the post-PR-7.2 architecture. v1 docs are preserved as
a labelled archive under docs/v1/.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-30 08:59:59 -04:00

1063 lines
44 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
> **✅ Completed 2026-04-30 — historical record of the v2-mxgw implementation plan.**
>
> All 39 PRs across 7 phases (1.11.3 + 2.12.3 + 1+2.W + 3.13.W +
> 4.04.W + 5.15.W + 6.16.W + 7.17.3) shipped and merged to master
> at commit `ae7106d`. Per-phase status tracking below is preserved as
> the historical PR-execution log; phase descriptions are
> retrospective, not pending. Parity matrix verified green on the dev
> rig 2026-04-30 (14 passed / 1 skipped / 0 failed —
> see `docs/v2/Galaxy.ParityMatrix.md`).
# Galaxy → MxGateway Migration — Detailed Implementation Plan
Companion to `lmx_mxgw.md` (design plan). This document breaks the plan into
PR-sized tasks with concrete file paths, acceptance checks, test deltas, and
explicit parallel-safety analysis for subagent execution.
Cross-repo scope:
- **`lmxopcua`** (this repo) — drivers, server, install scripts, e2e, docs.
- **`mxaccessgw`** (`C:\Users\dohertj2\Desktop\mxaccessgw`) — gRPC gateway,
worker, .NET client.
---
## How to use parallel subagents safely
The plan lists each task with a `parallel-key`. Two tasks share a key when
they touch the same file(s); tasks with **disjoint keys are safe to run in
parallel**. Tasks within the same phase that share a key MUST run
sequentially.
### Subagent execution rules
1. **One git worktree per parallel subagent.** Spawn each parallel agent
with `Agent({ isolation: "worktree", ... })` so they never collide on the
working tree. Merge back to a shared integration branch after each
parallel batch completes.
2. **Interface-defining tasks run first, then their consumers.** Anywhere
the plan says "PR X.0: define interface", that PR must merge to the
integration branch before its consumers fan out in parallel.
3. **Shared-file edits serialize.** Files touched by more than one PR in a
batch — `ZB.MOM.WW.OtOpcUa.slnx`, `Install-Services.ps1`,
`appsettings.json`, `CLAUDE.md`, `MEMORY.md` — get a single dedicated
"wire-up" PR at the end of the batch that ingests every parallel branch's
needed line. Don't let parallel agents edit them.
4. **Test fixtures own their fixture file.** When two PRs both need a
`FakeMxGatewayClient`, the first PR creates it and exposes the contract;
subsequent PRs add cases to the same file or extend it via partial class
in their own test files.
5. **Subagent prompt must include the parallel-key and disallowed paths.**
Any agent prompt must say "you may NOT edit `<sln file>`,
`<wire-up files>`, or files outside `<your scope>`. If you discover a
needed change there, surface it as a task for the wire-up PR; do not
make it yourself." This prevents merge conflicts at integration time.
6. **Choose the right subagent type.**
- `Explore` — read-only research/locate. Cheap. Use before any PR that
needs to learn the surrounding code.
- `Plan` — produce a step-by-step PR plan from a brief; no code writes.
Use when a task description below is too coarse for a fresh agent.
- `general-purpose` — code-writing. Use for PRs that create/modify
source.
- `code-simplifier` — post-PR cleanup pass on the same files.
- `codex:rescue` — a stuck PR; use sparingly.
7. **Foreground vs. background.** Run one PR foreground if its result
gates the rest of your work this turn. Run the rest in background and
read results when they complete.
8. **Trust but verify.** After every subagent claims completion, the
parent runs the build (`dotnet build ZB.MOM.WW.OtOpcUa.slnx`) and the
target tests. The agent's report is hearsay until the build is green.
9. **Worktree cleanup.** When `isolation: "worktree"` returns no path,
nothing was changed; if it returns a path, integrate by cherry-picking
or fast-forwarding into the integration branch, then prune the worktree.
### Locked files (never edit from a parallel batch)
These get a dedicated wire-up PR at the **end** of each phase's parallel
fanout:
| File | Why locked |
|---|---|
| `ZB.MOM.WW.OtOpcUa.slnx` | New project additions stack and conflict |
| `src/ZB.MOM.WW.OtOpcUa.Server/appsettings.json` | Config schema additions stack |
| `src/ZB.MOM.WW.OtOpcUa.Server/Program.cs` (or `Startup.cs`) | DI registrations stack |
| `scripts/install/Install-Services.ps1` | Service registrations stack |
| `scripts/e2e/e2e-config.sample.json` | E2E config stacks |
| `CLAUDE.md`, `docs/v2/dev-environment.md` | Doc edits stack |
| `MEMORY.md` (auto-memory index) | One line per change; conflicts often |
| `mxaccessgw/MxGateway.sln` | Same reason as our slnx |
| `mxaccessgw/clients/proto/*.proto` files | Proto edits stack and reorder field numbers |
---
## Phase 0 — mxaccessgw foundation work
Repo: `C:\Users\dohertj2\Desktop\mxaccessgw`. Branch off `main` per task.
| PR | Title | Parallel-key | Files |
|----|-------|--------------|-------|
| 0.1 | Galaxy attribute metadata parity | `gw-proto-galaxy` | `clients/proto/galaxy_repository.proto`, `src/MxGateway.Server/Galaxy/AttributeMapper.cs`, `src/MxGateway.Server/Galaxy/GalaxyHierarchyCache.cs`, `gr/`-equivalent SQL in `src/MxGateway.Server/Galaxy/Sql/`, contract tests |
| 0.2 | Bulk subscribe with publishing-interval hint | `gw-proto-mxaccess` | `clients/proto/mxaccess_gateway.proto` (extend `SubscribeBulkCommand` with `optional uint32 buffered_update_interval_ms`), `src/MxGateway.Worker/MxAccess/Commands/SubscribeBulkHandler.cs`, `src/MxGateway.Server/Sessions/Mappers.cs`, worker tests |
| 0.3 | Subscription replay RPC | `gw-proto-mxaccess` | Same proto file as 0.2 (add `ReplaySubscriptionsCommand`), `src/MxGateway.Worker/MxAccess/Commands/ReplaySubscriptionsHandler.cs`, gateway forwarder, tests |
| 0.4 | Session health stream | `gw-proto-mxaccess` | Same proto (add `StreamSessionHealth(SessionId) returns (stream SessionHealth)`), `src/MxGateway.Server/Sessions/SessionHealthService.cs`, dashboard projection, tests |
| 0.5 | Document event-stream resume contract | `gw-docs` | `docs/Sessions.md`, `docs/gateway-process-design.md` — define retention bound, `events_lost` signal in `MxEvent` envelope |
| 0.6 | .NET client `MxValue` adapter + `SubscribeWithCallback` | `gw-dotnet-client` | `clients/dotnet/MxGateway.Client/MxValueAdapter.cs` (new), `clients/dotnet/MxGateway.Client/MxGatewaySession.cs` (extend with `SubscribeWithCallbackAsync`), `clients/dotnet/MxGateway.Client.Tests/` |
| 0.7 | API key scopes + `mxgw-key` minting CLI | `gw-auth` | `src/MxGateway.Server/Auth/`, `src/MxGateway.Cli/`, `docs/Authentication.md` |
### Phase 0 parallel batches
- **Batch 0a (parallel):** 0.1 (`gw-proto-galaxy`), 0.5 (`gw-docs`),
0.6 (`gw-dotnet-client`), 0.7 (`gw-auth`). Four worktrees, four
`general-purpose` agents.
- **Batch 0b (sequential within key, parallel across keys):** 0.2 → 0.3 →
0.4 all share `gw-proto-mxaccess`. Land them in order on the same agent
(or three sequential calls). Field number assignment must be coordinated
through the wire-up PR.
- **Wire-up 0.W:** integrate proto-generated descriptors, regenerate
`clients/proto/descriptors`, run cross-language smoke matrix.
**Phase 0 exit:** mxaccessgw `main` carries all seven PRs. Tag the gw NuGet
release. Bump `MxGateway.Client` consumed by lmxopcua.
---
## Phase 1 — Server-level historian extension point (lmxopcua)
Goal: detach `IHistorianDataSource` from the Galaxy driver. Server's
`HistoryRead*` operations call into a registered data source by namespace,
not into `IHistoryProvider` on the driver.
### Tasks
#### PR 1.1 — Lift `IHistorianDataSource` to `Core.Abstractions`
**Parallel-key:** `core-abs-historian` (locks files in
`Core.Abstractions/Historian/`).
**Files**
- Create:
- `src/ZB.MOM.WW.OtOpcUa.Core.Abstractions/Historian/IHistorianDataSource.cs`
- `src/ZB.MOM.WW.OtOpcUa.Core.Abstractions/Historian/HistorianSample.cs`
- `src/ZB.MOM.WW.OtOpcUa.Core.Abstractions/Historian/HistorianAggregateSample.cs`
- `src/ZB.MOM.WW.OtOpcUa.Core.Abstractions/Historian/HistorianEvent.cs`
- `src/ZB.MOM.WW.OtOpcUa.Core.Abstractions/Historian/HistorianHealthSnapshot.cs`
- Move-from (Galaxy.Host originals stay until phase 7; new copies live in
Core.Abstractions and are pure POCO):
- source bodies in `src/.../Driver.Galaxy.Host/Backend/Historian/`
- Modify:
- `src/ZB.MOM.WW.OtOpcUa.Core.Abstractions/ZB.MOM.WW.OtOpcUa.Core.Abstractions.csproj` (no change if files auto-included)
- Tests:
- `tests/ZB.MOM.WW.OtOpcUa.Core.Abstractions.Tests/Historian/IHistorianDataSourceContractTests.cs`
contract documentation tests (null arg behavior, time-range ordering).
**Acceptance**
- `dotnet build` clean.
- New tests run and pass.
- Galaxy.Host still compiles (it keeps its own copies until phase 7).
**Subagent prompt boilerplate** (template — re-use this shape for each PR):
> You are working in worktree `<path>`. Create the files listed in PR 1.1 of
> `lmx_mxgw_impl.md`. Do NOT edit any file under `Driver.Galaxy.Host/`,
> `appsettings.json`, the `.slnx`, or `Program.cs`. The DTOs are pure value
> records — do not import OPC UA types or COM types. Run
> `dotnet build src/ZB.MOM.WW.OtOpcUa.Core.Abstractions` before reporting.
#### PR 1.2 — `IHistoryService` plugin host on the server
**Parallel-key:** `server-history`.
**Files**
- Create:
- `src/ZB.MOM.WW.OtOpcUa.Server/History/IHistoryRouter.cs` — namespace → `IHistorianDataSource`.
- `src/ZB.MOM.WW.OtOpcUa.Server/History/HistoryRouter.cs` — registry impl.
- `src/ZB.MOM.WW.OtOpcUa.Server/History/HistoryServiceAdapter.cs`
bridges OPC UA `HistoryRead`/`HistoryReadProcessed`/`HistoryReadAtTime`/
`HistoryReadEvents` to the router.
- Modify:
- `src/ZB.MOM.WW.OtOpcUa.Server/OpcUaServerService.cs` — register
`HistoryServiceAdapter`. *Locked file* — defer to wire-up PR 1.W.
- Tests:
- `tests/ZB.MOM.WW.OtOpcUa.Server.Tests/History/HistoryRouterTests.cs`.
**Acceptance**
- Router resolves data source by namespace prefix.
- Unknown namespace returns `BadHistoryOperationUnsupported` (or current
status used for that case — verify against existing server behavior in
`OpcUaServerService.cs` before coding).
**Depends on:** 1.1 merged.
#### PR 1.3 — Driver capability shrink: drop `IHistoryProvider` requirement
**Parallel-key:** `server-history`.
**Files**
- Modify:
- `src/ZB.MOM.WW.OtOpcUa.Server/DriverNodeManager.cs` (or wherever
`IHistoryProvider` is consumed; locate via `Grep "IHistoryProvider"`).
Replace direct calls with `IHistoryRouter.Resolve(...)`.
- Tests:
- Update any test that exercised `IHistoryProvider` paths to register a
fake data source via the router.
**Depends on:** 1.2 merged.
#### PR 1.W — Phase 1 wire-up
**Parallel-key:** locked-files.
**Files**
- `src/ZB.MOM.WW.OtOpcUa.Server/OpcUaServerService.cs` — DI registration of
`HistoryRouter` + the legacy Galaxy.Host historian adapter.
- `ZB.MOM.WW.OtOpcUa.slnx` — no change unless a new project was added; if
PR 1.1 went into the existing `Core.Abstractions` project, no slnx edit.
### Phase 1 parallel batches
- **Batch 1a (sequential):** 1.1 → 1.2 → 1.3 → 1.W. Each blocks the next.
- Total: one foreground sequence; no parallelism in Phase 1. Use one
`general-purpose` agent across all four PRs, or one PR per agent in
order.
---
## Phase 2 — Server-level alarm condition subsystem (lmxopcua)
Goal: drop `GalaxyAlarmTracker` from the driver's responsibilities; the
server runs the AlarmCondition state machine driven by `IsAlarm=true`
attribute metadata.
### Tasks
#### PR 2.1 — Address-space builder alarm-declaration API
**Parallel-key:** `core-abs-alarms`.
**Files**
- Modify:
- `src/ZB.MOM.WW.OtOpcUa.Core.Abstractions/IAddressSpaceBuilder.cs`
add `IAlarmConditionDeclaration MarkAsAlarmCondition(...)` (the
method already exists per `GalaxyProxyDriver.cs:146`; verify shape and
extend with the four sub-attribute references).
- `src/ZB.MOM.WW.OtOpcUa.Core.Abstractions/Alarms/AlarmConditionInfo.cs`
— add `InAlarmRef`, `PriorityRef`, `DescAttrNameRef`, `AckedRef`,
`AckMsgWriteRef` fields.
- Tests:
- `tests/ZB.MOM.WW.OtOpcUa.Core.Abstractions.Tests/Alarms/AlarmConditionInfoTests.cs`.
**Acceptance**
- Existing call sites (`GalaxyProxyDriver.DiscoverAsync`) still compile —
add the new fields with safe defaults.
#### PR 2.2 — `AlarmConditionService` (state machine)
**Parallel-key:** `server-alarms`.
**Files**
- Create:
- `src/ZB.MOM.WW.OtOpcUa.Server/Alarms/AlarmConditionService.cs`
- `src/ZB.MOM.WW.OtOpcUa.Server/Alarms/AlarmConditionState.cs`
- `src/ZB.MOM.WW.OtOpcUa.Server/Alarms/IAlarmAcknowledger.cs`
- Reference impl to **port** (do not duplicate — read it for invariants):
- `src/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host/Backend/Alarms/GalaxyAlarmTracker.cs`
- Tests:
- `tests/ZB.MOM.WW.OtOpcUa.Server.Tests/Alarms/AlarmConditionServiceTests.cs`
port the existing tracker tests (`tests/.../Galaxy.Host.Tests/`).
**Subagent guidance**
- **Two-step.** First a `Plan` agent: read `GalaxyAlarmTracker.cs` and
produce a state-transition table + a list of tests to port. Then a
`general-purpose` agent: implement `AlarmConditionService` against that
table.
**Depends on:** 2.1 merged.
#### PR 2.3 — Wire alarm service into `DriverNodeManager`
**Parallel-key:** `server-alarms`.
**Files**
- Modify:
- `src/ZB.MOM.WW.OtOpcUa.Server/DriverNodeManager.cs` — on each driver's
discovery, collect alarm declarations and hand to `AlarmConditionService`
along with the driver's `ISubscribable` and `IWritable` for sub-attribute
advise + ack writes.
- Tests:
- extend `DriverNodeManagerTests` with a fake driver that declares one
alarm-bearing node.
**Depends on:** 2.2 merged.
#### PR 2.W — Phase 2 wire-up
DI registration of `AlarmConditionService` in `OpcUaServerService.cs`.
### Phase 2 parallel batches
- **Batch 2a (sequential):** 2.1 → 2.2 → 2.3 → 2.W.
### Phases 1 + 2 cross-batch parallelism
PR 1.1 and PR 2.1 touch **different files** in `Core.Abstractions/` (one
under `Historian/`, one in `IAddressSpaceBuilder.cs` + `Alarms/`). They are
**parallel-safe**.
PR 1.2/1.3 and PR 2.2/2.3 both modify `OpcUaServerService.cs` and
`DriverNodeManager.cs`. They share **two locked files** — but only at the
DI-registration level. If we split the `OpcUaServerService.cs` edits into a
single combined wire-up PR (1+2.W), the body PRs 1.2/1.3 and 2.2/2.3 don't
touch them. Then the body PRs *can* run in parallel batches across
phase 1 and phase 2.
**Recommended Phase 1+2 plan** (parallel):
1. Run **PR 1.1 and PR 2.1 in parallel** (two worktrees, two
`general-purpose` agents). Both target `Core.Abstractions` only.
2. Merge both to integration branch.
3. Run **PR 1.2/1.3 and PR 2.2/2.3 in parallel**, each as a sequential
2-PR chain on its own worktree. Constraint: neither chain edits
`OpcUaServerService.cs` or `DriverNodeManager.cs` — defer all DI/wiring
to the combined wire-up.
4. Merge both chains.
5. **Combined wire-up PR 1+2.W** edits `OpcUaServerService.cs` and
`DriverNodeManager.cs` once.
---
## Phase 3 — `Driver.Historian.Wonderware` sidecar
Goal: house the existing `HistorianDataSource` code in its own .NET 4.8 x86
service, exposed over named pipe; ship a .NET 10 client implementing
`IHistorianDataSource`.
### Tasks
#### PR 3.1 — Create the sidecar shell project
**Parallel-key:** `historian-sidecar-host`.
**Files**
- Create project: `src/ZB.MOM.WW.OtOpcUa.Driver.Historian.Wonderware/`
- `Driver.Historian.Wonderware.csproj` (`<TargetFramework>net48</TargetFramework>`,
`<PlatformTarget>x86</PlatformTarget>`).
- `Program.cs` — Serilog + console host + named pipe server (mirror
`Driver.Galaxy.Host/Program.cs` shape: env-driven pipe name, allowed SID,
shared secret).
- Create test project:
- `tests/ZB.MOM.WW.OtOpcUa.Driver.Historian.Wonderware.Tests/`
- *Locked:* `.slnx`, `Install-Services.ps1` (wire-up).
#### PR 3.2 — Lift `HistorianDataSource` & friends
**Parallel-key:** `historian-sidecar-host`.
**Files**
- Move (preserve git history with `git mv`):
- `src/.../Driver.Galaxy.Host/Backend/Historian/HistorianDataSource.cs`
`src/.../Driver.Historian.Wonderware/Backend/HistorianDataSource.cs`
- `HistorianClusterEndpointPicker.cs`
- `HistorianClusterNodeState.cs`
- `HistorianConfiguration.cs`
- `HistorianEventDto.cs`
- `HistorianHealthSnapshot.cs`
- `HistorianQualityMapper.cs`
- `HistorianSample.cs`
- `IHistorianConnectionFactory.cs`
- Add a thin `IHistorianDataSource` shim in the sidecar that re-implements
the **interface from `Core.Abstractions/Historian/`** (after PR 1.1).
- Galaxy.Host needs to keep building until phase 7. Either:
- Add `Driver.Historian.Wonderware` ProjectReference from
`Driver.Galaxy.Host` and re-use the moved code, OR
- Leave a stub copy in Galaxy.Host that delegates to the sidecar via the
new client. Pick option 1 (cleaner).
- Tests:
- `git mv` matching test files from
`tests/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Tests/Backend/Historian/`
to `tests/ZB.MOM.WW.OtOpcUa.Driver.Historian.Wonderware.Tests/`.
**Depends on:** PR 1.1 merged (interface lives in Core.Abstractions).
#### PR 3.3 — Pipe contract + handler
**Parallel-key:** `historian-sidecar-pipe`.
**Files**
- Create:
- `src/ZB.MOM.WW.OtOpcUa.Driver.Historian.Wonderware/Ipc/Contracts.cs`
(MessagePack DTOs: `ReadRawRequest/Reply`, `ReadProcessedRequest/Reply`,
`ReadAtTimeRequest/Reply`, `ReadEventsRequest/Reply`,
**`WriteAlarmEventsRequest/Reply`** — alarm-event persistence write
path; mirror today's `GalaxyHistorianWriter.WriteBatchAsync` payload
so the SQLite store-and-forward sink in `Core.AlarmHistorian` can
drain into the Wonderware historian event store after Galaxy.Proxy is
deleted).
- `Ipc/PipeServer.cs` — copy + adapt
`Driver.Galaxy.Host/Ipc/PipeServer.cs` (same ACL/secret model).
- `Ipc/HistorianFrameHandler.cs` — handles all five contract pairs
above.
- Tests:
- `tests/.../Driver.Historian.Wonderware.Tests/Ipc/PipeRoundTripTests.cs`
— round-trip every contract pair including `WriteAlarmEvents`.
#### PR 3.4 — .NET 10 client
**Parallel-key:** `historian-sidecar-client`.
**Files**
- Create project: `src/ZB.MOM.WW.OtOpcUa.Driver.Historian.Wonderware.Client/`
(.NET 10 x64). Implements:
- `IHistorianDataSource` (read path: raw / processed / at-time / events)
against the sidecar pipe.
- `IAlarmHistorianWriter` (write path: alarm-event persistence) against
the sidecar pipe `WriteAlarmEvents` contract from PR 3.3.
- Tests:
- `tests/.../Driver.Historian.Wonderware.Client.Tests/` against an
in-proc fake pipe server. Cover both the read interface and the
alarm-event write interface; verify the SQLite store-and-forward sink
(`Core.AlarmHistorian.SqliteStoreAndForwardSink`) drains successfully
when the client is plugged in as its target.
**Depends on:** PR 3.3 merged (contracts published).
#### PR 3.W — Phase 3 wire-up
**Files**
- `ZB.MOM.WW.OtOpcUa.slnx` — register three new projects + two new test
projects.
- `scripts/install/Install-Services.ps1` — register
`OtOpcUaWonderwareHistorian` NSSM service.
- `src/ZB.MOM.WW.OtOpcUa.Server/OpcUaServerService.cs` — register the
client as both an `IHistorianDataSource` for the Galaxy namespace **and**
the `IAlarmHistorianWriter` target for the SQLite store-and-forward
sink, replacing today's `GalaxyProxyDriver.WriteBatchAsync` route.
- `src/ZB.MOM.WW.OtOpcUa.Server/appsettings.json``Historian:Wonderware`
block.
### Phase 3 parallel batches
- **Batch 3a (sequential):** 3.1 (shell) → 3.2 (lift code).
- **Batch 3b (parallel after 3.2):** 3.3 (pipe) and 3.4 (client) — but
3.4 depends on 3.3's contracts. So sequential within Phase 3.
- **Batch 3c:** 3.W.
But Phase 3 is **fully independent of Phase 1.1's downstream work** once
1.1 has merged. Phase 3 can run in parallel with Phase 1.2/1.3 and all of
Phase 2.
**Recommended phasing**: kick off Phase 3 in parallel with Phase 2, both
gated only on Phase 1.1's merge.
---
## Phase 4 — New `Driver.Galaxy` (Tier-A, .NET 10) against gw
This is the bulk of the work. Each PR adds one capability to the new driver.
The driver builds and links from PR 4.0 onward; capabilities arrive as
incremental green bars.
The driver lives at `src/ZB.MOM.WW.OtOpcUa.Driver.Galaxy/` (note: same
short name as the old `.Proxy`, but new project. The `.Host`, `.Proxy`,
`.Shared` projects continue to coexist until phase 7).
### Tasks
#### PR 4.0 — Project skeleton, options, factory
**Parallel-key:** `galaxy-shell`.
**Files**
- Create project: `src/ZB.MOM.WW.OtOpcUa.Driver.Galaxy/`
- `Driver.Galaxy.csproj` (.NET 10 x64), references
`Core.Abstractions`, `Core`, `MxGateway.Client` (NuGet from gw repo).
- `GalaxyDriver.cs``IDriver` + `IDisposable` skeleton; `Initialize`
creates `MxGatewayClient` and opens a session; `Shutdown` disposes.
- `Config/GalaxyDriverOptions.cs` — POCO matching the JSON shape in
`lmx_mxgw.md`.
- `GalaxyDriverFactoryExtensions.cs``AddGalaxyDriver(IServiceCollection)`.
- Tests:
- `tests/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Tests/` (new project)
- `Tests/GalaxyDriverInitializationTests.cs` — uses a fake
`IMxGatewayClientTransport` to verify open-session behavior.
- *Locked:* `.slnx` (wire-up PR 4.W).
**Acceptance**
- Driver builds, `Initialize` opens a session against a fake transport,
`Shutdown` closes it.
- `IDriver.RecycleAsync` (if present in the interface today) returns the
same stub shape as the legacy backend — `{Accepted = true, GraceSeconds
= 15}` — and is documented in the file as intentionally a no-op until a
future PR wires it through gw. Today's `MxAccessGalaxyBackend.RecycleAsync`
is itself a stub, so this preserves behavior exactly.
#### PR 4.1 — `ITagDiscovery` via `GalaxyRepositoryClient`
**Parallel-key:** `galaxy-discover`.
**Files**
- Create:
- `src/.../Driver.Galaxy/Browse/GalaxyDiscoverer.cs`
- `src/.../Driver.Galaxy/Browse/DataTypeMap.cs`
`mx_data_type → DriverDataType`. Port table from
`GalaxyProxyDriver.MapDataType` (lines 523532) and verify against
`gr/data_type_mapping.md`.
- `src/.../Driver.Galaxy/Browse/SecurityMap.cs` — port
`GalaxyProxyDriver.MapSecurity` (lines 534544).
- `src/.../Driver.Galaxy/Browse/AlarmRefBuilder.cs` — for any attribute
where `IsAlarm=true`, compute the five sub-attribute references by
Galaxy naming convention (`<tag>.<attr>.InAlarm`,
`<tag>.<attr>.Priority`, `<tag>.<attr>.DescAttrName`,
`<tag>.<attr>.Acked`, `<tag>.<attr>.AckMsg`) and populate
`AlarmConditionInfo.{InAlarmRef, PriorityRef, DescAttrNameRef,
AckedRef, AckMsgWriteRef}` before passing to `MarkAsAlarmCondition`.
Mirrors today's behavior in
`MxAccessGalaxyBackend.SubscribeAlarmsAsync` so the server-level
`AlarmConditionService` (Phase 2) has every ref it needs.
- Modify:
- `GalaxyDriver.cs` — implement `ITagDiscovery.DiscoverAsync` calling
discoverer.
- Tests:
- `Tests/Browse/GalaxyDiscovererTests.cs` — fake
`IGalaxyRepositoryClientTransport` with canned `GalaxyObject` list.
- `Tests/Browse/AlarmRefBuilderTests.cs` — for an alarm-bearing
attribute, verify all five refs match the `<tag>.<attr>.{...}` shape
and round-trip cleanly through `MarkAsAlarmCondition`.
**Acceptance**
- Discovered nodes carry `mx_data_type`, `IsArray`, `ArrayDim`,
`SecurityClassification`, `IsHistorized`, `IsAlarm` matching what the
legacy backend produces (snapshot-compared in Phase 5).
- Every `IsAlarm=true` attribute calls `MarkAsAlarmCondition` with all
five sub-attribute refs populated. The `AlarmConditionService` from
Phase 2 must be able to subscribe and ack without further help from
the driver.
**Subagent guidance**
- Use an `Explore` agent first: "find every place in
`Driver.Galaxy.Proxy/GalaxyProxyDriver.cs` that consumes
`DiscoverHierarchyResponse` and list every wire field it reads, so we
know what gw's proto must surface."
**Depends on:** PR 4.0 merged + PR 0.1 (gw attribute parity) NuGet bumped.
#### PR 4.2 — `IReadable` (one-shot read path)
**Parallel-key:** `galaxy-read`.
**Files**
- Create:
- `src/.../Driver.Galaxy/Runtime/GalaxyMxSession.cs` — owns
`MxGatewaySession`, `Register` server handle, in-memory
`tag → itemHandle` registry.
- `src/.../Driver.Galaxy/Runtime/MxValueDecoder.cs`
`MxValue → object` (boolean/int32/float/double/string/datetime, plus
array variants).
- `src/.../Driver.Galaxy/Runtime/StatusCodeMap.cs` — explicit
`MxStatusProxy → uint OPC UA StatusCode` mapping table. Today's
coarse `vtq.Quality >= 192 ? Good : Uncertain_Placeholder` becomes a
full mapping covering at minimum:
`Good (0x0)`, `Uncertain (0x40000000)`, `Uncertain_LastUsableValue
(0x40A40000)`, `Bad (0x80000000)`, `Bad_NotConnected (0x808A0000)`,
`Bad_NoCommunication (0x80310000)`, `Bad_OutOfService (0x808D0000)`.
Document any unmapped category as `Bad_InternalError` and log once
with the raw `MxStatusProxy` so the matrix can be extended from
field data.
- Modify:
- `GalaxyDriver.cs` — implement `IReadable.ReadAsync`: per tag,
`AddItem` → short-lived `Advise` → first `OnDataChange`. (If
Phase 0 added a synchronous `ReadAsync` RPC, use that; flag a follow-up
if missing.)
- Tests:
- `Tests/Runtime/GalaxyReadTests.cs` — fake transport with scripted
`OnDataChange` responses.
- `Tests/Runtime/StatusCodeMapTests.cs` — exhaustive mapping cases plus
"unknown category falls back to Bad_InternalError and emits a single
diagnostic log" assertion.
**Depends on:** PR 4.0.
#### PR 4.3 — `IWritable` + secured-write routing
**Parallel-key:** `galaxy-write`.
**Files**
- Create:
- `src/.../Driver.Galaxy/Runtime/MxValueEncoder.cs`
`object → MxValue` (the inverse of 4.2's decoder; unify into one type
if simpler).
- Modify:
- `GalaxyDriver.cs` — implement `IWritable.WriteAsync`.
Route writes whose attribute carries
`SecurityClassification.SecuredWrite` / `VerifiedWrite` through
`WriteSecuredAsync` (mxaccessgw exposes this in `MxGatewaySession`).
- Tests:
- `Tests/Runtime/GalaxyWriteTests.cs` — verify the routing decision
given each `SecurityClassification` value.
**Depends on:** PR 4.2 merged (shares `GalaxyMxSession` + value type code).
#### PR 4.4 — `ISubscribable` + `EventPump`
**Parallel-key:** `galaxy-subscribe`.
**Files**
- Create:
- `src/.../Driver.Galaxy/Runtime/SubscriptionRegistry.cs`
`(driverSubId → list<itemHandle>)` and reverse map.
- `src/.../Driver.Galaxy/Runtime/EventPump.cs` — single consumer of
`MxGatewaySession.StreamEventsAsync`. Maps each `OnDataChange` to a
`DataChangeEventArgs` per registered driver subscription.
- `src/.../Driver.Galaxy/Runtime/GalaxySubscriptionHandle.cs` (port from
Proxy).
- Modify:
- `GalaxyDriver.cs` — implement `ISubscribable.SubscribeAsync` using
`SubscribeBulkAsync` with the `buffered_update_interval_ms` hint
from PR 0.2.
- Tests:
- `Tests/Runtime/EventPumpFanoutTests.cs` — one item → multiple driver
subscriptions → one event per driver subscription.
- `Tests/Runtime/SubscribeBulkTests.cs` — partial failures.
**Depends on:** PR 4.3.
#### PR 4.5 — `ReconnectSupervisor`
**Parallel-key:** `galaxy-reconnect`.
**Files**
- Create:
- `src/.../Driver.Galaxy/Runtime/ReconnectSupervisor.cs` — state machine
`(Healthy → TransportLost → ReopeningSession → ReplayingSubscriptions
→ Healthy)`. Surfaces `DriverState.Degraded` while not Healthy.
- Modify:
- `GalaxyDriver.cs` + `GalaxyMxSession.cs` — wire transport-error
callbacks into the supervisor; replay subscriptions via
`ReplaySubscriptionsCommand` (PR 0.3).
- Tests:
- `Tests/Runtime/ReconnectSupervisorTests.cs` with simulated drops.
**Depends on:** PR 4.4. Strong recommend Phase 0.3 (replay RPC) merged.
#### PR 4.6 — `IRediscoverable` via `WatchDeployEvents`
**Parallel-key:** `galaxy-deploy`.
**Files**
- Create:
- `src/.../Driver.Galaxy/Browse/DeployWatcher.cs` — long-lived consumer
of `GalaxyRepositoryClient.WatchDeployEventsAsync`.
- Modify:
- `GalaxyDriver.cs` — start watcher on Initialize; raise
`OnRediscoveryNeeded` per event.
- Tests:
- `Tests/Browse/DeployWatcherTests.cs`.
**Depends on:** PR 4.0. **Independent of PR 4.24.5** — can run in
parallel with all of them.
#### PR 4.7 — `IHostConnectivityProbe` (transport health + per-platform probes)
**Parallel-key:** `galaxy-health`.
The current driver reports two flavors of host connectivity:
1. **Top-level transport health** — flips `Running`/`Stopped` on the
synthetic host named after `OTOPCUA_GALAXY_CLIENT_NAME` whenever the
MXAccess COM proxy connects/disconnects.
2. **Per-platform `ScanState` probes** — for each discovered
`$WinPlatform` and `$AppEngine` gobject, advise its `ScanState`
attribute and translate value transitions into per-host
`Running`/`Stopped`/`Unknown`. Lives in
`Driver.Galaxy.Host/Backend/Stability/GalaxyRuntimeProbeManager.cs`.
This PR ports both.
**Files**
- Create:
- `src/.../Driver.Galaxy/Health/HostConnectivityForwarder.cs`
consumes PR 0.4 `StreamSessionHealth` and surfaces the synthetic
top-level host entry (named after the configured MXAccess
`ClientName`).
- `src/.../Driver.Galaxy/Health/PerPlatformProbeWatcher.cs` — port of
`GalaxyRuntimeProbeManager`. On `Discover`, takes the list of
discovered `$WinPlatform`/`$AppEngine` tag names, subscribes their
`ScanState` via the driver's own `GalaxyMxSession.SubscribeBulkAsync`
(or directly through the gw session), runs the same state machine
(`OnProbeCallback` interpretation logic — port verbatim with tests),
and raises per-host `HostStatusChangedEventArgs` through the
aggregator below.
- `src/.../Driver.Galaxy/Health/HostStatusAggregator.cs` — single
sink that merges the forwarder's transport entry with the watcher's
per-platform entries into the `IReadOnlyList<HostConnectivityStatus>`
surfaced by `IHostConnectivityProbe.GetHostStatuses()`. Owns the
de-dup + diff logic that today lives in
`GalaxyProxyDriver.OnHostConnectivityUpdate`.
- Modify:
- `GalaxyDriver.cs` — wire forwarder + watcher + aggregator into
Initialize. On every `ITagDiscovery.DiscoverAsync` completion (incl.
re-discovery from PR 4.6), feed the watcher the fresh platform list
so probe subscriptions follow Galaxy redeploys.
- Tests:
- `Tests/Health/HostConnectivityForwarderTests.cs`.
- `Tests/Health/PerPlatformProbeWatcherTests.cs` — port the existing
`GalaxyRuntimeProbeManagerTests` (or whatever covers
`OnProbeCallback`) verbatim. Cover: initial subscribe on Discover,
re-subscribe after Rediscover, value-transition state machine,
cleanup on Shutdown.
- `Tests/Health/HostStatusAggregatorTests.cs` — transport entry plus
multiple per-platform entries, transitions, aggregator emits
`OnHostStatusChanged` only on actual state change.
**Acceptance**
- Top-level transport up/down reflected within 1s of gw `SessionHealth`
flip.
- Each `$WinPlatform` / `$AppEngine` gobject in the discovered hierarchy
produces exactly one entry in `GetHostStatuses()`, transitioning on
`ScanState` changes.
- After a redeploy that adds a new platform, the watcher subscribes its
`ScanState` without restarting the driver.
**Depends on:** PR 4.0 + PR 4.1 (needs the discoverer's platform list).
**Independent of PR 4.24.6** — parallel-safe with the runtime track.
#### PR 4.W — Backend-flag wiring
**Parallel-key:** locked-files.
**Files**
- `src/.../Server/Configuration/DriverFactoryRegistry.cs` (or wherever
drivers are wired) — add a `Galaxy:Backend` switch:
- `legacy-host` → existing `GalaxyProxyDriver` registration (untouched).
- `mxgateway` → new `GalaxyDriver` registration via PR 4.0's extension.
- `src/.../Server/appsettings.json` — sample new config block.
- `ZB.MOM.WW.OtOpcUa.slnx` — register `Driver.Galaxy` and its tests.
- `CLAUDE.md` — note new driver, retain old driver pointers.
**Acceptance**
- With `Galaxy:Backend=legacy-host` (default), unchanged behavior.
- With `Galaxy:Backend=mxgateway`, server boots against the new driver and
passes a smoke test against the dev gw.
### Phase 4 parallel batches
Dependency graph:
```
4.0 (shell) ──┬── 4.1 (discover) ──┬── 4.6 (deploy)
│ └── 4.7 (health: needs platform list)
├── 4.2 (read) ── 4.3 (write) ── 4.4 (subscribe) ── 4.5 (reconnect)
│ \
│ → 4.W (wire-up)
└── (no longer parallel-with-4.1: 4.7 moved under 4.1)
```
- After 4.0 merges, **4.1 and the 4.2-chain head** can run in two parallel
worktrees.
- After 4.1 merges, **4.6 and 4.7** can run in two parallel worktrees.
- 4.2 → 4.3 → 4.4 → 4.5 is one sequential chain on its own worktree
(they all touch `GalaxyDriver.cs` and `GalaxyMxSession.cs`) and runs
alongside the discover/deploy/health track.
- 4.W gathers everything.
**Recommended Phase 4 plan:**
- Stage 1 (after 4.0): two worktrees — W1: 4.1; W2: 4.2 → 4.3 → 4.4 → 4.5.
- Stage 2 (after 4.1 merges, W2 still running): three worktrees —
W1: 4.6; W3: 4.7; W2: continues runtime chain.
- Stage 3: 4.W wire-up.
---
## Phase 5 — Parity test matrix
### Tasks
#### PR 5.1 — `Driver.Galaxy.ParityTests` project
**Parallel-key:** `parity-shell`.
**Files**
- Create: `tests/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.ParityTests/`
- `ParityHarness.cs` — boots the OtOpcUa server twice with each backend,
drives the same OPC UA scenarios, captures structured snapshots.
- Theory data per scenario (browse, subscribe, alarm transition, write
by classification, history read).
- Reuses existing live-Galaxy fixtures from
`tests/.../Driver.Galaxy.E2E/`.
#### PR 5.2 — Browse + read parity scenarios
**Parallel-key:** `parity-browse`.
#### PR 5.3 — Subscribe + event-rate parity scenarios
**Parallel-key:** `parity-subscribe`.
#### PR 5.4 — Write-by-classification parity scenarios
**Parallel-key:** `parity-write`.
#### PR 5.5 — Alarm-transition parity scenarios
**Parallel-key:** `parity-alarms`.
Cover both:
- **Live transitions:** Active / Acknowledged / Inactive sequences against
`.InAlarm` / `.Acked` value flips on the dev Galaxy. Must match
legacy-host event ordering and severity mapping.
- **Alarm-event persistence:** trigger N alarm transitions, then verify
the SQLite store-and-forward sink drains them into the Wonderware
historian event store via the new sidecar's `WriteAlarmEvents`
contract (PR 3.3). Compare the persisted rows to those produced by the
legacy `GalaxyHistorianWriter` path.
#### PR 5.6 — History-read parity scenarios
**Parallel-key:** `parity-history`.
#### PR 5.7 — Reconnect/disruption scenarios
**Parallel-key:** `parity-reconnect`.
#### PR 5.8 — Per-platform `ScanState` probe parity
**Parallel-key:** `parity-probes`.
Verify the new `PerPlatformProbeWatcher` (PR 4.7) produces the same
per-host `HostConnectivityStatus` stream as the legacy
`GalaxyRuntimeProbeManager`:
- Initial state on Discover for each `$WinPlatform` / `$AppEngine`.
- Transition events when a runtime is stopped/started on the dev Galaxy.
- Re-subscription after a redeploy that adds/removes a platform.
- Cleanup of probe subscriptions on Shutdown (no leaked advises in gw).
#### PR 5.W — Parity matrix doc
**Files**
- `docs/v2/Galaxy.ParityMatrix.md` — table of scenario × result for both
backends. Resolved deltas marked, accepted deltas justified.
### Phase 5 parallel batches
After 5.1 lands, scenarios 5.25.8 are **fully parallel** — they each add
a separate test class file. Seven worktrees, seven `general-purpose` agents.
5.W runs after all scenarios merge and pass.
---
## Phase 6 — Performance + hardening
### Tasks
#### PR 6.1 — OpenTelemetry traces
**Parallel-key:** `perf-otel`.
#### PR 6.2 — Bounded channel + drop-newest metrics
**Parallel-key:** `perf-eventpump`.
#### PR 6.3 — Buffered update interval landing
**Parallel-key:** `perf-buffered`.
Wire `MxAccess:PublishingIntervalMs``SetBufferedUpdateInterval` once
gw exposes it.
#### PR 6.4 — Soak test scenario
**Parallel-key:** `perf-soak`.
50k tags, 24h, automated metric collection.
#### PR 6.5 — Tune `MxGatewayClientOptions` defaults
**Parallel-key:** `perf-tuning`.
Based on soak data.
#### PR 6.W — Performance doc
`docs/v2/Galaxy.Performance.md`.
### Phase 6 parallel batches
6.1, 6.2, 6.3 all touch `Driver.Galaxy/Runtime/`. Serialize them, OR split
files explicitly:
- 6.1 owns a new `Runtime/Tracing.cs` injected via decorator. Parallel-safe.
- 6.2 owns `Runtime/EventPump.cs`. Conflicts with PR 4.4 only if reordered;
not in parallel with 6.1 if 6.1 also wraps EventPump. Decide upfront:
PR 6.1 wraps at the gateway-client boundary, PR 6.2 owns EventPump
internals. Parallel-safe.
- 6.3 modifies `GalaxyDriver.SubscribeAsync` only. Parallel-safe.
So 6.1, 6.2, 6.3 parallel, then 6.4 (depends on all three). 6.5 sequential
after 6.4 (uses its data). 6.W last.
---
## Phase 7 — Retire legacy
### Tasks
#### PR 7.1 — Default flip
**Parallel-key:** `retire-defaults`.
**Files**
- `src/.../Server/appsettings.json``Galaxy:Backend = mxgateway`.
- `scripts/e2e/e2e-config.sample.json` → drop `OTOPCUA_GALAXY_*` pipe vars,
add gw endpoint.
- `scripts/install/Install-Services.ps1` → remove
`OtOpcUaGalaxyHost` registration; keep `OtOpcUaWonderwareHistorian` from
PR 3.W.
#### PR 7.2 — Delete legacy projects
**Parallel-key:** `retire-delete`.
**Files**
- Delete:
- `src/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host/`
- `src/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Proxy/`
- `src/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Shared/`
- `tests/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Tests/`
- `tests/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Proxy.Tests/`
- `tests/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Shared.Tests/`
- Modify:
- `ZB.MOM.WW.OtOpcUa.slnx` — remove the six entries.
- `Server/Configuration/DriverFactoryRegistry.cs` — remove the
`legacy-host` switch arm.
**Depends on:** parity matrix in `docs/v2/Galaxy.ParityMatrix.md` is
fully green or carries documented accepted-deltas (verified
2026-04-30 on the dev rig: 14 passed / 1 skipped / 0 failed).
#### PR 7.3 — Doc + memory housekeeping
**Parallel-key:** `retire-docs`.
**Files**
- `CLAUDE.md` — rewrite Galaxy section.
- `docs/v2/dev-environment.md` — drop `OtOpcUaGalaxyHost` references.
- `docs/ServiceHosting.md`, `docs/Redundancy.md`, `docs/security.md`
scrub `Galaxy.Host`/`Galaxy.Proxy` mentions.
- `~/.claude/projects/.../memory/MEMORY.md` — retire entries:
- `project_galaxy_host_service.md`
- `project_galaxy_host_installed.md`
- `project_aveva_platform_installed.md` (revise — server box no longer
needs AVEVA; gw box does)
- Delete:
- `mxaccess_documentation.md` (no longer consumed by this repo).
- Add memory entry: `project_galaxy_via_mxgateway.md`.
### Phase 7 parallel batches
- **Batch 7a (sequential, gated by phase 6 production soak):** 7.1.
- **Batch 7b (parallel after 7.1):** 7.2 (`retire-delete`) and 7.3
(`retire-docs`) — disjoint files.
---
## Cross-phase dependency graph
```
Phase 0 (gw repo) ────────────────────────────────────┐
Phase 1.1 (Core.Abs/Historian) ──┐ │
├── Phase 1.2/1.3 │
│ (server History)│
Phase 2.1 (Core.Abs/Alarms) ──────┤ │
├── Phase 2.2/2.3 │
│ (server Alarms) │
│ │
└── Phase 3 (sidecar host + client)
│ │
└─────────┴── Phase 4 (Driver.Galaxy)
Phase 5 (parity)
Phase 6 (perf)
Phase 7 (retire)
```
### Maximum-parallelism rollout (one possible execution)
- **Day 0N (mxaccessgw):** Phase 0 batches 0a + 0b + 0.W in parallel
worktrees, separate repo from this one — runs in parallel with everything
below until consumers need the gw bump.
- **Day 0N (this repo):** Phases 1.1 and 2.1 in parallel (two worktrees).
Merge.
- **Day N+:** Phases 1.2/1.3, 2.2/2.3, 3.1+3.2+3.3+3.4 in parallel (three
worktrees, each a sequential chain).
- **Day M:** combined wire-up PR 1+2.W, then PR 3.W. Server passes existing
e2e against legacy backend.
- **Day M+:** Phase 4.0 lands. Phase 4 fan-out (four worktrees) starts.
- **Day P:** Phase 4 wire-up. Phase 5 fan-out (six worktrees) starts.
- **Day Q:** Phase 5 wire-up. Phase 6 fan-out (three worktrees + sequential).
- **Day R:** Phase 7. Done.
---
## Subagent prompt template
Re-use this shell when launching any of the parallel coding tasks. Replace
`<bracketed>` parts.
```
You are implementing PR <id> from lmx_mxgw_impl.md ("<title>").
Repo: <C:\Users\dohertj2\Desktop\lmxopcua | C:\Users\dohertj2\Desktop\mxaccessgw>.
Worktree: <path>.
Scope (you may create/edit only these files):
<list>
DO NOT edit:
- Any file outside the scope above
- ZB.MOM.WW.OtOpcUa.slnx / mxaccessgw/MxGateway.sln
- src/.../Server/Program.cs, OpcUaServerService.cs, appsettings.json
- scripts/install/Install-Services.ps1
- scripts/e2e/e2e-config.sample.json
- CLAUDE.md, docs/**, MEMORY.md, mxaccess_documentation.md
Acceptance:
<list>
Tests:
<list>
If you find a needed change outside scope, STOP and surface it as a
finding rather than editing — it will be picked up by the wire-up PR.
Before reporting completion:
1. Run `dotnet build <smallest project tree that covers your scope>`.
2. Run the new/changed tests.
3. Report: files changed, test command + result, any out-of-scope
findings.
```
---
## Risk register (operational)
| Risk | When it bites | Mitigation |
|---|---|---|
| Phase 0 gw bump breaks existing mxaccessgw consumers | Phase 0 wire-up | Cross-language smoke matrix in mxaccessgw must run before merge |
| Two parallel agents both edit `OpcUaServerService.cs` despite the rule | Phases 1+2 parallel | Wire-up convention + grep-based pre-merge check (`git diff --stat origin/main` of locked files in the integration branch must be empty until the wire-up PR) |
| Subagent silently adds a stray `using` to a locked file | Anytime | The build-and-test step in the prompt will fail if the locked file changed and broke compile; a `git diff --name-only` whitelist check at integration-branch merge time enforces it |
| Galaxy.Host can't build during phase 3.2 because lifted files vanished | Phase 3 mid-flight | PR 3.2 adds a ProjectReference from Galaxy.Host to Driver.Historian.Wonderware so the moved files remain reachable; tests cover both call sites |
| Phase 4 chain stalls because gw exposes no synchronous read | PR 4.2 | Surface as a Phase 0 finding immediately — add a `ReadCommand` to gw or accept short-lived advise as the read mechanism (document as a perf accepted delta in 5.W) |
| Phase 5 parity matrix exposes a delta no one wants to fix | Phase 5 | Phase 7 gating: `Galaxy:Backend=mxgateway` does not become default until every parity delta is either resolved or has a written acceptance from the user |
| Soak test in 6.4 finds a memory leak in `EventPump` | Phase 6 | EventPump bounded-channel design (PR 6.2) is shipped before soak so the leak is bounded by design |
| Stale memory file references retired code after phase 7 | Phase 7 | PR 7.3 explicitly retires `project_galaxy_host_*` entries; add a memory-audit step to phase-close checklist |
---
## Phase-close checklist (apply at the end of each phase)
Before declaring a phase done:
1. `dotnet build ZB.MOM.WW.OtOpcUa.slnx` clean on integration branch.
2. `dotnet test ZB.MOM.WW.OtOpcUa.slnx` clean (or all-but-known-skipped).
3. Live-Galaxy smoke (when applicable) green on dev box.
4. No locked files modified outside their wire-up PR
(`git log --name-only origin/main..HEAD -- <locked-paths>` shows only
the wire-up commit).
5. `MEMORY.md` updated for any persistent context this phase introduced.
6. Doc updates limited to the phase's scope (no doc edits sprinkled across
non-doc PRs).