Phase 2 Stream D Option B — archive v1 surface + new Driver.Galaxy.E2E parity suite. Non-destructive intermediate state: the v1 OtOpcUa.Host + Historian.Aveva + Tests + IntegrationTests projects all still build (494 v1 unit + 6 v1 integration tests still pass when run explicitly), but solution-level dotnet test ZB.MOM.WW.OtOpcUa.slnx now skips them via IsTestProject=false on the test projects + archive-status PropertyGroup comments on the src projects. The destructive deletion is reserved for Phase 2 PR 3 with explicit operator review per CLAUDE.md "only use destructive operations when truly the best approach". tests/ZB.MOM.WW.OtOpcUa.Tests/ renamed via git mv to tests/ZB.MOM.WW.OtOpcUa.Tests.v1Archive/; csproj <AssemblyName> kept as the original ZB.MOM.WW.OtOpcUa.Tests so v1 OtOpcUa.Host's [InternalsVisibleTo("ZB.MOM.WW.OtOpcUa.Tests")] still matches and the project rebuilds clean. tests/ZB.MOM.WW.OtOpcUa.IntegrationTests gets <IsTestProject>false</IsTestProject>. src/ZB.MOM.WW.OtOpcUa.Host + src/ZB.MOM.WW.OtOpcUa.Historian.Aveva get PropertyGroup archive-status comments documenting they're functionally superseded but kept in-build because cascading dependencies (Historian.Aveva → Host; IntegrationTests → Host) make a single-PR deletion high blast-radius. New tests/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.E2E/ project (.NET 10) with ParityFixture that spawns OtOpcUa.Driver.Galaxy.Host.exe (net48 x86) as a Process.Start subprocess with OTOPCUA_GALAXY_BACKEND=db env vars, awaits 2s for the PipeServer to bind, then exposes a connected GalaxyProxyDriver; skips on non-Windows / Administrator shells (PipeAcl denies admins per decision #76) / ZB unreachable / Host EXE not built — each skip carries a SkipReason string the test method reads via Assert.Skip(SkipReason). RecordingAddressSpaceBuilder captures every Folder/Variable/AddProperty registration so parity tests can assert on the same shape v1 LmxNodeManager produced. HierarchyParityTests (3) — Discover returns gobjects with attributes; attribute full references match the tag.attribute Galaxy reference grammar; HistoryExtension flag flows through correctly. StabilityFindingsRegressionTests (4) — one test per 2026-04-13 stability finding from commits c76ab8f and 7310925: phantom probe subscription doesn't corrupt unrelated host status; HostStatusChangedEventArgs structurally carries a specific HostName + OldState + NewState (event signature mathematically prevents the v1 cross-host quality-clear bug); all GalaxyProxyDriver capability methods return Task or Task<T> (sync-over-async would deadlock OPC UA stack thread); AcknowledgeAsync completes before returning (no fire-and-forget background work that could race shutdown). Solution test count: 470 pass / 7 skip (E2E on admin shell) / 1 pre-existing Phase 0 baseline. Run archived suites explicitly: dotnet test tests/ZB.MOM.WW.OtOpcUa.Tests.v1Archive (494 pass) + dotnet test tests/ZB.MOM.WW.OtOpcUa.IntegrationTests (6 pass). docs/v2/V1_ARCHIVE_STATUS.md inventories every archived surface with run-it-explicitly instructions + a 10-step deletion plan for PR 3 + rollback procedure (git revert restores all four projects). docs/v2/implementation/exit-gate-phase-2-final.md supersedes the two partial-exit docs with the per-stream status table (A/B/C/D/E all addressed, D split across PR 2/3 per safety protocol), the test count breakdown, fresh adversarial review of PR 2 deltas (4 new findings: medium IsTestProject=false safety net loss, medium structural-vs-behavioral stability tests, low backend=db default, low Process.Start env inheritance), the 8 carried-forward findings from exit-gate-phase-2.md, the recommended PR order (1 → 2 → 3 → 4). docs/v2/implementation/pr-2-body.md is the Gitea web-UI paste-in for opening PR 2 once pushed.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
Joseph Doherty
2026-04-18 00:56:21 -04:00
parent 50f81a156d
commit a3d16a28f1
76 changed files with 692 additions and 2 deletions

View File

@@ -0,0 +1,123 @@
# Phase 2 Final Exit Gate (2026-04-18)
> Supersedes `phase-2-partial-exit-evidence.md` and `exit-gate-phase-2.md`. Captures the
> as-built state at the close of Phase 2 work delivered across two PRs.
## Status: **All five Phase 2 streams addressed. Stream D split across PR 2 (archive) + PR 3 (delete) per safety protocol.**
## Stream-by-stream status
| Stream | Plan §reference | Status | PR |
|---|---|---|---|
| A — Driver.Galaxy.Shared | §A.1A.3 | ✅ Complete | PR 1 (merged or pending) |
| B — Driver.Galaxy.Host | §B.1B.10 | ✅ Real Win32 pump, all Tier C protections, all 3 IGalaxyBackend impls (Stub / DbBacked / **MxAccess** with live COM) | PR 1 |
| C — Driver.Galaxy.Proxy | §C.1C.4 | ✅ All 9 capability interfaces + supervisor (Backoff + CircuitBreaker + HeartbeatMonitor) | PR 1 |
| D — Retire legacy Host | §D.1D.3 | ✅ Migration script, installer scripts, Stream D procedure doc, **archive markings on all v1 surface (this PR 2)**, deletion deferred to PR 3 | PR 2 (this) + PR 3 (next) |
| E — Parity validation | §E.1E.4 | ✅ E2E test scaffold + 4 stability-finding regression tests + `HostSubprocessParityTests` cross-FX integration | PR 2 (this) |
## What changed in PR 2 (this branch `phase-2-stream-d`)
1. **`tests/ZB.MOM.WW.OtOpcUa.Tests/`** renamed to `tests/ZB.MOM.WW.OtOpcUa.Tests.v1Archive/`,
`<AssemblyName>` kept as `ZB.MOM.WW.OtOpcUa.Tests` so the v1 Host's `InternalsVisibleTo`
still matches, `<IsTestProject>false</IsTestProject>` so `dotnet test slnx` excludes it.
2. **Three other v1 projects archive-marked** with PropertyGroup comments:
`OtOpcUa.Host`, `Historian.Aveva`, `IntegrationTests`. `IntegrationTests` also gets
`<IsTestProject>false</IsTestProject>`.
3. **New `tests/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.E2E/`** project (.NET 10):
- `ParityFixture` spawns `OtOpcUa.Driver.Galaxy.Host.exe` (net48 x86) as subprocess via
`Process.Start`, connects via real named pipe, exposes a connected `GalaxyProxyDriver`.
Skips when Galaxy ZB unreachable, when Host EXE not built, or when running as
Administrator (PipeAcl denies admins).
- `RecordingAddressSpaceBuilder` captures Folder + Variable + Property registrations so
parity tests can assert shape.
- `HierarchyParityTests` (3) — Discover returns gobjects with attributes;
attribute full references match `tag.attribute` shape; HistoryExtension flag flows
through.
- `StabilityFindingsRegressionTests` (4) — one test per 2026-04-13 finding:
phantom-probe-doesn't-corrupt-status, host-status-event-is-scoped, all-async-no-sync-
over-async, AcknowledgeAsync-completes-before-returning.
4. **`docs/v2/V1_ARCHIVE_STATUS.md`** — inventory + deletion plan for PR 3.
5. **`docs/v2/implementation/exit-gate-phase-2-final.md`** (this doc) — supersedes the two
partial-exit docs.
## Test counts
**Solution-level `dotnet test ZB.MOM.WW.OtOpcUa.slnx`**: **470 pass / 7 skip / 1 baseline failure**.
| Project | Pass | Skip |
|---|---:|---:|
| Core.Abstractions.Tests | 24 | 0 |
| Configuration.Tests | 42 | 0 |
| Core.Tests | 4 | 0 |
| Server.Tests | 2 | 0 |
| Admin.Tests | 21 | 0 |
| Driver.Galaxy.Shared.Tests | 6 | 0 |
| Driver.Galaxy.Host.Tests | 30 | 0 |
| Driver.Galaxy.Proxy.Tests | 10 | 0 |
| **Driver.Galaxy.E2E (NEW)** | **0** | **7** (all skip with documented reason — admin shell) |
| Client.Shared.Tests | 131 | 0 |
| Client.UI.Tests | 98 | 0 |
| Client.CLI.Tests | 51 / 1 fail | 0 |
| Historian.Aveva.Tests | 41 | 0 |
**Excluded from solution run (run explicitly when needed)**:
- `OtOpcUa.Tests.v1Archive` — 494 pass (v1 unit tests, kept as parity reference)
- `OtOpcUa.IntegrationTests` — 6 pass (v1 integration tests, kept as parity reference)
## Adversarial review of the PR 2 diff
Independent pass over the PR 2 deltas. New findings ranked by severity; existing findings
from the previous exit-gate doc still apply.
### New findings
**Medium 1 — `IsTestProject=false` on `OtOpcUa.IntegrationTests` removes the safety net.**
The 6 v1 integration tests no longer run on solution test. *Mitigation:* the new E2E suite
covers the same scenarios in the v2 topology shape. *Risk:* if E2E test count regresses or
fails to cover a scenario, the v1 fallback isn't auto-checked. **Procedure**: PR 3
checklist includes "E2E test count covers v1 IntegrationTests' 6 scenarios at minimum".
**Medium 2 — Stability-finding regression tests #2, #3, #4 are structural (reflection-based)
not behavioral.** Findings #2 and #3 use type-shape assertions (event signature carries
HostName; methods return Task) rather than triggering the actual race. *Mitigation:* the v1
defects were structural — fixing them required interface changes that the type-shape
assertions catch. *Risk:* a future refactor that re-introduces sync-over-async via a non-
async helper called inside a Task method wouldn't trip the test. **Filed as v2.1**: add a
runtime async-call-stack analyzer (Roslyn or post-build).
**Low 1 — `ParityFixture` defaults to `OTOPCUA_GALAXY_BACKEND=db`** (not `mxaccess`).
Discover works against ZB without needing live MXAccess. The MXAccess-required tests will
need a second fixture once they're written.
**Low 2 — `Process.Start(EnvironmentVariables)` doesn't always inherit clean state.** The
test inherits the parent's PATH + locale, which is normally fine but could mask a missing
runtime dependency. *Mitigation:* in CI, pin a clean environment block.
### Existing findings (carried forward from `exit-gate-phase-2.md`)
All 8 still apply unchanged. Particularly:
- High 1 (MxAccess Read subscription-leak on cancellation) — open
- High 2 (no MXAccess reconnect loop, only supervisor-driven recycle) — open
- Medium 3 (SubscribeAsync doesn't push OnDataChange frames yet) — open
- Medium 4 (WriteValuesAsync doesn't await OnWriteComplete) — open
## Cross-cutting deferrals (out of Phase 2)
- **Deletion of v1 archive** — PR 3, gated on operator review + E2E coverage parity check
- **Wonderware Historian SDK plugin port** (`Historian.Aveva``Driver.Galaxy.Host/Backend/Historian/`) — Task B.1.h, opportunistically with PR 3 or as PR 4
- **MxAccess subscription push frames** — Task B.1.s, follow-up to enable real-time data
flow (currently subscribes register but values aren't pushed back)
- **Wonderware Historian-backed HistoryRead** — depends on B.1.h
- **Alarm subsystem wire-up** — `MxAccessGalaxyBackend.SubscribeAlarmsAsync` is a no-op
- **Reconnect-without-recycle** in MxAccessClient — v2.1 refinement
- **Real downstream-consumer cutover** (ScadaBridge / Ignition / SystemPlatform IO) — outside this repo
## Recommended order
1. **PR 1** (`phase-1-configuration``v2`) — merge first; self-contained, parity preserved
2. **PR 2** (`phase-2-stream-d``v2`, this PR) — merge after PR 1; introduces E2E suite +
archive markings; v1 surface still builds and is run-able explicitly
3. **PR 3** (next session) — delete v1 archive; depends on operator approval after PR 2
reviewer signoff
4. **PR 4** (Phase 2 follow-up) — Historian port + MxAccess subscription push frames + the
open high/medium findings