Files
lmxopcua/docs/v2/implementation/stream-d-removal-procedure.md
Joseph Doherty 7403b92b72 Phase 2 Stream D progress — non-destructive deliverables: appsettings → DriverConfig migration script, two-service Windows installer scripts, process-spawn cross-FX parity test, Stream D removal procedure doc with both Option A (rewrite 494 v1 tests) and Option B (archive + new v2 E2E suite) spelled out step-by-step. Cannot one-shot the actual legacy-Host deletion in any unattended session — explained in the procedure doc; the parity-defect debug cycle is intrinsically interactive (each iteration requires inspecting a v1↔v2 diff and deciding if it's a legitimate v2 improvement or a regression, then either widening the assertion or fixing the v2 code), and git rm -r src/ZB.MOM.WW.OtOpcUa.Host is destructive enough to need explicit operator authorization on a real PR review. scripts/migration/Migrate-AppSettings-To-DriverConfig.ps1 takes a v1 appsettings.json and emits the v2 DriverInstance.DriverConfig JSON blob (MxAccess/Database/Historian sections) ready to upsert into the central Configuration DB; null-leaf stripping; -DryRun mode; smoke-tested against the dev appsettings.json and produces the expected three-section ordered-dictionary output. scripts/install/Install-Services.ps1 registers the two v2 services with sc.exe — OtOpcUaGalaxyHost first (net48 x86 EXE with OTOPCUA_GALAXY_PIPE/OTOPCUA_ALLOWED_SID/OTOPCUA_GALAXY_SECRET/OTOPCUA_GALAXY_BACKEND/OTOPCUA_GALAXY_ZB_CONN/OTOPCUA_GALAXY_CLIENT_NAME env vars set via HKLM:\SYSTEM\CurrentControlSet\Services\OtOpcUaGalaxyHost\Environment registry), then OtOpcUa with depend=OtOpcUaGalaxyHost; resolves down-level account names to SID for the IPC ACL; generates a fresh 32-byte base64 shared secret per install if not supplied (kept out of registry — operators record offline for service rebinding scenarios); echoes start commands. scripts/install/Uninstall-Services.ps1 stops + removes both services. tests/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Proxy.Tests/HostSubprocessParityTests.cs is the production-shape parity test — Proxy (.NET 10) spawns the actual OtOpcUa.Driver.Galaxy.Host.exe (net48 x86) as a subprocess via Process.Start with backend=db env vars, connects via real named pipe, calls Discover, asserts at least one Galaxy gobject comes back. Skipped when running as Administrator (PipeAcl denies admins, same guard as other IPC integration tests), when the Host EXE hasn't been built, or when the ZB SQL endpoint is unreachable. This is the cross-FX integration that the parity suite genuinely needs — the previous IPC tests all ran in-process; this one validates the production deployment topology where Proxy and Host are separate processes communicating only over the named pipe. docs/v2/implementation/stream-d-removal-procedure.md is the next-session playbook: Option A (rewrite 494 v1 tests via a ProxyMxAccessClientAdapter that implements v1's IMxAccessClient by forwarding to GalaxyProxyDriver — Vtq↔DataValueSnapshot, Quality↔StatusCode, OnTagValueChanged↔OnDataChange mapping; 3-5 days, full coverage), Option B (rename OtOpcUa.Tests → OtOpcUa.Tests.v1Archive with [Trait("Category", "v1Archive")] for opt-in CI runs; new OtOpcUa.Driver.Galaxy.E2E test project with 10-20 representative tests via the HostSubprocessParityTests pattern; 1-2 days, accreted coverage); deletion checklist with eight pre-conditions, ten ordered steps, and a rollback path (git revert restores the legacy Host alongside the v2 stack — both topologies remain installable until the downstream consumer cutover). Full solution 964 pass / 1 pre-existing Phase 0 baseline; the 494 v1 IntegrationTests + 6 v1 IntegrationTests-net48 still pass because legacy OtOpcUa.Host stays untouched until an interactive session executes the procedure doc.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 00:38:44 -04:00

5.8 KiB

Stream D — Legacy OtOpcUa.Host Removal Procedure

Sequenced playbook for the next session that takes Phase 2 to its full exit gate. All Stream A/B/C work is committed. The blocker is structural: the 494 v1 OtOpcUa.Tests instantiate v1 Host classes directly, so they must be retargeted (or archived) before the Host project can be deleted.

Decision: Option A or Option B

Option A — Rewrite the 494 v1 tests to use v2 topology

Effort: 3-5 days. Highest fidelity (full v1 test coverage carries forward).

Steps:

  1. Build a ProxyMxAccessClientAdapter in a new OtOpcUa.LegacyTestCompat/ project that implements v1's IMxAccessClient by forwarding to Driver.Galaxy.Proxy.GalaxyProxyDriver. Maps v1 Vtq ↔ v2 DataValueSnapshot, v1 Quality enum ↔ v2 StatusCode u32, the v1 OnTagValueChanged event ↔ v2 ISubscribable.OnDataChange.
  2. Same idea for IGalaxyRepository — adapter that wraps v2's Backend.Galaxy.GalaxyRepository.
  3. Replace MxAccessClient constructions in OtOpcUa.Tests test fixtures with the adapter. Most tests use a single fixture so the change-set is concentrated.
  4. For each test class: run; iterate on parity defects until green. Expected defect families: timing-sensitive assertions (IPC adds ~5ms latency; widen tolerances), Quality enum vs StatusCode mismatches, value-byte-encoding differences.
  5. Once all 494 pass: proceed to deletion checklist below.

When to pick A: regulatory environments that need the full historical test suite green, or when the v2 parity gate is itself a release-blocking artifact downstream consumers will look for.

Option B — Archive the 494 v1 tests, build a smaller v2 parity suite

Effort: 1-2 days. Faster to green; less coverage initially, accreted over time.

Steps:

  1. Rename tests/ZB.MOM.WW.OtOpcUa.Tests/tests/ZB.MOM.WW.OtOpcUa.Tests.v1Archive/. Add <IsTestProject>false</IsTestProject> so CI doesn't run them; mark every class with [Trait("Category", "v1Archive")] so a future operator can opt in via --filter.
  2. New tests/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.E2E/ project (.NET 10):
    • ParityFixture spawns Galaxy.Host EXE per test class with OTOPCUA_GALAXY_BACKEND=mxaccess pointing at the dev box's live Galaxy. Pattern from HostSubprocessParityTests.
    • 10-20 representative tests covering the core paths: hierarchy shape, attribute count, read-Manufacturer-Boolean, write-Operate-Float roundtrip, subscribe-receives-OnDataChange, Bad-quality on disconnect, alarm-event-shape.
  3. The four 2026-04-13 stability findings get individual regression tests in this project.
  4. Once green: proceed to deletion checklist below.

When to pick B: typical dev velocity case. The v1 archive is reference, the new suite is the live parity bar.

Deletion checklist (after Option A or B is green)

Pre-conditions:

  • Chosen-option test suite green (494 retargeted OR new E2E suite passing on this box)
  • phase-2-compliance.ps1 runs and exits 0
  • Get-Service aaGR, aaBootstrap → Running
  • Driver.Galaxy.Host x86 publish output verified at src/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host/bin/Release/net48/
  • Migration script tested: scripts/migration/Migrate-AppSettings-To-DriverConfig.ps1 -AppSettingsPath src/ZB.MOM.WW.OtOpcUa.Host/appsettings.json -DryRun produces a well-formed DriverConfig
  • Service installer scripts dry-run on a test box: scripts/install/Install-Services.ps1 -InstallRoot C:\OtOpcUa -ServiceAccount LOCALHOST\testuser registers both services and they start

Steps:

  1. Delete src/ZB.MOM.WW.OtOpcUa.Host/ (the legacy in-process Host project).
  2. Edit ZB.MOM.WW.OtOpcUa.slnx — remove the legacy Host <Project> line; keep all v2 project lines.
  3. Migrate the dev appsettings.json Galaxy sections to DriverConfig JSON via the migration script; insert into the Configuration DB for the dev cluster's Galaxy driver instance.
  4. Run the chosen test suite once more — confirm zero regressions from the deletion.
  5. Build full solution (dotnet build ZB.MOM.WW.OtOpcUa.slnx) — confirm clean build with no references to the deleted project.
  6. Commit: git rm -r src/ZB.MOM.WW.OtOpcUa.Host followed by the slnx + cleanup edits in one atomic commit titled "Phase 2 Stream D — retire legacy OtOpcUa.Host".
  7. Run /codex:adversarial-review --base v2 on the merged Phase 2 diff.
  8. Record exit-gate-phase-2-final.md with: Option chosen, deletion-commit SHA, parity test count + duration, adversarial-review findings (each closed or deferred with link).
  9. Open PR against v2, link the exit-gate doc + compliance script output + parity report.
  10. Merge after one reviewer signoff.

Rollback

If Stream D causes downstream consumer failures (ScadaBridge / Ignition / SystemPlatform IO clients seeing different OPC UA behavior), the rollback is git revert of the deletion commit — the whole v2 codebase keeps Galaxy.Proxy + Galaxy.Host installed alongside the restored legacy Host. Production can run either topology. OtOpcUa.Driver.Galaxy.Proxy becomes dormant until the next attempt.

Why this can't one-shot in an autonomous session

  • The parity-defect debug cycle is intrinsically interactive: each iteration requires running the test suite against live Galaxy, inspecting the diff, deciding if the difference is a legitimate v2 improvement or a regression, then either widening the assertion or fixing the v2 code. That decision-making is the bottleneck, not the typing.
  • The legacy-Host deletion is destructive — needs explicit operator authorization on a real PR review, not unattended automation.
  • The downstream consumer cutover (ScadaBridge, Ignition, AppServer) lives outside this repo and on an integration-team track; "Phase 2 done" inside this repo is a precondition, not the full release.