diff --git a/docs/v2/implementation/stream-d-removal-procedure.md b/docs/v2/implementation/stream-d-removal-procedure.md new file mode 100644 index 0000000..75916f0 --- /dev/null +++ b/docs/v2/implementation/stream-d-removal-procedure.md @@ -0,0 +1,103 @@ +# Stream D — Legacy `OtOpcUa.Host` Removal Procedure + +> Sequenced playbook for the next session that takes Phase 2 to its full exit gate. +> All Stream A/B/C work is committed. The blocker is structural: the 494 v1 +> `OtOpcUa.Tests` instantiate v1 `Host` classes directly, so they must be +> retargeted (or archived) before the Host project can be deleted. + +## Decision: Option A or Option B + +### Option A — Rewrite the 494 v1 tests to use v2 topology + +**Effort**: 3-5 days. Highest fidelity (full v1 test coverage carries forward). + +**Steps**: +1. Build a `ProxyMxAccessClientAdapter` in a new `OtOpcUa.LegacyTestCompat/` project that + implements v1's `IMxAccessClient` by forwarding to `Driver.Galaxy.Proxy.GalaxyProxyDriver`. + Maps v1 `Vtq` ↔ v2 `DataValueSnapshot`, v1 `Quality` enum ↔ v2 `StatusCode` u32, the v1 + `OnTagValueChanged` event ↔ v2 `ISubscribable.OnDataChange`. +2. Same idea for `IGalaxyRepository` — adapter that wraps v2's `Backend.Galaxy.GalaxyRepository`. +3. Replace `MxAccessClient` constructions in `OtOpcUa.Tests` test fixtures with the adapter. + Most tests use a single fixture so the change-set is concentrated. +4. For each test class: run; iterate on parity defects until green. Expected defect families: + timing-sensitive assertions (IPC adds ~5ms latency; widen tolerances), Quality enum vs + StatusCode mismatches, value-byte-encoding differences. +5. Once all 494 pass: proceed to deletion checklist below. + +**When to pick A**: regulatory environments that need the full historical test suite green, +or when the v2 parity gate is itself a release-blocking artifact downstream consumers will +look for. + +### Option B — Archive the 494 v1 tests, build a smaller v2 parity suite + +**Effort**: 1-2 days. Faster to green; less coverage initially, accreted over time. + +**Steps**: +1. Rename `tests/ZB.MOM.WW.OtOpcUa.Tests/` → `tests/ZB.MOM.WW.OtOpcUa.Tests.v1Archive/`. + Add `false` so CI doesn't run them; mark every class with + `[Trait("Category", "v1Archive")]` so a future operator can opt in via `--filter`. +2. New `tests/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.E2E/` project (.NET 10): + - `ParityFixture` spawns Galaxy.Host EXE per test class with `OTOPCUA_GALAXY_BACKEND=mxaccess` + pointing at the dev box's live Galaxy. Pattern from `HostSubprocessParityTests`. + - 10-20 representative tests covering the core paths: hierarchy shape, attribute count, + read-Manufacturer-Boolean, write-Operate-Float roundtrip, subscribe-receives-OnDataChange, + Bad-quality on disconnect, alarm-event-shape. +3. The four 2026-04-13 stability findings get individual regression tests in this project. +4. Once green: proceed to deletion checklist below. + +**When to pick B**: typical dev velocity case. The v1 archive is reference, the new suite is +the live parity bar. + +## Deletion checklist (after Option A or B is green) + +Pre-conditions: +- [ ] Chosen-option test suite green (494 retargeted OR new E2E suite passing on this box) +- [ ] `phase-2-compliance.ps1` runs and exits 0 +- [ ] `Get-Service aaGR, aaBootstrap` → Running +- [ ] `Driver.Galaxy.Host` x86 publish output verified at + `src/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host/bin/Release/net48/` +- [ ] Migration script tested: `scripts/migration/Migrate-AppSettings-To-DriverConfig.ps1 + -AppSettingsPath src/ZB.MOM.WW.OtOpcUa.Host/appsettings.json -DryRun` produces a + well-formed DriverConfig +- [ ] Service installer scripts dry-run on a test box: `scripts/install/Install-Services.ps1 + -InstallRoot C:\OtOpcUa -ServiceAccount LOCALHOST\testuser` registers both services + and they start + +Steps: +1. Delete `src/ZB.MOM.WW.OtOpcUa.Host/` (the legacy in-process Host project). +2. Edit `ZB.MOM.WW.OtOpcUa.slnx` — remove the legacy Host `` line; keep all v2 + project lines. +3. Migrate the dev `appsettings.json` Galaxy sections to `DriverConfig` JSON via the + migration script; insert into the Configuration DB for the dev cluster's Galaxy driver + instance. +4. Run the chosen test suite once more — confirm zero regressions from the deletion. +5. Build full solution (`dotnet build ZB.MOM.WW.OtOpcUa.slnx`) — confirm clean build with + no references to the deleted project. +6. Commit: + `git rm -r src/ZB.MOM.WW.OtOpcUa.Host` followed by the slnx + cleanup edits in one + atomic commit titled "Phase 2 Stream D — retire legacy OtOpcUa.Host". +7. Run `/codex:adversarial-review --base v2` on the merged Phase 2 diff. +8. Record `exit-gate-phase-2-final.md` with: Option chosen, deletion-commit SHA, parity + test count + duration, adversarial-review findings (each closed or deferred with link). +9. Open PR against `v2`, link the exit-gate doc + compliance script output + parity report. +10. Merge after one reviewer signoff. + +## Rollback + +If Stream D causes downstream consumer failures (ScadaBridge / Ignition / SystemPlatform IO +clients seeing different OPC UA behavior), the rollback is `git revert` of the deletion +commit — the whole v2 codebase keeps Galaxy.Proxy + Galaxy.Host installed alongside the +restored legacy Host. Production can run either topology. `OtOpcUa.Driver.Galaxy.Proxy` +becomes dormant until the next attempt. + +## Why this can't one-shot in an autonomous session + +- The parity-defect debug cycle is intrinsically interactive: each iteration requires running + the test suite against live Galaxy, inspecting the diff, deciding if the difference is a + legitimate v2 improvement or a regression, then either widening the assertion or fixing the + v2 code. That decision-making is the bottleneck, not the typing. +- The legacy-Host deletion is destructive — needs explicit operator authorization on a real + PR review, not unattended automation. +- The downstream consumer cutover (ScadaBridge, Ignition, AppServer) lives outside this repo + and on an integration-team track; "Phase 2 done" inside this repo is a precondition, not + the full release. diff --git a/scripts/install/Install-Services.ps1 b/scripts/install/Install-Services.ps1 new file mode 100644 index 0000000..d0bceca --- /dev/null +++ b/scripts/install/Install-Services.ps1 @@ -0,0 +1,102 @@ +<# +.SYNOPSIS + Registers the two v2 Windows services on a node: OtOpcUa (main server, net10) and + OtOpcUaGalaxyHost (out-of-process Galaxy COM host, net48 x86). + +.DESCRIPTION + Phase 2 Stream D.2 — replaces the v1 single-service install (TopShelf-based OtOpcUa.Host). + Installs both services with the correct service-account SID + per-process shared secret + provisioning per `driver-stability.md §"IPC Security"`. Galaxy.Host depends on OtOpcUa + (Galaxy.Host must be reachable when OtOpcUa starts; service dependency wiring + retry + handled by OtOpcUa.Server NodeBootstrap). + +.PARAMETER InstallRoot + Where the binaries live (typically C:\Program Files\OtOpcUa). + +.PARAMETER ServiceAccount + Service account SID or DOMAIN\name. Both services run under this account; the + Galaxy.Host pipe ACL only allows this SID to connect (decision #76). + +.PARAMETER GalaxySharedSecret + Per-process secret passed to Galaxy.Host via env var. Generated freshly per install. + +.PARAMETER ZbConnection + Galaxy ZB SQL connection string (passed to Galaxy.Host via env var). + +.EXAMPLE + .\Install-Services.ps1 -InstallRoot 'C:\Program Files\OtOpcUa' -ServiceAccount 'OTOPCUA\svc-otopcua' +#> +[CmdletBinding()] +param( + [Parameter(Mandatory)] [string]$InstallRoot, + [Parameter(Mandatory)] [string]$ServiceAccount, + [string]$GalaxySharedSecret, + [string]$ZbConnection = 'Server=localhost;Database=ZB;Integrated Security=True;TrustServerCertificate=True;Encrypt=False;', + [string]$GalaxyClientName = 'OtOpcUa-Galaxy.Host', + [string]$GalaxyPipeName = 'OtOpcUaGalaxy' +) + +$ErrorActionPreference = 'Stop' + +if (-not (Test-Path "$InstallRoot\OtOpcUa.Server.exe")) { + Write-Error "OtOpcUa.Server.exe not found at $InstallRoot — copy the publish output first" + exit 1 +} +if (-not (Test-Path "$InstallRoot\Galaxy\OtOpcUa.Driver.Galaxy.Host.exe")) { + Write-Error "OtOpcUa.Driver.Galaxy.Host.exe not found at $InstallRoot\Galaxy — copy the publish output first" + exit 1 +} + +# Generate a fresh shared secret per install if not supplied. Stored in DPAPI-protected file +# rather than the registry so the service account can read it but other local users cannot. +if (-not $GalaxySharedSecret) { + $bytes = New-Object byte[] 32 + [System.Security.Cryptography.RandomNumberGenerator]::Create().GetBytes($bytes) + $GalaxySharedSecret = [Convert]::ToBase64String($bytes) +} + +# Resolve the SID — the IPC ACL needs the SID, not the down-level name. +$sid = if ($ServiceAccount.StartsWith('S-1-')) { + $ServiceAccount +} else { + (New-Object System.Security.Principal.NTAccount $ServiceAccount).Translate([System.Security.Principal.SecurityIdentifier]).Value +} + +# --- Install OtOpcUaGalaxyHost first (OtOpcUa starts after, depends on it being up). +$galaxyEnv = @( + "OTOPCUA_GALAXY_PIPE=$GalaxyPipeName" + "OTOPCUA_ALLOWED_SID=$sid" + "OTOPCUA_GALAXY_SECRET=$GalaxySharedSecret" + "OTOPCUA_GALAXY_BACKEND=mxaccess" + "OTOPCUA_GALAXY_ZB_CONN=$ZbConnection" + "OTOPCUA_GALAXY_CLIENT_NAME=$GalaxyClientName" +) -join "`0" +$galaxyEnv += "`0`0" + +Write-Host "Installing OtOpcUaGalaxyHost..." +& sc.exe create OtOpcUaGalaxyHost binPath= "`"$InstallRoot\Galaxy\OtOpcUa.Driver.Galaxy.Host.exe`"" ` + DisplayName= 'OtOpcUa Galaxy Host (out-of-process MXAccess)' ` + start= auto ` + obj= $ServiceAccount | Out-Null + +# Set per-service environment variables via the registry — sc.exe doesn't expose them directly. +$svcKey = "HKLM:\SYSTEM\CurrentControlSet\Services\OtOpcUaGalaxyHost" +$envValue = $galaxyEnv.Split("`0") | Where-Object { $_ -ne '' } +Set-ItemProperty -Path $svcKey -Name 'Environment' -Type MultiString -Value $envValue + +# --- Install OtOpcUa (depends on Galaxy host being installed; doesn't strictly require it +# started — OtOpcUa.Server NodeBootstrap retries on the IPC connect path). +Write-Host "Installing OtOpcUa..." +& sc.exe create OtOpcUa binPath= "`"$InstallRoot\OtOpcUa.Server.exe`"" ` + DisplayName= 'OtOpcUa Server' ` + start= auto ` + depend= 'OtOpcUaGalaxyHost' ` + obj= $ServiceAccount | Out-Null + +Write-Host "" +Write-Host "Installed. Start with:" +Write-Host " sc.exe start OtOpcUaGalaxyHost" +Write-Host " sc.exe start OtOpcUa" +Write-Host "" +Write-Host "Galaxy shared secret (record this offline — required for service rebinding):" +Write-Host " $GalaxySharedSecret" diff --git a/scripts/install/Uninstall-Services.ps1 b/scripts/install/Uninstall-Services.ps1 new file mode 100644 index 0000000..c811226 --- /dev/null +++ b/scripts/install/Uninstall-Services.ps1 @@ -0,0 +1,18 @@ +<# +.SYNOPSIS + Stops + removes the two v2 services. Mirrors Install-Services.ps1. +#> +[CmdletBinding()] param() +$ErrorActionPreference = 'Continue' + +foreach ($svc in 'OtOpcUa', 'OtOpcUaGalaxyHost') { + if (Get-Service $svc -ErrorAction SilentlyContinue) { + Write-Host "Stopping $svc..." + Stop-Service $svc -Force -ErrorAction SilentlyContinue + Write-Host "Removing $svc..." + & sc.exe delete $svc | Out-Null + } else { + Write-Host "$svc not installed — skipping" + } +} +Write-Host "Done." diff --git a/scripts/migration/Migrate-AppSettings-To-DriverConfig.ps1 b/scripts/migration/Migrate-AppSettings-To-DriverConfig.ps1 new file mode 100644 index 0000000..5f5a0d3 --- /dev/null +++ b/scripts/migration/Migrate-AppSettings-To-DriverConfig.ps1 @@ -0,0 +1,107 @@ +<# +.SYNOPSIS + Translates a v1 OtOpcUa.Host appsettings.json into a v2 DriverInstance.DriverConfig JSON + blob suitable for upserting into the central Configuration DB. + +.DESCRIPTION + Phase 2 Stream D.3 — moves the legacy MxAccess + GalaxyRepository + Historian sections out + of node-local appsettings.json and into the central DB so each node only needs Cluster.NodeId + + ClusterId + DB conn (per decision #18). Idempotent + dry-run-able. + + Output shape matches the Galaxy DriverType schema in `docs/v2/plan.md` §"Galaxy DriverConfig": + + { + "MxAccess": { "ClientName": "...", "RequestTimeoutSeconds": 30 }, + "Database": { "ConnectionString": "...", "PollIntervalSeconds": 60 }, + "Historian": { "Enabled": false } + } + +.PARAMETER AppSettingsPath + Path to the v1 appsettings.json. Defaults to ../../src/ZB.MOM.WW.OtOpcUa.Host/appsettings.json + relative to the script. + +.PARAMETER OutputPath + Where to write the generated DriverConfig JSON. Defaults to stdout. + +.PARAMETER DryRun + Print what would be written without writing. + +.EXAMPLE + pwsh ./Migrate-AppSettings-To-DriverConfig.ps1 -AppSettingsPath C:\OtOpcUa\appsettings.json -OutputPath C:\tmp\galaxy-driverconfig.json +#> +[CmdletBinding()] +param( + [string]$AppSettingsPath, + [string]$OutputPath, + [switch]$DryRun +) + +$ErrorActionPreference = 'Stop' + +if (-not $AppSettingsPath) { + $AppSettingsPath = Join-Path (Split-Path -Parent $PSScriptRoot) '..\src\ZB.MOM.WW.OtOpcUa.Host\appsettings.json' +} + +if (-not (Test-Path $AppSettingsPath)) { + Write-Error "AppSettings file not found: $AppSettingsPath" + exit 1 +} + +$src = Get-Content -Raw $AppSettingsPath | ConvertFrom-Json + +$mx = $src.MxAccess +$gr = $src.GalaxyRepository +$hi = $src.Historian + +$driverConfig = [ordered]@{ + MxAccess = [ordered]@{ + ClientName = $mx.ClientName + NodeName = $mx.NodeName + GalaxyName = $mx.GalaxyName + RequestTimeoutSeconds = $mx.ReadTimeoutSeconds + WriteTimeoutSeconds = $mx.WriteTimeoutSeconds + MaxConcurrentOps = $mx.MaxConcurrentOperations + MonitorIntervalSec = $mx.MonitorIntervalSeconds + AutoReconnect = $mx.AutoReconnect + ProbeTag = $mx.ProbeTag + } + Database = [ordered]@{ + ConnectionString = $gr.ConnectionString + ChangeDetectionIntervalSec = $gr.ChangeDetectionIntervalSeconds + CommandTimeoutSeconds = $gr.CommandTimeoutSeconds + ExtendedAttributes = $gr.ExtendedAttributes + Scope = $gr.Scope + PlatformName = $gr.PlatformName + } + Historian = [ordered]@{ + Enabled = if ($null -ne $hi -and $null -ne $hi.Enabled) { $hi.Enabled } else { $false } + } +} + +# Strip null-valued leaves so the resulting JSON is compact and round-trippable. +function Remove-Nulls($obj) { + $keys = @($obj.Keys) + foreach ($k in $keys) { + if ($null -eq $obj[$k]) { $obj.Remove($k) | Out-Null } + elseif ($obj[$k] -is [System.Collections.Specialized.OrderedDictionary]) { Remove-Nulls $obj[$k] } + } +} +Remove-Nulls $driverConfig + +$json = $driverConfig | ConvertTo-Json -Depth 8 + +if ($DryRun) { + Write-Host "=== DriverConfig (dry-run, would write to $OutputPath) ===" + Write-Host $json + return +} + +if ($OutputPath) { + $dir = Split-Path -Parent $OutputPath + if ($dir -and -not (Test-Path $dir)) { New-Item -ItemType Directory -Path $dir | Out-Null } + Set-Content -Path $OutputPath -Value $json -Encoding UTF8 + Write-Host "Wrote DriverConfig to $OutputPath" +} +else { + $json +} diff --git a/tests/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Proxy.Tests/HostSubprocessParityTests.cs b/tests/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Proxy.Tests/HostSubprocessParityTests.cs new file mode 100644 index 0000000..c9e8fe1 --- /dev/null +++ b/tests/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Proxy.Tests/HostSubprocessParityTests.cs @@ -0,0 +1,130 @@ +using System.Diagnostics; +using System.Reflection; +using System.Security.Principal; +using Shouldly; +using Xunit; +using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Proxy.Ipc; +using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Shared.Contracts; + +namespace ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Proxy.Tests; + +/// +/// The honest cross-FX parity test — spawns the actual OtOpcUa.Driver.Galaxy.Host.exe +/// subprocess (net48 x86), the Proxy connects via real named pipe, exercises Discover +/// against the live Galaxy ZB DB, and asserts gobjects come back. This is the production +/// deployment shape (Tier C: separate process, IPC over named pipe, Proxy in the .NET 10 +/// server process). Skipped when the Host EXE isn't built or Galaxy is unreachable. +/// +[Trait("Category", "ProcessSpawnParity")] +public sealed class HostSubprocessParityTests : IDisposable +{ + private Process? _hostProcess; + + public void Dispose() + { + if (_hostProcess is not null && !_hostProcess.HasExited) + { + try { _hostProcess.Kill(entireProcessTree: true); } catch { /* ignore */ } + try { _hostProcess.WaitForExit(5_000); } catch { /* ignore */ } + } + _hostProcess?.Dispose(); + } + + private static string? FindHostExe() + { + // The test assembly lives at tests/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Proxy.Tests/bin/Debug/net10.0/. + // The Host EXE lives at src/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host/bin/Debug/net48/. + var asmDir = Path.GetDirectoryName(Assembly.GetExecutingAssembly().Location)!; + var solutionRoot = asmDir; + for (var i = 0; i < 8 && solutionRoot is not null; i++) + { + if (File.Exists(Path.Combine(solutionRoot, "ZB.MOM.WW.OtOpcUa.slnx"))) + break; + solutionRoot = Path.GetDirectoryName(solutionRoot); + } + if (solutionRoot is null) return null; + + var candidate = Path.Combine(solutionRoot, + "src", "ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host", "bin", "Debug", "net48", + "OtOpcUa.Driver.Galaxy.Host.exe"); + return File.Exists(candidate) ? candidate : null; + } + + private static bool IsAdministrator() + { + if (!OperatingSystem.IsWindows()) return false; + using var identity = WindowsIdentity.GetCurrent(); + return new WindowsPrincipal(identity).IsInRole(WindowsBuiltInRole.Administrator); + } + + private static async Task ZbReachableAsync() + { + try + { + using var client = new System.Net.Sockets.TcpClient(); + var task = client.ConnectAsync("localhost", 1433); + return await Task.WhenAny(task, Task.Delay(1_500)) == task && client.Connected; + } + catch { return false; } + } + + [Fact] + public async Task Spawned_Host_in_db_mode_lets_Proxy_Discover_real_Galaxy_gobjects() + { + if (!OperatingSystem.IsWindows() || IsAdministrator()) return; + if (!await ZbReachableAsync()) return; + + var hostExe = FindHostExe(); + if (hostExe is null) return; // skip when the Host hasn't been built + + using var identity = WindowsIdentity.GetCurrent(); + var sid = identity.User!; + var pipeName = $"OtOpcUaGalaxyParity-{Guid.NewGuid():N}"; + const string secret = "parity-secret"; + + var psi = new ProcessStartInfo(hostExe) + { + UseShellExecute = false, + CreateNoWindow = true, + RedirectStandardOutput = true, + RedirectStandardError = true, + EnvironmentVariables = + { + ["OTOPCUA_GALAXY_PIPE"] = pipeName, + ["OTOPCUA_ALLOWED_SID"] = sid.Value, + ["OTOPCUA_GALAXY_SECRET"] = secret, + ["OTOPCUA_GALAXY_BACKEND"] = "db", // SQL-only — doesn't need MXAccess + ["OTOPCUA_GALAXY_ZB_CONN"] = "Server=localhost;Database=ZB;Integrated Security=True;TrustServerCertificate=True;Encrypt=False;", + }, + }; + + _hostProcess = Process.Start(psi) + ?? throw new InvalidOperationException("Failed to spawn Galaxy.Host"); + + // Wait for the pipe to come up — the Host's PipeServer takes ~100ms to bind. + await Task.Delay(2_000); + + await using var client = await GalaxyIpcClient.ConnectAsync( + pipeName, secret, TimeSpan.FromSeconds(5), CancellationToken.None); + + var sessionResp = await client.CallAsync( + MessageKind.OpenSessionRequest, + new OpenSessionRequest { DriverInstanceId = "parity", DriverConfigJson = "{}" }, + MessageKind.OpenSessionResponse, + CancellationToken.None); + sessionResp.Success.ShouldBeTrue(sessionResp.Error); + + var discoverResp = await client.CallAsync( + MessageKind.DiscoverHierarchyRequest, + new DiscoverHierarchyRequest { SessionId = sessionResp.SessionId }, + MessageKind.DiscoverHierarchyResponse, + CancellationToken.None); + + discoverResp.Success.ShouldBeTrue(discoverResp.Error); + discoverResp.Objects.Length.ShouldBeGreaterThan(0, + "live Galaxy ZB has at least one deployed gobject"); + + await client.SendOneWayAsync(MessageKind.CloseSessionRequest, + new CloseSessionRequest { SessionId = sessionResp.SessionId }, CancellationToken.None); + } +}