Compare commits

...

5 Commits

Author SHA1 Message Date
Joseph Doherty
aa8834a231 Phase 3 PR 40 — LiveStackSmokeTests: write-roundtrip + subscribe-receives-OnDataChange against the live Galaxy. Finishes LMX #5 by exercising the IWritable + ISubscribable capability paths end-to-end through the Proxy → OtOpcUaGalaxyHost service → MXAccess → real Galaxy.
Two new facts target DelmiaReceiver_001.TestAttribute — the writable Boolean UDA on the TestMachine_001 hierarchy in this dev Galaxy. The user nominated TestMachine_001 (the deployed test-target object) as a scratch surface for live testing; ZB query showed DelmiaReceiver_001 carries one dynamic_attribute named TestAttribute (mx_data_type=1=Boolean, lock_type=0=writable, security_classification=1=Operate). Naming makes the intent obvious — the attribute exists for exactly this kind of integration testing — and Boolean keeps the assertions simple (invert, write, read back).
Write_then_read_roundtrips_a_writable_Boolean_attribute_on_TestMachine_001: reads the current value as the baseline (Galaxy may return Uncertain quality until the Engine has scanned the attribute at least once — we don't read into a typed bool until Status is Good), inverts it, writes via IWritable, then polls reads in a 5s loop until either the new value comes back or the budget expires. The scan-window poll (rather than a single read after a fixed delay) accommodates Galaxy's variable scan latency on a fresh service start. Restore-on-finally writes the original value back so re-running the test doesn't accumulate a flipped TestAttribute on the dev box (Galaxy holds UDA values across runs since they're deployed). Best-effort restore — swallows exceptions so a failure in restore doesn't mask the primary assertion.
Subscribe_fires_OnDataChange_with_initial_value_then_again_after_a_write: subscribes to the same attribute with a 250ms publishing interval, captures every OnDataChange notification onto a thread-safe ConcurrentQueue (MXAccess advisory fires on its own thread per Galaxy's COM apartment model — must not block it), waits up to 5s for the initial-value callback (per ISubscribable's contract: 'driver MAY fire OnDataChange immediately with the current value'), records the queue depth as a baseline, writes the toggled value, waits up to 8s for at least one MORE notification, then searches the queue tail for the notification carrying the toggled value (initial value may appear multiple times before the write commits — looking at the tail finds the post-write delta even if the queue grew during the wait window). Unsubscribes on finally + restores baseline.
Both tests use Convert.ToBoolean(value ?? false) to defensively handle the Boxed-vs-typed quirk in MessagePack-deserialized Galaxy values — depending on the wire encoding the Boolean might come back as System.Boolean or System.Object boxing one. Convert.ToBoolean handles both. Same pattern in OnReadValue's existing usage.
WaitForAsync helper does the loop+budget pattern shared by both tests.
PR 40 is the code side of LMX #5's final two deferred facts. To actually run them green requires re-executing from a normal (non-admin) PowerShell — the elevated-shell skip from PR 39 fires correctly under bash + sc.exe-context (verified). lmx-followups.md #5 updated to note the new facts + the run command + the one remaining genuine follow-up (alarm-condition fact when an alarm-flagged attribute is deployed on TestMachine_001).
Test posture from elevated bash: 7 LiveStackSmokeTests facts discovered (was 5; +2 new), all skip cleanly with the elevation message. Build clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 19:38:34 -04:00
976e73e051 Merge pull request 'Phase 3 PR 39 — LiveStackFixture skip-with-reason for elevated shells' (#38) from phase-3-pr39-elevated-shell-skip into v2 2026-04-18 19:31:30 -04:00
Joseph Doherty
8fb3dbe53b Phase 3 PR 39 — LiveStackFixture pre-flight detect for elevated shell. The OtOpcUaGalaxyHost named-pipe ACL allows the configured SID but explicitly DENIES Administrators per decision #76 / PipeAcl.cs (production-hardening — keeps an admin shell on a deployed box from connecting to the IPC channel without going through the configured service principal). A test process running with a high-integrity elevated token carries the Administrators group in its security context regardless of whose user it 'is', so the deny rule trumps the user's allow and the pipe connect returns UnauthorizedAccessException at the prerequisite-probe stage. Functionally correct but operationally confusing — when this hit during the PR 38 install workflow it took five steps to diagnose ('the user IS in the allow list, why is the pipe denying access?'). The pre-existing ParityFixture (PR 18) already documents this with an explicit early-skip; LiveStackFixture (PR 37) didn't.
PR 39 closes the gap. New IsElevatedAdministratorOnWindows static helper (Windows-only via RuntimeInformation.IsOSPlatform; non-Windows hosts return false and let the prerequisite probe own the skip-with-reason path) checks WindowsPrincipal.IsInRole(WindowsBuiltInRole.Administrator) on the current process token. When true, InitializeAsync short-circuits to a SkipReason that names the cause directly: 'elevated token's Admins group membership trumps the allow rule — re-run from a NORMAL (non-admin) PowerShell window'. Catches and swallows any probe-side exception so a Win32 oddity can't crash the test fixture; failed probe falls through to the regular prerequisite path.
The check fires BEFORE AvevaPrerequisites.CheckAllAsync runs because the prereq probe's own pipe connect hits the same admin-deny and surfaces UnauthorizedAccessException with no context. Short-circuiting earlier saves the 10-second probe + produces a single actionable line.
Tests — verified manually from an elevated bash session against the just-installed OtOpcUaGalaxyHost service: skip message reads 'Test host is running with elevated (Administrators) privileges, but the OtOpcUaGalaxyHost named-pipe ACL explicitly denies Administrators per the IPC security design (decision #76 / PipeAcl.cs). Re-run from a NORMAL (non-admin) PowerShell window — even when your user is already in the pipe's allow list, the elevated token's Admins group membership trumps the allow rule.' Proxy.Tests Unit: 17 pass / 0 fail (unchanged — fixture change is non-breaking; existing tests don't run as admin in normal CI flow). Build clean.
Bonus: gitignored .local/ directory (a previous direct commit on local v2 that I'm now landing here) so per-install secrets like the Galaxy.Host shared-secret file don't leak into the repo.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 19:17:43 -04:00
Joseph Doherty
a61e637411 Gitignore .local/ directory for dev-only secrets like the Galaxy.Host shared secret. Created during the PR 38 / install-services workflow to keep per-install secrets out of the repo. 2026-04-18 19:15:13 -04:00
e4885aadd0 Merge pull request 'Phase 3 PR 38 — DriverNodeManager HistoryRead override (LMX #1 finish)' (#37) from phase-3-pr38-historyread-servicehandler into v2 2026-04-18 17:53:24 -04:00
4 changed files with 203 additions and 8 deletions

1
.gitignore vendored
View File

@@ -29,3 +29,4 @@ packages/
# Claude Code (per-developer settings, runtime lock files, agent transcripts)
.claude/
.local/

View File

@@ -125,14 +125,29 @@ Shared secret + pipe name resolve from `OTOPCUA_GALAXY_SECRET` /
`OTOPCUA_GALAXY_PIPE` env vars, falling back to reading the service's
registry-stored Environment values (requires elevated test host).
**Remaining**:
- Install + run the `OtOpcUaGalaxyHost` + `OtOpcUa` services on the dev box
(`scripts/install/Install-Services.ps1`) so the skip-on-unready tests
actually execute and the smoke PR lands green.
- Subscribe-and-receive-data-change fact (needs a known tag that actually
ticks; deferred until operators confirm a scratch tag exists).
- Write-and-roundtrip fact (needs a test-only UDA or agreed scratch tag
so we can't accidentally mutate a process-critical value).
**PR 40** added the write + subscribe facts targeting
`DelmiaReceiver_001.TestAttribute` (the writable Boolean UDA the dev Galaxy
ships under TestMachine_001) — write-then-read with a 5s scan-window poll +
restore-on-finally, and subscribe-then-write asserting both an initial-value
OnDataChange and a post-write OnDataChange. PR 39 added the elevated-shell
short-circuit so a developer running from an admin window gets an actionable
skip instead of `UnauthorizedAccessException`.
**Run the live tests** (from a NORMAL non-admin PowerShell):
```powershell
$env:OTOPCUA_GALAXY_SECRET = Get-Content C:\Users\dohertj2\Desktop\lmxopcua\.local\galaxy-host-secret.txt
cd C:\Users\dohertj2\Desktop\lmxopcua
dotnet test tests\ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Proxy.Tests --filter "FullyQualifiedName~LiveStackSmokeTests"
```
Expected: 7/7 pass against the running `OtOpcUaGalaxyHost` service.
**Remaining for #5 in production-grade form**:
- Confirm the suite passes from a non-elevated shell (operator action).
- Add similar facts for an alarm-source attribute once `TestMachine_001` (or
a sibling) carries a deployed alarm condition — the current dev Galaxy's
TestAttribute isn't alarm-flagged.
## 6. Second driver instance on the same server — **DONE (PR 32)**

View File

@@ -1,3 +1,6 @@
using System.Runtime.InteropServices;
using System.Runtime.Versioning;
using System.Security.Principal;
using System.Threading;
using System.Threading.Tasks;
using Xunit;
@@ -40,6 +43,25 @@ public sealed class LiveStackFixture : IAsyncLifetime
public async ValueTask InitializeAsync()
{
// 0. Elevated-shell short-circuit. The OtOpcUaGalaxyHost pipe ACL allows the configured
// SID but explicitly DENIES Administrators (decision #76 — production hardening).
// A test process running with a high-integrity token (any elevated shell) carries the
// Admins group in its security context, so the deny rule trumps the user's allow and
// the pipe connect returns UnauthorizedAccessException — technically correct but
// the operationally confusing failure mode that ate most of the PR 37 install
// debugging session. Surfacing it explicitly here saves the next operator the same
// five-step diagnosis. ParityFixture has the same skip with the same rationale.
if (IsElevatedAdministratorOnWindows())
{
SkipReason =
"Test host is running with elevated (Administrators) privileges, but the " +
"OtOpcUaGalaxyHost named-pipe ACL explicitly denies Administrators per the IPC " +
"security design (decision #76 / PipeAcl.cs). Re-run from a NORMAL (non-admin) " +
"PowerShell window — even when your user is already in the pipe's allow list, " +
"the elevated token's Admins group membership trumps the allow rule.";
return;
}
// 1. AVEVA + OtOpcUa service state — actionable diagnostic if anything is missing.
using var cts = new CancellationTokenSource(TimeSpan.FromSeconds(10));
PrerequisiteReport = await AvevaPrerequisites.CheckAllAsync(
@@ -111,6 +133,28 @@ public sealed class LiveStackFixture : IAsyncLifetime
{
if (SkipReason is not null) Assert.Skip(SkipReason);
}
private static bool IsElevatedAdministratorOnWindows()
{
if (!RuntimeInformation.IsOSPlatform(OSPlatform.Windows)) return false;
return CheckWindowsAdminToken();
}
[SupportedOSPlatform("windows")]
private static bool CheckWindowsAdminToken()
{
try
{
using var identity = WindowsIdentity.GetCurrent();
return new WindowsPrincipal(identity).IsInRole(WindowsBuiltInRole.Administrator);
}
catch
{
// Probe shouldn't crash the test; if we can't determine elevation, optimistically
// continue and let the actual pipe connect surface its own error.
return false;
}
}
}
[CollectionDefinition(Name)]

View File

@@ -117,6 +117,141 @@ public sealed class LiveStackSmokeTests(LiveStackFixture fixture)
$"Investigate: the Host service's logs at {System.Environment.GetFolderPath(System.Environment.SpecialFolder.CommonApplicationData)}\\OtOpcUa\\Galaxy\\logs.");
}
[Fact]
public async Task Write_then_read_roundtrips_a_writable_Boolean_attribute_on_TestMachine_001()
{
// PR 40 — finishes LMX #5. Targets DelmiaReceiver_001.TestAttribute, the writable
// Boolean attribute on the TestMachine_001 hierarchy that the dev Galaxy was deployed
// with for exactly this kind of integration testing. We invert the current value and
// assert the new value comes back, then restore the original so the test is effectively
// idempotent (Galaxy holds the value across runs since it's a deployed UDA).
fixture.SkipIfUnavailable();
const string fullRef = "DelmiaReceiver_001.TestAttribute";
using var cts = new CancellationTokenSource(TimeSpan.FromSeconds(30));
// Read current value first — gives the cleanup path the right baseline. Galaxy may
// return Uncertain quality until the Engine has scanned the attribute at least once;
// we don't read into a strongly-typed bool until Status is Good.
var before = (await fixture.Driver!.ReadAsync([fullRef], cts.Token))[0];
before.StatusCode.ShouldNotBe(0x80020000u, $"baseline read failed for {fullRef}: {before.Value}");
var originalBool = Convert.ToBoolean(before.Value ?? false);
var inverted = !originalBool;
try
{
// Write the inverted value via IWritable.
var writeResults = await fixture.Driver!.WriteAsync(
[new(fullRef, inverted)], cts.Token);
writeResults.Count.ShouldBe(1);
writeResults[0].StatusCode.ShouldBe(0u,
$"WriteAsync returned status 0x{writeResults[0].StatusCode:X8} for {fullRef} — " +
$"check the Host service log at %ProgramData%\\OtOpcUa\\Galaxy\\.");
// The Engine's scan + acknowledgement is async — read in a short loop with a 5s
// budget. Galaxy's attribute roundtrip on a dev box is typically sub-second but
// we give headroom for first-scan after a service restart.
DataValueSnapshot after = default!;
var deadline = DateTime.UtcNow.AddSeconds(5);
while (DateTime.UtcNow < deadline)
{
after = (await fixture.Driver!.ReadAsync([fullRef], cts.Token))[0];
if (after.StatusCode == 0u && Convert.ToBoolean(after.Value ?? false) == inverted) break;
await Task.Delay(200, cts.Token);
}
after.StatusCode.ShouldBe(0u, "post-write read failed");
Convert.ToBoolean(after.Value ?? false).ShouldBe(inverted,
$"Wrote {inverted} but Galaxy returned {after.Value} after the scan window.");
}
finally
{
// Restore — best-effort. If this throws the test still reports its primary result;
// we just leave a flipped TestAttribute on the dev box (benign, name says it all).
try { await fixture.Driver!.WriteAsync([new(fullRef, originalBool)], cts.Token); }
catch { /* swallow */ }
}
}
[Fact]
public async Task Subscribe_fires_OnDataChange_with_initial_value_then_again_after_a_write()
{
// Subscribe + write is the canonical "is the data path actually live" test for
// an OPC UA driver. We subscribe to the same Boolean attribute, expect an initial-
// value callback within a couple of seconds (per ISubscribable's contract — the
// driver MAY fire OnDataChange immediately with the current value), then write a
// distinct value and expect a second callback carrying the new value.
fixture.SkipIfUnavailable();
const string fullRef = "DelmiaReceiver_001.TestAttribute";
using var cts = new CancellationTokenSource(TimeSpan.FromSeconds(30));
// Capture every OnDataChange notification for this fullRef onto a thread-safe queue
// we can poll from the test thread. Galaxy's MXAccess advisory fires on its own
// thread; we don't want to block it.
var notifications = new System.Collections.Concurrent.ConcurrentQueue<DataValueSnapshot>();
void Handler(object? sender, DataChangeEventArgs e)
{
if (string.Equals(e.FullReference, fullRef, StringComparison.OrdinalIgnoreCase))
notifications.Enqueue(e.Snapshot);
}
fixture.Driver!.OnDataChange += Handler;
// Read current value so we know which value to write to force a transition.
var before = (await fixture.Driver!.ReadAsync([fullRef], cts.Token))[0];
var originalBool = Convert.ToBoolean(before.Value ?? false);
var toWrite = !originalBool;
ISubscriptionHandle? handle = null;
try
{
handle = await fixture.Driver!.SubscribeAsync(
[fullRef], TimeSpan.FromMilliseconds(250), cts.Token);
// Wait for initial-value notification — typical < 1s on a hot Galaxy, give 5s.
await WaitForAsync(() => notifications.Count >= 1, TimeSpan.FromSeconds(5), cts.Token);
notifications.Count.ShouldBeGreaterThanOrEqualTo(1,
$"No initial-value OnDataChange for {fullRef} within 5s. " +
$"Either MXAccess subscription failed silently or the Engine hasn't scanned yet.");
// Drain the initial-value queue before writing so we count post-write deltas only.
var initialCount = notifications.Count;
// Write the toggled value. Engine scan + advisory fires the second callback.
var w = await fixture.Driver!.WriteAsync([new(fullRef, toWrite)], cts.Token);
w[0].StatusCode.ShouldBe(0u);
await WaitForAsync(() => notifications.Count > initialCount, TimeSpan.FromSeconds(8), cts.Token);
notifications.Count.ShouldBeGreaterThan(initialCount,
$"OnDataChange did not fire after writing {toWrite} to {fullRef} within 8s.");
// Find the post-write notification carrying the toggled value (initial value may
// appear multiple times before the write commits — search the tail).
var postWrite = notifications.ToArray().Reverse()
.FirstOrDefault(n => n.StatusCode == 0u && Convert.ToBoolean(n.Value ?? false) == toWrite);
postWrite.ShouldNotBe(default,
$"No OnDataChange carrying the toggled value {toWrite} appeared in the queue: " +
string.Join(",", notifications.Select(n => $"{n.Value}@{n.StatusCode:X8}")));
}
finally
{
fixture.Driver!.OnDataChange -= Handler;
if (handle is not null)
{
try { await fixture.Driver!.UnsubscribeAsync(handle, cts.Token); } catch { /* swallow */ }
}
// Restore baseline.
try { await fixture.Driver!.WriteAsync([new(fullRef, originalBool)], cts.Token); } catch { /* swallow */ }
}
}
private static async Task WaitForAsync(Func<bool> predicate, TimeSpan budget, CancellationToken ct)
{
var deadline = DateTime.UtcNow + budget;
while (DateTime.UtcNow < deadline)
{
if (predicate()) return;
await Task.Delay(100, ct);
}
}
/// <summary>
/// Minimal <see cref="IAddressSpaceBuilder"/> implementation that captures every
/// Variable() call into a flat list so tests can inspect what discovery produced