Phase 2 Stream D Option B — archive v1 surface + new Driver.Galaxy.E2E parity suite. Non-destructive intermediate state: the v1 OtOpcUa.Host + Historian.Aveva + Tests + IntegrationTests projects all still build (494 v1 unit + 6 v1 integration tests still pass when run explicitly), but solution-level dotnet test ZB.MOM.WW.OtOpcUa.slnx now skips them via IsTestProject=false on the test projects + archive-status PropertyGroup comments on the src projects. The destructive deletion is reserved for Phase 2 PR 3 with explicit operator review per CLAUDE.md "only use destructive operations when truly the best approach". tests/ZB.MOM.WW.OtOpcUa.Tests/ renamed via git mv to tests/ZB.MOM.WW.OtOpcUa.Tests.v1Archive/; csproj <AssemblyName> kept as the original ZB.MOM.WW.OtOpcUa.Tests so v1 OtOpcUa.Host's [InternalsVisibleTo("ZB.MOM.WW.OtOpcUa.Tests")] still matches and the project rebuilds clean. tests/ZB.MOM.WW.OtOpcUa.IntegrationTests gets <IsTestProject>false</IsTestProject>. src/ZB.MOM.WW.OtOpcUa.Host + src/ZB.MOM.WW.OtOpcUa.Historian.Aveva get PropertyGroup archive-status comments documenting they're functionally superseded but kept in-build because cascading dependencies (Historian.Aveva → Host; IntegrationTests → Host) make a single-PR deletion high blast-radius. New tests/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.E2E/ project (.NET 10) with ParityFixture that spawns OtOpcUa.Driver.Galaxy.Host.exe (net48 x86) as a Process.Start subprocess with OTOPCUA_GALAXY_BACKEND=db env vars, awaits 2s for the PipeServer to bind, then exposes a connected GalaxyProxyDriver; skips on non-Windows / Administrator shells (PipeAcl denies admins per decision #76) / ZB unreachable / Host EXE not built — each skip carries a SkipReason string the test method reads via Assert.Skip(SkipReason). RecordingAddressSpaceBuilder captures every Folder/Variable/AddProperty registration so parity tests can assert on the same shape v1 LmxNodeManager produced. HierarchyParityTests (3) — Discover returns gobjects with attributes; attribute full references match the tag.attribute Galaxy reference grammar; HistoryExtension flag flows through correctly. StabilityFindingsRegressionTests (4) — one test per 2026-04-13 stability finding from commits c76ab8f and 7310925: phantom probe subscription doesn't corrupt unrelated host status; HostStatusChangedEventArgs structurally carries a specific HostName + OldState + NewState (event signature mathematically prevents the v1 cross-host quality-clear bug); all GalaxyProxyDriver capability methods return Task or Task<T> (sync-over-async would deadlock OPC UA stack thread); AcknowledgeAsync completes before returning (no fire-and-forget background work that could race shutdown). Solution test count: 470 pass / 7 skip (E2E on admin shell) / 1 pre-existing Phase 0 baseline. Run archived suites explicitly: dotnet test tests/ZB.MOM.WW.OtOpcUa.Tests.v1Archive (494 pass) + dotnet test tests/ZB.MOM.WW.OtOpcUa.IntegrationTests (6 pass). docs/v2/V1_ARCHIVE_STATUS.md inventories every archived surface with run-it-explicitly instructions + a 10-step deletion plan for PR 3 + rollback procedure (git revert restores all four projects). docs/v2/implementation/exit-gate-phase-2-final.md supersedes the two partial-exit docs with the per-stream status table (A/B/C/D/E all addressed, D split across PR 2/3 per safety protocol), the test count breakdown, fresh adversarial review of PR 2 deltas (4 new findings: medium IsTestProject=false safety net loss, medium structural-vs-behavioral stability tests, low backend=db default, low Process.Start env inheritance), the 8 carried-forward findings from exit-gate-phase-2.md, the recommended PR order (1 → 2 → 3 → 4). docs/v2/implementation/pr-2-body.md is the Gitea web-UI paste-in for opening PR 2 once pushed.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -0,0 +1,104 @@
|
||||
using System;
|
||||
using System.Threading.Tasks;
|
||||
using Shouldly;
|
||||
using Xunit;
|
||||
using ZB.MOM.WW.OtOpcUa.Tests.Helpers;
|
||||
|
||||
namespace ZB.MOM.WW.OtOpcUa.Tests.OpcUa
|
||||
{
|
||||
/// <summary>
|
||||
/// Verifies that subscription and unsubscription failures in the MXAccess client
|
||||
/// are handled gracefully by the node manager instead of silently lost.
|
||||
/// </summary>
|
||||
public class LmxNodeManagerSubscriptionFaultTests
|
||||
{
|
||||
/// <summary>
|
||||
/// Confirms that a faulted SubscribeAsync is caught and logged rather than silently discarded.
|
||||
/// </summary>
|
||||
[Fact]
|
||||
public async Task SubscribeTag_WhenClientFaults_DoesNotThrowAndDoesNotHang()
|
||||
{
|
||||
var mxClient = new FakeMxAccessClient
|
||||
{
|
||||
SubscribeException = new InvalidOperationException("COM connection lost")
|
||||
};
|
||||
var fixture = OpcUaServerFixture.WithFakeMxAccessClient(mxClient);
|
||||
await fixture.InitializeAsync();
|
||||
try
|
||||
{
|
||||
var nodeManager = fixture.Service.NodeManagerInstance!;
|
||||
|
||||
// SubscribeTag should catch the fault — not throw and not hang
|
||||
Should.NotThrow(() => nodeManager.SubscribeTag("TestMachine_001.MachineID"));
|
||||
}
|
||||
finally
|
||||
{
|
||||
await fixture.DisposeAsync();
|
||||
}
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// Confirms that a faulted UnsubscribeAsync is caught and logged rather than silently discarded.
|
||||
/// </summary>
|
||||
[Fact]
|
||||
public async Task UnsubscribeTag_WhenClientFaults_DoesNotThrowAndDoesNotHang()
|
||||
{
|
||||
var mxClient = new FakeMxAccessClient();
|
||||
var fixture = OpcUaServerFixture.WithFakeMxAccessClient(mxClient);
|
||||
await fixture.InitializeAsync();
|
||||
try
|
||||
{
|
||||
var nodeManager = fixture.Service.NodeManagerInstance!;
|
||||
|
||||
// Subscribe first (succeeds)
|
||||
nodeManager.SubscribeTag("TestMachine_001.MachineID");
|
||||
mxClient.ActiveSubscriptionCount.ShouldBe(1);
|
||||
|
||||
// Now inject fault for unsubscribe
|
||||
mxClient.UnsubscribeException = new InvalidOperationException("COM connection lost");
|
||||
|
||||
// UnsubscribeTag should catch the fault — not throw and not hang
|
||||
Should.NotThrow(() => nodeManager.UnsubscribeTag("TestMachine_001.MachineID"));
|
||||
}
|
||||
finally
|
||||
{
|
||||
await fixture.DisposeAsync();
|
||||
}
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// Confirms that subscription failure does not corrupt the ref-count bookkeeping,
|
||||
/// allowing a retry to succeed after the fault clears.
|
||||
/// </summary>
|
||||
[Fact]
|
||||
public async Task SubscribeTag_AfterFaultClears_CanSubscribeAgain()
|
||||
{
|
||||
var mxClient = new FakeMxAccessClient
|
||||
{
|
||||
SubscribeException = new InvalidOperationException("transient fault")
|
||||
};
|
||||
var fixture = OpcUaServerFixture.WithFakeMxAccessClient(mxClient);
|
||||
await fixture.InitializeAsync();
|
||||
try
|
||||
{
|
||||
var nodeManager = fixture.Service.NodeManagerInstance!;
|
||||
|
||||
// First subscribe faults (caught)
|
||||
nodeManager.SubscribeTag("TestMachine_001.MachineID");
|
||||
mxClient.ActiveSubscriptionCount.ShouldBe(0); // subscribe failed
|
||||
|
||||
// Clear the fault
|
||||
mxClient.SubscribeException = null;
|
||||
|
||||
// Unsubscribe to reset ref count, then subscribe again
|
||||
nodeManager.UnsubscribeTag("TestMachine_001.MachineID");
|
||||
nodeManager.SubscribeTag("TestMachine_001.MachineID");
|
||||
mxClient.ActiveSubscriptionCount.ShouldBe(1);
|
||||
}
|
||||
finally
|
||||
{
|
||||
await fixture.DisposeAsync();
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
Reference in New Issue
Block a user