fix(code-review): resolve Batch 3 wave A (OpcUaServer history/guard, ControlPlane topology gate)

- OpcUaServer-002: HistoryRead-Events NumValuesPerNode==0 now maps to unbounded (int.MaxValue) instead of the backend default-cap sentinel; no Core.Abstractions contract change (+EventMaxEvents helper tests)
- OpcUaServer-004: EnsureAddressSpaceCreated guard on public mutators -> clear InvalidOperationException instead of bare NRE if called pre-start (+tests)
- OpcUaServer-003: Deferred (endUtc inclusive/exclusive needs live Wonderware boundary confirmation)
- Configuration-013: wire DraftValidator.ValidateClusterTopology into AdminOperationsActor deploy gate (read-only, no migration) (+2 tests)
This commit is contained in:
Joseph Doherty
2026-06-20 22:53:29 -04:00
parent c817d7720e
commit 94eec70fb0
8 changed files with 455 additions and 13 deletions
+3 -3
View File
@@ -7,7 +7,7 @@
| Review date | 2026-06-19 (re-review; first reviewed 2026-05-22) |
| Commit reviewed | `7286d320` (re-review; was `76d35d1`) |
| Status | Reviewed |
| Open findings | 1 |
| Open findings | 0 |
## Checklist coverage
@@ -232,13 +232,13 @@ Prior findings Configuration-001…011 remain Resolved. Notable since the first
| Severity | Medium |
| Category | Design-document adherence |
| Location | `src/Core/ZB.MOM.WW.OtOpcUa.Configuration/Validation/DraftValidator.cs:243` (`ValidateClusterTopology`) |
| Status | Open |
| Status | Resolved |
**Description:** `DraftValidator.ValidateClusterTopology` is documented as the managed pre-publish guard that catches cluster-topology drift the SQL `CK_ServerCluster_RedundancyMode_NodeCount` check cannot see — specifically an operator disabling a `ClusterNode` (effective enabled-count = 1) while `RedundancyMode` stays `Hot`/`Warm`, which would boot the runtime into an invalid-topology band. It is fully unit-tested (`DraftValidatorTests` §"ValidateClusterTopology") but **no production code calls it.** The deploy gate in `AdminOperationsActor.StartDeployment` runs `DraftValidator.Validate(...)` (the snapshot rules) but never `ValidateClusterTopology(...)`, so the documented enabled-node-count guard is inert at deploy time — the only thing standing is the row-level SQL CHECK, which the doc explicitly says is insufficient.
**Recommendation:** Wire `ValidateClusterTopology` into the deploy/publish path — load the `ServerCluster` row(s) + their `ClusterNode`s and run it alongside `Validate`, folding its errors into the same reject summary. The fix belongs in `src/Server/ZB.MOM.WW.OtOpcUa.ControlPlane/AdminOperations/AdminOperationsActor.cs` (a different module), so it is **deferred from this module's edit scope** and recorded here against the now-dead Configuration-layer method. Cross-module: ControlPlane.
**Resolution:** _(open — fix is in the ControlPlane module's `AdminOperationsActor`, outside Configuration's edit scope)_
**Resolution:** Resolved 2026-06-20 — wired `DraftValidator.ValidateClusterTopology` into the deploy gate in the ControlPlane module's `AdminOperationsActor.HandleStartDeploymentAsync` (`src/Server/ZB.MOM.WW.OtOpcUa.ControlPlane/AdminOperations/AdminOperationsActor.cs`). Immediately after the existing `DraftValidator.Validate(draft)` call, the handler now loads the `ServerCluster` rows (ClusterId-ordered for a deterministic summary) and their `ClusterNode`s from the **same** `db` context already open in the handler — read-only via `AsNoTracking()`, no second DbContext lifetime, no schema/migration or entity change — groups the nodes by `ClusterId`, and runs `ValidateClusterTopology(cluster, nodes)` per cluster. Its errors are appended to the SAME error list (`Validate(...)` now collected into a `List<ValidationError>`), so a deploy failing either the snapshot rules or the topology guard is rejected with both sets of messages folded into the single reject summary string; ordering stays deterministic (snapshot rules first, then per-cluster topology errors in ClusterId order). The previously-inert enabled-node-count guard (e.g. `RedundancyMode = Hot` with one `ClusterNode` toggled off, effective enabled-count = 1) now rejects at deploy time rather than relying solely on the row-level SQL CHECK the doc says is insufficient. New ControlPlane tests `AdminOperationsActorTests.StartDeployment_rejects_on_invalid_cluster_topology_disabled_node` (Hot + one disabled node → `Rejected` with `ClusterEnabledNodeCountMismatch`, no coordinator dispatch, no Deployment row) and `StartDeployment_accepts_when_cluster_topology_is_valid` (Hot + two enabled nodes → `Accepted`, no topology error, row inserted) pin the wiring; the rejecting test was confirmed red against the unwired handler before the fix. ControlPlane.Tests 62/62 green; the existing `DraftValidatorTests` §"ValidateClusterTopology" (Configuration.Tests 103/103) unchanged and still green.
### Configuration-014
+40 -7
View File
@@ -7,7 +7,7 @@
| Review date | 2026-06-19 |
| Commit reviewed | `7286d320` |
| Status | Reviewed |
| Open findings | 4 |
| Open findings | 1 |
## Checklist coverage
@@ -72,7 +72,7 @@ which is outside this module's edit boundary.
| Severity | Medium |
| Category | Correctness & logic bugs |
| Location | `OtOpcUaNodeManager.cs:1748` (`HistoryReadEvents`), `OtOpcUaNodeManager.cs:1814` (`ClampToInt`) |
| Status | Open |
| Status | Resolved |
**Description:** For HistoryRead-Events, `HistoryReadEvents` passes
`ClampToInt(details.NumValuesPerNode)` to `IHistorianDataSource.ReadEventsAsync(maxEvents)` and
@@ -93,7 +93,21 @@ backend truncation and surface a continuation point / `GoodMoreData` for events.
event backends (cross-module, Core.Abstractions contract); option (b) requires the backend to report
truncation. Both cross this module's boundary.
**Resolution:** _(Open — deferred: rooted in the cross-module `IHistoryProvider.ReadEventsAsync` `maxEvents <= 0` sentinel contract (Core.Abstractions-006) and the Wonderware/OpcUaClient event backends; cannot be fixed safely inside OpcUaServer alone.)_
**Resolution:** Resolved — 2026-06-20 (SHA pending): fixed locally inside OpcUaServer without touching
the cross-module `IHistoryProvider.ReadEventsAsync` `maxEvents <= 0` sentinel. Added a small testable
`internal static int EventMaxEvents(uint numValuesPerNode)` helper next to `ClampToInt` that translates
the OPC UA Part 4/11 "no limit" request (`NumValuesPerNode == 0`) to UNBOUNDED (`int.MaxValue`, a very
large positive cap) rather than the backend's `<= 0` "use the default cap" sentinel; a positive value
still passes through `ClampToInt` unchanged. `HistoryReadEvents` now calls `EventMaxEvents(details.NumValuesPerNode)`
instead of `ClampToInt(details.NumValuesPerNode)`, so a "give me the whole window" events read is no
longer silently truncated at the backend default. The sentinel contract + the Wonderware/OpcUaClient
backends are untouched (a positive `int.MaxValue` is never the `<= 0` sentinel). Tests:
`NodeManagerEventMaxEventsTests` (helper purity — `0u→int.MaxValue`, normal passthrough,
`>int.MaxValue→int.MaxValue` clamp, exact-`int.MaxValue` boundary) plus
`NodeManagerHistoryReadEventsTests.Events_unbounded_request_passes_int_max_to_backend` (the recording fake
`IHistorianDataSource` receives `int.MaxValue` when `NumValuesPerNode == 0`). Note: option (b) — surfacing
a continuation point / `GoodMoreData` on backend truncation — remains a cross-module/backend change and is
out of scope; option (a) here removes the silent-truncation defect for the common "all events" request.
### OpcUaServer-003
@@ -102,7 +116,7 @@ truncation. Both cross this module's boundary.
| Severity | Low |
| Category | Correctness & logic bugs |
| Location | `OtOpcUaNodeManager.cs:1978` (`ServeRawPaged`), `HistoryPaging.cs` (whole), `HistoryPaging.cs:213` (`SliceTieCluster` `next <= endUtc`) |
| Status | Open |
| Status | Deferred |
**Description:** The Raw paging chain treats `endUtc` as an **inclusive** upper bound throughout —
the `HistoryContinuationState`/`HistoryPaging` XML docs all say "the original (inclusive) end of
@@ -126,7 +140,14 @@ the inclusive/exclusive question requires confirming the Wonderware backend's ac
semantics (cross-module / infra), and changing a comparison without that confirmation risks the
opposite off-by-one.
**Resolution:** _(Open — deferred: needs the backend's authoritative endUtc boundary semantics confirmed before the comparison/doc is changed; flipping it blindly risks an off-by-one in the other direction.)_
**Resolution:** Deferred — 2026-06-20: infra-gated. Resolving the `endUtc` inclusive-vs-exclusive
disagreement requires confirming the actual Wonderware historian backend's boundary semantics, which is
hardware/infra-gated and not reachable from this macOS dev host. The impact is benign and bounded — because
the backend is the authority on which samples exist (a sample at exactly `endUtc` never appears in an
exclusive-end read), the disagreement only ever yields ONE extra empty resume page (`[endUtc, endUtc)`
GoodNoData, no continuation point) rather than any duplicated or dropped data. Changing the
`SliceTieCluster` comparison / paging XML docs without confirming the live backend boundary risks
introducing the opposite off-by-one, so no code is changed here pending that live confirmation.
### OpcUaServer-004
@@ -135,7 +156,7 @@ opposite off-by-one.
| Severity | Low |
| Category | Error handling & resilience |
| Location | `OtOpcUaNodeManager.cs:1597` (`ResolveParentFolder`), and every public sink mutator that calls it (`EnsureFolder` 1278, `EnsureVariable` 1335, `MaterialiseAlarmCondition` 597, plus `WriteValue`/`WriteAlarmCondition` `CreateVariable`) |
| Status | Open |
| Status | Resolved |
**Description:** `ResolveParentFolder` dereferences `_root!` with the null-forgiving operator, and
`CreateVariable` uses `_root` (`AddChild`). `_root` is only assigned in `CreateAddressSpace`, which
@@ -153,7 +174,19 @@ mutators, so a too-early call fails legibly instead of with a bare NRE. Low prio
hardening, not a live defect. Left Open to avoid an unscoped change to the mutator entry points on
this critical class without a regression scenario that reproduces the early-call ordering.
**Resolution:** _(Open — defensive-only; latent given current boot ordering. Deferred to avoid an unscoped guard-add across five mutators without a reproducing pre-start ordering scenario.)_
**Resolution:** Resolved — 2026-06-20 (SHA pending): added a private `EnsureAddressSpaceCreated()` helper
that throws `InvalidOperationException("OPC UA address space has not been created yet (server not started.)")`
when `_root` is null, and call it at the top of `ResolveParentFolder` and at every public address-space
mutator entry point (`WriteValue`, `WriteAlarmCondition`, `EnsureFolder`, `EnsureVariable`,
`MaterialiseAlarmCondition`) — right after argument validation, before any `_root` dereference. A too-early
call (a sink wired or a publish replayed before `StartAsync` drives `CreateAddressSpace`) now fails legibly
instead of with a bare NRE out of `ResolveParentFolder` / `CreateVariable`. Happy-path behaviour is
unchanged. The guard was test-feasible after all: `NodeManagerPreStartGuardTests` boots a real host,
borrows the live node manager's real `IServerInternal`, constructs a SECOND, never-started
`OtOpcUaNodeManager` from it (so `_root` is null), and asserts each of the four lock-taking mutators
(`EnsureFolder`/`EnsureVariable`/`WriteValue`/`MaterialiseAlarmCondition`) throws `InvalidOperationException`
(not NRE), with the folder case asserting the message text. (`WriteAlarmCondition`'s guard is identical and
sits on the same path; it is build-verified.) Full `OpcUaServer.Tests` suite green (284/284).
### OpcUaServer-005
@@ -173,7 +173,35 @@ public sealed class AdminOperationsActor : ReceiveActor
// committed/visible when the snapshot is read — operators seeing a spurious one should
// check ExternalIdReservation state before re-submitting.
var draft = await DraftSnapshotFactory.FromConfigDbAsync(db);
var errors = DraftValidator.Validate(draft);
var errors = DraftValidator.Validate(draft).ToList();
// Cluster-topology guard (decision #91 / task #148 part 2). The SQL
// CK_ServerCluster_RedundancyMode_NodeCount CHECK enforces the (NodeCount, RedundancyMode)
// pair on the row itself, but it cannot see the per-node ClusterNode.Enabled flag — an
// operator can disable a node (effective enabled-count = 1) while leaving RedundancyMode at
// Hot/Warm and the constraint stays green, which would boot the runtime into an
// InvalidTopology band. ValidateClusterTopology catches that drift, but it isn't carried on
// the generation-versioned DraftSnapshot (the cluster/node rows aren't versioned), so it must
// be run separately here against the live rows. Read-only (AsNoTracking); errors fold into the
// same reject summary alongside the snapshot rules so a deploy failing either check is
// rejected with both sets of messages. ClusterId-ordered for a deterministic summary.
var clusters = await db.ServerClusters
.AsNoTracking()
.OrderBy(c => c.ClusterId)
.ToListAsync();
var nodesByCluster = (await db.ClusterNodes
.AsNoTracking()
.ToListAsync())
.GroupBy(n => n.ClusterId, StringComparer.Ordinal)
.ToDictionary(g => g.Key, g => g.ToList(), StringComparer.Ordinal);
foreach (var cluster in clusters)
{
var nodes = nodesByCluster.TryGetValue(cluster.ClusterId, out var ns)
? (IReadOnlyList<ClusterNode>)ns
: [];
errors.AddRange(DraftValidator.ValidateClusterTopology(cluster, nodes));
}
if (errors.Count > 0)
{
var summary = string.Join("; ", errors.Select(e => $"[{e.Code}] {e.Message}"));
@@ -261,6 +261,7 @@ public sealed class OtOpcUaNodeManager : CustomNodeManager2
public void WriteValue(string nodeId, object? value, OpcUaQuality quality, DateTime sourceTimestampUtc)
{
ArgumentException.ThrowIfNullOrEmpty(nodeId);
EnsureAddressSpaceCreated(); // OpcUaServer-004: fail legibly if called before the server started.
lock (Lock)
{
@@ -296,6 +297,7 @@ public sealed class OtOpcUaNodeManager : CustomNodeManager2
{
ArgumentException.ThrowIfNullOrEmpty(alarmNodeId);
ArgumentNullException.ThrowIfNull(state);
EnsureAddressSpaceCreated(); // OpcUaServer-004: fail legibly if called before the server started.
// Look up + project under a SINGLE Lock so a concurrent RebuildAddressSpace can't clear
// _alarmConditions / detach the condition node between the lookup and the Set* calls.
@@ -584,6 +586,7 @@ public sealed class OtOpcUaNodeManager : CustomNodeManager2
{
ArgumentException.ThrowIfNullOrEmpty(alarmNodeId);
ArgumentException.ThrowIfNullOrEmpty(displayName);
EnsureAddressSpaceCreated(); // OpcUaServer-004: fail legibly if called before the server started.
lock (Lock)
{
@@ -1280,6 +1283,7 @@ public sealed class OtOpcUaNodeManager : CustomNodeManager2
{
ArgumentException.ThrowIfNullOrEmpty(folderNodeId);
ArgumentException.ThrowIfNullOrEmpty(displayName);
EnsureAddressSpaceCreated(); // OpcUaServer-004: fail legibly if called before the server started.
if (_folders.ContainsKey(folderNodeId)) return;
@@ -1336,6 +1340,7 @@ public sealed class OtOpcUaNodeManager : CustomNodeManager2
{
ArgumentException.ThrowIfNullOrEmpty(variableNodeId);
ArgumentException.ThrowIfNullOrEmpty(displayName);
EnsureAddressSpaceCreated(); // OpcUaServer-004: fail legibly if called before the server started.
// If already present, leave it alone (idempotent re-applies).
if (_variables.ContainsKey(variableNodeId)) return;
@@ -1608,10 +1613,29 @@ public sealed class OtOpcUaNodeManager : CustomNodeManager2
private FolderState ResolveParentFolder(string? parentNodeId)
{
EnsureAddressSpaceCreated();
if (string.IsNullOrEmpty(parentNodeId)) return _root!;
return _folders.TryGetValue(parentNodeId, out var existing) ? existing : _root!;
}
/// <summary>OpcUaServer-004: guard the address-space mutators against a too-early call. <c>_root</c>
/// is only assigned in <see cref="CreateAddressSpace"/>, which the SDK invokes during
/// <c>StandardServer</c> start; every public mutator (<see cref="WriteValue"/>,
/// <see cref="WriteAlarmCondition"/>, <see cref="EnsureFolder"/>, <see cref="EnsureVariable"/>,
/// <see cref="MaterialiseAlarmCondition"/>) and <see cref="ResolveParentFolder"/> assume it has run.
/// If one is invoked before the server has started (a sink wired or a publish replayed before
/// <c>StartAsync</c> completes), <c>_root</c> is null and the dereference would NRE; throw a legible
/// <see cref="InvalidOperationException"/> instead. Happy-path behaviour is unchanged.</summary>
/// <exception cref="InvalidOperationException">When the address space has not been created yet.</exception>
private void EnsureAddressSpaceCreated()
{
if (_root is null)
{
throw new InvalidOperationException(
"OPC UA address space has not been created yet (server not started).");
}
}
// ---------------------------------------------------------------------------------------------
// Phase C — OPC UA HistoryRead over historized variable nodes.
//
@@ -1790,8 +1814,11 @@ public sealed class OtOpcUaNodeManager : CustomNodeManager2
sourceName,
details.StartTime,
details.EndTime,
// NumValuesPerNode is uint; ReadEventsAsync takes int (<=0 ⇒ backend default cap).
ClampToInt(details.NumValuesPerNode),
// OpcUaServer-002: NumValuesPerNode==0 means "no limit — return ALL values" per OPC UA
// Part 4/11, so translate it to UNBOUNDED (int.MaxValue) here. Passing the int<=0
// backend-default-cap sentinel instead would silently truncate a "give me everything"
// events read at the backend default. A positive cap passes through (clamped).
EventMaxEvents(details.NumValuesPerNode),
CancellationToken.None).GetAwaiter().GetResult();
var historyEvent = ProjectEvents(sourceResult.Events, selectClauses);
@@ -1825,6 +1852,18 @@ public sealed class OtOpcUaNodeManager : CustomNodeManager2
/// <returns>The clamped non-negative int.</returns>
private static int ClampToInt(uint value) => value > int.MaxValue ? int.MaxValue : (int)value;
/// <summary>OpcUaServer-002: map a HistoryRead-Events <c>NumValuesPerNode</c> request cap onto the
/// <see cref="IHistorianDataSource.ReadEventsAsync"/> <c>maxEvents</c> argument, honouring the OPC UA
/// Part 4/11 semantics that <c>NumValuesPerNode == 0</c> means "no limit — return ALL values".
/// We translate 0 to UNBOUNDED (<see cref="int.MaxValue"/>) — a very large positive cap — rather than
/// the backend's <c>maxEvents &lt;= 0</c> "use the default cap" sentinel, so a client asking for the
/// whole window is not silently truncated at the backend default. A positive value passes through
/// clamped to <see cref="int.MaxValue"/> (mirroring <see cref="ClampToInt"/>).</summary>
/// <param name="numValuesPerNode">The request's <c>NumValuesPerNode</c> cap (0 ⇒ no limit).</param>
/// <returns><see cref="int.MaxValue"/> when 0 (unbounded); otherwise the clamped non-negative int.</returns>
internal static int EventMaxEvents(uint numValuesPerNode) =>
numValuesPerNode == 0 ? int.MaxValue : ClampToInt(numValuesPerNode);
/// <summary>
/// Project a sequence of <see cref="HistoricalEvent"/>s into an SDK <see cref="HistoryEvent"/> —
/// one <see cref="HistoryEventFieldList"/> per event, each carrying the requested
@@ -435,6 +435,123 @@ public sealed class AdminOperationsActorTests : ControlPlaneActorTestBase
reply.Message.ShouldContain("1 script(s) will compile");
}
/// <summary>Verifies the cluster-topology guard is wired into the deploy gate (Configuration-013):
/// a <see cref="RedundancyMode.Hot"/> cluster with only ONE enabled <see cref="ClusterNode"/>
/// (the second toggled off) is <see cref="StartDeploymentOutcome.Rejected"/> with the
/// <c>ClusterEnabledNodeCountMismatch</c> topology error in the message — no coordinator dispatch,
/// no Deployment row. The row-level SQL CHECK cannot see the disabled-node flag, so this proves the
/// managed <see cref="Configuration.Validation.DraftValidator.ValidateClusterTopology"/> guard runs
/// at deploy time rather than sitting inert.</summary>
[Fact]
public void StartDeployment_rejects_on_invalid_cluster_topology_disabled_node()
{
var dbFactory = NewInMemoryDbFactory();
using (var db = dbFactory.CreateDbContext())
{
db.ServerClusters.Add(new Configuration.Entities.ServerCluster
{
ClusterId = "LINE3-OPCUA",
Name = "Line 3",
Enterprise = "zb",
Site = "dev",
NodeCount = 2,
RedundancyMode = RedundancyMode.Hot, // declared 2 + Hot, but only 1 enabled below
CreatedBy = "seed",
});
db.ClusterNodes.Add(new Configuration.Entities.ClusterNode
{
NodeId = "LINE3-OPCUA-A",
ClusterId = "LINE3-OPCUA",
Host = "host-a",
ApplicationUri = "urn:line3:a",
Enabled = true,
CreatedBy = "seed",
});
db.ClusterNodes.Add(new Configuration.Entities.ClusterNode
{
NodeId = "LINE3-OPCUA-B",
ClusterId = "LINE3-OPCUA",
Host = "host-b",
ApplicationUri = "urn:line3:b",
Enabled = false, // toggled off → effective enabled-count = 1 while mode stays Hot
CreatedBy = "seed",
});
db.SaveChanges();
}
var coordinator = CreateTestProbe("coord");
var actor = Sys.ActorOf(AdminOperationsActor.Props(dbFactory, coordinator.Ref, Enumerable.Empty<IDriverProbe>()));
actor.Tell(new StartDeployment("joe", CorrelationId.NewId()));
coordinator.ExpectNoMsg(TimeSpan.FromMilliseconds(500));
var reply = ExpectMsg<StartDeploymentResult>(TimeSpan.FromSeconds(3));
reply.Outcome.ShouldBe(StartDeploymentOutcome.Rejected);
reply.Message.ShouldNotBeNull();
reply.Message.ShouldContain("ClusterEnabledNodeCountMismatch");
using var verify = dbFactory.CreateDbContext();
verify.Deployments.Count().ShouldBe(0);
}
/// <summary>Verifies the topology guard does NOT spuriously reject a well-formed cluster: a
/// <see cref="RedundancyMode.Hot"/> cluster whose two <see cref="ClusterNode"/>s are both enabled
/// passes the topology check, so a deploy of an otherwise-valid config is
/// <see cref="StartDeploymentOutcome.Accepted"/> with no topology error in the message and a row
/// inserted. Pairs with the rejecting test to prove the guard is discriminating, not blanket.</summary>
[Fact]
public void StartDeployment_accepts_when_cluster_topology_is_valid()
{
var dbFactory = NewInMemoryDbFactory();
using (var db = dbFactory.CreateDbContext())
{
db.ServerClusters.Add(new Configuration.Entities.ServerCluster
{
ClusterId = "LINE3-OPCUA",
Name = "Line 3",
Enterprise = "zb",
Site = "dev",
NodeCount = 2,
RedundancyMode = RedundancyMode.Hot,
CreatedBy = "seed",
});
db.ClusterNodes.Add(new Configuration.Entities.ClusterNode
{
NodeId = "LINE3-OPCUA-A",
ClusterId = "LINE3-OPCUA",
Host = "host-a",
ApplicationUri = "urn:line3:a",
Enabled = true,
CreatedBy = "seed",
});
db.ClusterNodes.Add(new Configuration.Entities.ClusterNode
{
NodeId = "LINE3-OPCUA-B",
ClusterId = "LINE3-OPCUA",
Host = "host-b",
ApplicationUri = "urn:line3:b",
Enabled = true, // both enabled → matches declared NodeCount=2 + Hot
CreatedBy = "seed",
});
db.SaveChanges();
}
var coordinator = CreateTestProbe("coord");
var actor = Sys.ActorOf(AdminOperationsActor.Props(dbFactory, coordinator.Ref, Enumerable.Empty<IDriverProbe>()));
actor.Tell(new StartDeployment("joe", CorrelationId.NewId()));
coordinator.ExpectMsg<DispatchDeployment>(TimeSpan.FromSeconds(3));
var reply = ExpectMsg<StartDeploymentResult>(TimeSpan.FromSeconds(3));
reply.Outcome.ShouldBe(StartDeploymentOutcome.Accepted);
(reply.Message is null || !reply.Message.Contains("ClusterEnabledNodeCountMismatch")).ShouldBeTrue();
(reply.Message is null || !reply.Message.Contains("ClusterRedundancyModeInvalid")).ShouldBeTrue();
using var verify = dbFactory.CreateDbContext();
verify.Deployments.Count().ShouldBe(1);
}
/// <summary>Verifies that starting a deployment is refused when another is in flight.</summary>
[Fact]
public void StartDeployment_refuses_when_another_is_in_flight()
@@ -0,0 +1,45 @@
using Shouldly;
using Xunit;
namespace ZB.MOM.WW.OtOpcUa.OpcUaServer.Tests;
/// <summary>
/// OpcUaServer-002 — unit coverage for <see cref="OtOpcUaNodeManager.EventMaxEvents"/>, the pure
/// helper that maps a HistoryRead-Events <c>NumValuesPerNode</c> request cap onto the
/// <c>IHistorianDataSource.ReadEventsAsync</c> <c>maxEvents</c> argument. Per OPC UA Part 4/11,
/// <c>NumValuesPerNode == 0</c> means "no limit — return ALL values", so the helper translates 0 to
/// UNBOUNDED (<see cref="int.MaxValue"/>) rather than the backend's <c>maxEvents &lt;= 0</c>
/// "use the default cap" sentinel; a positive value passes through clamped to <see cref="int.MaxValue"/>.
/// </summary>
public sealed class NodeManagerEventMaxEventsTests
{
/// <summary>0 ("no limit" per the spec) ⇒ int.MaxValue (unbounded), NOT the 0/default-cap sentinel.</summary>
[Fact]
public void Zero_maps_to_int_max()
{
OtOpcUaNodeManager.EventMaxEvents(0u).ShouldBe(int.MaxValue);
}
/// <summary>A normal positive cap passes through unchanged.</summary>
[Fact]
public void Normal_value_passes_through()
{
OtOpcUaNodeManager.EventMaxEvents(50u).ShouldBe(50);
OtOpcUaNodeManager.EventMaxEvents(1u).ShouldBe(1);
}
/// <summary>A value above int.MaxValue clamps to int.MaxValue (mirrors ClampToInt's saturation).</summary>
[Fact]
public void Value_above_int_max_clamps()
{
OtOpcUaNodeManager.EventMaxEvents((uint)int.MaxValue + 1u).ShouldBe(int.MaxValue);
OtOpcUaNodeManager.EventMaxEvents(uint.MaxValue).ShouldBe(int.MaxValue);
}
/// <summary>int.MaxValue exactly passes through (boundary — not clamped down).</summary>
[Fact]
public void Int_max_exactly_passes_through()
{
OtOpcUaNodeManager.EventMaxEvents((uint)int.MaxValue).ShouldBe(int.MaxValue);
}
}
@@ -94,6 +94,44 @@ public sealed class NodeManagerHistoryReadEventsTests : IDisposable
await host.DisposeAsync();
}
/// <summary>OpcUaServer-002: a HistoryReadEvents with <c>NumValuesPerNode == 0</c> means "no limit —
/// return ALL values" per OPC UA Part 4/11, so the backend must receive an UNBOUNDED cap
/// (<see cref="int.MaxValue"/>), NOT the <c>maxEvents &lt;= 0</c> "use the default cap" sentinel that
/// would silently truncate a whole-window read.</summary>
[Fact]
public async Task Events_unbounded_request_passes_int_max_to_backend()
{
var (host, server) = await BootAsync();
var nm = server.NodeManager!;
var fake = new RecordingHistorianDataSource();
nm.HistorianDataSource = fake;
const string equipmentId = "eq-unbounded";
nm.EnsureFolder(equipmentId, parentNodeId: null, displayName: "Equipment");
nm.MaterialiseAlarmCondition("alarm-0", equipmentId, "Cond", "OffNormalAlarm", severity: 600);
var notifierNodeId = nm.TryGetFolder(equipmentId)!.NodeId;
fake.EventsResult = new HistoricalEventsResult(
new[] { new HistoricalEvent("evt-x", "Src", DateTime.UtcNow, DateTime.UtcNow, "msg", 600) }, null);
var details = new ReadEventDetails
{
StartTime = DateTime.UtcNow.AddHours(-1),
EndTime = DateTime.UtcNow,
// 0 ⇒ "no limit" — the override must translate this to int.MaxValue for the backend.
NumValuesPerNode = 0,
Filter = SelectFilter("EventId"),
};
var (_, errors) = InvokeHistoryRead(server, nm, details, notifierNodeId);
errors[0].StatusCode.Code.ShouldBe(StatusCodes.Good);
// The backend saw the unbounded cap, NOT the 0/default-cap sentinel.
fake.LastMaxEvents.ShouldBe(int.MaxValue);
await host.DisposeAsync();
}
/// <summary>An unsupported select operand (BrowsePath ["EventType"]) projects to Variant.Null — a field
/// the server can't supply is null (spec-conformant) — while supported siblings still project.</summary>
[Fact]
@@ -0,0 +1,142 @@
using Microsoft.Extensions.Logging.Abstractions;
using Opc.Ua;
using Shouldly;
using Xunit;
using ZB.MOM.WW.OtOpcUa.Commons.OpcUa;
namespace ZB.MOM.WW.OtOpcUa.OpcUaServer.Tests;
/// <summary>
/// OpcUaServer-004 — a fresh <see cref="OtOpcUaNodeManager"/> whose <c>CreateAddressSpace</c> has NOT
/// yet run (i.e. the server has not started) has a null <c>_root</c>. Every public address-space mutator
/// (<see cref="OtOpcUaNodeManager.WriteValue"/>, <see cref="OtOpcUaNodeManager.WriteAlarmCondition"/>,
/// <see cref="OtOpcUaNodeManager.EnsureFolder"/>, <see cref="OtOpcUaNodeManager.EnsureVariable"/>,
/// <see cref="OtOpcUaNodeManager.MaterialiseAlarmCondition"/>) must now fail with a legible
/// <see cref="InvalidOperationException"/> instead of a bare NRE out of <c>ResolveParentFolder</c> /
/// <c>CreateVariable</c>.
/// <para>
/// The node manager's ctor needs a real <see cref="Opc.Ua.Server.IServerInternal"/> +
/// <see cref="ApplicationConfiguration"/>, which only the SDK boot produces — so we boot a real
/// <see cref="OpcUaApplicationHost"/>, borrow those two from the LIVE (already-started) node manager
/// (its public <c>Server</c> + <c>Server.Configuration</c>), then construct a SECOND, fresh node
/// manager from them. That second manager never had <c>CreateAddressSpace</c> driven, so it
/// reproduces the pre-start ordering hazard exactly.
/// </para>
/// </summary>
public sealed class NodeManagerPreStartGuardTests : IDisposable
{
private static CancellationToken Ct => TestContext.Current.CancellationToken;
private readonly string _pkiRoot = Path.Combine(
Path.GetTempPath(),
$"otopcua-prestartguard-{Guid.NewGuid():N}");
[Fact]
public async Task EnsureFolder_before_CreateAddressSpace_throws_InvalidOperationException()
{
var (host, nm) = await BuildPreStartNodeManagerAsync();
try
{
var ex = Should.Throw<InvalidOperationException>(() =>
nm.EnsureFolder("eq-1", parentNodeId: null, displayName: "Equipment"));
ex.Message.ShouldContain("address space has not been created");
}
finally
{
await host.DisposeAsync();
}
}
[Fact]
public async Task EnsureVariable_before_CreateAddressSpace_throws_InvalidOperationException()
{
var (host, nm) = await BuildPreStartNodeManagerAsync();
try
{
Should.Throw<InvalidOperationException>(() =>
nm.EnsureVariable("eq-1/temp", parentFolderNodeId: null, displayName: "Temp",
dataType: "Float", writable: false));
}
finally
{
await host.DisposeAsync();
}
}
[Fact]
public async Task WriteValue_before_CreateAddressSpace_throws_InvalidOperationException()
{
var (host, nm) = await BuildPreStartNodeManagerAsync();
try
{
Should.Throw<InvalidOperationException>(() =>
nm.WriteValue("eq-1/temp", 1.0, OpcUaQuality.Good, DateTime.UtcNow));
}
finally
{
await host.DisposeAsync();
}
}
[Fact]
public async Task MaterialiseAlarmCondition_before_CreateAddressSpace_throws_InvalidOperationException()
{
var (host, nm) = await BuildPreStartNodeManagerAsync();
try
{
Should.Throw<InvalidOperationException>(() =>
nm.MaterialiseAlarmCondition("alarm-1", "eq-1", "Cond", "OffNormalAlarm", severity: 500));
}
finally
{
await host.DisposeAsync();
}
}
/// <summary>Boot a real host, borrow the live node manager's real
/// <see cref="Opc.Ua.Server.IServerInternal"/>, then construct a SECOND node manager from it (with a
/// fresh <see cref="ApplicationConfiguration"/> — the ctor only records it + sets namespaces) that has
/// NEVER had <c>CreateAddressSpace</c> driven (so <c>_root</c> is null). The host is returned so the
/// caller disposes it after exercising the guard.</summary>
private async Task<(OpcUaApplicationHost Host, OtOpcUaNodeManager NodeManager)> BuildPreStartNodeManagerAsync()
{
var host = new OpcUaApplicationHost(
new OpcUaApplicationHostOptions
{
ApplicationName = "OtOpcUa.PreStartGuardTest",
ApplicationUri = $"urn:OtOpcUa.PreStartGuardTest:{Guid.NewGuid():N}",
OpcUaPort = AllocateFreePort(),
PublicHostname = "localhost",
PkiStoreRoot = _pkiRoot,
},
NullLogger<OpcUaApplicationHost>.Instance);
var server = new OtOpcUaSdkServer();
await host.StartAsync(server, Ct);
var live = server.NodeManager!;
// Borrow the SDK's real IServerInternal from the live manager and build a brand-new node manager —
// CreateAddressSpace has not been driven on THIS instance, so _root is null and every mutator must
// hit the EnsureAddressSpaceCreated guard.
var fresh = new OtOpcUaNodeManager(live.Server, new ApplicationConfiguration());
return (host, fresh);
}
private static int AllocateFreePort()
{
using var listener = new System.Net.Sockets.TcpListener(System.Net.IPAddress.Loopback, 0);
listener.Start();
var port = ((System.Net.IPEndPoint)listener.LocalEndpoint).Port;
listener.Stop();
return port;
}
public void Dispose()
{
if (Directory.Exists(_pkiRoot))
{
try { Directory.Delete(_pkiRoot, recursive: true); }
catch { /* best-effort cleanup */ }
}
}
}