fix(core): resolve Low code-review findings (Core-004,008,009,010,011,012)
- Core-004: add ConfigureAwait(false) to DriverHost.RegisterAsync / UnregisterAsync / DisposeAsync. - Core-008: rewrite the BuildAddressSpaceAsync XML doc to correctly name the caller (OpcUaApplicationHost.PopulateAddressSpaces) that owns the per-driver isolation. - Core-009: snapshot DriverResilienceOptions once per non-idempotent write in CapabilityInvoker.ExecuteWriteAsync. - Core-010: switch DriverResilienceOptions.Resolve to TryGetValue with a diagnostic error message when a tier table is missing a capability. - Core-011: add an optional diagnostic callback to PermissionTrieBuilder so production callers can surface scope-path mismatches. - Core-012: correct the stale WedgeDetector ctor summary and add the Reconnecting row to DriverHealthReport's state matrix. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -7,7 +7,7 @@
|
|||||||
| Review date | 2026-05-22 |
|
| Review date | 2026-05-22 |
|
||||||
| Commit reviewed | `76d35d1` |
|
| Commit reviewed | `76d35d1` |
|
||||||
| Status | Reviewed |
|
| Status | Reviewed |
|
||||||
| Open findings | 6 |
|
| Open findings | 0 |
|
||||||
|
|
||||||
## Checklist coverage
|
## Checklist coverage
|
||||||
|
|
||||||
@@ -78,13 +78,13 @@
|
|||||||
| Severity | Low |
|
| Severity | Low |
|
||||||
| Category | OtOpcUa conventions |
|
| Category | OtOpcUa conventions |
|
||||||
| Location | `src/Core/ZB.MOM.WW.OtOpcUa.Core/Hosting/DriverHost.cs:55,72,87` |
|
| Location | `src/Core/ZB.MOM.WW.OtOpcUa.Core/Hosting/DriverHost.cs:55,72,87` |
|
||||||
| Status | Open |
|
| Status | Resolved |
|
||||||
|
|
||||||
**Description:** `DriverHost` is a library type whose async calls (`driver.InitializeAsync`, `driver.ShutdownAsync`) do not use `ConfigureAwait(false)`, whereas the sibling `CapabilityInvoker` and `AlarmSurfaceInvoker` in the same module consistently do. The server host has no synchronization context so behaviour is currently correct, but the inconsistency is a maintenance hazard and a deviation from the established convention in `Core.Resilience`.
|
**Description:** `DriverHost` is a library type whose async calls (`driver.InitializeAsync`, `driver.ShutdownAsync`) do not use `ConfigureAwait(false)`, whereas the sibling `CapabilityInvoker` and `AlarmSurfaceInvoker` in the same module consistently do. The server host has no synchronization context so behaviour is currently correct, but the inconsistency is a maintenance hazard and a deviation from the established convention in `Core.Resilience`.
|
||||||
|
|
||||||
**Recommendation:** Add `.ConfigureAwait(false)` to the three awaited calls in `DriverHost.RegisterAsync`, `UnregisterAsync`, and `DisposeAsync`.
|
**Recommendation:** Add `.ConfigureAwait(false)` to the three awaited calls in `DriverHost.RegisterAsync`, `UnregisterAsync`, and `DisposeAsync`.
|
||||||
|
|
||||||
**Resolution:** _(open)_
|
**Resolution:** Resolved 2026-05-23 — added `.ConfigureAwait(false)` to the three awaited driver calls in `RegisterAsync`, `UnregisterAsync`, and `DisposeAsync`; added three `RegisterAsync/UnregisterAsync/DisposeAsync_Does_Not_Capture_SynchronizationContext` regression tests that install a tracking `SynchronizationContext` on a dedicated thread and assert zero captured posts.
|
||||||
|
|
||||||
### Core-005
|
### Core-005
|
||||||
|
|
||||||
@@ -138,13 +138,13 @@
|
|||||||
| Severity | Low |
|
| Severity | Low |
|
||||||
| Category | Error handling & resilience |
|
| Category | Error handling & resilience |
|
||||||
| Location | `src/Core/ZB.MOM.WW.OtOpcUa.Core/OpcUa/GenericDriverNodeManager.cs:42-64` |
|
| Location | `src/Core/ZB.MOM.WW.OtOpcUa.Core/OpcUa/GenericDriverNodeManager.cs:42-64` |
|
||||||
| Status | Open |
|
| Status | Resolved |
|
||||||
|
|
||||||
**Description:** The XML summary of `BuildAddressSpaceAsync` states "Driver exceptions are isolated per decision #12 — the driver's subtree is marked Faulted, but other drivers remain available." The method body contains no such isolation: an exception from `discovery.DiscoverAsync` propagates straight out unhandled, and nothing here marks a subtree Faulted. The isolation is presumably done by the server-layer caller, but the comment asserts behaviour this class does not implement.
|
**Description:** The XML summary of `BuildAddressSpaceAsync` states "Driver exceptions are isolated per decision #12 — the driver's subtree is marked Faulted, but other drivers remain available." The method body contains no such isolation: an exception from `discovery.DiscoverAsync` propagates straight out unhandled, and nothing here marks a subtree Faulted. The isolation is presumably done by the server-layer caller, but the comment asserts behaviour this class does not implement.
|
||||||
|
|
||||||
**Recommendation:** Either implement the documented isolation in `GenericDriverNodeManager`, or correct the XML doc to state that exception isolation is the caller's responsibility and name the type that performs it.
|
**Recommendation:** Either implement the documented isolation in `GenericDriverNodeManager`, or correct the XML doc to state that exception isolation is the caller's responsibility and name the type that performs it.
|
||||||
|
|
||||||
**Resolution:** _(open)_
|
**Resolution:** Resolved 2026-05-23 — corrected the `BuildAddressSpaceAsync` XML doc to (a) explicitly state exception isolation is the caller's responsibility, and (b) name the type that performs it (`Server.OpcUa.OpcUaApplicationHost.PopulateAddressSpaces`); added `BuildAddressSpaceAsync_Propagates_Discovery_Exceptions_To_Caller` regression test verifying the documented propagation behaviour.
|
||||||
|
|
||||||
### Core-009
|
### Core-009
|
||||||
|
|
||||||
@@ -153,13 +153,13 @@
|
|||||||
| Severity | Low |
|
| Severity | Low |
|
||||||
| Category | Performance & resource management |
|
| Category | Performance & resource management |
|
||||||
| Location | `src/Core/ZB.MOM.WW.OtOpcUa.Core/Resilience/CapabilityInvoker.cs:121-128` |
|
| Location | `src/Core/ZB.MOM.WW.OtOpcUa.Core/Resilience/CapabilityInvoker.cs:121-128` |
|
||||||
| Status | Open |
|
| Status | Resolved |
|
||||||
|
|
||||||
**Description:** `ExecuteWriteAsync` calls `_optionsAccessor()` three times for a single non-idempotent write (once for the `with` expression, once inside the dictionary initializer for `.Resolve(...)`, plus the discarded base). On the per-write hot path it rebuilds a fresh `DriverResilienceOptions` and a one-entry dictionary on every non-idempotent write, and the redundant accessor calls could observe two different snapshots if an Admin edit lands between them. Phase 6.1 budgets a 1% pipeline overhead; this is unnecessary allocation plus a minor consistency hazard.
|
**Description:** `ExecuteWriteAsync` calls `_optionsAccessor()` three times for a single non-idempotent write (once for the `with` expression, once inside the dictionary initializer for `.Resolve(...)`, plus the discarded base). On the per-write hot path it rebuilds a fresh `DriverResilienceOptions` and a one-entry dictionary on every non-idempotent write, and the redundant accessor calls could observe two different snapshots if an Admin edit lands between them. Phase 6.1 budgets a 1% pipeline overhead; this is unnecessary allocation plus a minor consistency hazard.
|
||||||
|
|
||||||
**Recommendation:** Capture `var options = _optionsAccessor();` once at the top of the non-idempotent branch and derive both the `with` and the `Resolve` call from that snapshot. Consider caching the no-retry pipeline keyed on `(hostName, non-idempotent)`.
|
**Recommendation:** Capture `var options = _optionsAccessor();` once at the top of the non-idempotent branch and derive both the `with` and the `Resolve` call from that snapshot. Consider caching the no-retry pipeline keyed on `(hostName, non-idempotent)`.
|
||||||
|
|
||||||
**Resolution:** _(open)_
|
**Resolution:** Resolved 2026-05-23 — `ExecuteWriteAsync` now captures `_optionsAccessor()` into a single `snapshot` local at the top of the non-idempotent branch; the `with` expression and the `Resolve(Write)` call both derive from that snapshot so the two values are guaranteed coherent and only one accessor invocation occurs per call. Added `ExecuteWriteAsync_NonIdempotent_Snapshots_Options_Once_Per_Call` (counts invocations) and `ExecuteWriteAsync_NonIdempotent_Uses_Consistent_Options_Snapshot` (alternating-accessor) regression tests.
|
||||||
|
|
||||||
### Core-010
|
### Core-010
|
||||||
|
|
||||||
@@ -168,13 +168,13 @@
|
|||||||
| Severity | Low |
|
| Severity | Low |
|
||||||
| Category | Code organization & conventions |
|
| Category | Code organization & conventions |
|
||||||
| Location | `src/Core/ZB.MOM.WW.OtOpcUa.Core/Resilience/DriverResilienceOptions.cs:45-52` |
|
| Location | `src/Core/ZB.MOM.WW.OtOpcUa.Core/Resilience/DriverResilienceOptions.cs:45-52` |
|
||||||
| Status | Open |
|
| Status | Resolved |
|
||||||
|
|
||||||
**Description:** `DriverResilienceOptions.Resolve` indexes the tier-default dictionary directly (`defaults[capability]`) with no fallback. Any future addition to `DriverCapability` that is not also added to all three tier tables in `GetTierDefaults` will make `Resolve` throw `KeyNotFoundException` at runtime on the capability hot path rather than failing at build time. The two are coupled by convention only.
|
**Description:** `DriverResilienceOptions.Resolve` indexes the tier-default dictionary directly (`defaults[capability]`) with no fallback. Any future addition to `DriverCapability` that is not also added to all three tier tables in `GetTierDefaults` will make `Resolve` throw `KeyNotFoundException` at runtime on the capability hot path rather than failing at build time. The two are coupled by convention only.
|
||||||
|
|
||||||
**Recommendation:** Either add a `default` arm to `Resolve` returning a conservative policy (and logging), or add a unit-test invariant asserting every `DriverCapability` value is present in each tier's default table.
|
**Recommendation:** Either add a `default` arm to `Resolve` returning a conservative policy (and logging), or add a unit-test invariant asserting every `DriverCapability` value is present in each tier's default table.
|
||||||
|
|
||||||
**Resolution:** _(open)_
|
**Resolution:** Resolved 2026-05-23 — `Resolve` now uses `TryGetValue` and throws a diagnostic `KeyNotFoundException` whose message names the missing capability + tier and points to `GetTierDefaults` when a capability is missing from both the override map and the tier table; the existing `TierDefaults_Cover_EveryCapability` test invariant prevents this in shipped code, and added `Resolve_Returns_NonNull_Policy_For_Every_Capability` (per-tier exhaustive) + `Resolve_Throws_Diagnostic_When_Capability_Missing_From_Tier_Defaults` regression tests.
|
||||||
|
|
||||||
### Core-011
|
### Core-011
|
||||||
|
|
||||||
@@ -183,13 +183,13 @@
|
|||||||
| Severity | Low |
|
| Severity | Low |
|
||||||
| Category | Testing coverage |
|
| Category | Testing coverage |
|
||||||
| Location | `src/Core/ZB.MOM.WW.OtOpcUa.Core/Authorization/PermissionTrieBuilder.cs:58-75` |
|
| Location | `src/Core/ZB.MOM.WW.OtOpcUa.Core/Authorization/PermissionTrieBuilder.cs:58-75` |
|
||||||
| Status | Open |
|
| Status | Resolved |
|
||||||
|
|
||||||
**Description:** `PermissionTrieBuilder.Descend` has a two-branch behaviour: with a `scopePaths` lookup it descends the real hierarchy; without one it falls back to placing every non-cluster row directly under the root keyed by `ScopeId` ("works for deterministic tests, not for production"). The fallback silently produces a structurally incorrect trie when `scopePaths` is null or a row's `ScopeId` is missing — a UnsLine-scoped grant ends up as a direct child of the root, so `WalkEquipment` / `WalkSystemPlatform` never reach it and the grant is effectively dropped, with no diagnostic. There is no test asserting the production multi-level descent versus the fallback.
|
**Description:** `PermissionTrieBuilder.Descend` has a two-branch behaviour: with a `scopePaths` lookup it descends the real hierarchy; without one it falls back to placing every non-cluster row directly under the root keyed by `ScopeId` ("works for deterministic tests, not for production"). The fallback silently produces a structurally incorrect trie when `scopePaths` is null or a row's `ScopeId` is missing — a UnsLine-scoped grant ends up as a direct child of the root, so `WalkEquipment` / `WalkSystemPlatform` never reach it and the grant is effectively dropped, with no diagnostic. There is no test asserting the production multi-level descent versus the fallback.
|
||||||
|
|
||||||
**Recommendation:** Add unit tests covering `Build` with `scopePaths` producing the correct multi-level trie and the missing-`ScopeId` fallback. Have `Descend` surface a diagnostic (or throw outside test configuration) when a sub-cluster row cannot be located in `scopePaths`.
|
**Recommendation:** Add unit tests covering `Build` with `scopePaths` producing the correct multi-level trie and the missing-`ScopeId` fallback. Have `Descend` surface a diagnostic (or throw outside test configuration) when a sub-cluster row cannot be located in `scopePaths`.
|
||||||
|
|
||||||
**Resolution:** _(open)_
|
**Resolution:** Resolved 2026-05-23 — added optional `Action<PermissionTrieBuildDiagnostic>? diagnostic` parameter to `PermissionTrieBuilder.Build`; `Descend` now invokes the callback with a `MissingScopePath` diagnostic when a sub-cluster row's `ScopeId` is absent from a supplied (non-null) `scopePaths` lookup so production callers can log + surface orphan grants instead of silently dropping them. New `PermissionTrieBuilderTests` covers (a) production multi-level descent with sibling-line non-leakage, (b) the deterministic-test fallback, (c) the diagnostic firing on a missing scope-path entry, (d) no diagnostic when all rows resolve, and (e) no diagnostic when `scopePaths` is null (explicit test mode).
|
||||||
|
|
||||||
### Core-012
|
### Core-012
|
||||||
|
|
||||||
@@ -198,10 +198,10 @@
|
|||||||
| Severity | Low |
|
| Severity | Low |
|
||||||
| Category | Documentation & comments |
|
| Category | Documentation & comments |
|
||||||
| Location | `src/Core/ZB.MOM.WW.OtOpcUa.Core/Stability/WedgeDetector.cs:26`, `src/Core/ZB.MOM.WW.OtOpcUa.Core/Observability/DriverHealthReport.cs:11-22` |
|
| Location | `src/Core/ZB.MOM.WW.OtOpcUa.Core/Stability/WedgeDetector.cs:26`, `src/Core/ZB.MOM.WW.OtOpcUa.Core/Observability/DriverHealthReport.cs:11-22` |
|
||||||
| Status | Open |
|
| Status | Resolved |
|
||||||
|
|
||||||
**Description:** Two stale doc comments. (1) `WedgeDetector` — the `<summary>` above the constructor reads "Whether the driver reported itself `DriverState.Healthy` at construction." The constructor takes only a `TimeSpan threshold` and the detector is documented as stateless; the comment describes nothing the constructor does. (2) `DriverHealthReport` — the `<remarks>` state matrix lists Unknown, Initializing, Healthy, Degraded, Faulted but `Aggregate` (lines 42-44) also folds `DriverState.Reconnecting` into the Degraded verdict. `Reconnecting` is a real `DriverState` member absent from the documented matrix.
|
**Description:** Two stale doc comments. (1) `WedgeDetector` — the `<summary>` above the constructor reads "Whether the driver reported itself `DriverState.Healthy` at construction." The constructor takes only a `TimeSpan threshold` and the detector is documented as stateless; the comment describes nothing the constructor does. (2) `DriverHealthReport` — the `<remarks>` state matrix lists Unknown, Initializing, Healthy, Degraded, Faulted but `Aggregate` (lines 42-44) also folds `DriverState.Reconnecting` into the Degraded verdict. `Reconnecting` is a real `DriverState` member absent from the documented matrix.
|
||||||
|
|
||||||
**Recommendation:** Replace the `WedgeDetector` constructor `<summary>` with an accurate description (e.g. "Construct with the wedge-detection threshold; values below 60 s clamp to 60 s"). Add `Reconnecting` to the `DriverHealthReport` `<remarks>` state matrix and state it maps to Degraded.
|
**Recommendation:** Replace the `WedgeDetector` constructor `<summary>` with an accurate description (e.g. "Construct with the wedge-detection threshold; values below 60 s clamp to 60 s"). Add `Reconnecting` to the `DriverHealthReport` `<remarks>` state matrix and state it maps to Degraded.
|
||||||
|
|
||||||
**Resolution:** _(open)_
|
**Resolution:** Resolved 2026-05-23 — replaced the `WedgeDetector(.ctor)` `<summary>` with an accurate "Construct with the wedge-detection threshold; values below 60 s clamp to 60 s" description plus a `<param>` block; added the `Reconnecting` row to the `DriverHealthReport` `<remarks>` state matrix and updated the verdict-rule prose. Added `WedgeDetectorTests.Doc_Constructor_Summary_Describes_Threshold_Clamp` and `DriverHealthReportTests.Doc_State_Matrix_Includes_Reconnecting` regression tests that parse the generated `.xml` doc to assert the strings, plus `Any_Reconnecting_WithoutFaultedOrNotReady_IsDegraded` confirming the documented Reconnecting → Degraded behaviour.
|
||||||
|
|||||||
@@ -26,11 +26,27 @@ public static class PermissionTrieBuilder
|
|||||||
/// Build a trie for one cluster/generation from the supplied rows. The caller is
|
/// Build a trie for one cluster/generation from the supplied rows. The caller is
|
||||||
/// responsible for pre-filtering rows to the target generation + cluster.
|
/// responsible for pre-filtering rows to the target generation + cluster.
|
||||||
/// </summary>
|
/// </summary>
|
||||||
|
/// <param name="clusterId">Cluster the trie is being built for; rows for other clusters are skipped.</param>
|
||||||
|
/// <param name="generationId">Config-generation the rows belong to; stamped on the returned trie.</param>
|
||||||
|
/// <param name="rows">ACL rows for this cluster + generation.</param>
|
||||||
|
/// <param name="scopePaths">
|
||||||
|
/// Optional <c>ScopeId</c> → multi-level trie-path lookup. When supplied, sub-cluster rows
|
||||||
|
/// descend to their structurally-correct trie node. When null, sub-cluster rows fall back
|
||||||
|
/// to a direct child of the trie root keyed on <c>ScopeId</c> — deterministic-test mode.
|
||||||
|
/// </param>
|
||||||
|
/// <param name="diagnostic">
|
||||||
|
/// Optional callback invoked when a sub-cluster row's <c>ScopeId</c> cannot be located
|
||||||
|
/// in <paramref name="scopePaths"/>. Production callers should wire a logger here so
|
||||||
|
/// orphaned grants surface — silently dropping them under the wrong trie level was the
|
||||||
|
/// Core-011 production hazard. The callback fires only when <paramref name="scopePaths"/>
|
||||||
|
/// is non-null (a null lookup is the explicit deterministic-test fallback mode).
|
||||||
|
/// </param>
|
||||||
public static PermissionTrie Build(
|
public static PermissionTrie Build(
|
||||||
string clusterId,
|
string clusterId,
|
||||||
long generationId,
|
long generationId,
|
||||||
IReadOnlyList<NodeAcl> rows,
|
IReadOnlyList<NodeAcl> rows,
|
||||||
IReadOnlyDictionary<string, NodeAclPath>? scopePaths = null)
|
IReadOnlyDictionary<string, NodeAclPath>? scopePaths = null,
|
||||||
|
Action<PermissionTrieBuildDiagnostic>? diagnostic = null)
|
||||||
{
|
{
|
||||||
ArgumentException.ThrowIfNullOrWhiteSpace(clusterId);
|
ArgumentException.ThrowIfNullOrWhiteSpace(clusterId);
|
||||||
ArgumentNullException.ThrowIfNull(rows);
|
ArgumentNullException.ThrowIfNull(rows);
|
||||||
@@ -45,7 +61,7 @@ public static class PermissionTrieBuilder
|
|||||||
var node = row.ScopeKind switch
|
var node = row.ScopeKind switch
|
||||||
{
|
{
|
||||||
NodeAclScopeKind.Cluster => trie.Root,
|
NodeAclScopeKind.Cluster => trie.Root,
|
||||||
_ => Descend(trie.Root, row, scopePaths),
|
_ => Descend(trie.Root, row, scopePaths, diagnostic),
|
||||||
};
|
};
|
||||||
|
|
||||||
if (node is not null)
|
if (node is not null)
|
||||||
@@ -55,16 +71,30 @@ public static class PermissionTrieBuilder
|
|||||||
return trie;
|
return trie;
|
||||||
}
|
}
|
||||||
|
|
||||||
private static PermissionTrieNode? Descend(PermissionTrieNode root, NodeAcl row, IReadOnlyDictionary<string, NodeAclPath>? scopePaths)
|
private static PermissionTrieNode? Descend(
|
||||||
|
PermissionTrieNode root,
|
||||||
|
NodeAcl row,
|
||||||
|
IReadOnlyDictionary<string, NodeAclPath>? scopePaths,
|
||||||
|
Action<PermissionTrieBuildDiagnostic>? diagnostic)
|
||||||
{
|
{
|
||||||
if (string.IsNullOrEmpty(row.ScopeId)) return null;
|
if (string.IsNullOrEmpty(row.ScopeId)) return null;
|
||||||
|
|
||||||
// For sub-cluster scopes the caller supplies a path lookup so we know the containing
|
// For sub-cluster scopes the caller supplies a path lookup so we know the containing
|
||||||
// namespace / UnsArea / UnsLine ids. Without a path lookup we fall back to putting the
|
// namespace / UnsArea / UnsLine ids. Without a path lookup we fall back to putting the
|
||||||
// row directly under the root using its ScopeId — works for deterministic tests, not
|
// row directly under the root using its ScopeId — works for deterministic tests, not
|
||||||
// for production where the hierarchy must be honored.
|
// for production where the hierarchy must be honored. If a scopePaths lookup IS
|
||||||
|
// provided but is missing the row's ScopeId, surface a diagnostic so the caller can
|
||||||
|
// log the orphan instead of silently dropping the grant under an unreachable node.
|
||||||
if (scopePaths is null || !scopePaths.TryGetValue(row.ScopeId, out var path))
|
if (scopePaths is null || !scopePaths.TryGetValue(row.ScopeId, out var path))
|
||||||
{
|
{
|
||||||
|
if (scopePaths is not null)
|
||||||
|
{
|
||||||
|
diagnostic?.Invoke(new PermissionTrieBuildDiagnostic(
|
||||||
|
NodeAclId: row.NodeAclId,
|
||||||
|
ScopeKind: row.ScopeKind,
|
||||||
|
ScopeId: row.ScopeId,
|
||||||
|
Reason: PermissionTrieBuildDiagnosticReason.MissingScopePath));
|
||||||
|
}
|
||||||
return EnsureChild(root, row.ScopeId);
|
return EnsureChild(root, row.ScopeId);
|
||||||
}
|
}
|
||||||
|
|
||||||
@@ -95,3 +125,30 @@ public static class PermissionTrieBuilder
|
|||||||
/// applicable; or (for SystemPlatform kind) NamespaceId / FolderSegment / .../TagId.
|
/// applicable; or (for SystemPlatform kind) NamespaceId / FolderSegment / .../TagId.
|
||||||
/// </param>
|
/// </param>
|
||||||
public sealed record NodeAclPath(IReadOnlyList<string> Segments);
|
public sealed record NodeAclPath(IReadOnlyList<string> Segments);
|
||||||
|
|
||||||
|
/// <summary>
|
||||||
|
/// Diagnostic emitted by <see cref="PermissionTrieBuilder.Build"/> when a row could not be
|
||||||
|
/// placed at its structurally-correct trie node. Production callers should log these so
|
||||||
|
/// orphaned grants surface instead of being silently dropped under an unreachable node
|
||||||
|
/// (Core-011).
|
||||||
|
/// </summary>
|
||||||
|
/// <param name="NodeAclId">The offending row's logical id.</param>
|
||||||
|
/// <param name="ScopeKind">The row's <see cref="NodeAclScopeKind"/>.</param>
|
||||||
|
/// <param name="ScopeId">The row's <c>ScopeId</c> that could not be located.</param>
|
||||||
|
/// <param name="Reason">Why the diagnostic fired.</param>
|
||||||
|
public sealed record PermissionTrieBuildDiagnostic(
|
||||||
|
string NodeAclId,
|
||||||
|
NodeAclScopeKind ScopeKind,
|
||||||
|
string ScopeId,
|
||||||
|
PermissionTrieBuildDiagnosticReason Reason);
|
||||||
|
|
||||||
|
/// <summary>Reasons <see cref="PermissionTrieBuildDiagnostic"/> can be emitted.</summary>
|
||||||
|
public enum PermissionTrieBuildDiagnosticReason
|
||||||
|
{
|
||||||
|
/// <summary>
|
||||||
|
/// The row's <c>ScopeId</c> was not present in the supplied <c>scopePaths</c> lookup.
|
||||||
|
/// The grant is placed as a direct child of the trie root keyed on <c>ScopeId</c> — a
|
||||||
|
/// position the production trie walker cannot reach for multi-level scopes.
|
||||||
|
/// </summary>
|
||||||
|
MissingScopePath,
|
||||||
|
}
|
||||||
|
|||||||
@@ -52,7 +52,7 @@ public sealed class DriverHost : IAsyncDisposable
|
|||||||
_drivers[id] = driver;
|
_drivers[id] = driver;
|
||||||
}
|
}
|
||||||
|
|
||||||
try { await driver.InitializeAsync(driverConfigJson, ct); }
|
try { await driver.InitializeAsync(driverConfigJson, ct).ConfigureAwait(false); }
|
||||||
catch
|
catch
|
||||||
{
|
{
|
||||||
// Keep the driver registered — operator will see Faulted state and can reinitialize.
|
// Keep the driver registered — operator will see Faulted state and can reinitialize.
|
||||||
@@ -69,7 +69,7 @@ public sealed class DriverHost : IAsyncDisposable
|
|||||||
_drivers.Remove(driverInstanceId);
|
_drivers.Remove(driverInstanceId);
|
||||||
}
|
}
|
||||||
|
|
||||||
try { await driver.ShutdownAsync(ct); }
|
try { await driver.ShutdownAsync(ct).ConfigureAwait(false); }
|
||||||
catch { /* shutdown is best-effort; logs elsewhere */ }
|
catch { /* shutdown is best-effort; logs elsewhere */ }
|
||||||
}
|
}
|
||||||
|
|
||||||
@@ -84,7 +84,7 @@ public sealed class DriverHost : IAsyncDisposable
|
|||||||
|
|
||||||
foreach (var driver in snapshot)
|
foreach (var driver in snapshot)
|
||||||
{
|
{
|
||||||
try { await driver.ShutdownAsync(CancellationToken.None); } catch { /* ignore */ }
|
try { await driver.ShutdownAsync(CancellationToken.None).ConfigureAwait(false); } catch { /* ignore */ }
|
||||||
(driver as IDisposable)?.Dispose();
|
(driver as IDisposable)?.Dispose();
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|||||||
@@ -15,11 +15,13 @@ namespace ZB.MOM.WW.OtOpcUa.Core.Observability;
|
|||||||
/// → /readyz 503 (not yet ready).</item>
|
/// → /readyz 503 (not yet ready).</item>
|
||||||
/// <item><see cref="DriverState.Healthy"/> → /readyz 200.</item>
|
/// <item><see cref="DriverState.Healthy"/> → /readyz 200.</item>
|
||||||
/// <item><see cref="DriverState.Degraded"/> → /readyz 200 with flagged driver IDs.</item>
|
/// <item><see cref="DriverState.Degraded"/> → /readyz 200 with flagged driver IDs.</item>
|
||||||
|
/// <item><see cref="DriverState.Reconnecting"/> → /readyz 200 with flagged driver IDs
|
||||||
|
/// (driver alive but not serving live data; same verdict as Degraded).</item>
|
||||||
/// <item><see cref="DriverState.Faulted"/> → /readyz 503.</item>
|
/// <item><see cref="DriverState.Faulted"/> → /readyz 503.</item>
|
||||||
/// </list>
|
/// </list>
|
||||||
/// The overall verdict is computed across the fleet: any Faulted → Faulted; any
|
/// The overall verdict is computed across the fleet: any Faulted → Faulted; any
|
||||||
/// Unknown/Initializing → NotReady; any Degraded → Degraded; else Healthy. An empty fleet
|
/// Unknown/Initializing → NotReady; any Degraded or Reconnecting → Degraded; else
|
||||||
/// is Healthy (nothing to degrade).
|
/// Healthy. An empty fleet is Healthy (nothing to degrade).
|
||||||
/// </remarks>
|
/// </remarks>
|
||||||
public static class DriverHealthReport
|
public static class DriverHealthReport
|
||||||
{
|
{
|
||||||
|
|||||||
@@ -39,8 +39,11 @@ public class GenericDriverNodeManager(IDriver driver) : IDisposable
|
|||||||
/// If called a second time (e.g. Galaxy redeploy via <c>IRediscoverable.OnRediscoveryNeeded</c>)
|
/// If called a second time (e.g. Galaxy redeploy via <c>IRediscoverable.OnRediscoveryNeeded</c>)
|
||||||
/// the previous alarm subscription is torn down and the sink registry is cleared before
|
/// the previous alarm subscription is torn down and the sink registry is cleared before
|
||||||
/// re-walking, preventing double delivery of alarm transitions.
|
/// re-walking, preventing double delivery of alarm transitions.
|
||||||
/// Exception isolation (marking the driver's subtree Faulted) is the caller's responsibility —
|
/// Exception isolation (per decision #12 — marking the driver's subtree Faulted while other
|
||||||
/// exceptions from <see cref="ITagDiscovery.DiscoverAsync"/> propagate to the caller.
|
/// drivers stay available) is the caller's responsibility; exceptions from
|
||||||
|
/// <see cref="ITagDiscovery.DiscoverAsync"/> propagate unhandled to the caller. The Server
|
||||||
|
/// project's <c>OpcUaApplicationHost.PopulateAddressSpaces</c> wraps this call in a per-driver
|
||||||
|
/// try/catch that logs + leaves the driver's subtree empty until a Reinitialize succeeds.
|
||||||
/// </summary>
|
/// </summary>
|
||||||
public async Task BuildAddressSpaceAsync(IAddressSpaceBuilder builder, CancellationToken ct)
|
public async Task BuildAddressSpaceAsync(IAddressSpaceBuilder builder, CancellationToken ct)
|
||||||
{
|
{
|
||||||
|
|||||||
@@ -118,11 +118,15 @@ public sealed class CapabilityInvoker
|
|||||||
|
|
||||||
if (!isIdempotent)
|
if (!isIdempotent)
|
||||||
{
|
{
|
||||||
var noRetryOptions = _optionsAccessor() with
|
// Snapshot the options exactly once per call — invoking _optionsAccessor twice can
|
||||||
|
// (a) observe two different snapshots if an Admin edit lands between them and
|
||||||
|
// (b) wastes an allocation on the per-write hot path (Phase 6.1 1% pipeline budget).
|
||||||
|
var snapshot = _optionsAccessor();
|
||||||
|
var noRetryOptions = snapshot with
|
||||||
{
|
{
|
||||||
CapabilityPolicies = new Dictionary<DriverCapability, CapabilityPolicy>
|
CapabilityPolicies = new Dictionary<DriverCapability, CapabilityPolicy>
|
||||||
{
|
{
|
||||||
[DriverCapability.Write] = _optionsAccessor().Resolve(DriverCapability.Write) with { RetryCount = 0 },
|
[DriverCapability.Write] = snapshot.Resolve(DriverCapability.Write) with { RetryCount = 0 },
|
||||||
},
|
},
|
||||||
};
|
};
|
||||||
var pipeline = _builder.GetOrCreate(_driverInstanceId, $"{hostName}::non-idempotent", DriverCapability.Write, noRetryOptions);
|
var pipeline = _builder.GetOrCreate(_driverInstanceId, $"{hostName}::non-idempotent", DriverCapability.Write, noRetryOptions);
|
||||||
|
|||||||
@@ -42,13 +42,27 @@ public sealed record DriverResilienceOptions
|
|||||||
/// Look up the effective policy for a capability, falling back to tier defaults when no
|
/// Look up the effective policy for a capability, falling back to tier defaults when no
|
||||||
/// override is configured. Never returns null.
|
/// override is configured. Never returns null.
|
||||||
/// </summary>
|
/// </summary>
|
||||||
|
/// <exception cref="KeyNotFoundException">
|
||||||
|
/// Thrown when neither the override map nor the tier defaults carry an entry for the
|
||||||
|
/// requested capability. The <c>TierDefaults_Cover_EveryCapability</c> invariant test
|
||||||
|
/// in <c>DriverResilienceOptionsTests</c> guarantees every defined enum value is present
|
||||||
|
/// in each tier's table, so this only fires when a caller passes an out-of-range value
|
||||||
|
/// or someone adds a <see cref="DriverCapability"/> member without updating
|
||||||
|
/// <see cref="GetTierDefaults"/>. The message names the missing capability and tier.
|
||||||
|
/// </exception>
|
||||||
public CapabilityPolicy Resolve(DriverCapability capability)
|
public CapabilityPolicy Resolve(DriverCapability capability)
|
||||||
{
|
{
|
||||||
if (CapabilityPolicies.TryGetValue(capability, out var policy))
|
if (CapabilityPolicies.TryGetValue(capability, out var policy))
|
||||||
return policy;
|
return policy;
|
||||||
|
|
||||||
var defaults = GetTierDefaults(Tier);
|
var defaults = GetTierDefaults(Tier);
|
||||||
return defaults[capability];
|
if (defaults.TryGetValue(capability, out var fallback))
|
||||||
|
return fallback;
|
||||||
|
|
||||||
|
throw new KeyNotFoundException(
|
||||||
|
$"No policy defined for capability '{capability}' under tier '{Tier}'. " +
|
||||||
|
$"This indicates a {nameof(DriverCapability)} enum value missing from {nameof(GetTierDefaults)} — " +
|
||||||
|
"add the capability to every tier's default table.");
|
||||||
}
|
}
|
||||||
|
|
||||||
/// <summary>
|
/// <summary>
|
||||||
|
|||||||
@@ -23,7 +23,15 @@ public sealed class WedgeDetector
|
|||||||
/// <summary>Wedge-detection threshold; pass < 60 s and the detector clamps to 60 s.</summary>
|
/// <summary>Wedge-detection threshold; pass < 60 s and the detector clamps to 60 s.</summary>
|
||||||
public TimeSpan Threshold { get; }
|
public TimeSpan Threshold { get; }
|
||||||
|
|
||||||
/// <summary>Whether the driver reported itself <see cref="DriverState.Healthy"/> at construction.</summary>
|
/// <summary>
|
||||||
|
/// Construct with the wedge-detection threshold; values below 60 s clamp to 60 s so
|
||||||
|
/// the detector never fires below the documented floor.
|
||||||
|
/// </summary>
|
||||||
|
/// <param name="threshold">
|
||||||
|
/// Time without a successful unit of work after which a Healthy driver with pending
|
||||||
|
/// work is considered Faulted. Clamped to a minimum of 60 s per the plan-default of
|
||||||
|
/// 5 × PublishingInterval.
|
||||||
|
/// </param>
|
||||||
public WedgeDetector(TimeSpan threshold)
|
public WedgeDetector(TimeSpan threshold)
|
||||||
{
|
{
|
||||||
Threshold = threshold < TimeSpan.FromSeconds(60) ? TimeSpan.FromSeconds(60) : threshold;
|
Threshold = threshold < TimeSpan.FromSeconds(60) ? TimeSpan.FromSeconds(60) : threshold;
|
||||||
|
|||||||
@@ -0,0 +1,155 @@
|
|||||||
|
using Shouldly;
|
||||||
|
using Xunit;
|
||||||
|
using ZB.MOM.WW.OtOpcUa.Configuration.Entities;
|
||||||
|
using ZB.MOM.WW.OtOpcUa.Configuration.Enums;
|
||||||
|
using ZB.MOM.WW.OtOpcUa.Core.Authorization;
|
||||||
|
|
||||||
|
namespace ZB.MOM.WW.OtOpcUa.Core.Tests.Authorization;
|
||||||
|
|
||||||
|
/// <summary>
|
||||||
|
/// Core-011 regression coverage for <see cref="PermissionTrieBuilder.Build"/>'s
|
||||||
|
/// <c>Descend</c> helper:
|
||||||
|
/// <list type="bullet">
|
||||||
|
/// <item>With a <c>scopePaths</c> lookup the row must land at the correct multi-level
|
||||||
|
/// trie node — a deep <see cref="NodeAclScopeKind.UnsLine"/> grant must be visible
|
||||||
|
/// ONLY when the requested scope walks the same namespace/area/line chain.</item>
|
||||||
|
/// <item>Without a <c>scopePaths</c> entry the row falls back to a direct child of
|
||||||
|
/// the namespace root keyed on the row's <c>ScopeId</c>. The builder must surface
|
||||||
|
/// this fallback (warning callback) so callers know a grant was placed where the
|
||||||
|
/// walker can't reach it for production hierarchies — silently dropping the grant
|
||||||
|
/// is the Core-011 production hazard.</item>
|
||||||
|
/// </list>
|
||||||
|
/// </summary>
|
||||||
|
[Trait("Category", "Unit")]
|
||||||
|
public sealed class PermissionTrieBuilderTests
|
||||||
|
{
|
||||||
|
private static NodeAcl Row(string group, NodeAclScopeKind scope, string? scopeId, NodePermissions flags, string clusterId = "c1") =>
|
||||||
|
new()
|
||||||
|
{
|
||||||
|
NodeAclRowId = Guid.NewGuid(),
|
||||||
|
NodeAclId = $"acl-{Guid.NewGuid():N}",
|
||||||
|
GenerationId = 1,
|
||||||
|
ClusterId = clusterId,
|
||||||
|
LdapGroup = group,
|
||||||
|
ScopeKind = scope,
|
||||||
|
ScopeId = scopeId,
|
||||||
|
PermissionFlags = flags,
|
||||||
|
};
|
||||||
|
|
||||||
|
private static NodeScope EquipmentTag(string cluster, string ns, string area, string line, string equip, string tag) =>
|
||||||
|
new()
|
||||||
|
{
|
||||||
|
ClusterId = cluster,
|
||||||
|
NamespaceId = ns,
|
||||||
|
UnsAreaId = area,
|
||||||
|
UnsLineId = line,
|
||||||
|
EquipmentId = equip,
|
||||||
|
TagId = tag,
|
||||||
|
Kind = NodeHierarchyKind.Equipment,
|
||||||
|
};
|
||||||
|
|
||||||
|
[Fact]
|
||||||
|
public void Build_With_ScopePaths_Places_UnsLine_Row_At_Correct_Multi_Level_Node()
|
||||||
|
{
|
||||||
|
// Scope path mirrors the production hierarchy: namespace → area → line.
|
||||||
|
var paths = new Dictionary<string, NodeAclPath>(StringComparer.OrdinalIgnoreCase)
|
||||||
|
{
|
||||||
|
["line-42"] = new(new[] { "ns", "area-1", "line-42" }),
|
||||||
|
};
|
||||||
|
var rows = new[] { Row("cn=ops", NodeAclScopeKind.UnsLine, "line-42", NodePermissions.Read) };
|
||||||
|
|
||||||
|
var trie = PermissionTrieBuilder.Build("c1", 1, rows, paths);
|
||||||
|
|
||||||
|
// Walk through the same chain — the grant must be reachable.
|
||||||
|
var matchOnLine = trie.CollectMatches(
|
||||||
|
EquipmentTag("c1", "ns", "area-1", "line-42", "eq-A", "tag-A"),
|
||||||
|
["cn=ops"]);
|
||||||
|
matchOnLine.Count.ShouldBe(1, "row must land at the correct multi-level trie node");
|
||||||
|
|
||||||
|
// A different line under the same area must not pick up the grant.
|
||||||
|
var matchOnOtherLine = trie.CollectMatches(
|
||||||
|
EquipmentTag("c1", "ns", "area-1", "line-99", "eq-A", "tag-A"),
|
||||||
|
["cn=ops"]);
|
||||||
|
matchOnOtherLine.ShouldBeEmpty(
|
||||||
|
"grant anchored at line-42 must not leak to sibling line-99 under the same area");
|
||||||
|
}
|
||||||
|
|
||||||
|
[Fact]
|
||||||
|
public void Build_Without_ScopePaths_Falls_Back_To_Root_Child_For_Tests()
|
||||||
|
{
|
||||||
|
// Fallback path — deterministic tests pass without a scope-path lookup. The row
|
||||||
|
// is placed as a direct child of the trie root keyed by ScopeId.
|
||||||
|
var rows = new[] { Row("cn=ops", NodeAclScopeKind.UnsLine, "line-42", NodePermissions.Read) };
|
||||||
|
|
||||||
|
var trie = PermissionTrieBuilder.Build("c1", 1, rows);
|
||||||
|
|
||||||
|
// Root has one child — "line-42".
|
||||||
|
trie.Root.Children.ShouldContainKey("line-42");
|
||||||
|
var node = trie.Root.Children["line-42"];
|
||||||
|
node.Grants.Count.ShouldBe(1);
|
||||||
|
}
|
||||||
|
|
||||||
|
/// <summary>
|
||||||
|
/// Core-011 regression: when a sub-cluster row's ScopeId is not in the supplied
|
||||||
|
/// <c>scopePaths</c>, the fallback diagnostic callback must fire so the caller can
|
||||||
|
/// surface a warning. Silently dropping the grant under the wrong trie level is the
|
||||||
|
/// production hazard the finding flagged.
|
||||||
|
/// </summary>
|
||||||
|
[Fact]
|
||||||
|
public void Build_Missing_ScopePath_Entry_Invokes_Diagnostic_Callback()
|
||||||
|
{
|
||||||
|
var paths = new Dictionary<string, NodeAclPath>(StringComparer.OrdinalIgnoreCase)
|
||||||
|
{
|
||||||
|
["line-known"] = new(new[] { "ns", "area-1", "line-known" }),
|
||||||
|
};
|
||||||
|
// Row references a line that is NOT in the path lookup.
|
||||||
|
var rows = new[]
|
||||||
|
{
|
||||||
|
Row("cn=ops", NodeAclScopeKind.UnsLine, "line-orphan", NodePermissions.Read),
|
||||||
|
Row("cn=ops", NodeAclScopeKind.UnsLine, "line-known", NodePermissions.Read),
|
||||||
|
};
|
||||||
|
var diagnostics = new List<PermissionTrieBuildDiagnostic>();
|
||||||
|
|
||||||
|
var trie = PermissionTrieBuilder.Build("c1", 1, rows, paths, diagnostics.Add);
|
||||||
|
|
||||||
|
diagnostics.Count.ShouldBe(1, "exactly one row had no matching scope-path entry");
|
||||||
|
diagnostics[0].ScopeId.ShouldBe("line-orphan");
|
||||||
|
diagnostics[0].ScopeKind.ShouldBe(NodeAclScopeKind.UnsLine);
|
||||||
|
diagnostics[0].Reason.ShouldBe(PermissionTrieBuildDiagnosticReason.MissingScopePath);
|
||||||
|
}
|
||||||
|
|
||||||
|
[Fact]
|
||||||
|
public void Build_No_Diagnostic_When_All_Sub_Cluster_Rows_Have_ScopePaths()
|
||||||
|
{
|
||||||
|
var paths = new Dictionary<string, NodeAclPath>(StringComparer.OrdinalIgnoreCase)
|
||||||
|
{
|
||||||
|
["line-A"] = new(new[] { "ns", "area-1", "line-A" }),
|
||||||
|
["line-B"] = new(new[] { "ns", "area-1", "line-B" }),
|
||||||
|
};
|
||||||
|
var rows = new[]
|
||||||
|
{
|
||||||
|
Row("cn=ops", NodeAclScopeKind.Cluster, null, NodePermissions.Read), // cluster-level — no descent
|
||||||
|
Row("cn=ops", NodeAclScopeKind.UnsLine, "line-A", NodePermissions.Read),
|
||||||
|
Row("cn=ops", NodeAclScopeKind.UnsLine, "line-B", NodePermissions.Read),
|
||||||
|
};
|
||||||
|
var diagnostics = new List<PermissionTrieBuildDiagnostic>();
|
||||||
|
|
||||||
|
PermissionTrieBuilder.Build("c1", 1, rows, paths, diagnostics.Add);
|
||||||
|
|
||||||
|
diagnostics.ShouldBeEmpty("no rows are missing a scope-path entry");
|
||||||
|
}
|
||||||
|
|
||||||
|
[Fact]
|
||||||
|
public void Build_Diagnostic_Callback_Optional_When_ScopePaths_Null()
|
||||||
|
{
|
||||||
|
// No diagnostics callback should fire when scopePaths itself is null — that's the
|
||||||
|
// "deterministic-test fallback" mode, not a production drop.
|
||||||
|
var rows = new[] { Row("cn=ops", NodeAclScopeKind.UnsLine, "line-42", NodePermissions.Read) };
|
||||||
|
var diagnostics = new List<PermissionTrieBuildDiagnostic>();
|
||||||
|
|
||||||
|
PermissionTrieBuilder.Build("c1", 1, rows, scopePaths: null, diagnostic: diagnostics.Add);
|
||||||
|
|
||||||
|
diagnostics.ShouldBeEmpty(
|
||||||
|
"scopePaths=null is the explicit test-fallback mode and must not emit per-row warnings");
|
||||||
|
}
|
||||||
|
}
|
||||||
@@ -77,4 +77,162 @@ public sealed class DriverHostTests
|
|||||||
host.RegisteredDriverIds.ShouldNotContain("d-1");
|
host.RegisteredDriverIds.ShouldNotContain("d-1");
|
||||||
driver.ShutDown.ShouldBeTrue();
|
driver.ShutDown.ShouldBeTrue();
|
||||||
}
|
}
|
||||||
|
|
||||||
|
/// <summary>
|
||||||
|
/// Core-004 regression — DriverHost is a library type whose async calls must use
|
||||||
|
/// ConfigureAwait(false) to match the convention used by CapabilityInvoker /
|
||||||
|
/// AlarmSurfaceInvoker. Asserts the awaited driver call does not post its
|
||||||
|
/// continuation back to a captured SynchronizationContext.
|
||||||
|
/// The driver awaits an unsettled TaskCompletionSource so it does not introduce its
|
||||||
|
/// own capture — only DriverHost's await of the returned Task can drive a post.
|
||||||
|
/// </summary>
|
||||||
|
[Fact]
|
||||||
|
public async Task RegisterAsync_Does_Not_Capture_SynchronizationContext()
|
||||||
|
{
|
||||||
|
var tcs = new TaskCompletionSource(TaskCreationOptions.RunContinuationsAsynchronously);
|
||||||
|
var driver = new TcsDriver("d-cfg-1", tcs);
|
||||||
|
var ctx = new TrackingSynchronizationContext();
|
||||||
|
|
||||||
|
// Run the DriverHost call on a dedicated thread that has our tracking SyncContext installed.
|
||||||
|
var workerCtx = await RunOnContextAsync(ctx, async () =>
|
||||||
|
{
|
||||||
|
var host = new DriverHost();
|
||||||
|
var registerTask = host.RegisterAsync(driver, "{}", CancellationToken.None);
|
||||||
|
// Complete the driver's InitializeAsync from a background thread so DriverHost's
|
||||||
|
// await must resume via the captured context if ConfigureAwait(false) was missing.
|
||||||
|
_ = Task.Run(() => tcs.SetResult());
|
||||||
|
await registerTask.ConfigureAwait(false);
|
||||||
|
await host.DisposeAsync().ConfigureAwait(false);
|
||||||
|
});
|
||||||
|
|
||||||
|
workerCtx.PostCount.ShouldBe(0,
|
||||||
|
"RegisterAsync's awaited driver call must use ConfigureAwait(false) so the continuation does not post back to the captured context");
|
||||||
|
}
|
||||||
|
|
||||||
|
[Fact]
|
||||||
|
public async Task UnregisterAsync_Does_Not_Capture_SynchronizationContext()
|
||||||
|
{
|
||||||
|
var initTcs = new TaskCompletionSource(TaskCreationOptions.RunContinuationsAsynchronously);
|
||||||
|
initTcs.SetResult();
|
||||||
|
var shutdownTcs = new TaskCompletionSource(TaskCreationOptions.RunContinuationsAsynchronously);
|
||||||
|
var driver = new TcsDriver("d-cfg-2", initTcs, shutdownTcs);
|
||||||
|
var ctx = new TrackingSynchronizationContext();
|
||||||
|
|
||||||
|
var workerCtx = await RunOnContextAsync(ctx, async () =>
|
||||||
|
{
|
||||||
|
var host = new DriverHost();
|
||||||
|
await host.RegisterAsync(driver, "{}", CancellationToken.None).ConfigureAwait(false);
|
||||||
|
|
||||||
|
// After RegisterAsync we re-enter the context. Reset the post counter so we only
|
||||||
|
// observe UnregisterAsync's behaviour from here on.
|
||||||
|
((TrackingSynchronizationContext)SynchronizationContext.Current!).Reset();
|
||||||
|
|
||||||
|
var task = host.UnregisterAsync("d-cfg-2", CancellationToken.None);
|
||||||
|
_ = Task.Run(() => shutdownTcs.SetResult());
|
||||||
|
await task.ConfigureAwait(false);
|
||||||
|
});
|
||||||
|
|
||||||
|
workerCtx.PostCount.ShouldBe(0,
|
||||||
|
"UnregisterAsync's awaited shutdown call must use ConfigureAwait(false)");
|
||||||
|
}
|
||||||
|
|
||||||
|
[Fact]
|
||||||
|
public async Task DisposeAsync_Does_Not_Capture_SynchronizationContext()
|
||||||
|
{
|
||||||
|
var initTcs = new TaskCompletionSource(TaskCreationOptions.RunContinuationsAsynchronously);
|
||||||
|
initTcs.SetResult();
|
||||||
|
var shutdownTcs = new TaskCompletionSource(TaskCreationOptions.RunContinuationsAsynchronously);
|
||||||
|
var driver = new TcsDriver("d-cfg-3", initTcs, shutdownTcs);
|
||||||
|
var ctx = new TrackingSynchronizationContext();
|
||||||
|
|
||||||
|
var workerCtx = await RunOnContextAsync(ctx, async () =>
|
||||||
|
{
|
||||||
|
var host = new DriverHost();
|
||||||
|
await host.RegisterAsync(driver, "{}", CancellationToken.None).ConfigureAwait(false);
|
||||||
|
((TrackingSynchronizationContext)SynchronizationContext.Current!).Reset();
|
||||||
|
|
||||||
|
var task = host.DisposeAsync();
|
||||||
|
_ = Task.Run(() => shutdownTcs.SetResult());
|
||||||
|
await task.ConfigureAwait(false);
|
||||||
|
});
|
||||||
|
|
||||||
|
workerCtx.PostCount.ShouldBe(0,
|
||||||
|
"DisposeAsync's awaited shutdown call must use ConfigureAwait(false)");
|
||||||
|
}
|
||||||
|
|
||||||
|
/// <summary>
|
||||||
|
/// Run <paramref name="body"/> on a dedicated thread with <paramref name="ctx"/>
|
||||||
|
/// installed as the current SynchronizationContext, and return <paramref name="ctx"/>
|
||||||
|
/// after the body completes. The dedicated thread guarantees that resuming via the
|
||||||
|
/// captured context observably routes through our Post hook (the ThreadPool would
|
||||||
|
/// otherwise clear the context on the resuming worker).
|
||||||
|
/// </summary>
|
||||||
|
private static Task<TrackingSynchronizationContext> RunOnContextAsync(TrackingSynchronizationContext ctx, Func<Task> body)
|
||||||
|
{
|
||||||
|
var done = new TaskCompletionSource<TrackingSynchronizationContext>(TaskCreationOptions.RunContinuationsAsynchronously);
|
||||||
|
var t = new Thread(() =>
|
||||||
|
{
|
||||||
|
SynchronizationContext.SetSynchronizationContext(ctx);
|
||||||
|
try
|
||||||
|
{
|
||||||
|
// Pump posted continuations until the body completes.
|
||||||
|
var task = body();
|
||||||
|
while (!task.IsCompleted)
|
||||||
|
{
|
||||||
|
if (ctx.TryDequeue(out var work)) work();
|
||||||
|
else Thread.Sleep(1);
|
||||||
|
}
|
||||||
|
// Drain any tail continuations.
|
||||||
|
while (ctx.TryDequeue(out var work)) work();
|
||||||
|
task.GetAwaiter().GetResult();
|
||||||
|
done.SetResult(ctx);
|
||||||
|
}
|
||||||
|
catch (Exception ex) { done.SetException(ex); }
|
||||||
|
}) { IsBackground = true };
|
||||||
|
t.Start();
|
||||||
|
return done.Task;
|
||||||
|
}
|
||||||
|
|
||||||
|
/// <summary>Driver whose Initialize / Shutdown completions are caller-controlled via TCS.</summary>
|
||||||
|
private sealed class TcsDriver(string id, TaskCompletionSource initTcs, TaskCompletionSource? shutdownTcs = null) : IDriver
|
||||||
|
{
|
||||||
|
public string DriverInstanceId { get; } = id;
|
||||||
|
public string DriverType => "Tcs";
|
||||||
|
|
||||||
|
public Task InitializeAsync(string _, CancellationToken ct) => initTcs.Task;
|
||||||
|
public Task ReinitializeAsync(string _, CancellationToken ct) => Task.CompletedTask;
|
||||||
|
public Task ShutdownAsync(CancellationToken ct) => (shutdownTcs ?? CompletedTcs).Task;
|
||||||
|
public DriverHealth GetHealth() => new(DriverState.Healthy, null, null);
|
||||||
|
public long GetMemoryFootprint() => 0;
|
||||||
|
public Task FlushOptionalCachesAsync(CancellationToken ct) => Task.CompletedTask;
|
||||||
|
|
||||||
|
private static readonly TaskCompletionSource CompletedTcs = MakeCompleted();
|
||||||
|
private static TaskCompletionSource MakeCompleted()
|
||||||
|
{
|
||||||
|
var t = new TaskCompletionSource();
|
||||||
|
t.SetResult();
|
||||||
|
return t;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
/// <summary>SynchronizationContext that queues posts to a thread-safe work list and counts them.</summary>
|
||||||
|
private sealed class TrackingSynchronizationContext : SynchronizationContext
|
||||||
|
{
|
||||||
|
private readonly System.Collections.Concurrent.ConcurrentQueue<Action> _queue = new();
|
||||||
|
public int PostCount;
|
||||||
|
public int SendCount;
|
||||||
|
|
||||||
|
public override void Post(SendOrPostCallback d, object? state)
|
||||||
|
{
|
||||||
|
Interlocked.Increment(ref PostCount);
|
||||||
|
_queue.Enqueue(() => d(state));
|
||||||
|
}
|
||||||
|
public override void Send(SendOrPostCallback d, object? state)
|
||||||
|
{
|
||||||
|
Interlocked.Increment(ref SendCount);
|
||||||
|
d(state);
|
||||||
|
}
|
||||||
|
public bool TryDequeue(out Action work) => _queue.TryDequeue(out work!);
|
||||||
|
public void Reset() { Interlocked.Exchange(ref PostCount, 0); Interlocked.Exchange(ref SendCount, 0); }
|
||||||
|
}
|
||||||
}
|
}
|
||||||
|
|||||||
@@ -143,6 +143,41 @@ public sealed class GenericDriverNodeManagerTests
|
|||||||
nm.BuildAddressSpaceAsync(new RecordingBuilder(), CancellationToken.None));
|
nm.BuildAddressSpaceAsync(new RecordingBuilder(), CancellationToken.None));
|
||||||
}
|
}
|
||||||
|
|
||||||
|
/// <summary>
|
||||||
|
/// Core-008 regression: the XML doc states exception isolation is the caller's
|
||||||
|
/// responsibility — exceptions from <see cref="ITagDiscovery.DiscoverAsync"/> must propagate
|
||||||
|
/// out of <c>BuildAddressSpaceAsync</c> unhandled so the Server layer's per-driver try/catch
|
||||||
|
/// (<c>OpcUaApplicationHost.PopulateAddressSpaces</c>) can mark the subtree Faulted.
|
||||||
|
/// </summary>
|
||||||
|
[Fact]
|
||||||
|
public async Task BuildAddressSpaceAsync_Propagates_Discovery_Exceptions_To_Caller()
|
||||||
|
{
|
||||||
|
var driver = new ThrowingDiscoveryDriver();
|
||||||
|
using var nm = new GenericDriverNodeManager(driver);
|
||||||
|
|
||||||
|
var ex = await Should.ThrowAsync<InvalidOperationException>(() =>
|
||||||
|
nm.BuildAddressSpaceAsync(new RecordingBuilder(), CancellationToken.None));
|
||||||
|
ex.Message.ShouldBe("discovery boom",
|
||||||
|
"exceptions from DiscoverAsync must propagate unhandled — exception isolation is the caller's responsibility (e.g. OpcUaApplicationHost)");
|
||||||
|
}
|
||||||
|
|
||||||
|
/// <summary>Driver whose DiscoverAsync throws — exercises the exception-isolation boundary.</summary>
|
||||||
|
private sealed class ThrowingDiscoveryDriver : IDriver, ITagDiscovery
|
||||||
|
{
|
||||||
|
public string DriverInstanceId => "throwing";
|
||||||
|
public string DriverType => "Throwing";
|
||||||
|
|
||||||
|
public Task InitializeAsync(string _, CancellationToken __) => Task.CompletedTask;
|
||||||
|
public Task ReinitializeAsync(string _, CancellationToken __) => Task.CompletedTask;
|
||||||
|
public Task ShutdownAsync(CancellationToken _) => Task.CompletedTask;
|
||||||
|
public DriverHealth GetHealth() => new(DriverState.Healthy, null, null);
|
||||||
|
public long GetMemoryFootprint() => 0;
|
||||||
|
public Task FlushOptionalCachesAsync(CancellationToken _) => Task.CompletedTask;
|
||||||
|
|
||||||
|
public Task DiscoverAsync(IAddressSpaceBuilder builder, CancellationToken ct)
|
||||||
|
=> throw new InvalidOperationException("discovery boom");
|
||||||
|
}
|
||||||
|
|
||||||
// --- test doubles ---
|
// --- test doubles ---
|
||||||
|
|
||||||
private sealed class FakeDriver : IDriver, ITagDiscovery, IAlarmSource
|
private sealed class FakeDriver : IDriver, ITagDiscovery, IAlarmSource
|
||||||
|
|||||||
@@ -67,4 +67,53 @@ public sealed class DriverHealthReportTests
|
|||||||
{
|
{
|
||||||
DriverHealthReport.HttpStatus(verdict).ShouldBe(expected);
|
DriverHealthReport.HttpStatus(verdict).ShouldBe(expected);
|
||||||
}
|
}
|
||||||
|
|
||||||
|
/// <summary>
|
||||||
|
/// Core-012 regression: <see cref="DriverState.Reconnecting"/> must aggregate to
|
||||||
|
/// <see cref="ReadinessVerdict.Degraded"/> — the doc remarks state matrix lists this
|
||||||
|
/// mapping (after the Core-012 doc fix that added the Reconnecting row).
|
||||||
|
/// </summary>
|
||||||
|
[Fact]
|
||||||
|
public void Any_Reconnecting_WithoutFaultedOrNotReady_IsDegraded()
|
||||||
|
{
|
||||||
|
var verdict = DriverHealthReport.Aggregate([
|
||||||
|
new DriverHealthSnapshot("a", DriverState.Healthy),
|
||||||
|
new DriverHealthSnapshot("b", DriverState.Reconnecting),
|
||||||
|
]);
|
||||||
|
verdict.ShouldBe(ReadinessVerdict.Degraded,
|
||||||
|
"Reconnecting = driver alive but not serving live data → /readyz stays 200 while operators see the affected driver in the body");
|
||||||
|
}
|
||||||
|
|
||||||
|
/// <summary>
|
||||||
|
/// Core-012 regression: assert the XML <c><remarks></c> on
|
||||||
|
/// <see cref="DriverHealthReport"/> names <see cref="DriverState.Reconnecting"/> in its
|
||||||
|
/// state matrix. Catches a future doc-drift if someone re-aliases Reconnecting without
|
||||||
|
/// updating the matrix.
|
||||||
|
/// </summary>
|
||||||
|
[Fact]
|
||||||
|
public void Doc_State_Matrix_Includes_Reconnecting()
|
||||||
|
{
|
||||||
|
var xmlPath = Path.Combine(
|
||||||
|
AppContext.BaseDirectory,
|
||||||
|
"ZB.MOM.WW.OtOpcUa.Core.xml");
|
||||||
|
File.Exists(xmlPath).ShouldBeTrue($"expected XML doc file at {xmlPath}");
|
||||||
|
|
||||||
|
var content = File.ReadAllText(xmlPath);
|
||||||
|
var driverHealthReportRemarks = ExtractRemarksFor(content, "T:ZB.MOM.WW.OtOpcUa.Core.Observability.DriverHealthReport");
|
||||||
|
|
||||||
|
driverHealthReportRemarks.ShouldContain("Reconnecting");
|
||||||
|
}
|
||||||
|
|
||||||
|
private static string ExtractRemarksFor(string xml, string member)
|
||||||
|
{
|
||||||
|
var memberStart = xml.IndexOf($"<member name=\"{member}\"", StringComparison.Ordinal);
|
||||||
|
if (memberStart < 0) return string.Empty;
|
||||||
|
var memberEnd = xml.IndexOf("</member>", memberStart, StringComparison.Ordinal);
|
||||||
|
if (memberEnd < 0) return string.Empty;
|
||||||
|
var slice = xml.Substring(memberStart, memberEnd - memberStart);
|
||||||
|
var remarksStart = slice.IndexOf("<remarks>", StringComparison.Ordinal);
|
||||||
|
if (remarksStart < 0) return string.Empty;
|
||||||
|
var remarksEnd = slice.IndexOf("</remarks>", remarksStart, StringComparison.Ordinal);
|
||||||
|
return remarksEnd < 0 ? string.Empty : slice.Substring(remarksStart, remarksEnd - remarksStart);
|
||||||
|
}
|
||||||
}
|
}
|
||||||
|
|||||||
@@ -148,4 +148,66 @@ public sealed class CapabilityInvokerTests
|
|||||||
|
|
||||||
builder.CachedPipelineCount.ShouldBe(2);
|
builder.CachedPipelineCount.ShouldBe(2);
|
||||||
}
|
}
|
||||||
|
|
||||||
|
/// <summary>
|
||||||
|
/// Core-009 regression: ExecuteWriteAsync's non-idempotent branch must snapshot
|
||||||
|
/// <c>_optionsAccessor</c> exactly once per call. Calling it multiple times allocates
|
||||||
|
/// redundant options objects on the per-write hot path and creates a consistency hazard
|
||||||
|
/// where an Admin edit mid-call could observe two different snapshots.
|
||||||
|
/// </summary>
|
||||||
|
[Fact]
|
||||||
|
public async Task ExecuteWriteAsync_NonIdempotent_Snapshots_Options_Once_Per_Call()
|
||||||
|
{
|
||||||
|
var options = new DriverResilienceOptions
|
||||||
|
{
|
||||||
|
Tier = DriverTier.A,
|
||||||
|
CapabilityPolicies = new Dictionary<DriverCapability, CapabilityPolicy>
|
||||||
|
{
|
||||||
|
[DriverCapability.Write] = new(TimeoutSeconds: 2, RetryCount: 3, BreakerFailureThreshold: 5),
|
||||||
|
},
|
||||||
|
};
|
||||||
|
var accessorCalls = 0;
|
||||||
|
var invoker = new CapabilityInvoker(
|
||||||
|
new DriverResiliencePipelineBuilder(),
|
||||||
|
"drv-test",
|
||||||
|
() => { Interlocked.Increment(ref accessorCalls); return options; });
|
||||||
|
|
||||||
|
await invoker.ExecuteWriteAsync(
|
||||||
|
"host-1",
|
||||||
|
isIdempotent: false,
|
||||||
|
_ => ValueTask.FromResult(0),
|
||||||
|
CancellationToken.None);
|
||||||
|
|
||||||
|
accessorCalls.ShouldBe(1,
|
||||||
|
"ExecuteWriteAsync's non-idempotent branch must capture the options snapshot exactly once per call");
|
||||||
|
}
|
||||||
|
|
||||||
|
/// <summary>
|
||||||
|
/// Core-009 regression — companion consistency assertion: the non-idempotent branch must
|
||||||
|
/// not observe two different option snapshots even if the accessor's returned value changes
|
||||||
|
/// between calls (simulating an Admin edit landing mid-flight). With a single snapshot the
|
||||||
|
/// two derived values (<c>with</c> base + <c>Resolve(Write)</c>) come from the same options
|
||||||
|
/// instance.
|
||||||
|
/// </summary>
|
||||||
|
[Fact]
|
||||||
|
public async Task ExecuteWriteAsync_NonIdempotent_Uses_Consistent_Options_Snapshot()
|
||||||
|
{
|
||||||
|
var a = new DriverResilienceOptions { Tier = DriverTier.A };
|
||||||
|
var b = new DriverResilienceOptions { Tier = DriverTier.B };
|
||||||
|
var alternating = new[] { a, b, a, b }.AsEnumerable().GetEnumerator();
|
||||||
|
var invoker = new CapabilityInvoker(
|
||||||
|
new DriverResiliencePipelineBuilder(),
|
||||||
|
"drv-test",
|
||||||
|
() => { alternating.MoveNext(); return alternating.Current; });
|
||||||
|
|
||||||
|
// If options is read twice, the with-expression and Resolve(Write) come from
|
||||||
|
// different tier tables (A then B) — the resulting one-entry dictionary is
|
||||||
|
// inconsistent with the snapshot used for the rest of the options. Single-snapshot
|
||||||
|
// semantics guarantee the call sees a coherent view.
|
||||||
|
await Should.NotThrowAsync(async () => await invoker.ExecuteWriteAsync(
|
||||||
|
"host-1",
|
||||||
|
isIdempotent: false,
|
||||||
|
_ => ValueTask.FromResult(0),
|
||||||
|
CancellationToken.None));
|
||||||
|
}
|
||||||
}
|
}
|
||||||
|
|||||||
@@ -99,4 +99,49 @@ public sealed class DriverResilienceOptionsTests
|
|||||||
options.Resolve(DriverCapability.Write).ShouldBe(
|
options.Resolve(DriverCapability.Write).ShouldBe(
|
||||||
DriverResilienceOptions.GetTierDefaults(DriverTier.A)[DriverCapability.Write]);
|
DriverResilienceOptions.GetTierDefaults(DriverTier.A)[DriverCapability.Write]);
|
||||||
}
|
}
|
||||||
|
|
||||||
|
/// <summary>
|
||||||
|
/// Core-010 regression: every <see cref="DriverCapability"/> value must successfully resolve
|
||||||
|
/// under every tier with a default <see cref="DriverResilienceOptions"/>. A future
|
||||||
|
/// enum-only addition that forgets to update <c>GetTierDefaults</c> would otherwise blow up
|
||||||
|
/// on the hot path with <see cref="KeyNotFoundException"/>.
|
||||||
|
/// </summary>
|
||||||
|
[Theory]
|
||||||
|
[InlineData(DriverTier.A)]
|
||||||
|
[InlineData(DriverTier.B)]
|
||||||
|
[InlineData(DriverTier.C)]
|
||||||
|
public void Resolve_Returns_NonNull_Policy_For_Every_Capability(DriverTier tier)
|
||||||
|
{
|
||||||
|
var options = new DriverResilienceOptions { Tier = tier };
|
||||||
|
|
||||||
|
foreach (var capability in Enum.GetValues<DriverCapability>())
|
||||||
|
{
|
||||||
|
var policy = options.Resolve(capability);
|
||||||
|
policy.ShouldNotBeNull(
|
||||||
|
$"every DriverCapability must resolve to a non-null policy for tier {tier} — {capability} did not");
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
/// <summary>
|
||||||
|
/// Core-010 regression: when a capability is somehow missing from BOTH the override
|
||||||
|
/// map and the tier defaults (defensive — should be impossible thanks to the
|
||||||
|
/// <c>TierDefaults_Cover_EveryCapability</c> invariant, but is the failure mode the
|
||||||
|
/// finding flagged), <c>Resolve</c> must throw a diagnostic <see cref="KeyNotFoundException"/>
|
||||||
|
/// that names the missing capability and tier — not a bare lookup failure.
|
||||||
|
/// </summary>
|
||||||
|
[Fact]
|
||||||
|
public void Resolve_Throws_Diagnostic_When_Capability_Missing_From_Tier_Defaults()
|
||||||
|
{
|
||||||
|
// Use a CapabilityPolicies dict that purposely omits one capability and use reflection
|
||||||
|
// to confirm the message names the capability when the tier defaults also omit it.
|
||||||
|
// We can't easily mutate GetTierDefaults so we exercise the documented behavior on a
|
||||||
|
// synthetic non-tier-known capability (we cast an out-of-range enum value).
|
||||||
|
var options = new DriverResilienceOptions { Tier = DriverTier.A };
|
||||||
|
var bogus = (DriverCapability)int.MaxValue;
|
||||||
|
|
||||||
|
var ex = Should.Throw<KeyNotFoundException>(() => options.Resolve(bogus));
|
||||||
|
ex.Message.ShouldContain(bogus.ToString());
|
||||||
|
ex.Message.ShouldContain(DriverTier.A.ToString());
|
||||||
|
ex.Message.ShouldContain(nameof(DriverResilienceOptions.GetTierDefaults));
|
||||||
|
}
|
||||||
}
|
}
|
||||||
|
|||||||
@@ -109,4 +109,40 @@ public sealed class WedgeDetectorTests
|
|||||||
new DemandSignal(0, 0, 1, Now).HasPendingWork.ShouldBeTrue();
|
new DemandSignal(0, 0, 1, Now).HasPendingWork.ShouldBeTrue();
|
||||||
new DemandSignal(0, 0, 0, Now).HasPendingWork.ShouldBeFalse();
|
new DemandSignal(0, 0, 0, Now).HasPendingWork.ShouldBeFalse();
|
||||||
}
|
}
|
||||||
|
|
||||||
|
/// <summary>
|
||||||
|
/// Core-012 regression: the XML <c><summary></c> on the <see cref="WedgeDetector"/>
|
||||||
|
/// constructor must accurately describe what the constructor does (take + clamp the
|
||||||
|
/// threshold). The previous text — "Whether the driver reported itself Healthy at
|
||||||
|
/// construction" — referenced behaviour the constructor doesn't perform.
|
||||||
|
/// </summary>
|
||||||
|
[Fact]
|
||||||
|
public void Doc_Constructor_Summary_Describes_Threshold_Clamp()
|
||||||
|
{
|
||||||
|
var xmlPath = Path.Combine(
|
||||||
|
AppContext.BaseDirectory,
|
||||||
|
"ZB.MOM.WW.OtOpcUa.Core.xml");
|
||||||
|
File.Exists(xmlPath).ShouldBeTrue($"expected XML doc file at {xmlPath}");
|
||||||
|
|
||||||
|
var content = File.ReadAllText(xmlPath);
|
||||||
|
var ctorSummary = ExtractSummaryFor(content,
|
||||||
|
"M:ZB.MOM.WW.OtOpcUa.Core.Stability.WedgeDetector.#ctor(System.TimeSpan)");
|
||||||
|
|
||||||
|
ctorSummary.ShouldNotBeNullOrWhiteSpace();
|
||||||
|
ctorSummary.ShouldNotContain("reported itself");
|
||||||
|
ctorSummary.ShouldContain("threshold");
|
||||||
|
}
|
||||||
|
|
||||||
|
private static string ExtractSummaryFor(string xml, string member)
|
||||||
|
{
|
||||||
|
var memberStart = xml.IndexOf($"<member name=\"{member}\"", StringComparison.Ordinal);
|
||||||
|
if (memberStart < 0) return string.Empty;
|
||||||
|
var memberEnd = xml.IndexOf("</member>", memberStart, StringComparison.Ordinal);
|
||||||
|
if (memberEnd < 0) return string.Empty;
|
||||||
|
var slice = xml.Substring(memberStart, memberEnd - memberStart);
|
||||||
|
var sumStart = slice.IndexOf("<summary>", StringComparison.Ordinal);
|
||||||
|
if (sumStart < 0) return string.Empty;
|
||||||
|
var sumEnd = slice.IndexOf("</summary>", sumStart, StringComparison.Ordinal);
|
||||||
|
return sumEnd < 0 ? string.Empty : slice.Substring(sumStart, sumEnd - sumStart);
|
||||||
|
}
|
||||||
}
|
}
|
||||||
|
|||||||
Reference in New Issue
Block a user