fix(core-abstractions): resolve Low code-review findings (Core.Abstractions-004,005,006,007,008)

- Core.Abstractions-004: guard DriverTypeRegistry.Register with a Lock so
  concurrent registrations are atomic.
- Core.Abstractions-005: narrow PollGroupEngine catch blocks to non-fatal
  exceptions, add optional onError callback, tolerate disposed-CTS races.
- Core.Abstractions-006: document the deliberate int-vs-uint asymmetry on
  IHistoryProvider.ReadEventsAsync / IHistorianDataSource.ReadEventsAsync.
- Core.Abstractions-007: pin the gaps with PollGroupEngine + DriverHealth
  contract tests.
- Core.Abstractions-008: correct XML docs on DriverHealth.LastError and
  the optional / required asymmetry on the history-read surfaces.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
Joseph Doherty
2026-05-23 05:37:54 -04:00
parent a02c0ffe36
commit ff2e75ab98
10 changed files with 422 additions and 33 deletions

View File

@@ -7,7 +7,7 @@
| Review date | 2026-05-22 | | Review date | 2026-05-22 |
| Commit reviewed | `76d35d1` | | Commit reviewed | `76d35d1` |
| Status | Reviewed | | Status | Reviewed |
| Open findings | 5 | | Open findings | 0 |
## Checklist coverage ## Checklist coverage
@@ -84,13 +84,13 @@ a category produced nothing rather than leaving it blank.
| Severity | Low | | Severity | Low |
| Category | Concurrency & thread safety | | Category | Concurrency & thread safety |
| Location | `src/Core/ZB.MOM.WW.OtOpcUa.Core.Abstractions/DriverTypeRegistry.cs:23-40` | | Location | `src/Core/ZB.MOM.WW.OtOpcUa.Core.Abstractions/DriverTypeRegistry.cs:23-40` |
| Status | Open | | Status | Resolved |
**Description:** `Register` performs a check-then-act sequence (`snapshot.ContainsKey` then build `next` then `Interlocked.Exchange`) that is not atomic. Two threads registering concurrently can both pass the duplicate check and both build a `next` dictionary; the second `Interlocked.Exchange` then wins and silently discards the first registration, defeating the documented "registered only once" guarantee. The class XML doc states registration happens single-threaded at startup, so this is not a live defect — but the use of `Interlocked.Exchange` for the swap implies the type is fully thread-safe for writers, which it is not. The mismatch between the implementation's apparent intent and its actual guarantee is a maintenance hazard. **Description:** `Register` performs a check-then-act sequence (`snapshot.ContainsKey` then build `next` then `Interlocked.Exchange`) that is not atomic. Two threads registering concurrently can both pass the duplicate check and both build a `next` dictionary; the second `Interlocked.Exchange` then wins and silently discards the first registration, defeating the documented "registered only once" guarantee. The class XML doc states registration happens single-threaded at startup, so this is not a live defect — but the use of `Interlocked.Exchange` for the swap implies the type is fully thread-safe for writers, which it is not. The mismatch between the implementation's apparent intent and its actual guarantee is a maintenance hazard.
**Recommendation:** Either guard `Register` with a `lock` so the check-build-swap is atomic, or strengthen the XML `Thread-safety` remark to state explicitly that concurrent `Register` calls are unsupported and only reader/writer concurrency is safe. **Recommendation:** Either guard `Register` with a `lock` so the check-build-swap is atomic, or strengthen the XML `Thread-safety` remark to state explicitly that concurrent `Register` calls are unsupported and only reader/writer concurrency is safe.
**Resolution:** _(open)_ **Resolution:** Resolved 2026-05-23 — guarded the duplicate check + copy-on-write rebuild + swap with a private `Lock`, making the check-build-swap atomic. Added `Register_ConcurrentDistinctTypes_AllSucceed` and `Register_ConcurrentDuplicateType_ExactlyOneWins` tests that exercise 16/32 racing threads and assert the "registered only once" guarantee holds.
### Core.Abstractions-005 ### Core.Abstractions-005
@@ -99,13 +99,13 @@ a category produced nothing rather than leaving it blank.
| Severity | Low | | Severity | Low |
| Category | Error handling & resilience | | Category | Error handling & resilience |
| Location | `src/Core/ZB.MOM.WW.OtOpcUa.Core.Abstractions/PollGroupEngine.cs:90,99` | | Location | `src/Core/ZB.MOM.WW.OtOpcUa.Core.Abstractions/PollGroupEngine.cs:90,99` |
| Status | Open | | Status | Resolved |
**Description:** Both the initial-poll and steady-state catch blocks use a bare `catch { }` that swallows every exception type, including non-transient programmer errors such as `NullReferenceException` and `ArgumentOutOfRangeException` (see Core.Abstractions-002). The XML remark says "transient poll error — loop continues, driver health surface logs it", but the engine never actually notifies the driver — there is no callback or event for a caught exception, so the driver's health surface has nothing to log. A persistently failing reader produces a silently spinning loop with zero observability from inside this module. **Description:** Both the initial-poll and steady-state catch blocks use a bare `catch { }` that swallows every exception type, including non-transient programmer errors such as `NullReferenceException` and `ArgumentOutOfRangeException` (see Core.Abstractions-002). The XML remark says "transient poll error — loop continues, driver health surface logs it", but the engine never actually notifies the driver — there is no callback or event for a caught exception, so the driver's health surface has nothing to log. A persistently failing reader produces a silently spinning loop with zero observability from inside this module.
**Recommendation:** Narrow the catch to the exception types a reader is expected to throw (or at least exclude obviously-fatal ones), and add an optional `Action<Exception>` error callback (or raise an event) so the owning driver can record poll failures on its health surface as the doc claims happens. **Recommendation:** Narrow the catch to the exception types a reader is expected to throw (or at least exclude obviously-fatal ones), and add an optional `Action<Exception>` error callback (or raise an event) so the owning driver can record poll failures on its health surface as the doc claims happens.
**Resolution:** _(open)_ **Resolution:** Resolved 2026-05-23 — narrowed the catch blocks with an `IsFatal` guard so `OutOfMemoryException` / `StackOverflowException` / `AccessViolationException` / `ThreadAbortException` propagate instead of being swallowed; added an optional `Action<Exception>? onError` constructor parameter (backward-compatible — every existing driver call site uses named args and is unaffected) and routed every caught reader / contract-violation exception through a `ReportError` helper that defends against a buggy error sink. Also tolerates `ObjectDisposedException` from `Task.Delay` against an already-disposed CTS (defensive race-safety). Added `Reader_exception_is_reported_to_onError_callback`, `Reader_contract_violation_routes_to_onError_callback`, and `OnError_handler_that_throws_does_not_crash_loop` regression tests.
### Core.Abstractions-006 ### Core.Abstractions-006
@@ -114,13 +114,13 @@ a category produced nothing rather than leaving it blank.
| Severity | Low | | Severity | Low |
| Category | Code organization & conventions | | Category | Code organization & conventions |
| Location | `src/Core/ZB.MOM.WW.OtOpcUa.Core.Abstractions/IHistoryProvider.cs:63,84-86`, `src/Core/ZB.MOM.WW.OtOpcUa.Core.Abstractions/Historian/IHistorianDataSource.cs:30,63` | | Location | `src/Core/ZB.MOM.WW.OtOpcUa.Core.Abstractions/IHistoryProvider.cs:63,84-86`, `src/Core/ZB.MOM.WW.OtOpcUa.Core.Abstractions/Historian/IHistorianDataSource.cs:30,63` |
| Status | Open | | Status | Resolved |
**Description:** The two history-read surfaces use inconsistent integer types for the same "maximum rows" concept. `IHistoryProvider.ReadRawAsync` and `IHistorianDataSource.ReadRawAsync` take `uint maxValuesPerNode`, but `ReadEventsAsync` (on both interfaces) takes `int maxEvents`. The OPC UA `HistoryRead` service request fields are unsigned, and a negative `maxEvents` has no defined meaning. Mixing `int` and `uint` for the same parameter role across sibling methods forces every caller and implementer to reason about the inconsistency and risks accidental sign issues at the boundary. **Description:** The two history-read surfaces use inconsistent integer types for the same "maximum rows" concept. `IHistoryProvider.ReadRawAsync` and `IHistorianDataSource.ReadRawAsync` take `uint maxValuesPerNode`, but `ReadEventsAsync` (on both interfaces) takes `int maxEvents`. The OPC UA `HistoryRead` service request fields are unsigned, and a negative `maxEvents` has no defined meaning. Mixing `int` and `uint` for the same parameter role across sibling methods forces every caller and implementer to reason about the inconsistency and risks accidental sign issues at the boundary.
**Recommendation:** Standardize on `uint` for all max-rows parameters across both `IHistoryProvider` and `IHistorianDataSource` (or document explicitly why `maxEvents` differs). **Recommendation:** Standardize on `uint` for all max-rows parameters across both `IHistoryProvider` and `IHistorianDataSource` (or document explicitly why `maxEvents` differs).
**Resolution:** _(open)_ **Resolution:** Resolved 2026-05-23 — took the documented-difference path (signature change would cross ten files outside Core.Abstractions). `int maxEvents` is intentionally signed: callers and downstream historian adapters (`WonderwareHistorianClient`, `HistorianDataSource`) treat `maxEvents <= 0` as a "use the backend's default cap" sentinel that has no `uint` equivalent. Updated XML docs on both `IHistoryProvider.ReadEventsAsync` and `IHistorianDataSource.ReadEventsAsync` to spell out the asymmetry and rationale. Added `HistoryRead_MaxParameter_TypePinned` / `HistoryProvider_MaxParameter_TypePinned` contract tests so an accidental future flip is caught.
### Core.Abstractions-007 ### Core.Abstractions-007
@@ -129,13 +129,13 @@ a category produced nothing rather than leaving it blank.
| Severity | Low | | Severity | Low |
| Category | Testing coverage | | Category | Testing coverage |
| Location | `tests/Core/ZB.MOM.WW.OtOpcUa.Core.Abstractions.Tests/PollGroupEngineTests.cs` | | Location | `tests/Core/ZB.MOM.WW.OtOpcUa.Core.Abstractions.Tests/PollGroupEngineTests.cs` |
| Status | Open | | Status | Resolved |
**Description:** `PollGroupEngine` is the only behavioural (non-DTO) type in the module and its tests, while solid for the happy paths, miss two paths that this review identifies as defect-prone: (a) no test exercises an array-valued tag whose contents are unchanged across polls (would catch Core.Abstractions-001), and (b) no test exercises a reader that returns a snapshot list shorter than the input references (would catch Core.Abstractions-002). The `Reader_exception_does_not_crash_loop` test only covers a reader that throws before producing any result. `DataValueSnapshot` change-detection semantics for reference-typed values are therefore unverified. **Description:** `PollGroupEngine` is the only behavioural (non-DTO) type in the module and its tests, while solid for the happy paths, miss two paths that this review identifies as defect-prone: (a) no test exercises an array-valued tag whose contents are unchanged across polls (would catch Core.Abstractions-001), and (b) no test exercises a reader that returns a snapshot list shorter than the input references (would catch Core.Abstractions-002). The `Reader_exception_does_not_crash_loop` test only covers a reader that throws before producing any result. `DataValueSnapshot` change-detection semantics for reference-typed values are therefore unverified.
**Recommendation:** Add tests for the unchanged-array case and the short-result-list case once Core.Abstractions-001/002 are addressed, so the intended contract is locked down. **Recommendation:** Add tests for the unchanged-array case and the short-result-list case once Core.Abstractions-001/002 are addressed, so the intended contract is locked down.
**Resolution:** _(open)_ **Resolution:** Resolved 2026-05-23 — the gap is filled by the regression tests added when Core.Abstractions-001 and -002 were closed: `Array_valued_tag_unchanged_contents_raises_only_once`, `Array_valued_tag_changed_contents_raises_event`, and `Reader_short_result_list_raises_descriptive_exception_and_loop_continues` lock the two previously-untested paths down. The fixes for -004 / -005 / -008 added a further nine regression tests (`Register_ConcurrentDistinctTypes_AllSucceed`, `Register_ConcurrentDuplicateType_ExactlyOneWins`, `Reader_exception_is_reported_to_onError_callback`, `Reader_contract_violation_routes_to_onError_callback`, `OnError_handler_that_throws_does_not_crash_loop`, `LastError_IsIndependent_OfState`, `DriverState_EnumContainsExpectedMembers`, `HistoryRead_MaxParameter_TypePinned`, `HistoryProvider_MaxParameter_TypePinned`, `HistoryProvider_OptionalMethods_HaveDefaultImplementation`). Total test count rose from 57 to 75.
### Core.Abstractions-008 ### Core.Abstractions-008
@@ -144,7 +144,7 @@ a category produced nothing rather than leaving it blank.
| Severity | Low | | Severity | Low |
| Category | Documentation & comments | | Category | Documentation & comments |
| Location | `src/Core/ZB.MOM.WW.OtOpcUa.Core.Abstractions/DriverHealth.cs:9`, `src/Core/ZB.MOM.WW.OtOpcUa.Core.Abstractions/IHistoryProvider.cs:39-43,65-69` | | Location | `src/Core/ZB.MOM.WW.OtOpcUa.Core.Abstractions/DriverHealth.cs:9`, `src/Core/ZB.MOM.WW.OtOpcUa.Core.Abstractions/IHistoryProvider.cs:39-43,65-69` |
| Status | Open | | Status | Resolved |
**Description:** Two XML-doc inaccuracies: **Description:** Two XML-doc inaccuracies:
@@ -153,4 +153,4 @@ a category produced nothing rather than leaving it blank.
**Recommendation:** Reword `DriverHealth.LastError` to "Most recent error message; may be null when no error has been recorded" without tying nullness to a specific state. Add a one-line note on `IHistoryProvider`/`IHistorianDataSource` explaining why one surface uses default methods and the other does not. **Recommendation:** Reword `DriverHealth.LastError` to "Most recent error message; may be null when no error has been recorded" without tying nullness to a specific state. Add a one-line note on `IHistoryProvider`/`IHistorianDataSource` explaining why one surface uses default methods and the other does not.
**Resolution:** _(open)_ **Resolution:** Resolved 2026-05-23 — reworded `DriverHealth.LastError` to "null when no error has been recorded" and called out that the field is independent of `State`. Added an asymmetry `<remarks>` to `IHistoryProvider` (default-impl-throws so legacy drivers compile) and to `IHistorianDataSource.ReadEventsAsync` (required because server-side historians own the full surface) cross-referencing the finding. Added `LastError_IsIndependent_OfState` + `DriverState_EnumContainsExpectedMembers` and `HistoryProvider_OptionalMethods_HaveDefaultImplementation` contract tests so the asymmetry pins.

View File

@@ -6,7 +6,15 @@ namespace ZB.MOM.WW.OtOpcUa.Core.Abstractions;
/// </summary> /// </summary>
/// <param name="State">Current driver-instance state.</param> /// <param name="State">Current driver-instance state.</param>
/// <param name="LastSuccessfulRead">Timestamp of the most recent successful equipment read; null if never.</param> /// <param name="LastSuccessfulRead">Timestamp of the most recent successful equipment read; null if never.</param>
/// <param name="LastError">Most recent error message; null when state is Healthy.</param> /// <param name="LastError">
/// Most recent error message; null when no error has been recorded. The type makes no
/// guarantee about correlation with <paramref name="State"/> — a driver in
/// <see cref="DriverState.Healthy"/> may legitimately retain the last error from a recovered
/// failure (useful for diagnostics), and <see cref="DriverState.Degraded"/> /
/// <see cref="DriverState.Reconnecting"/> / <see cref="DriverState.Faulted"/> states may all
/// carry a non-null message. Callers must not key behaviour on the LastError-null ↔ Healthy
/// pairing (Core.Abstractions-008).
/// </param>
public sealed record DriverHealth( public sealed record DriverHealth(
DriverState State, DriverState State,
DateTime? LastSuccessfulRead, DateTime? LastSuccessfulRead,

View File

@@ -10,33 +10,46 @@ namespace ZB.MOM.WW.OtOpcUa.Core.Abstractions;
/// and #111 (driver type → namespace kind mapping enforced by sp_ValidateDraft). /// and #111 (driver type → namespace kind mapping enforced by sp_ValidateDraft).
/// The registry is the source of truth for both checks. /// The registry is the source of truth for both checks.
/// ///
/// Thread-safety: registration happens at startup (single thread); lookups happen on every /// Thread-safety: registration is typically single-threaded at startup; lookups happen on
/// config-apply (multi-threaded). The internal dictionary is replaced atomically via /// every config-apply (multi-threaded). The check-then-act inside <see cref="Register"/> is
/// <see cref="System.Threading.Interlocked"/> on register; readers see a stable snapshot. /// guarded by a private lock so concurrent registrations are atomic — the "registered only
/// once per process" guarantee holds even if two callers race. Readers operate against the
/// volatile snapshot reference produced by the last successful <see cref="Register"/> and
/// never block.
/// </remarks> /// </remarks>
public sealed class DriverTypeRegistry public sealed class DriverTypeRegistry
{ {
private readonly Lock _writeLock = new();
private IReadOnlyDictionary<string, DriverTypeMetadata> _types = private IReadOnlyDictionary<string, DriverTypeMetadata> _types =
new Dictionary<string, DriverTypeMetadata>(StringComparer.OrdinalIgnoreCase); new Dictionary<string, DriverTypeMetadata>(StringComparer.OrdinalIgnoreCase);
/// <summary>Register a driver type. Throws if the type name is already registered.</summary> /// <summary>Register a driver type. Throws if the type name is already registered.</summary>
/// <remarks>
/// The check-then-act (duplicate check → copy-on-write rebuild → swap) is performed under
/// <see cref="_writeLock"/> so concurrent <see cref="Register"/> calls cannot silently
/// discard each other's registrations — see Core.Abstractions-004.
/// </remarks>
public void Register(DriverTypeMetadata metadata) public void Register(DriverTypeMetadata metadata)
{ {
ArgumentNullException.ThrowIfNull(metadata); ArgumentNullException.ThrowIfNull(metadata);
var snapshot = _types; lock (_writeLock)
if (snapshot.ContainsKey(metadata.TypeName))
{ {
throw new InvalidOperationException( var snapshot = _types;
$"Driver type '{metadata.TypeName}' is already registered. " + if (snapshot.ContainsKey(metadata.TypeName))
$"Each driver type may be registered only once per process."); {
} throw new InvalidOperationException(
$"Driver type '{metadata.TypeName}' is already registered. " +
$"Each driver type may be registered only once per process.");
}
var next = new Dictionary<string, DriverTypeMetadata>(snapshot, StringComparer.OrdinalIgnoreCase) var next = new Dictionary<string, DriverTypeMetadata>(snapshot, StringComparer.OrdinalIgnoreCase)
{ {
[metadata.TypeName] = metadata, [metadata.TypeName] = metadata,
}; };
Interlocked.Exchange(ref _types, next); _types = next;
}
} }
/// <summary>Look up a driver type by name. Throws if unknown.</summary> /// <summary>Look up a driver type by name. Throws if unknown.</summary>

View File

@@ -59,6 +59,21 @@ public interface IHistorianDataSource : IDisposable
/// Distinct from any live event stream; sources here come from the historian's /// Distinct from any live event stream; sources here come from the historian's
/// event log. <paramref name="sourceName"/> is null to return all sources. /// event log. <paramref name="sourceName"/> is null to return all sources.
/// </summary> /// </summary>
/// <remarks>
/// Note on parameter types — <paramref name="maxEvents"/> is <see cref="int"/> (not
/// <see cref="uint"/>) so callers can pass <c>0</c> or a negative value as a "use the
/// backend's default cap" sentinel; see <c>WonderwareHistorianClient</c> /
/// <c>HistorianDataSource</c> and Core.Abstractions-006 for the rationale. The sibling
/// <see cref="ReadRawAsync"/> / <see cref="ReadProcessedAsync"/> use
/// <c>uint maxValuesPerNode</c> because their OPC UA HistoryRead surface has no
/// equivalent "use default" sentinel.
///
/// This surface declares <see cref="ReadAtTimeAsync"/> and <see cref="ReadEventsAsync"/>
/// as required members — a server-side historian owns the full read surface, unlike
/// <see cref="IHistoryProvider"/> where the same two methods are optional default-impl
/// methods so legacy drivers can stay raw-only. The asymmetry is intentional
/// (Core.Abstractions-008).
/// </remarks>
Task<HistoricalEventsResult> ReadEventsAsync( Task<HistoricalEventsResult> ReadEventsAsync(
string? sourceName, string? sourceName,
DateTime startUtc, DateTime startUtc,

View File

@@ -6,6 +6,14 @@ namespace ZB.MOM.WW.OtOpcUa.Core.Abstractions;
/// Galaxy (Wonderware Historian via the optional plugin), OPC UA Client (forward /// Galaxy (Wonderware Historian via the optional plugin), OPC UA Client (forward
/// to upstream server). /// to upstream server).
/// </summary> /// </summary>
/// <remarks>
/// <see cref="ReadAtTimeAsync"/> and <see cref="ReadEventsAsync"/> are C# default interface
/// methods that throw <see cref="NotSupportedException"/> — drivers opt in by overriding so
/// a raw-only driver compiles without forcing it to provide at-time / event surfaces it
/// has no backend for. The sibling server-side surface, <see cref="IHistorianDataSource"/>,
/// declares both methods as required because a registered historian owns the full read
/// surface; the asymmetry is intentional (Core.Abstractions-008).
/// </remarks>
public interface IHistoryProvider public interface IHistoryProvider
{ {
/// <summary> /// <summary>
@@ -60,12 +68,24 @@ public interface IHistoryProvider
/// </param> /// </param>
/// <param name="startUtc">Inclusive lower bound on <c>EventTimeUtc</c>.</param> /// <param name="startUtc">Inclusive lower bound on <c>EventTimeUtc</c>.</param>
/// <param name="endUtc">Exclusive upper bound on <c>EventTimeUtc</c>.</param> /// <param name="endUtc">Exclusive upper bound on <c>EventTimeUtc</c>.</param>
/// <param name="maxEvents">Upper cap on returned events — the driver's backend enforces this.</param> /// <param name="maxEvents">
/// Upper cap on returned events — the driver's backend enforces this. The type is
/// <see cref="int"/> rather than <see cref="uint"/> (which the sibling raw / processed
/// reads use for <c>maxValuesPerNode</c>) because callers and downstream historian
/// adapters historically treat <c>maxEvents &lt;= 0</c> as a sentinel meaning
/// "use the backend's default cap" (see <c>WonderwareHistorianClient</c> /
/// <c>HistorianDataSource</c>). The asymmetry is intentional — Core.Abstractions-006.
/// </param>
/// <param name="cancellationToken">Request cancellation.</param> /// <param name="cancellationToken">Request cancellation.</param>
/// <remarks> /// <remarks>
/// Default implementation throws. Only drivers with an event historian (Galaxy via the /// Default implementation throws. Only drivers with an event historian (Galaxy via the
/// Wonderware Alarm &amp; Events log) override. Modbus / the OPC UA Client driver stay /// Wonderware Alarm &amp; Events log) override. Modbus / the OPC UA Client driver stay
/// with the default and let callers see <c>BadHistoryOperationUnsupported</c>. /// with the default and let callers see <c>BadHistoryOperationUnsupported</c>.
///
/// Note the type asymmetry with <see cref="ReadRawAsync"/> /
/// <see cref="ReadProcessedAsync"/> (both use <c>uint maxValuesPerNode</c>): event
/// readers accept a signed <c>int maxEvents</c> so callers can pass 0 / negative as a
/// "use default cap" sentinel without an extra parameter or overload.
/// </remarks> /// </remarks>
Task<HistoricalEventsResult> ReadEventsAsync( Task<HistoricalEventsResult> ReadEventsAsync(
string? sourceName, string? sourceName,

View File

@@ -19,14 +19,21 @@ namespace ZB.MOM.WW.OtOpcUa.Core.Abstractions;
/// from the previously-seen snapshot.</para> /// from the previously-seen snapshot.</para>
/// ///
/// <para>Exceptions thrown by the reader on the initial poll or any subsequent poll are /// <para>Exceptions thrown by the reader on the initial poll or any subsequent poll are
/// swallowed — the loop continues on the next tick. The driver's own health surface is /// caught — the loop continues on the next tick. When an <c>onError</c> callback is supplied
/// where transient poll failures should be reported; the engine intentionally does not /// to the constructor the caught exception is routed to it so the driver's health surface
/// double-book that responsibility.</para> /// can record the failure. Without an <c>onError</c> callback the exception is silently
/// swallowed (preserves the original behaviour for drivers that have not opted in yet).</para>
///
/// <para>Programmer errors and obviously-fatal exceptions (<see cref="OutOfMemoryException"/>,
/// <see cref="ThreadAbortException"/>, <see cref="StackOverflowException"/>,
/// <see cref="AccessViolationException"/>) are NOT caught — they propagate and tear the poll
/// loop down rather than spin a silently-broken subscription.</para>
/// </remarks> /// </remarks>
public sealed class PollGroupEngine : IAsyncDisposable public sealed class PollGroupEngine : IAsyncDisposable
{ {
private readonly Func<IReadOnlyList<string>, CancellationToken, Task<IReadOnlyList<DataValueSnapshot>>> _reader; private readonly Func<IReadOnlyList<string>, CancellationToken, Task<IReadOnlyList<DataValueSnapshot>>> _reader;
private readonly Action<ISubscriptionHandle, string, DataValueSnapshot> _onChange; private readonly Action<ISubscriptionHandle, string, DataValueSnapshot> _onChange;
private readonly Action<Exception>? _onError;
private readonly TimeSpan _minInterval; private readonly TimeSpan _minInterval;
private readonly ConcurrentDictionary<long, SubscriptionState> _subscriptions = new(); private readonly ConcurrentDictionary<long, SubscriptionState> _subscriptions = new();
private long _nextId; private long _nextId;
@@ -40,15 +47,21 @@ public sealed class PollGroupEngine : IAsyncDisposable
/// <see cref="ISubscribable.OnDataChange"/> event.</param> /// <see cref="ISubscribable.OnDataChange"/> event.</param>
/// <param name="minInterval">Interval floor; anything below is clamped. Defaults to 100 ms /// <param name="minInterval">Interval floor; anything below is clamped. Defaults to 100 ms
/// per <see cref="DefaultMinInterval"/>.</param> /// per <see cref="DefaultMinInterval"/>.</param>
/// <param name="onError">Optional error sink — invoked once per caught reader exception (or
/// internal contract-violation throw) so the owning driver can route the failure to its
/// health surface (Core.Abstractions-005). Defensive: an <c>onError</c> handler that
/// itself throws is silently absorbed so a buggy forwarder cannot crash the poll loop.</param>
public PollGroupEngine( public PollGroupEngine(
Func<IReadOnlyList<string>, CancellationToken, Task<IReadOnlyList<DataValueSnapshot>>> reader, Func<IReadOnlyList<string>, CancellationToken, Task<IReadOnlyList<DataValueSnapshot>>> reader,
Action<ISubscriptionHandle, string, DataValueSnapshot> onChange, Action<ISubscriptionHandle, string, DataValueSnapshot> onChange,
TimeSpan? minInterval = null) TimeSpan? minInterval = null,
Action<Exception>? onError = null)
{ {
ArgumentNullException.ThrowIfNull(reader); ArgumentNullException.ThrowIfNull(reader);
ArgumentNullException.ThrowIfNull(onChange); ArgumentNullException.ThrowIfNull(onChange);
_reader = reader; _reader = reader;
_onChange = onChange; _onChange = onChange;
_onError = onError;
_minInterval = minInterval ?? DefaultMinInterval; _minInterval = minInterval ?? DefaultMinInterval;
} }
@@ -102,19 +115,54 @@ public sealed class PollGroupEngine : IAsyncDisposable
// whether it has changed, satisfying OPC UA Part 4 initial-value semantics. // whether it has changed, satisfying OPC UA Part 4 initial-value semantics.
try { await PollOnceAsync(state, forceRaise: true, ct).ConfigureAwait(false); } try { await PollOnceAsync(state, forceRaise: true, ct).ConfigureAwait(false); }
catch (OperationCanceledException) { return; } catch (OperationCanceledException) { return; }
catch { /* first-read error tolerated — loop continues */ } catch (Exception ex) when (!IsFatal(ex))
{
// first-read error tolerated — loop continues; forward to driver health surface.
ReportError(ex);
}
while (!ct.IsCancellationRequested) while (!ct.IsCancellationRequested)
{ {
try { await Task.Delay(state.Interval, ct).ConfigureAwait(false); } try { await Task.Delay(state.Interval, ct).ConfigureAwait(false); }
catch (OperationCanceledException) { return; } catch (OperationCanceledException) { return; }
// Defensive: the CTS may be disposed by Unsubscribe/DisposeAsync between the
// cancellation check above and the Task.Delay touching the token. Treat that race
// as a normal cancellation rather than a fatal exception.
catch (ObjectDisposedException) { return; }
try { await PollOnceAsync(state, forceRaise: false, ct).ConfigureAwait(false); } try { await PollOnceAsync(state, forceRaise: false, ct).ConfigureAwait(false); }
catch (OperationCanceledException) { return; } catch (OperationCanceledException) { return; }
catch { /* transient poll error — loop continues, driver health surface logs it */ } catch (Exception ex) when (!IsFatal(ex))
{
// transient poll error — loop continues, driver health surface logs it
// via the supplied onError callback (Core.Abstractions-005).
ReportError(ex);
}
} }
} }
/// <summary>
/// Programmer-error / process-fatal exception classification: anything that cannot be
/// safely "swallowed and retry on the next tick" must escape the poll loop instead.
/// </summary>
private static bool IsFatal(Exception ex)
=> ex is OutOfMemoryException
or StackOverflowException
or AccessViolationException
or ThreadAbortException;
/// <summary>
/// Forward a caught exception to the optional <c>onError</c> callback. Defensive
/// against an <c>onError</c> implementation that itself throws — that would crash the
/// poll loop and re-introduce the silent-stall failure mode this method exists to prevent.
/// </summary>
private void ReportError(Exception ex)
{
if (_onError is null) return;
try { _onError(ex); }
catch { /* never let a buggy error sink stop the poll loop */ }
}
private async Task PollOnceAsync(SubscriptionState state, bool forceRaise, CancellationToken ct) private async Task PollOnceAsync(SubscriptionState state, bool forceRaise, CancellationToken ct)
{ {
var snapshots = await _reader(state.TagReferences, ct).ConfigureAwait(false); var snapshots = await _reader(state.TagReferences, ct).ConfigureAwait(false);

View File

@@ -0,0 +1,52 @@
using Shouldly;
using Xunit;
using ZB.MOM.WW.OtOpcUa.Core.Abstractions;
namespace ZB.MOM.WW.OtOpcUa.Core.Abstractions.Tests;
/// <summary>
/// Covers <see cref="DriverHealth"/> shape invariants — added to lock down the documented
/// contract after Core.Abstractions-008 reworded the <c>LastError</c> remark.
/// </summary>
public sealed class DriverHealthTests
{
/// <summary>
/// Core.Abstractions-008: <c>DriverHealth.LastError</c> is not constrained by the
/// <c>State</c> enum — a Healthy driver may legitimately retain the last error from a
/// recovered failure (for diagnostics), and Degraded / Reconnecting / Faulted states may
/// all carry a non-null message. The old XML doc "null when state is Healthy" was wrong;
/// this test makes the type's actual contract explicit so future doc churn cannot drift.
/// </summary>
[Theory]
[InlineData(DriverState.Unknown)]
[InlineData(DriverState.Initializing)]
[InlineData(DriverState.Healthy)]
[InlineData(DriverState.Degraded)]
[InlineData(DriverState.Reconnecting)]
[InlineData(DriverState.Faulted)]
public void LastError_IsIndependent_OfState(DriverState state)
{
var healthWithError = new DriverHealth(state, LastSuccessfulRead: DateTime.UtcNow, LastError: "earlier failure");
var healthWithoutError = new DriverHealth(state, LastSuccessfulRead: DateTime.UtcNow, LastError: null);
// Both shapes are constructible regardless of state — the type makes no enforcement.
healthWithError.LastError.ShouldBe("earlier failure");
healthWithoutError.LastError.ShouldBeNull();
healthWithError.State.ShouldBe(state);
healthWithoutError.State.ShouldBe(state);
}
[Fact]
public void DriverState_EnumContainsExpectedMembers()
{
// Pins the enum so finding-008's "more than Healthy can carry an error" claim
// does not bit-rot.
var names = Enum.GetNames<DriverState>();
names.ShouldContain(nameof(DriverState.Unknown));
names.ShouldContain(nameof(DriverState.Initializing));
names.ShouldContain(nameof(DriverState.Healthy));
names.ShouldContain(nameof(DriverState.Degraded));
names.ShouldContain(nameof(DriverState.Reconnecting));
names.ShouldContain(nameof(DriverState.Faulted));
}
}

View File

@@ -1,5 +1,6 @@
using Shouldly; using Shouldly;
using Xunit; using Xunit;
using ZB.MOM.WW.OtOpcUa.Core.Abstractions;
namespace ZB.MOM.WW.OtOpcUa.Core.Abstractions.Tests; namespace ZB.MOM.WW.OtOpcUa.Core.Abstractions.Tests;
@@ -120,4 +121,77 @@ public sealed class DriverTypeRegistryTests
var registry = new DriverTypeRegistry(); var registry = new DriverTypeRegistry();
Should.Throw<ArgumentException>(() => registry.Get(typeName!)); Should.Throw<ArgumentException>(() => registry.Get(typeName!));
} }
/// <summary>
/// Core.Abstractions-004: concurrent <see cref="DriverTypeRegistry.Register"/> calls for
/// two different driver types must both succeed and both end up visible to readers. A
/// non-atomic check-then-act would let the second swap silently discard the first
/// registration; this test asserts the "registered only once per process" guarantee
/// holds under concurrent writers too.
/// </summary>
[Fact]
public void Register_ConcurrentDistinctTypes_AllSucceed()
{
var registry = new DriverTypeRegistry();
const int parallelism = 32;
var ready = new ManualResetEventSlim(false);
var threads = Enumerable.Range(0, parallelism)
.Select(i => new Thread(() =>
{
ready.Wait();
registry.Register(SampleMetadata($"Driver-{i}"));
}))
.ToArray();
foreach (var t in threads) t.Start();
ready.Set(); // release all threads simultaneously
foreach (var t in threads) t.Join();
var all = registry.All();
all.Count.ShouldBe(parallelism);
for (var i = 0; i < parallelism; i++)
registry.TryGet($"Driver-{i}").ShouldNotBeNull(
$"Driver-{i} was lost to a race in concurrent Register");
}
/// <summary>
/// Core.Abstractions-004: concurrent <see cref="DriverTypeRegistry.Register"/> calls for
/// the SAME driver type must result in exactly one successful registration and exactly
/// (parallelism - 1) <see cref="InvalidOperationException"/> throws — not a silent
/// last-writer-wins replacement.
/// </summary>
[Fact]
public void Register_ConcurrentDuplicateType_ExactlyOneWins()
{
var registry = new DriverTypeRegistry();
const int parallelism = 16;
var ready = new ManualResetEventSlim(false);
var successes = 0;
var failures = 0;
var threads = Enumerable.Range(0, parallelism)
.Select(_ => new Thread(() =>
{
ready.Wait();
try
{
registry.Register(SampleMetadata("Contended"));
Interlocked.Increment(ref successes);
}
catch (InvalidOperationException)
{
Interlocked.Increment(ref failures);
}
}))
.ToArray();
foreach (var t in threads) t.Start();
ready.Set();
foreach (var t in threads) t.Join();
successes.ShouldBe(1);
failures.ShouldBe(parallelism - 1);
registry.All().Count.ShouldBe(1);
}
} }

View File

@@ -118,4 +118,61 @@ public sealed class IHistorianDataSourceContractTests
healthy.ShouldNotBe(unhealthy); healthy.ShouldNotBe(unhealthy);
} }
/// <summary>
/// Core.Abstractions-006: the <c>maxValuesPerNode</c> (raw / processed) and <c>maxEvents</c>
/// parameter types are intentionally asymmetric — raw/processed reads use <c>uint</c>
/// because OPC UA HistoryRead's NumValuesPerNode is unsigned, while event reads use
/// <c>int</c> to allow zero / negative as a "use backend default cap" sentinel
/// (see <c>WonderwareHistorianClient</c> / <c>HistorianDataSource</c> usage). This test
/// pins both shapes so accidental changes are caught.
/// </summary>
[Theory]
[InlineData("ReadRawAsync", "maxValuesPerNode", typeof(uint))]
[InlineData("ReadEventsAsync", "maxEvents", typeof(int))]
public void HistoryRead_MaxParameter_TypePinned(string methodName, string parameterName, Type expectedType)
{
var method = typeof(IHistorianDataSource).GetMethod(methodName);
method.ShouldNotBeNull();
var parameter = method!.GetParameters().FirstOrDefault(p => p.Name == parameterName);
parameter.ShouldNotBeNull($"Method {methodName} should expose a '{parameterName}' parameter.");
parameter!.ParameterType.ShouldBe(expectedType);
}
[Theory]
[InlineData("ReadRawAsync", "maxValuesPerNode", typeof(uint))]
[InlineData("ReadEventsAsync", "maxEvents", typeof(int))]
public void HistoryProvider_MaxParameter_TypePinned(string methodName, string parameterName, Type expectedType)
{
var method = typeof(IHistoryProvider).GetMethod(methodName);
method.ShouldNotBeNull();
var parameter = method!.GetParameters().FirstOrDefault(p => p.Name == parameterName);
parameter.ShouldNotBeNull($"Method {methodName} should expose a '{parameterName}' parameter.");
parameter!.ParameterType.ShouldBe(expectedType);
}
/// <summary>
/// Core.Abstractions-008: <see cref="IHistoryProvider.ReadAtTimeAsync"/> and
/// <see cref="IHistoryProvider.ReadEventsAsync"/> are C# default interface methods
/// (drivers opt in), whereas <see cref="IHistorianDataSource.ReadAtTimeAsync"/> and
/// <see cref="IHistorianDataSource.ReadEventsAsync"/> are required (the server-side
/// historian must implement them). This test pins the asymmetry so an implementer cannot
/// accidentally collapse the two surfaces and so the documented rationale stays load-bearing.
/// </summary>
[Theory]
[InlineData("ReadAtTimeAsync")]
[InlineData("ReadEventsAsync")]
public void HistoryProvider_OptionalMethods_HaveDefaultImplementation(string methodName)
{
var providerMethod = typeof(IHistoryProvider).GetMethod(methodName);
providerMethod.ShouldNotBeNull();
// Default interface methods carry a method body — IsAbstract is false.
providerMethod!.IsAbstract.ShouldBeFalse(
$"IHistoryProvider.{methodName} must remain a default-impl-throws method so legacy drivers continue to compile.");
var dataSourceMethod = typeof(IHistorianDataSource).GetMethod(methodName);
dataSourceMethod.ShouldNotBeNull();
dataSourceMethod!.IsAbstract.ShouldBeTrue(
$"IHistorianDataSource.{methodName} must remain required — server-side historians own the full surface.");
}
} }

View File

@@ -342,6 +342,108 @@ public sealed class PollGroupEngineTests
shortReadCount.ShouldBeGreaterThanOrEqualTo(2); shortReadCount.ShouldBeGreaterThanOrEqualTo(2);
} }
/// <summary>
/// Core.Abstractions-005: the engine documents that "transient poll errors are logged on
/// the driver health surface", but until an error callback exists the driver has no way
/// to observe a caught reader exception. Subscribing without supplying an error callback
/// must continue to swallow exceptions (backward compatible). When an error callback IS
/// supplied, every exception caught during a poll cycle must be routed to it.
/// </summary>
[Fact]
public async Task Reader_exception_is_reported_to_onError_callback()
{
var observed = new ConcurrentQueue<Exception>();
var readCount = 0;
Task<IReadOnlyList<DataValueSnapshot>> Reader(IReadOnlyList<string> refs, CancellationToken ct)
{
if (Interlocked.Increment(ref readCount) <= 3)
throw new InvalidOperationException($"boom-{readCount}");
var now = DateTime.UtcNow;
return Task.FromResult<IReadOnlyList<DataValueSnapshot>>(
refs.Select(_ => new DataValueSnapshot(1, 0u, now, now)).ToList());
}
await using var engine = new PollGroupEngine(
Reader,
(_, _, _) => { },
minInterval: TimeSpan.FromMilliseconds(50),
onError: ex => observed.Enqueue(ex));
var handle = engine.Subscribe(["X"], TimeSpan.FromMilliseconds(50));
await WaitForAsync(() => observed.Count >= 3, TimeSpan.FromSeconds(3));
engine.Unsubscribe(handle);
observed.Count.ShouldBeGreaterThanOrEqualTo(3);
observed.All(e => e is InvalidOperationException).ShouldBeTrue();
observed.All(e => e.Message.StartsWith("boom-")).ShouldBeTrue();
}
/// <summary>
/// Core.Abstractions-005: a contract-violating reader (Core.Abstractions-002 path) that
/// throws the descriptive <see cref="InvalidOperationException"/> from inside the engine
/// must also be routed to the error callback so the driver health surface can observe
/// repeated contract violations.
/// </summary>
[Fact]
public async Task Reader_contract_violation_routes_to_onError_callback()
{
var observed = new ConcurrentQueue<Exception>();
Task<IReadOnlyList<DataValueSnapshot>> Reader(IReadOnlyList<string> refs, CancellationToken ct)
{
// Always return zero snapshots — short-result-list contract violation.
return Task.FromResult<IReadOnlyList<DataValueSnapshot>>(new List<DataValueSnapshot>());
}
await using var engine = new PollGroupEngine(
Reader,
(_, _, _) => { },
minInterval: TimeSpan.FromMilliseconds(50),
onError: ex => observed.Enqueue(ex));
var handle = engine.Subscribe(["X"], TimeSpan.FromMilliseconds(50));
await WaitForAsync(() => observed.Count >= 2, TimeSpan.FromSeconds(2));
engine.Unsubscribe(handle);
observed.Count.ShouldBeGreaterThanOrEqualTo(2);
observed.All(e => e is InvalidOperationException
&& e.Message.Contains("Reader contract violation"))
.ShouldBeTrue();
}
/// <summary>
/// Core.Abstractions-005: the engine must defend itself against an <c>onError</c> handler
/// that itself throws — otherwise a buggy health-surface forwarder would crash the poll
/// loop and silently stall the subscription, defeating the whole point of the callback.
/// </summary>
[Fact]
public async Task OnError_handler_that_throws_does_not_crash_loop()
{
var readCount = 0;
var events = new ConcurrentQueue<string>();
Task<IReadOnlyList<DataValueSnapshot>> Reader(IReadOnlyList<string> refs, CancellationToken ct)
{
if (Interlocked.Increment(ref readCount) <= 2)
throw new InvalidOperationException("boom");
var now = DateTime.UtcNow;
return Task.FromResult<IReadOnlyList<DataValueSnapshot>>(
refs.Select(_ => new DataValueSnapshot(1, 0u, now, now)).ToList());
}
await using var engine = new PollGroupEngine(
Reader,
(_, r, _) => events.Enqueue(r),
minInterval: TimeSpan.FromMilliseconds(50),
onError: _ => throw new ApplicationException("error-handler-bug"));
var handle = engine.Subscribe(["X"], TimeSpan.FromMilliseconds(50));
// Wait long enough for the reader to recover and for the engine to deliver a change.
await WaitForAsync(() => events.Count >= 1, TimeSpan.FromSeconds(3));
engine.Unsubscribe(handle);
events.Count.ShouldBeGreaterThanOrEqualTo(1);
}
private sealed record DummyHandle : ISubscriptionHandle private sealed record DummyHandle : ISubscriptionHandle
{ {
public string DiagnosticId => "dummy"; public string DiagnosticId => "dummy";