# Code Review — Core.Abstractions | Field | Value | |---|---| | Module | `src/Core/ZB.MOM.WW.OtOpcUa.Core.Abstractions` | | Reviewer | Claude Code | | Review date | 2026-06-19 (re-review; original 2026-05-22) | | Commit reviewed | `7286d320` | | Status | Reviewed | | Open findings | 0 | ## Checklist coverage — original review (76d35d1, 2026-05-22) A comprehensive review completes every category, recording "No issues found" where a category produced nothing rather than leaving it blank. | # | Category | Result | |---|---|---| | 1 | Correctness & logic bugs | Core.Abstractions-001, Core.Abstractions-002 | | 2 | OtOpcUa conventions | No issues found | | 3 | Concurrency & thread safety | Core.Abstractions-003, Core.Abstractions-004 | | 4 | Error handling & resilience | Core.Abstractions-005 | | 5 | Security | No issues found | | 6 | Performance & resource management | No issues found | | 7 | Design-document adherence | No issues found | | 8 | Code organization & conventions | Core.Abstractions-006 | | 9 | Testing coverage | Core.Abstractions-007 | | 10 | Documentation & comments | Core.Abstractions-008 | ## Checklist coverage — re-review (7286d320, 2026-06-19) | # | Category | Result | |---|---|---| | 1 | Correctness & logic bugs | No new issues found | | 2 | OtOpcUa conventions | No issues found | | 3 | Concurrency & thread safety | No new issues found | | 4 | Error handling & resilience | No new issues found | | 5 | Security | No issues found | | 6 | Performance & resource management | No issues found | | 7 | Design-document adherence | No issues found | | 8 | Code organization & conventions | No new issues found | | 9 | Testing coverage | Core.Abstractions-010 | | 10 | Documentation & comments | Core.Abstractions-009 | ## Findings ### Core.Abstractions-001 | Field | Value | |---|---| | Severity | Medium | | Category | Correctness & logic bugs | | Location | `src/Core/ZB.MOM.WW.OtOpcUa.Core.Abstractions/PollGroupEngine.cs:112` | | Status | Resolved | **Description:** `PollOnceAsync` detects a change with `!Equals(lastSeen?.Value, current.Value)`. `object.Equals` falls back to reference equality for reference types that do not override it — including `T[]` array values. The capability interfaces explicitly support 1-D array attributes (`DriverAttributeInfo.IsArray`, `ValueRank=1`), and a driver's batch reader produces a fresh array instance on every poll. As a result every poll of an array-valued tag is treated as a change, so `OnDataChange` fires on every tick regardless of whether the array contents actually changed. This produces spurious data-change notifications and unnecessary OPC UA monitored-item publishes for any poll-based driver (Modbus, S7, AB CIP, FOCAS) that exposes array tags. **Recommendation:** Compare array values structurally — e.g. when both `lastSeen?.Value` and `current.Value` are arrays, compare with `StructuralComparisons.StructuralEqualityComparer.Equals` (or element-wise) — instead of relying on `object.Equals`. Add a test covering an array-valued tag whose contents are unchanged across polls. **Resolution:** Resolved 2026-05-22 — introduced `ValuesAreDifferent` helper in `PollGroupEngine` that uses `StructuralComparisons.StructuralEqualityComparer` for `Array` values, falling back to `object.Equals` for scalars; added `Array_valued_tag_unchanged_contents_raises_only_once` and `Array_valued_tag_changed_contents_raises_event` tests. ### Core.Abstractions-002 | Field | Value | |---|---| | Severity | Medium | | Category | Correctness & logic bugs | | Location | `src/Core/ZB.MOM.WW.OtOpcUa.Core.Abstractions/PollGroupEngine.cs:105-109` | | Status | Resolved | **Description:** `PollOnceAsync` iterates `state.TagReferences` and indexes the reader's result with `snapshots[i]`, assuming the driver-supplied `_reader` delegate returns exactly one snapshot per input reference in input order. The contract is documented (ctor XML doc: "snapshots MUST be returned in the same order as the input references"), but it is never validated. A reader that returns a shorter list — a plausible driver bug, or a partial result on a backend error — throws `ArgumentOutOfRangeException` from the indexer. That exception escapes `PollOnceAsync`, is swallowed by the catch-all in `PollLoopAsync` (line 99), and the subscription then silently produces no further `OnDataChange` callbacks for the rest of its lifetime with no diagnostic. The failure mode is a permanently stalled subscription that looks healthy. **Recommendation:** Validate `snapshots.Count == state.TagReferences.Count` at the top of `PollOnceAsync` and throw a descriptive exception (or skip the tick with a logged diagnostic) so the contract violation is visible rather than silently degrading. Consider surfacing repeated reader-contract failures through a callback the driver can route to its health surface. **Resolution:** Resolved 2026-05-22 — added count-guard at the top of `PollOnceAsync` that throws `InvalidOperationException` with a descriptive message when the reader returns the wrong number of snapshots; added `Reader_short_result_list_raises_descriptive_exception_and_loop_continues` test verifying the loop survives contract violations and resumes delivering events once the reader recovers. ### Core.Abstractions-003 | Field | Value | |---|---| | Severity | Medium | | Category | Concurrency & thread safety | | Location | `src/Core/ZB.MOM.WW.OtOpcUa.Core.Abstractions/PollGroupEngine.cs:64,121-130` | | Status | Resolved | **Description:** `Subscribe` starts the poll loop with a fire-and-forget `Task.Run` and keeps no reference to the returned `Task`. Neither `Unsubscribe` nor `DisposeAsync` awaits the loop's completion — they only cancel the `CancellationTokenSource` and dispose it. Two consequences: 1. After `DisposeAsync`/`Unsubscribe` returns, a poll already in flight inside `PollOnceAsync` can still complete and invoke the `_onChange` callback. A driver that disposes the engine during shutdown can therefore receive a data-change callback after it considers the engine torn down, with no way to know the engine is gone. 2. `Unsubscribe`/`DisposeAsync` call `state.Cts.Dispose()` immediately while the loop may still be inside `Task.Delay(state.Interval, ct)`. Cancelling-then-disposing a CTS while a consumer still touches the token can race; `Task.Delay` on a disposed token can throw `ObjectDisposedException` rather than `OperationCanceledException`, which the `Task.Delay` await in `PollLoopAsync` does not catch (it catches only `OperationCanceledException`). **Recommendation:** Track each loop `Task` in `SubscriptionState` and await it (with a timeout) in `Unsubscribe`/`DisposeAsync` before disposing the CTS, so disposal is deterministic and no callback can fire after teardown. At minimum, defer `Cts.Dispose()` until the loop task has observed cancellation, or wrap the `Task.Delay` await to also tolerate `ObjectDisposedException`. **Resolution:** Resolved 2026-05-22 — stored the loop `Task` in `SubscriptionState.LoopTask`; `Unsubscribe` calls `StopState` which cancels then awaits the task (5 s timeout) before disposing the CTS; `DisposeAsync` cancels all loops in parallel then awaits them all via `Task.WhenAll` with a 5 s timeout before disposing CTSs, making teardown deterministic and preventing post-disposal callbacks. ### Core.Abstractions-004 | Field | Value | |---|---| | Severity | Low | | Category | Concurrency & thread safety | | Location | `src/Core/ZB.MOM.WW.OtOpcUa.Core.Abstractions/DriverTypeRegistry.cs:23-40` | | Status | Resolved | **Description:** `Register` performs a check-then-act sequence (`snapshot.ContainsKey` then build `next` then `Interlocked.Exchange`) that is not atomic. Two threads registering concurrently can both pass the duplicate check and both build a `next` dictionary; the second `Interlocked.Exchange` then wins and silently discards the first registration, defeating the documented "registered only once" guarantee. The class XML doc states registration happens single-threaded at startup, so this is not a live defect — but the use of `Interlocked.Exchange` for the swap implies the type is fully thread-safe for writers, which it is not. The mismatch between the implementation's apparent intent and its actual guarantee is a maintenance hazard. **Recommendation:** Either guard `Register` with a `lock` so the check-build-swap is atomic, or strengthen the XML `Thread-safety` remark to state explicitly that concurrent `Register` calls are unsupported and only reader/writer concurrency is safe. **Resolution:** Resolved 2026-05-23 — guarded the duplicate check + copy-on-write rebuild + swap with a private `Lock`, making the check-build-swap atomic. Added `Register_ConcurrentDistinctTypes_AllSucceed` and `Register_ConcurrentDuplicateType_ExactlyOneWins` tests that exercise 16/32 racing threads and assert the "registered only once" guarantee holds. ### Core.Abstractions-005 | Field | Value | |---|---| | Severity | Low | | Category | Error handling & resilience | | Location | `src/Core/ZB.MOM.WW.OtOpcUa.Core.Abstractions/PollGroupEngine.cs:90,99` | | Status | Resolved | **Description:** Both the initial-poll and steady-state catch blocks use a bare `catch { }` that swallows every exception type, including non-transient programmer errors such as `NullReferenceException` and `ArgumentOutOfRangeException` (see Core.Abstractions-002). The XML remark says "transient poll error — loop continues, driver health surface logs it", but the engine never actually notifies the driver — there is no callback or event for a caught exception, so the driver's health surface has nothing to log. A persistently failing reader produces a silently spinning loop with zero observability from inside this module. **Recommendation:** Narrow the catch to the exception types a reader is expected to throw (or at least exclude obviously-fatal ones), and add an optional `Action` error callback (or raise an event) so the owning driver can record poll failures on its health surface as the doc claims happens. **Resolution:** Resolved 2026-05-23 — narrowed the catch blocks with an `IsFatal` guard so `OutOfMemoryException` / `StackOverflowException` / `AccessViolationException` / `ThreadAbortException` propagate instead of being swallowed; added an optional `Action? onError` constructor parameter (backward-compatible — every existing driver call site uses named args and is unaffected) and routed every caught reader / contract-violation exception through a `ReportError` helper that defends against a buggy error sink. Also tolerates `ObjectDisposedException` from `Task.Delay` against an already-disposed CTS (defensive race-safety). Added `Reader_exception_is_reported_to_onError_callback`, `Reader_contract_violation_routes_to_onError_callback`, and `OnError_handler_that_throws_does_not_crash_loop` regression tests. ### Core.Abstractions-006 | Field | Value | |---|---| | Severity | Low | | Category | Code organization & conventions | | Location | `src/Core/ZB.MOM.WW.OtOpcUa.Core.Abstractions/IHistoryProvider.cs:63,84-86`, `src/Core/ZB.MOM.WW.OtOpcUa.Core.Abstractions/Historian/IHistorianDataSource.cs:30,63` | | Status | Resolved | **Description:** The two history-read surfaces use inconsistent integer types for the same "maximum rows" concept. `IHistoryProvider.ReadRawAsync` and `IHistorianDataSource.ReadRawAsync` take `uint maxValuesPerNode`, but `ReadEventsAsync` (on both interfaces) takes `int maxEvents`. The OPC UA `HistoryRead` service request fields are unsigned, and a negative `maxEvents` has no defined meaning. Mixing `int` and `uint` for the same parameter role across sibling methods forces every caller and implementer to reason about the inconsistency and risks accidental sign issues at the boundary. **Recommendation:** Standardize on `uint` for all max-rows parameters across both `IHistoryProvider` and `IHistorianDataSource` (or document explicitly why `maxEvents` differs). **Resolution:** Resolved 2026-05-23 — took the documented-difference path (signature change would cross ten files outside Core.Abstractions). `int maxEvents` is intentionally signed: callers and downstream historian adapters (`WonderwareHistorianClient`, `HistorianDataSource`) treat `maxEvents <= 0` as a "use the backend's default cap" sentinel that has no `uint` equivalent. Updated XML docs on both `IHistoryProvider.ReadEventsAsync` and `IHistorianDataSource.ReadEventsAsync` to spell out the asymmetry and rationale. Added `HistoryRead_MaxParameter_TypePinned` / `HistoryProvider_MaxParameter_TypePinned` contract tests so an accidental future flip is caught. ### Core.Abstractions-007 | Field | Value | |---|---| | Severity | Low | | Category | Testing coverage | | Location | `tests/Core/ZB.MOM.WW.OtOpcUa.Core.Abstractions.Tests/PollGroupEngineTests.cs` | | Status | Resolved | **Description:** `PollGroupEngine` is the only behavioural (non-DTO) type in the module and its tests, while solid for the happy paths, miss two paths that this review identifies as defect-prone: (a) no test exercises an array-valued tag whose contents are unchanged across polls (would catch Core.Abstractions-001), and (b) no test exercises a reader that returns a snapshot list shorter than the input references (would catch Core.Abstractions-002). The `Reader_exception_does_not_crash_loop` test only covers a reader that throws before producing any result. `DataValueSnapshot` change-detection semantics for reference-typed values are therefore unverified. **Recommendation:** Add tests for the unchanged-array case and the short-result-list case once Core.Abstractions-001/002 are addressed, so the intended contract is locked down. **Resolution:** Resolved 2026-05-23 — the gap is filled by the regression tests added when Core.Abstractions-001 and -002 were closed: `Array_valued_tag_unchanged_contents_raises_only_once`, `Array_valued_tag_changed_contents_raises_event`, and `Reader_short_result_list_raises_descriptive_exception_and_loop_continues` lock the two previously-untested paths down. The fixes for -004 / -005 / -008 added a further nine regression tests (`Register_ConcurrentDistinctTypes_AllSucceed`, `Register_ConcurrentDuplicateType_ExactlyOneWins`, `Reader_exception_is_reported_to_onError_callback`, `Reader_contract_violation_routes_to_onError_callback`, `OnError_handler_that_throws_does_not_crash_loop`, `LastError_IsIndependent_OfState`, `DriverState_EnumContainsExpectedMembers`, `HistoryRead_MaxParameter_TypePinned`, `HistoryProvider_MaxParameter_TypePinned`, `HistoryProvider_OptionalMethods_HaveDefaultImplementation`). Total test count rose from 57 to 75. ### Core.Abstractions-008 | Field | Value | |---|---| | Severity | Low | | Category | Documentation & comments | | Location | `src/Core/ZB.MOM.WW.OtOpcUa.Core.Abstractions/DriverHealth.cs:9`, `src/Core/ZB.MOM.WW.OtOpcUa.Core.Abstractions/IHistoryProvider.cs:39-43,65-69` | | Status | Resolved | **Description:** Two XML-doc inaccuracies: 1. `DriverHealth.LastError` is documented as "Most recent error message; null when state is Healthy." The `DriverState` enum also defines `Degraded`, `Reconnecting`, and `Faulted` states, all of which carry an error; and a driver in `Healthy` state may legitimately retain the last error from a previously-recovered failure. The "null when Healthy" claim is stronger than the type enforces and than callers should rely on. 2. `IHistoryProvider.ReadAtTimeAsync` and `ReadEventsAsync` are C# default interface methods whose `` say "Default implementation throws". This is accurate, but the sibling `IHistorianDataSource` declares the same methods as required (non-default) members — the asymmetry between the two history surfaces is undocumented and could surprise an implementer who assumes parity. **Recommendation:** Reword `DriverHealth.LastError` to "Most recent error message; may be null when no error has been recorded" without tying nullness to a specific state. Add a one-line note on `IHistoryProvider`/`IHistorianDataSource` explaining why one surface uses default methods and the other does not. **Resolution:** Resolved 2026-05-23 — reworded `DriverHealth.LastError` to "null when no error has been recorded" and called out that the field is independent of `State`. Added an asymmetry `` to `IHistoryProvider` (default-impl-throws so legacy drivers compile) and to `IHistorianDataSource.ReadEventsAsync` (required because server-side historians own the full surface) cross-referencing the finding. Added `LastError_IsIndependent_OfState` + `DriverState_EnumContainsExpectedMembers` and `HistoryProvider_OptionalMethods_HaveDefaultImplementation` contract tests so the asymmetry pins. --- ## Re-review 2026-06-19 (commit 7286d320) Changes since `76d35d1`: all eight original findings resolved. New code added: `EquipmentTagRefResolver` (generic helper bridging authored tag-table lookup and equipment-tag JSON ref parsing, with `ConcurrentDictionary` negative-result cache), `NullHistorianDataSource` singleton, `AlarmAcknowledgeRequest.OperatorUser` optional param, `AlarmEventArgs.Kind` (`AlarmTransitionKind`), `NamespaceKindCompatibility.SystemPlatform` removed, `AlarmConditionInfo` sub-attribute refs, numerous XML-doc backfill. Test count rose to 88 at the prior commit. Re-review scope: verify all prior findings are properly closed, evaluate all new code against the 10 categories, and assess the OpcUaServer-002 cross-module item. ### Core.Abstractions-009 | Field | Value | |---|---| | Severity | Low | | Category | Documentation & comments | | Location | `src/Core/ZB.MOM.WW.OtOpcUa.Core.Abstractions/IHistoryProvider.cs:89-107`, `src/Core/ZB.MOM.WW.OtOpcUa.Core.Abstractions/Historian/IHistorianDataSource.cs:91-101` | | Status | Resolved | **Description (OpcUaServer-002 verdict):** The `ReadEventsAsync` `maxEvents <= 0` sentinel contract is present (Core.Abstractions-006 fix), but neither `IHistoryProvider.ReadEventsAsync` nor `IHistorianDataSource.ReadEventsAsync` specifies what a compliant implementation MUST do with `ContinuationPoint` when the backend caps the result. Without this, callers cannot distinguish "all matching events were returned" from "the backend hit its cap and silently truncated." The OpcUaServer review noted that when `NumValuesPerNode == 0` ("all events") the server dispatches with `maxEvents = 0` and gets back a result whose `ContinuationPoint` is null even when the backend capped it — causing the server to report `GoodNoData` / empty with no `GoodMoreData` response. This is a contract-under-specification issue in the interface definition: it needs to state that when the backend cap truncates, `ContinuationPoint` MUST be non-null. Changing the signature is forbidden (public-contract break); the fix is a documentation clarification. **Recommendation:** Add a `Continuation contract when using the sentinel:` note to `IHistoryProvider.ReadEventsAsync` and update the `` doc in `IHistorianDataSource.ReadEventsAsync` to state: implementations MUST set `ContinuationPoint` non-null when the backend cap truncates the result; null means all matching events were returned. **Resolution:** Resolved 2026-06-19 — added explicit continuation-contract note to `IHistoryProvider.ReadEventsAsync` `` and to `IHistorianDataSource.ReadEventsAsync` `` (both: "when backend cap truncates, MUST set ContinuationPoint non-null; null means all events returned"). Added `ReadEventsAsync_sentinel_and_continuation_contract_types_pinned` contract test in `IHistorianDataSourceContractTests` pinning `maxEvents` type and `ContinuationPoint` member presence. SHA: (2026-06-19, blank — not committed). ### Core.Abstractions-010 | Field | Value | |---|---| | Severity | Low | | Category | Testing coverage | | Location | `tests/Core/ZB.MOM.WW.OtOpcUa.Core.Abstractions.Tests/PollGroupEngineTests.cs:68,111,128,246,280` | | Status | Resolved | **Description:** Five `Task.Delay(ms)` calls in `PollGroupEngineTests` omit a `CancellationToken`, triggering xUnit v3 analyser warning xUnit1051 ("Calls to methods which accept CancellationToken should use `TestContext.Current.CancellationToken`"). These delays cannot be cancelled when a test is cancelled or times out, making those test methods unresponsive to test-runner abort signals. **Recommendation:** Pass `TestContext.Current.CancellationToken` as the second argument to every `Task.Delay` call inside test methods. **Resolution:** Resolved 2026-06-19 — updated all five bare `Task.Delay(ms)` calls in `PollGroupEngineTests` (lines 68, 111, 128, 246, 280 at review commit) to `Task.Delay(ms, TestContext.Current.CancellationToken)`. Build warnings eliminated; test count unchanged at 88 (+ 2 new from -009 fix = 90 total). SHA: (2026-06-19, blank — not committed). #### Note: EquipmentTagRefResolver null-safety improvement (not a separately-tracked finding) The `TryResolve` out-parameter `def` was typed as `out TDef def` (non-nullable) but set to `null!` on the false path — idiomatic but silently unsound in NRT-enabled callers. Added `[MaybeNullWhen(false)]` attribute and changed `null!` to `default`. Pinned by new `TryResolve_false_path_sets_def_to_default` test in `EquipmentTagRefResolverTests`.