bac6613dd2
First review of the v2 OPC UA core at HEAD 7286d320. 6 findings (2 Medium, 4 Low).
OpcUaServer-006 fixed (stale NodeManager/ApplicationHost XML docs). 001-004 deferred
(cross-module: Runtime publish-actor / Core.Abstractions history contract / Wonderware
boundary semantics, or latent-only). 005 re-triaged Won't-Fix (coverage already exists).
High-scrutiny paths (Lock discipline, OnWriteValue fire-and-forget, WriteOperate/AlarmAck
gates, HistoryRead AccessLevel bits) verified correct.
212 lines
13 KiB
Markdown
212 lines
13 KiB
Markdown
# Code Review — OpcUaServer
|
|
|
|
| Field | Value |
|
|
|---|---|
|
|
| Module | `src/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer` |
|
|
| Reviewer | Claude Code |
|
|
| Review date | 2026-06-19 |
|
|
| Commit reviewed | `7286d320` |
|
|
| Status | Reviewed |
|
|
| Open findings | 4 |
|
|
|
|
## Checklist coverage
|
|
|
|
A comprehensive review completes every category, recording "No issues found" where
|
|
a category produced nothing rather than leaving it blank.
|
|
|
|
| # | Category | Result |
|
|
|---|---|---|
|
|
| 1 | Correctness & logic bugs | OpcUaServer-001, -002, -003 |
|
|
| 2 | OtOpcUa conventions | No issues found |
|
|
| 3 | Concurrency & thread safety | No issues found (Lock discipline + fire-and-forget dispatch verified correct) |
|
|
| 4 | Error handling & resilience | OpcUaServer-004 |
|
|
| 5 | Security | No issues found (WriteOperate / AlarmAck gates fail closed; anonymous → BadUserAccessDenied; HistoryRead AccessLevel bits correct) |
|
|
| 6 | Performance & resource management | No issues found |
|
|
| 7 | Design-document adherence | No issues found (ConditionId=dotted-FullName routing, MapSeverity buckets, fire-and-forget dispatch all match CLAUDE.md / docs) |
|
|
| 8 | Code organization & conventions | No issues found |
|
|
| 9 | Testing coverage | No issues found (HistoryRead paging incl. oversized-tie-cluster + backstop is covered end-to-end; OpcUaServer-005 re-triaged to Won't Fix) |
|
|
| 10 | Documentation & comments | OpcUaServer-006 (Resolved) |
|
|
|
|
## Findings
|
|
|
|
<!-- One ### entry per finding. IDs are OpcUaServer-NNN, sequential within the module,
|
|
never reused. Findings are never deleted — close them by changing Status and
|
|
completing Resolution. -->
|
|
|
|
### OpcUaServer-001
|
|
|
|
| Field | Value |
|
|
|---|---|
|
|
| Severity | Medium |
|
|
| Category | Correctness & logic bugs |
|
|
| Location | `AddressSpacePlan.cs:56` (`AddressSpacePlan.IsEmpty`), `AddressSpacePlan.cs:80` (`AddressSpacePlanner.Compute`) |
|
|
| Status | Open |
|
|
|
|
**Description:** `AddressSpaceComposition` carries the UNS topology (`UnsAreas` + `UnsLines`), and
|
|
`AddressSpaceApplier.MaterialiseHierarchy` uses each area's/line's `DisplayName` for the OPC UA
|
|
folder display name. But `AddressSpacePlanner.Compute` only diffs Equipment / DriverInstance /
|
|
ScriptedAlarm / EquipmentTag / EquipmentVirtualTag — it never diffs `UnsAreas` / `UnsLines`. So a
|
|
deployment whose ONLY change is a UNS Area or Line **rename** (no equipment/driver/alarm/tag/vtag
|
|
delta) produces a plan whose `IsEmpty` is `true`. In `OpcUaPublishActor.HandleRebuild`
|
|
(`src/Server/ZB.MOM.WW.OtOpcUa.Runtime/OpcUa/OpcUaPublishActor.cs:313-318`) an empty plan
|
|
short-circuits BEFORE `MaterialiseHierarchy` runs, so the renamed Area/Line folder keeps its stale
|
|
display name until some unrelated structural change forces a full rebuild. Compounding it,
|
|
`OtOpcUaNodeManager.EnsureFolder` (`OtOpcUaNodeManager.cs:1272`) early-returns for an
|
|
already-present folder id and never updates an existing folder's `DisplayName`, so even if
|
|
`MaterialiseHierarchy` were reached on a rename it would be a no-op until `RebuildAddressSpace`
|
|
clears `_folders`.
|
|
|
|
**Recommendation:** Carry UNS area/line diff sets into `AddressSpacePlan` (mirroring the
|
|
init-only EquipmentTag/VirtualTag pattern) so a rename is no longer "empty", and have the apply
|
|
path drive a hierarchy refresh (or make `EnsureFolder` update an existing folder's `DisplayName`
|
|
in place + `ClearChangeMasks`). Deferred: a complete fix spans the Runtime module
|
|
(`OpcUaPublishActor` must honour the new plan flag and call a hierarchy-refresh / rebuild path),
|
|
which is outside this module's edit boundary.
|
|
|
|
**Resolution:** _(Open — deferred: needs a coordinated change in the Runtime module's `OpcUaPublishActor` to act on a UNS-changed plan; an in-`AddressSpacePlan` change alone is inert because `Apply`/`MaterialiseHierarchy` do not refresh existing folder names.)_
|
|
|
|
### OpcUaServer-002
|
|
|
|
| Field | Value |
|
|
|---|---|
|
|
| Severity | Medium |
|
|
| Category | Correctness & logic bugs |
|
|
| Location | `OtOpcUaNodeManager.cs:1748` (`HistoryReadEvents`), `OtOpcUaNodeManager.cs:1814` (`ClampToInt`) |
|
|
| Status | Open |
|
|
|
|
**Description:** For HistoryRead-Events, `HistoryReadEvents` passes
|
|
`ClampToInt(details.NumValuesPerNode)` to `IHistorianDataSource.ReadEventsAsync(maxEvents)` and
|
|
always returns the result with `ContinuationPoint = null` ("the full window in one shot"). The
|
|
events arm never issues continuation points. By the `IHistoryProvider.ReadEventsAsync` contract
|
|
(`src/Core/ZB.MOM.WW.OtOpcUa.Core.Abstractions/IHistoryProvider.cs:88-95`), `maxEvents <= 0` is a
|
|
sentinel meaning "use the backend's **default cap**". But OPC UA Part 4/11 define
|
|
`NumValuesPerNode == 0` on a HistoryRead as "no limit — return ALL values". So a client that asks
|
|
for all events in a window (`NumValuesPerNode == 0`) is silently truncated at the backend's
|
|
default cap, and because no continuation point (and no `GoodMoreData`) is returned, the client has
|
|
no signal that the result was truncated — it believes it received the complete window. This is a
|
|
silent-data-loss / spec-deviation on the events arm.
|
|
|
|
**Recommendation:** Either (a) translate `NumValuesPerNode == 0` to "unbounded" for the events
|
|
backend (e.g. saturate to a very large cap) rather than the "default cap" sentinel, or (b) detect
|
|
backend truncation and surface a continuation point / `GoodMoreData` for events. Deferred: option
|
|
(a) changes the documented `maxEvents <= 0` sentinel semantics shared with the Wonderware/OpcUaClient
|
|
event backends (cross-module, Core.Abstractions contract); option (b) requires the backend to report
|
|
truncation. Both cross this module's boundary.
|
|
|
|
**Resolution:** _(Open — deferred: rooted in the cross-module `IHistoryProvider.ReadEventsAsync` `maxEvents <= 0` sentinel contract (Core.Abstractions-006) and the Wonderware/OpcUaClient event backends; cannot be fixed safely inside OpcUaServer alone.)_
|
|
|
|
### OpcUaServer-003
|
|
|
|
| Field | Value |
|
|
|---|---|
|
|
| Severity | Low |
|
|
| Category | Correctness & logic bugs |
|
|
| Location | `OtOpcUaNodeManager.cs:1978` (`ServeRawPaged`), `HistoryPaging.cs` (whole), `HistoryPaging.cs:213` (`SliceTieCluster` `next <= endUtc`) |
|
|
| Status | Open |
|
|
|
|
**Description:** The Raw paging chain treats `endUtc` as an **inclusive** upper bound throughout —
|
|
the `HistoryContinuationState`/`HistoryPaging` XML docs all say "the original (inclusive) end of
|
|
the window", and `SliceTieCluster` advances with `next <= endUtc`. But the backend contract
|
|
(`IHistoryProvider.ReadRawAsync`, `IHistoryProvider.cs:25`) defines `endUtc` as an **exclusive**
|
|
upper bound. The two arms therefore disagree at the exact `endUtc` boundary. Impact is small
|
|
because the backend is the authority on which samples actually exist (a sample at exactly `endUtc`
|
|
never appears in an exclusive-end read), so the disagreement only ever yields one extra empty
|
|
resume page (`[endUtc, endUtc)` → GoodNoData, no CP) rather than duplicated or dropped data.
|
|
Separately, the SDK permits a Raw read with `StartTime == DateTime.MinValue` + a `NumValuesPerNode`
|
|
cap (the OPC UA "read N values backward from EndTime" request — validated in
|
|
`CustomNodeManager.HistoryRead`); the forward-only resume cursor (`ComputeResumeCursor` resumes from
|
|
the LAST sample forward) is not defined for a backward read, so paging a backward Raw read is
|
|
unsound if the backend honours the backward direction.
|
|
|
|
**Recommendation:** Pin the `endUtc` inclusive-vs-exclusive convention against the actual backend
|
|
(align the paging XML docs + `SliceTieCluster` comparison to the contract), and either explicitly
|
|
reject a `StartTime == MinValue` paged Raw read with a clear status or document that the backend is
|
|
assumed to always return chronological forward results regardless of direction. Deferred: resolving
|
|
the inclusive/exclusive question requires confirming the Wonderware backend's actual boundary
|
|
semantics (cross-module / infra), and changing a comparison without that confirmation risks the
|
|
opposite off-by-one.
|
|
|
|
**Resolution:** _(Open — deferred: needs the backend's authoritative endUtc boundary semantics confirmed before the comparison/doc is changed; flipping it blindly risks an off-by-one in the other direction.)_
|
|
|
|
### OpcUaServer-004
|
|
|
|
| Field | Value |
|
|
|---|---|
|
|
| Severity | Low |
|
|
| Category | Error handling & resilience |
|
|
| Location | `OtOpcUaNodeManager.cs:1597` (`ResolveParentFolder`), and every public sink mutator that calls it (`EnsureFolder` 1278, `EnsureVariable` 1335, `MaterialiseAlarmCondition` 597, plus `WriteValue`/`WriteAlarmCondition` `CreateVariable`) |
|
|
| Status | Open |
|
|
|
|
**Description:** `ResolveParentFolder` dereferences `_root!` with the null-forgiving operator, and
|
|
`CreateVariable` uses `_root` (`AddChild`). `_root` is only assigned in `CreateAddressSpace`, which
|
|
the SDK invokes during `StandardServer` start. Every public address-space mutator
|
|
(`WriteValue`, `WriteAlarmCondition`, `EnsureFolder`, `EnsureVariable`, `MaterialiseAlarmCondition`)
|
|
assumes `CreateAddressSpace` has already run. If any of these is ever called before the server has
|
|
started (e.g. a sink wired or a publish replayed before `StartAsync` completes), `_root` is `null`
|
|
and the call NREs out of the node manager. In the current boot ordering the host wires the
|
|
`SdkAddressSpaceSink` only after start, so this is latent rather than live, but it is an unguarded
|
|
ordering hazard on the highest-risk class.
|
|
|
|
**Recommendation:** Add an explicit guard (e.g. throw a clear `InvalidOperationException("address
|
|
space not yet created")` or no-op with a logged warning) when `_root` is null at the top of the
|
|
mutators, so a too-early call fails legibly instead of with a bare NRE. Low priority — defensive
|
|
hardening, not a live defect. Left Open to avoid an unscoped change to the mutator entry points on
|
|
this critical class without a regression scenario that reproduces the early-call ordering.
|
|
|
|
**Resolution:** _(Open — defensive-only; latent given current boot ordering. Deferred to avoid an unscoped guard-add across five mutators without a reproducing pre-start ordering scenario.)_
|
|
|
|
### OpcUaServer-005
|
|
|
|
| Field | Value |
|
|
|---|---|
|
|
| Severity | Low |
|
|
| Category | Testing coverage |
|
|
| Location | `OtOpcUaNodeManager.cs:2049` (`ServeRawPaged` tie-cluster stall path), `OtOpcUaNodeManager.cs:2068` (absurd-burst backstop) |
|
|
| Status | Won't Fix |
|
|
|
|
**Description:** (Re-triaged to a non-issue.) Initial concern was that the oversized-tie-cluster
|
|
paging path in `ServeRawPaged` (over-fetch the whole cluster, `SliceTieCluster`, the
|
|
`MaxTieClusterOverfetch` absurd-burst `BadHistoryOperationUnsupported` backstop) lacked an
|
|
end-to-end node-manager regression and was only covered at the pure level by `HistoryPagingTests`.
|
|
|
|
**Recommendation:** None required.
|
|
|
|
**Resolution:** Won't Fix — 2026-06-19: coverage already exists end-to-end. Verified that
|
|
`NodeManagerHistoryReadPagingTests.Raw_oversized_tie_cluster_pages_within_the_timestamp` drives a
|
|
real multi-page resume INTO the stall (a 5-way tie cluster larger than the page cap) through a
|
|
booted `OtOpcUaSdkServer` + `InMemoryHistoryContinuationStore` and asserts lossless duplicate-free
|
|
paging, and `Raw_tie_cluster_beyond_overfetch_bound_fails_loudly` exercises the
|
|
`> MaxTieClusterOverfetch` backstop. The module's HistoryRead paging surface is comprehensively
|
|
tested; no test gap.
|
|
|
|
### OpcUaServer-006
|
|
|
|
| Field | Value |
|
|
|---|---|
|
|
| Severity | Low |
|
|
| Category | Documentation & comments |
|
|
| Location | `OtOpcUaNodeManager.cs:11-30` (class XML doc), `OpcUaApplicationHost.cs:88-93` / `OpcUaApplicationHost.cs:421-423` (F13/F13c follow-up notes) |
|
|
| Status | Resolved |
|
|
|
|
**Description:** Several XML comments reference superseded plan milestones as if still pending. The
|
|
`OtOpcUaNodeManager` class doc says equipment-folder hierarchy + type metadata "still come from the
|
|
AddressSpaceApplier / EquipmentNodeWalker integration (F14b, tracked under #85)", but
|
|
`EquipmentNodeWalker` is no longer the integration path (the composer → applier → sink → node-manager
|
|
chain replaced it, per `AddressSpaceComposer`'s own header) and #85 hierarchy materialisation has
|
|
shipped. `OpcUaApplicationHost` still carries "Full extraction … is tracked as follow-up F13" and
|
|
"F13c will plug a real LDAP-bound validator" notes, but the `IOpcUaUserAuthenticator` /
|
|
`RoleCarryingUserIdentity` impersonation wiring is implemented and live (the doc/`docs/OpcUaServer.md`
|
|
describe it as done). These are stale-comment / undocumented-as-shipped issues, not behaviour bugs.
|
|
|
|
**Recommendation:** Refresh the class- and method-level doc comments to drop the
|
|
`EquipmentNodeWalker`/F13/F13c/F14b "pending" framing and describe the shipped behaviour, matching
|
|
`docs/OpcUaServer.md`.
|
|
|
|
**Resolution:** Resolved — 2026-06-19 (SHA pending): rewrote the `OtOpcUaNodeManager` class XML
|
|
doc to describe the shipped folder-hierarchy / typed-variable / historized-node / Part 9
|
|
condition materialisation (dropping the false "treats every id as a flat BaseDataVariableState"
|
|
claim and the retired `EquipmentNodeWalker`/F14b framing), and the two `OpcUaApplicationHost`
|
|
doc blocks (class summary + `BuildUserTokenPolicies`) to describe the shipped impersonation/auth
|
|
wiring instead of the "F13/F13c pending" framing. Doc-comment-only — no behaviour change; build
|
|
re-verified green.
|