C1 (critical): a boundary tie cluster larger than NumValuesPerNode could
silently truncate a resumed read to GoodNoData, permanently dropping the
un-emitted ties — the (timestamp, skip) cursor cannot advance past a single
timestamp the fixed-(start,end,cap) backend keeps re-returning. Now detected
and failed LOUDLY per node with BadHistoryOperationUnsupported + a log naming
the tag/timestamp/cap; documented in Historian.md with the larger-cap remedy.
Regression test Raw_tie_cluster_larger_than_page_fails_loudly_not_silently.
I3: build HistoryData before Save() so a projection failure can never orphan a
stored continuation cursor.
N1 (YAGNI): drop the never-produced HistoryReadKind enum + Processed-only
Aggregate/IntervalTicks fields from HistoryContinuationState — only Raw pages.
N3: ComputeResumeCursor guards its documented non-empty precondition.
I1: document InMemoryHistoryContinuationStore's eventual-consistency (test double).
Build clean, 182/182 OpcUaServer tests pass.
The Wonderware historian backend is single-shot — it returns up to
NumValuesPerNode samples with a null continuation point — so paging is
synthesised server-side, time-based, for the only count-capped arm (Raw):
- A full page (count == NumValuesPerNode, NumValuesPerNode > 0) emits an
opaque 16-byte continuation point and stores a resume cursor; a short page
(or NumValuesPerNode == 0 "all values") emits none.
- A resume read takes the stored cursor, reads the next page from the boundary
forward, and emits a fresh CP only if that page is also full.
- The resume cursor is tie-safe (HistoryPaging.ComputeResumeCursor /
TrimBoundaryDuplicates): the next page resumes from the boundary timestamp
INCLUSIVE and drops the head ties already returned, so samples sharing the
boundary SourceTimestamp are neither duplicated nor skipped.
Continuation points are bound to the OPC UA session via the SDK's
ISession.SaveHistoryContinuationPoint / RestoreHistoryContinuationPoint store
(SessionHistoryContinuationStore) — capped by ServerConfiguration.
MaxHistoryContinuationPoints (default 100, oldest-evicted) and disposed on
session close. releaseContinuationPoints is honoured via an override of
HistoryReleaseContinuationPoints (the base dispatcher routes release-only reads
there, never to the per-details arms). An unknown / evicted / released point
resumes to BadContinuationPointInvalid.
Processed and AtTime stay single-shot: neither details type carries a client
count cap, so the single-shot backend returns the complete result in one read
and there is no "full page" signal to page on (spec-conformant). Modified-value
history remains out of scope.
The pure paging decisions + CP store contract are unit-tested via HistoryPaging
+ InMemoryHistoryContinuationStore; the full multi-page round trip is driven
end-to-end through the node manager with an in-memory store + a series-backed
fake historian (the in-process harness is session-less).
- A.1 (false-rejection safety): restrict the structural fail-fast's confident-mismatch check to
the CLOSED set of built-in types ResolveBuiltInDataType emits (numeric families + Boolean/
String/DateTime/ByteString). Any other expected type (Enumeration, Guid, …) now defers to the
SDK, so a coercible write (Int32→Enumeration) is never false-rejected. + A7/A8 regression tests.
- C.1: guard BuildWriteFailureAuditEvent (under Lock) in try/catch like ReportAuditEvent, so a
SetChildValue surprise is swallowed+logged, never thrown out of the fire-and-forget continuation.
Three deferred 'surface the failed write' enhancements on the write-outcome
self-correction path in OtOpcUaNodeManager:
- Item A: synchronous structural fail-fast. EvaluateEquipmentWriteStructure
(pure static) rejects a structurally-invalid write INLINE (Bad sync) after
the authz gate but before the optimistic dispatch, so the SDK never applies
it. Null payload -> BadTypeMismatch; plus a confidence-gated cheap built-in
type compatibility check (numeric widening + BaseDataType wildcard tolerated;
uncertain cases defer to the SDK's own coercion).
- Item B: Bad-quality blip on device-write failure. On a revert,
RevertOptimisticWriteIfNeeded first publishes the still-applied optimistic
value with StatusCode BadDeviceFailure, then restores the prior value/status
(both under the existing Lock). Documents the queue-coalescing caveat (a slow
subscriber may see only the restored value -> the audit event is the reliable
signal).
- Item C: Part 8 AuditWriteUpdateEvent on device-write failure. Builds an
AuditWriteUpdateEventState (SourceNode=node, AttributeId=Value, OldValue=prior,
NewValue=attempted, ClientUserId from the threaded identity, Message carries
outcome.Reason) under Lock and reports it via Server.ReportEvent OUTSIDE Lock.
Guarded so auditing-disabled / report failure never breaks the revert.
Threads the writing identity's user-id + node into the continuation. Adds 6
unit tests for EvaluateEquipmentWriteStructure. Build clean (0 warnings);
158/158 OpcUaServer.Tests green.
- HistoryReadEvents miss path + catch path now both set results[handle.Index] explicitly
(new SdkHistoryReadResult { StatusCode = BadHistoryOperationUnsupported }) — don't rely on
base pre-seeding results[i] so every path sets BOTH errors and results coherently (#1)
- ProjectEventField: SourceName null now emits Variant.Null instead of a String-typed null
variant (evt.SourceName is null ? Variant.Null : new Variant(evt.SourceName)) (#3)
- Comment near the HistoryRead dispatcher block updated: all four arms (Raw/Processed/AtTime
+ Events/Task 4) are now overridden — "left to the base" wording was stale (#5)
- Happy-path test adds ReceiveTime to select clauses and asserts it projects ReceivedTimeUtc
as a DateTime Variant at the correct select-order position (#4)
- Backend-throw test hardened: asserts errors[0] via ServiceResult.IsBad + explicit code,
asserts results[0] is non-null with the Bad code (no longer relies on base seeding),
and asserts EventsEntered to prove the override reached the bridge before the throw (#1)
- RecordingHistorianDataSource gains EventsEntered flag (set before ThrowOnRead check) (#1)
- Events_non_source_node test gains clarifying doc comment explaining the SDK base rejects
variable nodes (EventNotifier=None) for event reads before our override runs; the
override's source-guard is exercised by the promoted-without-source test instead (#2)
Fix A: add Raw_multi_node_per_node_error_isolation test — two historized variables
(eqA/good→A.PV, eqB/bad→B.PV) in one Raw batch; per-tagname fake throws for B.PV,
returns a sample for A.PV; asserts errors[0]=Good+sample, errors[1]=Bad,
HistoryData[1]=null (no cross-slot leak), no exception escapes.
Fix B: collapse double ConcurrentDictionary lookup in ServeNode — TryGetHistorizedTagname
now captures `out var tagname` on the guard; the resolved tagname is threaded into the
read callback as a second parameter (Func<IHistorianDataSource, string, Task<HistorianRead>>),
removing the redundant ResolveTagname helper (deleted) and the tiny race window between
the check and the second lookup. All three call-sites (Raw/Processed/AtTime) updated.
Fix C: rewrite the IsReadModified comment at NodeManagerHistoryReadTests.cs:102 — the
SDK's ReadRawModifiedDetails.Initialize() sets m_isReadModified=true (generated ctor body
in Opc.Ua.DataTypes.cs), so the default IS true; the test must explicitly clear it to
false for a plain raw read. Previous comment said the same thing but imprecisely; now
cites the SDK mechanism (Initialize() call in the public ctor).
I1: DeferredAddressSpaceSinkTests.RecordingSink now captures HistorianTagname
per EnsureVariable call (HistorianQueue/HistorianCalls, matching the
Phase7ApplierTests pattern); new test EnsureVariable_forwards_historianTagname_to_inner_sink
asserts the arg is forwarded unchanged through DeferredAddressSpaceSink.
M1: OtOpcUaNodeManager.EnsureVariable doc-comment notes that a changed
historize intent on an already-registered node is silently ignored until
a RebuildAddressSpace (rebuild precondition for Task 3 implementers).
N2: DeploymentArtifact.ExtractTagHistorize doc wording: "The live-edit
side" → "The live-edit composer side".
Stop parsing TagConfig twice per tag on the deploy hot path: Phase7Composer's
equipment-tag Select lambda is now block-bodied (captures isHistorized/historianTagname
once), and DeploymentArtifact.BuildEquipmentTagPlans captures locals before result.Add.
Add wrong-type-historianTagname InlineData to ExtractTagHistorizeTests. Extend the
parity round-trip fixture with a 4th tag (isHistorized:false + JSON-null tagname)
exercising the artifact-side private guard path. Align DeploymentArtifact's
ExtractTagHistorize doc-comment with the composer-side phrasing (ExtractTagFullName /
ExtractTagAlarm cross-reference).
Resolves the code-review notes on 95be607a + the AdminUI bundle: the
EnsureVariable docs (IOpcUaAddressSpaceSink, OtOpcUaNodeManager) and the Tag
entity doc no longer say 'Galaxy / SystemPlatform / alias'; the DriverHostActor
ForwardToMux comment now states the real equipment-tag value-routing gap (the
FullName→NodeId 'live values' milestone) instead of claiming Galaxy values map
straight through.
All five suppressed advisories are now resolved at baseline/resolved versions,
so every NuGetAuditSuppress is removed repo-wide:
- System.Security.Cryptography.Xml (GHSA-37gx-xxp4-5rgx / GHSA-w3x6-4m5h-cxqf)
-> fixed by the .NET 10 baseline (10.0.6)
- OPCFoundation Opc.Ua.Core (GHSA-h958-fxgg-g7w3) -> fixed at resolved 1.5.378.106
Two were still live and are now patched via direct security pins:
- OpenTelemetry.Api 1.9.0 -> 1.15.3 (GHSA-g94r-2vxg-569j) pinned in Cluster;
Runtime/ControlPlane/AdminUI + tests inherit via project reference
- Tmds.DBus.Protocol 0.20.0 -> 0.21.3 (GHSA-xrw6-gwf8-vvr9) pinned in Client.UI
Also correct the Historian sidecar runtime comments (x86 -> x64, matching the
csproj PlatformTarget). Solution audit: 0 vulnerable packages; full build clean.
Three code-review points on commit 004558c2 were correct behavior
that was under-documented, not bugs:
1. AlarmConditionDelta gains explicit paragraphs explaining why
CommentAdded is absent: it always originates from a client
AddComment call whose T18 OnAddComment handler returns Good →
SDK auto-fires the comment event (E2); the engine re-projection
carries no delta-field change, so the gate correctly suppresses
the duplicate. Force-firing would double-emit.
2. Same doc explains Retain is intentionally absent: Retain is a
pure function of Active/Acknowledged (both compared), so it
cannot flip without a real delta. Notes future risk if that
ever changes.
3. ReportConditionEvent Time/ReceiveTime comment corrected: the
projection was already applied by WriteAlarmCondition above
with identical values; the restamp is a locality repeat, not a
reorder guard.
Also adds one seam unit-test (103 total, was 102) pinning the
null-vs-empty Message normalization boundary so a change to the
?? string.Empty coalescing is caught at the seam level.
Inbound client acks now route through the engine (T18/T19). On a successful
Acknowledge the T18 gate returns Good, so the SDK applies the acked state to the
AlarmConditionState node and auto-fires its own condition event (E2) -- directly
on the node, bypassing WriteAlarmCondition. The engine then re-projects that same
transition through WriteAlarmCondition, which fired again (E3): a double-emit.
Gate WriteAlarmCondition's ReportConditionEvent on a genuine delta computed
against the node's CURRENT live state (read before projecting the snapshot), not
a last-written cache (which would be stale, since the SDK-applied ack never went
through this method). For a re-projected ack the snapshot equals the node's
already-applied state -> no delta -> suppress E3. Genuine engine-driven
transitions still differ -> fire.
Compared fields (value-equality via AlarmConditionDelta record): Active, Acked,
Confirmed, Enabled, Shelving (mapped from the shelving state machine), Severity
(mapped through MapSeverity to match the bucket the node stores), Message.
Optional Confirmed/Shelving fold to the node read-back default when the child is
absent so they can't register a phantom delta.
Tests prove both: suppression of the simulated inbound ack re-projection
(EventId unchanged) and that genuine transitions fire while identical
re-projections suppress; plus a direct unit test of the ShouldFireConditionEvent
seam. 102/102 OpcUaServer.Tests green.
The SDK fires OnTimedUnshelve with the node manager's system context (no
session, no user identity) when a TimedShelve duration expires. Routing
through the shared HandleAlarmCommand hit the AlarmAck gate and returned
BadUserAccessDenied, leaving the alarm permanently shelved.
Replace the delegated HandleAlarmCommand call with an inline lambda that
bypasses the client gate, extracts the AlarmId the same way, and routes an
Unshelve command so the engine clears its shelve state. The manual-client
Unshelve path via OnShelve(shelving:false) remains gated.
Update the AlarmCommandRouterTests OnTimedUnshelve test to use a real
system context (no UserIdentity) — reproducing the actual SDK invocation
path — and assert Good, AlarmId, Operation==Unshelve, User==empty.
Add a doc note to AlarmCommand.Operation that Enable/Disable are in the
vocabulary but not yet wired at the node-manager seam.
Wire the materialised AlarmConditionState method handlers so a client calling
Acknowledge/Confirm/Shelve/AddComment is gated on the AlarmAck data-plane role
and, when allowed, routed back to the scripted-alarm engine via a new
`alarm-commands` DistributedPubSub topic.
- Commons: new AlarmCommand DTO (AlarmId/Operation/User/Comment/UnshelveAtUtc).
- ScriptedAlarmHostActor: add AlarmCommandsTopic const.
- OtOpcUaNodeManager: settable AlarmCommandRouter + wire OnAcknowledge/OnConfirm/
OnAddComment/OnShelve/OnTimedUnshelve. Each resolves the principal off
ISessionOperationContext.UserIdentity as RoleCarryingUserIdentity, fails closed
(BadUserAccessDenied) when the AlarmAck role is absent or no identity, else maps
+ routes an AlarmCommand and returns Good. OnShelve discriminates OneShotShelve/
TimedShelve/Unshelve from the SDK flags; TimedShelve expiry = UtcNow + ms.
No Akka/IActorRef handle — only the Action<AlarmCommand> delegate. T20 de-dup
note left; WriteAlarmCondition untouched.
- OpcUaServer.Security: OpcUaDataPlaneRoles.AlarmAck shared const (the role was a
bare string everywhere; introduced one symbol for the gate + tests).
- OtOpcUaSdkServer: SetAlarmCommandRouter pass-through.
- Host: boot wiring publishes each command via mediator.Tell(Publish(...)) using a
lazy ActorSystem accessor (mirrors DpsScriptLogPublisher).
- Tests: 11 new gate + mapping tests (OpcUaServer.Tests 88->99, all green).
Code-review follow-up to T17: copy the roles into a fresh array at construction so a
caller mutating the source list cannot retroactively alter a session's granted roles,
and so the T18 ack gate's per-invocation .Contains(...) runs over a known-small frozen
array.
Stop discarding the authenticator's resolved roles during impersonation.
HandleImpersonation now sets args.Identity to a RoleCarryingUserIdentity
(: UserIdentity) that carries result.Roles, so a downstream method handler
can read them off context.UserIdentity for the inbound AlarmAck gate (T18).
Verified via the decompiled SDK (1.5.378.106) that the instance we assign to
ImpersonateEventArgs.Identity is stored by reference onto Session.Identity /
EffectiveIdentity and surfaced unchanged on OperationContext.UserIdentity --
the custom subclass survives the round-trip. No auth-decision logic changes.