fix(historian): address code review on Raw HistoryRead paging

C1 (critical): a boundary tie cluster larger than NumValuesPerNode could
silently truncate a resumed read to GoodNoData, permanently dropping the
un-emitted ties — the (timestamp, skip) cursor cannot advance past a single
timestamp the fixed-(start,end,cap) backend keeps re-returning. Now detected
and failed LOUDLY per node with BadHistoryOperationUnsupported + a log naming
the tag/timestamp/cap; documented in Historian.md with the larger-cap remedy.
Regression test Raw_tie_cluster_larger_than_page_fails_loudly_not_silently.

I3: build HistoryData before Save() so a projection failure can never orphan a
stored continuation cursor.

N1 (YAGNI): drop the never-produced HistoryReadKind enum + Processed-only
Aggregate/IntervalTicks fields from HistoryContinuationState — only Raw pages.

N3: ComputeResumeCursor guards its documented non-empty precondition.

I1: document InMemoryHistoryContinuationStore's eventual-consistency (test double).

Build clean, 182/182 OpcUaServer tests pass.
This commit is contained in:
Joseph Doherty
2026-06-15 05:15:07 -04:00
parent 94c3ca60fc
commit bea0b482d4
6 changed files with 101 additions and 36 deletions
@@ -1751,28 +1751,53 @@ public sealed class OtOpcUaNodeManager : CustomNodeManager2
.ReadRawAsync(tagname, startUtc, endUtc, numValuesPerNode, CancellationToken.None)
.GetAwaiter().GetResult();
var backendFull = HistoryPaging.IsFullPage(sourceResult.Samples.Count, numValuesPerNode);
// On a resume read, drop the boundary ties already returned on the prior page.
var samples = inboundCp is { Length: > 0 }
? HistoryPaging.TrimBoundaryDuplicates(sourceResult.Samples, startUtc, boundarySkip)
: sourceResult.Samples;
// Degenerate tie cluster: a resume read returned a FULL backend page that the boundary-tie trim
// emptied entirely. That can only happen when more than NumValuesPerNode samples share the resume
// boundary timestamp — a tie cluster larger than the page cap. The fixed-(start,end,cap) backend
// can only ever return the first `cap` of those ties, so a (timestamp, skip) cursor can never
// advance past the cluster. Fail LOUDLY for this node rather than silently truncate to GoodNoData
// (which would permanently drop the un-emitted ties). The operator's remedy is a larger
// NumValuesPerNode; see docs/Historian.md "Paging limitation".
if (inboundCp is { Length: > 0 } && backendFull && samples.Count == 0)
{
#pragma warning disable CS0618 // Type or member is obsolete
Utils.LogError(
"OtOpcUaNodeManager: HistoryReadRaw paging stalled — tie cluster at {0:O} for tag '{1}' " +
"exceeds NumValuesPerNode={2}; cannot page past it. Increase NumValuesPerNode.",
startUtc, tagname, numValuesPerNode);
#pragma warning restore CS0618
errors[handle.Index] = StatusCodes.BadHistoryOperationUnsupported;
results[handle.Index] = new SdkHistoryReadResult { StatusCode = StatusCodes.BadHistoryOperationUnsupported };
return;
}
// The "full page" test is against the RAW backend count (before trimming): the backend honoured
// the cap, so a full backend page ⇒ there may be more even if we trimmed some boundary ties.
var historyData = ToHistoryDataFromSamples(samples);
byte[]? outboundCp = null;
if (HistoryPaging.IsFullPage(sourceResult.Samples.Count, numValuesPerNode) && samples.Count > 0)
if (backendFull && samples.Count > 0)
{
HistoryPaging.ComputeResumeCursor(samples, out var nextStart, out var skip);
var nextState = new HistoryContinuationState(
HistoryReadKind.Raw, tagname, nextStart, endUtc, skip, numValuesPerNode,
Aggregate: default, IntervalTicks: 0);
tagname, nextStart, endUtc, skip, numValuesPerNode);
// Save may return null (no session on this request) ⇒ degrade to single-shot for this node.
// Built AFTER historyData so a failure projecting samples can never orphan a stored cursor.
outboundCp = _historyContinuationStore.Save(session, nextState);
}
var historyData = ToHistoryDataFromSamples(samples);
results[handle.Index] = new SdkHistoryReadResult
{
// No samples ⇒ GoodNoData (the node is historized, the window just held no data).
// No samples ⇒ GoodNoData (the node is historized, the window just held no data). With the
// degenerate-cluster guard above, a resumed empty page now only means the window/cluster is
// genuinely drained — never silent data loss.
StatusCode = samples.Count == 0 ? StatusCodes.GoodNoData : StatusCodes.Good,
HistoryData = new ExtensionObject(historyData),
ContinuationPoint = outboundCp,