15 KiB
Historian within-timestamp paging (#400) + AbCip/TwinCAT UDT member-paths — Design
Status: Approved 2026-06-17. Branch
feat/stillpending-historian-paging-udt-membersoff masterc402872c. Next step: writing-plans → subagent-driven-development → finish (merge to master + push).
Goal
Bundle two stillpending.md §2 backlog items into one phase. They touch disjoint
projects (server/historian vs. two driver projects), so their implementers run
concurrently.
- WS-A — #400 within-timestamp tie-cluster paging. Retire the loud
BadHistoryOperationUnsupportedfail when a singleSourceTimestampcarries more raw ties thanNumValuesPerNode; page within the timestamp instead. - WS-B — AbCip + TwinCAT UDT member-paths. Make individual UDT/struct members
individually addressable (discover + read) — what both §2 lines explicitly point to
(
stillpending.md:50"Must address individual member paths",:51"discover UDT members individually").
WS-A — #400: within-timestamp tie-cluster paging
Problem
HistoryRead-Raw is paged server-side over a single-shot backend that only accepts
(start, end, cap) — it cannot skip/offset. The tie-safe resume cursor
(HistoryContinuationState) is a (NextStartUtc=T, BoundarySkipCount=skip) pair. When
more samples share one SourceTimestamp than NumValuesPerNode (an oversized tie
cluster), the cursor cannot advance past T: every resume re-reads the same first cap
ties, the boundary-tie trim empties the page, and ServeRawPaged
(OtOpcUaNodeManager.cs:1881-1892) fails that node loudly with
BadHistoryOperationUnsupported. The operator's only remedy today is to re-issue with a
bigger NumValuesPerNode. For a single tag's raw history this is a data anomaly (raw
samples normally carry strictly increasing distinct timestamps) — hence it was deferred
as task #400.
Grounding facts (verified against the code)
- The backend (
HistorianDataSource.ReadRawAsync:405) queriesStartDateTime = startTime/EndDateTime = endTime(inclusive both ends,RetrievalMode.Full) and honors an explicitmaxValuescap. So a degenerate(T, T, bounded)read returns the whole cluster at exactly T. cap == 0is NOT truly unlimited at the backend — it falls back to_config.MaxValuesPerRead(HistorianDataSource.cs:431-432,441). So the over-fetch must pass an explicit large cap, never rely on0 = unlimited.- The pure paging helpers already live in
HistoryPaging.cs(IsFullPage,ComputeResumeCursor,TrimBoundaryDuplicates) — all static + SDK-free, unit-tested directly. The new decision logic joins them there. - Inclusive-start resume + boundary-tie trim are already load-bearing invariants of the current tie-safe cursor; the over-fetch inherits them and adds no new assumption beyond stable backend ordering within a single timestamp (which the existing skip-count cursor already relies on).
- The config knobs live in
ServerHistorianOptions.cs(AddServerHistorianDI mirror).
Approaches considered
- A — within-timestamp over-fetch + cap-sliced serve (CHOSEN). On detecting the stuck
cluster, issue a dedicated degenerate read
(T, T, MaxTieClusterOverfetch + 1)to pull the whole cluster at T, then serve itcapsamples per page by advancing the existingBoundarySkipCountwithin the cluster. When the cluster drains, advance the cursor toT.AddTicks(1)with skip 0 (safe: all T-ties emitted; no DateTime tick exists strictly between T and T+1). Re-reading(T,T,bounded)each page is idempotent over immutable history.MaxTieClusterOverfetch(newServerHistorianOptionsfield) bounds the over-fetch; a cluster larger than the bound preserves today's loudBadHistoryOperationUnsupportedas the ultimate backstop. No contract / backend / sidecar change. The slice + next-cursor decision goes in a new pureHistoryPaging.SliceTieCluster(...)(unit-testable without a server/session/SDK); the node-manager owns only the(T,T,bounded)I/O. - B — backend skip/offset parameter (REJECTED). Native within-timestamp offset via a
(start, end, cap, skip)backend surface. Requires anIHistorianDataSource.ReadRawAsynccontract change and a Wonderware sidecar wire-protocol change — infra-gated and off-limits (no wire/proto change). - C — buffer the cluster in the continuation store (REJECTED). Over-fetch once, stash the whole cluster in the continuation-point state, serve slices from the buffer. Avoids re-reads (O(N) vs A's O(N²/cap) transfer) but holds the whole cluster in the session-bound, capped, oldest-evicted store — more state, more memory under many concurrent pathological reads, more eviction edge-cases. A re-reads instead; the re-read cost only bites on a documented anomaly path, and steady-state memory stays one page (plus the transient bounded over-fetch).
Architecture (Approach A)
The change is localized to the existing "stuck" branch of ServeRawPaged
(OtOpcUaNodeManager.cs:1881-1892). Today that branch fails the node; instead it:
// stuck: inboundCp present, backend page full, boundary-trim emptied the page
cluster = ReadRawAsync(tagname, T, T, MaxTieClusterOverfetch + 1) // whole cluster at T
if cluster.Count > MaxTieClusterOverfetch:
log + BadHistoryOperationUnsupported // preserve today's loud backstop for absurd clusters
return
(sliceStart, sliceCount, nextCursor) = HistoryPaging.SliceTieCluster( // PURE
clusterCount: cluster.Count, skip: boundarySkip, cap: numValuesPerNode, boundaryT: T, endUtc: endUtc)
serve cluster[sliceStart .. sliceStart+sliceCount) // Good (or GoodNoData if empty + drained + no window left)
emit continuation point iff nextCursor is not null
SliceTieCluster decides:
- more ties remain at T (
skip + sliceCount < clusterCount) → next cursor(T, skip + sliceCount)(stay within the cluster). - cluster drained (
skip + sliceCount >= clusterCount) → next cursor(T.AddTicks(1), 0)iffT.AddTicks(1) <= endUtc(there may be window left after T); else terminal (no continuation). A drained slice may be short (< cap) yet still emit a CP — the one legitimate exception to "short page ⇒ terminal", because we know there may be un-read window past T. Documented.
A fresh read that lands on an oversized cluster needs no change: it returns the
first cap ties, ComputeResumeCursor stores (T, cap), and the next (resume) read
hits the stuck branch — which now over-fetches. The (T, N)/skip >= clusterCount
degenerate self-heals (slice empty → drained branch → advance past T).
Testing (WS-A)
- Pure
HistoryPaging.SliceTieClusterunit tests: mid-cluster slice, last (short) slice + advance-past-T, exact-cap-boundary,skip >= clusterCountself-heal,T == endUtcterminal, drained-with-window-left emits CP. ServeRawPagedintegration vs a fakeIHistorianDataSourcereturning an oversized cluster (cluster bigger than cap at one timestamp): assert the full cluster is paged out across N resumes with no duplicates/drops, then the window past T is read, then terminal — and that> MaxTieClusterOverfetchstill loud-fails.- Regression: the existing small-tie-cluster + normal Raw paging tests stay green.
WS-B — AbCip + TwinCAT UDT member-paths
Problem
Both §2 lines point to the same feature — make individual UDT members addressable:
- AbCip (
stillpending.md:50): bare-name read of a member-bearing UDT returnsBadNotSupported(AbCipDriver.cs:515-522); UDT type maps to aStringplaceholder (AbCipDataTypeExtensions.cs:33). "Must address individual member paths." - TwinCAT (
stillpending.md:51): pre-declared Structure tags are rejected at parse (TwinCATDriverFactoryExtensions.cs:89-100, by design — the error tells the operator to "discover UDT members individually"); discovered UDTs/FBs are silently skipped (MapSymbolType→null→DiscoverAsyncdrops,AdsTwinCATClient.cs:349/364).
Both path parsers already handle member paths. AbCipTagPath.TryParse /
TwinCATSymbolPath.TryParse structurally parse Motor.Speed.Setpoint and reassemble the
exact native name libplctag (ToLibplctagName) / ADS (ToAdsSymbolName) read — proven by
the just-shipped bit-RMW work (which read/wrote parent.bit through these same surfaces).
So the read path for a member reference already works; the gap is purely
discovery — neither enumerator walks members:
LibplctagTagEnumerator.EnumerateAsync(:28-49) emits oneAbCipDiscoveredTagper top-level@tagsentry; a UDT instance surfaces as a bare Structure tag withMembersmetadata, never expanded into addressable member tags.AdsTwinCATClient.BrowseSymbolsAsync(:319-339) emits oneTwinCATDiscoveredSymbolper top-level symbol and drops Structure symbols (MapSymbolType→ null).
Approaches considered
- A — discovery-time member expansion, read (CHOSEN). When discovery hits a
struct/UDT, walk its members and emit each atomic leaf as an individually-addressable
discovered tag carrying its concrete member path + resolved atomic data type. The read
path already serves it. Minimal, symmetric across both drivers, fully fake-testable.
- AbCip: expand from the already-decoded
AbCipUdtShape.Members(each carries name + offset + atomic type) the template cache (AbCipTemplateCache) holds. EmitAbCipDiscoveredTagper atomic member with referenceParent.Memberand the member'sDriverDataType. (Whole-UDT grouped reads viaAbCipUdtReadPlannerare an existing efficiency path and stay; member tags are the addressable surface.) - TwinCAT: recurse
ISymbol.SubSymbolsinBrowseSymbolsAsync; for each atomic leaf yield aTwinCATDiscoveredSymbol(InstancePath, mappedAtomicType, readOnly, arrayLength). A struct sub-symbol recurses; an unsupported leaf (nested-UDT-array / pointer / FB internal) is dropped (null), never mis-reported — same discipline as the existing multi-dim-array drop.
- AbCip: expand from the already-decoded
- B — runtime member resolution, no discovery change (REJECTED). Operator hand-types member tags; AbCip resolver walks the template at read time. No browse (worse UX) and more resolver complexity for AbCip; A is cleaner.
- C — whole-UDT structured read (REJECTED). Return the container as one OPC UA structured value — needs a structured OPC UA DataType + encoder + the retirement of the String placeholder; large surface. Defer.
Architecture (Approach A)
- AbCip: add member expansion to the discovery emit. Where a discovered tag is a
Structurewith decodedMembers, additionally emit oneAbCipDiscoveredTagper atomic member (referenceName.Member, atomicDriverDataTypefromAbCipDataTypeExtensions/the template member type). Bounded recursion depth for nested structs (config/const cap, e.g. 8). The bare Structure container keeps returningBadNotSupportedon direct read (unchanged) — members are the addressable surface. - TwinCAT: in
BrowseSymbolsAsync, when a symbol'sIDataType.Category == Struct, recursesymbol.SubSymbolsand yield atomic leaves (fullInstancePath,MapSymbolType-resolved type). Bounded recursion depth. The pre-declared-Structure-tag reject atTwinCATDriverFactoryExtensions.cs:89stays (its message already points to discovery) — but now discovery actually surfaces the members it points to. - No
IDriver/ wire / proto / EF change — both edits are driver-internal discovery.
Named deferrals (not silent)
- Member writes — read-only members this slice (writes via the atomic path may "just work" but add round-trip risk; deferred + documented).
- Whole-UDT (container) read/write — stays unsupported; address members.
- Arrays-of-UDT, UDT-typed array members, pointers/references, FB internals beyond plain struct members — dropped in discovery (never mis-reported), deferred.
- Nested-struct recursion beyond the bounded depth cap.
Testing (WS-B)
- AbCip: fake template reader / enumerator yields a UDT with atomic + nested members →
assert discovery emits per-member tags with the right reference + atomic type; assert a
member-path read resolves + reads (existing fake runtime); assert the bare container
still
BadNotSupported; depth-cap honored. - TwinCAT:
FakeTwinCATClientsymbol tree with a struct + nested struct + an unsupported leaf → assertBrowseSymbolsAsyncyields the atomic leaves (full paths, mapped types), drops the unsupported leaf, honors the depth cap; member read viaToAdsSymbolNameworks. - No bUnit.
Live /run
- WS-A (#400): unit + integration-proven. A real oversized-tie anomaly needs the
AVEVA Historian on
10.100.0.48(infra-gated — same gate as all of Phase C) plus an actual same-timestamp tie cluster. The fake-source integration test is the canonical proof; a live/runis best-effort/operator-gated. - WS-B AbCip: best-effort against a local
ab_serverControlLogix (--plc=ControlLogix) — UDT support inab_serveris partial, so likely unit-proven; attempt a member round-trip and record honestly. - WS-B TwinCAT: operator-gated (Windows-only ADS runtime; no local target) — unit-proven only.
Task slicing (subagent-driven)
Disjoint projects ⇒ WS-A and WS-B implementers run concurrently.
- T1 (WS-A) —
HistoryPaging.SliceTieClusterpure helper +MaxTieClusterOverfetchoption +ServeRawPagedstuck-branch rewrite + tests.standard. ∥ T2/T3. - T2 (WS-B AbCip) — discovery member expansion + member-read verification + tests.
standard. ∥ T1/T3. - T3 (WS-B TwinCAT) —
BrowseSymbolsAsyncsub-symbol recursion + tests.standard. ∥ T1/T2. - T4 — Docs (
docs/Historian.mdpaging-limitation section → resolved;docs/drivers/AbCip.md+docs/drivers/TwinCAT.mdUDT member-path sections) + clear the §2 lines via the plan record only (do not stagestillpending.md).small. - T5 — Full build + OpcUaServer + AbCip + TwinCAT tests + final integration review.
standard. - T6 — Live
/runbest-effort (WS-A fake-integration; WS-B AbCip ab_server best-effort, TwinCAT operator-gated) + finish branch (merge to master + push) + memory.standard.
Dependency graph: {T1 ∥ T2 ∥ T3} → T4 → T5 → T6.
Hard rules (carried from prior phases)
Stage by explicit path, never git add .; never stage sql_login.txt /
src/Server/.../pki/ / pending.md / current.md / docker-dev/docker-compose.yml /
stillpending.md; never echo/commit secrets; no force-push; no --no-verify; NO EF
migration; NO Commons wire/proto contract change (all edits are driver-internal or
server-internal — HistoryPaging/OtOpcUaNodeManager/ServerHistorianOptions and the two
drivers' discovery; no IDriver/IHistorianDataSource/WriteResult change); NO bUnit;
dangerouslyDisableSandbox for all build/test/rig commands.