docs(plan): historian within-timestamp paging (#400) + AbCip/TwinCAT UDT member-paths design

This commit is contained in:
Joseph Doherty
2026-06-17 19:52:52 -04:00
parent c402872c0c
commit ad66ecc97e
@@ -0,0 +1,254 @@
# Historian within-timestamp paging (#400) + AbCip/TwinCAT UDT member-paths — Design
> **Status:** Approved 2026-06-17. Branch `feat/stillpending-historian-paging-udt-members` off master `c402872c`.
> Next step: writing-plans → subagent-driven-development → finish (merge to master + push).
## Goal
Bundle two `stillpending.md` §2 backlog items into one phase. They touch **disjoint
projects** (server/historian vs. two driver projects), so their implementers run
concurrently.
- **WS-A — #400 within-timestamp tie-cluster paging.** Retire the loud
`BadHistoryOperationUnsupported` fail when a single `SourceTimestamp` carries more raw
ties than `NumValuesPerNode`; page *within* the timestamp instead.
- **WS-B — AbCip + TwinCAT UDT member-paths.** Make individual UDT/struct members
individually addressable (discover + read) — what both §2 lines explicitly point to
(`stillpending.md:50` *"Must address individual member paths"*, `:51` *"discover UDT
members individually"*).
---
## WS-A — #400: within-timestamp tie-cluster paging
### Problem
`HistoryRead-Raw` is paged server-side over a single-shot backend that only accepts
`(start, end, cap)` — it cannot skip/offset. The tie-safe resume cursor
(`HistoryContinuationState`) is a `(NextStartUtc=T, BoundarySkipCount=skip)` pair. When
**more samples share one `SourceTimestamp` than `NumValuesPerNode`** (an oversized tie
cluster), the cursor cannot advance past T: every resume re-reads the same first `cap`
ties, the boundary-tie trim empties the page, and `ServeRawPaged`
(`OtOpcUaNodeManager.cs:1881-1892`) fails that node **loudly** with
`BadHistoryOperationUnsupported`. The operator's only remedy today is to re-issue with a
bigger `NumValuesPerNode`. For a single tag's raw history this is a data anomaly (raw
samples normally carry strictly increasing distinct timestamps) — hence it was deferred
as task #400.
### Grounding facts (verified against the code)
- The backend (`HistorianDataSource.ReadRawAsync:405`) queries
`StartDateTime = startTime` / `EndDateTime = endTime` (inclusive both ends,
`RetrievalMode.Full`) and honors an **explicit** `maxValues` cap. So a degenerate
`(T, T, bounded)` read returns the *whole* cluster at exactly T.
- **`cap == 0` is NOT truly unlimited at the backend** — it falls back to
`_config.MaxValuesPerRead` (`HistorianDataSource.cs:431-432,441`). So the over-fetch must
pass an **explicit** large cap, never rely on `0 = unlimited`.
- The pure paging helpers already live in `HistoryPaging.cs` (`IsFullPage`,
`ComputeResumeCursor`, `TrimBoundaryDuplicates`) — all static + SDK-free, unit-tested
directly. The new decision logic joins them there.
- Inclusive-start resume + boundary-tie trim are already load-bearing invariants of the
current tie-safe cursor; the over-fetch inherits them and adds no new assumption beyond
*stable backend ordering within a single timestamp* (which the existing skip-count
cursor already relies on).
- The config knobs live in `ServerHistorianOptions.cs` (`AddServerHistorian` DI mirror).
### Approaches considered
- **A — within-timestamp over-fetch + cap-sliced serve (CHOSEN).** On detecting the stuck
cluster, issue a dedicated degenerate read `(T, T, MaxTieClusterOverfetch + 1)` to pull
the whole cluster at T, then serve it `cap` samples per page by advancing the existing
`BoundarySkipCount` *within* the cluster. When the cluster drains, advance the cursor to
`T.AddTicks(1)` with skip 0 (safe: all T-ties emitted; no DateTime tick exists strictly
between T and T+1). Re-reading `(T,T,bounded)` each page is idempotent over immutable
history. `MaxTieClusterOverfetch` (new `ServerHistorianOptions` field) bounds the
over-fetch; a cluster larger than the bound preserves **today's** loud
`BadHistoryOperationUnsupported` as the ultimate backstop. **No contract / backend /
sidecar change.** The slice + next-cursor decision goes in a new pure
`HistoryPaging.SliceTieCluster(...)` (unit-testable without a server/session/SDK); the
node-manager owns only the `(T,T,bounded)` I/O.
- **B — backend skip/offset parameter (REJECTED).** Native within-timestamp offset via a
`(start, end, cap, skip)` backend surface. Requires an `IHistorianDataSource.ReadRawAsync`
contract change **and** a Wonderware sidecar wire-protocol change — infra-gated and
off-limits (no wire/proto change).
- **C — buffer the cluster in the continuation store (REJECTED).** Over-fetch once, stash
the whole cluster in the continuation-point state, serve slices from the buffer. Avoids
re-reads (O(N) vs A's O(N²/cap) transfer) but holds the whole cluster in the
session-bound, capped, oldest-evicted store — more state, more memory under many
concurrent pathological reads, more eviction edge-cases. A re-reads instead; the
re-read cost only bites on a documented anomaly path, and steady-state memory stays one
page (plus the transient bounded over-fetch).
### Architecture (Approach A)
The change is **localized to the existing "stuck" branch** of `ServeRawPaged`
(`OtOpcUaNodeManager.cs:1881-1892`). Today that branch fails the node; instead it:
```
// stuck: inboundCp present, backend page full, boundary-trim emptied the page
cluster = ReadRawAsync(tagname, T, T, MaxTieClusterOverfetch + 1) // whole cluster at T
if cluster.Count > MaxTieClusterOverfetch:
log + BadHistoryOperationUnsupported // preserve today's loud backstop for absurd clusters
return
(sliceStart, sliceCount, nextCursor) = HistoryPaging.SliceTieCluster( // PURE
clusterCount: cluster.Count, skip: boundarySkip, cap: numValuesPerNode, boundaryT: T, endUtc: endUtc)
serve cluster[sliceStart .. sliceStart+sliceCount) // Good (or GoodNoData if empty + drained + no window left)
emit continuation point iff nextCursor is not null
```
`SliceTieCluster` decides:
- **more ties remain at T** (`skip + sliceCount < clusterCount`) → next cursor
`(T, skip + sliceCount)` (stay within the cluster).
- **cluster drained** (`skip + sliceCount >= clusterCount`) → next cursor
`(T.AddTicks(1), 0)` **iff** `T.AddTicks(1) <= endUtc` (there may be window left after T);
else terminal (no continuation). A drained slice may be **short** (< cap) yet still emit
a CP — the one legitimate exception to "short page ⇒ terminal", because we *know* there
may be un-read window past T. Documented.
A **fresh** read that lands on an oversized cluster needs **no change**: it returns the
first `cap` ties, `ComputeResumeCursor` stores `(T, cap)`, and the *next* (resume) read
hits the stuck branch — which now over-fetches. The `(T, N)`/`skip >= clusterCount`
degenerate self-heals (slice empty → drained branch → advance past T).
### Testing (WS-A)
- Pure `HistoryPaging.SliceTieCluster` unit tests: mid-cluster slice, last (short) slice +
advance-past-T, exact-cap-boundary, `skip >= clusterCount` self-heal, `T == endUtc`
terminal, drained-with-window-left emits CP.
- `ServeRawPaged` integration vs a fake `IHistorianDataSource` returning an oversized
cluster (cluster bigger than cap at one timestamp): assert the full cluster is paged out
across N resumes with no duplicates/drops, then the window past T is read, then terminal —
and that `> MaxTieClusterOverfetch` still loud-fails.
- Regression: the existing small-tie-cluster + normal Raw paging tests stay green.
---
## WS-B — AbCip + TwinCAT UDT member-paths
### Problem
Both §2 lines point to the **same feature** — make individual UDT members addressable:
- **AbCip** (`stillpending.md:50`): bare-name read of a member-bearing UDT returns
`BadNotSupported` (`AbCipDriver.cs:515-522`); UDT type maps to a `String` placeholder
(`AbCipDataTypeExtensions.cs:33`). *"Must address individual member paths."*
- **TwinCAT** (`stillpending.md:51`): pre-declared Structure tags are **rejected** at parse
(`TwinCATDriverFactoryExtensions.cs:89-100`, by design — the error tells the operator to
*"discover UDT members individually"*); discovered UDTs/FBs are **silently skipped**
(`MapSymbolType``null``DiscoverAsync` drops, `AdsTwinCATClient.cs:349/364`).
**Both path parsers already handle member paths.** `AbCipTagPath.TryParse` /
`TwinCATSymbolPath.TryParse` structurally parse `Motor.Speed.Setpoint` and reassemble the
exact native name libplctag (`ToLibplctagName`) / ADS (`ToAdsSymbolName`) read — proven by
the just-shipped bit-RMW work (which read/wrote `parent.bit` through these same surfaces).
So the **read path for a member reference already works**; the gap is purely
**discovery** — neither enumerator walks members:
- `LibplctagTagEnumerator.EnumerateAsync` (`:28-49`) emits one `AbCipDiscoveredTag` per
top-level `@tags` entry; a UDT instance surfaces as a bare Structure tag with `Members`
metadata, never expanded into addressable member tags.
- `AdsTwinCATClient.BrowseSymbolsAsync` (`:319-339`) emits one `TwinCATDiscoveredSymbol`
per top-level symbol and drops Structure symbols (`MapSymbolType` → null).
### Approaches considered
- **A — discovery-time member expansion, read (CHOSEN).** When discovery hits a
struct/UDT, walk its members and emit each **atomic leaf** as an individually-addressable
discovered tag carrying its concrete member path + resolved atomic data type. The read
path already serves it. Minimal, symmetric across both drivers, fully fake-testable.
- **AbCip:** expand from the already-decoded `AbCipUdtShape.Members` (each carries name +
offset + atomic type) the template cache (`AbCipTemplateCache`) holds. Emit
`AbCipDiscoveredTag` per atomic member with reference `Parent.Member` and the member's
`DriverDataType`. (Whole-UDT grouped reads via `AbCipUdtReadPlanner` are an existing
efficiency path and stay; member tags are the addressable surface.)
- **TwinCAT:** recurse `ISymbol.SubSymbols` in `BrowseSymbolsAsync`; for each atomic leaf
yield a `TwinCATDiscoveredSymbol(InstancePath, mappedAtomicType, readOnly, arrayLength)`.
A struct sub-symbol recurses; an unsupported leaf (nested-UDT-array / pointer / FB
internal) is dropped (null), never mis-reported — same discipline as the existing
multi-dim-array drop.
- **B — runtime member resolution, no discovery change (REJECTED).** Operator hand-types
member tags; AbCip resolver walks the template at read time. No browse (worse UX) and
more resolver complexity for AbCip; A is cleaner.
- **C — whole-UDT structured read (REJECTED).** Return the container as one OPC UA
structured value — needs a structured OPC UA DataType + encoder + the retirement of the
String placeholder; large surface. Defer.
### Architecture (Approach A)
- **AbCip:** add member expansion to the discovery emit. Where a discovered tag is a
`Structure` with decoded `Members`, additionally emit one `AbCipDiscoveredTag` per atomic
member (reference `Name.Member`, atomic `DriverDataType` from
`AbCipDataTypeExtensions`/the template member type). Bounded recursion depth for
nested structs (config/const cap, e.g. 8). The bare Structure container keeps returning
`BadNotSupported` on direct read (unchanged) — members are the addressable surface.
- **TwinCAT:** in `BrowseSymbolsAsync`, when a symbol's `IDataType.Category == Struct`,
recurse `symbol.SubSymbols` and yield atomic leaves (full `InstancePath`,
`MapSymbolType`-resolved type). Bounded recursion depth. The pre-declared-Structure-tag
reject at `TwinCATDriverFactoryExtensions.cs:89` stays (its message already points to
discovery) — but now discovery actually surfaces the members it points to.
- **No** `IDriver` / wire / proto / EF change — both edits are driver-internal discovery.
### Named deferrals (not silent)
- **Member writes** — read-only members this slice (writes via the atomic path may "just
work" but add round-trip risk; deferred + documented).
- Whole-UDT (container) read/write — stays unsupported; address members.
- Arrays-of-UDT, UDT-typed array members, pointers/references, FB internals beyond plain
struct members — dropped in discovery (never mis-reported), deferred.
- Nested-struct recursion beyond the bounded depth cap.
### Testing (WS-B)
- **AbCip:** fake template reader / enumerator yields a UDT with atomic + nested members →
assert discovery emits per-member tags with the right reference + atomic type; assert a
member-path read resolves + reads (existing fake runtime); assert the bare container
still `BadNotSupported`; depth-cap honored.
- **TwinCAT:** `FakeTwinCATClient` symbol tree with a struct + nested struct + an
unsupported leaf → assert `BrowseSymbolsAsync` yields the atomic leaves (full paths,
mapped types), drops the unsupported leaf, honors the depth cap; member read via
`ToAdsSymbolName` works.
- **No bUnit.**
---
## Live `/run`
- **WS-A (#400):** **unit + integration-proven.** A real oversized-tie anomaly needs the
AVEVA Historian on `10.100.0.48` (infra-gated — same gate as all of Phase C) plus an
actual same-timestamp tie cluster. The fake-source integration test is the canonical
proof; a live `/run` is best-effort/operator-gated.
- **WS-B AbCip:** **best-effort** against a local `ab_server` ControlLogix
(`--plc=ControlLogix`) — UDT support in `ab_server` is partial, so likely unit-proven;
attempt a member round-trip and record honestly.
- **WS-B TwinCAT:** **operator-gated** (Windows-only ADS runtime; no local target) —
unit-proven only.
## Task slicing (subagent-driven)
Disjoint projects ⇒ WS-A and WS-B implementers run **concurrently**.
- **T1 (WS-A)** — `HistoryPaging.SliceTieCluster` pure helper + `MaxTieClusterOverfetch`
option + `ServeRawPaged` stuck-branch rewrite + tests. `standard`. ∥ T2/T3.
- **T2 (WS-B AbCip)** — discovery member expansion + member-read verification + tests.
`standard`. ∥ T1/T3.
- **T3 (WS-B TwinCAT)** — `BrowseSymbolsAsync` sub-symbol recursion + tests. `standard`.
∥ T1/T2.
- **T4** — Docs (`docs/Historian.md` paging-limitation section → resolved;
`docs/drivers/AbCip.md` + `docs/drivers/TwinCAT.md` UDT member-path sections) + clear the
§2 lines **via the plan record only** (do not stage `stillpending.md`). `small`.
- **T5** — Full build + OpcUaServer + AbCip + TwinCAT tests + final integration review.
`standard`.
- **T6** — Live `/run` best-effort (WS-A fake-integration; WS-B AbCip ab_server best-effort,
TwinCAT operator-gated) + finish branch (merge to master + push) + memory. `standard`.
Dependency graph: `{T1 ∥ T2 ∥ T3} → T4 → T5 → T6`.
## Hard rules (carried from prior phases)
Stage by explicit path, never `git add .`; never stage `sql_login.txt` /
`src/Server/.../pki/` / `pending.md` / `current.md` / `docker-dev/docker-compose.yml` /
`stillpending.md`; never echo/commit secrets; no force-push; no `--no-verify`; **NO EF
migration**; **NO Commons wire/proto contract change** (all edits are driver-internal or
server-internal — `HistoryPaging`/`OtOpcUaNodeManager`/`ServerHistorianOptions` and the two
drivers' discovery; no `IDriver`/`IHistorianDataSource`/`WriteResult` change); **NO bUnit**;
`dangerouslyDisableSandbox` for all build/test/rig commands.