docs(plan): historian within-timestamp paging (#400) + AbCip/TwinCAT UDT member-paths design
This commit is contained in:
@@ -0,0 +1,254 @@
|
||||
# Historian within-timestamp paging (#400) + AbCip/TwinCAT UDT member-paths — Design
|
||||
|
||||
> **Status:** Approved 2026-06-17. Branch `feat/stillpending-historian-paging-udt-members` off master `c402872c`.
|
||||
> Next step: writing-plans → subagent-driven-development → finish (merge to master + push).
|
||||
|
||||
## Goal
|
||||
|
||||
Bundle two `stillpending.md` §2 backlog items into one phase. They touch **disjoint
|
||||
projects** (server/historian vs. two driver projects), so their implementers run
|
||||
concurrently.
|
||||
|
||||
- **WS-A — #400 within-timestamp tie-cluster paging.** Retire the loud
|
||||
`BadHistoryOperationUnsupported` fail when a single `SourceTimestamp` carries more raw
|
||||
ties than `NumValuesPerNode`; page *within* the timestamp instead.
|
||||
- **WS-B — AbCip + TwinCAT UDT member-paths.** Make individual UDT/struct members
|
||||
individually addressable (discover + read) — what both §2 lines explicitly point to
|
||||
(`stillpending.md:50` *"Must address individual member paths"*, `:51` *"discover UDT
|
||||
members individually"*).
|
||||
|
||||
---
|
||||
|
||||
## WS-A — #400: within-timestamp tie-cluster paging
|
||||
|
||||
### Problem
|
||||
|
||||
`HistoryRead-Raw` is paged server-side over a single-shot backend that only accepts
|
||||
`(start, end, cap)` — it cannot skip/offset. The tie-safe resume cursor
|
||||
(`HistoryContinuationState`) is a `(NextStartUtc=T, BoundarySkipCount=skip)` pair. When
|
||||
**more samples share one `SourceTimestamp` than `NumValuesPerNode`** (an oversized tie
|
||||
cluster), the cursor cannot advance past T: every resume re-reads the same first `cap`
|
||||
ties, the boundary-tie trim empties the page, and `ServeRawPaged`
|
||||
(`OtOpcUaNodeManager.cs:1881-1892`) fails that node **loudly** with
|
||||
`BadHistoryOperationUnsupported`. The operator's only remedy today is to re-issue with a
|
||||
bigger `NumValuesPerNode`. For a single tag's raw history this is a data anomaly (raw
|
||||
samples normally carry strictly increasing distinct timestamps) — hence it was deferred
|
||||
as task #400.
|
||||
|
||||
### Grounding facts (verified against the code)
|
||||
|
||||
- The backend (`HistorianDataSource.ReadRawAsync:405`) queries
|
||||
`StartDateTime = startTime` / `EndDateTime = endTime` (inclusive both ends,
|
||||
`RetrievalMode.Full`) and honors an **explicit** `maxValues` cap. So a degenerate
|
||||
`(T, T, bounded)` read returns the *whole* cluster at exactly T.
|
||||
- **`cap == 0` is NOT truly unlimited at the backend** — it falls back to
|
||||
`_config.MaxValuesPerRead` (`HistorianDataSource.cs:431-432,441`). So the over-fetch must
|
||||
pass an **explicit** large cap, never rely on `0 = unlimited`.
|
||||
- The pure paging helpers already live in `HistoryPaging.cs` (`IsFullPage`,
|
||||
`ComputeResumeCursor`, `TrimBoundaryDuplicates`) — all static + SDK-free, unit-tested
|
||||
directly. The new decision logic joins them there.
|
||||
- Inclusive-start resume + boundary-tie trim are already load-bearing invariants of the
|
||||
current tie-safe cursor; the over-fetch inherits them and adds no new assumption beyond
|
||||
*stable backend ordering within a single timestamp* (which the existing skip-count
|
||||
cursor already relies on).
|
||||
- The config knobs live in `ServerHistorianOptions.cs` (`AddServerHistorian` DI mirror).
|
||||
|
||||
### Approaches considered
|
||||
|
||||
- **A — within-timestamp over-fetch + cap-sliced serve (CHOSEN).** On detecting the stuck
|
||||
cluster, issue a dedicated degenerate read `(T, T, MaxTieClusterOverfetch + 1)` to pull
|
||||
the whole cluster at T, then serve it `cap` samples per page by advancing the existing
|
||||
`BoundarySkipCount` *within* the cluster. When the cluster drains, advance the cursor to
|
||||
`T.AddTicks(1)` with skip 0 (safe: all T-ties emitted; no DateTime tick exists strictly
|
||||
between T and T+1). Re-reading `(T,T,bounded)` each page is idempotent over immutable
|
||||
history. `MaxTieClusterOverfetch` (new `ServerHistorianOptions` field) bounds the
|
||||
over-fetch; a cluster larger than the bound preserves **today's** loud
|
||||
`BadHistoryOperationUnsupported` as the ultimate backstop. **No contract / backend /
|
||||
sidecar change.** The slice + next-cursor decision goes in a new pure
|
||||
`HistoryPaging.SliceTieCluster(...)` (unit-testable without a server/session/SDK); the
|
||||
node-manager owns only the `(T,T,bounded)` I/O.
|
||||
- **B — backend skip/offset parameter (REJECTED).** Native within-timestamp offset via a
|
||||
`(start, end, cap, skip)` backend surface. Requires an `IHistorianDataSource.ReadRawAsync`
|
||||
contract change **and** a Wonderware sidecar wire-protocol change — infra-gated and
|
||||
off-limits (no wire/proto change).
|
||||
- **C — buffer the cluster in the continuation store (REJECTED).** Over-fetch once, stash
|
||||
the whole cluster in the continuation-point state, serve slices from the buffer. Avoids
|
||||
re-reads (O(N) vs A's O(N²/cap) transfer) but holds the whole cluster in the
|
||||
session-bound, capped, oldest-evicted store — more state, more memory under many
|
||||
concurrent pathological reads, more eviction edge-cases. A re-reads instead; the
|
||||
re-read cost only bites on a documented anomaly path, and steady-state memory stays one
|
||||
page (plus the transient bounded over-fetch).
|
||||
|
||||
### Architecture (Approach A)
|
||||
|
||||
The change is **localized to the existing "stuck" branch** of `ServeRawPaged`
|
||||
(`OtOpcUaNodeManager.cs:1881-1892`). Today that branch fails the node; instead it:
|
||||
|
||||
```
|
||||
// stuck: inboundCp present, backend page full, boundary-trim emptied the page
|
||||
cluster = ReadRawAsync(tagname, T, T, MaxTieClusterOverfetch + 1) // whole cluster at T
|
||||
if cluster.Count > MaxTieClusterOverfetch:
|
||||
log + BadHistoryOperationUnsupported // preserve today's loud backstop for absurd clusters
|
||||
return
|
||||
(sliceStart, sliceCount, nextCursor) = HistoryPaging.SliceTieCluster( // PURE
|
||||
clusterCount: cluster.Count, skip: boundarySkip, cap: numValuesPerNode, boundaryT: T, endUtc: endUtc)
|
||||
serve cluster[sliceStart .. sliceStart+sliceCount) // Good (or GoodNoData if empty + drained + no window left)
|
||||
emit continuation point iff nextCursor is not null
|
||||
```
|
||||
|
||||
`SliceTieCluster` decides:
|
||||
- **more ties remain at T** (`skip + sliceCount < clusterCount`) → next cursor
|
||||
`(T, skip + sliceCount)` (stay within the cluster).
|
||||
- **cluster drained** (`skip + sliceCount >= clusterCount`) → next cursor
|
||||
`(T.AddTicks(1), 0)` **iff** `T.AddTicks(1) <= endUtc` (there may be window left after T);
|
||||
else terminal (no continuation). A drained slice may be **short** (< cap) yet still emit
|
||||
a CP — the one legitimate exception to "short page ⇒ terminal", because we *know* there
|
||||
may be un-read window past T. Documented.
|
||||
|
||||
A **fresh** read that lands on an oversized cluster needs **no change**: it returns the
|
||||
first `cap` ties, `ComputeResumeCursor` stores `(T, cap)`, and the *next* (resume) read
|
||||
hits the stuck branch — which now over-fetches. The `(T, N)`/`skip >= clusterCount`
|
||||
degenerate self-heals (slice empty → drained branch → advance past T).
|
||||
|
||||
### Testing (WS-A)
|
||||
|
||||
- Pure `HistoryPaging.SliceTieCluster` unit tests: mid-cluster slice, last (short) slice +
|
||||
advance-past-T, exact-cap-boundary, `skip >= clusterCount` self-heal, `T == endUtc`
|
||||
terminal, drained-with-window-left emits CP.
|
||||
- `ServeRawPaged` integration vs a fake `IHistorianDataSource` returning an oversized
|
||||
cluster (cluster bigger than cap at one timestamp): assert the full cluster is paged out
|
||||
across N resumes with no duplicates/drops, then the window past T is read, then terminal —
|
||||
and that `> MaxTieClusterOverfetch` still loud-fails.
|
||||
- Regression: the existing small-tie-cluster + normal Raw paging tests stay green.
|
||||
|
||||
---
|
||||
|
||||
## WS-B — AbCip + TwinCAT UDT member-paths
|
||||
|
||||
### Problem
|
||||
|
||||
Both §2 lines point to the **same feature** — make individual UDT members addressable:
|
||||
|
||||
- **AbCip** (`stillpending.md:50`): bare-name read of a member-bearing UDT returns
|
||||
`BadNotSupported` (`AbCipDriver.cs:515-522`); UDT type maps to a `String` placeholder
|
||||
(`AbCipDataTypeExtensions.cs:33`). *"Must address individual member paths."*
|
||||
- **TwinCAT** (`stillpending.md:51`): pre-declared Structure tags are **rejected** at parse
|
||||
(`TwinCATDriverFactoryExtensions.cs:89-100`, by design — the error tells the operator to
|
||||
*"discover UDT members individually"*); discovered UDTs/FBs are **silently skipped**
|
||||
(`MapSymbolType` → `null` → `DiscoverAsync` drops, `AdsTwinCATClient.cs:349/364`).
|
||||
|
||||
**Both path parsers already handle member paths.** `AbCipTagPath.TryParse` /
|
||||
`TwinCATSymbolPath.TryParse` structurally parse `Motor.Speed.Setpoint` and reassemble the
|
||||
exact native name libplctag (`ToLibplctagName`) / ADS (`ToAdsSymbolName`) read — proven by
|
||||
the just-shipped bit-RMW work (which read/wrote `parent.bit` through these same surfaces).
|
||||
So the **read path for a member reference already works**; the gap is purely
|
||||
**discovery** — neither enumerator walks members:
|
||||
|
||||
- `LibplctagTagEnumerator.EnumerateAsync` (`:28-49`) emits one `AbCipDiscoveredTag` per
|
||||
top-level `@tags` entry; a UDT instance surfaces as a bare Structure tag with `Members`
|
||||
metadata, never expanded into addressable member tags.
|
||||
- `AdsTwinCATClient.BrowseSymbolsAsync` (`:319-339`) emits one `TwinCATDiscoveredSymbol`
|
||||
per top-level symbol and drops Structure symbols (`MapSymbolType` → null).
|
||||
|
||||
### Approaches considered
|
||||
|
||||
- **A — discovery-time member expansion, read (CHOSEN).** When discovery hits a
|
||||
struct/UDT, walk its members and emit each **atomic leaf** as an individually-addressable
|
||||
discovered tag carrying its concrete member path + resolved atomic data type. The read
|
||||
path already serves it. Minimal, symmetric across both drivers, fully fake-testable.
|
||||
- **AbCip:** expand from the already-decoded `AbCipUdtShape.Members` (each carries name +
|
||||
offset + atomic type) the template cache (`AbCipTemplateCache`) holds. Emit
|
||||
`AbCipDiscoveredTag` per atomic member with reference `Parent.Member` and the member's
|
||||
`DriverDataType`. (Whole-UDT grouped reads via `AbCipUdtReadPlanner` are an existing
|
||||
efficiency path and stay; member tags are the addressable surface.)
|
||||
- **TwinCAT:** recurse `ISymbol.SubSymbols` in `BrowseSymbolsAsync`; for each atomic leaf
|
||||
yield a `TwinCATDiscoveredSymbol(InstancePath, mappedAtomicType, readOnly, arrayLength)`.
|
||||
A struct sub-symbol recurses; an unsupported leaf (nested-UDT-array / pointer / FB
|
||||
internal) is dropped (null), never mis-reported — same discipline as the existing
|
||||
multi-dim-array drop.
|
||||
- **B — runtime member resolution, no discovery change (REJECTED).** Operator hand-types
|
||||
member tags; AbCip resolver walks the template at read time. No browse (worse UX) and
|
||||
more resolver complexity for AbCip; A is cleaner.
|
||||
- **C — whole-UDT structured read (REJECTED).** Return the container as one OPC UA
|
||||
structured value — needs a structured OPC UA DataType + encoder + the retirement of the
|
||||
String placeholder; large surface. Defer.
|
||||
|
||||
### Architecture (Approach A)
|
||||
|
||||
- **AbCip:** add member expansion to the discovery emit. Where a discovered tag is a
|
||||
`Structure` with decoded `Members`, additionally emit one `AbCipDiscoveredTag` per atomic
|
||||
member (reference `Name.Member`, atomic `DriverDataType` from
|
||||
`AbCipDataTypeExtensions`/the template member type). Bounded recursion depth for
|
||||
nested structs (config/const cap, e.g. 8). The bare Structure container keeps returning
|
||||
`BadNotSupported` on direct read (unchanged) — members are the addressable surface.
|
||||
- **TwinCAT:** in `BrowseSymbolsAsync`, when a symbol's `IDataType.Category == Struct`,
|
||||
recurse `symbol.SubSymbols` and yield atomic leaves (full `InstancePath`,
|
||||
`MapSymbolType`-resolved type). Bounded recursion depth. The pre-declared-Structure-tag
|
||||
reject at `TwinCATDriverFactoryExtensions.cs:89` stays (its message already points to
|
||||
discovery) — but now discovery actually surfaces the members it points to.
|
||||
- **No** `IDriver` / wire / proto / EF change — both edits are driver-internal discovery.
|
||||
|
||||
### Named deferrals (not silent)
|
||||
|
||||
- **Member writes** — read-only members this slice (writes via the atomic path may "just
|
||||
work" but add round-trip risk; deferred + documented).
|
||||
- Whole-UDT (container) read/write — stays unsupported; address members.
|
||||
- Arrays-of-UDT, UDT-typed array members, pointers/references, FB internals beyond plain
|
||||
struct members — dropped in discovery (never mis-reported), deferred.
|
||||
- Nested-struct recursion beyond the bounded depth cap.
|
||||
|
||||
### Testing (WS-B)
|
||||
|
||||
- **AbCip:** fake template reader / enumerator yields a UDT with atomic + nested members →
|
||||
assert discovery emits per-member tags with the right reference + atomic type; assert a
|
||||
member-path read resolves + reads (existing fake runtime); assert the bare container
|
||||
still `BadNotSupported`; depth-cap honored.
|
||||
- **TwinCAT:** `FakeTwinCATClient` symbol tree with a struct + nested struct + an
|
||||
unsupported leaf → assert `BrowseSymbolsAsync` yields the atomic leaves (full paths,
|
||||
mapped types), drops the unsupported leaf, honors the depth cap; member read via
|
||||
`ToAdsSymbolName` works.
|
||||
- **No bUnit.**
|
||||
|
||||
---
|
||||
|
||||
## Live `/run`
|
||||
|
||||
- **WS-A (#400):** **unit + integration-proven.** A real oversized-tie anomaly needs the
|
||||
AVEVA Historian on `10.100.0.48` (infra-gated — same gate as all of Phase C) plus an
|
||||
actual same-timestamp tie cluster. The fake-source integration test is the canonical
|
||||
proof; a live `/run` is best-effort/operator-gated.
|
||||
- **WS-B AbCip:** **best-effort** against a local `ab_server` ControlLogix
|
||||
(`--plc=ControlLogix`) — UDT support in `ab_server` is partial, so likely unit-proven;
|
||||
attempt a member round-trip and record honestly.
|
||||
- **WS-B TwinCAT:** **operator-gated** (Windows-only ADS runtime; no local target) —
|
||||
unit-proven only.
|
||||
|
||||
## Task slicing (subagent-driven)
|
||||
|
||||
Disjoint projects ⇒ WS-A and WS-B implementers run **concurrently**.
|
||||
|
||||
- **T1 (WS-A)** — `HistoryPaging.SliceTieCluster` pure helper + `MaxTieClusterOverfetch`
|
||||
option + `ServeRawPaged` stuck-branch rewrite + tests. `standard`. ∥ T2/T3.
|
||||
- **T2 (WS-B AbCip)** — discovery member expansion + member-read verification + tests.
|
||||
`standard`. ∥ T1/T3.
|
||||
- **T3 (WS-B TwinCAT)** — `BrowseSymbolsAsync` sub-symbol recursion + tests. `standard`.
|
||||
∥ T1/T2.
|
||||
- **T4** — Docs (`docs/Historian.md` paging-limitation section → resolved;
|
||||
`docs/drivers/AbCip.md` + `docs/drivers/TwinCAT.md` UDT member-path sections) + clear the
|
||||
§2 lines **via the plan record only** (do not stage `stillpending.md`). `small`.
|
||||
- **T5** — Full build + OpcUaServer + AbCip + TwinCAT tests + final integration review.
|
||||
`standard`.
|
||||
- **T6** — Live `/run` best-effort (WS-A fake-integration; WS-B AbCip ab_server best-effort,
|
||||
TwinCAT operator-gated) + finish branch (merge to master + push) + memory. `standard`.
|
||||
|
||||
Dependency graph: `{T1 ∥ T2 ∥ T3} → T4 → T5 → T6`.
|
||||
|
||||
## Hard rules (carried from prior phases)
|
||||
|
||||
Stage by explicit path, never `git add .`; never stage `sql_login.txt` /
|
||||
`src/Server/.../pki/` / `pending.md` / `current.md` / `docker-dev/docker-compose.yml` /
|
||||
`stillpending.md`; never echo/commit secrets; no force-push; no `--no-verify`; **NO EF
|
||||
migration**; **NO Commons wire/proto contract change** (all edits are driver-internal or
|
||||
server-internal — `HistoryPaging`/`OtOpcUaNodeManager`/`ServerHistorianOptions` and the two
|
||||
drivers' discovery; no `IDriver`/`IHistorianDataSource`/`WriteResult` change); **NO bUnit**;
|
||||
`dangerouslyDisableSandbox` for all build/test/rig commands.
|
||||
Reference in New Issue
Block a user