Files
histsdk/docs/plans/r1.8-r1.9-summary-queries.md
T
Joseph Doherty 1a7519c803 RE: resolve R1.8/R1.9 analog/state summary via request+response capture
Captured the native StartQuery2 pRequestBuff and the GetNextQueryResultBuffer2
response (instrument-wcf-writemessage + chained instrument-wcf-readmessage) and
decoded both against AnalogSummaryHistory SQL ground truth. Conclusion: the rich
multi-aggregate analog/state summary struct is NOT delivered over the 2020 WCF
binary protocol — the response is the ordinary version-9 row buffer the existing
aggregate parser already handles, carrying one value per cycle selected by
RetrievalMode (QueryType 5-8), not ValueSelector (inert on this path). So
"analog summary" == the existing ReadAggregateAsync; no new src/ code warranted.

Tooling (tools/ + scripts/ only, nothing in src/):
- NativeTraceHarness: drive summary knobs via --value-selector /
  --aggregation-type / --max-states (uint16) / --filter
- Capture-SummaryRequest.ps1: repeatable instrument+stage+matrix capture,
  -WithResponse chains the ReadMessage hook
- decode-summary-capture.py: StartQuery2 request diff vs baseline
- decode-summary-response.py: response decode vs SQL ground truth

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-20 17:01:42 -04:00

167 lines
12 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# R1.8 / R1.9 — Analog-summary & State-summary queries (implementation plan)
**Status (2026-06-21): RESOLVED by request + response capture. Conclusion: the rich
multi-aggregate analog/state summary struct is NOT delivered over the 2020 WCF binary protocol.
The per-cycle aggregate values it would expose are ALREADY shipped via `ReadAggregateAsync`
(RetrievalMode → QueryType 58). No new `src/` code is warranted for R1.8/R1.9 on 2020 WCF.**
## RESOLVED — what the response capture proved (2026-06-21)
The request side was recovered first (table further down), then the `GetNextQueryResultBuffer2`
**response** was captured (`instrument-wcf-readmessage`, both hooks chained) and decoded against
`AnalogSummaryHistory` SQL ground truth for `SysTimeSec` over a 6 h window / 1 h cycle. Findings:
1. **The response is the ordinary version-9 row buffer** — same layout the existing raw/aggregate
parser (`TryParseGetNextQueryResultBufferAggregateRows`) already handles: `uint16 version=9`,
`uint32 rowCount`, then per-row `tagKey + nameLen + name + ValueCount + cycleEnd FILETIME +
quality + OpcQuality + Value(double) + PercentGood(double) + trailer(cycleStart FILETIME …)`.
The captured 7-row buffer decoded with `Value=31.0`, `PercentGood=100.0`, `ValueCount=1`,
`OpcQuality=192` — matching the SQL row exactly.
2. **There is NO rich `CAnalogSummaryValue` struct on the wire.** Each row carries a *single*
value, not Min+Max+First+Last+Avg+Integral together. The all-aggregates-in-one-row shape that
`CAnalogSummaryValue` / `AnalogSummaryHistory` represents is the **SQL/OLEDB provider's** shape,
not the binary `StartQuery2` retrieval's.
3. **The single value is selected by `RetrievalMode` (QueryType), not by `ValueSelector`.** Proven
against the same constant tag where only the *kind* of aggregate distinguishes the result:
- `RetrievalMode=Integral` (QueryType 8) → `Value = 111600.0` (= SQL `Integral`) ✓
- `RetrievalMode=TimeWeightedAverage` (QueryType 5) → `Value = 31.0` (= SQL `Average`) ✓
- `Cyclic` (QueryType 0) **+ `ValueSelector=Integral`** → `Value = 31.0` (selector **ignored**;
the request byte `ValueSelector@0x59=0x04` was confirmed sent, yet the cyclic value came back).
So `ValueSelector` / `AggregationType` / `MaxStates` are **inert on the WCF retrieval path**
they configure the SQL provider's summary tables, not this binary query.
4. **Resolution unit is correct in the SDK.** The wire `Resolution` is 100 ns ticks (= ms × 10000).
`SerializeFullHistoryRequest` writes `TimeSpan.Ticks`, which the golden test
`SerializerMatchesInstrumentedNativeTimeWeightedAverageRequest` already verifies byte-for-byte
against native (`FromMinutes(1)``600000000`). No bug.
**Therefore:** "analog summary" over 2020 WCF == the existing aggregate read. To get Min, Max,
Average and Integral for a cycle you issue the corresponding `RetrievalMode` queries
(`MinimumWithTime` / `MaximumWithTime` / `TimeWeightedAverage` / `Integral`), each returning that
one aggregate per cycle — all already implemented, mapped (QueryType 58) and golden-tested in
`ReadAggregateAsync`. **R1.8/R1.9 need no new protocol code on this server.** A genuine
all-aggregates-at-once summary would require the gRPC front door or the SQL provider, neither of
which is the 2020 WCF binary path.
Capture/decode tooling is committed and repeatable: `scripts/Capture-SummaryRequest.ps1`
(`-WithResponse` chains ReadMessage), `scripts/decode-summary-capture.py` (request diff),
`scripts/decode-summary-response.py <config>` (response decode vs SQL ground truth). Raw captures
live under `artifacts/reverse-engineering/instrumented-wcf-writemessage-summary/` (gitignored).
---
_Original scoping notes below remain for context. They led to the capture; the conclusion above supersedes their "ready to implement" framing._
Unlike the M1 *read* items gated by the [string-handle wall](../reverse-engineering/wcf-string-handle-wall.md),
summary queries ride the **proven `uint`-handle `StartQuery2`** path — the same call the working
raw/aggregate reads use. So they are genuinely reachable here; the only work is (a) the right
request parameters and (b) decoding the summary row buffer.
## What's already in place
`HistorianDataQueryRequest` + `SerializeFullHistoryRequest`
(`Wcf/HistorianDataQueryProtocol.cs`) already serialize every field a summary query needs:
`QueryType` (INSQL_QUERYTYPE), `SummaryType` (HISTORIAN_SUMMARYTYPE), `AggregationType`,
`ColumnSelectorFlags`, `Resolution`. Normal reads send `SummaryType=0` and
`ColumnSelectorFlags=0x0000_8182_0007_82FF`. A summary query is the **same request with summary
values in those three fields**, then a different row parser on the result buffer.
## Decode targets recovered from `current/aahClientManaged.dll`
Found via `methods … Summary` + `dnlib-method`:
| Native artifact | Token | Use |
|---|---|---|
| `CAnalogSummaryValue.UnpackFromValueBuffer` | `0x06000394` | **the analog-summary row decoder** — a chain of buffer-reader calls (not literal offsets), so decode empirically against a captured buffer |
| `CAnalogSummaryValue.PackToVtq` | `0x06000395` | inverse (for a future write path) |
| `CAnalogSummaryValue` setters | `0x0600038A92` | wire field set: **StartDateTime, Min, Max, First, Last, ValueCount, TimeGood, Integral, IntegralOfSquares** |
| `CAnalogSummaryStruct` setters | `0x0600036977` | fuller field set: adds **MinDateTime, MaxDateTime, FirstDateTime, LastDateTime, FirstNullDateTime, LastNullFlag, LinearIntegral** |
| `CStateSummaryStruct` setters | `0x0600039BA0` | **state-summary fields: MinContained, MaxContained, TotalContained, PartialStart, PartialEnd, StateEntryCount** |
| `QueryColumnSelector.SelectAnalogSummaryColumns` | `0x0600004B` | builds `ColumnSelectorFlags` for analog summary via `CColumnNameMap.GetColumnFlag(name)` per column |
| `QueryColumnSelector.SelectStateSummaryColumns` | `0x0600004C` | same, state summary |
| `QueryColumnSelector.SelectNonSummaryColumns` | `0x0600004D` | the default (matches the `0x…82FF` flags reads already send) |
| `CTypeMetadata.IsAnalogSummary` / `IsStateSummary` | `0x060001A4/A5` | server-side type gating |
| `INSQL_QUERYTYPE` / `HISTORIAN_SUMMARYTYPE` | enums `0200013F` / `02000191` | the `QueryType` / `SummaryType` values to send |
## Native request capture (2026-06-21) — request shape RECOVERED
The earlier blind probing (sweeping `SummaryType`/`ColumnSelectorFlags` over the managed
serializer) was the wrong lever: it returned 0-row buffers because the managed `SummaryType`
field is **not** how the native client encodes a summary. A real capture settled it.
**Capture pipeline (now repeatable):** `scripts/Capture-SummaryRequest.ps1` IL-rewrites a copy
of `aahClientManaged.dll` (`instrument-wcf-writemessage`), stages it alongside the strong-named
`ReverseInstrumentation` logger, then drives the `NativeTraceHarness` history scenario through a
candidate matrix while logging every outgoing MDAS body. `scripts/decode-summary-capture.py`
extracts the `Retr/StartQuery2` `pRequestBuff` from each and diffs the summary candidates against
a tag-matched `baseline-full`. The harness now exposes `--value-selector` / `--aggregation-type`
/ `--max-states` / `--filter` so the native `HistoryQueryArgs` summary knobs can be driven.
**There is no separate "summary" QueryType or `SummaryType` field.** A summary is an ordinary
`StartQuery2` request (`QueryType` = the chosen `RetrievalMode`, e.g. `Cyclic`=0) with three
things set: the **ValueSelector** byte, the **AggregationType** byte, a non-zero **Resolution**
(which fills the previously-zeroed `AutoSummaryParameters` trailer), and — for state summary —
the **MaxStates** field. The server then returns analog- vs state-summary rows based on the tag
type plus these fields. Offsets below are **into the StartQuery2 `pRequestBuff`** (229-byte
`SysTimeSec` baseline; verified byte-for-byte against the native client):
| Offset | Field | Type | Evidence |
|---|---|---|---|
| `0x01` | QueryType | uint32 LE | Full→`02`, Cyclic→`00` (matches the verified `RetrievalMode``QueryType` map) |
| `0x1D` | Resolution | float64 LE | `36e9` ticks → `00 00 00 D0 88 C3 20 42` = `0x4220C388D0000000` (1 h). Zero for non-summary reads |
| `0x32` | Timezone | len-prefixed UTF-16 | `"UTC"` |
| `0x49` | Filter | len-prefixed UTF-16 | `"NoFilter"` default; driven by `--filter` |
| `0x59` | **ValueSelector** | byte | baseline `01` (Auto); `--value-selector Minimum``06`, `Maximum``07`, `Average``08` — exact `HistorianValueSelector` values |
| `0x5B` | **AggregationType** | byte | baseline `03`; `--aggregation-type Average``02` — exact `HistorianAggregationType` values |
| `~0x5F` | ColumnSelectorFlags | bytes | `FF 82 07 00 82 81` — matches the `0x0000_8182_0007_82FF` reads already send; **unchanged** by summary |
| `0x6B` | Tag name | len-prefixed UTF-16 | `count, "SysTimeSec"` |
| after tag | **MaxStates** | uint16 LE | the `01`-default byte after the tag block; `--max-states 10``0A` (state summary, R1.9) |
| `~0xAA` | **AutoSummaryParameters** | block | zero for plain reads; `80 1E 08 6B 47 01` when Resolution set (identical across analog *and* state) — the resolution-derived cycle block |
State summary (R1.9) is the **same request** with `MaxStates` > 0 (the analog `ValueSelector`/
`AggregationType` bytes stay at their `01`/`03` defaults); the analog-vs-state distinction on the
wire is which of those fields is non-default, plus the tag type. Note `MaxStates` is a **UInt16**
on `HistoryQueryArgs` (passing UInt32 throws) — the harness casts accordingly.
Raw captures live under `artifacts/reverse-engineering/instrumented-wcf-writemessage-summary/`
(gitignored). Re-run with `scripts/Capture-SummaryRequest.ps1` (analog: `SysTimeSec`; state:
`-TagName SysPulse`, the local discrete tag).
## Open questions (only the row layout remains)
1. ~~**Request params.**~~ **DONE** — see the table above. ValueSelector @ `0x59`,
AggregationType @ `0x5B`, Resolution @ `0x1D` (→ AutoSummaryParameters @ `~0xAA`),
MaxStates after the tag block. No new QueryType/SummaryType ordinal involved.
2. **Row layout (next concrete step).** Capture the `GetNextQueryResultBuffer2` *response* for an
analog summary of `SysTimeSec` over a multi-hour window with a 1 h resolution — instrument
`ReadMessage` (`instrument-wcf-readmessage`, symmetric to the WriteMessage capture already
wired here) and decode against the `CAnalogSummaryValue` field set
(StartDateTime + Min/Max/First/Last/ValueCount/TimeGood/Integral/IntegralOfSquares). The
request side is no longer a blocker.
## Implementation steps (per the project's two-tests discipline)
1. Add request params to `HistorianDataQueryRequest` builders (a `BuildAnalogSummaryRequest` /
`BuildStateSummaryRequest` alongside `BuildAggregateQueryRequest`).
2. **Live-probe** `SysTimeSec` via a gated diagnostic; sanitize the response into
`fixtures/protocol/analog-summary/` using the CW-1 pipeline.
3. Write `TryParseGetNextQueryResultBufferAnalogSummaryRows` (+ state variant) against the fixture.
4. Public API: `ReadAnalogSummaryAsync` / `ReadStateSummaryAsync` returning new models
`HistorianAnalogSummary` (Min/Max/First/Last/Avg=Integral÷TimeGood/ValueCount/…) and
`HistorianStateSummary` (per-state contained/partial/entry-count). Reuse `RunQuery` plumbing.
5. Golden-byte test on the parser + gated live test on `localhost` (assert non-empty, fields sane).
## State of play
The **request side is fully recovered** from real bytes (table above) — the managed
`HistorianDataQueryRequest` builder can now set `ValueSelector`/`AggregationType`/`Resolution`
(+ `MaxStates` for state) against ground truth rather than guesses. What remains is the
**response row layout**: `CAnalogSummaryValue.UnpackFromValueBuffer` is reader-call-based (no
literal offset table), so the parser needs a captured real *response* buffer to decode against
(step 2 in Open questions — `instrument-wcf-readmessage`, already wired alongside the WriteMessage
capture). Per project rule ("never guess wire bytes; leave throwing until evidence supports it")
no summary code is in `src/` yet — that lands once the response fixture exists.