Files
histsdk/docs/plans/r1.8-r1.9-summary-queries.md
T
Joseph Doherty 1a7519c803 RE: resolve R1.8/R1.9 analog/state summary via request+response capture
Captured the native StartQuery2 pRequestBuff and the GetNextQueryResultBuffer2
response (instrument-wcf-writemessage + chained instrument-wcf-readmessage) and
decoded both against AnalogSummaryHistory SQL ground truth. Conclusion: the rich
multi-aggregate analog/state summary struct is NOT delivered over the 2020 WCF
binary protocol — the response is the ordinary version-9 row buffer the existing
aggregate parser already handles, carrying one value per cycle selected by
RetrievalMode (QueryType 5-8), not ValueSelector (inert on this path). So
"analog summary" == the existing ReadAggregateAsync; no new src/ code warranted.

Tooling (tools/ + scripts/ only, nothing in src/):
- NativeTraceHarness: drive summary knobs via --value-selector /
  --aggregation-type / --max-states (uint16) / --filter
- Capture-SummaryRequest.ps1: repeatable instrument+stage+matrix capture,
  -WithResponse chains the ReadMessage hook
- decode-summary-capture.py: StartQuery2 request diff vs baseline
- decode-summary-response.py: response decode vs SQL ground truth

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-20 17:01:42 -04:00

12 KiB
Raw Blame History

R1.8 / R1.9 — Analog-summary & State-summary queries (implementation plan)

Status (2026-06-21): RESOLVED by request + response capture. Conclusion: the rich multi-aggregate analog/state summary struct is NOT delivered over the 2020 WCF binary protocol. The per-cycle aggregate values it would expose are ALREADY shipped via ReadAggregateAsync (RetrievalMode → QueryType 58). No new src/ code is warranted for R1.8/R1.9 on 2020 WCF.

RESOLVED — what the response capture proved (2026-06-21)

The request side was recovered first (table further down), then the GetNextQueryResultBuffer2 response was captured (instrument-wcf-readmessage, both hooks chained) and decoded against AnalogSummaryHistory SQL ground truth for SysTimeSec over a 6 h window / 1 h cycle. Findings:

  1. The response is the ordinary version-9 row buffer — same layout the existing raw/aggregate parser (TryParseGetNextQueryResultBufferAggregateRows) already handles: uint16 version=9, uint32 rowCount, then per-row tagKey + nameLen + name + ValueCount + cycleEnd FILETIME + quality + OpcQuality + Value(double) + PercentGood(double) + trailer(cycleStart FILETIME …). The captured 7-row buffer decoded with Value=31.0, PercentGood=100.0, ValueCount=1, OpcQuality=192 — matching the SQL row exactly.

  2. There is NO rich CAnalogSummaryValue struct on the wire. Each row carries a single value, not Min+Max+First+Last+Avg+Integral together. The all-aggregates-in-one-row shape that CAnalogSummaryValue / AnalogSummaryHistory represents is the SQL/OLEDB provider's shape, not the binary StartQuery2 retrieval's.

  3. The single value is selected by RetrievalMode (QueryType), not by ValueSelector. Proven against the same constant tag where only the kind of aggregate distinguishes the result:

    • RetrievalMode=Integral (QueryType 8) → Value = 111600.0 (= SQL Integral) ✓
    • RetrievalMode=TimeWeightedAverage (QueryType 5) → Value = 31.0 (= SQL Average) ✓
    • Cyclic (QueryType 0) + ValueSelector=IntegralValue = 31.0 (selector ignored; the request byte ValueSelector@0x59=0x04 was confirmed sent, yet the cyclic value came back).

    So ValueSelector / AggregationType / MaxStates are inert on the WCF retrieval path — they configure the SQL provider's summary tables, not this binary query.

  4. Resolution unit is correct in the SDK. The wire Resolution is 100 ns ticks (= ms × 10000). SerializeFullHistoryRequest writes TimeSpan.Ticks, which the golden test SerializerMatchesInstrumentedNativeTimeWeightedAverageRequest already verifies byte-for-byte against native (FromMinutes(1)600000000). No bug.

Therefore: "analog summary" over 2020 WCF == the existing aggregate read. To get Min, Max, Average and Integral for a cycle you issue the corresponding RetrievalMode queries (MinimumWithTime / MaximumWithTime / TimeWeightedAverage / Integral), each returning that one aggregate per cycle — all already implemented, mapped (QueryType 58) and golden-tested in ReadAggregateAsync. R1.8/R1.9 need no new protocol code on this server. A genuine all-aggregates-at-once summary would require the gRPC front door or the SQL provider, neither of which is the 2020 WCF binary path.

Capture/decode tooling is committed and repeatable: scripts/Capture-SummaryRequest.ps1 (-WithResponse chains ReadMessage), scripts/decode-summary-capture.py (request diff), scripts/decode-summary-response.py <config> (response decode vs SQL ground truth). Raw captures live under artifacts/reverse-engineering/instrumented-wcf-writemessage-summary/ (gitignored).


Original scoping notes below remain for context. They led to the capture; the conclusion above supersedes their "ready to implement" framing.

Unlike the M1 read items gated by the string-handle wall, summary queries ride the proven uint-handle StartQuery2 path — the same call the working raw/aggregate reads use. So they are genuinely reachable here; the only work is (a) the right request parameters and (b) decoding the summary row buffer.

What's already in place

HistorianDataQueryRequest + SerializeFullHistoryRequest (Wcf/HistorianDataQueryProtocol.cs) already serialize every field a summary query needs: QueryType (INSQL_QUERYTYPE), SummaryType (HISTORIAN_SUMMARYTYPE), AggregationType, ColumnSelectorFlags, Resolution. Normal reads send SummaryType=0 and ColumnSelectorFlags=0x0000_8182_0007_82FF. A summary query is the same request with summary values in those three fields, then a different row parser on the result buffer.

Decode targets recovered from current/aahClientManaged.dll

Found via methods … Summary + dnlib-method:

Native artifact Token Use
CAnalogSummaryValue.UnpackFromValueBuffer 0x06000394 the analog-summary row decoder — a chain of buffer-reader calls (not literal offsets), so decode empirically against a captured buffer
CAnalogSummaryValue.PackToVtq 0x06000395 inverse (for a future write path)
CAnalogSummaryValue setters 0x0600038A92 wire field set: StartDateTime, Min, Max, First, Last, ValueCount, TimeGood, Integral, IntegralOfSquares
CAnalogSummaryStruct setters 0x0600036977 fuller field set: adds MinDateTime, MaxDateTime, FirstDateTime, LastDateTime, FirstNullDateTime, LastNullFlag, LinearIntegral
CStateSummaryStruct setters 0x0600039BA0 state-summary fields: MinContained, MaxContained, TotalContained, PartialStart, PartialEnd, StateEntryCount
QueryColumnSelector.SelectAnalogSummaryColumns 0x0600004B builds ColumnSelectorFlags for analog summary via CColumnNameMap.GetColumnFlag(name) per column
QueryColumnSelector.SelectStateSummaryColumns 0x0600004C same, state summary
QueryColumnSelector.SelectNonSummaryColumns 0x0600004D the default (matches the 0x…82FF flags reads already send)
CTypeMetadata.IsAnalogSummary / IsStateSummary 0x060001A4/A5 server-side type gating
INSQL_QUERYTYPE / HISTORIAN_SUMMARYTYPE enums 0200013F / 02000191 the QueryType / SummaryType values to send

Native request capture (2026-06-21) — request shape RECOVERED

The earlier blind probing (sweeping SummaryType/ColumnSelectorFlags over the managed serializer) was the wrong lever: it returned 0-row buffers because the managed SummaryType field is not how the native client encodes a summary. A real capture settled it.

Capture pipeline (now repeatable): scripts/Capture-SummaryRequest.ps1 IL-rewrites a copy of aahClientManaged.dll (instrument-wcf-writemessage), stages it alongside the strong-named ReverseInstrumentation logger, then drives the NativeTraceHarness history scenario through a candidate matrix while logging every outgoing MDAS body. scripts/decode-summary-capture.py extracts the Retr/StartQuery2 pRequestBuff from each and diffs the summary candidates against a tag-matched baseline-full. The harness now exposes --value-selector / --aggregation-type / --max-states / --filter so the native HistoryQueryArgs summary knobs can be driven.

There is no separate "summary" QueryType or SummaryType field. A summary is an ordinary StartQuery2 request (QueryType = the chosen RetrievalMode, e.g. Cyclic=0) with three things set: the ValueSelector byte, the AggregationType byte, a non-zero Resolution (which fills the previously-zeroed AutoSummaryParameters trailer), and — for state summary — the MaxStates field. The server then returns analog- vs state-summary rows based on the tag type plus these fields. Offsets below are into the StartQuery2 pRequestBuff (229-byte SysTimeSec baseline; verified byte-for-byte against the native client):

Offset Field Type Evidence
0x01 QueryType uint32 LE Full→02, Cyclic→00 (matches the verified RetrievalModeQueryType map)
0x1D Resolution float64 LE 36e9 ticks → 00 00 00 D0 88 C3 20 42 = 0x4220C388D0000000 (1 h). Zero for non-summary reads
0x32 Timezone len-prefixed UTF-16 "UTC"
0x49 Filter len-prefixed UTF-16 "NoFilter" default; driven by --filter
0x59 ValueSelector byte baseline 01 (Auto); --value-selector Minimum06, Maximum07, Average08 — exact HistorianValueSelector values
0x5B AggregationType byte baseline 03; --aggregation-type Average02 — exact HistorianAggregationType values
~0x5F ColumnSelectorFlags bytes FF 82 07 00 82 81 — matches the 0x0000_8182_0007_82FF reads already send; unchanged by summary
0x6B Tag name len-prefixed UTF-16 count, "SysTimeSec"
after tag MaxStates uint16 LE the 01-default byte after the tag block; --max-states 100A (state summary, R1.9)
~0xAA AutoSummaryParameters block zero for plain reads; 80 1E 08 6B 47 01 when Resolution set (identical across analog and state) — the resolution-derived cycle block

State summary (R1.9) is the same request with MaxStates > 0 (the analog ValueSelector/ AggregationType bytes stay at their 01/03 defaults); the analog-vs-state distinction on the wire is which of those fields is non-default, plus the tag type. Note MaxStates is a UInt16 on HistoryQueryArgs (passing UInt32 throws) — the harness casts accordingly.

Raw captures live under artifacts/reverse-engineering/instrumented-wcf-writemessage-summary/ (gitignored). Re-run with scripts/Capture-SummaryRequest.ps1 (analog: SysTimeSec; state: -TagName SysPulse, the local discrete tag).

Open questions (only the row layout remains)

  1. Request params. DONE — see the table above. ValueSelector @ 0x59, AggregationType @ 0x5B, Resolution @ 0x1D (→ AutoSummaryParameters @ ~0xAA), MaxStates after the tag block. No new QueryType/SummaryType ordinal involved.
  2. Row layout (next concrete step). Capture the GetNextQueryResultBuffer2 response for an analog summary of SysTimeSec over a multi-hour window with a 1 h resolution — instrument ReadMessage (instrument-wcf-readmessage, symmetric to the WriteMessage capture already wired here) and decode against the CAnalogSummaryValue field set (StartDateTime + Min/Max/First/Last/ValueCount/TimeGood/Integral/IntegralOfSquares). The request side is no longer a blocker.

Implementation steps (per the project's two-tests discipline)

  1. Add request params to HistorianDataQueryRequest builders (a BuildAnalogSummaryRequest / BuildStateSummaryRequest alongside BuildAggregateQueryRequest).
  2. Live-probe SysTimeSec via a gated diagnostic; sanitize the response into fixtures/protocol/analog-summary/ using the CW-1 pipeline.
  3. Write TryParseGetNextQueryResultBufferAnalogSummaryRows (+ state variant) against the fixture.
  4. Public API: ReadAnalogSummaryAsync / ReadStateSummaryAsync returning new models HistorianAnalogSummary (Min/Max/First/Last/Avg=Integral÷TimeGood/ValueCount/…) and HistorianStateSummary (per-state contained/partial/entry-count). Reuse RunQuery plumbing.
  5. Golden-byte test on the parser + gated live test on localhost (assert non-empty, fields sane).

State of play

The request side is fully recovered from real bytes (table above) — the managed HistorianDataQueryRequest builder can now set ValueSelector/AggregationType/Resolution (+ MaxStates for state) against ground truth rather than guesses. What remains is the response row layout: CAnalogSummaryValue.UnpackFromValueBuffer is reader-call-based (no literal offset table), so the parser needs a captured real response buffer to decode against (step 2 in Open questions — instrument-wcf-readmessage, already wired alongside the WriteMessage capture). Per project rule ("never guess wire bytes; leave throwing until evidence supports it") no summary code is in src/ yet — that lands once the response fixture exists.