RE: resolve R1.8/R1.9 analog/state summary via request+response capture

Captured the native StartQuery2 pRequestBuff and the GetNextQueryResultBuffer2
response (instrument-wcf-writemessage + chained instrument-wcf-readmessage) and
decoded both against AnalogSummaryHistory SQL ground truth. Conclusion: the rich
multi-aggregate analog/state summary struct is NOT delivered over the 2020 WCF
binary protocol — the response is the ordinary version-9 row buffer the existing
aggregate parser already handles, carrying one value per cycle selected by
RetrievalMode (QueryType 5-8), not ValueSelector (inert on this path). So
"analog summary" == the existing ReadAggregateAsync; no new src/ code warranted.

Tooling (tools/ + scripts/ only, nothing in src/):
- NativeTraceHarness: drive summary knobs via --value-selector /
  --aggregation-type / --max-states (uint16) / --filter
- Capture-SummaryRequest.ps1: repeatable instrument+stage+matrix capture,
  -WithResponse chains the ReadMessage hook
- decode-summary-capture.py: StartQuery2 request diff vs baseline
- decode-summary-response.py: response decode vs SQL ground truth

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
Joseph Doherty
2026-06-20 17:01:42 -04:00
parent 362fcb0ef4
commit 1a7519c803
5 changed files with 531 additions and 38 deletions
+150
View File
@@ -0,0 +1,150 @@
<#
.SYNOPSIS
Captures the native AVEVA client's StartQuery2 request bytes for analog/state
summary queries (HCAL roadmap R1.8/R1.9) so the managed SDK's summary request
shape can be decoded against ground truth instead of guessed.
.DESCRIPTION
Drives the .NET-Framework NativeTraceHarness against the live Historian with an
IL-rewritten copy of aahClientManaged.dll whose ClientMessageEncoder.WriteMessage
is instrumented to log every outgoing MDAS body (the same pipeline that produced
every other proven request shape). For each candidate HistoryQueryArgs config it
writes a per-config NDJSON capture under
artifacts/reverse-engineering/instrumented-wcf-writemessage-summary/ (gitignored).
The default matrix is:
- baseline-full : RetrievalMode=Full (the known-good non-summary request)
- analog-avg : RetrievalMode=Cyclic + ValueSelector=Average + Resolution
- analog-min : RetrievalMode=Cyclic + ValueSelector=Minimum + Resolution
- analog-agg-avg : RetrievalMode=Cyclic + AggregationType=Average + Resolution
- state-summary : RetrievalMode=Cyclic + MaxStates>0 + Resolution
Diff any candidate against baseline-full (scripts/decode-summary-capture.py) to read
off the exact QueryType / SummaryType / AutoSummaryParameters bytes the native client
sets for a summary, then implement the managed request against that.
.NOTES
Artifacts are diagnostic. Sanitize before copying anything into docs/ — never commit
raw capture NDJSON, credentials, hostnames, or customer tag names.
#>
[CmdletBinding()]
param(
[string]$ServerName = "localhost",
[int]$TcpPort = 32568,
# SysTimeSec is the local data-bearing system tag (OtOpcUaParityTest_001.Counter is stale/empty).
[string]$TagName = "SysTimeSec",
[int]$LookbackMinutes = 240,
[int]$MaxRows = 4,
# 1-hour summary cycle in 100ns ticks (1h = 36,000,000,000 ticks).
[uint64]$ResolutionTicks = 36000000000,
[string]$Configuration = "Debug",
# Restrict the run to a single named config from the matrix (default: run all).
[string]$OnlyConfig = "",
# Also instrument ReadMessage so each capture includes the incoming WCF response bodies
# (the GetNextQueryResultBuffer2 pResultBuff summary rows). Decoded by decode-summary-response.py.
[switch]$WithResponse
)
$ErrorActionPreference = "Stop"
$repoRoot = Split-Path -Parent $PSScriptRoot
Set-Location $repoRoot
$reProj = Join-Path $repoRoot "tools\AVEVA.Historian.ReverseEngineering\AVEVA.Historian.ReverseEngineering.csproj"
$harnessProj = Join-Path $repoRoot "tools\AVEVA.Historian.NativeTraceHarness\AVEVA.Historian.NativeTraceHarness.csproj"
$instrProj = Join-Path $repoRoot "tools\AVEVA.Historian.ReverseInstrumentation\AVEVA.Historian.ReverseInstrumentation.csproj"
$captureDir = Join-Path $repoRoot "artifacts\reverse-engineering\instrumented-wcf-writemessage-summary"
$currentCopy = Join-Path $captureDir "current-copy"
$instrDll = Join-Path $captureDir "aahClientManaged.dll"
Write-Host "== Building tooling ($Configuration) ==" -ForegroundColor Cyan
dotnet build $reProj -c $Configuration --nologo -v q | Out-Null
dotnet build $instrProj -c $Configuration --nologo -v q | Out-Null
dotnet build $harnessProj -c $Configuration --nologo -v q | Out-Null
$instrSourceDll = Get-ChildItem -Recurse (Join-Path $repoRoot "tools\AVEVA.Historian.ReverseInstrumentation\bin\$Configuration") `
-Filter "AVEVA.Historian.ReverseInstrumentation.dll" | Select-Object -First 1 -ExpandProperty FullName
if (-not $instrSourceDll) { throw "ReverseInstrumentation.dll not found under bin\$Configuration." }
Write-Host "== Instrumenting WriteMessage$(if ($WithResponse) { ' + ReadMessage' }) ==" -ForegroundColor Cyan
New-Item -ItemType Directory -Force -Path $captureDir | Out-Null
if ($WithResponse) {
# Chain via a distinct intermediate file (reading+writing the same path drops the second
# hook on the mixed-mode native image). Final dll carries both hooks with distinct Phase
# strings: WCF.WriteMessage.Body and WCF.ReadMessage.Body.
$writeOnly = Join-Path $captureDir "aahClientManaged.write.dll"
dotnet run --no-build -c $Configuration --project $reProj -- `
instrument-wcf-writemessage (Join-Path $repoRoot "current\aahClientManaged.dll") $writeOnly | Out-Null
dotnet run --no-build -c $Configuration --project $reProj -- `
instrument-wcf-readmessage $writeOnly $instrDll | Out-Null
} else {
dotnet run --no-build -c $Configuration --project $reProj -- `
instrument-wcf-writemessage (Join-Path $repoRoot "current\aahClientManaged.dll") $instrDll | Out-Null
}
Write-Host "== Staging current-copy ==" -ForegroundColor Cyan
# Mirror current/ into current-copy, then overwrite the managed dll with the instrumented
# build and drop the strong-named logger assembly alongside it so the injected call binds.
robocopy (Join-Path $repoRoot "current") $currentCopy /MIR /NJH /NJS /NDL /NP /NC /NS | Out-Null
Copy-Item -Force $instrDll (Join-Path $currentCopy "aahClientManaged.dll")
Copy-Item -Force $instrSourceDll (Join-Path $currentCopy "AVEVA.Historian.ReverseInstrumentation.dll")
# Candidate matrix: name + harness arg list. Summary configs all use Cyclic + a resolution;
# the differentiator is which summary knob is set.
$matrix = @(
@{ Name = "baseline-full"; Args = @("--retrieval-mode", "Full") },
@{ Name = "analog-avg"; Args = @("--retrieval-mode", "Cyclic", "--value-selector", "Average", "--resolution-ticks", "$ResolutionTicks") },
@{ Name = "analog-min"; Args = @("--retrieval-mode", "Cyclic", "--value-selector", "Minimum", "--resolution-ticks", "$ResolutionTicks") },
@{ Name = "analog-max"; Args = @("--retrieval-mode", "Cyclic", "--value-selector", "Maximum", "--resolution-ticks", "$ResolutionTicks") },
@{ Name = "analog-integral"; Args = @("--retrieval-mode", "Cyclic", "--value-selector", "Integral", "--resolution-ticks", "$ResolutionTicks") },
@{ Name = "mode-integral"; Args = @("--retrieval-mode", "Integral", "--resolution-ticks", "$ResolutionTicks") },
@{ Name = "mode-twavg"; Args = @("--retrieval-mode", "TimeWeightedAverage", "--resolution-ticks", "$ResolutionTicks") },
@{ Name = "analog-agg-avg"; Args = @("--retrieval-mode", "Cyclic", "--aggregation-type", "Average", "--resolution-ticks", "$ResolutionTicks") },
@{ Name = "state-summary"; Args = @("--retrieval-mode", "Cyclic", "--max-states", "10", "--resolution-ticks", "$ResolutionTicks") }
)
if ($OnlyConfig) { $matrix = $matrix | Where-Object { $_.Name -eq $OnlyConfig } }
if (-not $matrix) { throw "No matrix entry named '$OnlyConfig'." }
$harnessDll = Join-Path $currentCopy "aahClientManaged.dll"
$summary = @()
foreach ($cfg in $matrix) {
$name = $cfg.Name
$capturePath = Join-Path $captureDir "summary-capture-$name-latest.ndjson"
if (Test-Path $capturePath) { Remove-Item -Force $capturePath }
$env:AVEVA_HISTORIAN_RE_CAPTURE = $capturePath
Write-Host "== Capturing: $name ==" -ForegroundColor Green
$harnessArgs = @(
"--scenario", "history",
"--server-name", $ServerName,
"--tcp-port", "$TcpPort",
"--tag", $TagName,
"--lookback-minutes", "$LookbackMinutes",
"--max-rows", "$MaxRows",
"--current-dir", $currentCopy,
"--managed-dll-path", $harnessDll
) + $cfg.Args
# Don't let a single config that errors (e.g. state summary on an analog tag) abort the
# whole matrix, and don't treat dotnet's stderr noise as a terminating error.
try {
$prevEap = $ErrorActionPreference
$ErrorActionPreference = "Continue"
& dotnet run --no-build -c $Configuration --project $harnessProj -- @harnessArgs 2>&1 | Out-Null
} catch {
Write-Host " (config '$name' raised: $($_.Exception.Message))" -ForegroundColor Yellow
} finally {
$ErrorActionPreference = $prevEap
}
$recCount = if (Test-Path $capturePath) { (Get-Content $capturePath | Where-Object { $_.Trim() }).Count } else { 0 }
Write-Host " -> $recCount records -> $capturePath"
$summary += [pscustomobject]@{ Config = $name; Records = $recCount; Capture = $capturePath }
}
Remove-Item Env:\AVEVA_HISTORIAN_RE_CAPTURE -ErrorAction SilentlyContinue
Write-Host "`n== Capture summary ==" -ForegroundColor Cyan
$summary | Format-Table -AutoSize
Write-Host "Decode with: python scripts\decode-summary-capture.py" -ForegroundColor Cyan
+115
View File
@@ -0,0 +1,115 @@
"""Decoder for the analog/state summary request capture (HCAL roadmap R1.8/R1.9).
Reads the per-config NDJSON captures produced by scripts/Capture-SummaryRequest.ps1
under artifacts/reverse-engineering/instrumented-wcf-writemessage-summary/, extracts
the Retr/StartQuery2 `pRequestBuff` payload from each, hex-dumps it, and diffs every
summary candidate against the baseline-full request so the differing bytes (the native
QueryType / SummaryType / AutoSummaryParameters fields) stand out.
Output is diagnostic. The only printed strings are the SDK-chosen system tag name and
protocol field markers — sanitize before copying any of it into docs/.
"""
import base64
import json
import sys
from pathlib import Path
REPO_ROOT = Path(__file__).resolve().parent.parent
CAPTURE_DIR = REPO_ROOT / "artifacts" / "reverse-engineering" / "instrumented-wcf-writemessage-summary"
ACTION = b"aa/Retr/StartQuery2"
PARAM = b"pRequestBuff"
def extract_request_buffer(records):
"""Return the pRequestBuff bytes from the first StartQuery2 write record, or None."""
for rec in records:
if rec.get("Phase") != "WCF.WriteMessage.Body":
continue
body = base64.b64decode(rec["Base64"])
if ACTION not in body:
continue
i = body.find(PARAM)
if i < 0:
continue
i += len(PARAM)
marker = body[i]
# MDAS length markers (same scheme as the write decoder).
if marker == 0x9E:
length = body[i + 1]
return body[i + 2:i + 2 + length]
if marker == 0x9F:
length = int.from_bytes(body[i + 1:i + 3], "little")
return body[i + 3:i + 3 + length]
if marker == 0xA0:
length = int.from_bytes(body[i + 1:i + 3], "little")
return body[i + 3:i + 3 + length + 1]
return None
return None
def hexdump(payload, diff_against=None):
for off in range(0, len(payload), 16):
chunk = payload[off:off + 16]
cells = []
for j, c in enumerate(chunk):
mark = ""
if diff_against is not None:
k = off + j
if k >= len(diff_against) or diff_against[k] != c:
mark = "*"
cells.append(f"{c:02X}{mark}")
hp = " ".join(cells)
ap = "".join(chr(c) if 32 <= c < 127 else "." for c in chunk)
print(f" {off:04X} {hp:<56} |{ap}|")
def load(path):
with path.open(encoding="utf-8-sig") as fh:
return [json.loads(line) for line in fh if line.strip()]
def main() -> int:
if not CAPTURE_DIR.exists():
print(f"Capture dir not found: {CAPTURE_DIR}")
print("Run scripts/Capture-SummaryRequest.ps1 first.")
return 1
captures = sorted(CAPTURE_DIR.glob("summary-capture-*-latest.ndjson"))
if not captures:
print(f"No capture files in {CAPTURE_DIR}")
return 1
buffers = {}
for path in captures:
name = path.stem.replace("summary-capture-", "").replace("-latest", "")
records = load(path)
buf = extract_request_buffer(records)
buffers[name] = buf
status = f"{len(buf)} bytes" if buf else "<no StartQuery2 request found>"
print(f"{name:<18} records={len(records):>3} pRequestBuff={status}")
baseline = buffers.get("baseline-full")
print()
if not baseline:
print("No baseline-full request buffer captured; cannot diff. Dumping each raw.")
for name, buf in buffers.items():
if buf:
print(f"\n== {name} pRequestBuff ({len(buf)} bytes) ==")
hexdump(buf)
return 0
print(f"== baseline-full pRequestBuff ({len(baseline)} bytes) ==")
hexdump(baseline)
for name, buf in buffers.items():
if name == "baseline-full" or not buf:
continue
print(f"\n== {name} pRequestBuff ({len(buf)} bytes) — '*' marks bytes differing from baseline ==")
hexdump(buf, diff_against=baseline)
return 0
if __name__ == "__main__":
sys.exit(main())
+123
View File
@@ -0,0 +1,123 @@
"""Decode the GetNextQueryResultBuffer2 *response* for an analog summary (HCAL R1.8).
Reads the both-hooks capture produced by
scripts/Capture-SummaryRequest.ps1 -OnlyConfig analog-avg -WithResponse
finds the ReadMessage record carrying GetNextQueryResultBuffer2Response, extracts the
`pResultBuff` payload, hex-dumps it, and annotates every 8-byte window that decodes to a
known ground-truth value (the AnalogSummaryHistory row for SysTimeSec) so the field offsets
of CAnalogSummaryValue can be read off directly.
Output is diagnostic; the only printed strings are the SDK-chosen system tag name and field
markers. Sanitize before copying into docs/.
"""
import base64
import json
import struct
import sys
from pathlib import Path
REPO_ROOT = Path(__file__).resolve().parent.parent
# Config name (analog-avg / analog-min / analog-max / …) selectable via argv[1].
CONFIG = sys.argv[1] if len(sys.argv) > 1 else "analog-avg"
CAPTURE = (REPO_ROOT / "artifacts" / "reverse-engineering"
/ "instrumented-wcf-writemessage-summary" / f"summary-capture-{CONFIG}-latest.ndjson")
RESP = b"GetNextQueryResultBuffer2Response"
PARAM = b"pResultBuff"
# Ground-truth values from AnalogSummaryHistory(SysTimeSec, 1h cycle) — used to label offsets.
KNOWN_DOUBLES = {
31.0: "31.0 (First/Last/Average)",
100.0: "100.0 (PercentGood)",
0.031: "0.031 (Integral)",
111600.0: "111600.0 (Integral, full-cycle)",
1.0: "1.0 (ValueCount as double?)",
}
KNOWN_U32 = {
1: "ValueCount=1",
192: "OPCQuality=192",
100: "PercentGood=100",
9: "version=9",
}
def extract_param(body, param):
i = body.find(param)
if i < 0:
return None
i += len(param)
marker = body[i]
if marker == 0x9E:
length = body[i + 1]
return body[i + 2:i + 2 + length]
if marker == 0x9F:
length = int.from_bytes(body[i + 1:i + 3], "little")
return body[i + 3:i + 3 + length]
if marker == 0xA0:
length = int.from_bytes(body[i + 1:i + 3], "little")
return body[i + 3:i + 3 + length + 1]
return None
def main() -> int:
if not CAPTURE.exists():
print(f"Capture not found: {CAPTURE}")
print("Run: scripts/Capture-SummaryRequest.ps1 -OnlyConfig analog-avg -WithResponse")
return 1
with CAPTURE.open(encoding="utf-8-sig") as fh:
records = [json.loads(line) for line in fh if line.strip()]
payload = None
for rec in records:
if rec.get("Phase") != "WCF.ReadMessage.Body":
continue
body = base64.b64decode(rec["Base64"])
if RESP not in body:
continue
payload = extract_param(body, PARAM)
break
if payload is None:
print("No GetNextQueryResultBuffer2Response / pResultBuff found in capture.")
return 2
print(f"pResultBuff: {len(payload)} bytes")
if len(payload) >= 6:
version = int.from_bytes(payload[0:2], "little")
row_count = int.from_bytes(payload[2:6], "little")
print(f" header: version={version} rowCount={row_count}")
print()
# Annotated hex dump.
for off in range(0, len(payload), 16):
chunk = payload[off:off + 16]
hp = " ".join(f"{c:02X}" for c in chunk)
ap = "".join(chr(c) if 32 <= c < 127 else "." for c in chunk)
print(f" {off:04X} {hp:<48} |{ap}|")
# Scan every 8-byte window for known doubles, and every 4-byte window for known u32s.
print("\n== Known-value hits (offset -> field) ==")
for off in range(0, len(payload) - 7):
val = struct.unpack_from("<d", payload, off)[0]
for known, label in KNOWN_DOUBLES.items():
if val == known or (known != 0 and abs(val - known) < 1e-9 * max(1.0, abs(known))):
print(f" 0x{off:04X} double {val!r:>14} -> {label}")
for off in range(0, len(payload) - 3):
val = int.from_bytes(payload[off:off + 4], "little")
if val in KNOWN_U32:
print(f" 0x{off:04X} uint32 {val:>14} -> {KNOWN_U32[val]}")
# FILETIME windows (plausible 2026 timestamps: 0x01DC.. high dword).
print("\n== Plausible FILETIME windows (Int64, year ~2020-2030) ==")
for off in range(0, len(payload) - 7):
ft = int.from_bytes(payload[off:off + 8], "little")
# FILETIME for 2020-01-01 ~= 0x01D5BF.. ; 2030 ~= 0x01E5.. — gate by high word.
if 0x01D5_0000_0000_0000 <= ft <= 0x01E6_0000_0000_0000:
print(f" 0x{off:04X} filetime 0x{ft:016X}")
return 0
if __name__ == "__main__":
sys.exit(main())