Files
histsdk/docs/reverse-engineering/wcf-string-handle-wall.md
T
Joseph Doherty 4da5287d01 R1.2 GetRuntimeParameter + string-handle wall RESOLVED (handle-format bug)
Execute HCAL roadmap R1.2 (GetRuntimeParameterAsync) end-to-end, and in doing so
discover that the "string-handle wall" blocking R1.1/R1.4/R1.5/R1.6 was a handle
FORMAT bug, not a missing native session/filter registration.

R1.2 (shipped, live-verified):
- Captured native GetRuntimeParameter -> WCF op aa/Stat/GETRP (string-handle op,
  GETHI's shape), via scripts/Capture-RuntimeParam.ps1 + instrument-wcf-{write,read}message.
- HistorianRuntimeParameterProtocol serializes pRequestBuff (54 67 01 00 + uint
  nameCount + per-name uint charCount + UTF-16) and parses pResponseBuff (version +
  uint resultCount + CRetVariant 0x43 VT_BSTR + uint16 len + uint16 charCount + UTF-16).
- IStatusServiceContract2.GetRuntimeParameter (GETRP) op; HistorianWcfStatusClient
  passes the Open2 storage-session GUID as the string handle, UPPERCASE.
- Public HistorianClient.GetRuntimeParameterAsync(name) via the dialect.
- Golden WcfRuntimeParameterProtocolTests + gated live test; returns HistorianVersion.

String-handle wall RESOLVED (proven, public APIs deferred):
- The Open2 storage GUID works as the string handle when sent UPPERCASE
  (ToString("D").ToUpperInvariant()); earlier "blocked" probes used lowercase.
- Live-probed GETHI (R1.4) -> returns data; ExeC (R1.1) -> Retr.GetV prime -> ExeC ->
  GetR returns a BinaryFormatter-serialized .NET DataTable. Gated
  StringHandleProbeDiagnosticTests + scripts/Capture-ExecSql.ps1 + exec-sql harness scenario.
- Docs flipped: wcf-string-handle-wall.md RESOLVED banner; roadmap R1.1/R1.4 reachable,
  R1.5/R1.6 likely; wcf-status-localhost.md GETRP section.
- R1.1/R1.4 public APIs NOT shipped: ExeC needs a GetR paging loop + a BinaryFormatter-
  stream parser (BinaryFormatter is removed from .NET 10); GETHI full-info struct needs
  its own capture.

223 unit tests pass; gated live tests green against the local 2020 Historian.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01B6mcaT2PjRFKcogzp9UkfC
2026-06-20 22:10:31 -04:00

115 lines
7.3 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# The 2020 WCF string-handle wall (2026-06-20)
> ## ✅✅ RESOLVED (2026-06-20): the "wall" was a handle-FORMAT bug, not a registration wall.
>
> The string-handle ops are reachable from the pure-managed client after all. The Open2
> storage-session GUID must be passed as the `string handle` **UPPERCASE, dash-separated,
> no braces** — `storageSessionId.ToString("D").ToUpperInvariant()`. The earlier probes that
> "proved" the wall passed the GUID in .NET's default **lowercase** `ToString("D")`, which the
> server's session table does not match. Live-verified end-to-end against the local 2020 server:
> - **GETRP** (R1.2) → returns the runtime `HistorianVersion` (shipped).
> - **GETHI** (R1.4) → `returned=True`, returns the version buffer (`0C000000` + UTF-16 "20,0,000,000").
> - **ExeC** (R1.1) → `returned=True`, `Retr.GetV` prime + `ExeC("SELECT 1 AS ProbeValue", option=0)`
> yields `queryHandle`, then `GetR(handle, queryHandle, sequence=0)` returns a 1232-byte result =
> a **BinaryFormatter-serialized .NET DataTable** (stream header `…System.Data, Version=4.0.0.0…`).
>
> Probes: gated `StringHandleProbeDiagnosticTests` (GETHI + ExeC). Captures:
> `scripts/Capture-RuntimeParam.ps1`, `scripts/Capture-ExecSql.ps1`. The handle for ExeC/GetR is the
> **same** Open2 storage-session GUID (confirmed = `outBuff[5..21]`). The original analysis below is
> retained for history; treat its "blocked" conclusions as **superseded** — the only missing piece
> was the uppercase format. R1.5/R1.6 (GetTepByNm family) and QTB/QTG are very likely reachable the
> same way but have not yet been individually re-probed.
---
Live-probing the local **Historian 2020** (WCF, port 32568) for HCAL roadmap M1
surfaced a clean structural boundary on what the pure-managed client can call. It
explains why R1.1/R1.4/R1.5 all fail and identifies the single RE target that
unblocks the rest of the M1 read surface.
> ⚠️ **Superseded — see the RESOLVED banner above.** The boundary below is real *only* when the
> handle is sent lowercase. With the uppercased storage GUID the string-handle ops succeed.
## The dichotomy
Retrieval/Status/History ops split by the **type of their first (handle) parameter**:
| Handle type | Examples | Status on 2020 WCF |
|---|---|---|
| **`uint` client handle** (Open2 output) | `StartQuery2`, `GetNextQueryResultBuffer2`, `IsOriginalAllowed`, `GetTagInfosFromName`/`GetTagInfoFromName` (GetTgByNm), `GetSystemParameter`, `StartEventQuery`, `GetNextEventQueryResultBuffer`, `RegisterTags2`, `EnsureTags2`, `UpdateClientStatus3` | ✅ **work** — the proven read/browse/metadata/status-param/event/write surface |
| **`string` GUID handle** | `ExecuteSqlCommand` (ExeC), `StartTagQuery` (QTB), `QueryTag` (QTG), `GetHistorianInfo` (GETHI), `GetTagExtendedPropertiesFromName` (GetTepByNm), `GetTagInfosFromName2` (GetTgByNm2), `GetTagidsByTagnameAndSource` | ⛔ **blocked** — native error type 4, code **51 (InvalidParameter)** or **1 (Failure)** |
## Evidence (this probe + prior notes)
- **ExeC** → type 4 / code 51 for every handle variant (storageGuid, contextGuid).
Matches `implementation-status.md` ~982 / ~1404 ("StartTagQuery depends on earlier
native session/filter registration … do not wire through guessed calls").
- **GETHI** (`HistorianVersion` param query — the *exact* native request shape from
`BuildGetHistorianInfoRequest`, with `Stat.GetV ×2` priming) → type 4 / code **1**
for all five handle formats tried: storage-session GUID, context GUID, uint as
decimal, uint as `X8` hex, uint as `0x`-hex. In the only place GETHI is used (the
event-priming chain) its result is wrapped in `TryRun` and **discarded**, so there
was never evidence it actually returns data from the managed client.
- **GetTepByNm / QTB / QTG / GetTgByNm2** all take a `string handle` → same family.
## Why
The string-handle ops are keyed off a **native-side session/filter registration**
that the C++ client performs but the managed replay does not reproduce. The uint
client handle is the Open2 session token the server already trusts; the string GUID
handle indexes a *different* per-service registration table that stays empty unless
the native priming is replicated faithfully. `Stat.GetV ×2` alone is insufficient.
## Consequence for the roadmap
Every remaining **M1 read** item is a string-handle op:
- R1.1 `ExecuteSqlCommandAsync` (ExeC) — blocked
- R1.4 `GetHistorianInfoAsync` (GETHI) — blocked
- R1.5 extended-property read (GetTepByNm) — blocked (string handle, confirmed)
- R1.6 localized-property read — same family
So **M1 read-surface completion on 2020 WCF is gated entirely behind one RE target:
the native session/filter registration for string-handle ops.** Reverse-engineer it
once and the whole family unlocks. Until then, the alternatives are:
1. **RE the registration** — instrument the native `CRetrievalConnectionWCF` /
`CStatusConnectionWCF` priming between Open2 and the first successful string-handle
call (capture-tier; the highest-leverage single RE task for M1).
2. **2023 R2 gRPC server** — these ops are first-class on the gRPC front door, where
the handle/envelope differs and the registration wall may not apply.
Do **not** ship any string-handle op via guessed calls (project discipline:
"leave them throwing until evidence supports an implementation").
## ⚠️ Update (2026-06-20): GETRP punches through — the wall is not absolute
Roadmap **R1.2 `GetRuntimeParameterAsync`** turned out to be a **`string`-handle op**
(`aa/Stat/GETRP(string handle, byte[] pRequestBuff) → (bool, byte[] pResponseBuff,
byte[] errorBuffer)`) — the **same shape as GETHI**, and in the same native session it
uses the **same handle GUID** as GETHI (confirmed: the GUID equals the Open2 `outBuff`
storage-session id at `[5..21]`, the value the managed `ParseOpenConnectionResponse`
already extracts as `StorageSessionId`).
Yet GETRP **works from the pure-managed client** — live-verified, returns the runtime
`HistorianVersion` value `20,0,000,000`. The only material difference from the failed
GETHI probe is the **handle string format**: the native client sends the GUID
**UPPERCASE, dash-separated, no braces** (format example
`XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX`, all hex upper), i.e.
`storageSessionId.ToString("D").ToUpperInvariant()`. `.NET Guid.ToString("D")` is
lowercase, so a probe that passed the GUID without upcasing would not byte-match what
the server's session table is keyed on.
**Implication / open lead (not yet retested):** the GETHI/ExeC/QTB/QTG family failures
may be (at least partly) a **handle-format** issue, not (only) a missing native
registration step. The highest-value cheap follow-up is to **re-probe GETHI and ExeC
with the uppercased storage-session GUID** before assuming the registration wall. If
they also return data, the "wall" collapses to a formatting bug and R1.4/R1.5/R1.6/R1.1
may be reachable without any new RE. This has **not** been done yet — do not reclassify
those items until it is. GETRP is shipped because it was directly captured + live-verified
end-to-end; the rest remain `ProtocolEvidenceMissingException`/unprobed until tested.
See `HistorianRuntimeParameterProtocol`, `IStatusServiceContract2.GetRuntimeParameter`,
golden `WcfRuntimeParameterProtocolTests`, and capture tooling
`scripts/Capture-RuntimeParam.ps1` + `scripts/decode-runtime-param-capture.py`.