Merge feat/m1a-findings-r1.4-gethi: map the 2020 WCF string-handle wall

Probed M1 read items R1.1/R1.3/R1.4 live against the local 2020 server and
found them blocked, then documented the structural boundary (uint-handle ops
work; string-handle ops require unmapped native session registration). No
half-implementations shipped; 208 tests green.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
Joseph Doherty
2026-06-20 15:34:19 -04:00
3 changed files with 108 additions and 8 deletions
+34 -8
View File
@@ -24,6 +24,23 @@ HCAL replacement, built on the **2023 R2 gRPC transport**. Derived from
> golden-byte/unit-tested here but **cannot be live-verified** without an actual 2023 R2 server.
> Treat gRPC ops as unverified until then; the byte payloads remain the proven 2020 protocol.
> 🔬 **M1a re-classification (2026-06-20).** Two "trivial" items were live-probed against the
> 2020 WCF server and found **not deliverable here**, both for evidence-backed reasons:
> - **R1.3 `GetServerTimeZoneAsync`** — `Status.GetSystemTimeZoneName` is a client-side *stub*
> on 2020 (rc=0, empty value), same family as `GetServerTime`. gRPC/2023R2-only.
> - **R1.1 `ExecuteSqlCommandAsync`** — `ExeC` returns native error 51 (InvalidParameter);
> the contract-3 string-handle ops require an unmapped native session/filter registration
> step (the `StartTagQuery` wall).
>
> Takeaway: the M1a "cheap surface" is *cheap only on the 2023 R2 gRPC front door*. On 2020 WCF
> the boundary is the **handle type** (see the string-handle wall note under §1b and
> `docs/reverse-engineering/wcf-string-handle-wall.md`): **`uint`-handle ops work, `string`-handle
> ops are blocked.** GETHI/GetTepByNm were probed and confirmed blocked (not, as first guessed,
> reachable). The genuinely reachable next items on 2020 WCF are the remaining **`uint`-handle**
> ops: **R1.8/R1.9 StartQuery summary/state modes** and **R1.7 event filters** (filter bytes ride
> the proven `uint`-handle `StartEventQuery`). Everything string-handle waits on one RE target:
> the native session/filter registration.
## Guiding principles
1. **gRPC-first.** New ops go on the `RemoteGrpc` transport (clean protobuf envelope);
@@ -68,19 +85,28 @@ read/browse/status surface is Windows-free and the gRPC stack is the default pat
### 1a. Trivial (XSS each, no new payload format)
| ID | Capability | gRPC op | Notes |
|---|---|---|---|
| R1.1 | `ExecuteSqlCommandAsync` | `Retrieval.ExecuteSqlCommand` | string in → `iRetValue` + status; thin |
| ~~R1.1~~ | ~~`ExecuteSqlCommandAsync`~~ | `Retrieval.ExecuteSqlCommand` | **Blocked on 2020 WCF.** Live-probed 2026-06-20: `ExeC` returns native error type 4 / code **51 (InvalidParameter)** for every handle variant — same unmapped *native session/filter registration* prerequisite that blocks `StartTagQuery`/`QueryTag` (see `implementation-status.md` lines ~982, ~1404). Needs that registration RE'd, or a 2023 R2 gRPC server. Do not wire via guessed calls. |
| R1.2 | `GetRuntimeParameterAsync` | `Status.GetRuntimeParameter` | mirror `GetSystemParameter` |
| R1.3 | `GetServerTimeZoneAsync` | `Status.GetSystemTimeZoneName` | string out |
| ~~R1.3~~ | ~~`GetServerTimeZoneAsync`~~ | `Status.GetSystemTimeZoneName` | **gRPC/2023R2-only.** Verified 2026-06-20: over **2020 WCF** this op is a stub (rc=0, empty value) in the `GetServerTime` family — not shippable here. Build+verify only against a live 2023 R2 server. See `docs/reverse-engineering/wcf-status-localhost.md`. |
> ⛔ **String-handle wall (2026-06-20).** R1.4/R1.5/R1.6 (and R1.1) are **all blocked on 2020
> WCF** for the *same* reason: their ops take a **`string` GUID handle** and require an unmapped
> native session/filter registration. Probed live — GETHI returns code 1 for the exact native
> request shape across 5 handle formats + Stat.GetV priming; ExeC returns code 51. The proven
> surface uses **`uint`-handle** ops only. **One RE target — the native string-handle session
> registration — unblocks this whole sub-milestone.** Full analysis:
> `docs/reverse-engineering/wcf-string-handle-wall.md`. R1.8/R1.9 (StartQuery summary/state modes)
> are `uint`-handle and remain reachable on 2020 WCF.
### 1b. Bounded (decode one `bytes` payload; SM each)
| ID | Capability | gRPC op | Payload to decode | Depends |
|---|---|---|---|---|
| R1.4 | `GetHistorianInfoAsync` | `Status.GetHistorianInfo` | GETHI buffer (partly decoded; incl. `EventStorageMode`@514) | R0.5 |
| R1.5 | Extended-property **read** | `Retrieval.GetTagExtendedPropertiesFromName` | TEP result buffer | R0.5 |
| R1.6 | Localized-property **read** | `Retrieval.GetTagLocalizedPropertiesFromName` | localized buffer | R0.5 |
| R1.7 | Event **filters** | filter bytes in `Retrieval.StartEventQuery` | filter predicate encoding (name/op/value) | R0.5 |
| R1.8 | Analog-summary query | `Retrieval.StartQuery` (summary mode) | summary row layout | — |
| R1.9 | State-summary query | `Retrieval.StartQuery` (state mode) | state-summary row layout | — |
| ~~R1.4~~ | `GetHistorianInfoAsync` | `Status.GetHistorianInfo` | **string-handle wall** — GETHI returns code 1 on 2020 WCF (all handle/priming variants). GETHI buffer incl. `EventStorageMode`@514. | string-handle RE |
| ~~R1.5~~ | Extended-property **read** | `Retrieval.GetTagExtendedPropertiesFromName` |**string-handle wall** (GetTepByNm takes `string handle`). TEP result buffer. | string-handle RE |
| ~~R1.6~~ | Localized-property **read** | `Retrieval.GetTagLocalizedPropertiesFromName` | **string-handle wall** (same family). | string-handle RE |
| R1.7 | Event **filters** | filter bytes in `Retrieval.StartEventQuery` | filter predicate encoding (name/op/value)**`uint`-handle**, reachable | R0.5 |
| R1.8 | Analog-summary query | `Retrieval.StartQuery` (summary mode) | summary row layout **`uint`-handle**, reachable | — |
| R1.9 | State-summary query | `Retrieval.StartQuery` (state mode) | state-summary row layout **`uint`-handle**, reachable | — |
### 1c. Bounded config writes (SM each)
| ID | Capability | gRPC op | Payload | Notes |
@@ -26,9 +26,25 @@ Observed sanitized localhost results:
- `GetSystemParameter(handle: 0, "Version")` returns `false` with no error
buffer.
Re-tested 2026-06-20 with a **real authenticated client handle** (full Open2 auth
chain), not `handle: 0`:
- `GetSystemParameter(handle, "HistorianVersion")` → real version string (works;
shipped as `GetSystemParameterAsync`).
- `GetSystemTimeZoneName(handle)` → return code `0x00000000` (success) but an
**empty value string**. Same channel/handle that makes `GetSystemParameter`
return real data, so this is the op's own behavior, not an auth/marshalling
gap. `GetSystemTimeZoneName` is a member of the `GetServerTime` stub family:
the 2020 WCF path returns success without producing a value (the native client
computes the zone locally). It only becomes a real round-trip on the 2023 R2
gRPC front door (`Status.GetSystemTimeZoneName`), which is absent on this box.
Interpretation:
- `Stat` endpoint routing is confirmed, but status operations that require a
real client handle are not usable until managed session open is solved.
- `GetServerTime` should not be promoted into the public SDK as a real server
time call from this WCF path; native evidence shows it is a no-op stub here.
- **`GetServerTimeZoneAsync` (roadmap R1.3) is NOT a trivial WCF op on 2020** — it
is a stub returning empty. Do not ship it over the 2020 WCF transport. Deliver
it only against a live 2023 R2 gRPC server. Reclassified in `docs/plans/hcal-roadmap.md`.
@@ -0,0 +1,58 @@
# The 2020 WCF string-handle wall (2026-06-20)
Live-probing the local **Historian 2020** (WCF, port 32568) for HCAL roadmap M1
surfaced a clean structural boundary on what the pure-managed client can call. It
explains why R1.1/R1.4/R1.5 all fail and identifies the single RE target that
unblocks the rest of the M1 read surface.
## The dichotomy
Retrieval/Status/History ops split by the **type of their first (handle) parameter**:
| Handle type | Examples | Status on 2020 WCF |
|---|---|---|
| **`uint` client handle** (Open2 output) | `StartQuery2`, `GetNextQueryResultBuffer2`, `IsOriginalAllowed`, `GetTagInfosFromName`/`GetTagInfoFromName` (GetTgByNm), `GetSystemParameter`, `StartEventQuery`, `GetNextEventQueryResultBuffer`, `RegisterTags2`, `EnsureTags2`, `UpdateClientStatus3` | ✅ **work** — the proven read/browse/metadata/status-param/event/write surface |
| **`string` GUID handle** | `ExecuteSqlCommand` (ExeC), `StartTagQuery` (QTB), `QueryTag` (QTG), `GetHistorianInfo` (GETHI), `GetTagExtendedPropertiesFromName` (GetTepByNm), `GetTagInfosFromName2` (GetTgByNm2), `GetTagidsByTagnameAndSource` | ⛔ **blocked** — native error type 4, code **51 (InvalidParameter)** or **1 (Failure)** |
## Evidence (this probe + prior notes)
- **ExeC** → type 4 / code 51 for every handle variant (storageGuid, contextGuid).
Matches `implementation-status.md` ~982 / ~1404 ("StartTagQuery depends on earlier
native session/filter registration … do not wire through guessed calls").
- **GETHI** (`HistorianVersion` param query — the *exact* native request shape from
`BuildGetHistorianInfoRequest`, with `Stat.GetV ×2` priming) → type 4 / code **1**
for all five handle formats tried: storage-session GUID, context GUID, uint as
decimal, uint as `X8` hex, uint as `0x`-hex. In the only place GETHI is used (the
event-priming chain) its result is wrapped in `TryRun` and **discarded**, so there
was never evidence it actually returns data from the managed client.
- **GetTepByNm / QTB / QTG / GetTgByNm2** all take a `string handle` → same family.
## Why
The string-handle ops are keyed off a **native-side session/filter registration**
that the C++ client performs but the managed replay does not reproduce. The uint
client handle is the Open2 session token the server already trusts; the string GUID
handle indexes a *different* per-service registration table that stays empty unless
the native priming is replicated faithfully. `Stat.GetV ×2` alone is insufficient.
## Consequence for the roadmap
Every remaining **M1 read** item is a string-handle op:
- R1.1 `ExecuteSqlCommandAsync` (ExeC) — blocked
- R1.4 `GetHistorianInfoAsync` (GETHI) — blocked
- R1.5 extended-property read (GetTepByNm) — blocked (string handle, confirmed)
- R1.6 localized-property read — same family
So **M1 read-surface completion on 2020 WCF is gated entirely behind one RE target:
the native session/filter registration for string-handle ops.** Reverse-engineer it
once and the whole family unlocks. Until then, the alternatives are:
1. **RE the registration** — instrument the native `CRetrievalConnectionWCF` /
`CStatusConnectionWCF` priming between Open2 and the first successful string-handle
call (capture-tier; the highest-leverage single RE task for M1).
2. **2023 R2 gRPC server** — these ops are first-class on the gRPC front door, where
the handle/envelope differs and the registration wall may not apply.
Do **not** ship any string-handle op via guessed calls (project discipline:
"leave them throwing until evidence supports an implementation").