handoff: record the event-row parser fix + mark the gRPC event angles exhausted

Updated the live-blocker item (1) to reflect that all four next-session angles for the
gRPC event zero-rows are now tested and ruled out (transport, metadata/cert, topology,
data store) and that the parse path is verified against the provided client with a latent
multi-row bug fixed. Corrected the two historical event-row-layout skeletons that wrongly
described 0x1E as a per-row marker: it is a one-time buffer header field and rows are
markerless (which is why the old parser returned only the first row of any multi-row
buffer, on WCF too).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01B6mcaT2PjRFKcogzp9UkfC
This commit is contained in:
Joseph Doherty
2026-06-23 14:12:45 -04:00
parent 6faf8a5f30
commit 6a67a8366c
+61 -37
View File
@@ -38,28 +38,42 @@ reuses the proven 2020 WCF byte serializers/parsers unchanged inside protobuf
**Everything still open is gated — none is a pure-code task:**
1. **gRPC event ROW retrieval** (`ReadEventsAsync` #2) — **v8 EVENT-CONNECTION AUTH
SOLVED 2026-06-23 (merged `9a25fa4`); row retrieval is now CONNECTION-gated, not
a payload gap.** The v8 OpenConnection crypto wall is fully cracked + live-verified:
the event connection authenticates via **`HistoryService.ExchangeKey` (P-256 ECDH)
→ client key = `SHA256(shared secret)` → credential token = `RC4(password-UTF16LE,
key=MD5(client key))`** (the native `HistorianCrypto.NRC4_V2.aahCryptV2` MD5-keyed
RC4 scheme). RE'd via Frida CNG hooks + dnlib IL extraction + an offline cracker;
implemented pure-managed, golden-tested, auth live-PASSES (past `132/171
AuthenticationFailed`). The `StartEventQuery` v6 request and the Event-type v8
`OpenConnection` (`ConnectionType=Event`) are also shipped. **BUT** the event query
still returns version-11 **rowCount-0** while the native returns 50 for a
BYTE-IDENTICAL request. Exhaustively proven NOT a client-payload issue: v8
`openParameters`, all str/uint handles, the request, registration (`RTag/EnsT=True`
+ order), `queryRequestType=3`, gzip header — **all byte-match the native**. The
event RPCs succeed and return a valid EMPTY result (not a transport error), so it's
a **connection/server-level scoping difference** (session affinity tied to the
native `Grpc.Core` HTTP/2 connection or a connection identity). **Next session: see
the "NEXT SESSION — the server-side / connection angle" section of
`docs/reverse-engineering/grpc-event-query-capture.md`** (ordered, testable
hypotheses: HTTP/2-vs-gRPC-Web transport, TLS client cert, HTTP/2 frame capture, SQL
event-store scoping). Orchestrator stays on the no-row throw; `eventConnection: true`
is wired; opt-in `EventReadDiagnostic` test (`HISTORIAN_GRPC_EVENT_DIAG=1`).
1. **gRPC event ROW retrieval** (`ReadEventsAsync` #2) — **AUTH-SOLVED / PARSE-VERIFIED /
RETRIEVAL-SERVER-GATED (every client-side angle exhausted 2026-06-23, merged `6faf8a5`).**
The v8 OpenConnection crypto wall is fully cracked + live-verified: the event connection
authenticates via **`HistoryService.ExchangeKey` (P-256 ECDH) → client key =
`SHA256(shared secret)` → credential token = `RC4(password-UTF16LE, key=MD5(client key))`**
(the native `HistorianCrypto.NRC4_V2.aahCryptV2` MD5-keyed RC4 scheme). RE'd via Frida CNG
hooks + dnlib IL extraction + an offline cracker; implemented pure-managed, golden-tested,
auth live-PASSES. The `StartEventQuery` v6 request and the Event-type v8 `OpenConnection`
(`ConnectionType=Event`) are shipped. **BUT** the query still returns **rowCount-0** while
the native returns 50 for a byte-identical request — and **all four next-session angles are
now tested and ruled out** (`grpc-event-query-capture.md`):
- **transport** — the stock client is *also* gRPC-Web/HTTP-1.1 (decompiled); plain native
HTTP/2 (`CreateHttp2`) returns the same 0 rows;
- **client metadata/cert** — decompiled + TLS-tee captured: gzip-only metadata, no-op
interceptor, **no TLS client cert** on either side;
- **connection topology** — the native splits services across 5 connections and queries on
a dedicated RetrievalService connection; replicating that (`HISTORIAN_GRPC_EVENT_SPLIT_CHANNEL=1`)
still returns 0 rows → the server correlates by **session handle, not connection**;
- **data store** — via SOCKS→SQL: the event store is global/unscoped (no per-connection
column); the `Events` view (served by the engine via the INSQL provider) returns 71,332
events for the same window the gRPC query gets 0.
So the gate is a **server-internal per-connection retrieval working-set** in the native
`HistorianClient` C++ core — not reconstructable from a pure-managed client. **PARSE PATH NOW
VERIFIED + a latent bug FIXED:** fed the provided stock client's real captured result buffer
(63,192 B, 50 events) through `HistorianEventRowProtocol.Parse` — it exposed that the parser
treated the one-time `0x1E` buffer header field as a per-row marker, decoding only the **first
row of any multi-row buffer**. This also hit the **shipped WCF event read** (identical
`0900 <rowCount> 1E000000 0700` header). Fixed to a 10-byte buffer header + **markerless rows**,
accepting container version 9 (WCF) and 11 (gRPC); the real 50-row buffer now decodes to exactly
50 events (`Parse_RealStockClientCapture_DecodesAllEvents`, gated on `HISTORIAN_EVENT_CAPTURE_NDJSON`).
So if the server gate ever opens, decoded events flow through correctly on both transports; until
then the orchestrator stays on the no-row throw (`eventConnection: true` wired; opt-in
`EventReadDiagnostic` test, `HISTORIAN_GRPC_EVENT_DIAG=1`). Diagnostics retained: the `httpcap`
TLS-tee proxy, `CreateHttp2`/`SPLIT_CHANNEL` switches, the `sqlschema` SOCKS→SQL probe, the
`capture-event` harness (native, returns rows).
2. **R4.3 active-SF magnitude** — needs an **SF-active server** (D2 storage-engine
console handle).
3. **SendEvent over gRPC****capture-gated**: no distinct RPC, framing uncaptured.
@@ -1041,27 +1055,35 @@ record 5 of `instrumented-wcf-readmessage/readmessage-capture-event-latest.ndjso
### Event-row parser
`Wcf/HistorianEventRowProtocol.Parse(ReadOnlySpan<byte>)` parses the
version-9 row buffer:
> **CORRECTED 2026-06-23 (merged `6faf8a5`).** The skeleton below mis-read the `0x1E`
> as a *per-row* marker. Verifying the parser against the provided stock client's real
> 50-event buffer proved `0x1E` is a **one-time buffer-level header field**, and the rows
> are **markerless** — so the original parser silently returned only the **first** row of
> any multi-row buffer (on WCF too). The corrected layout and behaviour are below.
`Wcf/HistorianEventRowProtocol.Parse(ReadOnlySpan<byte>)` parses the row buffer
(container version 9 for WCF, 11 for 2023 R2 gRPC — both accepted):
```text
UInt16 version = 9
UInt16 version = 9 (WCF) | 11 (gRPC)
UInt32 rowCount
N rows, each:
UInt32 rowMarker = 0x1E
UInt32 headerField = 0x1E // ONE buffer-level field, NOT a per-row marker
N rows, each (MARKERLESS):
UInt16 rowFormat = 7
Int64 filetimeUtc (event time)
UInt16 × 8 fieldOffsets (opaque — purpose not fully decoded)
Property bag (sequence of name=value pairs; first name is the event type)
UInt16 propertyCount
Property bag (propertyCount × name=value pairs; first field is the event type)
```
The parser extracts `EventTimeUtc` and `Type` (the first compact-ASCII-string
in the property bag) for each row, and seeks forward to the next row by
scanning for the next `1E 00 00 00 07 00` marker. Property-bag value
encoding is partially decoded (compact ASCII `09 LEN 00 …`, UTF-16 strings
`43 UInt32 LEN × UInt16`, integers with markers in the `0x880x8B` range,
8-byte FILETIMEs) but **value parsing is intentionally not implemented yet**
— it requires more reverse-engineering and would need sanitized fixtures.
The parser reads the 10-byte buffer header (skipping the `0x1E` field once), then walks
each markerless row by length: `rowFormat(2) + filetime(8) + 8×UInt16 slots + compact-ASCII
type + propertyCount + propertyCount × (name + value)`. Value encoding **is** implemented
(compact ASCII `09 LEN 00 …`, Boolean `0x02`, GUID `0x10`, FILETIME `0x18`, Int32 `0x31`,
UTF-16 `0x43`; unknown markers preserve raw bytes). Verified against the provided client's
real buffer: `Parse_RealStockClientCapture_DecodesAllEvents` decodes all 50 events (25
Alarm.Set + 25 Alarm.Clear) to end-of-buffer (gated on `HISTORIAN_EVENT_CAPTURE_NDJSON`),
plus a synthetic v11 golden test.
5 unit tests in `HistorianEventRowProtocolTests.cs` cover empty buffer,
zero-row, wrong-version, two-row synthetic, and missing-marker. Test count
@@ -1159,10 +1181,12 @@ Typemarker dispatch:
Unknown markers preserve the raw `length` value bytes as a `byte[]` in
the property dictionary.
Each row layout (refines the earlier skeleton):
Each row layout (**corrected 2026-06-23** — see the "Event-row parser" note above; the
`0x1E` is a one-time buffer header field, NOT a per-row marker, and rows are markerless):
```text
UInt32 rowMarker = 0x1E
buffer header: UInt16 version (9|11) + UInt32 rowCount + UInt32 headerField (0x1E)
each row (markerless):
UInt16 rowFormat = 7
Int64 eventTimeUtcFiletime
UInt16 × 8 // purpose unclear