From 6a67a8366c6ac8ca5dd538d63fef38140682fbb0 Mon Sep 17 00:00:00 2001 From: Joseph Doherty Date: Tue, 23 Jun 2026 14:12:45 -0400 Subject: [PATCH] handoff: record the event-row parser fix + mark the gRPC event angles exhausted Updated the live-blocker item (1) to reflect that all four next-session angles for the gRPC event zero-rows are now tested and ruled out (transport, metadata/cert, topology, data store) and that the parse path is verified against the provided client with a latent multi-row bug fixed. Corrected the two historical event-row-layout skeletons that wrongly described 0x1E as a per-row marker: it is a one-time buffer header field and rows are markerless (which is why the old parser returned only the first row of any multi-row buffer, on WCF too). Co-Authored-By: Claude Opus 4.8 (1M context) Claude-Session: https://claude.ai/code/session_01B6mcaT2PjRFKcogzp9UkfC --- docs/reverse-engineering/handoff.md | 98 ++++++++++++++++++----------- 1 file changed, 61 insertions(+), 37 deletions(-) diff --git a/docs/reverse-engineering/handoff.md b/docs/reverse-engineering/handoff.md index 9092cbb..aa6bf87 100644 --- a/docs/reverse-engineering/handoff.md +++ b/docs/reverse-engineering/handoff.md @@ -38,28 +38,42 @@ reuses the proven 2020 WCF byte serializers/parsers unchanged inside protobuf **Everything still open is gated — none is a pure-code task:** -1. **gRPC event ROW retrieval** (`ReadEventsAsync` #2) — **v8 EVENT-CONNECTION AUTH - SOLVED 2026-06-23 (merged `9a25fa4`); row retrieval is now CONNECTION-gated, not - a payload gap.** The v8 OpenConnection crypto wall is fully cracked + live-verified: - the event connection authenticates via **`HistoryService.ExchangeKey` (P-256 ECDH) - → client key = `SHA256(shared secret)` → credential token = `RC4(password-UTF16LE, - key=MD5(client key))`** (the native `HistorianCrypto.NRC4_V2.aahCryptV2` MD5-keyed - RC4 scheme). RE'd via Frida CNG hooks + dnlib IL extraction + an offline cracker; - implemented pure-managed, golden-tested, auth live-PASSES (past `132/171 - AuthenticationFailed`). The `StartEventQuery` v6 request and the Event-type v8 - `OpenConnection` (`ConnectionType=Event`) are also shipped. **BUT** the event query - still returns version-11 **rowCount-0** while the native returns 50 for a - BYTE-IDENTICAL request. Exhaustively proven NOT a client-payload issue: v8 - `openParameters`, all str/uint handles, the request, registration (`RTag/EnsT=True` - + order), `queryRequestType=3`, gzip header — **all byte-match the native**. The - event RPCs succeed and return a valid EMPTY result (not a transport error), so it's - a **connection/server-level scoping difference** (session affinity tied to the - native `Grpc.Core` HTTP/2 connection or a connection identity). **Next session: see - the "NEXT SESSION — the server-side / connection angle" section of - `docs/reverse-engineering/grpc-event-query-capture.md`** (ordered, testable - hypotheses: HTTP/2-vs-gRPC-Web transport, TLS client cert, HTTP/2 frame capture, SQL - event-store scoping). Orchestrator stays on the no-row throw; `eventConnection: true` - is wired; opt-in `EventReadDiagnostic` test (`HISTORIAN_GRPC_EVENT_DIAG=1`). +1. **gRPC event ROW retrieval** (`ReadEventsAsync` #2) — **AUTH-SOLVED / PARSE-VERIFIED / + RETRIEVAL-SERVER-GATED (every client-side angle exhausted 2026-06-23, merged `6faf8a5`).** + The v8 OpenConnection crypto wall is fully cracked + live-verified: the event connection + authenticates via **`HistoryService.ExchangeKey` (P-256 ECDH) → client key = + `SHA256(shared secret)` → credential token = `RC4(password-UTF16LE, key=MD5(client key))`** + (the native `HistorianCrypto.NRC4_V2.aahCryptV2` MD5-keyed RC4 scheme). RE'd via Frida CNG + hooks + dnlib IL extraction + an offline cracker; implemented pure-managed, golden-tested, + auth live-PASSES. The `StartEventQuery` v6 request and the Event-type v8 `OpenConnection` + (`ConnectionType=Event`) are shipped. **BUT** the query still returns **rowCount-0** while + the native returns 50 for a byte-identical request — and **all four next-session angles are + now tested and ruled out** (`grpc-event-query-capture.md`): + - **transport** — the stock client is *also* gRPC-Web/HTTP-1.1 (decompiled); plain native + HTTP/2 (`CreateHttp2`) returns the same 0 rows; + - **client metadata/cert** — decompiled + TLS-tee captured: gzip-only metadata, no-op + interceptor, **no TLS client cert** on either side; + - **connection topology** — the native splits services across 5 connections and queries on + a dedicated RetrievalService connection; replicating that (`HISTORIAN_GRPC_EVENT_SPLIT_CHANNEL=1`) + still returns 0 rows → the server correlates by **session handle, not connection**; + - **data store** — via SOCKS→SQL: the event store is global/unscoped (no per-connection + column); the `Events` view (served by the engine via the INSQL provider) returns 71,332 + events for the same window the gRPC query gets 0. + + So the gate is a **server-internal per-connection retrieval working-set** in the native + `HistorianClient` C++ core — not reconstructable from a pure-managed client. **PARSE PATH NOW + VERIFIED + a latent bug FIXED:** fed the provided stock client's real captured result buffer + (63,192 B, 50 events) through `HistorianEventRowProtocol.Parse` — it exposed that the parser + treated the one-time `0x1E` buffer header field as a per-row marker, decoding only the **first + row of any multi-row buffer**. This also hit the **shipped WCF event read** (identical + `0900 1E000000 0700` header). Fixed to a 10-byte buffer header + **markerless rows**, + accepting container version 9 (WCF) and 11 (gRPC); the real 50-row buffer now decodes to exactly + 50 events (`Parse_RealStockClientCapture_DecodesAllEvents`, gated on `HISTORIAN_EVENT_CAPTURE_NDJSON`). + So if the server gate ever opens, decoded events flow through correctly on both transports; until + then the orchestrator stays on the no-row throw (`eventConnection: true` wired; opt-in + `EventReadDiagnostic` test, `HISTORIAN_GRPC_EVENT_DIAG=1`). Diagnostics retained: the `httpcap` + TLS-tee proxy, `CreateHttp2`/`SPLIT_CHANNEL` switches, the `sqlschema` SOCKS→SQL probe, the + `capture-event` harness (native, returns rows). 2. **R4.3 active-SF magnitude** — needs an **SF-active server** (D2 storage-engine console handle). 3. **SendEvent over gRPC** — **capture-gated**: no distinct RPC, framing uncaptured. @@ -1041,27 +1055,35 @@ record 5 of `instrumented-wcf-readmessage/readmessage-capture-event-latest.ndjso ### Event-row parser -`Wcf/HistorianEventRowProtocol.Parse(ReadOnlySpan)` parses the -version-9 row buffer: +> **CORRECTED 2026-06-23 (merged `6faf8a5`).** The skeleton below mis-read the `0x1E` +> as a *per-row* marker. Verifying the parser against the provided stock client's real +> 50-event buffer proved `0x1E` is a **one-time buffer-level header field**, and the rows +> are **markerless** — so the original parser silently returned only the **first** row of +> any multi-row buffer (on WCF too). The corrected layout and behaviour are below. + +`Wcf/HistorianEventRowProtocol.Parse(ReadOnlySpan)` parses the row buffer +(container version 9 for WCF, 11 for 2023 R2 gRPC — both accepted): ```text -UInt16 version = 9 +UInt16 version = 9 (WCF) | 11 (gRPC) UInt32 rowCount -N rows, each: - UInt32 rowMarker = 0x1E +UInt32 headerField = 0x1E // ONE buffer-level field, NOT a per-row marker +N rows, each (MARKERLESS): UInt16 rowFormat = 7 Int64 filetimeUtc (event time) UInt16 × 8 fieldOffsets (opaque — purpose not fully decoded) - Property bag (sequence of name=value pairs; first name is the event type) + UInt16 propertyCount + Property bag (propertyCount × name=value pairs; first field is the event type) ``` -The parser extracts `EventTimeUtc` and `Type` (the first compact-ASCII-string -in the property bag) for each row, and seeks forward to the next row by -scanning for the next `1E 00 00 00 07 00` marker. Property-bag value -encoding is partially decoded (compact ASCII `09 LEN 00 …`, UTF-16 strings -`43 UInt32 LEN × UInt16`, integers with markers in the `0x88–0x8B` range, -8-byte FILETIMEs) but **value parsing is intentionally not implemented yet** -— it requires more reverse-engineering and would need sanitized fixtures. +The parser reads the 10-byte buffer header (skipping the `0x1E` field once), then walks +each markerless row by length: `rowFormat(2) + filetime(8) + 8×UInt16 slots + compact-ASCII +type + propertyCount + propertyCount × (name + value)`. Value encoding **is** implemented +(compact ASCII `09 LEN 00 …`, Boolean `0x02`, GUID `0x10`, FILETIME `0x18`, Int32 `0x31`, +UTF-16 `0x43`; unknown markers preserve raw bytes). Verified against the provided client's +real buffer: `Parse_RealStockClientCapture_DecodesAllEvents` decodes all 50 events (25 +Alarm.Set + 25 Alarm.Clear) to end-of-buffer (gated on `HISTORIAN_EVENT_CAPTURE_NDJSON`), +plus a synthetic v11 golden test. 5 unit tests in `HistorianEventRowProtocolTests.cs` cover empty buffer, zero-row, wrong-version, two-row synthetic, and missing-marker. Test count @@ -1159,10 +1181,12 @@ Typemarker dispatch: Unknown markers preserve the raw `length` value bytes as a `byte[]` in the property dictionary. -Each row layout (refines the earlier skeleton): +Each row layout (**corrected 2026-06-23** — see the "Event-row parser" note above; the +`0x1E` is a one-time buffer header field, NOT a per-row marker, and rows are markerless): ```text -UInt32 rowMarker = 0x1E +buffer header: UInt16 version (9|11) + UInt32 rowCount + UInt32 headerField (0x1E) +each row (markerless): UInt16 rowFormat = 7 Int64 eventTimeUtcFiletime UInt16 × 8 // purpose unclear