docs(grpc-events): document the server-side/connection angle for next session

Records the row-retrieval pickup now that the v8 ExchangeKey auth is solved and the
gap is proven connection-level (not client payload):
- grpc-event-query-capture.md: a "NEXT SESSION — the server-side / connection angle"
  section — what's already proven (don't redo), the in-place tooling, and ordered,
  testable hypotheses (HTTP/2 vs gRPC-Web transport [leading], TLS client cert,
  HTTP/2 frame capture, SQL event-store scoping).
- handoff.md item 1: updated to "v8 auth solved; row retrieval connection-gated",
  pointing at the NEXT SESSION section; the "to move any item" summary updated.

Doc-only; sanitization scan clean.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01B6mcaT2PjRFKcogzp9UkfC
This commit is contained in:
Joseph Doherty
2026-06-23 12:37:22 -04:00
parent 9a25fa4ef7
commit 88287a8c66
2 changed files with 74 additions and 24 deletions
@@ -333,6 +333,53 @@ unfixable by, client payload matching.** Closing it needs server-side insight or
hard wall — pure managed, golden-tested, auth live-verified. Orchestrator stays on the no-row throw; hard wall — pure managed, golden-tested, auth live-verified. Orchestrator stays on the no-row throw;
gated test unchanged. gated test unchanged.
### NEXT SESSION — the server-side / connection angle (row retrieval pickup)
Client payloads are exhausted (byte-identical to the native, proven above). The next investigation is
**connection-level**, not wire-payload. Pursue in roughly this order; each is concrete and testable.
**Already proven — do NOT redo:** auth works (ExchangeKey ECDH + RC4 token, live-verified); v8
`openParameters`, all handles (str/uint), `StartEventQuery` request, registration (`RTag/EnsT=True` +
order), `queryRequestType=3`, gzip header — all byte-match the native. Events exist (native returns 50
*now*). The event RPCs succeed over our transport and return a valid version-11 **rowCount-0** (not a
transport error). So the server scopes 0 events to *our* connection specifically.
**Tooling already in place:** opt-in diagnostic test `EventReadDiagnostic_OverGrpc_PrintsJourney`
(env `HISTORIAN_GRPC_EVENT_DIAG=1`, prints registration outcomes, handles, result hex, v8 buffer);
the `capture-event` harness scenario (native, returns rows); `instrument-grpc-nonstream` now logs
string/uint handle fields too; the CNG Frida hook. Live recipe: set `HISTORIAN_GRPC_HOST`/`_PORT
32565`/`_TLS true`/`_DNSID` to the 2023 R2 server + domain creds (strip quotes); reach the box per the
live-server access reference.
1. **Transport: native `Grpc.Core` HTTP/2 vs our `Grpc.Net.Client` + `GrpcWebHandler` (gRPC-Web).**
This is the leading hypothesis — the strongest remaining difference. Reads work over gRPC-Web *and*
return rows, so gRPC-Web isn't broken in general; but events are connection-scoped and may require a
**native HTTP/2** connection. TEST: build the event channel WITHOUT the `GrpcWebHandler` wrap (plain
HTTP/2 `GrpcChannel`) in `HistorianGrpcChannelFactory` for the event path only, and re-run the
diagnostic. If rows flow → gate found. (Mind TLS/ALPN over the loopback tunnel — may need
`HttpVersion = 2.0`/`HttpVersionPolicy.RequestVersionExact`.)
2. **TLS client identity / certificate.** The native used `SecurityMode=TransportCertificate`. Determine
whether it presents a **client certificate** the server uses to scope events (our SDK presents none —
`AllowUntrustedServerCertificate=true`, server cert only). TEST: capture the TLS handshake (e.g.
`SSLKEYLOGFILE` + Wireshark, or a decrypting proxy) for a native `capture-event` run and check the
Certificate message; if a client cert is presented, replicate it.
3. **HTTP/2-level capture.** The byte[]/handle capture is RPC-payload only. Capture the actual HTTP/2
frames (HEADERS/SETTINGS/stream IDs, connection reuse) for the native run vs ours — via a
TLS-decrypting mitm on the loopback forward — to see any connection-level header/affinity our capture
can't see.
4. **Server-side ground truth.** Via the SOCKS→SQL relay (user-authorized, read-only), inspect the
`Runtime.dbo.Events` schema for any per-connection / per-client / source-session column that would
explain why the server returns the rows to the native connection but not ours. Also check whether the
StorageService/event-store path has a connection-scoping notion the History-service event query
depends on.
If 14 don't crack it, the realistic conclusion is that gRPC event-row retrieval has a server-side
connection-identity dependency not reachable from a pure-managed client, and it stays documented as
auth-solved / retrieval-connection-gated.
**2 of 3 layers cleared** (key exchange + client key); the 3rd (token construction) is localized to a **2 of 3 layers cleared** (key exchange + client key); the 3rd (token construction) is localized to a
specific managed method, pending dnlib extraction. ExchangeKey + the v8 serializer are committed; the specific managed method, pending dnlib extraction. ExchangeKey + the v8 serializer are committed; the
orchestrator stays on v6 (set `eventConnection: true` to re-arm once the token construction lands). The orchestrator stays on v6 (set `eventConnection: true` to re-arm once the token construction lands). The
+27 -24
View File
@@ -38,26 +38,28 @@ reuses the proven 2020 WCF byte serializers/parsers unchanged inside protobuf
**Everything still open is gated — none is a pure-code task:** **Everything still open is gated — none is a pure-code task:**
1. **gRPC event ROW retrieval** (`ReadEventsAsync` #2) — **CAPTURED + DIAGNOSED 1. **gRPC event ROW retrieval** (`ReadEventsAsync` #2) — **v8 EVENT-CONNECTION AUTH
2026-06-22 (merged `8ad160b`); remaining gate is now a scoped RE+impl effort, SOLVED 2026-06-23 (merged `9a25fa4`); row retrieval is now CONNECTION-gated, not
not a capture.** The `capture-event` harness scenario drove the stock 2023 R2 a payload gap.** The v8 OpenConnection crypto wall is fully cracked + live-verified:
client over an Event-type gRPC connection and captured it reading **50 events** the event connection authenticates via **`HistoryService.ExchangeKey` (P-256 ECDH)
(the live server is event-bearing — SQL ground truth via the INSQL linked → client key = `SHA256(shared secret)` → credential token = `RC4(password-UTF16LE,
server: `Runtime.dbo.Events` = 19,356 rows/30d, 90,944/365d). Two gaps vs the key=MD5(client key))`** (the native `HistorianCrypto.NRC4_V2.aahCryptV2` MD5-keyed
SDK: **(a)** the working `StartEventQuery` request is **version 6** (byte 0 = RC4 scheme). RE'd via Frida CNG hooks + dnlib IL extraction + an offline cracker;
`06` + a 5-byte trailing pad), the SDK sent v5 — **SHIPPED** implemented pure-managed, golden-tested, auth live-PASSES (past `132/171
(`HistorianEventQueryProtocol` `version` param, default 5 = WCF; the gRPC AuthenticationFailed`). The `StartEventQuery` v6 request and the Event-type v8
orchestrator passes 6; golden-tested). **(b) the real gate:** rows flow only on `OpenConnection` (`ConnectionType=Event`) are also shipped. **BUT** the event query
an **Event-type connection**. The native `OpenConnection.openParameters` is still returns version-11 **rowCount-0** while the native returns 50 for a
format **v8** with a `ConnectionType` byte (Event `01` / Process `02`) right BYTE-IDENTICAL request. Exhaustively proven NOT a client-payload issue: v8
after `ClientType`; the SDK's **v6** Open2 buffer has no such field (it writes `openParameters`, all str/uint handles, the request, registration (`RTag/EnsT=True`
`ClientType` then `ConnectionMode` back-to-back), so the read-capable v6 + order), `queryRequestType=3`, gzip header — **all byte-match the native**. The
connection cannot be marked as Event. `ConnectionMode` is not the lever (2020 event RPCs succeed and return a valid EMPTY result (not a transport error), so it's
WCF events work at `0x402`). Making event rows flow needs the SDK to emit the a **connection/server-level scoping difference** (session affinity tied to the
native v8 `OpenConnection` with `ConnectionType=Event` (and likely the native `Grpc.Core` HTTP/2 connection or a connection identity). **Next session: see
`ExchangeKey` cert auth path) — a substantial follow-on. Full evidence + the "NEXT SESSION — the server-side / connection angle" section of
byte-level diagnosis: `docs/reverse-engineering/grpc-event-query-capture.md`. `docs/reverse-engineering/grpc-event-query-capture.md`** (ordered, testable
Gated test still pins the no-row throw; bounded ≤30s. hypotheses: HTTP/2-vs-gRPC-Web transport, TLS client cert, HTTP/2 frame capture, SQL
event-store scoping). Orchestrator stays on the no-row throw; `eventConnection: true`
is wired; opt-in `EventReadDiagnostic` test (`HISTORIAN_GRPC_EVENT_DIAG=1`).
2. **R4.3 active-SF magnitude** — needs an **SF-active server** (D2 storage-engine 2. **R4.3 active-SF magnitude** — needs an **SF-active server** (D2 storage-engine
console handle). console handle).
3. **SendEvent over gRPC****capture-gated**: no distinct RPC, framing uncaptured. 3. **SendEvent over gRPC****capture-gated**: no distinct RPC, framing uncaptured.
@@ -75,10 +77,11 @@ reuses the proven 2020 WCF byte serializers/parsers unchanged inside protobuf
8. **Deferred-by-design** items (`write-commands` D1D3, non-analog tag create, 8. **Deferred-by-design** items (`write-commands` D1D3, non-analog tag create,
etc.) — bounded out until an explicit customer/user demand signal. etc.) — bounded out until an explicit customer/user demand signal.
To move any remaining item you need a **scoped RE+impl effort** (the v8 To move any remaining item you need a **server-side / connection-level angle**
Event-type `OpenConnection` — item 1, already captured + diagnosed), a **fresh (item 1 — v8 event auth is solved; row retrieval is connection-gated, see the
native capture** (SendEvent gRPC framing — item 3), a **different server** NEXT SESSION section of `grpc-event-query-capture.md`), a **fresh native capture**
(SF-active for item 2), or a **demand signal** to unlock a deferred item. (SendEvent gRPC framing — item 3), a **different server** (SF-active for item 2),
or a **demand signal** to unlock a deferred item.
Live-server gRPC probe recipe: set Live-server gRPC probe recipe: set
`HISTORIAN_GRPC_HOST`/`_PORT 32565`/`_TLS true`/`_DNSID` + domain creds (strip `HISTORIAN_GRPC_HOST`/`_PORT 32565`/`_TLS true`/`_DNSID` + domain creds (strip
quotes — `reference_wonder_sql_vd03_credentials`) and run the gated quotes — `reference_wonder_sql_vd03_credentials`) and run the gated