docs(grpc-events): document the server-side/connection angle for next session
Records the row-retrieval pickup now that the v8 ExchangeKey auth is solved and the gap is proven connection-level (not client payload): - grpc-event-query-capture.md: a "NEXT SESSION — the server-side / connection angle" section — what's already proven (don't redo), the in-place tooling, and ordered, testable hypotheses (HTTP/2 vs gRPC-Web transport [leading], TLS client cert, HTTP/2 frame capture, SQL event-store scoping). - handoff.md item 1: updated to "v8 auth solved; row retrieval connection-gated", pointing at the NEXT SESSION section; the "to move any item" summary updated. Doc-only; sanitization scan clean. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01B6mcaT2PjRFKcogzp9UkfC
This commit is contained in:
@@ -333,6 +333,53 @@ unfixable by, client payload matching.** Closing it needs server-side insight or
|
||||
hard wall — pure managed, golden-tested, auth live-verified. Orchestrator stays on the no-row throw;
|
||||
gated test unchanged.
|
||||
|
||||
### NEXT SESSION — the server-side / connection angle (row retrieval pickup)
|
||||
|
||||
Client payloads are exhausted (byte-identical to the native, proven above). The next investigation is
|
||||
**connection-level**, not wire-payload. Pursue in roughly this order; each is concrete and testable.
|
||||
|
||||
**Already proven — do NOT redo:** auth works (ExchangeKey ECDH + RC4 token, live-verified); v8
|
||||
`openParameters`, all handles (str/uint), `StartEventQuery` request, registration (`RTag/EnsT=True` +
|
||||
order), `queryRequestType=3`, gzip header — all byte-match the native. Events exist (native returns 50
|
||||
*now*). The event RPCs succeed over our transport and return a valid version-11 **rowCount-0** (not a
|
||||
transport error). So the server scopes 0 events to *our* connection specifically.
|
||||
|
||||
**Tooling already in place:** opt-in diagnostic test `EventReadDiagnostic_OverGrpc_PrintsJourney`
|
||||
(env `HISTORIAN_GRPC_EVENT_DIAG=1`, prints registration outcomes, handles, result hex, v8 buffer);
|
||||
the `capture-event` harness scenario (native, returns rows); `instrument-grpc-nonstream` now logs
|
||||
string/uint handle fields too; the CNG Frida hook. Live recipe: set `HISTORIAN_GRPC_HOST`/`_PORT
|
||||
32565`/`_TLS true`/`_DNSID` to the 2023 R2 server + domain creds (strip quotes); reach the box per the
|
||||
live-server access reference.
|
||||
|
||||
1. **Transport: native `Grpc.Core` HTTP/2 vs our `Grpc.Net.Client` + `GrpcWebHandler` (gRPC-Web).**
|
||||
This is the leading hypothesis — the strongest remaining difference. Reads work over gRPC-Web *and*
|
||||
return rows, so gRPC-Web isn't broken in general; but events are connection-scoped and may require a
|
||||
**native HTTP/2** connection. TEST: build the event channel WITHOUT the `GrpcWebHandler` wrap (plain
|
||||
HTTP/2 `GrpcChannel`) in `HistorianGrpcChannelFactory` for the event path only, and re-run the
|
||||
diagnostic. If rows flow → gate found. (Mind TLS/ALPN over the loopback tunnel — may need
|
||||
`HttpVersion = 2.0`/`HttpVersionPolicy.RequestVersionExact`.)
|
||||
|
||||
2. **TLS client identity / certificate.** The native used `SecurityMode=TransportCertificate`. Determine
|
||||
whether it presents a **client certificate** the server uses to scope events (our SDK presents none —
|
||||
`AllowUntrustedServerCertificate=true`, server cert only). TEST: capture the TLS handshake (e.g.
|
||||
`SSLKEYLOGFILE` + Wireshark, or a decrypting proxy) for a native `capture-event` run and check the
|
||||
Certificate message; if a client cert is presented, replicate it.
|
||||
|
||||
3. **HTTP/2-level capture.** The byte[]/handle capture is RPC-payload only. Capture the actual HTTP/2
|
||||
frames (HEADERS/SETTINGS/stream IDs, connection reuse) for the native run vs ours — via a
|
||||
TLS-decrypting mitm on the loopback forward — to see any connection-level header/affinity our capture
|
||||
can't see.
|
||||
|
||||
4. **Server-side ground truth.** Via the SOCKS→SQL relay (user-authorized, read-only), inspect the
|
||||
`Runtime.dbo.Events` schema for any per-connection / per-client / source-session column that would
|
||||
explain why the server returns the rows to the native connection but not ours. Also check whether the
|
||||
StorageService/event-store path has a connection-scoping notion the History-service event query
|
||||
depends on.
|
||||
|
||||
If 1–4 don't crack it, the realistic conclusion is that gRPC event-row retrieval has a server-side
|
||||
connection-identity dependency not reachable from a pure-managed client, and it stays documented as
|
||||
auth-solved / retrieval-connection-gated.
|
||||
|
||||
**2 of 3 layers cleared** (key exchange + client key); the 3rd (token construction) is localized to a
|
||||
specific managed method, pending dnlib extraction. ExchangeKey + the v8 serializer are committed; the
|
||||
orchestrator stays on v6 (set `eventConnection: true` to re-arm once the token construction lands). The
|
||||
|
||||
@@ -38,26 +38,28 @@ reuses the proven 2020 WCF byte serializers/parsers unchanged inside protobuf
|
||||
|
||||
**Everything still open is gated — none is a pure-code task:**
|
||||
|
||||
1. **gRPC event ROW retrieval** (`ReadEventsAsync` #2) — **CAPTURED + DIAGNOSED
|
||||
2026-06-22 (merged `8ad160b`); remaining gate is now a scoped RE+impl effort,
|
||||
not a capture.** The `capture-event` harness scenario drove the stock 2023 R2
|
||||
client over an Event-type gRPC connection and captured it reading **50 events**
|
||||
(the live server is event-bearing — SQL ground truth via the INSQL linked
|
||||
server: `Runtime.dbo.Events` = 19,356 rows/30d, 90,944/365d). Two gaps vs the
|
||||
SDK: **(a)** the working `StartEventQuery` request is **version 6** (byte 0 =
|
||||
`06` + a 5-byte trailing pad), the SDK sent v5 — **SHIPPED**
|
||||
(`HistorianEventQueryProtocol` `version` param, default 5 = WCF; the gRPC
|
||||
orchestrator passes 6; golden-tested). **(b) the real gate:** rows flow only on
|
||||
an **Event-type connection**. The native `OpenConnection.openParameters` is
|
||||
format **v8** with a `ConnectionType` byte (Event `01` / Process `02`) right
|
||||
after `ClientType`; the SDK's **v6** Open2 buffer has no such field (it writes
|
||||
`ClientType` then `ConnectionMode` back-to-back), so the read-capable v6
|
||||
connection cannot be marked as Event. `ConnectionMode` is not the lever (2020
|
||||
WCF events work at `0x402`). Making event rows flow needs the SDK to emit the
|
||||
native v8 `OpenConnection` with `ConnectionType=Event` (and likely the
|
||||
`ExchangeKey` cert auth path) — a substantial follow-on. Full evidence +
|
||||
byte-level diagnosis: `docs/reverse-engineering/grpc-event-query-capture.md`.
|
||||
Gated test still pins the no-row throw; bounded ≤30s.
|
||||
1. **gRPC event ROW retrieval** (`ReadEventsAsync` #2) — **v8 EVENT-CONNECTION AUTH
|
||||
SOLVED 2026-06-23 (merged `9a25fa4`); row retrieval is now CONNECTION-gated, not
|
||||
a payload gap.** The v8 OpenConnection crypto wall is fully cracked + live-verified:
|
||||
the event connection authenticates via **`HistoryService.ExchangeKey` (P-256 ECDH)
|
||||
→ client key = `SHA256(shared secret)` → credential token = `RC4(password-UTF16LE,
|
||||
key=MD5(client key))`** (the native `HistorianCrypto.NRC4_V2.aahCryptV2` MD5-keyed
|
||||
RC4 scheme). RE'd via Frida CNG hooks + dnlib IL extraction + an offline cracker;
|
||||
implemented pure-managed, golden-tested, auth live-PASSES (past `132/171
|
||||
AuthenticationFailed`). The `StartEventQuery` v6 request and the Event-type v8
|
||||
`OpenConnection` (`ConnectionType=Event`) are also shipped. **BUT** the event query
|
||||
still returns version-11 **rowCount-0** while the native returns 50 for a
|
||||
BYTE-IDENTICAL request. Exhaustively proven NOT a client-payload issue: v8
|
||||
`openParameters`, all str/uint handles, the request, registration (`RTag/EnsT=True`
|
||||
+ order), `queryRequestType=3`, gzip header — **all byte-match the native**. The
|
||||
event RPCs succeed and return a valid EMPTY result (not a transport error), so it's
|
||||
a **connection/server-level scoping difference** (session affinity tied to the
|
||||
native `Grpc.Core` HTTP/2 connection or a connection identity). **Next session: see
|
||||
the "NEXT SESSION — the server-side / connection angle" section of
|
||||
`docs/reverse-engineering/grpc-event-query-capture.md`** (ordered, testable
|
||||
hypotheses: HTTP/2-vs-gRPC-Web transport, TLS client cert, HTTP/2 frame capture, SQL
|
||||
event-store scoping). Orchestrator stays on the no-row throw; `eventConnection: true`
|
||||
is wired; opt-in `EventReadDiagnostic` test (`HISTORIAN_GRPC_EVENT_DIAG=1`).
|
||||
2. **R4.3 active-SF magnitude** — needs an **SF-active server** (D2 storage-engine
|
||||
console handle).
|
||||
3. **SendEvent over gRPC** — **capture-gated**: no distinct RPC, framing uncaptured.
|
||||
@@ -75,10 +77,11 @@ reuses the proven 2020 WCF byte serializers/parsers unchanged inside protobuf
|
||||
8. **Deferred-by-design** items (`write-commands` D1–D3, non-analog tag create,
|
||||
etc.) — bounded out until an explicit customer/user demand signal.
|
||||
|
||||
To move any remaining item you need a **scoped RE+impl effort** (the v8
|
||||
Event-type `OpenConnection` — item 1, already captured + diagnosed), a **fresh
|
||||
native capture** (SendEvent gRPC framing — item 3), a **different server**
|
||||
(SF-active for item 2), or a **demand signal** to unlock a deferred item.
|
||||
To move any remaining item you need a **server-side / connection-level angle**
|
||||
(item 1 — v8 event auth is solved; row retrieval is connection-gated, see the
|
||||
NEXT SESSION section of `grpc-event-query-capture.md`), a **fresh native capture**
|
||||
(SendEvent gRPC framing — item 3), a **different server** (SF-active for item 2),
|
||||
or a **demand signal** to unlock a deferred item.
|
||||
Live-server gRPC probe recipe: set
|
||||
`HISTORIAN_GRPC_HOST`/`_PORT 32565`/`_TLS true`/`_DNSID` + domain creds (strip
|
||||
quotes — `reference_wonder_sql_vd03_credentials`) and run the gated
|
||||
|
||||
Reference in New Issue
Block a user