gRPC events: capture decrypted HTTP/1.1 frames native vs ours — topology found + tested null
Pursued hypothesis #3 (connection-frame capture). Built a TLS-terminating tee proxy (artifacts/.../httpcap/, gitignored: self-signed server cert, forwards through the loopback tunnel, logs decrypted HTTP/1.1 + gRPC-Web both directions) and ran a native capture-event (returns 50 rows) and our SDK diagnostic (0 rows) through the SAME proxy/upstream for a clean A/B. Findings: - The stock client is gRPC-Web/HTTP-1.1 (alpn empty), and clientCert=none on every connection — confirming (with the decompile) that hypothesis #2 (TLS client cert) is moot: the native presents no client cert. - Connection topology differs: the native opens 5 TLS connections, one per service, and runs the event query (StartEventQuery/GetNext/EndEventQuery) on a DEDICATED RetrievalService connection, separate from the HistoryService connection that opened and registered the session. Our SDK collapses every service onto one connection. (Matches the decompile: the stock client has a separate GrpcClientBase per service.) - Framing differs benignly: native uses content-length + Expect: 100-continue; SDK uses transfer-encoding: chunked. The server accepts both (StartEventQuery returns a valid handle), so framing is not the gate. No hidden header on either side. Tested the topology hypothesis with a new env-gated switch (HISTORIAN_GRPC_EVENT_SPLIT_CHANNEL=1): run StartEventQuery/GetNext/EndEventQuery on a dedicated RetrievalService connection (no re-handshake, reusing the session handle — mirroring native conn4), registration staying on the main connection. Result: still 0B00000000001E000000 (0 rows), QH=1063. Splitting the event query onto its own connection does not make rows flow — the server correlates by session handle, not connection, so topology is not the row-scoping gate. Every angle is now exhausted (payload, transport, metadata/interceptor/cert, topology, data store). The gate is a server-internal per-connection retrieval working-set in the native HistorianClient C++ core, unreachable from a pure-managed client. Conclusion unchanged: auth-solved / retrieval-server-gated; ReadEventsAsync over gRPC keeps the no-row throw, event reads use WCF. 56 offline gRPC/event tests pass. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01B6mcaT2PjRFKcogzp9UkfC
This commit is contained in:
@@ -370,29 +370,34 @@ live-server access reference.
|
|||||||
retained for future connection-level probing. This eliminates the leading hypothesis and tightens the
|
retained for future connection-level probing. This eliminates the leading hypothesis and tightens the
|
||||||
conclusion: the server scopes 0 events to our connection at a layer **above** the gRPC transport.
|
conclusion: the server scopes 0 events to our connection at a layer **above** the gRPC transport.
|
||||||
|
|
||||||
2. **TLS client identity / certificate.** The native used `SecurityMode=TransportCertificate`. Determine
|
2. ~~**TLS client identity / certificate.**~~ **DISPROVEN 2026-06-23 (decompile + capture).** The stock
|
||||||
whether it presents a **client certificate** the server uses to scope events (our SDK presents none —
|
client's `GrpcClientBase.InitializeBase` creates a bare `HttpClientHandler` and sets only
|
||||||
`AllowUntrustedServerCertificate=true`, server cert only). TEST: capture the TLS handshake (e.g.
|
`ServerCertificateCustomValidationCallback` — it **never adds a client certificate**. The TLS-tee
|
||||||
`SSLKEYLOGFILE` + Wireshark, or a decrypting proxy) for a native `capture-event` run and check the
|
capture (below) confirms `clientCert=none` on every native connection. So the native presents no client
|
||||||
Certificate message; if a client cert is presented, replicate it.
|
cert; this is not the gate.
|
||||||
|
|
||||||
3. **HTTP/2-level capture.** The byte[]/handle capture is RPC-payload only. Capture the actual HTTP/2
|
3. ~~**HTTP/2-level / connection-frame capture.**~~ **DONE 2026-06-23 — topology difference found, tested,
|
||||||
frames (HEADERS/SETTINGS/stream IDs, connection reuse) for the native run vs ours — via a
|
NULL.** Built a TLS-terminating tee proxy (`artifacts/.../httpcap/`, gitignored: self-signed server
|
||||||
TLS-decrypting mitm on the loopback forward — to see any connection-level header/affinity our capture
|
cert, forwards through the loopback tunnel, logs decrypted HTTP/1.1 + gRPC-Web both ways) and ran a
|
||||||
can't see.
|
**native `capture-event` (returns 50 rows) and our SDK diagnostic (0 rows) through the same
|
||||||
|
proxy/upstream**. Note: the stock client is gRPC-Web/HTTP-1.1 (not HTTP/2 — `alpn` empty), so the
|
||||||
2. **TLS client identity / certificate.** The native used `SecurityMode=TransportCertificate`. Determine
|
capture is HTTP/1.1 framing. Findings:
|
||||||
whether it presents a **client certificate** the server uses to scope events (our SDK presents none —
|
- **Connection topology differs.** The native opens **5 TLS connections, one per service** —
|
||||||
`AllowUntrustedServerCertificate=true`, server cert only). TEST: capture the TLS handshake (e.g.
|
`HistoryService` (ExchangeKey/OpenConnection/Register/EnsureTags), `StatusService` (×2), and
|
||||||
`SSLKEYLOGFILE` + Wireshark, or a decrypting proxy) for a native `capture-event` run and check the
|
**`RetrievalService` (the event query: GetRetrievalInterfaceVersion → StartEventQuery → GetNext →
|
||||||
Certificate message; if a client cert is presented, replicate it. **Lower-probability after #1: the
|
EndEventQuery) on its own dedicated connection**. Our SDK collapses **every service onto one
|
||||||
plain-HTTP/2 path presents no client cert either, yet auth + registration still succeed and the gate
|
connection**. (Matches the decompile: stock has a separate `GrpcClientBase` per service.)
|
||||||
persists — so the gate is not at the TLS-identity layer the cert would affect.**
|
- **Framing differs** (benign): native uses `content-length` + `Expect: 100-continue`; SDK uses
|
||||||
|
`transfer-encoding: chunked`. The server accepts both (our `StartEventQuery` returns a valid handle),
|
||||||
3. **HTTP/2-level capture.** The byte[]/handle capture is RPC-payload only. Capture the actual HTTP/2
|
so framing is not the gate. No extra/hidden header on either side; `clientCert=none` throughout.
|
||||||
frames (HEADERS/SETTINGS/stream IDs, connection reuse) for the native run vs ours — via a
|
- **TESTED the topology hypothesis (`HISTORIAN_GRPC_EVENT_SPLIT_CHANNEL=1`):** ran
|
||||||
TLS-decrypting mitm on the loopback forward — to see any connection-level header/affinity our capture
|
`StartEventQuery`/`GetNext`/`EndEventQuery` on a **dedicated RetrievalService connection** (no
|
||||||
can't see.
|
re-handshake, reusing the session handle — exactly mirroring native conn4), registration staying on
|
||||||
|
the main connection. **Result: still `0B00000000001E000000` (0 rows), `QH=1063`.** Splitting the
|
||||||
|
event query onto its own connection — the one concrete structural difference the capture revealed —
|
||||||
|
**does not make rows flow.** So the server correlates by session handle, not by connection, and the
|
||||||
|
topology is **not** the row-scoping gate. The `CreateHttp2`/`SPLIT_CHANNEL` switches + the
|
||||||
|
`httpcap` proxy are retained as diagnostics.
|
||||||
|
|
||||||
4. ~~**Server-side ground truth.**~~ **ANSWERED 2026-06-23 (DISPROVES the data-scoping premise).** Via
|
4. ~~**Server-side ground truth.**~~ **ANSWERED 2026-06-23 (DISPROVES the data-scoping premise).** Via
|
||||||
the SOCKS→SQL relay (read-only; `artifacts/.../sqlschema/`, gitignored), dumped the full event schema
|
the SOCKS→SQL relay (read-only; `artifacts/.../sqlschema/`, gitignored), dumped the full event schema
|
||||||
@@ -449,17 +454,24 @@ So **every client-controllable layer is now confirmed identical by reading the s
|
|||||||
by wire match: request bytes, transport, channel options, gRPC metadata, interceptor. The remaining
|
by wire match: request bytes, transport, channel options, gRPC metadata, interceptor. The remaining
|
||||||
difference is below the managed surface (native core) / server-side.
|
difference is below the managed surface (native core) / server-side.
|
||||||
|
|
||||||
**Conclusion (after #1 disproven + #4 answered + stock client decompiled).** Four independent angles are
|
**Conclusion (after #1–#4 + stock client decompiled + TLS-tee capture).** Every angle is now exhausted:
|
||||||
now exhausted: client payload (byte-identical), transport (stock client is *also* gRPC-Web/HTTP-1.1 —
|
- **client payload** — byte-identical (IL capture + decompile);
|
||||||
HTTP/2 makes no difference, both 0 rows), client-side metadata/interceptor/channel (decompiled — identical,
|
- **transport** — stock client is *also* gRPC-Web/HTTP-1.1; native HTTP/2 makes no difference, both 0 rows;
|
||||||
no hidden header), and data store (global, unscoped, 71,332 events the engine serves via INSQL but
|
- **client metadata/interceptor/channel** — decompiled: identical gzip-only header, no-op interceptor, no
|
||||||
withholds from our gRPC connection). The gate is a **server-internal per-connection retrieval working-set**
|
client cert; the TLS-tee capture confirms no hidden header and `clientCert=none`;
|
||||||
that a pure-managed client cannot reconstruct by matching wire bytes, transport, metadata, or data — and
|
- **connection topology** — the native splits services across 5 connections and queries on a dedicated
|
||||||
the establishing logic is in the native `HistorianClient` C++ core, not in any decompilable managed step.
|
RetrievalService connection; replicating that (`SPLIT_CHANNEL`) still returns 0 rows → the server
|
||||||
The remaining angle (#3 HTTP/2-frame capture) is low-probability given the stock client uses the same
|
correlates by session handle, not connection;
|
||||||
gRPC-Web/HTTP-1.1 channel. **gRPC event-row retrieval stands documented as auth-solved /
|
- **data store** — global, unscoped; 71,332 events the engine serves via INSQL but withholds from our
|
||||||
retrieval-server-gated**; `ReadEventsAsync` over gRPC keeps the honest no-row throw, and event reads use
|
gRPC connection.
|
||||||
the WCF transport.
|
|
||||||
|
The gate is a **server-internal per-connection retrieval working-set** that a pure-managed client cannot
|
||||||
|
reconstruct by matching wire bytes, transport, metadata, topology, or data — and the establishing logic is
|
||||||
|
in the native `HistorianClient` C++ core, not in any decompilable managed step or observable on the wire.
|
||||||
|
**gRPC event-row retrieval stands documented as auth-solved / retrieval-server-gated**; `ReadEventsAsync`
|
||||||
|
over gRPC keeps the honest no-row throw, and event reads use the WCF transport. Diagnostics retained for
|
||||||
|
any future server-side investigation: the `httpcap` TLS-tee proxy, the `CreateHttp2` / `SPLIT_CHANNEL`
|
||||||
|
switches, the `EventReadDiagnostic` test, and the `capture-event` harness (native, returns rows).
|
||||||
|
|
||||||
**2 of 3 layers cleared** (key exchange + client key); the 3rd (token construction) is localized to a
|
**2 of 3 layers cleared** (key exchange + client key); the 3rd (token construction) is localized to a
|
||||||
specific managed method, pending dnlib extraction. ExchangeKey + the v8 serializer are committed; the
|
specific managed method, pending dnlib extraction. ExchangeKey + the v8 serializer are committed; the
|
||||||
|
|||||||
@@ -281,9 +281,21 @@ internal sealed class HistorianGrpcEventOrchestrator
|
|||||||
HistorianEventFilter? filter,
|
HistorianEventFilter? filter,
|
||||||
CancellationToken cancellationToken)
|
CancellationToken cancellationToken)
|
||||||
{
|
{
|
||||||
var retrievalClient = new GrpcRetrieval.RetrievalService.RetrievalServiceClient(connection.Channel);
|
// HTTP/2-frame capture (grpc-event-query-capture.md #3) showed the stock client runs the event
|
||||||
|
// query on a DEDICATED RetrievalService TLS connection, separate from the HistoryService
|
||||||
|
// connection that opened+registered the session (correlated only by the session handle); our SDK
|
||||||
|
// collapses every service onto one connection. Opt in via HISTORIAN_GRPC_EVENT_SPLIT_CHANNEL=1 to
|
||||||
|
// run StartEventQuery/GetNext/EndEventQuery on their own connection (mirrors native conn4: no
|
||||||
|
// re-handshake, just the existing handle), to test whether topology is the row-scoping gate.
|
||||||
|
bool splitChannel = string.Equals(
|
||||||
|
Environment.GetEnvironmentVariable("HISTORIAN_GRPC_EVENT_SPLIT_CHANNEL"), "1", StringComparison.Ordinal);
|
||||||
|
HistorianGrpcConnection rconn = splitChannel ? HistorianGrpcChannelFactory.Create(_options) : connection;
|
||||||
|
try
|
||||||
|
{
|
||||||
|
|
||||||
|
var retrievalClient = new GrpcRetrieval.RetrievalService.RetrievalServiceClient(rconn.Channel);
|
||||||
GrpcRetrieval.GetRetrievalInterfaceVersionResponse retrievalVersion = retrievalClient.GetRetrievalInterfaceVersion(
|
GrpcRetrieval.GetRetrievalInterfaceVersionResponse retrievalVersion = retrievalClient.GetRetrievalInterfaceVersion(
|
||||||
new GrpcRetrieval.GetRetrievalInterfaceVersionRequest(), connection.Metadata, Deadline(), cancellationToken);
|
new GrpcRetrieval.GetRetrievalInterfaceVersionRequest(), rconn.Metadata, Deadline(), cancellationToken);
|
||||||
HistorianServerVersionGate.Validate(HistorianServiceInterface.Retrieval, retrievalVersion.UiVersion, _options);
|
HistorianServerVersionGate.Validate(HistorianServiceInterface.Retrieval, retrievalVersion.UiVersion, _options);
|
||||||
|
|
||||||
// Version 6 envelope: the stock 2023 R2 client sends v6 (the WCF path's v5 request is accepted
|
// Version 6 envelope: the stock 2023 R2 client sends v6 (the WCF path's v5 request is accepted
|
||||||
@@ -306,7 +318,7 @@ internal sealed class HistorianGrpcEventOrchestrator
|
|||||||
UiQueryRequestType = HistorianEventQueryProtocol.QueryRequestTypeEvent,
|
UiQueryRequestType = HistorianEventQueryProtocol.QueryRequestTypeEvent,
|
||||||
BtRequest = ByteString.CopyFrom(requestBuffer)
|
BtRequest = ByteString.CopyFrom(requestBuffer)
|
||||||
},
|
},
|
||||||
connection.Metadata,
|
rconn.Metadata,
|
||||||
Deadline(),
|
Deadline(),
|
||||||
cancellationToken);
|
cancellationToken);
|
||||||
|
|
||||||
@@ -331,7 +343,7 @@ internal sealed class HistorianGrpcEventOrchestrator
|
|||||||
{
|
{
|
||||||
nextResponse = retrievalClient.GetNextEventQueryResultBuffer(
|
nextResponse = retrievalClient.GetNextEventQueryResultBuffer(
|
||||||
new GrpcRetrieval.GetNextEventQueryResultBufferRequest { UiHandle = session.ClientHandle, UiQueryHandle = queryHandle },
|
new GrpcRetrieval.GetNextEventQueryResultBufferRequest { UiHandle = session.ClientHandle, UiQueryHandle = queryHandle },
|
||||||
connection.Metadata,
|
rconn.Metadata,
|
||||||
EventPollDeadline(),
|
EventPollDeadline(),
|
||||||
cancellationToken);
|
cancellationToken);
|
||||||
}
|
}
|
||||||
@@ -384,7 +396,12 @@ internal sealed class HistorianGrpcEventOrchestrator
|
|||||||
}
|
}
|
||||||
finally
|
finally
|
||||||
{
|
{
|
||||||
EndEventQuerySafely(retrievalClient, connection, session.ClientHandle, queryHandle);
|
EndEventQuerySafely(retrievalClient, rconn, session.ClientHandle, queryHandle);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
finally
|
||||||
|
{
|
||||||
|
if (splitChannel) { rconn.Dispose(); }
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|||||||
Reference in New Issue
Block a user