Event-row parser: verify against the provided 2023 R2 client; fix latent multi-row bug
Used the provided stock client as an oracle to verify the event read path. The capture-event harness returns 50 real events, and the instrument-grpc-nonstream rewrite captured the exact GetNextEventQueryResultBuffer.result buffer (63,192 bytes, version 0x0B=11, rowCount 50 = 25 Alarm.Set + 25 Alarm.Clear). Feeding that real buffer through HistorianEventRowProtocol.Parse exposed a latent parser bug. The real buffer layout is: version(2) + rowCount(4) + headerField(4, =0x1E) followed by MARKERLESS rows (rowFormat(2)=7 + filetime(8) + 8x u16 slots + compact-ascii type + propCount + props). The parser wrongly treated the one-time 0x1E field as a per-row marker and re-consumed [marker+format] for every row, so it decoded only the FIRST row of any multi-row buffer and stopped. This is not gRPC-specific: the captured WCF v9 buffer has the identical 0900 <rowCount> 1E000000 0700 header, so the shipped WCF event read had the same latent multi-row truncation. Fix: read a 10-byte buffer header (skip the 0x1E field once) and parse markerless rows; accept container version 9 (WCF) and 11 (gRPC), mirroring the interface-version gate that accepts History 11 and 12. Verified: the real 50-row buffer now decodes to exactly 50 events, ending cleanly at end-of-buffer (Parse_RealStockClientCapture_DecodesAllEvents, gated on HISTORIAN_EVENT_CAPTURE_NDJSON so it skips without the gitignored capture), plus a synthetic v11 golden test. 328 offline tests pass. The parse path is now verified against the provided client's real event data on both transports; the only remaining gap for gRPC events is the server delivering rows to our connection (the documented retrieval-server-gate). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01B6mcaT2PjRFKcogzp9UkfC
This commit is contained in:
@@ -473,6 +473,30 @@ over gRPC keeps the honest no-row throw, and event reads use the WCF transport.
|
||||
any future server-side investigation: the `httpcap` TLS-tee proxy, the `CreateHttp2` / `SPLIT_CHANNEL`
|
||||
switches, the `EventReadDiagnostic` test, and the `capture-event` harness (native, returns rows).
|
||||
|
||||
### Verify the parse path against the provided client's real data (2026-06-23) — found + fixed a latent bug
|
||||
|
||||
Used the provided 2023 R2 client as an **oracle**: the `capture-event` harness returns 50 real events
|
||||
(verified live + through the `httpcap` proxy), and the `instrument-grpc-nonstream` rewrite captured the
|
||||
exact `GetNextEventQueryResultBuffer.result` buffer the stock client received — **63,192 bytes, version
|
||||
`0x0B` (11), rowCount 50** (25 `Alarm.Set` + 25 `Alarm.Clear`). Fed that real buffer through our
|
||||
`HistorianEventRowProtocol.Parse` to verify the read path decodes genuine gRPC event data, and it
|
||||
**exposed a latent parser bug**:
|
||||
|
||||
- The real row buffer is `version(2) + rowCount(4) + headerField(4, =0x1E)` then **markerless rows**
|
||||
(`rowFormat(2)=7 + filetime(8) + 8×u16 slots + compact-ascii type + propCount + props`). Our parser
|
||||
wrongly treated the one-time `0x1E` field as a **per-row marker** and re-consumed `[marker+format]`
|
||||
every row — so it parsed only the **first** row of any multi-row buffer and stopped. This is **not
|
||||
gRPC-specific**: the captured **WCF v9** buffer has the identical `0900 <rowCount> 1E000000 0700 …`
|
||||
header, so the shipped WCF event read had the same latent multi-row truncation.
|
||||
- **Fix:** read a 10-byte buffer header (skip the `0x1E` field once) and parse markerless rows; accept
|
||||
container version **9 (WCF) and 11 (gRPC)**. Verified: the real 50-row buffer now decodes to exactly 50
|
||||
events, ending cleanly at end-of-buffer (`Parse_RealStockClientCapture_DecodesAllEvents`, gated on
|
||||
`HISTORIAN_EVENT_CAPTURE_NDJSON`); plus a synthetic v11 golden test. 328 offline tests pass.
|
||||
|
||||
So the **parse path is now verified against the provided client's real event data** — the one remaining
|
||||
gap is strictly the server delivering rows to our gRPC connection (the working-set gate above). If that
|
||||
were ever opened, the decoded events would now flow through correctly on both transports.
|
||||
|
||||
**2 of 3 layers cleared** (key exchange + client key); the 3rd (token construction) is localized to a
|
||||
specific managed method, pending dnlib extraction. ExchangeKey + the v8 serializer are committed; the
|
||||
orchestrator stays on v6 (set `eventConnection: true` to re-arm once the token construction lands). The
|
||||
|
||||
Reference in New Issue
Block a user