Files
histsdk/docs/reverse-engineering/wcf-event-read-spike-results.md
T
Joseph Doherty 54bcf8a5e0 docs(c2): record 2020-vs-2023R2 positive control for the event-read server gate
Adds the positive control that the prior C2 evidence lacked. The SAME native WCF
event-read client returns real events (5) from a local AVEVA Historian 2020 but 0
from the 2023 R2 server over the identical sequence and window, while both boxes
hold tens of thousands of events in SQL — isolating the zero-rows to the 2023 R2
server, not the client, protocol, or serializers.

- wcf-event-read-spike-results.md: new "2026-06-26 positive control" section
  (2020 vs 2023 R2 A/B from one WCF client; stock-2020-client version-self-block
  caveat; stock-2023R2 gRPC cross-check).
- grpc-event-query-capture.md: re-control note — the 2026-06-22 stock 50-row
  capture did NOT reproduce; the stock 2023 R2 client now also returns 0 rows.
- HistorianGrpcIntegrationTests: correct the stale "capture-gated, NOT
  server-gated" comment to the server-gate conclusion backed by the controls.

Sanitized throughout (counts, native return codes, buffer lengths only).

Claude-Session: https://claude.ai/code/session_012SDSQ3AcaXqPcBtDESBRii
2026-06-26 11:08:20 -04:00

7.6 KiB

WCF event-read spike — live result (2026-06-25/26): transport+auth viable, row-retrieval server-gated

Settles the open question behind C2 ("event reads over gRPC are gated; the only listed unblock is route event reads via WCF"). The gRPC event-read path is a proven server-side dead-end (grpc-event-query-capture.md: auth fully solved, every client-controllable layer byte-matched to the stock client, yet the server scopes 0 rows to our connection). This spike resolved the WCF leg.

Correction to an earlier draft of this doc. A first pass concluded "the 2023 R2 historian does not serve the legacy WCF transport (connection reset at framing)." That was a test error, not a server fact. It connected to the historian's real WCF port 32568 directly and used the Windows-integrated transport. In this environment the historian is reached through a reverse SSH tunnel (local 42568 → historian 32568), and integrated/Kerberos auth does not work through that tunnel. The socket-RST was the tunnel/transport mismatch, not an absent listener. Corrected below.

What was run

A Windows-only-by-default, env-gated diagnostic (tests/AVEVA.Historian.Client.Tests/WcfEventReadSpikeTests.cs) drives HistorianWcfEventOrchestrator.ReadEventsAsync directly. The decisive run was cross-platform, direct (no tunnel): from the VPN-holding host straight to the historian's real WCF endpoint net.tcp://<historian>:32568/HistCert, using the certificate transport (RemoteTcpCertificate, TLS, AllowUntrustedServerCertificate) and NegotiateAuthentication (cross-platform, explicit domain credentials). The SDK's interface-version gate was bypassed (VerifyServerInterfaceVersion=false) — the 2023 R2 WCF History interface reports version 13 (this SDK's serializers target 11/12).

Result — transport+auth viable; row-retrieval server-gated (sanitized)

Progression of the live errors as the addressing/transport was corrected:

attempt error
direct :32568, integrated SocketException "forcibly closed" (wrong port + transport for the tunnel)
tunnel :42568, integrated ProtocolException at the security UpgradeResponse (integrated can't negotiate through the tunnel)
tunnel :42568, certificate reached the WCF dispatcher → AddressFilter mismatch (tunnel rewrites the port)
direct :32568, certificate, cross-platform past authProtocolEvidenceMissingException: History interface version 13
+ VerifyServerInterfaceVersion=false full chain runs; query returns a 10-byte 0-row header, then GetNext long-polls

Connection-mode experiment (certificate transport, direct, version-bypassed, a 1-day window that holds events), comparing the native OpenConnection mode used for the event-read chain:

connMode RegisterTags (RTag2) EnsureTags (EnsT2) result buffer events
0x501 (event) 0 — success 1 (benign-false, as in the 2020 flow) 10 bytes (0-row header) 0
0x401 (write) 1 (fail) 1 10 bytes 0
0x402 (read-only, default) 1 (fail) 1 10 bytes 0

Conclusion

  1. WCF transport + auth ARE viable on 2023 R2. The certificate (TLS) transport negotiates and the NegotiateAuthentication app-level handshake authenticates — cross-platform (proven from a non-Windows VPN host). The earlier "WCF not served" conclusion was wrong. (Integrated/Windows transport security is not usable through the reverse tunnel — net.tcp Kerberos does not tunnel.)
  2. The event-read chain needs the 0x501 event connection mode. With it, CM_EVENT RegisterTags succeeds (it fails on 0x402/0x401). EnsureTags returns false, but that is documented as benign in the 2020 flow that did return rows.
  3. Row retrieval is server-gated — same as gRPC. Even with auth solved and RegisterTags succeeding, over a window that holds events, StartEventQuery succeeds but GetNextEventQueryResultBuffer returns a 0-row header (10 bytes) and long-polls. Registration and window are ruled out as the cause; the server simply does not scope event rows to a managed connection. This is the identical server-side per-connection retrieval working-set gate proven for gRPC in grpc-event-query-capture.md.

Therefore event reads do not return rows on the 2023 R2 historian over either transport — gRPC (retrieval-server-gated) and WCF (transport+auth work, but the same server-side row gate). The only remaining theoretical unblock is server-side (AVEVA exposing event-row retrieval to a managed connection) — not client-fixable. C2 stays closed won't-fix, for this (corrected) reason.

SDK additions from this investigation (retained, build-clean, golden where applicable)

  • HistorianClientOptions.ConnectViaAddress — WCF Via (connect to a tunnel/proxy while addressing the SOAP To the real endpoint), so a port-forward whose local port differs from the server's real port satisfies the server-side WCF AddressFilter.
  • HistorianClientOptions.EventReadConnectionModeOverride — diagnostic override of the event-read OpenConnection mode (the 0x501 finding above).
  • The C2 spike is now transport-selectable (integrated|certificate), cross-platform for the cert transport, bounded (per-call timeout + overall budget with a phase-diagnostic dump), and version-gate bypassable. Output stays sanitized (counts, native return codes, buffer lengths, sha256).

2026-06-26 — positive control: same WCF client, 2020 historian vs 2023 R2

The earlier evidence triangulated the gate but lacked a clean positive control — proof that the native event-query path returns rows for some historian, so that the 0-row 2023 R2 result can be attributed to the server rather than to the client/protocol. This run supplies it, A/B against two historians from the same WCF event-read client (HistorianWcfEventOrchestrator, whose wire protocol is byte-replayed from stock 2020 captures), same 365-day window:

target historian transport RegisterTags (RTag2) result buffer events
local AVEVA Historian 2020 WCF, integrated 0 — success terminal after rows 5
2023 R2 (the C2 server) WCF, certificate (gate, as documented) 10-byte 0-row header → long-poll 0

SQL ground truth (Runtime.dbo.Events) for the same two boxes: the 2020 historian holds ~51.6k events over the window, the 2023 R2 holds ~71.5k — both populated. So the identical native WCF event-read client returns real events from a 2020 historian and zero from the 2023 R2 server. That isolates the zero-rows to the 2023 R2 server: not the client, not the protocol, not our serializers.

Notes / honesty caveats:

  • The genuinely-stock 2020 client (aahClientManaged.dll v2020.0406.2652.2, driven by reflection) could not be run end-to-end here: against the local 2020 historian (services patched to build 3383.3) it self-blocks at StartEventQuery with Invalid InterfaceVersion (242) — a client-side build/version gate, and the stock client has no version-bypass. Our client (which does bypass the version check and byte-replays the same native sequence) is the faithful proxy that reaches the rows.
  • Cross-check on the gRPC leg the same day: the stock 2023 R2 client (native Event connection, its own correct event query) returned 0 rows over 30d/90d/365d/3yr against the 2023 R2 server; the 2026-06-22 "50 rows" stock capture did not reproduce. Same server-gate, both transports, both clients.
  • Output sanitized throughout (counts, native return codes, buffer lengths only — no event identity, host, or credentials).