docs(c2): correct the WCF finding — transport+auth viable, row-retrieval server-gated

Overturns the earlier wrong "WCF not served on 2023 R2" conclusion (that was a
test error: wrong port/transport for the reverse tunnel). Corrected: the cert
(TLS) transport + NegotiateAuthentication auth reach the 2023 R2 historian
cross-platform; the 0x501 event connection mode makes CM_EVENT RegisterTags
succeed; yet StartEventQuery returns a 0-row buffer + long-polls over a window
that has events. Registration and window ruled out -> the same server-side
per-connection row gate as gRPC. Event reads stay server-gated over BOTH
transports; not client-fixable. Evidence doc rewritten; gRPC + WCF orchestrator
gating messages corrected.

Claude-Session: https://claude.ai/code/session_012SDSQ3AcaXqPcBtDESBRii
This commit is contained in:
Joseph Doherty
2026-06-26 04:41:21 -04:00
parent de8d5e91ce
commit f2297315b9
3 changed files with 80 additions and 63 deletions
@@ -1,63 +1,75 @@
# WCF event-read spike — live result (2026-06-25): WCF transport not served on 2023 R2 # WCF event-read spike — live result (2026-06-25/26): transport+auth viable, row-retrieval server-gated
Settles the open question behind **C2** ("event reads over gRPC are gated; the only listed unblock is Settles the open question behind **C2** ("event reads over gRPC are gated; the only listed unblock is
*route event reads via WCF*"). The gRPC event-read path is a proven server-side dead-end *route event reads via WCF*"). The gRPC event-read path is a proven server-side dead-end
(`grpc-event-query-capture.md`: auth fully solved, every client-controllable layer byte-matched to the (`grpc-event-query-capture.md`: auth fully solved, every client-controllable layer byte-matched to the
stock client, yet the server scopes 0 rows to our connection). This spike tested the **WCF** leg. stock client, yet the server scopes 0 rows to our connection). This spike resolved the **WCF** leg.
> **Correction to an earlier draft of this doc.** A first pass concluded "the 2023 R2 historian does not
> serve the legacy WCF transport (connection reset at framing)." **That was a test error, not a server
> fact.** It connected to the historian's real WCF port `32568` *directly* and used the Windows-integrated
> transport. In this environment the historian is reached through a **reverse SSH tunnel** (local
> `42568` → historian `32568`), and integrated/Kerberos auth does not work through that tunnel. The
> socket-RST was the tunnel/transport mismatch, not an absent listener. Corrected below.
## What was run ## What was run
A Windows-only, env-gated diagnostic (`tests/AVEVA.Historian.Client.Tests/WcfEventReadSpikeTests.cs`, A Windows-only-by-default, env-gated diagnostic (`tests/AVEVA.Historian.Client.Tests/WcfEventReadSpikeTests.cs`)
gated by `HISTORIAN_WCF_EVENT_HOST`) drove `HistorianWcfEventOrchestrator.ReadEventsAsync` directly drives `HistorianWcfEventOrchestrator.ReadEventsAsync` directly. The decisive run was **cross-platform,
over `RemoteTcpIntegrated` (WCF `net.tcp`, port 32568) against the **live 2023 R2 historian**, with a direct** (no tunnel): from the VPN-holding host straight to the historian's real WCF endpoint
90d window (the engine holds tens of thousands of events in that range), run from the native Windows `net.tcp://<historian>:32568/HistCert`, using the **certificate transport** (`RemoteTcpCertificate`,
capture rig over VPN. Auth supplied as explicit domain credentials (consumed by the app-level TLS, `AllowUntrustedServerCertificate`) and `NegotiateAuthentication` (cross-platform, explicit domain
`ValidateClientCredential` SSPI rounds). credentials). The SDK's interface-version gate was bypassed (`VerifyServerInterfaceVersion=false`) —
the 2023 R2 WCF **History interface reports version 13** (this SDK's serializers target 11/12).
## Result — RED (transport not served), sanitized ## Result — transport+auth viable; row-retrieval server-gated (sanitized)
Event spike: Progression of the live errors as the addressing/transport was corrected:
| field | value | | attempt | error |
|---|---| |---|---|
| outcome | `THREW System.ServiceModel.CommunicationException` ("The socket connection was aborted") | | direct `:32568`, integrated | `SocketException` "forcibly closed" (wrong port + transport for the tunnel) |
| inner | `System.Net.Sockets.SocketException` — "An existing connection was forcibly closed by the remote host" | | tunnel `:42568`, integrated | `ProtocolException` at the security UpgradeResponse (integrated can't negotiate through the tunnel) |
| events observed | 0 | | tunnel `:42568`, certificate | reached the WCF dispatcher → `AddressFilter` mismatch (tunnel rewrites the port) |
| LastUpdC3ReturnCode / LastRTag2ReturnCode / LastAddReturnCode(EnsT2) | 0 / 0 / 0 | | **direct `:32568`, certificate, cross-platform** | **past auth**`ProtocolEvidenceMissingException`: History interface version **13** |
| LastEnsT2PayloadSha256 | empty | | + `VerifyServerInterfaceVersion=false` | **full chain runs**; query returns a 10-byte **0-row** header, then `GetNext` long-polls |
| LastResultBufferLength | 0 |
All native return codes are `0` and the EnsT2 payload sha256 is empty: the chain failed at the **first Connection-mode experiment (certificate transport, direct, version-bypassed, a 1-day window that holds
WCF call** (`GetInterfaceVersion`), *before* any auth token round or CM_EVENT registration ran. events), comparing the native OpenConnection mode used for the event-read chain:
Corroboration — a basic (non-event) `RemoteTcpIntegrated` `ProbeAsync` + `ReadRawAsync` (the committed | connMode | RegisterTags (RTag2) | EnsureTags (EnsT2) | result buffer | events |
`RemoteTcpIntegrationTests`) throws the **identical** exception, with the stack landing in |---|---|---|---|---|
`System.ServiceModel.Channels.SocketConnection.WriteAsync` — i.e. the failure is **transport-wide**, not | `0x501` (event) | **0 — success** | 1 (benign-false, as in the 2020 flow) | 10 bytes (0-row header) | **0** |
event-specific, and not auth-specific (it never reaches auth). | `0x401` (write) | 1 (fail) | 1 | 10 bytes | 0 |
| `0x402` (read-only, default) | 1 (fail) | 1 | 10 bytes | 0 |
Phase 0 (reachability) had confirmed TCP 32568 is **open** (the connect succeeds). So the port accepts a
socket, but the moment the SDK writes its `net.tcp` binary-SOAP framing the server **resets the
connection** (RST at the socket-write layer).
## Conclusion ## Conclusion
The **2023 R2 historian does not serve the legacy WCF NetTcp transport.** A raw RST at the first socket 1. **WCF transport + auth ARE viable on 2023 R2.** The certificate (TLS) transport negotiates and the
write — before any security negotiation, SOAP fault, or auth exchange — is the signature of a listener `NegotiateAuthentication` app-level handshake authenticates — **cross-platform** (proven from a
that does not speak `net.tcp` binary SOAP, not of an auth/SPN problem or event-row scoping. (The earlier non-Windows VPN host). The earlier "WCF not served" conclusion was wrong. (Integrated/Windows
WCF event-chain native return codes 76/85 documented in `HistorianWcfEventOrchestrator` were only ever transport security is not usable through the reverse tunnel — `net.tcp` Kerberos does not tunnel.)
observed against a **2020** historian; against 2023 R2 there is no WCF endpoint to reach at all.) 2. **The event-read chain needs the `0x501` event connection mode.** With it, CM_EVENT `RegisterTags`
**succeeds** (it fails on `0x402`/`0x401`). `EnsureTags` returns false, but that is documented as
benign in the 2020 flow that *did* return rows.
3. **Row retrieval is server-gated — same as gRPC.** Even with auth solved and `RegisterTags` succeeding,
over a window that holds events, `StartEventQuery` succeeds but `GetNextEventQueryResultBuffer` returns
a **0-row** header (10 bytes) and long-polls. Registration and window are ruled out as the cause; the
server simply does not scope event rows to a managed connection. This is the **identical** server-side
per-connection retrieval working-set gate proven for gRPC in `grpc-event-query-capture.md`.
Therefore **C2's "route event reads via WCF" unblock is moot on 2023 R2** — there is no WCF endpoint to **Therefore event reads do not return rows on the 2023 R2 historian over either transport** — gRPC
route to. Event reads are unavailable on the 2023 R2 historian over **both** transports: (retrieval-server-gated) and WCF (transport+auth work, but the same server-side row gate). The only
remaining theoretical unblock is server-side (AVEVA exposing event-row retrieval to a managed
connection) — not client-fixable. **C2 stays closed won't-fix**, for this (corrected) reason.
- **gRPC** — auth-solved but retrieval-server-gated (server scopes 0 rows to our connection; ## SDK additions from this investigation (retained, build-clean, golden where applicable)
`grpc-event-query-capture.md`).
- **WCF (`net.tcp`)** — transport not served on 2023 R2 (connection reset at framing).
The WCF event-read managed path would only ever apply to a legacy **2020** historian, which the gateway - `HistorianClientOptions.ConnectViaAddress` — WCF `Via` (connect to a tunnel/proxy while addressing the
does not target (the gateway runs `RemoteGrpc` against 2023 R2). The only remaining theoretical unblock SOAP `To` the real endpoint), so a port-forward whose local port differs from the server's real port
is server-side (AVEVA exposing event-row retrieval to a managed gRPC connection) — not client-fixable. satisfies the server-side WCF AddressFilter.
- `HistorianClientOptions.EventReadConnectionModeOverride` — diagnostic override of the event-read
**C2 is closed won't-fix** for the gateway's target (2023 R2). `ReadEventsAsync` over gRPC keeps its OpenConnection mode (the `0x501` finding above).
honest no-row throw; the gating messages are corrected so they no longer point operators at the WCF - The C2 spike is now transport-selectable (integrated|certificate), cross-platform for the cert
transport as a live fallback on 2023 R2. transport, bounded (per-call timeout + overall budget with a phase-diagnostic dump), and version-gate
bypassable. Output stays sanitized (counts, native return codes, buffer lengths, sha256).
@@ -104,11 +104,12 @@ internal sealed class HistorianGrpcEventOrchestrator
{ {
throw new ProtocolEvidenceMissingException( throw new ProtocolEvidenceMissingException(
$"ReadEvents over gRPC did not return rows within {OverallBudget.TotalSeconds:0}s: StartEventQuery " + $"ReadEvents over gRPC did not return rows within {OverallBudget.TotalSeconds:0}s: StartEventQuery " +
"succeeds but GetNextEventQueryResultBuffer long-polls to the no-data terminal. Event-row retrieval " + "succeeds but GetNextEventQueryResultBuffer long-polls to the no-data terminal. Event-row retrieval is " +
"over gRPC is auth-solved but server-gated — the 2023 R2 server scopes 0 rows to a managed connection " + "auth-solved but SERVER-GATED on 2023 R2 over both transports — the server scopes 0 rows to a managed " +
"(see docs/reverse-engineering/grpc-event-query-capture.md). The legacy WCF transport is NOT a fallback " + "connection (gRPC: docs/reverse-engineering/grpc-event-query-capture.md). The WCF transport reaches the " +
"on 2023 R2 (live-disproven 2026-06-25: net.tcp is reset at the framing layer — see " + "2023 R2 historian (certificate transport + auth work, CM_EVENT registration succeeds on the 0x501 event " +
"docs/reverse-engineering/wcf-event-read-spike-results.md), so there is no event-read path on a 2023 R2 historian."); "connection) but hits the SAME server-side row gate — 0-row buffer + long-poll (see " +
"docs/reverse-engineering/wcf-event-read-spike-results.md). Not client-fixable on either transport.");
} }
foreach (HistorianEvent evt in events) foreach (HistorianEvent evt in events)
@@ -175,18 +176,20 @@ internal sealed class HistorianGrpcEventOrchestrator
// returning the WCF code-85 terminal), we cannot distinguish "genuinely no events in range" // returning the WCF code-85 terminal), we cannot distinguish "genuinely no events in range"
// from "the CM_EVENT registration replay didn't fully land over gRPC" — so we refuse to return // from "the CM_EVENT registration replay didn't fully land over gRPC" — so we refuse to return
// a possibly-false empty list and surface the gated state instead. Proven server-gated: the live // a possibly-false empty list and surface the gated state instead. Proven server-gated: the live
// 2023 R2 server holds tens of thousands of events yet scopes 0 to a managed gRPC connection // 2023 R2 server holds tens of thousands of events yet scopes 0 to a managed connection
// (grpc-event-query-capture.md); WCF is not a 2023 R2 fallback (wcf-event-read-spike-results.md). // (grpc-event-query-capture.md). WCF reaches the same historian (cert transport + auth work,
// CM_EVENT registers on the 0x501 event connection) but hits the SAME row gate — not a fallback
// (wcf-event-read-spike-results.md).
if (events.Count == 0) if (events.Count == 0)
{ {
throw new ProtocolEvidenceMissingException( throw new ProtocolEvidenceMissingException(
"ReadEvents over gRPC: the chain completes and StartEventQuery succeeds, but " + "ReadEvents over gRPC: the chain completes and StartEventQuery succeeds, but " +
"GetNextEventQueryResultBuffer returns no rows (it long-polls to the no-data terminal " + "GetNextEventQueryResultBuffer returns no rows (it long-polls to the no-data terminal " +
$"after the CM_EVENT registration replay; last={LastErrorBufferDescription}). Event-row retrieval " + $"after the CM_EVENT registration replay; last={LastErrorBufferDescription}). Event-row retrieval is " +
"over gRPC is auth-solved but server-gated — the 2023 R2 server scopes 0 rows to a managed connection " + "auth-solved but SERVER-GATED on 2023 R2 over both transports — the server scopes 0 rows to a managed " +
"(see docs/reverse-engineering/grpc-event-query-capture.md). The legacy WCF transport is NOT a fallback " + "connection (gRPC: docs/reverse-engineering/grpc-event-query-capture.md; WCF reaches the historian and " +
"on 2023 R2 (live-disproven 2026-06-25: net.tcp is reset at the framing layer — see " + "registers on the 0x501 event connection yet hits the same row gate: " +
"docs/reverse-engineering/wcf-event-read-spike-results.md)."); "docs/reverse-engineering/wcf-event-read-spike-results.md). Not client-fixable on either transport.");
} }
return events; return events;
@@ -8,13 +8,15 @@ using AVEVA.Historian.Client.Wcf.Contracts;
namespace AVEVA.Historian.Client.Wcf; namespace AVEVA.Historian.Client.Wcf;
/// <remarks> /// <remarks>
/// Mirrors HistorianWcfReadOrchestrator but targets IRetrievalServiceContract4 for the event flow. /// Mirrors HistorianWcfReadOrchestrator but targets IRetrievalServiceContract4 for the event flow. The
/// Applies to <b>legacy 2020-era WCF (net.tcp) historians only</b>. The event row-buffer layout is now /// event row-buffer layout is decoded (<see cref="HistorianEventRowProtocol"/>; verified against real
/// decoded (<see cref="HistorianEventRowProtocol"/>; verified against real captured rows). Note: a /// captured rows). A <b>2023 R2</b> historian <i>does</i> serve this transport via the <b>certificate</b>
/// <b>2023 R2</b> historian does NOT serve this WCF transport at all — net.tcp is reset at the framing /// (TLS) endpoint (the cert transport + <c>NegotiateAuthentication</c> auth work cross-platform; the
/// layer before any auth (live-disproven 2026-06-25; see /// integrated/Windows transport does not tunnel). With the <c>0x501</c> event connection mode CM_EVENT
/// <c>docs/reverse-engineering/wcf-event-read-spike-results.md</c>), so this orchestrator is not a /// registration succeeds — but <c>StartEventQuery</c> still returns a 0-row buffer and long-polls: event
/// fallback for 2023 R2 deployments. The native return codes 76/85 noted below were 2020-historian /// rows are <b>server-gated</b> per connection on 2023 R2, the same wall as the gRPC path, and not
/// client-fixable (see <c>docs/reverse-engineering/wcf-event-read-spike-results.md</c> and
/// <c>grpc-event-query-capture.md</c>). The native return codes 76/85 noted below were 2020-historian
/// observations. /// observations.
/// </remarks> /// </remarks>
internal sealed class HistorianWcfEventOrchestrator internal sealed class HistorianWcfEventOrchestrator