54bcf8a5e0
Adds the positive control that the prior C2 evidence lacked. The SAME native WCF event-read client returns real events (5) from a local AVEVA Historian 2020 but 0 from the 2023 R2 server over the identical sequence and window, while both boxes hold tens of thousands of events in SQL — isolating the zero-rows to the 2023 R2 server, not the client, protocol, or serializers. - wcf-event-read-spike-results.md: new "2026-06-26 positive control" section (2020 vs 2023 R2 A/B from one WCF client; stock-2020-client version-self-block caveat; stock-2023R2 gRPC cross-check). - grpc-event-query-capture.md: re-control note — the 2026-06-22 stock 50-row capture did NOT reproduce; the stock 2023 R2 client now also returns 0 rows. - HistorianGrpcIntegrationTests: correct the stale "capture-gated, NOT server-gated" comment to the server-gate conclusion backed by the controls. Sanitized throughout (counts, native return codes, buffer lengths only). Claude-Session: https://claude.ai/code/session_012SDSQ3AcaXqPcBtDESBRii
106 lines
7.6 KiB
Markdown
106 lines
7.6 KiB
Markdown
# WCF event-read spike — live result (2026-06-25/26): transport+auth viable, row-retrieval server-gated
|
|
|
|
Settles the open question behind **C2** ("event reads over gRPC are gated; the only listed unblock is
|
|
*route event reads via WCF*"). The gRPC event-read path is a proven server-side dead-end
|
|
(`grpc-event-query-capture.md`: auth fully solved, every client-controllable layer byte-matched to the
|
|
stock client, yet the server scopes 0 rows to our connection). This spike resolved the **WCF** leg.
|
|
|
|
> **Correction to an earlier draft of this doc.** A first pass concluded "the 2023 R2 historian does not
|
|
> serve the legacy WCF transport (connection reset at framing)." **That was a test error, not a server
|
|
> fact.** It connected to the historian's real WCF port `32568` *directly* and used the Windows-integrated
|
|
> transport. In this environment the historian is reached through a **reverse SSH tunnel** (local
|
|
> `42568` → historian `32568`), and integrated/Kerberos auth does not work through that tunnel. The
|
|
> socket-RST was the tunnel/transport mismatch, not an absent listener. Corrected below.
|
|
|
|
## What was run
|
|
|
|
A Windows-only-by-default, env-gated diagnostic (`tests/AVEVA.Historian.Client.Tests/WcfEventReadSpikeTests.cs`)
|
|
drives `HistorianWcfEventOrchestrator.ReadEventsAsync` directly. The decisive run was **cross-platform,
|
|
direct** (no tunnel): from the VPN-holding host straight to the historian's real WCF endpoint
|
|
`net.tcp://<historian>:32568/HistCert`, using the **certificate transport** (`RemoteTcpCertificate`,
|
|
TLS, `AllowUntrustedServerCertificate`) and `NegotiateAuthentication` (cross-platform, explicit domain
|
|
credentials). The SDK's interface-version gate was bypassed (`VerifyServerInterfaceVersion=false`) —
|
|
the 2023 R2 WCF **History interface reports version 13** (this SDK's serializers target 11/12).
|
|
|
|
## Result — transport+auth viable; row-retrieval server-gated (sanitized)
|
|
|
|
Progression of the live errors as the addressing/transport was corrected:
|
|
|
|
| attempt | error |
|
|
|---|---|
|
|
| direct `:32568`, integrated | `SocketException` "forcibly closed" (wrong port + transport for the tunnel) |
|
|
| tunnel `:42568`, integrated | `ProtocolException` at the security UpgradeResponse (integrated can't negotiate through the tunnel) |
|
|
| tunnel `:42568`, certificate | reached the WCF dispatcher → `AddressFilter` mismatch (tunnel rewrites the port) |
|
|
| **direct `:32568`, certificate, cross-platform** | **past auth** → `ProtocolEvidenceMissingException`: History interface version **13** |
|
|
| + `VerifyServerInterfaceVersion=false` | **full chain runs**; query returns a 10-byte **0-row** header, then `GetNext` long-polls |
|
|
|
|
Connection-mode experiment (certificate transport, direct, version-bypassed, a 1-day window that holds
|
|
events), comparing the native OpenConnection mode used for the event-read chain:
|
|
|
|
| connMode | RegisterTags (RTag2) | EnsureTags (EnsT2) | result buffer | events |
|
|
|---|---|---|---|---|
|
|
| `0x501` (event) | **0 — success** | 1 (benign-false, as in the 2020 flow) | 10 bytes (0-row header) | **0** |
|
|
| `0x401` (write) | 1 (fail) | 1 | 10 bytes | 0 |
|
|
| `0x402` (read-only, default) | 1 (fail) | 1 | 10 bytes | 0 |
|
|
|
|
## Conclusion
|
|
|
|
1. **WCF transport + auth ARE viable on 2023 R2.** The certificate (TLS) transport negotiates and the
|
|
`NegotiateAuthentication` app-level handshake authenticates — **cross-platform** (proven from a
|
|
non-Windows VPN host). The earlier "WCF not served" conclusion was wrong. (Integrated/Windows
|
|
transport security is not usable through the reverse tunnel — `net.tcp` Kerberos does not tunnel.)
|
|
2. **The event-read chain needs the `0x501` event connection mode.** With it, CM_EVENT `RegisterTags`
|
|
**succeeds** (it fails on `0x402`/`0x401`). `EnsureTags` returns false, but that is documented as
|
|
benign in the 2020 flow that *did* return rows.
|
|
3. **Row retrieval is server-gated — same as gRPC.** Even with auth solved and `RegisterTags` succeeding,
|
|
over a window that holds events, `StartEventQuery` succeeds but `GetNextEventQueryResultBuffer` returns
|
|
a **0-row** header (10 bytes) and long-polls. Registration and window are ruled out as the cause; the
|
|
server simply does not scope event rows to a managed connection. This is the **identical** server-side
|
|
per-connection retrieval working-set gate proven for gRPC in `grpc-event-query-capture.md`.
|
|
|
|
**Therefore event reads do not return rows on the 2023 R2 historian over either transport** — gRPC
|
|
(retrieval-server-gated) and WCF (transport+auth work, but the same server-side row gate). The only
|
|
remaining theoretical unblock is server-side (AVEVA exposing event-row retrieval to a managed
|
|
connection) — not client-fixable. **C2 stays closed won't-fix**, for this (corrected) reason.
|
|
|
|
## SDK additions from this investigation (retained, build-clean, golden where applicable)
|
|
|
|
- `HistorianClientOptions.ConnectViaAddress` — WCF `Via` (connect to a tunnel/proxy while addressing the
|
|
SOAP `To` the real endpoint), so a port-forward whose local port differs from the server's real port
|
|
satisfies the server-side WCF AddressFilter.
|
|
- `HistorianClientOptions.EventReadConnectionModeOverride` — diagnostic override of the event-read
|
|
OpenConnection mode (the `0x501` finding above).
|
|
- The C2 spike is now transport-selectable (integrated|certificate), cross-platform for the cert
|
|
transport, bounded (per-call timeout + overall budget with a phase-diagnostic dump), and version-gate
|
|
bypassable. Output stays sanitized (counts, native return codes, buffer lengths, sha256).
|
|
|
|
## 2026-06-26 — positive control: same WCF client, 2020 historian vs 2023 R2
|
|
|
|
The earlier evidence triangulated the gate but lacked a clean *positive* control — proof that the
|
|
native event-query path returns rows for **some** historian, so that the 0-row 2023 R2 result can be
|
|
attributed to the server rather than to the client/protocol. This run supplies it, A/B against two
|
|
historians from the **same** WCF event-read client (`HistorianWcfEventOrchestrator`, whose wire
|
|
protocol is byte-replayed from stock 2020 captures), same 365-day window:
|
|
|
|
| target historian | transport | RegisterTags (RTag2) | result buffer | events |
|
|
|---|---|---|---|---|
|
|
| **local AVEVA Historian 2020** | WCF, integrated | **0 — success** | terminal after rows | **5** |
|
|
| **2023 R2** (the C2 server) | WCF, certificate | (gate, as documented) | 10-byte 0-row header → long-poll | **0** |
|
|
|
|
SQL ground truth (`Runtime.dbo.Events`) for the same two boxes: the 2020 historian holds ~51.6k events
|
|
over the window, the 2023 R2 holds ~71.5k — both populated. So the **identical native WCF event-read
|
|
client returns real events from a 2020 historian and zero from the 2023 R2 server**. That isolates the
|
|
zero-rows to the 2023 R2 server: not the client, not the protocol, not our serializers.
|
|
|
|
Notes / honesty caveats:
|
|
- The genuinely-**stock** 2020 client (`aahClientManaged.dll` v2020.0406.2652.2, driven by reflection)
|
|
could **not** be run end-to-end here: against the local 2020 historian (services patched to build
|
|
3383.3) it self-blocks at `StartEventQuery` with `Invalid InterfaceVersion` (242) — a client-side
|
|
build/version gate, and the stock client has no version-bypass. Our client (which *does* bypass the
|
|
version check and byte-replays the same native sequence) is the faithful proxy that reaches the rows.
|
|
- Cross-check on the gRPC leg the same day: the **stock 2023 R2** client (native Event connection, its
|
|
own correct event query) returned **0 rows** over 30d/90d/365d/3yr against the 2023 R2 server; the
|
|
2026-06-22 "50 rows" stock capture did not reproduce. Same server-gate, both transports, both clients.
|
|
- Output sanitized throughout (counts, native return codes, buffer lengths only — no event identity,
|
|
host, or credentials).
|