Resolve write-path silent fails + expand EnsureTagAsync, RetrievalMode coverage

DelT and EnsT2 had two distinct silent-fail blockers; both now resolved live
end-to-end. Read path's RetrievalMode mapping was missing 11 of 15 enum values
(plus a latent Cyclic→4 bug). Investigation tooling kept as env-gated helpers.

DelT silent fail: Open2 was using NativeIntegratedReadOnlyConnectionMode (0x402);
server returned err 132 OperationNotEnabled silently. Added
NativeIntegratedWriteEnabledConnectionMode (0x401) per
HistorianAccessUtil.SetConnectionMode bit map (Process=1 | IntegratedSecurity=0x400).
Write orchestrator now opens with write-enabled mode.

EnsT2 silent fail: byte-by-byte comparison via inspector revealed two bugs in
SerializeAnalogCTagMetadata. The original "146-byte byte-for-byte match" was
misaligned — it omitted the leading 0x4E marker byte and treated WCF's `01 01 01`
EndElement closing markers as if they were part of the InBuff payload. Real native
InBuff is 144 bytes with 0x4E lead and 2-byte `FE 00` trailer. Golden test bytes
corrected.

EnsureTagAsync expansion: probed every analog data type via
instrument-wcf-writemessage; byte 11 of CTagMetadata is the data-type
discriminator (Float=0x01, Double=0x21, UInt2=0x09, UInt4=0x11, Int2=0x29,
Int4=0x31). String/Int1/Int8/UInt8 fail at native AddTag — out of scope for
this op. Range encoding decoded: defaults emit compact `1A 03`; non-default
emit `1F 00` + 4 doubles in order MinEU/MaxEU/MinRaw/MaxRaw. MinRaw/MaxRaw
sent on the wire but server mirrors them to MinEU/MaxEU when ApplyScaling=false
(verified against native — server quirk, not SDK bug).

RetrievalMode mapping: probed all 15 enum values; QueryType is just the native
enum ordinal. Replaced the broken switch with `(uint)mode`. Existing SDK
mapped Cyclic→4 (BestFit's value); Cyclic is actually 0.

CLAUDE.md updated: stale "Active Protocol Blocker" rewritten as resolved-status
block; SDK surface now reflects the read-blocker resolution and the new write
ops; "Remaining gaps" punch list refreshed.

Tools added (both env-gated, no runtime overhead unless flipped on):
- HistorianWcfMessageCaptureBehavior — captures all WCF body bytes when
  AVEVA_HISTORIAN_SDK_WIRE_CAPTURE is set; used for byte-level diff vs native.
- HistorianWcfHistAddressingBehavior — explicitly sets wsa:To header on the
  Hist channel for parity with native bytes (kept though not load-bearing).
- WriteDiag in TagWriteOrchestrator — env-gated EnsT2/DelT response logging
  (AVEVA_HISTORIAN_DELT_DIAG).

NativeTraceHarness CLI: added --write-min-eu/--write-max-eu/--write-min-raw/
--write-max-raw for capturing non-default-range EnsT2 payloads.

Tests: 130 → 161 passing (+31). Includes 16-mode RetrievalMode mapping table,
4 per-data-type EnsT2 golden tests, NonDefaultRanges golden test, 6 live
round-trip integration tests covering Float/Double/Int2/Int4/UInt4/FloatRanges,
3 live tests for previously-unmapped RetrievalMode values.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
Joseph Doherty
2026-05-04 14:52:13 -04:00
parent 200493c990
commit 7a3cd9b76e
12 changed files with 591 additions and 79 deletions
+29 -4
View File
@@ -10,12 +10,19 @@ Read `AGENTS.md` (standing constraints), `instructions.md` (decision record), an
## Required SDK Surface
Read-only operations only. Do not implement write-back unless explicitly requested:
Reads (the original required surface, all working live as of 2026-05-04):
- `ProbeAsync`, `ReadRawAsync`, `ReadAggregateAsync`, `ReadAtTimeAsync`, `ReadEventsAsync`
- `BrowseTagNamesAsync`, `GetTagMetadataAsync`
- Status helpers: `GetConnectionStatusAsync`, `GetStoreForwardStatusAsync`, `GetSystemParameterAsync`
Writes (added 2026-05-04 by explicit user request — do not extend further without one):
- `EnsureTagAsync` for analog types: Float, Double, Int2, Int4, UInt4 (live-verified end-to-end). Other types (SingleByteString/DoubleByteString/Int1/Int8/UInt8) fail at native AddTag — likely require a different path and are intentionally not supported. `MinEU`/`MaxEU` round-trip correctly into the DB; `MinRaw`/`MaxRaw` are sent on the wire but the server mirrors them to MinEU/MaxEU when ApplyScaling=false (verified against native — server quirk, not SDK bug).
- `DeleteTagAsync`
`AddS2` (write samples) is architecturally blocked — server cache only ingests from configured IOServers/ApplicationServer pipelines. Do not add write-samples support.
Methods without protocol evidence currently throw `ProtocolEvidenceMissingException` from `Historian2020ProtocolDialect`. Do not stub fake behavior — leave them throwing until evidence supports an implementation.
## Build & Test
@@ -68,11 +75,29 @@ Three layered subsystems, intentionally decoupled so protocol parsing can be uni
`InternalsVisibleTo` exposes internals to the test assembly and the reverse-engineering tool.
### The Active Protocol Blocker
### Read-path status (resolved 2026-05-04)
The native wrapper does **not** use the simple `Open2` session handle for query reads. The successful native flow is `CClientContext.AuthenticateClient` → two `ValidateClientCredential` SSPI rounds → `CHistoryConnectionWCF.OpenConnection3``CClientCommon.StartQuery``/Retr.StartQuery2`. `OpenConnection3` mints the transient `/Retr` client handle the server accepts. Managed `Open2` alone reaches server logic but `Retr.StartQuery2` returns false with empty buffers.
The original blocker — `Open2` reaching server logic but `Retr.StartQuery2` returning false with empty buffers — is **resolved**. Root causes were:
`DataQueryRequest` and `EventQueryRequest` byte serialization is already byte-matched against native captures. The remaining gap is reproducing the auth/session state that lets the server accept a client-generated context GUID before `OpenConnection3`. See handoff.md "Active Blocker" and `docs/reverse-engineering/openconnection3-correlation-latest.json`.
1. WCF parameter-name mismatches — server contracts use `inBuff`/`outBuff`/`pRequestBuff`/etc.; the SDK's default C#-derived names made the deserializer ignore the body. Fixed via `[MessageParameter(Name = "...")]` attributes across `IHistoryServiceContract2`, `IRetrievalServiceContract2..4`, `IStatusServiceContract2`, `ITransactionServiceContract`.
2. Native SSPI request flags — round 0 = `0x2081C` (adds `IDENTIFY` + `REPLAY_DETECT` + `SEQUENCE_DETECT`); rounds 1+ = `0x81C`. Without `REPLAY_DETECT|SEQUENCE_DETECT`, NTLM MIC generation is skipped and `AcceptSecurityContext` rejects round 1. Implemented in `HistorianSspiClient` via P/Invoke `InitializeSecurityContextW`.
3. Cross-service version probes (`Trx/GetV`, `Stat/GetV`, `Retr/GetV`) between RTag2 and EnsT2 in the event flow — required to register the client with each service's session table.
End-to-end chain working from a pure managed .NET 10 client: `Hist.GetV → Hist.ValCl × N → Hist.Open2 → /Retr.GetV → Retr.IsOriginalAllowed → Retr.StartQuery2 → loop Retr.GetNextQueryResultBuffer2`. 23 live integration tests against `localhost` cover all required reads + the two write ops.
### Write-path notes (added 2026-05-04)
`EnsureTagAsync` and `DeleteTagAsync` chain follow the same pattern as reads but require Open2 with `NativeIntegratedWriteEnabledConnectionMode = 0x401` (Process | Write | IntegratedSecurity) — the read-path's `0x402` (read-only) makes the server return err 132 `OperationNotEnabled` silently. The analog Float `CTagMetadata` payload is 144 bytes with a leading `0x4E` marker byte and a 2-byte trailer `FE 00`. See `docs/reverse-engineering/handoff.md` and the `WriteDiag` env-gated diagnostic helper in `HistorianWcfTagWriteOrchestrator` for capture details.
### Remaining gaps
Smaller, isolated items — none block the production read surface:
- Remote TCP transports (`RemoteTcpIntegrated`, `RemoteTcpCertificate`) untested against an actual remote Historian (tests skip without `HISTORIAN_REMOTE_TCP_HOST`).
- Explicit username/password tag-metadata path (`HistorianWcfTagClient` line ~357) throws — only integrated security wired for that op.
- `Historian2020ProtocolDialect.GetTagInfoByName/GetTagInfos` throws — currently dead code; `GetTagMetadataAsync` works through the WCF tag client instead.
- Per-row trailing ~24 bytes of `GetNextQueryResultBuffer` are not decoded (likely per-sample value/source/state metadata).
- `EnsureTagAsync` distinct `MinRaw`/`MaxRaw` persistence requires `ApplyScaling=true` + a follow-up `UpdateTags` call — not yet wired (no API user has asked).
### Tools Layer