Files
histsdk/docs/reverse-engineering/handoff.md
T
dohertj2 c95824a65d Initial commit: managed .NET 10 AVEVA Historian SDK + reverse-engineering toolkit
Full read-only SDK (src/AVEVA.Historian.Client) implementing the CLAUDE.md required
surface against AVEVA Historian's binary WCF protocol — no native AVEVA runtime
dependency. All operations live-verified against a local Historian:

- ProbeAsync, ReadRawAsync, ReadAggregateAsync, ReadAtTimeAsync, ReadEventsAsync
- BrowseTagNamesAsync, GetTagMetadataAsync (17 native data-type codes mapped)
- GetConnectionStatusAsync, GetStoreForwardStatusAsync, GetSystemParameterAsync
- 108/108 unit + integration tests pass

Includes the reverse-engineering toolkit (tools/AVEVA.Historian.ReverseEngineering)
used to decode the protocol: WCF probes, IL inspection via dnlib, and IL-rewrite
instrumentation (instrument-wcf-{write,read}message etc.) plus the .NET Framework
trace harness (tools/AVEVA.Historian.NativeTraceHarness) for parity testing.

Sanitized handoff evidence under docs/reverse-engineering/. Native AVEVA binaries
(current/, aveva-install-x64/, aveva-install-x86/) are gitignored — fetch separately
from the AVEVA installer.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-04 06:31:48 -04:00

1100 lines
50 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# AVEVA Historian Managed Driver Handoff
Last updated: 2026-05-04 (event-flow prereqs)
## Project Direction
The project goal is still a fully managed .NET 10 C# AVEVA Historian client.
The production SDK must not depend on `aahClientManaged.dll`, `aahClient.dll`,
or any other AVEVA native runtime binary.
Do not pivot to REST or a P/Invoke production shim unless the project
requirements change. Native and P/Invoke tools in this repo are reverse
engineering aids only.
Required production surface remains narrowly scoped:
- `ProbeAsync`
- `ReadRawAsync`
- `ReadAggregateAsync`
- `ReadAtTimeAsync`
- `ReadEventsAsync`
- `BrowseTagNamesAsync`
- `GetTagMetadataAsync`
Writes are out of scope for the current pass.
## Repository Map
- `AGENTS.md` - standing project instructions and constraints.
- `instructions.md` - original plan and decision record.
- `current\` - deployed sidecar dependency DLL set; use this first for wrapper
behavior.
- `aveva-install-x64\` and `aveva-install-x86\` - full installed AVEVA DLL
sets for comparison.
- `src\AVEVA.Historian.Client\` - production managed SDK.
- `tests\AVEVA.Historian.Client.Tests\` - unit and gated integration tests.
- `tools\AVEVA.Historian.ReverseEngineering\` - .NET 10 CLI for static
inspection, WCF probes, and IL-rewrite generation.
- `tools\AVEVA.Historian.NativeTraceHarness\` - .NET Framework native-wrapper
comparison harness. Reverse-engineering only.
- `tools\AVEVA.Historian.NetFxWcfProbe\` - .NET Framework WCF probe used to
rule out .NET 10 WCF-only differences.
- `tools\AVEVA.Historian.ReverseInstrumentation\` - helper assembly injected
into rewritten wrapper copies for sanitized logging.
- `tools\AVEVA.Historian.WcfCaptureServer\` - fake WCF capture server used for
endpoint experiments.
- `scripts\` - PowerShell runners and Frida scripts.
- `docs\reverse-engineering\` - sanitized notes and small evidence summaries.
- `artifacts\reverse-engineering\` - ignored raw/sensitive runtime artifacts.
Do not commit raw captures or identity-bearing logs.
## Build And Test
From the repository root, normally `%USERPROFILE%\Desktop\histsdk`:
```powershell
dotnet build .\Histsdk.slnx --no-restore
dotnet test .\Histsdk.slnx --no-build --logger "console;verbosity=minimal"
```
Current known-good result:
- Build succeeds.
- Unit tests pass: 55/55.
The repository folder is not currently a Git working tree in this checkout, so
use file timestamps or your own external backup if you need change tracking.
## Environment Variables
Live integration tests and probes are gated by environment variables:
```powershell
$env:HISTORIAN_HOST = "<host>"
$env:HISTORIAN_PORT = "32568"
$env:HISTORIAN_USER = "<DOMAIN\user or machine\user>"
$env:HISTORIAN_PASSWORD = "<password>"
$env:HISTORIAN_TEST_TAG = "<known historized tag>"
$env:HISTORIAN_TAG_FILTER = "<LIKE filter, optional>"
```
Do not write actual credentials into docs, scripts, captures, or command logs.
The scripts read these values from the process environment.
## Useful Commands
Probe managed WCF endpoints:
```powershell
dotnet run --no-build --project tools\AVEVA.Historian.ReverseEngineering -- wcf-probe $env:HISTORIAN_HOST 32568
dotnet run --no-build --project tools\AVEVA.Historian.ReverseEngineering -- wcf-cert-probe $env:HISTORIAN_HOST 32568 localhost
```
Test the positive managed tag-browse route:
```powershell
dotnet run --no-build --project tools\AVEVA.Historian.ReverseEngineering -- wcf-like-tag-browse $env:HISTORIAN_HOST 32568 $env:HISTORIAN_TAG_FILTER
```
Run a bounded negative `StartQuery2` replay without burning the full matrix:
```powershell
dotnet run --no-build --project tools\AVEVA.Historian.ReverseEngineering -- wcf-start-query $env:HISTORIAN_HOST 32568 $env:HISTORIAN_TEST_TAG --max-attempts 1 --timeout-seconds 3
```
Run the native wrapper comparison harness:
```powershell
dotnet run --project tools\AVEVA.Historian.NativeTraceHarness -- --scenario history --tag $env:HISTORIAN_TEST_TAG --lookback-minutes 1440
dotnet run --project tools\AVEVA.Historian.NativeTraceHarness -- --scenario event --lookback-minutes 10080
```
Search local Galaxy Repository for historized tags:
```powershell
powershell.exe -NoProfile -ExecutionPolicy Bypass -File .\scripts\Find-GalaxyHistorizedTags.ps1
```
Prompt for Historian credentials in a PowerShell window:
```powershell
powershell.exe -NoProfile -ExecutionPolicy Bypass -File .\scripts\Prompt-HistorianCredentialsAndOpen2.ps1
```
## Script Locations
Credential/session helpers:
- `scripts\Prompt-HistorianCredentialsAndOpen2.ps1`
- `scripts\Test-AahClientManagedOpen.ps1`
- `scripts\Test-AahClientManagedReadIntegrated.ps1`
Native/wrapper capture runners:
- `scripts\Run-AahClientManagedFridaCapture.ps1`
- `scripts\Attach-AahClientManagedFridaCapture.ps1`
- `scripts\Attach-NativeTraceHarnessRuntimePointerCapture.ps1`
- `scripts\Attach-NativeTraceHarnessWinsockCapture.ps1`
- `scripts\Attach-NativeTraceHarnessSystemBoundaryCapture.ps1`
- `scripts\Attach-NativeTraceHarnessAahClientExportCapture.ps1`
Server-side ValCl probe:
- `scripts\Capture-AahClientAccessPointValClContext.ps1`
- `scripts\frida\aahclientaccesspoint-valcl-context.js`
Network/relay experiments:
- `scripts\Attach-SystemBoundaryViaDebianRelay.ps1`
- `scripts\Run-DebianHistorianRelayCapture.ps1`
- `scripts\Run-PktmonDebianRelayCapture.ps1`
- `scripts\Start-WcfOpen2CaptureServer.ps1`
Frida hook implementations:
- `scripts\frida\aahclientmanaged-open-query.js`
- `scripts\frida\aahclientmanaged-system-boundary.js`
- `scripts\frida\aahclientmanaged-winsock.js`
- `scripts\frida\aahclient-exports.js`
## Current Evidence Summary
Positive evidence:
- Fully managed WCF/MDAS endpoint probing works.
- `/Hist`, `/Retr`, `/Stat`, and `/Trx` `GetV` calls are reachable.
- `/HistCert` is reachable with MDAS over transport security.
- `/Hist-Integrated` accepts managed Windows integrated `Open2`.
- The returned `Open2` handle is accepted by `Retr.IsOriginalAllowed`.
- Managed wildcard tag browse works through:
- `Retr.StartLikeTagNameSearch`
- `Retr.GetLikeTagnames`
- Native wrapper history reads succeed in the direct/local path for known
historized tags.
- Native wrapper event query succeeds and returns sanitized local-dev rows.
- `DataQueryRequest` serialization is byte-matched for:
- full/raw request
- time-weighted aggregate request
- interpolated request
- `EventQueryRequest` serialization is byte-matched for the current empty-filter
event query fixture.
- `OpenConnection3` request/response layout is partially decoded:
- request byte `0`: version `6`
- request bytes `1..16`: authenticated context GUID
- request byte `17`: content selector
- response byte `0`: version `3`
- response bytes `1..4`: transient `/Retr` client handle
- response includes storage session id, connect time, and server time
Negative evidence:
- `Open2` by itself is not enough for history/event query starts.
- Direct managed `/Retr.StartQuery2` fails even with byte-matched
`DataQueryRequest` bytes.
- The bounded current replay shape is:
- `/Hist-Integrated Open2` succeeds
- `Retr.IsOriginalAllowed` returns true
- `StartQuery2` returns `false`
- response and error buffers are empty
- legacy `StartQuery` may fault with a server null-reference
- Query failure is not caused by:
- wrong basic WCF service path
- wrong MDAS content type
- wrong `DataQueryRequest` serializer
- wrong `QueryType` sweep
- wrong common selector flag variants
- missing `IsOriginalAllowed`
- simple explicit username/password mismatch
- Managed standalone `ValCl` replay reproduces the first native wrapped NTLM
token but still fails at round 0.
- Running the same managed `ValCl` path through .NET Framework also fails, so
this is not just a .NET 10 WCF behavior difference.
## Active Blocker
**Resolved on `2026-05-04`.** The previous blocker — managed `ValCl`
rejected by the server — had two causes, both now fixed:
1. **WCF parameter-name mismatch.** SDK and probe declared the
`ValidateClientCredential` byte parameters as `inputBuffer` /
`outputBuffer`; the actual server contract (per `ildasm` of
`aahClientAccessPoint.exe`) uses `inBuff` / `outBuff`. WCF derives
body element names from the C# parameter name, so the server's
deserialiser was ignoring the unknown `<inputBuffer>` element and
`arg.2` was null, NRE-ing at IL `0x01AA`. Fixed via
`[MessageParameter(Name = "inBuff")]` / `Name = "outBuff"` in the
probe and in `src/AVEVA.Historian.Client/Wcf/Contracts/IHistoryServiceContract2.cs`
and `IStorageServiceContract.cs`.
2. **SSPI request-flag mismatch.** Probe used `ALLOCATE_MEMORY |
CONFIDENTIALITY | INTEGRITY | CONNECTION = 0x10910`; the native
wrapper uses `0x2081C` round 0 / `0x81C` round 1+ (adds
`IDENTIFY` round 0 and `REPLAY_DETECT` + `SEQUENCE_DETECT` always).
The REPLAY/SEQUENCE pair gates NTLM MIC generation; without it,
`AcceptSecurityContext` rejects round 1 with
`SEC_E_INVALID_TOKEN`. Fixed in the probe's `SspiClient`.
The full chain a successful native read uses is now reproducible from
a fully managed client end-to-end:
1. `Hist-Integrated.GetV` → version `11`
2. `Hist-Integrated.ValCl` round 0 (69 → 239 bytes) ✓
3. `Hist-Integrated.ValCl` round 1 (93 → 1 byte terminal) ✓
The next evidence layers — `OpenConnection3` (with the now-known
context key), `Retr.IsOriginalAllowed`, and `Retr.StartQuery2` —
should now work, because the native context-map registration that
`ProcessServerToken` performs has finally been completed by a managed
client. Run the same managed sequence and observe whether
`OpenConnection3` returns the expected 42-byte response and whether
`StartQuery2` returns a non-empty result for
`OtOpcUaParityTest_001.Counter`.
## Next Pickup Steps
`scripts\Capture-AahClientAccessPointValClContext.ps1` cannot get server-side
helper visibility on this host. Both scenarios were re-run on `2026-05-03`
from an elevated PowerShell session (Admin, High Mandatory Label,
`SeDebugPrivilege` enabled) and Frida attach into `aahClientAccessPoint.exe`
(running as `NT SERVICE\aahClientAccessPoint`) was rejected with
`Failed to attach: process with pid <pid> either refused to load frida-agent,
or terminated during injection`. The actual Frida Python exception is
`frida.ProcessNotRespondingError`, which means the agent injection
handshake did not complete in time, not a load-time refusal. The probes
themselves still ran cleanly: NativeRead reproduced the canonical fixture
row, and ManagedValCl reproduced the type-4/code-1 round-zero failure with
the canonical wrapped-NTLM prefix.
Hypotheses already ruled out on this host:
- **Process mitigation policy.** `Get-ProcessMitigation -Id <pid>` reports
every category OFF for the service, including
`BinarySignature.MicrosoftSignedOnly`, `DynamicCode.BlockDynamicCode`,
`Cfg.Enable`, `ImageLoad.BlockRemoteImageLoads`,
`ExtensionPoint.DisableExtensionPoints`, and `UserShadowStack.*`.
- **DACL / token.** `OpenProcess(PROCESS_ALL_ACCESS)` from the elevated
token succeeds, including `PROCESS_VM_OPERATION`, `PROCESS_VM_WRITE`, and
`PROCESS_CREATE_THREAD`.
- **Bitness.** Cross-bitness Frida (64-bit Python attaching to a fresh
`C:\Windows\SysWOW64\cmd.exe`) works.
- **AV / EDR.** Defender real-time protection, behavior monitoring, and
on-access protection are OFF; no third-party AV/EDR is registered with
`SecurityCenter2`; no EDR-style filter driver is active.
- **IFEO / AppInit.** No IFEO debugger entry for `aahClientAccessPoint.exe`;
`AppInit_DLLs` empty in 64-bit and WOW64 hives.
- **Frida realm / persist_timeout knobs.** `realm='native'`,
`realm='emulated'`, and `persist_timeout=30` all fail identically.
Likely remaining cause: service-internal — `aahClientAccessPoint.exe` runs
~150 threads, many in `EventPairLow` ALPC/SCM waits, and Frida's manual
mapper does not get a cooperative thread to complete its RPC bootstrap.
ETW SSPI tracing then produced the actionable evidence Frida could not.
A `logman` session capturing `LsaSrv`, `LSA`,
`Microsoft-Windows-NTLM`, `NTLM Security Protocol`, and
`Security: NTLM Authentication` providers at level `0xFF` and keywords
`0xFFFFFFFFFFFFFFFF` recorded **10 SSPI events from
`aahClientAccessPoint` during a successful native read (Ids 30, 34, 35,
40, 84, 10, 12, 16, 17, 86 in a 47 ms burst) and zero from the same
process during a failing managed ValCl run**. lsass-side SSPI activity
also drops 35x in the failing run (4330 → 121 events). The implication
is that the long-standing
`HistoryService.ValidateClientCredential caught NullReferenceException
at line 1593` fires *before* reaching `CServerNode.ProcessServerToken`
at IL `0x01DC`, i.e. between `Guid.TryParse(handle)` at IL `0x012A` and
the ProcessServerToken call site. Likely culprits: `CServerBuffer`
vtable allocation at IL `0x0183`, the byte-array pointer/length copy
into buffer `+72/+76`, or a parameter pull from
`ServiceSecurityContext.Current` whose `WindowsIdentity` is null on the
plain `Security.Mode = None` pipe binding.
Static IL inspection of `HistoryService.ValidateClientCredential`
(token `0x06000774`, 779 instructions, in mixed-mode
`aahClientAccessPoint.exe`) enumerates every NRE-capable instruction
on the straight-line path before the ProcessServerToken call and
narrows the failure to five candidates (full table in
`openconnection3-correlation-latest.json`
`ValidateClientCredentialIlNreCandidates`):
- `0x00ED` — `LogHistorianMessage(... CServerClient*, ...)` in the
prologue. NREs if the `CServerClient*` is null on the failing
binding.
- `0x017E` and `0x0182` — vtable derefs in the allocator chain at
`&g_ClientAccessPoint + 2328` → vtable → +40. NREs if the field is
uninitialised; ruled out as the differentiator because
`g_ClientAccessPoint` is a process-wide singleton.
- `0x01AA` (`ldelema`) and `0x01B2` (`ldlen`) on `arg.2 = byte[]
inputBuffer`. NREs if WCF deserialises the buffer as null even
though 69 bytes are on the wire.
The two custom-error paths in this method (code `28` for invalid GUID
text at `0x012F`, code `204` for allocator-null at `0x018A`) are both
explicitly handled, so neither would manifest as the logged
`NullReferenceException`.
Differential analysis against the successful native local read (which
uses the same `Security.Mode = None` pipe binding) rules out the
prologue and the static-singleton vtable chain as differentiators. The
**byte-array deref at `0x01AA`/`0x01B2` is the most plausible remaining
candidate** because it depends on WCF body deserialisation which can
silently differ between the managed probe and the native wrapper even
when both sides claim the same operation contract.
SOAP-body comparison via WCF message logging in the .NET Framework
probe resolved this. The wire body sent
`<inputBuffer>BASE64DATA</inputBuffer>` but the response used
`<outBuff b:nil="true"/>`. `ildasm` against `aahClientAccessPoint.exe`
confirmed the actual server contract is
```il
ValidateClientCredential(string handle, uint8[] inBuff,
[out] uint8[]& outBuff,
[out] uint8[]& errorBuffer)
```
WCF derives the request body element name from the C# parameter name,
so the probe's `inputBuffer` parameter produced `<inputBuffer>` on the
wire and the server's WCF deserialiser ignored that unknown element,
leaving server-side `arg.2 = inBuff = null`. IL `0x01AA` `ldelema
System.Byte` then NREs and the C++/CLI catch handler converts it to
native error type 4 / code 1.
Adding `[MessageParameter(Name = "inBuff")]` and `[MessageParameter(Name
= "outBuff")]` to the probe's `ValidateClientCredential` declaration
unblocks the request:
- **Round 0:** `ServerSuccess=true`, `ServerOutputLength=239`,
`ServerContinue=true`, output prefix `01 4e 54 4c 4d 53 53 50 00 02
...` (continue byte + NTLMSSP type-2 challenge). Matches the
documented native-success "69→239 byte" first round exactly.
- **Round 1:** `Type=129 Code=0x80090308 = SEC_E_INVALID_TOKEN` with a
100-byte error buffer whose ASCII payload includes
`aahClientAccessPoint::CServerContext::ProcessClientToken` and
`InitializeSecurityContext`. The original parameter-binding NRE is
gone; the next layer of failure is real SSPI rejection inside
`AcceptSecurityContext`.
The same `[MessageParameter]` fix is now applied to the production SDK
contracts `IHistoryServiceContract2.ValidateClientCredential` and
`IStorageServiceContract.ValidateClientCredential`. `ildasm` also
revealed the same parameter-naming mismatch on
`EnsT`/`EnsT2`/`RTag2`/`ExKey`/`StJb`/`GtJb` with their current SDK
declarations; those operations are not on the read-only SDK path so
they are intentionally left alone for now (audit when those flows
become required — see `ServerContractAuditedOtherOperationsWithLikelySameMismatch`
in `openconnection3-correlation-latest.json` for the table).
Native SSPI flag replication on `2026-05-04` resolved
`SEC_E_INVALID_TOKEN`. Decoded native flags:
- `0x2081C` round 0 = `ISC_REQ_IDENTIFY | ISC_REQ_CONNECTION |
ISC_REQ_CONFIDENTIALITY | ISC_REQ_SEQUENCE_DETECT |
ISC_REQ_REPLAY_DETECT`
- `0x81C` round 1+ = same minus `ISC_REQ_IDENTIFY`
The probe was missing `ISC_REQ_REPLAY_DETECT`,
`ISC_REQ_SEQUENCE_DETECT`, and round-0 `ISC_REQ_IDENTIFY`. The
REPLAY/SEQUENCE pair gates NTLM MIC generation in the type-3
response; without it the server's `AcceptSecurityContext` rejects with
`SEC_E_INVALID_TOKEN`. Adding those flags (and tracking the round
count internally in `SspiClient`, keeping `ALLOCATE_MEMORY` for
buffer convenience) reproduces the documented native two-round
sequence byte-for-byte from a managed client:
| Round | Wire | Server output | Continue | Error |
|---|---|---|---|---|
| 0 | 69 wrapped | 239 (NTLM type-2 challenge) | true | none |
| 1 | 93 wrapped | **1 byte (`0x00` terminal)** | false | **none** |
`FinalServerSuccess: true`, `FinalNativeError: null`. The long-standing
managed `ValCl` blocker is resolved. The chain a successful native
read uses is now reproducible from a managed client end-to-end:
1. `Hist-Integrated.GetV` → version `11`
2. `Hist-Integrated.ValCl` round 0 (69 → 239 bytes) ✓
3. `Hist-Integrated.ValCl` round 1 (93 → 1 byte terminal) ✓
End-to-end chain verification on `2026-05-04`. The .NET Framework
probe was extended to chain `Hist.Open2` (replaying the captured
1346-byte v6 request with the leading 16 context-key bytes spliced to
match the managed `ValCl` GUID), then `Retr.IsOriginalAllowed`, then
`Retr.StartQuery2` (replaying the captured 251-byte
`OtOpcUaParityTest_001.Counter` `DataQueryRequest`). Result:
| Step | Outcome |
|---|---|
| `Hist.Open2` | 42 bytes, version `0x03`, transient `/Retr` client handle decoded |
| `Retr.GetV` | version `4` |
| `Retr.IsOriginalAllowed(handle)` | return code `0`, `isAllowed = true` |
| `Retr.StartQuery2(handle, 1, 251 bytes, ...)` | `Success=true`, response **31 bytes**, `QueryHandlePresent=true`, no error |
The 31-byte `StartQuery2` response SHA-256
`4c062b5ce8181308f0f46bfd8c6088acb52e6ade94401651b7d3ccc8952edfb5`
is **byte-for-byte identical** to the previously captured native
success response. The full AVEVA Historian native wire protocol chain
through `StartQuery2` is now reproducible end-to-end from a fully
managed client.
This required one additional contract fix:
`IRetrievalServiceContract2` had the same parameter-name mismatch
class. Server uses `pRequestBuff` / `pResponseBuff` / `errSize` / `err`
on `StartQuery2` (and `pResultBuff` / `errSize` / `err` on
`GetNextQueryResultBuffer2`, `errSize` / `err` on `EndQuery2`).
`[MessageParameter(Name = ...)]` attributes added to
`src/AVEVA.Historian.Client/Wcf/Contracts/IRetrievalServiceContract2.cs`.
Reproduce the chain with:
```powershell
.\tools\AVEVA.Historian.NetFxWcfProbe\bin\Debug\net481\AVEVA.Historian.NetFxWcfProbe.exe `
--endpoint "net.pipe://localhost/Hist" `
--retr-endpoint "net.pipe://localhost/Retr" `
--open2-replay .\artifacts\reverse-engineering\openconnection3-request-replay.bin `
--data-query-replay .\artifacts\reverse-engineering\startdataquery-request-replay.bin
```
The two `*.bin` inputs are extracted from
`artifacts/reverse-engineering/instrumented-openconnection3-correlation/capture.ndjson`
(`OpenConnection3.Request` and `StartDataQuery.Request` Base64
fields) and stay under `artifacts/` (gitignored). The probe stdout
JSON only echoes lengths, SHAs, version bytes, and prefix hex; it
does not echo identity payloads or transient handle values.
Production SDK note: `src/AVEVA.Historian.Client` currently has no
SSPI client (only wrap/unwrap helpers in
`HistorianWcfAuthenticationProtocol`). When the SDK auth flow is
wired for the production read path, it must use the same
native-equivalent flags. .NET 10's
`System.Net.Security.NegotiateAuthentication` does not expose
`ISC_REQ_*` directly; P/Invoke `InitializeSecurityContextW` (or
equivalent) to set `IDENTIFY` + `REPLAY_DETECT` + `SEQUENCE_DETECT`
explicitly. Reference implementation in
`tools/AVEVA.Historian.NetFxWcfProbe/Program.cs` `SspiClient`.
The protocol is now fully understood end-to-end for the read path;
remaining work is plumbing — replace the captured-replay `Open2`
payload with `HistorianOpen2Protocol.SerializeNativeOpenConnection3Version6`
(already in the SDK), then chain `ValCl → Open2 → /Retr.StartQuery2 →
/Retr.GetNextQueryResultBuffer2` for the canonical read fixture.
Production SDK plumbing landed on `2026-05-04`. The fully managed
.NET 10 SDK now reads history end-to-end against the live local
Historian. New SDK pieces:
- `Wcf/HistorianSspiClient.cs` — managed SSPI client, P/Invokes
`InitializeSecurityContextW` with native flags `0x2081C` round 0 /
`0x81C` later. `[SupportedOSPlatform("windows")]`.
- `Wcf/HistorianWcfBindingFactory.CreateMdasNetNamedPipeBinding` +
`CreatePipeEndpointAddress` — Named Pipe transport for the local
Historian. `[SupportedOSPlatform("windows")]`.
- `Wcf/HistorianDataQueryProtocol.TryParseGetNextQueryResultBufferRows` —
parses `UInt16 version=9` + `UInt32 rowCount` + N self-describing
rows; recognises the 5-byte `04 1E 00 00 00` ("no more data")
terminal.
- `Wcf/HistorianWcfReadOrchestrator.cs` — chains `Hist.GetV →
Hist.ValCl × N → Hist.Open2 → /Retr.GetV →
Retr.IsOriginalAllowed → Retr.StartQuery2 → loop
Retr.GetNextQueryResultBuffer2`. Builds the OpenConnection3 v6
request through `HistorianOpen2Protocol.SerializeNativeOpenConnection3Version6`
with documented native constants (`ClientType=4`,
`ConnectionMode=0x402`, `FormatVersion=4`, `HcalVersion=17`,
`DataSourceId="2020.406.2652.2"`).
- `HistorianClientOptions.Transport` (defaults to `LocalPipe`) and
`HistorianClientOptions.TargetSpn` (defaults to
`NT SERVICE\aahClientAccessPoint`).
- `Models/HistorianSample.PercentGood`.
- `Protocol/Historian2020ProtocolDialect.ReadRawAsync` now delegates
to the orchestrator on Windows + `LocalPipe`.
`ReadRawAsync` against the live local Historian for the canonical
`OtOpcUaParityTest_001.Counter` fixture returns parsed
`HistorianSample` rows including `Quality`, `OpcQuality`,
`QualityDetail`, `NumericValue`, `PercentGood`, and `TimestampUtc`.
Test coverage:
- **Without** the integration env vars: 64/64 unit tests pass
(golden-byte coverage of SSPI flag selection, Named Pipe binding
shape, and the row-buffer parser for the captured 570-byte
fixture).
- **With** `HISTORIAN_HOST=localhost` +
`HISTORIAN_TEST_TAG=OtOpcUaParityTest_001.Counter`: 69/69 pass,
including
`HistorianClientIntegrationTests.ReadRawAsync_AgainstLocalHistorian_ReturnsAtLeastOneRow`
which exercises the full managed chain end-to-end.
Reverse-engineering for the read path is **complete**. Remaining
follow-up work (not blocked by protocol discovery — only plumbing):
- Aggregate row layouts (`Interpolated`, `TimeWeightedAverage`) and
`ReadAggregateAsync` / `ReadAtTimeAsync` wiring (use the per-mode
`dnlib` row captures already in `docs/reverse-engineering/`).
- `ReadEventsAsync` wiring (`StartEventQuery` request bytes are
already byte-matched; need event row layout + a similar
orchestrator).
- Remote TCP transports (`RemoteTcpIntegrated`,
`RemoteTcpCertificate`).
- Explicit username/password authentication (current orchestrator is
integrated-only).
- `[MessageParameter]` audit on the other contracts ildasm flagged
with parameter-name mismatches: `EnsT`, `EnsT2`, `RTag2`, `ExKey`,
`StJb`, `GtJb` (none on the read path so far).
- Decode the trailing 34 bytes per row (likely string-value
placeholder + aggregate end-timestamp slot).
All of the above landed on `2026-05-04`. The SDK now exposes
`ReadRawAsync`, `ReadAggregateAsync`, `ReadAtTimeAsync`, and
`ReadEventsAsync` end-to-end; `[MessageParameter]` audits applied to
~30 parameter-name mismatches across `IHistoryServiceContract`,
`IHistoryServiceContract2`, `IRetrievalServiceContract`,
`IRetrievalServiceContract3`, and `IRetrievalServiceContract4`;
`HistorianWcfBindingFactory.CreateBindingPair(options)` now selects
the right `Hist` + `Retr` binding/endpoint pair for `LocalPipe`,
`RemoteTcpIntegrated`, and `RemoteTcpCertificate` transports;
`HistorianSspiClient` has an explicit-creds constructor overload that
builds `SEC_WINNT_AUTH_IDENTITY`. **72/72 tests pass with
`HISTORIAN_HOST=localhost` + `HISTORIAN_TEST_TAG=...` set, including
seven live integration tests against the local Historian.**
Surfaced new evidence target during event-flow verification:
`Retr.GetNextEventQueryResultBuffer` returns native error type=4
code=85 (`0x55`) — a fresh server response we haven't seen before,
likely caused by the missing `RegisterTags2(CM_EVENT)` prerequisite
that the native wrapper's `CreateDefaultEventTag` performs before any
event read. The orchestrator treats the 5-byte type=4 buffer as a
soft terminal so the chain doesn't throw; `LastErrorBufferDescription`
surfaces the full code for diagnostics.
Open items (each isolated, no protocol discovery required):
1. **Event default-tag registration (CM_EVENT prerequisite) — partially
decoded, full chain incomplete.** Built `instrument-wcf-writemessage`
IL-rewrite tooling that hooks `aahMDASEncoder.ClientMessageEncoder.WriteMessage`
(token `0x06005E65`, MDAS encoder layer) to capture every outgoing
WCF body via the existing CaptureLogger pattern. The captured event
scenario flow has **27 outgoing WCF calls** between session startup
and the first event row:
| # | Action | Notes |
|---|---|---|
| 0 | Hist/GetV | version probe |
| 1-2 | Hist/GetI | get-info |
| 3-4 | Hist/ValCl ×2 | auth (handle = ValCl context key GUID) |
| 5 | Hist/Open2 | 1472-byte v6 buffer (we replicate this) |
| 6-7 | unknown 105-byte | session setup |
| **8-9** | **unknown 211-byte** | **first appearance of session GUID `6D332FCD-…` (later used as EnsT2 handle)** |
| 10 | Hist/UpdC3 | status update — uses 6D332FCD |
| 11-16 | unknown 183/185/188/192-byte | more setup |
| 17 | Hist/RTag2 | uses 6D332FCD |
| 18 | unknown 184-byte | |
| 19 | Trx/GetV | transaction service version probe |
| 20 | unknown 105-byte | |
| 21 | Retr/GetV | retrieval version probe |
| **22** | **Hist/EnsT2** | **CTagMetadata(CM_EVENT) — uses 6D332FCD** |
| 23 | Retr/StartEventQuery | succeeds when 22 succeeds |
| 24 | Retr/GetNextEventQueryResultBuffer | returns row buffer |
| 25 | Retr/EndEventQuery | terminal |
| 26 | Hist/Close2 | session close |
**CTagMetadata payload is now byte-for-byte verified.** Captured
83-byte CM_EVENT payload from record 22 matches our SDK
`HistorianAddTagsProtocol.SerializeCmEventCTagMetadata` exactly
when the captured FILETIME is substituted in (verified via
reflection unit dump: 83/83 bytes match). Layout corrections from
the wire capture vs. the previously-documented format:
- Action URI is `aa/Hist/EnsT2`, NOT `aa/Hist/AddT`.
- 7-byte storage block ends with `0x01`, not `0x00`.
- Layout is `flags(7) + uint(0) + FILETIME(8) + GUID(16) + tail(5)`,
NOT `FILETIME + flags + uint(rate) + uint(deadband) + GUID`.
- Common Archestra event type GUID is
`5f59ae42-3bb6-4760-91a5-ab0be01f9f02` (NOT `…e01f2f27` as
previously documented from IL inspection).
- 5-byte tail `2F 27 01 01 01` (3 unknown bytes + 2 trailing 01s).
**Live event reads still return zero events** because:
- Records 6-9 (which establish the session GUID `6D332FCD-…` used
by every subsequent call) and records 11-16 (~5 unknown setup
calls) have NOT been decoded yet.
- Without those calls, our SDK's EnsT2 uses the storage session id
from the Open2 response as the handle, but the server expects
the session GUID established by records 8-9 — which it never
received because we never made those calls. EnsT2 returns false
and `Retr.GetNextEventQueryResultBuffer` returns native code 85.
- SDK's EnsT2 attempt is wrapped in try/catch and surfaces the
return code via `HistorianWcfEventOrchestrator.LastAddReturnCode`
for diagnostics; the chain doesn't throw.
Concrete remaining work for live event reads:
- Identify and decode records 6-9 from
`artifacts/reverse-engineering/instrumented-wcf-writemessage/writemessage-capture-event-latest.ndjson`.
The action URI of each will be visible as ASCII in the body
(e.g. `aa/Hist/Foo`). For each, decode the request body shape and
identify which call returns the session GUID `6D332FCD-…` that
subsequent calls use as their handle.
- Implement those calls in the orchestrator before EnsT2.
- Same for records 11-16 (unknown 183/185/188/192-byte calls).
- Then re-test EnsT2 should return true and events should flow.
- Once events flow, capture the `GetNextEventQueryResultBuffer`
response bytes (would require also instrumenting `ReadMessage` —
symmetric to WriteMessage) and write the event-row parser.
The IL-rewrite tooling (`tools/AVEVA.Historian.ReverseEngineering`
`instrument-wcf-writemessage` command) and corresponding
`LogByteArraySegment` helper in `CaptureLogger` are now in place
for any future capture work. Reproduce a fresh capture with:
```powershell
dotnet run --no-build --project tools\AVEVA.Historian.ReverseEngineering -- instrument-wcf-writemessage
# Then stage the modified DLL into a current-copy dir alongside
# AVEVA.Historian.ReverseInstrumentation.dll, set AVEVA_HISTORIAN_RE_CAPTURE,
# and run the native trace harness with --current-dir <copy> --managed-dll-path <copy>/aahClientManaged.dll
```
2. Capture a `Wcf.GetNextEventQueryResultBuffer.ResultBytes` fixture
(only possible AFTER the registration step above succeeds and
rows actually flow), then write a parser using the same approach
as `TryParseGetNextQueryResultBufferRows`.
3. Verify `RemoteTcpIntegrated` and `RemoteTcpCertificate` against
an actual remote Historian.
4. Verify explicit-creds path with a non-current user account.
5. Add `RetrievalMode` → `QueryType` mappings for the modes beyond
`Full` / `Interpolated` / `TimeWeightedAverage` / `Cyclic`.
6. Decode the trailing ~24 bytes of each row body (vary across rows
for the same tag — likely per-sample value/source/state metadata).
Diagnostic helper: `EventChainDiagnosticTests.EventOrchestrator_DiagnosticDump_AgainstLocalHistorian`
calls the orchestrator directly via `InternalsVisibleTo` and prints
`LastResultBufferLength` + `LastErrorBufferDescription`. Useful when
iterating on the registration step. Run with:
```powershell
$env:HISTORIAN_HOST = 'localhost'
dotnet test .\Histsdk.slnx --no-build --logger "console;verbosity=detailed" --filter "FullyQualifiedName~EventOrchestrator_DiagnosticDump"
```
SQL ground-truth check for events (verified against the live
Historian on `2026-05-04`):
```powershell
sqlcmd -E -S . -d Runtime -W -Q "SELECT TOP 10 EventTimeUtc, Type, Source_Object FROM Events WHERE EventTimeUtc > DATEADD(DAY, -7, GETUTCDATE()) ORDER BY EventTimeUtc"
```
Returns event rows like `System.OffScan`, `System.Stop`, `Alarm.Set`
that the managed `ReadEventsAsync` should also surface once the
registration step is wired.
If runtime confirmation is later required (e.g., to capture the actual
NRE stack frame), pick exactly one of these escalation paths and do not
retry plain elevated Frida:
1. **SYSTEM-token injection (requires explicit user consent — spawns a
SYSTEM shell).** Whether or not this clears
`ProcessNotRespondingError` is uncertain (the bottleneck looks like
the agent RPC handshake, not the caller token). Cheapest test, but
ETW already answered the immediate question.
```powershell
PsExec64.exe -accepteula -s -i frida -p <aahClientAccessPoint-pid> -l .\scripts\frida\aahclientaccesspoint-valcl-context.js -o .\artifacts\reverse-engineering\valcl-context-system.ndjson
```
2. **Signed Detours/EasyHook DLL.** Slowest path, but does not depend on
Frida's bootstrap handshake completing.
3. **WinDbg non-invasive attach (`windbg -p <pid> -pv`).** Useful for
one-shot stack/handle inspection rather than live hook coverage, and
it confirms whether the process responds to a debugger at all.
To rerun the ETW capture (no service touch, only ETW providers and the
existing harness/probe binaries):
```powershell
$artifacts = "$PWD\artifacts\reverse-engineering"; New-Item -ItemType Directory -Force -Path $artifacts | Out-Null
$stamp = Get-Date -Format "yyyyMMdd-HHmmss"
$nativeEtl = Join-Path $artifacts "etw-sspi-nativeread-$stamp.etl"
$managedEtl = Join-Path $artifacts "etw-sspi-managedvalcl-$stamp.etl"
$providers = @(
'{199FE037-2B82-40A9-82AC-E1D46C792B99}', # LsaSrv
'{CC85922F-DB41-11D2-9244-006008269001}', # LSA
'{AC43300D-5FCC-4800-8E99-1BD3F85F0320}', # Microsoft-Windows-NTLM
'{C92CF544-91B3-4DC0-8E11-C580339A0BF8}', # NTLM Security Protocol
'{5BBB6C18-AA45-49B1-A15F-085F7ED0AA90}' # Security: NTLM Authentication
)
function Start-Sspi($name, $etl) {
logman create trace $name -ow -o $etl -p $providers[0] 0xFFFFFFFFFFFFFFFF 0xFF -ets | Out-Null
foreach ($p in $providers[1..($providers.Count-1)]) { logman update trace $name -p $p 0xFFFFFFFFFFFFFFFF 0xFF -ets | Out-Null }
}
Start-Sspi 'histsdk-sspi-nativeread' $nativeEtl
.\tools\AVEVA.Historian.NativeTraceHarness\bin\Debug\net481\AVEVA.Historian.NativeTraceHarness.exe --scenario history --server-name localhost --tcp-port 32568 --tag OtOpcUaParityTest_001.Counter --lookback-minutes 1440 --max-rows 1 --connection-wait-seconds 15 | Out-Null
logman stop 'histsdk-sspi-nativeread' -ets | Out-Null
Start-Sspi 'histsdk-sspi-managedvalcl' $managedEtl
.\tools\AVEVA.Historian.NetFxWcfProbe\bin\Debug\net481\AVEVA.Historian.NetFxWcfProbe.exe --endpoint "net.pipe://localhost/Hist" | Out-Null
logman stop 'histsdk-sspi-managedvalcl' -ets | Out-Null
```
Decode with `Get-WinEvent -Path <etl> -Oldest`, then group by
`ProcessId`. Only `aahClientAccessPoint`'s event count + Id list belongs
in committed docs; ETL files contain SSPI tokens and identity metadata
and stay under `artifacts\reverse-engineering\` (gitignored).
After the chosen path produces server-helper telemetry:
5. Compare native vs managed runs for whether first-round setup helper
`0x0050FFC0` runs, whether lookup helper `0x00517AB0` returns a context,
whether `AcquireCredentialsHandleW` succeeds, whether
`AcceptSecurityContext` is reached, and whether failures occur before or
after native context-map insertion.
6. Update:
- `docs\reverse-engineering\implementation-status.md`
- `docs\reverse-engineering\openconnection3-correlation-latest.json`
7. Re-run:
```powershell
dotnet test .\Histsdk.slnx --no-build --logger "console;verbosity=minimal"
```
8. Run a targeted secret scan after touching auth/capture docs:
```powershell
rg -n "(?i)(password|credential|secret|token|<known-sensitive-host>|<known-sensitive-machine>|<known-sensitive-user>)" docs\reverse-engineering scripts tools
```
Expected scan output includes generic words like `token`, `credential`, and
environment variable names. It must not include real passwords, unsanitized
server names, or customer tag data.
## Primary Reference Docs
Read these first when resuming:
- `docs\reverse-engineering\implementation-status.md`
- `docs\reverse-engineering\wcf-contract-evidence.md`
- `docs\reverse-engineering\managed-wrapper-findings.md`
- `docs\reverse-engineering\openconnection3-correlation-latest.json`
- `docs\reverse-engineering\query-handle-correlation-latest.json`
- `docs\reverse-engineering\cclientcommon-startquery-correlation-latest.json`
- `docs\reverse-engineering\capture-workflow.md`
## Event-flow prereqs (2026-05-04)
`HistorianWcfEventOrchestrator.AddCmEventTagViaAddT` now replays the prerequisite
calls captured via `instrument-wcf-writemessage` against the live native event
read. Before invoking `EnsT2(CM_EVENT)`, the orchestrator now calls:
1. **`UpdC3` (UpdateClientStatus3)** — handle = storage session id (string GUID),
`clientStatusSize=81`, `clientStatus` = `02 01 00…00 1E 00 00 00` (81-byte
blob: 2 leading bytes + 76 zero bytes + uint32 0x1E trailer).
2. **`RTag2` (RegisterTags2)** — handle = same GUID, `ElementCount=1`,
`pInBuff` = `50 67 02 00 01 00 00 00` + 16-byte `CmEventTagId`
(`353b8145-5df0-4d46-a253-871aef49b321`) = 24 bytes total.
3. **`EnsT2` (EnsureTags2)** — unchanged byte-for-byte CTagMetadata payload.
Live diagnostic against `localhost`:
| Stage | Result |
|---|---|
| `UpdC3` | success (return = 0) |
| `RTag2` | success (return = 0) |
| `EnsT2` | returns false (likely benign — CM_EVENT exists with same metadata) |
| `StartEventQuery` | success, query handle returned |
| `GetNextEventQueryResultBuffer` | empty result + 5-byte error `04 55 00 00 00` (type=4 code=85) |
The Stat-service queries the native client also issues
(`Stat/GetV`, `Stat/GETHI` for `HistorianVersion`, `Stat/GetSystemParameter`
for `AllowOriginals`, `HistorianPartner`, `HistorianVersion`,
`MaxCyclicStorageTimeout`, `RealTimeWindow`, `FutureTimeThreshold`,
`AllowRenameTags`) appear informational and are skipped.
Decoded native `aa/Retr/StartEventQuery` `pRequestBuff` (63 bytes captured vs
65 bytes our SDK sends) — diff narrowed to the trailing 4 bytes of
`HistorianEventQueryProtocol.CreateNativeEmptyFilterAttempt`. Reverting the
trailer to a `ushort 0` yielded code 46 (validation reject) instead of code 85,
so the uint trailer is structurally correct against this server even though the
captured native bytes appear to use 2 bytes there. Either the server tolerates
both shapes or the metadata-namespace encoding is off; resolution requires a
ReadMessage capture.
24,773 events exist in the last 7 days per
`SELECT COUNT(*) FROM Events WHERE EventTimeUtc >= DATEADD(DAY, -7, GETUTCDATE())`,
so code 85 is not "no events".
## ReadMessage instrumentation + decoded event responses (2026-05-04)
`instrument-wcf-readmessage` CLI command added to
`tools/AVEVA.Historian.ReverseEngineering`. Mirror of
`instrument-wcf-writemessage`; targets
`aahMDASEncoder.ClientMessageEncoder.ReadMessage(ArraySegment<byte>, BufferManager, string)`
(token `0x06005E63`). Injects at method entry (IL_0000) capturing `arg.1`
(the incoming `ArraySegment<byte>`) so both the compressed
(post-`DecompressBuffer` V_1) and uncompressed (direct `arg.1` at IL_009C)
paths are recorded.
Capture obtained (28 records;
`artifacts/reverse-engineering/instrumented-wcf-readmessage/readmessage-capture-event-latest.ndjson`,
gitignored). Key responses:
| Record | Response | Length | Decoded |
|---|---|---|---|
| 5 | `Open2Response` | 1586 | encoded user identity + session state — must not commit |
| 18 | `StartEventQueryResponse` | 299 | `responseSize=1`, `pResponseBuff=nil`, `queryHandle=0x3E (=62)`, `errSize=1`, `err=nil` |
| 23 | `RTag2Response` | 208 | `outBuff` 24 bytes (echoes input shape), `errorBuffer=nil` |
| 24 | `GetNextEventQueryResultBufferResponse` | 2783 | `resultSize=2506`, `pResultBuff` starts `09 00 02 00 00 00 1E 00 00 00 07 00…Alarm.Set…` |
| 25 | `EnsT2Response` | 229 | **`EnsT2Result=true`**, OutBuff 45 bytes echoing `CmEventTagId` |
**Critical finding:** native `EnsT2` returns **true** with a 45-byte `OutBuff`
that echoes `CmEventTagId`. Our SDK's `EnsT2` returns **false**. Since the
request bytes are byte-identical (verified prior pass), the difference is
server-side session state. Between `UpdC3` (record 10) and `RTag2` (record 17)
the native flow issues 7 `Stat/GetSystemParameter` queries
(`AllowOriginals`, `HistorianPartner`, `HistorianVersion`,
`MaxCyclicStorageTimeout`, `RealTimeWindow`, `FutureTimeThreshold`,
`AllowRenameTags`) plus 2 `Stat/GETHI` for `HistorianVersion`. These were
previously assumed informational; the EnsT2 false vs true differential
suggests at least one of them primes the session for tag operations.
**Event-row wire shape** (from record 24 `pResultBuff`):
```
UInt16 version = 9
UInt32 rowCount
N rows, each:
UInt32 rowMarker = 0x1E
UInt16 fieldCount = 7
Int64 filetimeUtc
UInt16[fieldCount-1] fieldOffsets // running offsets into the trailing string blob
variable-length UTF-16 strings (Alarm.Set, …)
```
The 2506-byte fixture contains exactly 2 event rows (matches `--max-rows 2`
passed to the harness). Once the EnsT2-priming gap is closed, this layout
plugs directly into `HistorianWcfEventOrchestrator.RunEventQuery`.
Reproduce with:
```powershell
$captureDir = "artifacts\reverse-engineering\instrumented-wcf-readmessage"
dotnet run --no-build --project tools\AVEVA.Historian.ReverseEngineering -- `
instrument-wcf-readmessage current\aahClientManaged.dll "$captureDir\aahClientManaged.dll"
Copy-Item -Force "$captureDir\aahClientManaged.dll" "$captureDir\current-copy\aahClientManaged.dll"
$env:AVEVA_HISTORIAN_RE_CAPTURE = (Resolve-Path $captureDir).Path + "\readmessage-capture-event-latest.ndjson"
dotnet run --no-build --project tools\AVEVA.Historian.NativeTraceHarness -- `
--scenario event --tag CM_EVENT --lookback-minutes 1440 --max-rows 2 `
--current-dir (Resolve-Path "$captureDir\current-copy").Path `
--managed-dll-path (Resolve-Path "$captureDir\current-copy\aahClientManaged.dll").Path
python scripts\decode-readmessage-capture.py
```
## Stat-priming + event-row parser landed (2026-05-04)
`HistorianWcfEventOrchestrator.AddCmEventTagViaAddT` now replays the Stat-service
priming sequence captured from native:
1. `Stat/GetV` ×2 (records 6, 7)
2. `Stat/GETHI(HistorianVersion)` ×2 (records 8, 9) — builds the 39-byte
`pRequestBuff` via `BuildGetHistorianInfoRequest("HistorianVersion")`
3. `Hist/UpdC3` (record 10)
4. `Stat/GetSystemParameter` ×6 for `AllowOriginals`, `HistorianPartner`,
`HistorianVersion`, `MaxCyclicStorageTimeout`, `RealTimeWindow`,
`FutureTimeThreshold` (records 11-16)
5. `Hist/RTag2(CmEventTagId)` (record 17)
6. `Stat/GetSystemParameter("AllowRenameTags")` (record 18)
7. `Stat/GetV` (record 20)
8. `Hist/EnsT2(CTagMetadata)` (record 22)
Each Stat call is wrapped in best-effort `TryRun(...)` so individual rejections
don't abort the chain. Also fixed:
- `IStatusServiceContract2.GetHistorianInfo` parameter naming —
`[MessageParameter(Name = "pRequestBuff")]` and `Name = "pResponseBuff"`
attributes added to match the wire (default would have been `<requestBuffer>`
and the server would have ignored the body).
- Event-flow `ConnectionMode` switched from `0x501` to `0x402` — decoded from
the native Open2 request bytes (writemessage record 5 offset `0x26`). The
previous `0x501` was an unverified guess; native uses the same `0x402`
read-only mode for both data and event scenarios.
**Diagnostic against `localhost`:**
| Stage | Result |
|---|---|
| `UpdC3` | success (return = 0) |
| `RTag2` | success (return = 0) |
| `EnsT2` | still returns false |
| `GetNextEventQueryResultBuffer` | type=4 code=85 |
EnsT2 still doesn't match native (which returns `true` with a 45-byte OutBuff).
Hypothesis under investigation: the `StorageSessionId` extracted at Open2
response offset 5-20 is the v3 layout; the v6 response (1345 bytes payload,
contains user identity) likely has the session GUID at a different offset.
Tested bytes 1-16 — UpdC3+RTag2 then both fail (return 1), so 5-20 is the
acceptable handle for those ops. The right offset for EnsT2 may be elsewhere
in the response. **The Open2 v6 response decode requires bytes-level inspection
of identity-bearing data (kept under `artifacts/`, never committed) — see
record 5 of `instrumented-wcf-readmessage/readmessage-capture-event-latest.ndjson`.**
### Event-row parser
`Wcf/HistorianEventRowProtocol.Parse(ReadOnlySpan<byte>)` parses the
version-9 row buffer:
```text
UInt16 version = 9
UInt32 rowCount
N rows, each:
UInt32 rowMarker = 0x1E
UInt16 rowFormat = 7
Int64 filetimeUtc (event time)
UInt16 × 8 fieldOffsets (opaque — purpose not fully decoded)
Property bag (sequence of name=value pairs; first name is the event type)
```
The parser extracts `EventTimeUtc` and `Type` (the first compact-ASCII-string
in the property bag) for each row, and seeks forward to the next row by
scanning for the next `1E 00 00 00 07 00` marker. Property-bag value
encoding is partially decoded (compact ASCII `09 LEN 00 …`, UTF-16 strings
`43 UInt32 LEN × UInt16`, integers with markers in the `0x880x8B` range,
8-byte FILETIMEs) but **value parsing is intentionally not implemented yet**
— it requires more reverse-engineering and would need sanitized fixtures.
5 unit tests in `HistorianEventRowProtocolTests.cs` cover empty buffer,
zero-row, wrong-version, two-row synthetic, and missing-marker. Test count
went from 73 to 78. The orchestrator's `RunEventQuery` now calls the parser
on each non-empty `resultBuffer`, so events will flow with timestamps + types
once the EnsT2-priming gap is closed.
## Open2 v6 response decoded + live events working (2026-05-04)
A combined Read+Write capture under
`artifacts/reverse-engineering/instrumented-wcf-both/` (gitignored) let us
correlate the session GUID used as `handle` in the UpdC3/RTag2/EnsT2 REQUESTS
with its location in the Open2 RESPONSE.
**Open2Response decoded** (~1586 bytes WCF body):
```text
Open2Response wraps three byte[] outputs:
inParameters (echoed ref param — contains user identity; never commit)
outParameters (the session blob)
err (empty on success)
```
`outParameters` payload (42 bytes):
```text
byte 0 protocol version (server returns 3 even when we send Open3 v6 request)
bytes 1-4 UInt32 (purpose unknown — possibly a connect sequence/checksum)
bytes 5-20 16-byte session GUID — used as `handle` for UpdC3/RTag2/EnsT2/Close2
bytes 21-28 Int64 FILETIME (connect time)
bytes 29-36 Int64 FILETIME (server time)
bytes 37-41 5 trailing bytes (status flags?)
```
This matches `HistorianNativeOpen3Output` exactly — our existing offset 5-20
GUID extraction was always correct. The earlier hypothesis about a "v6
response layout" was wrong; the server returns the v3 layout regardless of
the request version.
**Real blocker resolved.** Native does three cross-service version probes
between RTag2 and EnsT2 — `Trx/GetV` (record 19), `Stat/GetV` (record 20),
`Retr/GetV` (record 21) — that register the client with each service's
session table. Without them the server rejects EnsT2 (returns false) and
GetNextEventQueryResultBuffer reports type=4 code=85.
`HistorianWcfEventOrchestrator.AddCmEventTagViaAddT` now opens
`ITransactionServiceContract` and `IRetrievalServiceContract4` channels
inside the setup callback (in addition to the existing `IStatusServiceContract2`
channel) and calls `GetInterfaceVersion` on all three between RTag2 and EnsT2.
**Final live-read diagnostic (`localhost`):**
| Stage | Result |
|---|---|
| `UpdC3` | success (return = 0) |
| `RTag2` | success (return = 0) |
| `Trx/GetV`, `Stat/GetV`, `Retr/GetV` | success |
| `EnsT2` | returns false (benign — "CM_EVENT exists with same metadata") |
| `StartEventQuery` | success |
| `GetNextEventQueryResultBuffer` | returns event-row buffer |
| Parser | **`Events observed: 1`** ✅ |
`LastErrorBufferDescription: type=4 code=85` reaches the orchestrator only
on the terminal (no-more-data) call, after the first batch returned an
event. The existing soft-terminal handling (`if errorBuffer[0] == 4 return`)
is correct.
The full managed event-read chain is reproducible end-to-end from a pure
.NET 10 SDK: GetV → ValCl × N → Open2 → UpdC3 → 6× GetSystemParameter →
RTag2 → GetSystemParameter(AllowRenameTags) → Trx/GetV → Stat/GetV →
Retr/GetV → EnsT2 → StartEventQuery → GetNextEventQueryResultBuffer loop →
EndEventQuery → Close2.
## Property-bag value-type parser landed (2026-05-04)
Decoded the row property-bag wire format. Unified value layout:
```text
typeMarker (UInt8)
length (UInt8 — bytes of value following the status byte)
status (UInt8 — observed 0x00 in successful captures)
value (length × byte, encoding determined by typeMarker)
```
Typemarker dispatch:
| Marker | Type | Value bytes |
|---|---|---|
| `0x02` | Boolean | 1 byte (0/1) |
| `0x10` | GUID | 16 bytes (.NET Guid byte order) |
| `0x18` | FILETIME UTC | Int64 LE |
| `0x31` | Int32 | 4 bytes LE |
| `0x43` | UTF-16 string | UInt16 charCount + (charCount × 2) UTF-16 LE bytes |
Unknown markers preserve the raw `length` value bytes as a `byte[]` in
the property dictionary.
Each row layout (refines the earlier skeleton):
```text
UInt32 rowMarker = 0x1E
UInt16 rowFormat = 7
Int64 eventTimeUtcFiletime
UInt16 × 8 // purpose unclear
compact ASCII string // event type ("Alarm.Set", …)
UInt16 propertyCount
propertyCount × Property {
compact ASCII string // property name
Value (per the typed format above)
}
```
`HistorianEventRowProtocol.Parse` populates `HistorianEvent` fields by
mapping known property names: `alarm_id`→`Id`, `receivedtime`→
`ReceivedTimeUtc`, `source_processvariable`/`source_object`→`SourceName`,
`namespace`/`provider_system`→`Namespace`, `revisionversion`→
`RevisionVersion`. All decoded properties (typed, not raw bytes) are also
exposed via the `Properties` dictionary.
**Live verification (`localhost`):** `Events observed: 1`,
`Properties.Count: 31`, `Has alarm_id: True`, `EventTimeUtc` and
`ReceivedTimeUtc` decoded as plausible timestamps.
Tests: 78 → 80. Added `Parse_RowWithKnownProperties_PopulatesEventFields`
(verifies all known-name → HistorianEvent-field mappings using synthetic
placeholder values) and `Parse_UnknownTypeMarker_KeepsRawBytesInPropertyBag`
(verifies the unknown-type fallback).
The fully managed event read is now end-to-end: chain auth → Stat priming →
EnsT2 → StartEventQuery → row buffer → typed event with property dictionary.
## Safety Notes
- Keep raw captures and identity-bearing logs under `artifacts\reverse-engineering`.
- Do not commit credentials, hostnames, user names, customer tags, or raw packet
captures.
- Prefer sanitized JSON and Markdown summaries under `docs\reverse-engineering`.
- Production code under `src\AVEVA.Historian.Client` must remain pure managed
.NET 10.
- Reverse-engineering harnesses may reference native AVEVA binaries only for
analysis and parity comparison.