feat(grpc-events): handle-capture cycle — event-row gap proven NOT a client payload issue

Extends the instrument-grpc rewrite to log string (strHandle) + uint (uiHandle /
queryRequestType) params, not just byte[], and captures our SDK's live v8
openParameters for a byte-diff against the native.

Result of the exhaustive comparison (all live-confirmed via the opt-in
EventReadDiagnostic test):
- StartEventQuery request: byte-identical to the native (v6 layout)
- v8 OpenConnection openParameters: byte-identical to the native (302B) once
  ClientNodeName matches — every control byte/ConnectionType/token/ShardId
- handle usage identical: ExchangeKey->contextKey, registration->storage GUID
  (strHandle), query->client uint (uiHandle); handles valid (RTag/EnsT=True)
- queryRequestType=3, registration order, gzip metadata header — all match
- window has events (native returns 50 now); eventCount not it

Every observable client-side byte matches the native, yet the server scopes 0
events to our connection. The event RPCs succeed over our transport and return a
valid EMPTY result (not a transport error), so this is a connection/server-level
difference (session affinity tied to the native Grpc.Core HTTP/2 connection or a
connection identity used to scope events) — invisible to and unfixable by client
payload matching. Needs server-side insight, not more wire RE.

Added opt-in diagnostics (RegistrationDiag, LastResultBufferHex,
LastEventOpenRequestHex). 326/326 offline; gated test still pins the no-row throw.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01B6mcaT2PjRFKcogzp9UkfC
This commit is contained in:
Joseph Doherty
2026-06-23 12:31:33 -04:00
parent 32ae301050
commit 0921e21bdb
5 changed files with 74 additions and 3 deletions
@@ -300,6 +300,39 @@ implement `aahCryptV2` (RC4+MD5+prefix) managed-side, set the v8 token = that, a
(non-destructive). The offline correlation data (one run's derived key + token + openParameters) is (non-destructive). The offline correlation data (one run's derived key + token + openParameters) is
captured under `artifacts/.../` to validate the managed reproduction before going live. captured under `artifacts/.../` to validate the managed reproduction before going live.
### Token implemented + auth WORKS live (2026-06-23); row retrieval still 0 — proven NOT a payload issue
`token = RC4(password-UTF16LE, key = MD5(SHA256(ECDH secret)))` was implemented in pure managed C#
(`HistorianNativeHandshake.BuildExchangeKeyCredentialToken` + `Rc4`; client key via
`DeriveKeyFromHash(SHA256)`), golden-tested (RC4 standard vector + token construction), and
**live-verified**: the v8 `OpenConnection` now **authenticates** against the 2023 R2 server (past the
`132/171 AuthenticationFailed` wall). Auth is solved.
The event **query** still returns `version-11 rowCount-0` while the native returns 50 for an
**identical** request. Exhaustively ruled out as the cause (all confirmed live, opt-in
`EventReadDiagnostic` test + the IL rewrite extended to log string/uint handle fields):
- `StartEventQuery` request: **byte-identical** to the native (v6 layout)
- v8 `OpenConnection` `openParameters`: **byte-identical** to the native (302 bytes) once ClientNodeName
is matched — every control byte, ConnectionType, token framing, ShardId, etc.
- Handle usage: identical — `ExchangeKey`→contextKey, registration→storage-session GUID (`strHandle`),
query→client uint (`uiHandle`); our parsed handles are valid (registration `RTag/EnsT=True`, valid
`queryHandle`)
- `queryRequestType = 3`, registration sequence/order, gzip metadata header — all match
- window (events exist; native returns 50 *now*), eventCount — not it
So **every observable client-side byte matches the native**, yet the server scopes 0 events to our
connection. The event RPCs succeed over our transport and return a valid *empty* result (not a
transport error), so it is **not a payload or transport-incompatibility issue** — it is a
connection/server-level difference (e.g. session affinity tied to the native `Grpc.Core` HTTP/2
connection or a connection-identity the server uses to scope events) that is **invisible to, and
unfixable by, client payload matching.** Closing it needs server-side insight or a different angle
(e.g. compare the full HTTP/2 connection setup / TLS identity), not more wire-payload RE.
**Shipped this effort:** the complete ExchangeKey crypto (ECDH + SHA256 + MD5-keyed RC4 token) — the
hard wall — pure managed, golden-tested, auth live-verified. Orchestrator stays on the no-row throw;
gated test unchanged.
**2 of 3 layers cleared** (key exchange + client key); the 3rd (token construction) is localized to a **2 of 3 layers cleared** (key exchange + client key); the 3rd (token construction) is localized to a
specific managed method, pending dnlib extraction. ExchangeKey + the v8 serializer are committed; the specific managed method, pending dnlib extraction. ExchangeKey + the v8 serializer are committed; the
orchestrator stays on v6 (set `eventConnection: true` to re-arm once the token construction lands). The orchestrator stays on v6 (set `eventConnection: true` to re-arm once the token construction lands). The
@@ -306,7 +306,7 @@ internal sealed class HistorianGrpcEventOrchestrator
} }
uint queryHandle = startResponse.UiQueryHandle; uint queryHandle = startResponse.UiQueryHandle;
RegistrationDiag += $"QH={queryHandle} clientH={session.ClientHandle} SEQresp={Convert.ToHexString(startResponse.BtResonse?.ToByteArray() ?? [])}; "; RegistrationDiag += $"QH={queryHandle} clientH={session.ClientHandle} strH={session.StringHandle}; ";
try try
{ {
List<HistorianEvent> events = []; List<HistorianEvent> events = [];
@@ -21,6 +21,9 @@ namespace AVEVA.Historian.Client.Grpc;
/// </summary> /// </summary>
internal static class HistorianGrpcHandshake internal static class HistorianGrpcHandshake
{ {
/// <summary>Diagnostic: hex of the most recent v8 event-connection OpenConnection request.</summary>
internal static string LastEventOpenRequestHex { get; private set; } = string.Empty;
/// <summary> /// <summary>
/// The handles produced by a successful OpenConnection. <see cref="ClientHandle"/> is the /// The handles produced by a successful OpenConnection. <see cref="ClientHandle"/> is the
/// transient <c>uint</c> session token used by StartQuery/GetSystemParameter and the other /// transient <c>uint</c> session token used by StartQuery/GetSystemParameter and the other
@@ -121,6 +124,7 @@ internal static class HistorianGrpcHandshake
byte[] open2Request = eventConnection byte[] open2Request = eventConnection
? HistorianNativeHandshake.BuildEventOpenConnectionVersion8Request(contextKey, options.UserName, eventToken) ? HistorianNativeHandshake.BuildEventOpenConnectionVersion8Request(contextKey, options.UserName, eventToken)
: HistorianNativeHandshake.BuildOpenConnection3Request(options.Host, contextKey, connectionMode); : HistorianNativeHandshake.BuildOpenConnection3Request(options.Host, contextKey, connectionMode);
if (eventConnection) { LastEventOpenRequestHex = Convert.ToHexString(open2Request); }
GrpcHistory.OpenConnectionResponse open2 = historyClient.OpenConnection( GrpcHistory.OpenConnectionResponse open2 = historyClient.OpenConnection(
new GrpcHistory.OpenConnectionRequest { BtConnectionRequest = ByteString.CopyFrom(open2Request) }, new GrpcHistory.OpenConnectionRequest { BtConnectionRequest = ByteString.CopyFrom(open2Request) },
@@ -547,7 +547,8 @@ public sealed class HistorianGrpcIntegrationTests
throw new Xunit.Sdk.XunitException( throw new Xunit.Sdk.XunitException(
$"[DIAG] outcome={outcome} | events={events.Count} | LastResultLen={orch.LastResultBufferLength} " + $"[DIAG] outcome={outcome} | events={events.Count} | LastResultLen={orch.LastResultBufferLength} " +
$"| ResultHex={orch.LastResultBufferHex} | ErrHex={orch.LastErrorBufferHex} | Reg=[{orch.RegistrationDiag}]"); $"| ResultHex={orch.LastResultBufferHex} | Reg=[{orch.RegistrationDiag}] " +
$"| v8open={AVEVA.Historian.Client.Grpc.HistorianGrpcHandshake.LastEventOpenRequestHex}");
} }
[Fact] [Fact]
@@ -677,7 +678,8 @@ public sealed class HistorianGrpcIntegrationTests
IntegratedSecurity = !explicitCreds, IntegratedSecurity = !explicitCreds,
UserName = user ?? string.Empty, UserName = user ?? string.Empty,
Password = password ?? string.Empty, Password = password ?? string.Empty,
RequestTimeout = timeout RequestTimeout = timeout,
Compression = true // the stock client always advertises grpc gzip request encoding
}; };
} }
} }
@@ -1387,10 +1387,14 @@ static int InstrumentGrpcNonStream(string[] args)
ModuleDefMD module = ModuleDefMD.Load(sourcePath); ModuleDefMD module = ModuleDefMD.Load(sourcePath);
MemberRefUser logByteArray = CreateLogByteArrayRef(module); MemberRefUser logByteArray = CreateLogByteArrayRef(module);
MemberRefUser logString = CreateLogStringRef(module);
MemberRefUser logUInt32 = CreateLogUInt32Ref(module);
// Cast a wide net: instrument EVERY byte[]-input method on every Grpc*Client type, so whichever // Cast a wide net: instrument EVERY byte[]-input method on every Grpc*Client type, so whichever
// path the native non-streamed write actually drives (History/Transaction RegisterTags + // path the native non-streamed write actually drives (History/Transaction RegisterTags +
// AddNonStreamValues, or a Storage-service route) is captured. Phase = "<Type>.<Method>.<param>". // AddNonStreamValues, or a Storage-service route) is captured. Phase = "<Type>.<Method>.<param>".
// Also captures string (strHandle) and uint (uiHandle / queryRequestType) inputs so the event-read
// handle fields are visible, not just byte[] params.
var instrumented = new List<object>(); var instrumented = new List<object>();
foreach (TypeDef type in module.GetTypes() foreach (TypeDef type in module.GetTypes()
.Where(t => t.Name.String.StartsWith("Grpc", StringComparison.Ordinal) && t.Name.String.EndsWith("Client", StringComparison.Ordinal))) .Where(t => t.Name.String.StartsWith("Grpc", StringComparison.Ordinal) && t.Name.String.EndsWith("Client", StringComparison.Ordinal)))
@@ -1431,6 +1435,34 @@ static int InstrumentGrpcNonStream(string[] args)
}); });
} }
// ENTRY: log string (strHandle) and uint (uiHandle / queryRequestType / count) inputs.
foreach (dnlib.DotNet.Parameter scalarParam in method.Parameters
.Where(p => !p.IsHiddenThisParameter && (p.Type.FullName == "System.String" || p.Type.FullName == "System.UInt32"))
.ToArray())
{
bool isString = scalarParam.Type.FullName == "System.String";
string phase = $"{type.Name}.{method.Name}.{scalarParam.Name}.{(isString ? "str" : "u32")}";
Instruction[] injected =
[
Instruction.Create(OpCodes.Ldstr, phase),
Instruction.Create(OpCodes.Ldarg, scalarParam),
Instruction.Create(OpCodes.Call, isString ? logString : logUInt32),
];
foreach (Instruction instruction in injected.Reverse())
{
method.Body.Instructions.Insert(0, instruction);
}
method.Body.MaxStack = (ushort)Math.Max((int)method.Body.MaxStack, 8);
instrumented.Add(new
{
Type = type.Name.String,
Method = method.Name.String,
Phase = phase,
Direction = "in",
Token = "0x" + method.MDToken.Raw.ToString("X8"),
});
}
// EXIT: log out/ref byte[] responses ("System.Byte[]&") before each ret. ldarg loads the // EXIT: log out/ref byte[] responses ("System.Byte[]&") before each ret. ldarg loads the
// managed pointer; ldind.ref dereferences it to the byte[]. (RPC wrappers set the out // managed pointer; ldind.ref dereferences it to the byte[]. (RPC wrappers set the out
// param right before a single ret, so branch-to-ret skew is not a concern here.) // param right before a single ret, so branch-to-ret skew is not a concern here.)