Commit Graph

78 Commits

Author SHA1 Message Date
Joseph Doherty 8199dde452 gRPC events: capture decrypted HTTP/1.1 frames native vs ours — topology found + tested null
Pursued hypothesis #3 (connection-frame capture). Built a TLS-terminating tee proxy
(artifacts/.../httpcap/, gitignored: self-signed server cert, forwards through the
loopback tunnel, logs decrypted HTTP/1.1 + gRPC-Web both directions) and ran a native
capture-event (returns 50 rows) and our SDK diagnostic (0 rows) through the SAME
proxy/upstream for a clean A/B.

Findings:
- The stock client is gRPC-Web/HTTP-1.1 (alpn empty), and clientCert=none on every
  connection — confirming (with the decompile) that hypothesis #2 (TLS client cert) is
  moot: the native presents no client cert.
- Connection topology differs: the native opens 5 TLS connections, one per service, and
  runs the event query (StartEventQuery/GetNext/EndEventQuery) on a DEDICATED
  RetrievalService connection, separate from the HistoryService connection that opened and
  registered the session. Our SDK collapses every service onto one connection. (Matches the
  decompile: the stock client has a separate GrpcClientBase per service.)
- Framing differs benignly: native uses content-length + Expect: 100-continue; SDK uses
  transfer-encoding: chunked. The server accepts both (StartEventQuery returns a valid
  handle), so framing is not the gate. No hidden header on either side.

Tested the topology hypothesis with a new env-gated switch
(HISTORIAN_GRPC_EVENT_SPLIT_CHANNEL=1): run StartEventQuery/GetNext/EndEventQuery on a
dedicated RetrievalService connection (no re-handshake, reusing the session handle —
mirroring native conn4), registration staying on the main connection. Result: still
0B00000000001E000000 (0 rows), QH=1063. Splitting the event query onto its own connection
does not make rows flow — the server correlates by session handle, not connection, so
topology is not the row-scoping gate.

Every angle is now exhausted (payload, transport, metadata/interceptor/cert, topology,
data store). The gate is a server-internal per-connection retrieval working-set in the
native HistorianClient C++ core, unreachable from a pure-managed client. Conclusion
unchanged: auth-solved / retrieval-server-gated; ReadEventsAsync over gRPC keeps the
no-row throw, event reads use WCF. 56 offline gRPC/event tests pass.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01B6mcaT2PjRFKcogzp9UkfC
2026-06-23 13:43:21 -04:00
Joseph Doherty 6cf4dd13fe gRPC events: decompile the stock managed client — confirms no hidden client-side difference
Closed the blind spot the zero-rows conclusion had leaned on: prior cycles used wire
capture (instrument-grpc-nonstream hooks only byte[] params), blind to gRPC metadata,
interceptors, channel options. Read the stock managed source directly
(histsdk-2023r2-analysis/decompiled/Archestra.Historian.GrpcClient + HistorianAccess;
the pure-managed assemblies decompile cleanly though mixed-mode aahClientManaged crashes
ILSpy).

Findings:
- GrpcClientBase.InitializeBase uses GrpcWebHandler (GrpcWebMode, HttpVersion 1.1) — the
  stock client speaks gRPC-Web over HTTP/1.1, the SAME transport as our SDK. This corrects
  the premise of hypothesis #1: there was never a native Grpc.Core HTTP/2 path to differ
  from; the stock client returning 50 rows is itself gRPC-Web. The HTTP/2 disproof's
  conclusion stands and is reinforced (identical transport on both sides).
- m_metadata on every RPC (incl. StartEventQuery/GetNextEventQueryResultBuffer) is only
  grpc-internal-encoding-request: gzip — exactly our header set. The ClientInterceptor is
  a no-op (empty LogCall). So the "invisible per-connection metadata/header" blind spot is
  confirmed empty — no hidden client-side identity the byte[] capture missed.
- CreateEventQuery/StartQuery/MoveNext are not in managed code; the managed
  GrpcRetrievalClient.StartEventQuery is a thin one-RPC stub. The query logic lives in the
  native C++/CLI HistorianClient core — consistent with the working-set being native/server-side.

Every client-controllable layer is now confirmed identical by reading the stock source,
not just by wire match: request bytes, transport, channel options, gRPC metadata,
interceptor. The remaining difference is below the managed surface / server-side.
Conclusion unchanged: gRPC event-row retrieval is auth-solved / retrieval-server-gated.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01B6mcaT2PjRFKcogzp9UkfC
2026-06-23 13:20:39 -04:00
Joseph Doherty f19eb3b821 gRPC events: answer hypothesis #4 (SQL ground truth) — event store is global, not connection-scoped
Pursued the server-side SQL angle for the gRPC event zero-rows. Built a read-only
SOCKS5-relay + Microsoft.Data.SqlClient probe (gitignored, artifacts/.../sqlschema/)
and dumped the live Runtime event schema. Findings:

- No per-connection / per-client / per-session column exists anywhere in the event
  store. The only scoping-like columns on Events/EventHistory/snapshots are event
  content (Source_* origin, User_* acker, Provider_NodeName, SourceServer replication).
- The rich Events view is not a relational table — it is served live by the Historian
  engine via the INSQL OLE DB provider (linked servers INSQL/INSQLD; encrypted remote
  view). The SQL EventHistory base table holds only 168 rows / 1 tag.
- Decisive: for the SAME -90d..now window the gRPC StartEventQuery diagnostic returned
  0 rows, the Events view via INSQL returns 71,332 events (most recent Alarm.Set firing
  seconds before the probe). Same engine, same window — INSQL serves the data, gRPC
  withholds it from our connection.

So there is nothing in the data to scope by: the zero-row gate is the gRPC
RetrievalService's per-connection in-process execution state, not data scoping or
transport (the same class of wall as DeleteTagExtendedProperties). Combined with the
transport disproof, three independent angles are now exhausted — client payload
(byte-identical), transport (HTTP/2 == gRPC-Web), and data store (global, unscoped).
gRPC event-row retrieval stands documented as auth-solved / retrieval-server-gated;
ReadEventsAsync over gRPC keeps the no-row throw and event reads use WCF.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01B6mcaT2PjRFKcogzp9UkfC
2026-06-23 13:05:19 -04:00
Joseph Doherty b0388e7a40 gRPC events: disprove transport hypothesis (native HTTP/2 also returns zero rows)
Tested grpc-event-query-capture.md's leading next-session hypothesis — that the
native client's Grpc.Core HTTP/2 transport (vs our Grpc.Net.Client + GrpcWebHandler
gRPC-Web) is why event reads return zero rows. Added HistorianGrpcChannelFactory
.CreateHttp2 (plain HTTP/2 over SocketsHttpHandler, no gRPC-Web wrap) and an
HISTORIAN_GRPC_EVENT_HTTP2 switch on the event orchestrator (event path only; reads
stay gRPC-Web).

Live side-by-side against the event-bearing 2023 R2 server, everything else held
constant: the full v8 chain (ExchangeKey auth, CM_EVENT RegisterTags/EnsureTags=True,
StartEventQuery with a valid handle) runs end-to-end over BOTH native HTTP/2 and
gRPC-Web, and the server returns the byte-identical version-11 rowCount-0 terminal
(0B00000000001E000000) on both transports. Transport choice makes no difference —
the leading hypothesis is disproven and the zero-row scoping sits above the gRPC
transport layer.

Also confirmed the native capture-event harness queries a 30-day historical window
(returns 50 rows), so the native read is connection-scoped historical retrieval,
not a live subscription.

CreateHttp2 + the env switch + the EventChannelMode diagnostic are retained for
further connection-level probing. 44 offline tests pass; orchestrator stays on the
no-row throw.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01B6mcaT2PjRFKcogzp9UkfC
2026-06-23 12:55:09 -04:00
Joseph Doherty 88287a8c66 docs(grpc-events): document the server-side/connection angle for next session
Records the row-retrieval pickup now that the v8 ExchangeKey auth is solved and the
gap is proven connection-level (not client payload):
- grpc-event-query-capture.md: a "NEXT SESSION — the server-side / connection angle"
  section — what's already proven (don't redo), the in-place tooling, and ordered,
  testable hypotheses (HTTP/2 vs gRPC-Web transport [leading], TLS client cert,
  HTTP/2 frame capture, SQL event-store scoping).
- handoff.md item 1: updated to "v8 auth solved; row retrieval connection-gated",
  pointing at the NEXT SESSION section; the "to move any item" summary updated.

Doc-only; sanitization scan clean.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01B6mcaT2PjRFKcogzp9UkfC
2026-06-23 12:37:22 -04:00
Joseph Doherty 0921e21bdb feat(grpc-events): handle-capture cycle — event-row gap proven NOT a client payload issue
Extends the instrument-grpc rewrite to log string (strHandle) + uint (uiHandle /
queryRequestType) params, not just byte[], and captures our SDK's live v8
openParameters for a byte-diff against the native.

Result of the exhaustive comparison (all live-confirmed via the opt-in
EventReadDiagnostic test):
- StartEventQuery request: byte-identical to the native (v6 layout)
- v8 OpenConnection openParameters: byte-identical to the native (302B) once
  ClientNodeName matches — every control byte/ConnectionType/token/ShardId
- handle usage identical: ExchangeKey->contextKey, registration->storage GUID
  (strHandle), query->client uint (uiHandle); handles valid (RTag/EnsT=True)
- queryRequestType=3, registration order, gzip metadata header — all match
- window has events (native returns 50 now); eventCount not it

Every observable client-side byte matches the native, yet the server scopes 0
events to our connection. The event RPCs succeed over our transport and return a
valid EMPTY result (not a transport error), so this is a connection/server-level
difference (session affinity tied to the native Grpc.Core HTTP/2 connection or a
connection identity used to scope events) — invisible to and unfixable by client
payload matching. Needs server-side insight, not more wire RE.

Added opt-in diagnostics (RegistrationDiag, LastResultBufferHex,
LastEventOpenRequestHex). 326/326 offline; gated test still pins the no-row throw.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01B6mcaT2PjRFKcogzp9UkfC
2026-06-23 12:31:33 -04:00
Joseph Doherty c45f1a957b docs(grpc-events): token scheme fully RE'd via dnlib — aahCryptV2 (MD5-keyed RC4 + prefix)
Loaded dnlib in PowerShell (ILSpy crashes on the mixed-mode assembly) and scanned
the IL to recover the entire v8 token construction:

- <Module>::CHistoryConnectionGrpc.GetClientKey drives the ECDH: ECDiffieHellmanCng
  {KeyDerivationFunction=Hash, HashAlgorithm=SHA256, KeySize=256} -> ExchangeKey ->
  CngKey.Import(serverPub, EccPublicBlob) -> DeriveKeyMaterial = SHA256(shared secret),
  the 32-byte client key.
- aahClientCommon.CClientBase.ConfigureOpenConnection (the lone GetClientKey caller)
  builds the 26-byte token via HistorianCrypto.NRC4_V2.aahCryptV2 = a custom MD5-keyed
  RC4 stream cipher with a version prefix:
    * body/HashData = MD5 (verified by the round constants 0xd76aa478... + shifts 7/12/17/22)
    * prepare_key = RC4 KSA from a 16-byte MD5 key
    * enc_buffer = MD5 -> key, then rc4encrypt; enc prepends PrefixV2/InnerPrefixV2
      (the constant 0x8e token marker)
  So token = prefix + RC4(plaintext, key=MD5(keyMaterial)), keyMaterial tied to the
  SHA256(ECDH secret) client key. 100% reproducible in pure managed code (RC4+MD5).

Remaining (next cycle): read ConfigureOpenConnection's exact key/plaintext/prefix bytes,
implement aahCryptV2 managed-side, set the v8 token, live-test. Frida CNG + dnlib are
the RE path; nothing AVEVA is shipped.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01B6mcaT2PjRFKcogzp9UkfC
2026-06-23 11:21:55 -04:00
Joseph Doherty b2ac35b98e docs(grpc-events): trace the ExchangeKey token crypto — KDF=SHA256(secret); token construction localized
Frida-hooked Windows CNG (scripts/frida/aahclientmanaged-cng-exchangekey.js) during
a real native ExchangeKey to recover the token derivation:

- The ECDH + KDF are standard CNG driven by managed System.Security.Cryptography
  .ECDiffieHellmanCng: NCryptSecretAgreement (P-256) -> NCryptDeriveKey(KDF=HASH,
  SHA256, 32 bytes). So the derived key = SHA256(ECDH shared secret).
- "ECK1" is the standard CNG BCRYPT_ECCPUBLIC_BLOB magic (P-256), confirming our
  BuildExchangeKeyClientHello wire format.
- The 26-byte token (constant 0x8e marker) is a custom construction over the
  derived key: a 528-candidate offline cracker (HMAC/SHA/AES-GCM/CBC/CTR over the
  derived key x request slices x creds) found no match, and it matches none of the
  traced hash digests. It is built in aahClientManaged's C++/CLI <Module> code
  between the DeriveKeyMaterial call and the openParameters assembly.

Next: ILSpy cannot decompile the mixed-mode assembly (crashes, exit 70); use dnlib
(IL-level) to dump the <Module> method referencing DeriveKeyMaterial and read the
post-derive token construction. 2 of 3 layers cleared (key exchange + client key);
the 3rd (token) is localized, pending dnlib extraction. Orchestrator stays on v6.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01B6mcaT2PjRFKcogzp9UkfC
2026-06-23 11:11:21 -04:00
Joseph Doherty 3fd522fa10 docs(grpc-events): Path B — ExchangeKey ECDH clears 2 of 3 layers
Records that the pure-managed P-256 ExchangeKey works (cleared the v8 client-key
check; error advanced to 132/171 AuthenticationFailed). The remaining layer is the
26-byte credential-token KDF, which requires recovering the native key derivation.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01B6mcaT2PjRFKcogzp9UkfC
2026-06-23 10:31:37 -04:00
Joseph Doherty 7284fdc976 docs(grpc-events): Path A disproven — v8 OpenConnection coupled to ExchangeKey
Records the full v8 openParameters byte map, the ECDH ExchangeKey finding, and
the Path A live result: the v8 OpenConnection on a ValidateClientCredential
session is rejected with native 132/34 "EstablishConnection Failed to get client
key". The v8 path requires the client key established by HistoryService.ExchangeKey
(ECDH), so the next route is Path B — implement ExchangeKey ("ECK1" + 64-byte
P-256 point) via .NET ECDiffieHellman, then reissue the v8 open.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01B6mcaT2PjRFKcogzp9UkfC
2026-06-23 09:46:07 -04:00
Joseph Doherty 876cbc5d94 docs(handoff): refresh event-row item — captured + v8 connection-type gate
Item 1 is no longer capture-gated (the capture is done, merged 8ad160b). Update it
to: the capture-event run read 50 events from the stock client; the v6 request is
shipped; the remaining gate is the native v8 Event-type OpenConnection (a
ConnectionType field the SDK's v6 Open2 format lacks), a scoped RE+impl follow-on.
Points at docs/reverse-engineering/grpc-event-query-capture.md. Also corrected the
"to move any item" summary so item 1 no longer reads as needing a fresh capture.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01B6mcaT2PjRFKcogzp9UkfC
2026-06-23 09:06:05 -04:00
Joseph Doherty c6752804ee docs(grpc-events): event-query capture finding + v8 connection-type gate
Records the 2026-06-22 capture of the stock 2023 R2 gRPC event read and the
diagnosis of why row retrieval is gated:

1. The working StartEventQuery request is version 6 (vs the SDK's v5) — shipped in
   the companion code commit.
2. Rows additionally require an EVENT-type connection. Decoding the captured
   OpenConnection.openParameters (native format v8) shows a ConnectionType byte
   (Event=01 / Process=02) right after ClientType — a field the SDK's v6 Open2
   format does not have (it writes ClientType then ConnectionMode back-to-back).
   So the v6 buffer the SDK sends (accepted for reads) cannot mark the connection
   as Event, and the 2023 R2 server returns event rows only on an Event
   connection. The native client also used the ExchangeKey cert auth path.

Conclusion: making event rows flow over gRPC requires the SDK to emit the native
v8 OpenConnection format with ConnectionType=Event (a larger RE+implementation
effort), scoped as a follow-on. v6 is retained as the captured-correct request.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01B6mcaT2PjRFKcogzp9UkfC
2026-06-22 10:41:29 -04:00
Joseph Doherty 73f66cbf27 docs(handoff): re-characterize gRPC event-row gate — capture-gated, not server-gated
Live SQL ground truth (user-authorized one-time read via SOCKS->SQL relay)
disproves the gate on the open gRPC event-row item. The live 2023 R2 server
IS event-bearing — Runtime.dbo.Events holds 19,356 rows in the last 30 days
(90,944 in 365) — yet the empty-filter gRPC event query still returns zero rows
and long-polls to the deadline over that same window.

So GetNextEventQueryResultBuffer returning nothing is NOT "no events on the
server"; the empty-filter request shape (filter / namespace / event-tag
registration) doesn't match existing rows. The remaining work is a fresh native
gRPC event-query capture of the stock client, not access to a different server.

- handoff.md: rewrite open-item #1 with the SQL numbers + capture-gated framing;
  update the "to move any item" summary to match.
- HistorianGrpcIntegrationTests: correct the event-read test comment (drop the
  false "idle dev box holds no events" rationale; document 19k-events-yet-zero-rows).

No behavior change (test edit is comment-only). Sanitization scan clean.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01B6mcaT2PjRFKcogzp9UkfC
2026-06-22 08:11:27 -04:00
Joseph Doherty 8b966f3d80 docs(handoff): refresh to current state — roadmap exhausted
The handoff doc was anchored at 2026-05-04 (read-path blocker era). Add a
"Current Status (2026-06-22)" section at the top summarizing the full shipped
read/write/config/client-side surface across WCF + gRPC, the 8 remaining gated
items (event-row retrieval, active-SF magnitude, SendEvent capture, SQL wall,
R4.2 edits, ReadBlocks, the disproven DelTep gRPC probe, deferred-by-design),
and what would unblock each. Reframe the old "Active Blocker" as a historical
read-path record; fix the stale 55/55 test count to 321/321 offline.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01B6mcaT2PjRFKcogzp9UkfC
2026-06-22 07:54:29 -04:00
Joseph Doherty b3417c2f6a docs(grpc): record DelTep multiplexed-channel probe as disproven
README transport matrix + grpc-tooling-completion.md §Out-of-scope: the gRPC
multiplexed-channel hypothesis for DeleteTagExtendedProperties was probed live
2026-06-22 and disproven — primes succeed on the shared channel but DelTep is
still rejected (native code=1), property survives. Stays server-blocked on both
transports, not shipped.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01B6mcaT2PjRFKcogzp9UkfC
2026-06-22 06:55:05 -04:00
Joseph Doherty df28bcfa53 docs(roadmap): mark HCAL roadmap exhausted; remaining items are all gated
M0/M1/M2/M3 done + live-verified; M4 R4.1/R4.3(idle)/R4.4 merged to main; the
grpc-tooling-completion plan is fully executed. Add a top-of-file status banner
enumerating the only remaining items and why each is gated (infra-gated event-row
retrieval + active-SF magnitude; capture-gated SendEvent; server-walled SQL +
revision edits; out-of-scope ReadBlocks / DeleteTagExtendedProperties). Nothing
left is a pure code task. README transport matrix stays authoritative per-op.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01B6mcaT2PjRFKcogzp9UkfC
2026-06-22 06:12:04 -04:00
Joseph Doherty ecf446965a docs(grpc): matrix + plan reflect ext-prop fix, SQL prime result, ConnStatus
- README transport matrix: GetTagExtendedProperties notes the multi-property parser
  fix; AddTagExtendedProperties read-back now round-trips; GetConnectionStatus gRPC
  -> live-verified; ExecuteSqlCommand notes the RegisterTags prime does not help.
  Refresh the closing production-pattern guidance.
- grpc-tooling-completion.md: mark #5 (ConnStatus) done, #4 (SQL prime) negative, and
  the #1 ext-prop read-back follow-up done.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01B6mcaT2PjRFKcogzp9UkfC
2026-06-22 06:03:59 -04:00
Joseph Doherty 27e969f86d docs(grpc): transport matrix + plan reflect ReadEvents + live-verified writes
- README transport matrix: gRPC writes (EnsureTag/DeleteTag/RenameTags/
  AddTagExtendedProperties) flip to live-verified; note the async-rename retry and
  the extended-property read-back parser gap. ReadEvents gRPC -> tooled-but-bounded
  (StartEventQuery works, GetNext long-polls, throws on no-row pending an
  event-bearing server). Refresh the closing production-pattern guidance.
- grpc-tooling-completion.md: mark items #1 (writes, done) and #2 (ReadEvents,
  tooled/bounded) with the live outcomes and follow-ups.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01B6mcaT2PjRFKcogzp9UkfC
2026-06-22 04:58:44 -04:00
Joseph Doherty 7e8bb07df3 docs(grpc): add gRPC tooling completion plan
Self-contained plan for finishing gRPC surface parity: live-verify the
sandbox-gated writes, port ReadEvents (CM_EVENT registration state machine),
SendEvent (capture-blocked), the SQL server-wall stretch, and optional
GetConnectionStatus. Includes the proven reuse pattern and live-verification setup
so it survives context compaction.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01B6mcaT2PjRFKcogzp9UkfC
2026-06-22 01:30:04 -04:00
Joseph Doherty e45c615a79 docs: record R4.3 measured idle-state status in hcal-roadmap
Update the M4 table row, one-glance status line, and M4 narrative note to
reflect R4.3: measured idle-state GetStoreForwardStatusAsync SHIPPED over
gRPC; active-SF magnitude + R4.2 revision edits stay deferred behind the
shared D2 storage-engine console-pipe wall.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01B6mcaT2PjRFKcogzp9UkfC
2026-06-21 23:20:51 -04:00
Joseph Doherty c2d8fb9bc8 R4.3: gRPC store-forward status probe + re-scope
Add HistorianGrpcStoreForwardStatusProbe and the `grpc-sf-status-probe` CLI
command. The idle-baseline run against the live 2023 R2 server resolves the
plan's §9.3 handle question: the direct StorageService SF pull RPCs
(GetSFParameter / GetRemainingSnapshotsSize) require the OpenStorageConnection
console handle and are D2-gated (err 132, identical under read-only and
write-enabled sessions), while StatusService.GetHistorianConsoleStatus IS
reachable on the session string handle (=3 at idle).

Records the gRPC re-scope and the idle-baseline findings in
docs/plans/store-forward-cache-reverse-engineering.md §9. The probe writes
nothing and releases any console session immediately.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01B6mcaT2PjRFKcogzp9UkfC
2026-06-21 23:14:05 -04:00
Joseph Doherty 60b3673f01 M4 R4.4: client-side multi-historian redundancy
Adds AVEVA.Historian.Client.Redundancy — HistorianRedundantClient orchestrates
N single-historian members (IHistorianMember; default HistorianClientMember
over HistorianClient) as one logical client. Pure client-side, no server-side
redundancy protocol, no RE.

- Reads fail over to the next member in priority order. Streaming reads only
  fail over BEFORE the first row is observed; a mid-stream failure propagates
  (failing over mid-stream would risk duplicated/skipped rows).
- Writes fan out: WriteFanout AllMembers | PreferredOnly, with All | Any ack
  policy, returning a per-member HistorianRedundantWriteResult.
- Per-member health: FailureThreshold demotes a failing member out of the
  preferred pool; a background watchdog (PeriodicTimer) + CheckHealthAsync
  re-probe and restore recovered members. GetStatus() snapshot + ActiveMember.
- Composes with R4.1: back a member's writes with a HistorianStoreForwardWriter
  so a down member buffers and replays on recovery — the pragmatic client-side
  equivalent of native ReSyncTags.

14 unit tests (no server): failover order, mid-stream no-failover, all-fail
aggregation, probe-any-up, fan-out ack policies, PreferredOnly, soft reject,
health demotion + CheckHealthAsync restore, watchdog recovery. Full suite 307
green. Roadmap R4.4 marked shipped.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01B6mcaT2PjRFKcogzp9UkfC
2026-06-21 22:46:10 -04:00
Joseph Doherty dd2aec3b8b M4 R4.1: pragmatic store-and-forward durable outbox
Adds AVEVA.Historian.Client.StoreForward — a client-side store-and-forward
layer over the historian write surface (AddHistoricalValuesAsync /
SendEventAsync). Producers enqueue writes; the writer persists them and
replays on reconnect so a transient disconnect never drops data. This is the
roadmap's recommended pragmatic outbox, NOT a bit-faithful reimplementation of
AVEVA's native SF cache (that stays deferred) — pure managed, no RE.

- HistorianOutboxEntry / HistorianOutboxEntryKind: buffered-write envelope
- IHistorianOutboxStore + InMemoryHistorianOutboxStore (tests) +
  FileHistorianOutboxStore (crash-durable: atomic temp+move JSON per entry,
  FIFO by filename sequence that resumes past on-disk max, corrupt-file
  quarantine). OutboxJson normalizes event object? properties off JsonElement.
- IHistorianWriteSink + HistorianClientWriteSink (HistorianClient-backed)
- HistorianStoreForwardWriter: enqueue, single-flight FIFO FlushAsync with
  head-of-line blocking, optional MaxDeliveryAttempts dead-lettering,
  DropOldest/Reject overflow policy, background drain loop (retry on reconnect),
  GetStatusAsync snapshot mirroring server SF Pending/Storing/ErrorOccurred.

12 unit tests (no server): durability-across-restart, reconnect-drain, FIFO
order/head-of-line, dead-letter, overflow policies, background auto-drain.
Full suite 293 green. Roadmap R4.1 marked shipped.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01B6mcaT2PjRFKcogzp9UkfC
2026-06-21 22:35:30 -04:00
Joseph Doherty a91f126287 docs(hcal-roadmap): M3 R3.2 ships all 5 analog types, not Float-only
R3.2 and the one-glance table still read "Float-only"; the shipped
AddHistoricalValuesAsync covers Float/Double/Int2/Int4/UInt4 (golden-tested
+ live write/read-back). Correct both lines to match code + tests.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01B6mcaT2PjRFKcogzp9UkfC
2026-06-21 22:20:25 -04:00
Joseph Doherty d1e96f48de M3 R3.2: AddHistoricalValuesAsync supports Double + Int (Int2/Int4/UInt4)
Extended the historical-write serializer from Float-only to all five analog types EnsureTagAsync
supports. Captured each type's "ON" buffer live from the native client (sandbox tag per type,
written + captured + deleted):

- The 4-byte value descriptor (C0 10 01 00) is CONSTANT across types — it does not encode the type.
- The value is u32(0) + native-width value, width by the tag's declared type:
  Float->float32, Double->double64, Int2->int16, Int4->int32, UInt4->uint32.

HistorianHistoricalWriteProtocol.SerializeAddStreamValuesBuffer now takes the HistorianDataType and
encodes accordingly (unsupported types throw ProtocolEvidenceMissingException). The orchestrator
resolves the type from the tag-info NativeDataTypeDescriptor via MapDataType. Harness capture-write
gained --data-type. Golden-tested against all five live captures + the gated write/read-back test
validated each type end-to-end through the pure-managed SDK; 281 unit tests pass.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01B6mcaT2PjRFKcogzp9UkfC
2026-06-21 21:48:29 -04:00
Joseph Doherty dafafa0c98 M3 R3.2 SHIPPED: docs — AddHistoricalValuesAsync recorded in roadmap, plan, and CLAUDE.md surface
Marks M3 historical writes SHIPPED + live-validated across the roadmap (R3.2/R3.3/one-glance),
revision-write-path.md §"R3.1 CAPTURED", and the CLAUDE.md Required SDK Surface (the new write op,
gRPC-only, AddStreamValues "ON" path, Float-only, distinct from the still-blocked AddS2 streaming
path).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01B6mcaT2PjRFKcogzp9UkfC
2026-06-21 21:24:37 -04:00
Joseph Doherty 0e78d638d0 M3 R3.1: document the captured + validated AddStreamValues "ON" write path
revision-write-path.md §"R3.1 CAPTURED" + roadmap R3.1/R3.2/one-glance now record the validated
finding: the historical write is HistoryService.AddStreamValues ("ON" storage-sample buffer, AddS2
"OS" family) + EnsureTags, not AddNonStreamValues/TransactionService. Includes the decoded 56-byte
"ON" buffer layout, the working priming/batch sequence, the tag-GUID keying, and that the D2 cache
gate does not block the primed 2023 R2 client. Remaining work to ship AddHistoricalValuesAsync is
the managed "ON" serializer (adapt HistorianEventWriteProtocol) + gRPC orchestrator wiring.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01B6mcaT2PjRFKcogzp9UkfC
2026-06-21 21:04:17 -04:00
Joseph Doherty 222eed9c02 M3 R3.1: durable capture plan — drive native 2023 R2 gRPC client + IL-rewrite byte[] payloads
Records the feasibility-verified plan to capture the two remaining buffers (regular-tag
RegisterTags btTagInfos + AddNonStreamValues btInput):

- 2023 R2 aahClientManaged.dll is self-contained mixed-mode C++/CLI (only Windows + VC++
  runtime native imports) — loadable in a net481 x64 process, no AVEVA install needed.
- gRPC routes through the managed Archestra.Historian.GrpcClient.dll, so the byte[] payloads
  are capturable by IL-rewriting GrpcHistoryClient.RegisterTags / AddNonStreamValues (dnlib,
  the instrument-wcf-writemessage pattern; rewrite a copy, never the originals).
- Connection is reflection-drivable: HistorianAccess.OpenConnection(HistorianConnectionArgs)
  with ConnectionMode=HistorianConnectionMode.Historian (the gRPC mode), TcpPort=32565, cert.
- gRPC runtime deps (Grpc.Net.Client / Grpc.Core.Api / Google.Protobuf / ...) are present in
  msi-extract/ArchestrA/Toolkits/Bin/x64.
- Risk: the C++ AddNonStreamedValue TagNotFoundInCache(129) gate (the 2020 D2 blocker) may
  block btInput; mitigation = read the tag first. RegisterTags is emitted before that gate.

Build order documented (read-only connect -> IL-rewrite -> write capture -> serializer ->
commit+read-back -> AddHistoricalValuesAsync), each live step gated on per-action auth.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01B6mcaT2PjRFKcogzp9UkfC
2026-06-21 18:58:38 -04:00
Joseph Doherty 57b9506d01 M3 R3.1: OpenStorageConnection is a dead end (error 85); precondition is front-door RegisterTags
Live-probed StorageService.OpenStorageConnection against the 2023 R2 server over a
write-enabled (0x401) session. Every attempt — sweeping ConnectionMode (0x401/0x402/0x1),
StorageSessionId-in (Open2-GUID / empty), and FreeDiskSpace — returns the IDENTICAL native
error type=4 code=85 ("session not registered"), so it's a structural refusal, not a bad
field value.

Decode (two corroborating facts):
- Error 85 is the same code the event read returns before RegisterTags2 (see
  HistorianWcfEventOrchestrator) — a generic "session not registered for this op".
- The 2023 R2 decompile shows OpenStorageConnection lives on a SEPARATE GrpcStorageClient
  (the storage engine's SF/snapshot channel, own port + service identity); HistorianAccess
  drives non-streamed writes through the native C++ HistorianClient, never this op.

So the roadmap's mapped "missing console session" step was wrong. The real non-streamed-write
precondition is the front-door HistoryService.RegisterTags (RTag2-family) for the target tag —
which is exactly why the R3.1 batch failed at AddNonStreamValues (no tag registered ->
StoreNonStreamValues had no route). Matches the original 2020-WCF D2 hypothesis.

Remaining (both need a native gRPC capture; do not guess bytes): the regular-tag RegisterTags
btTagInfos (only CM_EVENT's tag-GUID form is known) and the AddNonStreamValues btInput.

- HistorianGrpcStorageConnectionProbe + grpc-open-storage-connection CLI (opens nothing
  persistent; CloseStorageConnection on success)
- corrected revision-write-path.md §R3.1 follow-up + hcal-roadmap R3.1/R3.2 rows
- gated regression test pinning the error-85 refusal

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01B6mcaT2PjRFKcogzp9UkfC
2026-06-21 18:51:16 -04:00
Joseph Doherty 1a08dd9ec2 M3: roadmap reflects mapped non-streamed sequence (OpenStorageConnection follow-up)
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01B6mcaT2PjRFKcogzp9UkfC
2026-06-21 18:20:35 -04:00
Joseph Doherty ac28679a1f M3 R3.1: map the required non-streamed write sequence (OpenStorageConnection is the missing step)
Static decompile mining of the 2023 R2 client corroborates the live R3.1 error:
the AddNonStreamValues failure is the missing StorageService.OpenStorageConnection,
which creates exactly the \.\pipe\aahStorageEngine\console,sid(...) session named
in the server error. Mapped the full native sequence:

  HistoryService.OpenConnection (have) -> StorageService.OpenStorageConnection
  (MISSING) -> StorageService.RegisterTags -> AddNonStreamValuesBegin (works) ->
  AddNonStreamValues(btInput) (fails - no console session) -> End(commit).

Two hard parts remain, each a live-production decode loop with no static shortcut:
(1) reproduce the 12-arg OpenStorageConnection handshake (several args inferred);
(2) decode the AddNonStreamValues btInput (C++-built, absent from decompiles; only
the 44-byte packed HISTORIAN_VALUE2 is known). Documented in revision-write-path.md.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01B6mcaT2PjRFKcogzp9UkfC
2026-06-21 18:12:40 -04:00
Joseph Doherty 8fbb868813 M3 R3.1 decode: AddNonStreamValues reaches server StoreNonStreamValues (storage-engine console pipe)
Empirically decoded the AddNonStreamValues btInput framing against the live 2023
R2 server (grpc-nonstream-decode command + ProbeNonStreamedBuffersAsync driver).
Every transaction rolled back (bCommit=false) — no data written.

Finding: the btInput is assembled native-C++-side (not in any decompile), so 6
evidence-based framings (44-54B, packed HISTORIAN_VALUE2 variants) were probed.
All 6 returned the IDENTICAL server error while an empty buffer returned a
different InvalidParameter — so non-empty buffers pass parameter validation into
CHistStorageConnection::StoreNonStreamValues, which routes to the
\.\pipe\aahStorageEngine\console pipe server-side. Identical-across-framings =>
the blocker is NOT the btInput layout but a missing storage-engine console
session / tag-registration precondition for the connection.

Next step (untested): StorageService.OpenStorageConnection + tag registration
(RegisterTags/AddTagidPairs/AddShardTagids) before AddNonStreamValues, then
commit + read-back on a sandbox tag. Documented in revision-write-path.md
(R3.1 decode section); raw artifact gitignored.

272 unit tests pass.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01B6mcaT2PjRFKcogzp9UkfC
2026-06-21 18:08:27 -04:00
Joseph Doherty 23798db1ef M3 probe: non-streamed write transaction reachable over 2023 R2 gRPC (Begin/End live-verified)
The D2 storage-engine-pipe wall is WCF-transport-specific. On the 2023 R2 gRPC
front door, TransactionService is a first-class service AND the gateway to the
storage engine, so the Open2 storage-session GUID (uppercase) is accepted
directly as strHandle with no legacy pipe.

Live-verified against the real 2023 R2 server over a write-enabled (0x401) gRPC
session: AddNonStreamValuesBegin returns a real strTransactionId, and
AddNonStreamValuesEnd(bCommit=false) discards it cleanly (no data written). On
2020 WCF the same op returns UnknownClient(51) for every handle + priming chain.

- HistorianGrpcRevisionProbe + grpc-revision-probe CLI command + gated test
  NonStreamedWriteTransaction_OverGrpc_BeginsAndDiscards (live pass).
- HistorianGrpcHandshake.OpenSession gains an optional connectionMode param
  (default read-only 0x402; pass 0x401 for write-enabled) — non-breaking.
- Docs: revision-write-path.md "the wall is gone" section; roadmap M3 section,
  R3.1-R3.3 rows, one-glance table, and status note updated honestly.

Not yet shipped: the AddNonStreamValues btInput VTQ buffer is uncaptured (never
guess wire bytes), so no value-commit is implemented. Scope is non-streamed
ORIGINAL backfill; revision EDITS (R4.2) remain pipe-only even on gRPC.

272 unit tests pass; sanitization scan clean.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01B6mcaT2PjRFKcogzp9UkfC
2026-06-21 17:51:17 -04:00
Joseph Doherty 04ea0b9a1f R1.3 GetServerTimeZoneAsync over gRPC (live-verified); R1.4 bounded out on gRPC
Live-probed both R1.3 and R1.4 against a real 2023 R2 server over the gRPC
StatusService; implemented the one that carries an evidence-backed value.

R1.3 GetServerTimeZoneAsync — SHIPPED:
- StatusService.GetSystemTimeZoneName(uiHandle) returns the real server zone
  over RemoteGrpc (the 2020 WCF op is a client-side stub returning empty).
- HistorianGrpcStatusClient.GetSystemTimeZoneNameAsync -> dialect routing ->
  public HistorianClient.GetServerTimeZoneAsync. Non-gRPC transports fail
  closed with ProtocolEvidenceMissingException (no empty-string lie).
- Golden message-shape unit test + non-gRPC guardrail unit test + gated live
  test. 271 unit tests pass.

R1.4 GetHistorianInfoAsync (EventStorageMode) — bounded out on gRPC too:
- gRPC GetHistorianInfo is the same named-value query as 2020 WCF (only
  HistorianVersion resolves); EventStorageMode + 7 variants fail on both
  GetHistorianInfo and GetSystemParameter. The 518-byte struct is filled by a
  native vtable+648 HCAL call, not the gRPC op (per the 2023 R2 decompile), so
  the field is never on the wire. Not shipped on any transport. Closes the
  roadmap's open "build against a live 2023 R2 server" caveat.

Also correct the stale M3 roadmap section: D2 already proved
Transaction.AddNonStreamValues* rides the storage-engine pipe (STransactPipeClient2
-> aaStorageEngine), not WCF — same wall as R4.2 — so M3-over-WCF is blocked, not
"the path that is NOT the gated cache push".

Docs: hcal-roadmap.md, wcf-historian-info.md, wcf-status-localhost.md.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01B6mcaT2PjRFKcogzp9UkfC
2026-06-21 17:24:10 -04:00
Joseph Doherty 25aff409dc Merge re/grpc-2023r2-handshake: M0 gRPC parity (probe/system-param/metadata/browse) + handshake fix 2026-06-21 16:32:02 -04:00
Joseph Doherty d23722ea73 Merge re/r1.10-rename-tags: RenameTagsAsync via History StartJob
# Conflicts:
#	docs/plans/hcal-capability-matrix.md
#	docs/plans/hcal-roadmap.md
#	src/AVEVA.Historian.Client/Wcf/HistorianWcfTagWriteOrchestrator.cs
#	tests/AVEVA.Historian.Client.Tests/HistorianClientIntegrationTests.cs
#	tools/AVEVA.Historian.NativeTraceHarness/Program.cs
2026-06-21 16:31:44 -04:00
Joseph Doherty 4de222c950 Merge re/r1.4-gethi-finding: R1.1 ExecuteSqlCommand + R1.4 GetHistorianInfo (bounded)
# Conflicts:
#	docs/plans/hcal-roadmap.md
#	src/AVEVA.Historian.Client/HistorianClient.cs
#	src/AVEVA.Historian.Client/Protocol/Historian2020ProtocolDialect.cs
#	tests/AVEVA.Historian.Client.Tests/HistorianClientIntegrationTests.cs
#	tools/AVEVA.Historian.NativeTraceHarness/Program.cs
2026-06-21 16:18:49 -04:00
Joseph Doherty 85ff1b48df R0.1 browse over gRPC SHIPPED — QueryTag cracked, M0 gRPC parity complete
Wires HistorianClient.BrowseTagNamesAsync over gRPC (Transport==RemoteGrpc) via
Grpc/HistorianGrpcTagClient.BrowseTagNamesAsync: StartTagQuery(OData) -> paged
QueryTag -> EndTagQuery. Live-verified against a real 2023 R2 server (returns Sys* tags).

QueryTag packet-id recovered WITHOUT native disassembly: a .rdata packet-descriptor
table in aahClientManaged.dll lists {0x6751,1}=StartTagQuery immediately followed by
{0x6752,1}=QueryTag (found via pefile byte-scan of .rdata), confirmed live.

Wire format (live-verified):
- request btRequest = u16 0x6752 + u16 version(1) + u16 queryType(1=names) + u32 startIndex + u32 count
- response btResonse = u32 count + per-name(u32 charCount + UTF-16LE) + trailer (NextIndex/metadata, ignored)
- new HistorianTagQueryProtocol.ParseTagNameQueryPage tolerates the trailer
- GlobToODataFilter translates the SDK glob filter to OData (Pre*->startswith, *suf->endswith,
  *mid*->contains, exact->eq); the 2023 R2 metadata-server parses filters as OData.

Replaces the earlier RE probe helpers with the shipped browse path. Adds golden-byte
(BuildQueryTagRequest) + 8 glob-translation unit tests + gated live browse test.
226 unit tests pass; 5/5 live gRPC tests pass (read, probe, system-param, metadata, browse).

Milestone 0 (full gRPC parity) is complete.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01B6mcaT2PjRFKcogzp9UkfC
2026-06-21 16:01:15 -04:00
Joseph Doherty 630295bd18 docs: QueryTag native-RE attempt — lightweight tooling insufficient, needs Ghidra
Recorded the native-disassembly attempt on aahClientManaged.dll (mixed-mode):
ilspycmd cannot decompile it; capstone byte-search can't locate the StartTagQuery
0x6751 marker (not a plain immediate — it's an .rdata constant loaded RIP-relative,
the .text "51 67 00 00" hits are coincidental jump-table data). Managed metadata
gives QueryTag field semantics but not the binary packet-id. Finishing QueryTag
needs Ghidra/IDA xref analysis or the live IL-rewrite capture.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01B6mcaT2PjRFKcogzp9UkfC
2026-06-21 15:46:54 -04:00
Joseph Doherty 4c9f0d476c docs: QueryTag error = InvalidPacketId (72); needs native aahClient.dll RE
Deepened the R0.1 browse finding. QueryTag's constant rejection decodes to
ArchestrA.CloudHistorian.Contract.ErrorCode.InvalidPacketId (72): the btRequest needs
a QueryTag-specific packet-id header (the generic 0x6751/v1 header StartTagQuery accepts
is rejected). The semantic fields are known from CloudHistorian.Contract
(QueryHandle/QueryType/StartIndex/TagCount request; TagNames[]+TagMetadataBuffer response),
but the binary packet framing lives in native aahClient.dll — aahClientManaged.dll is
mixed-mode (ilspycmd cannot decompile it) and no managed assembly builds the buffer.
Finishing QueryTag needs native RE (Ghidra/IDA) or a live gRPC capture of the stock client.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01B6mcaT2PjRFKcogzp9UkfC
2026-06-21 15:16:19 -04:00
Joseph Doherty 26ef5e5645 R0.1 browse probe: StartTagQuery over gRPC takes an OData filter (live)
Probes the 2023 R2 gRPC browse path and records the finding. The front door does
NOT hit the 2020 WCF metadata-server-pipe wall.

- RetrievalService.StartTagQuery is cracked: the server (CMdServer::StartActiveTagnamesQuery
  over \.\pipe\aahMetadataServer\console) parses the filter as OData. startswith()/
  contains()/eq/empty succeed and return the 8-byte (queryHandle, tagCount); SQL-LIKE "%"
  and glob "*" fail with "ODataFilter: bad token". Live: 220 Sys* tags counted.
- QueryTag (paging) remains: every guessed btRequest returns a constant native error
  type 4 / code 72 (content-independent) -> framing needs a native capture, not guessing.

Adds RE probe helpers Grpc/HistorianGrpcTagClient.ProbeStartTagQuery + ProbeTagQuerySequence,
a gated StartTagQuery_OverGrpc_AcceptsODataFilter test, and the finding doc
docs/reverse-engineering/grpc-tag-query-odata.md. Browse is not yet wired (QueryTag open).

217 unit tests pass; 5/5 live gRPC tests pass. No tag names/identities committed.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01B6mcaT2PjRFKcogzp9UkfC
2026-06-21 14:58:12 -04:00
Joseph Doherty 0e19adae68 gRPC M0 R0.2: tag metadata over gRPC (GetTagInfosFromName, live-verified)
Routes HistorianClient.GetTagMetadataAsync over gRPC when Transport==RemoteGrpc,
via the new Grpc/HistorianGrpcTagClient calling RetrievalService.GetTagInfosFromName
(the plural string-handle metadata op).

- String handle = the Open2 storage-session GUID formatted uppercase (the format
  that resolves the native string-handle path); threaded out of the shared handshake
  via a new HistorianGrpcHandshake.Session { ClientHandle, StorageSessionId, StringHandle }.
- Request btTagNames = uint count + per-name(uint charCount + UTF-16LE) — golden-byte
  unit-tested (BuildTagNamesBuffer).
- Response btTagInfos = uint count + CTagMetadata records — decoded by the existing
  HistorianTagQueryProtocol.ParseGetTagInfoResponse; data type via the shared MapDataType.

The 2020 WCF string-handle wall does NOT apply on the gRPC front door, as the
string-handle-wall RE note predicted. LIVE-VERIFIED against a real 2023 R2 server:
GetTagMetadataAsync returns the requested tag with a valid decoded data type.

216 unit tests pass. Captured framing confirmed live then discarded; no tag names
or identities committed.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01B6mcaT2PjRFKcogzp9UkfC
2026-06-21 14:35:52 -04:00
Joseph Doherty b0703ebf80 docs: R0.3 live-verified; correct the auth-blocker note (harness quote bug)
R0.3 system-param over gRPC is now LIVE-VERIFIED against the real 2023 R2 server
(returned HistorianVersion), alongside the re-confirmed read chain and probe.

The apparent NTLM round-1 SEC_E_LOGON_DENIED "blocker" was a test-harness
credential-parsing bug, not a server/account/SDK issue: the gitignored creds
file stores quoted values and the env-setup must strip surrounding quotes before
exporting HISTORIAN_USER/PASSWORD. With quotes stripped, the NAM domain account
authenticates and the full chain passes. The round-failure diagnostic added
during the hunt (HistorianNativeHandshake.DescribeError) is kept.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01B6mcaT2PjRFKcogzp9UkfC
2026-06-21 14:29:12 -04:00
Joseph Doherty c4b8d0dde4 gRPC M0: probe (R0.4, live-verified) + system-param (R0.3) + shared handshake
Roadmap docs/plans/hcal-roadmap.md, milestone M0 (gRPC parity for the DONE
surface). Now unblocked for live verification by a reachable 2023 R2 server.

- R0.4 Probe over gRPC: new HistorianGrpcProbe calls History/Retrieval/Status
  GetInterfaceVersion (unauthenticated). ProbeAsync routes over gRPC when
  Transport==RemoteGrpc. LIVE-VERIFIED against a real 2023 R2 server — needs no
  credentials (runs before the auth loop), so it works despite the auth blocker.

- R0.3 System parameter over gRPC: new HistorianGrpcStatusClient calls
  StatusService.GetSystemParameter over the authenticated session; routed in the
  dialect. Built + unit-tested (request/response field mapping pinned).
  Live-verification pending an auth fix (see below).

- Extracted the proven auth handshake from HistorianGrpcReadOrchestrator into
  shared Grpc/HistorianGrpcHandshake (reused by read + status + future
  browse/metadata). Repointed the IL structural guardrail test to it.

- Diagnostics: round-failure now decodes the native server error + hex/ASCII
  preview (HistorianNativeHandshake.DescribeError). This surfaced the live auth
  blocker as SEC_E_LOGON_DENIED (0x8009030C) at NTLM round 1 — framing is correct,
  the credential did not validate. Probable cause: stale file password or NAM-domain
  NTLM restriction (Kerberos/RDP works, NTLM denied; no SPN path over the tunnel).

216 unit tests pass; live gRPC probe passes. Sanitization scan clean.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01B6mcaT2PjRFKcogzp9UkfC
2026-06-21 13:32:04 -04:00
Joseph Doherty c1b1b3d23b R1.11 DelTep capture + R1.3/R1.4/R1.12/R1.13 bounded out
DelTep (extended-property delete) — wire format captured + serializer
golden-proven, but live delete is server-blocked and NOT exposed publicly:
- Captured the DelTep inBuff via a cross-session trick (harness add-tep gains
  --tep-skip-add + read-for-sync before --tep-delete; Capture-DeleteTagExtended
  Properties.ps1 / decode-del-tep-capture.py). Layout = same group framing as
  AddTEx but property-name-only (no 0x43 value) + 0x00 group trailer.
- SerializeDeleteRequest + 4 golden tests pin the server-accepted buffer.
- A decisive experiment shows SDK-added properties ARE deletable (the native
  client read-syncs and deletes one), so SDK-add is complete; the SDK's own
  DelTep is rejected by CHistStorage::DeleteTagExtendedProperties even with
  byte-identical inBuff, matching mode/handle, GetTgByNm+GetTepByNm prime, open
  channel, and 60s retries. Root cause: the native multiplexes services over one
  connection (per-connection working set); the SDK's per-service WCF channels
  don't reproduce it. Kept as documented-but-blocked internal orchestrator path;
  no public HistorianClient delete API.

Bounded out with evidence (no code; docs + roadmap + probe):
- R1.12 localized-property write — no op on 2020 (mirror of R1.6); no
  *LocalizedPropert*/TagLocalized* symbol in any current/*.dll.
- R1.13 non-analog tag create — GATED; native AddTag rejects every non-analog
  type client-side (ValidationFailed, before any WCF op): SingleByteString,
  DoubleByteString, Int1 all fail, Float works. No Discrete type in the native
  enum, no TagType setter. No wire request to capture.
- R1.3 timezone + R1.4 EventStorageMode — re-confirmed 2023R2/gRPC-only from
  the Runtime DB schema (no timezone param, no EventStorageMode anywhere) and a
  parameter-op probe (GetSystemParameter + GETRP return null/throw for every
  candidate; only HistorianVersion works).

238 unit tests pass; full solution builds with 0 warnings.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01B6mcaT2PjRFKcogzp9UkfC
2026-06-21 11:26:21 -04:00
Joseph Doherty 08b950caee R1.11 AddTagExtendedPropertiesAsync: extended-property write via AddTEx
Adds user-defined extended properties to an existing tag via the 2020 WCF
AddTEx (AddTagExtendedProperties) op. Write-enabled connection + uppercase
storage-session GUID handle; reuses the write orchestrator open/priming chain.

The AddTEx inBuff is the exact inverse of the R1.5 GetTepByNm read-response
framing, so the serializer mirrors the read parser:
  uint32 groupCount + 0x01(group) + [0x09+u16+ASCII tag] + uint32 propCount
  + per prop{ 0x02 + [0x09+u16+ASCII name] + 0x43 VT_BSTR + u16 payloadLen
  + u16 charCount + UTF-16 value } + 0x01(group trailer) + 0x00(terminator).
The trailing 0x00 is required — without it inBuff is one byte short and the
server throws SErrorException in CHistStorage::AddTagExtendedProperties. The
golden fixture pins the clean inBuff the live server accepted (dumped via
AVEVA_HISTORIAN_TEP_DUMP); read-back verified via R1.5. String (0x43) values only.

Delete (DelTep) is deferred: the native DeleteTagExtendedPropertiesByName does a
client-side sync check and returns err 229 for a just-added property, so the
DelTep request never reaches the wire and its inBuff can't be captured yet.

Shipped: HistorianClient.AddTagExtendedPropertiesAsync/AddTagExtendedPropertyAsync;
HistorianTagExtendedPropertyProtocol.SerializeAddRequest; orchestrator path;
golden WcfTagExtendedPropertyWriteProtocolTests (4); gated live write/read-back test;
native-harness `add-tep` scenario + Capture-AddTagExtendedProperties.ps1 +
decode-add-tep-capture.py. Doc: wcf-add-tag-extended-properties.md. 233 tests green.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01B6mcaT2PjRFKcogzp9UkfC
2026-06-21 01:43:19 -04:00
Joseph Doherty bc353df8c4 R1.10 RenameTagsAsync: async tag rename via History StartJob (StJb)
Tag rename has no dedicated WCF op — the (old,new) name batch rides the
generic History StartJob (StJb) job buffer; the server returns a job id and
applies renames asynchronously. Handle is the uppercase storage-session GUID,
Open2 in write mode; reuses the write orchestrator's open+priming chain.

jobBuffer layout (decoded + server-validated): byte[7] zero prefix + uint32
pairCount + per pair (uint32 oldCharCount + UTF-16 oldName + uint32
newCharCount + UTF-16 newName), order (old,new). The raw instrument capture
mangles the final byte with MDAS chunk markers (the R1.1 lesson), so the golden
fixture pins the CLEAN byte[] the SDK handed the channel (dumped via
AVEVA_HISTORIAN_RENAME_DUMP) — the exact buffer the live server accepted and
renamed with.

Gated server-side by the AllowRenameTags system parameter (default 0): when
disabled the native client rejects pre-wire (err 132); the managed SDK surfaces
it as StartJob=false -> Accepted=false. Enabling needs a Historian config
reload, not just a storage-engine restart.

Shipped: HistorianClient.RenameTagAsync/RenameTagsAsync -> HistorianTagRenameResult;
HistorianTagRenameProtocol; orchestrator RenameTags/SendStartJobRename; golden
WcfTagRenameProtocolTests (4, pins server-accepted buffer); gated live test
RenameTagsAsync_AgainstLocalHistorian_RenamesSandboxTag (passed end-to-end).
Native-harness `rename` scenario + Capture-RenameTags.ps1 + decode-rename-capture.py.
Doc: docs/reverse-engineering/wcf-rename-tags.md. 213 tests green.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01B6mcaT2PjRFKcogzp9UkfC
2026-06-21 01:18:41 -04:00
Joseph Doherty fbd839077b R1.4 GetHistorianInfo: bounded out on 2020 WCF (named-value-only, no struct)
Captured the native HistorianAccess.GetHistorianInfo(out HistorianInfo, out err)
and decoded the wire: over 2020 WCF, GETHI is a named-value query whose only
working key is "HistorianVersion" (response ~30 bytes = the version string).
Probed 7 storage-mode key names -> all ok=False/err. The 518-byte HISTORIAN_INFO
struct + EventStorageMode@514 is the 2023R2 HCAL-native/gRPC model (confirmed
from the decompiled 2023R2 source); on 2020 the native client derives the mode
outside the WCF wire.

Version is already exposed (ProbeAsync/GetRuntimeParameterAsync), so no hollow
GetHistorianInfoAsync is shipped (same disposition as R1.3 timezone). This
completes the reachable 2020-WCF M1 read surface; remaining M1 = config writes
(gated on explicit request) or gRPC/2023R2-only items.

RE aids kept: harness `historian-info` scenario, Capture-HistorianInfo.ps1,
decode-historian-info-capture.py, and StringHandleProbeDiagnosticTests
.GETHI_CandidateInfoNames (asserts the named-value-only finding; gated).
Docs: wcf-historian-info.md (new) + roadmap/matrix/wall-doc updates. 230 tests green.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01B6mcaT2PjRFKcogzp9UkfC
2026-06-20 23:42:27 -04:00
Joseph Doherty 1a539882d0 R1.1 ExecuteSqlCommandAsync (ExeC + GetR, NRBF DataTable, no BinaryFormatter)
Ship SQL command execution over the 2020 WCF aa/Retr/ExeC + aa/Retr/GetR ops:
HistorianClient.ExecuteSqlCommandAsync(sql) -> HistorianSqlResult (columns +
typed rows). String-handle ops reached with the Open2 storage-session GUID
formatted uppercase (the handle format that unlocked GETRP/GETHI).

Chain: Retr.GetV prime -> ExeC(handle, sql, option=0, ref queryHandle) ->
GetR loop. Key gotcha captured: GetR returns FALSE even on success -- the byte
stream is in pResultBuff regardless; false just signals the final page. So the
orchestrator consumes the buffer first, then stops on a false result / empty page.

GetR's pResultBuff is an NRBF-serialized System.Data.DataTable
(SerializationFormat.Xml: members XmlSchema (XSD) + XmlDiffGram (rows)).
BinaryFormatter is removed from .NET 10, so the stream is decoded read-only with
the System.Formats.Nrbf package (NrbfDecoder) + XDocument -- no BinaryFormatter,
no code execution. Values are typed per the XSD type, falling back to string.

Adds: HistorianSqlResult / HistorianSqlColumn / HistorianSqlExecuteOption models,
HistorianSqlResultProtocol (NRBF + diffgram parser), HistorianWcfSqlClient
(ExeC/GetR orchestration with an AVEVA_HISTORIAN_SQL_DUMP diagnostic), dialect +
public API. Golden WcfSqlResultProtocolTests pinned to the real clean GetR stream
for the benign "SELECT 1 AS ProbeValue" (no sensitive data); gated live tests
(single cell + multi-column/multi-row/NULL). Doc: wcf-exec-sql.md; roadmap R1.1
DONE; wall doc + memory updated (incl. the QTB-server-side nuance). 229 tests green.

Note: a raw instrument-wcf capture corrupts a large pResultBuff with MDAS
transport chunk markers (0x9F); the clean contract-level byte[] is dumped via the
AVEVA_HISTORIAN_SQL_DUMP env var for the golden fixture.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01B6mcaT2PjRFKcogzp9UkfC
2026-06-20 23:16:06 -04:00
Joseph Doherty 108220c36b R1.5 GetTagExtendedPropertiesAsync (GetTepByNm) + R1.6 closed (no op)
Ship tag extended-property reads over the 2020 WCF aa/Retr/GetTepByNm op:
HistorianClient.GetTagExtendedPropertiesAsync(tag) -> name/value pairs.

String-handle op reached with the Open2 storage-session GUID formatted
uppercase (same format that unlocked GETRP/GETHI/ExeC). Routed via the
name-based native path (GetTagExtendedPropertiesByName, server-fetch flag),
not the index-based TagQuery path.

Evidence-backed findings from the capture:
- GetTepByNm (and GetTgByNm) succeed with the uppercase handle -- further
  validates the resolved string-handle wall.
- QTB (StartTagQuery) does NOT punch through: captured uppercase, it still
  fails server-side (CMdServer::StartActiveTagnamesQuery over the
  aahMetadataServer pipe) -- a metadata-server blocker, not handle format.
- R1.6 (localized properties) has NO distinct op (only error-message/UI-text
  localization in the managed client); collapses into R1.5. Closed, not throwing.

Wire format (golden-pinned, synthetic bytes -- no dev tag names committed):
- request tagNames = uint count + per-name(uint charCount + UTF-16)
- response = uint tagCount + per-tag(marker + compact-ASCII name +
  uint propCount + per-prop(marker + compact-ASCII name + 0x43 VT_BSTR value)
  + trailer); sequence-paged.

Adds: HistorianTagExtendedProperty model, HistorianTagExtendedPropertyProtocol
(codec), HistorianWcfTagExtendedPropertyClient (orchestration), dialect +
public API; golden WcfTagExtendedPropertyProtocolTests (4) + gated live test
(HISTORIAN_TEP_TAG). Tooling: Capture-TagExtendedProperties.ps1,
decode-tag-properties-capture.py, harness tag-extended-properties scenario.
Docs: wcf-tag-extended-properties.md; roadmap R1.5 DONE / R1.6 collapsed;
wall doc + memory updated with the QTB-server-side nuance. 228 tests green.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01B6mcaT2PjRFKcogzp9UkfC
2026-06-20 22:52:07 -04:00