Files
histsdk/docs/plans/grpc-tooling-completion.md
T
Joseph Doherty 27e969f86d docs(grpc): transport matrix + plan reflect ReadEvents + live-verified writes
- README transport matrix: gRPC writes (EnsureTag/DeleteTag/RenameTags/
  AddTagExtendedProperties) flip to live-verified; note the async-rename retry and
  the extended-property read-back parser gap. ReadEvents gRPC -> tooled-but-bounded
  (StartEventQuery works, GetNext long-polls, throws on no-row pending an
  event-bearing server). Refresh the closing production-pattern guidance.
- grpc-tooling-completion.md: mark items #1 (writes, done) and #2 (ReadEvents,
  tooled/bounded) with the live outcomes and follow-ups.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01B6mcaT2PjRFKcogzp9UkfC
2026-06-22 04:58:44 -04:00

11 KiB
Raw Blame History

gRPC Tooling Completion Plan

Status as of 2026-06-22. Tracks the remaining work to finish tooling the AVEVA Historian SDK's RemoteGrpc (2023 R2) transport so it reaches WCF surface parity. Self-contained for pickup after context compaction.

Where things stand

The gRPC transport already tools: probe, raw/aggregate/at-time reads, browse, metadata, system-parameter, server time-zone, measured store-forward status, AddHistoricalValues backfill write, and (newest, branch grpc-config-ops, 3 commits, NOT yet merged — main = 035d8a9):

  • GetRuntimeParameterAsync live-verified
  • GetTagExtendedPropertiesAsync (read) — live-verified
  • ExecuteSqlCommandAsync server-walled, bounded behind ProtocolEvidenceMissingException
  • EnsureTag / DeleteTag / RenameTags / AddTagExtendedProperties🧪 tooled + routed, sandbox-gated, not yet run destructively live
  • ReadEventsAsync⚠️ tooled + routed 2026-06-22 (item #2 below): chain runs, StartEventQuery succeeds, but GetNextEventQueryResultBuffer long-polls on no data; hard-bounded (≤30s) and throws ProtocolEvidenceMissingException on the no-row path. Row retrieval pending an event-bearing server.

Test baseline: 317 offline green, 19 gRPC-live green. Relevant memory: project_grpc_config_ops_tooling, project_m0_grpc_parity, project_roadmap_exhausted_2020wcf, reference_2023r2_live_server_access, reference_wonder_sql_vd03_credentials.

Proven pattern (reuse for everything below)

A WCF config op is tooled over gRPC by reusing its existing byte serializer/parser verbatim inside the protobuf bytes fields, keyed by the Open2 session handle:

  • HistorianGrpcConnection connection = HistorianGrpcChannelFactory.Create(options);
  • HistorianGrpcHandshake.Session session = HistorianGrpcHandshake.OpenSession(connection, options, ct[, connectionMode]);
    • session.StringHandle = uppercase Open2 GUID → string-handle ops (Retrieval/Status/History string-handle RPCs).
    • session.ClientHandle = transient uintuint-handle ops (StartQuery, DeleteTags, GetNext*).
    • write ops pass connectionMode: HistorianWcfAuthChainHelper.NativeIntegratedWriteEnabledConnectionMode (0x401).
  • Call new <Service>.<Service>Client(connection.Channel).<Rpc>(request, connection.Metadata, DateTime.UtcNow.Add(options.RequestTimeout), ct).
  • Check response.Status?.BSuccess; decode error via response.Status?.BtError (hex = native byte0 0x84 + LE u32 code, often followed by facility/file/message ASCII — this decode cracked the SQL + extended-prop cases).
  • The gRPC RetrievalService string-handle ops do NOT need the WCF Retr.GetV prime.

Proto field-name reference and WCF serializer signatures: see the mapping captured in project_grpc_config_ops_tooling memory and Grpc/Protos/*.proto.

Remaining items (priority order)

1. Live-verify the write ops — DONE 2026-06-22

Outcome: ran the gated lifecycle against a synthetic sandbox tag (ZZ_SdkGrpcWriteProbe); the writes flip 🧪. EnsureTags (create), AddTagExtendedProperties, StartJob rename, and DeleteTags all succeed live over gRPC (write-enabled 0x401 session, WCF serializers reused) — NO priming discovery-dance needed. Two findings: (a) rename is an async StartJob that the server can transiently reject right after the create commits and on target-name collision — the test now pre-cleans both names and retries rename (4×); callers should likewise retry. (b) reading a written extended property back via GetTagExtendedPropertiesAsync hits a shared-parser evidence gap (value marker 0x01 where the parser expects compact-string 0x09) — a read-side gap, not a write failure; the test tolerates it. Lifecycle test is self-cleaning and asserts no litter remains (verified two consecutive clean passes). Next read-side follow-up: capture the 0x01 extended-property value encoding and extend HistorianTagExtendedPropertyProtocol.ParseResponse.

Original notes:

  • Goal: flip the 🧪 writes to by running the gated lifecycle test against a sandbox tag.
  • How: set HISTORIAN_GRPC_WRITE_SANDBOX_TAG to a throwaway name and run TagWriteLifecycle_OverGrpc_CreatesAddsPropRenamesDeletes against the live 2023 R2 box.
  • Risk/gotcha: if any write is rejected, the first fix is to add the WCF write priming discovery-dance (HistorianWcfTagWriteOrchestrator.RunWritePriming: UpdC3 + 6 GetSystemParameter + AllowRenameTags + Trx/Stat/Retr GetV) to HistorianGrpcTagWriteOrchestrator over the gRPC StatusService/HistoryService. Rename also needs server AllowRenameTags enabled. Needs explicit user OK to mutate the shared server (they previously chose "no live mutate").
  • Files: tests/.../HistorianGrpcIntegrationTests.cs (run only), src/.../Grpc/HistorianGrpcTagWriteOrchestrator.cs (priming only if rejected).

2. ReadEvents over gRPC (heaviest read op) — TOOLED 2026-06-22 (rows pending event-bearing server)

Outcome: ReadEventsAsync is routed over gRPC (HistorianGrpcEventOrchestrator). The CM_EVENT registration replay (UpdateClientStatus→6 GetSystemParameterRegisterTags→cross-service version probes→EnsureTags, captured buffers shared with WCF via HistorianEventRegistrationProtocol) runs and StartEventQuery succeeds live. The blocker that remains is server behavior, not the port: GetNextEventQueryResultBuffer long-polls when the query has no rows — it blocks to the call deadline instead of returning the synchronous 5-byte type=4 code=85 terminal the 2020 WCF op returns. Per-call gRPC-Web deadlines proved unreliable over the tunnel (a 4s-deadline chain still ran >90s), so the read is hard-bounded by an overall linked-CTS budget (≤30s, scaled to RequestTimeout); gRPC honors token cancellation. On the no-row path the orchestrator throws ProtocolEvidenceMissingException rather than assert a false-empty list. The idle dev box holds no events, so row-level retrieval is not yet live-verified — flip the gated test ReadEventsAsync_OverGrpc_StartsQueryButRowRetrievalIsLongPollBlocked to assert parsed rows once an event-bearing 2023 R2 server is available (and consider whether the long-poll needs a "fetch historical then stop" request flag the native client may set). README row is ⚠️.

Original notes (still the reference for the registration replay):

  • Goal: route ReadEventsAsync over gRPC.
  • RPCs (exist): RetrievalService.StartEventQuery (uiHandle, uiQueryRequestType, btRequest) → {Status, uiQueryHandle, btResonse}; GetNextEventQueryResultBuffer (uiHandle, uiQueryHandle) → {Status, btResult}; EndEventQuery.
  • Reuse: HistorianEventQueryProtocol.CreateStartEventQueryAttempts(...) for the request buffer (QueryRequestTypeEvent), HistorianEventRowProtocol.Parse(...) for rows.
  • The hard part — port the CM_EVENT registration state machine. Without it, GetNextEventQueryResultBuffer returns native error type=4 code=85. WCF does this in HistorianWcfEventOrchestrator.AddCmEventTagViaAddT: UpdC3 → 6 system params → RegisterTags2 (CM_EVENT tag id 353b8145-5df0-4d46-a253-871aef49b321, 24-byte RTag2 buffer) → cross-service GetVEnsureTags2 (CM_EVENT CTagMetadata via HistorianAddTagsProtocol.SerializeCmEventCTagMetadata). gRPC equivalents: HistoryService.RegisterTags, HistoryService.EnsureTags, HistoryService.UpdateClientStatus, StatusService.GetSystemParameter.
  • Approach: new Grpc/HistorianGrpcEventOrchestrator. Open a read-only session, replay the registration over gRPC (RegisterTags + EnsureTags + the discovery calls), then run StartEventQuery → loop GetNextEventQueryResultBuffer → EndEventQuery, parsing rows. Route in Historian2020ProtocolDialect.ReadEventsAsync on UseGrpc.
  • Verify: live (read-only, safe) against the 2023 R2 box; dev box may return no rows (env) — assert "no error 85 + chain completes," mirror the WCF event test.
  • Risk: medium-high. Registration may need exact call ordering; capture the error buffer (hex+ASCII) at each step if code 85 persists.

3. SendEvent over gRPC

  • Goal: route SendEventAsync over gRPC.
  • Blocker: no distinct event-send RPC; WCF rides AddStreamValues2 (the HistorianEventWriteProtocol.SerializeAddStreamValuesBuffer VTQ). The gRPC framing is uncaptured — needs a native-client gRPC capture before implementing (per "capture first, never guess"). Depends on #2 (same CM_EVENT registration).
  • Risk: high / blocked on capture. Lowest priority.

4. (Stretch) SQL server-wall investigation

  • ExecuteSqlCommand over gRPC faults server-side in CSrvDbConnection.ExecuteSqlCommand (IndexOutOfRange / native err 38) — a DB-connection precondition the managed session doesn't establish. Next avenue: try a HistoryService.RegisterTags-family prime before ExecuteSqlCommand (same fix that unblocked the M3 write path / OpenStorageConnection class of wall). If it works, replace the bounded throw in HistorianGrpcSqlClient with the real GetNextQueryResultBuffer fetch loop (already written there) and flip the test.

5. (Optional) GetConnectionStatus over gRPC

  • Currently WCF-only, synthesized from an authenticated probe (no dedicated RPC either transport). Could synthesize the same over gRPC via StatusService.PingServer / GetHistorianConsoleStatus. Low value; do only if parity is wanted.

Out of scope

  • ReadBlocks (StartBlockRetrievalQuery) — never captured on either transport; leave throwing ProtocolEvidenceMissingException.
  • DeleteTagExtendedProperties — server-blocked on WCF (per-connection working set); gRPC's single multiplexed channel might fix it — opportunistic probe only.

Live verification setup (every live run)

Tunnel to WONDER-SQL-VD03 must be up (gRPC localhost:32565, TLS, cert CN WONDER-SQL-VD03; hosts entry present). Creds in gitignored wonder-sql-vd03.txt (QUOTED, colon-delimited — strip quotes; use the domainusername/domainpassword NAM domain account, which works for Historian gRPC; wonderapp does NOT). Env:

HISTORIAN_GRPC_HOST=wonder-sql-vd03  HISTORIAN_GRPC_PORT=32565
HISTORIAN_GRPC_TLS=true  HISTORIAN_GRPC_DNSID=WONDER-SQL-VD03
HISTORIAN_USER=<domain user>  HISTORIAN_PASSWORD=<domain pass>
HISTORIAN_TEST_TAG=SysTimeSec
# writes only, destructive: HISTORIAN_GRPC_WRITE_SANDBOX_TAG=<throwaway>
# slow links: HISTORIAN_GRPC_TIMEOUT=120

Run a subset: dotnet test ./Histsdk.slnx --no-build --filter "FullyQualifiedName~<name>". Aggregate tests self-calibrate their window from a real raw sample (the box is idle/ not-collecting). Sanitization scan before any commit: wonder-sql-vd03|zimmer|nam\\|dohertj2|ADOBuild over commit-safe files.

Standing constraints

  • Never commit credentials/hostnames/customer tag names/raw captures — placeholders only.
  • src/ stays pure managed .NET 10 (one allowed P/Invoke: SSPI). Never modify current/ or aveva-install-*/.
  • Commit only when asked; branch first if on main; required footers (Co-Authored-By + Claude-Session). Capture wire bytes before implementing — never guess.