Files
histsdk/docs/plans/grpc-tooling-completion.md
T
Joseph Doherty 7e8bb07df3 docs(grpc): add gRPC tooling completion plan
Self-contained plan for finishing gRPC surface parity: live-verify the
sandbox-gated writes, port ReadEvents (CM_EVENT registration state machine),
SendEvent (capture-blocked), the SQL server-wall stretch, and optional
GetConnectionStatus. Includes the proven reuse pattern and live-verification setup
so it survives context compaction.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01B6mcaT2PjRFKcogzp9UkfC
2026-06-22 01:30:04 -04:00

134 lines
8.3 KiB
Markdown

# gRPC Tooling Completion Plan
Status as of 2026-06-22. Tracks the remaining work to finish tooling the AVEVA
Historian SDK's `RemoteGrpc` (2023 R2) transport so it reaches WCF surface parity.
Self-contained for pickup after context compaction.
## Where things stand
The gRPC transport already tools: probe, raw/aggregate/at-time reads, browse,
metadata, system-parameter, server time-zone, measured store-forward status,
`AddHistoricalValues` backfill write, **and** (newest, branch `grpc-config-ops`,
3 commits, NOT yet merged — `main` = `035d8a9`):
- `GetRuntimeParameterAsync` — ✅ live-verified
- `GetTagExtendedPropertiesAsync` (read) — ✅ live-verified
- `ExecuteSqlCommandAsync` — ⛔ server-walled, bounded behind `ProtocolEvidenceMissingException`
- `EnsureTag` / `DeleteTag` / `RenameTags` / `AddTagExtendedProperties` — 🧪 tooled + routed, sandbox-gated, **not yet run destructively live**
Test baseline: 317 offline green, 19 gRPC-live green. Relevant memory:
`project_grpc_config_ops_tooling`, `project_m0_grpc_parity`,
`project_roadmap_exhausted_2020wcf`, `reference_2023r2_live_server_access`,
`reference_wonder_sql_vd03_credentials`.
## Proven pattern (reuse for everything below)
A WCF config op is tooled over gRPC by reusing its **existing byte serializer/parser
verbatim** inside the protobuf `bytes` fields, keyed by the Open2 session handle:
- `HistorianGrpcConnection connection = HistorianGrpcChannelFactory.Create(options);`
- `HistorianGrpcHandshake.Session session = HistorianGrpcHandshake.OpenSession(connection, options, ct[, connectionMode]);`
- `session.StringHandle` = uppercase Open2 GUID → **string-handle** ops (Retrieval/Status/History string-handle RPCs).
- `session.ClientHandle` = transient `uint`**uint-handle** ops (StartQuery, DeleteTags, GetNext*).
- write ops pass `connectionMode: HistorianWcfAuthChainHelper.NativeIntegratedWriteEnabledConnectionMode` (0x401).
- Call `new <Service>.<Service>Client(connection.Channel).<Rpc>(request, connection.Metadata, DateTime.UtcNow.Add(options.RequestTimeout), ct)`.
- Check `response.Status?.BSuccess`; decode error via `response.Status?.BtError` (hex = native byte0 0x84 + LE u32 code, often followed by facility/file/message ASCII — this decode cracked the SQL + extended-prop cases).
- The gRPC RetrievalService string-handle ops do NOT need the WCF `Retr.GetV` prime.
Proto field-name reference and WCF serializer signatures: see the mapping captured
in `project_grpc_config_ops_tooling` memory and `Grpc/Protos/*.proto`.
## Remaining items (priority order)
### 1. Live-verify the write ops (cheapest, highest-confidence-gain)
- **Goal:** flip the 🧪 writes to ✅ by running the gated lifecycle test against a sandbox tag.
- **How:** set `HISTORIAN_GRPC_WRITE_SANDBOX_TAG` to a throwaway name and run
`TagWriteLifecycle_OverGrpc_CreatesAddsPropRenamesDeletes` against the live 2023 R2 box.
- **Risk/gotcha:** if any write is rejected, the first fix is to add the WCF write
**priming discovery-dance** (`HistorianWcfTagWriteOrchestrator.RunWritePriming`:
UpdC3 + 6 `GetSystemParameter` + `AllowRenameTags` + Trx/Stat/Retr `GetV`) to
`HistorianGrpcTagWriteOrchestrator` over the gRPC StatusService/HistoryService.
Rename also needs server `AllowRenameTags` enabled. Needs explicit user OK to
mutate the shared server (they previously chose "no live mutate").
- **Files:** `tests/.../HistorianGrpcIntegrationTests.cs` (run only),
`src/.../Grpc/HistorianGrpcTagWriteOrchestrator.cs` (priming only if rejected).
### 2. ReadEvents over gRPC (heaviest read op)
- **Goal:** route `ReadEventsAsync` over gRPC.
- **RPCs (exist):** `RetrievalService.StartEventQuery` (`uiHandle`, `uiQueryRequestType`,
`btRequest`) → `{Status, uiQueryHandle, btResonse}`; `GetNextEventQueryResultBuffer`
(`uiHandle`, `uiQueryHandle`) → `{Status, btResult}`; `EndEventQuery`.
- **Reuse:** `HistorianEventQueryProtocol.CreateStartEventQueryAttempts(...)` for the
request buffer (`QueryRequestTypeEvent`), `HistorianEventRowProtocol.Parse(...)` for rows.
- **The hard part — port the CM_EVENT registration state machine.** Without it,
`GetNextEventQueryResultBuffer` returns native error type=4 **code=85**. WCF does this
in `HistorianWcfEventOrchestrator.AddCmEventTagViaAddT`: UpdC3 → 6 system params →
`RegisterTags2` (CM_EVENT tag id `353b8145-5df0-4d46-a253-871aef49b321`, 24-byte
RTag2 buffer) → cross-service `GetV``EnsureTags2` (CM_EVENT CTagMetadata via
`HistorianAddTagsProtocol.SerializeCmEventCTagMetadata`). gRPC equivalents:
`HistoryService.RegisterTags`, `HistoryService.EnsureTags`,
`HistoryService.UpdateClientStatus`, `StatusService.GetSystemParameter`.
- **Approach:** new `Grpc/HistorianGrpcEventOrchestrator`. Open a read-only session,
replay the registration over gRPC (RegisterTags + EnsureTags + the discovery calls),
then run StartEventQuery → loop GetNextEventQueryResultBuffer → EndEventQuery, parsing
rows. Route in `Historian2020ProtocolDialect.ReadEventsAsync` on `UseGrpc`.
- **Verify:** live (read-only, safe) against the 2023 R2 box; dev box may return no
rows (env) — assert "no error 85 + chain completes," mirror the WCF event test.
- **Risk:** medium-high. Registration may need exact call ordering; capture the error
buffer (hex+ASCII) at each step if code 85 persists.
### 3. SendEvent over gRPC
- **Goal:** route `SendEventAsync` over gRPC.
- **Blocker:** no distinct event-send RPC; WCF rides `AddStreamValues2` (the
`HistorianEventWriteProtocol.SerializeAddStreamValuesBuffer` VTQ). The gRPC framing is
**uncaptured** — needs a native-client gRPC capture before implementing (per
"capture first, never guess"). Depends on #2 (same CM_EVENT registration).
- **Risk:** high / blocked on capture. Lowest priority.
### 4. (Stretch) SQL server-wall investigation
- `ExecuteSqlCommand` over gRPC faults server-side in `CSrvDbConnection.ExecuteSqlCommand`
(IndexOutOfRange / native err 38) — a DB-connection precondition the managed session
doesn't establish. Next avenue: try a `HistoryService.RegisterTags`-family prime before
`ExecuteSqlCommand` (same fix that unblocked the M3 write path / OpenStorageConnection
class of wall). If it works, replace the bounded throw in `HistorianGrpcSqlClient` with
the real GetNextQueryResultBuffer fetch loop (already written there) and flip the test.
### 5. (Optional) GetConnectionStatus over gRPC
- Currently WCF-only, synthesized from an authenticated probe (no dedicated RPC either
transport). Could synthesize the same over gRPC via `StatusService.PingServer` /
`GetHistorianConsoleStatus`. Low value; do only if parity is wanted.
### Out of scope
- `ReadBlocks` (`StartBlockRetrievalQuery`) — never captured on either transport; leave
throwing `ProtocolEvidenceMissingException`.
- `DeleteTagExtendedProperties` — server-blocked on WCF (per-connection working set);
gRPC's single multiplexed channel *might* fix it — opportunistic probe only.
## Live verification setup (every live run)
Tunnel to `WONDER-SQL-VD03` must be up (gRPC `localhost:32565`, TLS, cert CN
`WONDER-SQL-VD03`; hosts entry present). Creds in gitignored `wonder-sql-vd03.txt`
(**QUOTED, colon-delimited** — strip quotes; use the `domainusername`/`domainpassword`
NAM domain account, which works for Historian gRPC; `wonderapp` does NOT). Env:
```
HISTORIAN_GRPC_HOST=wonder-sql-vd03 HISTORIAN_GRPC_PORT=32565
HISTORIAN_GRPC_TLS=true HISTORIAN_GRPC_DNSID=WONDER-SQL-VD03
HISTORIAN_USER=<domain user> HISTORIAN_PASSWORD=<domain pass>
HISTORIAN_TEST_TAG=SysTimeSec
# writes only, destructive: HISTORIAN_GRPC_WRITE_SANDBOX_TAG=<throwaway>
# slow links: HISTORIAN_GRPC_TIMEOUT=120
```
Run a subset: `dotnet test ./Histsdk.slnx --no-build --filter "FullyQualifiedName~<name>"`.
Aggregate tests self-calibrate their window from a real raw sample (the box is idle/
not-collecting). Sanitization scan before any commit:
`wonder-sql-vd03|zimmer|nam\\|dohertj2|ADOBuild` over commit-safe files.
## Standing constraints
- Never commit credentials/hostnames/customer tag names/raw captures — placeholders only.
- `src/` stays pure managed .NET 10 (one allowed P/Invoke: SSPI). Never modify `current/`
or `aveva-install-*/`.
- Commit only when asked; branch first if on `main`; required footers
(Co-Authored-By + Claude-Session). Capture wire bytes before implementing — never guess.