docs(grpc): transport matrix + plan reflect ReadEvents + live-verified writes

- README transport matrix: gRPC writes (EnsureTag/DeleteTag/RenameTags/
  AddTagExtendedProperties) flip to live-verified; note the async-rename retry and
  the extended-property read-back parser gap. ReadEvents gRPC -> tooled-but-bounded
  (StartEventQuery works, GetNext long-polls, throws on no-row pending an
  event-bearing server). Refresh the closing production-pattern guidance.
- grpc-tooling-completion.md: mark items #1 (writes, done) and #2 (ReadEvents,
  tooled/bounded) with the live outcomes and follow-ups.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01B6mcaT2PjRFKcogzp9UkfC
This commit is contained in:
Joseph Doherty
2026-06-22 04:58:44 -04:00
parent 274466c050
commit 27e969f86d
2 changed files with 49 additions and 11 deletions
+32 -2
View File
@@ -15,6 +15,7 @@ metadata, system-parameter, server time-zone, measured store-forward status,
- `GetTagExtendedPropertiesAsync` (read) — ✅ live-verified
- `ExecuteSqlCommandAsync` — ⛔ server-walled, bounded behind `ProtocolEvidenceMissingException`
- `EnsureTag` / `DeleteTag` / `RenameTags` / `AddTagExtendedProperties` — 🧪 tooled + routed, sandbox-gated, **not yet run destructively live**
- `ReadEventsAsync` — ⚠️ tooled + routed 2026-06-22 (item #2 below): chain runs, `StartEventQuery` succeeds, but `GetNextEventQueryResultBuffer` long-polls on no data; hard-bounded (≤30s) and throws `ProtocolEvidenceMissingException` on the no-row path. Row retrieval pending an event-bearing server.
Test baseline: 317 offline green, 19 gRPC-live green. Relevant memory:
`project_grpc_config_ops_tooling`, `project_m0_grpc_parity`,
@@ -40,7 +41,20 @@ in `project_grpc_config_ops_tooling` memory and `Grpc/Protos/*.proto`.
## Remaining items (priority order)
### 1. Live-verify the write ops (cheapest, highest-confidence-gain)
### 1. Live-verify the write ops — ✅ DONE 2026-06-22
**Outcome:** ran the gated lifecycle against a synthetic sandbox tag (`ZZ_SdkGrpcWriteProbe`); the
writes flip 🧪→✅. `EnsureTags` (create), `AddTagExtendedProperties`, `StartJob` rename, and
`DeleteTags` all succeed live over gRPC (write-enabled 0x401 session, WCF serializers reused) — NO
priming discovery-dance needed. Two findings: (a) **rename** is an async StartJob that the server can
transiently reject right after the create commits and on target-name collision — the test now
pre-cleans both names and retries rename (4×); callers should likewise retry. (b) **reading a written
extended property back** via `GetTagExtendedPropertiesAsync` hits a shared-parser evidence gap (value
marker `0x01` where the parser expects compact-string `0x09`) — a read-side gap, not a write failure;
the test tolerates it. Lifecycle test is self-cleaning and asserts no litter remains (verified two
consecutive clean passes). Next read-side follow-up: capture the `0x01` extended-property value
encoding and extend `HistorianTagExtendedPropertyProtocol.ParseResponse`.
_Original notes:_
- **Goal:** flip the 🧪 writes to ✅ by running the gated lifecycle test against a sandbox tag.
- **How:** set `HISTORIAN_GRPC_WRITE_SANDBOX_TAG` to a throwaway name and run
`TagWriteLifecycle_OverGrpc_CreatesAddsPropRenamesDeletes` against the live 2023 R2 box.
@@ -53,7 +67,23 @@ in `project_grpc_config_ops_tooling` memory and `Grpc/Protos/*.proto`.
- **Files:** `tests/.../HistorianGrpcIntegrationTests.cs` (run only),
`src/.../Grpc/HistorianGrpcTagWriteOrchestrator.cs` (priming only if rejected).
### 2. ReadEvents over gRPC (heaviest read op)
### 2. ReadEvents over gRPC (heaviest read op) — ✅ TOOLED 2026-06-22 (rows pending event-bearing server)
**Outcome:** `ReadEventsAsync` is routed over gRPC (`HistorianGrpcEventOrchestrator`). The CM_EVENT
registration replay (`UpdateClientStatus`→6 `GetSystemParameter``RegisterTags`→cross-service version
probes→`EnsureTags`, captured buffers shared with WCF via `HistorianEventRegistrationProtocol`) runs
and **`StartEventQuery` succeeds live**. The blocker that remains is server behavior, not the port:
`GetNextEventQueryResultBuffer` **long-polls** when the query has no rows — it blocks to the call
deadline instead of returning the synchronous 5-byte type=4 code=85 terminal the 2020 WCF op returns.
Per-call gRPC-Web deadlines proved unreliable over the tunnel (a 4s-deadline chain still ran >90s), so
the read is hard-bounded by an **overall linked-CTS budget** (≤30s, scaled to `RequestTimeout`); gRPC
honors token cancellation. On the no-row path the orchestrator throws `ProtocolEvidenceMissingException`
rather than assert a false-empty list. The idle dev box holds no events, so **row-level retrieval is
not yet live-verified** — flip the gated test
`ReadEventsAsync_OverGrpc_StartsQueryButRowRetrievalIsLongPollBlocked` to assert parsed rows once an
event-bearing 2023 R2 server is available (and consider whether the long-poll needs a "fetch historical
then stop" request flag the native client may set). README row is ⚠️.
_Original notes (still the reference for the registration replay):_
- **Goal:** route `ReadEventsAsync` over gRPC.
- **RPCs (exist):** `RetrievalService.StartEventQuery` (`uiHandle`, `uiQueryRequestType`,
`btRequest`) → `{Status, uiQueryHandle, btResonse}`; `GetNextEventQueryResultBuffer`