085f01123c
Established that analog/state-summary queries are reachable on 2020 WCF — they
ride the proven uint-handle StartQuery2 path, and the request serializer already
carries QueryType/SummaryType/ColumnSelectorFlags. Located every decode target in
aahClientManaged.dll:
- CAnalogSummaryValue.UnpackFromValueBuffer (0x06000394) — row decoder
- CAnalogSummaryValue/Struct fields — Min/Max/First/Last/ValueCount/TimeGood/
Integral/IntegralOfSquares (+ per-field DateTimes, LinearIntegral)
- CStateSummaryStruct — MinContained/MaxContained/TotalContained/PartialStart/
PartialEnd/StateEntryCount
- QueryColumnSelector.Select{Analog,State,NonSummary}Columns — column flags
- INSQL_QUERYTYPE / HISTORIAN_SUMMARYTYPE — the query/summary enum values
UnpackFromValueBuffer is reader-call-based (no literal offsets), so a correct
parser needs a captured real buffer. Per project discipline no guessed summary
code was added to src/. New plan doc lays out the recover-params -> live-capture
-> decode -> implement+test path. Roadmap R1.8/R1.9 marked scoped/ready.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
213 lines
14 KiB
Markdown
213 lines
14 KiB
Markdown
# HCAL modern-.NET client — implementation roadmap
|
||
|
||
Ordered, actionable plan to grow **histsdk** from "reads + basic config" into a broad
|
||
HCAL replacement, built on the **2023 R2 gRPC transport**. Derived from
|
||
[`hcal-capability-matrix.md`](hcal-capability-matrix.md); event details in
|
||
[`histevents.md`](histevents.md).
|
||
|
||
> Move to the repo's `docs/plans/` when execution starts. Each work item lands as: a
|
||
> protocol serializer/parser + golden-byte unit test + an env-gated live integration
|
||
> test against the local Historian.
|
||
|
||
## Progress (updated 2026-06-19)
|
||
|
||
- ✅ **R0.6 version gate** — `HistorianServerVersionGate` + `HistorianClientOptions.VerifyServerInterfaceVersion`;
|
||
fail-closed on connect, wired into both WCF and gRPC paths. Supported versions are
|
||
evidence-based (Hist=11, Retr=4, Trx=2; Status reachability-only), captured from the
|
||
live server. 10 unit tests.
|
||
- ✅ **CW-1 capture pipeline** — `ProtocolCaptureSanitizer` + `ProtocolFixtureWriter` +
|
||
`capture-tag-info` CLI command; produces sanitized `fixtures/protocol/<op>/` golden files.
|
||
11 unit tests. First fixture: `get-tag-info/analog-*.json`.
|
||
|
||
> ⚠️ **Live-verification constraint:** the local Historian is **2020** (WCF, port 32568) — the
|
||
> 2023 R2 gRPC endpoint (32565) is absent. M0's gRPC routing (R0.1–R0.4) can be built and
|
||
> golden-byte/unit-tested here but **cannot be live-verified** without an actual 2023 R2 server.
|
||
> Treat gRPC ops as unverified until then; the byte payloads remain the proven 2020 protocol.
|
||
|
||
> 🔬 **M1a re-classification (2026-06-20).** Two "trivial" items were live-probed against the
|
||
> 2020 WCF server and found **not deliverable here**, both for evidence-backed reasons:
|
||
> - **R1.3 `GetServerTimeZoneAsync`** — `Status.GetSystemTimeZoneName` is a client-side *stub*
|
||
> on 2020 (rc=0, empty value), same family as `GetServerTime`. gRPC/2023R2-only.
|
||
> - **R1.1 `ExecuteSqlCommandAsync`** — `ExeC` returns native error 51 (InvalidParameter);
|
||
> the contract-3 string-handle ops require an unmapped native session/filter registration
|
||
> step (the `StartTagQuery` wall).
|
||
>
|
||
> Takeaway: the M1a "cheap surface" is *cheap only on the 2023 R2 gRPC front door*. On 2020 WCF
|
||
> the boundary is the **handle type** (see the string-handle wall note under §1b and
|
||
> `docs/reverse-engineering/wcf-string-handle-wall.md`): **`uint`-handle ops work, `string`-handle
|
||
> ops are blocked.** GETHI/GetTepByNm were probed and confirmed blocked (not, as first guessed,
|
||
> reachable). The genuinely reachable next items on 2020 WCF are the remaining **`uint`-handle**
|
||
> ops: **R1.8/R1.9 StartQuery summary/state modes** and **R1.7 event filters** (filter bytes ride
|
||
> the proven `uint`-handle `StartEventQuery`). Everything string-handle waits on one RE target:
|
||
> the native session/filter registration.
|
||
|
||
## Guiding principles
|
||
|
||
1. **gRPC-first.** New ops go on the `RemoteGrpc` transport (clean protobuf envelope);
|
||
the inner `bytes` blob is the only thing to RE. Keep WCF as the legacy/Windows path.
|
||
2. **Two tests per op, always.** A golden-byte test (deterministic, no server) **and** a
|
||
gated live test (`HISTORIAN_GRPC_HOST` / `HISTORIAN_HOST`). No op is "done" without both.
|
||
3. **Version-pin, fail closed.** Read server version at connect; gate every byte
|
||
serializer on it; throw `ProtocolEvidenceMissingException` on mismatch — never
|
||
best-effort parse.
|
||
4. **Capture once, encode forever.** For CAPTURE-tier items, instrument one native call,
|
||
save a sanitized fixture under `fixtures/protocol/`, then implement against the fixture.
|
||
5. **Ship per milestone.** Each milestone is independently releasable.
|
||
|
||
Effort: **S** ≈ days · **M** ≈ ~1 week · **L** ≈ weeks. Estimates are incremental on
|
||
histsdk's existing infra (auth chain, transport, frame primitives, test harness).
|
||
|
||
---
|
||
|
||
## Milestone 0 — Foundation: full gRPC parity for the DONE surface (M)
|
||
|
||
*Goal: everything already working over WCF also works over `RemoteGrpc`, so the whole
|
||
read/browse/status surface is Windows-free and the gRPC stack is the default path.*
|
||
|
||
| ID | Work | gRPC op | Files | Verify | Effort |
|
||
|---|---|---|---|---|---|
|
||
| R0.1 | Route browse over gRPC | `Retrieval.StartTagQuery`/`QueryTag` or `GetTagInfosFromName` | `Grpc/HistorianGrpcReadOrchestrator` (+ new `…GrpcBrowseClient`), `Historian2020ProtocolDialect` | browse tags live over gRPC | S |
|
||
| R0.2 | Route tag metadata over gRPC | `Retrieval.GetTagInfosFromName` | dialect + grpc client | metadata matches WCF result | S |
|
||
| R0.3 | Route status/system-param over gRPC | `Status.GetSystemParameter`, `Status.GetHistorianConsoleStatus` | new `Grpc/HistorianGrpcStatusClient` | system param + conn status live | S |
|
||
| R0.4 | Probe over gRPC | `*.GetInterfaceVersion` | grpc clients | `ProbeAsync` Windows-free | XS |
|
||
| R0.5 | **Capture harness for gRPC payloads** | n/a | reuse `instrument-wcf-*` tooling (same byte blobs) + add a `grpc-call-dump` helper | dump any request/response `bytes` to a fixture | S |
|
||
| R0.6 | **Version gate** | server version at connect | `HistorianClientOptions`, orchestrators | mismatched version → throws | S |
|
||
|
||
**Acceptance:** the entire Phase-0 capability set runs end-to-end over `RemoteGrpc`
|
||
(incl. Linux), no WCF on the path. 188+ unit tests green; live gRPC integration suite green.
|
||
|
||
---
|
||
|
||
## Milestone 1 — Cheap surface completion (TRIVIAL/BOUNDED) (M–L total)
|
||
|
||
*Goal: knock out the remaining read/config surface. Order = ascending payload difficulty.*
|
||
|
||
### 1a. Trivial (XS–S each, no new payload format)
|
||
| ID | Capability | gRPC op | Notes |
|
||
|---|---|---|---|
|
||
| ~~R1.1~~ | ~~`ExecuteSqlCommandAsync`~~ | `Retrieval.ExecuteSqlCommand` | ⚠ **Blocked on 2020 WCF.** Live-probed 2026-06-20: `ExeC` returns native error type 4 / code **51 (InvalidParameter)** for every handle variant — same unmapped *native session/filter registration* prerequisite that blocks `StartTagQuery`/`QueryTag` (see `implementation-status.md` lines ~982, ~1404). Needs that registration RE'd, or a 2023 R2 gRPC server. Do not wire via guessed calls. |
|
||
| R1.2 | `GetRuntimeParameterAsync` | `Status.GetRuntimeParameter` | mirror `GetSystemParameter` |
|
||
| ~~R1.3~~ | ~~`GetServerTimeZoneAsync`~~ | `Status.GetSystemTimeZoneName` | ⚠ **gRPC/2023R2-only.** Verified 2026-06-20: over **2020 WCF** this op is a stub (rc=0, empty value) in the `GetServerTime` family — not shippable here. Build+verify only against a live 2023 R2 server. See `docs/reverse-engineering/wcf-status-localhost.md`. |
|
||
|
||
> ⛔ **String-handle wall (2026-06-20).** R1.4/R1.5/R1.6 (and R1.1) are **all blocked on 2020
|
||
> WCF** for the *same* reason: their ops take a **`string` GUID handle** and require an unmapped
|
||
> native session/filter registration. Probed live — GETHI returns code 1 for the exact native
|
||
> request shape across 5 handle formats + Stat.GetV priming; ExeC returns code 51. The proven
|
||
> surface uses **`uint`-handle** ops only. **One RE target — the native string-handle session
|
||
> registration — unblocks this whole sub-milestone.** Full analysis:
|
||
> `docs/reverse-engineering/wcf-string-handle-wall.md`. R1.8/R1.9 (StartQuery summary/state modes)
|
||
> are `uint`-handle and remain reachable on 2020 WCF.
|
||
|
||
### 1b. Bounded (decode one `bytes` payload; S–M each)
|
||
| ID | Capability | gRPC op | Payload to decode | Depends |
|
||
|---|---|---|---|---|
|
||
| ~~R1.4~~ | `GetHistorianInfoAsync` | `Status.GetHistorianInfo` | ⛔ **string-handle wall** — GETHI returns code 1 on 2020 WCF (all handle/priming variants). GETHI buffer incl. `EventStorageMode`@514. | string-handle RE |
|
||
| ~~R1.5~~ | Extended-property **read** | `Retrieval.GetTagExtendedPropertiesFromName` | ⛔ **string-handle wall** (GetTepByNm takes `string handle`). TEP result buffer. | string-handle RE |
|
||
| ~~R1.6~~ | Localized-property **read** | `Retrieval.GetTagLocalizedPropertiesFromName` | ⛔ **string-handle wall** (same family). | string-handle RE |
|
||
| R1.7 | Event **filters** | filter bytes in `Retrieval.StartEventQuery` | filter predicate encoding (name/op/value) — **`uint`-handle**, reachable | R0.5 |
|
||
| R1.8 | Analog-summary query | `Retrieval.StartQuery` (summary mode) | summary row layout — **`uint`-handle, reachable. Scoped + decode targets located** (`CAnalogSummaryValue.UnpackFromValueBuffer`, fields Min/Max/First/Last/ValueCount/Integral/…). Plan: [`r1.8-r1.9-summary-queries.md`](r1.8-r1.9-summary-queries.md) | — |
|
||
| R1.9 | State-summary query | `Retrieval.StartQuery` (state mode) | state-summary row layout — **`uint`-handle, reachable. Scoped** (`CStateSummaryStruct`: MinContained/MaxContained/TotalContained/PartialStart/PartialEnd/StateEntryCount). Plan: [`r1.8-r1.9-summary-queries.md`](r1.8-r1.9-summary-queries.md) | — |
|
||
|
||
### 1c. Bounded config writes (S–M each)
|
||
| ID | Capability | gRPC op | Payload | Notes |
|
||
|---|---|---|---|---|
|
||
| R1.10 | `RenameTagsAsync` | History rename op | rename request buffer | `AllowRenameTags` already probed |
|
||
| R1.11 | Extended-property **write** | `History.AddTagExtendedProperties` (+ groups) / `DeleteTagExtendedProperties` | TEP serialize | mirror analog CTagMetadata discipline |
|
||
| R1.12 | Localized-property **write** | `History.AddTagLocalizedProperties` / `DeleteTagLocalizedProperties` | localized serialize | |
|
||
| R1.13 | Non-analog tag create (string/discrete) | `History.EnsureTags` | distinct CTagMetadata variant | ⚠ native AddTag rejected some types — confirm server path first; may be GATED |
|
||
|
||
**Acceptance:** read + browse + metadata + system/status + property R/W + summaries +
|
||
event-filtered reads + rename all live-verified over gRPC.
|
||
|
||
---
|
||
|
||
## Milestone 2 — Event sending (CAPTURE) (S–M) ← headline gap
|
||
|
||
*Goal: `SendEventAsync(HistorianEvent)`. Path fully mapped in histevents.md; one capture away.*
|
||
|
||
| ID | Work | Detail |
|
||
|---|---|---|
|
||
| R2.1 | Capture the event value blob | Instrument `CCommonArchestraEventValue::PackToVtq` (or dump the VTQ value bytes) on a live `AddStreamedValue(HistorianEvent)`; save sanitized fixture |
|
||
| R2.2 | `HistorianEventWriteProtocol` | Serialize header (`ReceivedTime, EventType, EventTime, Id, RevisionVersion, IsUpdate/IsDelete, Namespace`) + typed property bag — **inverse of `HistorianEventRowProtocol`** (reuse typemarkers `0x02/0x10/0x18/0x31/0x43/…`) |
|
||
| R2.3 | Event write orchestrator | Open **Event** connection (write mode) → register CM_EVENT (already have) → `Storage.AddStreamValues` with the event VTQ |
|
||
| R2.4 | Public API | `HistorianClient.SendEventAsync(HistorianEvent)` (+ `HistorianEvent` model: Type, EventTime, property bag) |
|
||
| R2.5 | Round-trip test | Send an event → read it back via `StartEventQuery` / `v_AlarmEventHistory2`; golden-byte on R2.2 |
|
||
|
||
**Acceptance:** an event sent from histsdk appears in the historian and is read back with
|
||
matching Type + properties. **Now practical** — Historian is installed locally.
|
||
|
||
---
|
||
|
||
## Milestone 3 — Historical / non-streamed value writes (BOUNDED) (M)
|
||
|
||
*Goal: insert original historical VTQs (backfill), the path that is NOT the gated cache push.*
|
||
|
||
| ID | Work | gRPC op |
|
||
|---|---|---|
|
||
| R3.1 | Decode non-streamed VTQ packet | `Transaction.AddNonStreamValuesBegin/AddNonStreamValues/End` |
|
||
| R3.2 | `AddHistoricalValuesAsync` | batched begin→values→end |
|
||
| R3.3 | Ingest-permission validation | confirm the target accepts original-data insert (distinct from `AddS2` cache wall) |
|
||
|
||
**Acceptance:** historical points inserted and read back. Document clearly where this
|
||
differs from (gated) streaming sample writes.
|
||
|
||
---
|
||
|
||
## Milestone 4 — HARD subsystems (deferred / optional) (L each)
|
||
|
||
Only if the use case demands them. Each is a real subsystem, not an op.
|
||
|
||
| ID | Capability | Approach | Risk |
|
||
|---|---|---|---|
|
||
| R4.1 | Store-and-forward | **Pragmatic local queue** (durable outbox + replay on reconnect) rather than bit-faithful SF cache + `Forward*Snapshot`. Faithful SF = decode SF cache format + snapshot framing + recovery log | high; consider "good enough" |
|
||
| R4.2 | Revision / edit writes | `AddRevisionValue(s)` go via the **non-WCF storage-engine pipe** (`STransactPipeClient2`) — separate transport RE | high |
|
||
| R4.3 | Real store-forward **status** | duplex push (`SetStoreForwardEvent`) or a decoded pull endpoint — see store-forward plan | medium |
|
||
| R4.4 | Multi-historian / redundancy | client-side orchestration over N single-historian sessions (failover, ReSyncTags, partner watchdog) — build last | medium |
|
||
|
||
---
|
||
|
||
## Won't-do from the client (GATED)
|
||
|
||
- **Streaming process-sample writes** (`AddStreamedValue(HistorianDataValue)` / `AddS2`):
|
||
runtime cache only ingests from configured IOServer/AppServer pipelines. Confirm your
|
||
ingestion architecture instead of pursuing this.
|
||
|
||
---
|
||
|
||
## Cross-cutting workstreams (run alongside all milestones)
|
||
|
||
- **CW-1 Capture tooling** (enables R0.5, R1.x, R2.1): one reusable "call op → dump
|
||
request/response `bytes` → sanitized fixture" path. Highest leverage — do first.
|
||
- **CW-2 Version compatibility:** matrix of tested Historian versions; serializers keyed
|
||
by version; CI gate.
|
||
- **CW-3 Cross-platform CI:** run the gRPC suite on Linux/macOS (transport is portable;
|
||
explicit-cred auth path).
|
||
- **CW-4 Fixtures discipline:** every new op ships a `fixtures/protocol/<op>/` golden file;
|
||
sanitize hostnames/tags/GUIDs before commit.
|
||
- **CW-5 Public API shape:** keep the modern surface (async, `IAsyncEnumerable`,
|
||
cancellation, options record, DI-friendly) consistent as the surface grows.
|
||
|
||
---
|
||
|
||
## Sequencing (critical path)
|
||
|
||
```
|
||
CW-1 capture tooling ─┐
|
||
M0 gRPC parity ───────┼─→ M1 cheap surface ─→ M2 event send ─→ M3 historical writes ─→ (M4 optional)
|
||
R0.6 version gate ────┘
|
||
```
|
||
|
||
Recommended first sprint: **CW-1 + M0 (R0.1–R0.6)** → a fully Windows-free, version-safe
|
||
gRPC client at today's capability. Second sprint: **M1a + M2** (cheap wins + the headline
|
||
event-send). M3/M4 as demand dictates.
|
||
|
||
## One-glance status
|
||
|
||
| Milestone | Tier | Effort | Value | When |
|
||
|---|---|---|---|---|
|
||
| M0 gRPC parity + capture tooling | foundation | M | unblocks everything, Windows-free | **now** |
|
||
| M1 cheap surface | TRIVIAL/BOUNDED | M–L | most remaining read/config | next |
|
||
| M2 event send | CAPTURE | S–M | headline write capability | next |
|
||
| M3 historical writes | BOUNDED | M | backfill | on demand |
|
||
| M4 SF / revisions / redundancy | HARD | L×N | parity completeness | defer |
|