Files
histsdk/docs/plans/store-forward-cache-reverse-engineering.md
T
Joseph Doherty c2d8fb9bc8 R4.3: gRPC store-forward status probe + re-scope
Add HistorianGrpcStoreForwardStatusProbe and the `grpc-sf-status-probe` CLI
command. The idle-baseline run against the live 2023 R2 server resolves the
plan's §9.3 handle question: the direct StorageService SF pull RPCs
(GetSFParameter / GetRemainingSnapshotsSize) require the OpenStorageConnection
console handle and are D2-gated (err 132, identical under read-only and
write-enabled sessions), while StatusService.GetHistorianConsoleStatus IS
reachable on the session string handle (=3 at idle).

Records the gRPC re-scope and the idle-baseline findings in
docs/plans/store-forward-cache-reverse-engineering.md §9. The probe writes
nothing and releases any console session immediately.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01B6mcaT2PjRFKcogzp9UkfC
2026-06-21 23:14:05 -04:00

35 KiB
Raw Blame History

Store/Forward Cache Reverse-Engineering Plan

Last updated: 2026-06-21

2026-06-21 R4.3 re-scope — read this first. The original plan below (2026-05-04) was written against the 2020 Net.TCP/WCF transport, before the 2023 R2 gRPC transport existed. Its single biggest open risk — "is SF state readable via a one-shot pull, or only via a duplex push contract we'd have to add?" (Q1/Q2 + §3 Step 3 + Risk 4) — is now answered: pull, no duplex. The recovered gRPC StorageService contract exposes SF state as plain request/response RPCs. The current R4.3 scope and recommended path are in §9 ("2026-06-21 gRPC re-scope"); the 2020-WCF body below is retained as background, not the recommended route.

Original last-updated: 2026-05-04

This document plans the reverse-engineering effort needed to replace the synthesized GetStoreForwardStatusAsync in src/AVEVA.Historian.Client/Wcf/HistorianWcfStatusClient.cs (lines 101-117) with a real, evidence-backed implementation. It is a plan, not the work itself. No code changes; no captures collected.

Read this together with:

  • docs/reverse-engineering/handoff.md — read/event protocol decoding state
  • src/AVEVA.Historian.Client/Wcf/Contracts/IStorageServiceContract.cs — the WCF contract that already declares the SF parameter ops
  • src/AVEVA.Historian.Client/Models/HistorianStoreForwardStatus.cs — the output model the implementation must populate

1. Goal

"SF support works" means, end-to-end:

  1. Primary deliverable. client.GetStoreForwardStatusAsync() against a live local Historian returns a HistorianStoreForwardStatus whose Pending, Storing, DataStored, ErrorOccurred, Error, ServerName, and ConnectionKind fields reflect actual server-reported state, not the synthesized defaults at HistorianWcfStatusClient.cs:107-117.
  2. Secondary deliverable. The SDK can also answer the higher-level "is SF currently buffering?" question accurately when the runtime DB is down, not just when it is up. That is the case the real native client handles correctly and where the synthesized default (Storing = false, ErrorOccurred = false) is silently wrong today.
  3. Non-goals. Writing into SF, replaying SF buffers, configuring SF parameters, redundant-partner SF aggregation (HistorianStoreForwardStatus.AddPartnerStoreForwardStatus, token 0x060060B8). Read-only matches the project mission in CLAUDE.md.

The success bar is parity with the native wrapper's ArchestrA.HistorianAccess.GetStoreForwardStatus (MD token 0x06006186 in current/aahClientManaged.dll), not a superset.

2. Architecture Investigation (open questions, in priority order)

Answer these before writing any production code. Each has a discovery action in §3.

Q1. Is SF status read from a local in-process struct, a separate WCF endpoint, or a Named Pipe IPC?

Current evidence: all three are plausible, but the wrapper actually uses "in-process struct kept current by server-pushed WCF events". Specifically:

  • ArchestrA.HistorianAccess.GetStoreForwardStatus (token 0x06006187, the private 2-arg overload) does not call WCF. It calls mdas_GetStorageStatus (a calli against the INSQL_MDAS_ERROR (IntPtr handle, uint, HISTORIAN_STORAGE_STATUS*) C signature in current/aahClient.dll exports) and then maps the result through HistorianAccessUtil.ConvertUnmanagedSFStorageStatusToManagedStorageStatus (token 0x060060E4).
  • Mutators like CConfigStatusClient.SetMdasStoreForwardEvent (token 0x060029DC) and aahClientCommon.CStatus.SetStoreForwardEvent (token 0x06002A04) are wired to the WCF callback IStatusServiceContract2.SetStoreForwardEvent (StatusServiceContract.IStatusServiceContract2.SetStoreForwardEvent, token 0x06005F57). The server pushes SF state changes; the client caches them.
  • Confirm: read the IL of token 0x06006187 and verify the only system call is mdas_GetStorageStatus. The first 200 instructions confirm this: GetClient(ConnectionIndex)calli against the INSQL_MDAS_ERROR(IntPtr,uint,HISTORIAN_STORAGE_STATUS*) signature → ConvertUnmanagedSFStorageStatusToManagedStorageStatus.

Implication: the SDK cannot ship a synchronous probe that calls one WCF operation and gets the answer. It must subscribe to the same status-event stream the native wrapper subscribes to, or call a status query that returns the cached snapshot from the server.

Q2. Is there a single-shot WCF query that returns the same snapshot?

Likely yes. Hypothesis: IStatusServiceContract2.GetHistorianInfo (GETHI, see IStatusServiceContract2.cs:24-30) returns a multi-key status blob whose schema includes SF state. Alternative: a status-only key passed to GetSystemParameter (already plumbed via HistorianWcfStatusClient.GetSystemParameterAsync). Both are testable without writing protocol code by sending probe payloads and observing the response shape.

Q3. Does SF have its own sidecar process / pipe / WCF endpoint we are missing?

Strong evidence the answer is yes when SF is enabled:

  • aahClientCommon.CSFConnection.GetSFPipeName (token 0x06004B72), GetSFPath (0x06004B71), IsConnected (0x06004B73), IsEnabled (0x06004B6F) — there is a separately-named SF Named Pipe distinct from the main MDAS pipe.
  • aahClientCommon.CSFConnection.StartStoreforward (token 0x06004BC6).
  • IStorageServiceContract already declares GetStoreForwardParameter / SetStoreForwardParameter (GetSFP/SetSFP, see IStorageServiceContract.cs:81-85) and Storage is a separate WCF service slot in HistorianWcfServiceNames.cs:15.
  • CWcfConfig.ConfigurePipeProxy<IStorageServiceContract> (token 0x06004B1C) and CWcfConfig.ConfigureTcpProxy<IStorageServiceContract> (token 0x06004B1B) confirm the storage proxy supports both transports — same dual-transport pattern the History/Retrieval proxies use.
  • CStorageEngineConsoleClient.GetPipeNameStr (token 0x06000E2D) / GetFullPipeNameStr (token 0x06000E2E) wraps the storage-engine console pipe via STransactPipeClient2 (a non-WCF binary pipe protocol).

Open: is the SF sidecar even running on the dev host this SDK is being tested against? handoff.md does not record an SF process being observed. aveva-install-x64/ and aveva-install-x86/ ship only DLLs (no aahStoreForwardClient.exe / aahSFClient.exe / similar). The SF sidecar is part of the Historian server install, not the client redistributable. So:

  • On the developer machine, SF is reachable only because the local Historian server is installed.
  • A pure-client install (the deployment target this SDK ships into) may never have SF.

This shapes the success criteria: when SF is not configured, a correct implementation returns Pending = false, ErrorOccurred = false, DataStored = false, Storing = false — i.e. the same shape the synthesized defaults produce today. The interesting case is when SF is configured and active.

Q4. Is SF state authoritative on the Historian server or on a per-client basis?

Native wrapper reads it from HistorianClient* (the per-connection C++ object). This means it is connection-scoped server-pushed state. We do not need to enumerate cluster-wide SF state — the server reports "my SF buffer for this client's writes" only. This matches our read-only mission: we are not a writer, so the only SF state of interest is the server-side cache for other writers, which the server can report to us as a passive observer.

Q5. Does any SF probe require Admin?

CSFConnection.GetSFPipeName returns a kernel object name. Reading from it requires the pipe ACL to permit the caller. If the SF pipe is ACL'd to LocalSystem only, the SDK cannot read it without impersonation — and the SDK runs as the calling process. This is a hard limit, not a bug.

3. Discovery Workstreams

Run these in parallel. None require a live server beyond what the existing test rig already has.

Workstream A — Static IL inspection (parallel-safe, read-only)

Owner action items, in order:

  1. Dump full IL of token 0x06006187 (HistorianAccess.GetStoreForwardStatus(ConnectionIndex,out)):
    dotnet run --no-build --project tools\AVEVA.Historian.ReverseEngineering -- `
      dnlib-method current\aahClientManaged.dll HistorianAccess.GetStoreForwardStatus --instructions
    
    Save under docs/reverse-engineering/historianaccess-getstoreforwardstatus-il-latest.txt. Confirm the calli target signature INSQL_MDAS_ERROR(IntPtr,uint,HISTORIAN_STORAGE_STATUS*) and that the only WCF entry-points it touches are zero.
  2. Dump IL of HistorianAccessUtil.ConvertUnmanagedSFStorageStatusToManagedStorageStatus (token 0x060060E4). This is the unmanaged→managed mapping; it tells us which fields of HISTORIAN_STORAGE_STATUS populate which fields of HistorianStoreForwardStatus. We will need the same mapping in reverse on the wire response.
  3. Inventory every method that writes into the local SF status struct:
    methods current\aahClientManaged.dll SetStoreForward
    methods current\aahClientManaged.dll SetMdasStoreForward
    
    The known set as of writing: CConfigStatusClient.SetMdasStoreForwardEvent (0x060029DC), aahClientCommon.CStatus.SetStoreForwardEvent (0x06002A04), CStatusConnectionDirect.SetStoreForwardEvent (0x06004DF8), CStatusConnectionWCF.SetStoreForwardEvent (0x06004E4E), CClientCommon.SetStoreForwardEventOnServer (0x06002EC0). The WCF variant is the one whose IL maps onto IStatusServiceContract2.SetStoreForwardEvent (token 0x06005F57) — read its IL and document the request/response shape.
  4. Dump IL of IStatusServiceContract2.SetStoreForwardEvent (0x06005F57) parameter types. The [OperationContract] declaration in the wrapper assembly already encodes the wire shape; this gives us the bytes the server pushes us.

Workstream B — Install inventory (parallel-safe)

  1. Inventory aveva-install-x64\ and aveva-install-x86\ for any binary whose name contains Store, Forward, SF, Cache, Spool. As of this checkout: none, only DLLs. Confirm.
  2. Inventory the deployed Historian server (out-of-band; not in this repo) for aahStoreForwardClient.exe, aahStoreForwardServer.exe, aahSFCache.exe, or any service registered with Description matching *Forward*. Capture the service name, account identity, and pipe ACLs (accesschk -wuvc).
  3. Walk the registry: HKLM\SOFTWARE\ArchestrA\Historian and any sub-key matching *StoreForward*, recording paths and pipe names. Sanitize before committing.

Workstream C — WCF probe (parallel-safe)

Use the existing wcf-probe and wcf-status subcommands of tools\AVEVA.Historian.ReverseEngineering:

  1. wcf-probe $env:HISTORIAN_HOST 32568 — confirm Storage/GetV is reachable. (It is the third service slot in HistorianWcfServiceNames.) Document the returned interface version.
  2. wcf-status $env:HISTORIAN_HOST 32568 <param-name> — sweep plausible SF parameter names (SF.Status, StoreForward.State, SFCacheBytes, etc.) through GetSystemParameter and record what the server accepts. Cheap, read-only, no session needed beyond the already-decoded auth chain.
  3. Probe GetHistorianInfo (GETHI, IStatusServiceContract2.cs:24) with the byte request shape used by the native wrapper. The request bytes are visible if we run instrument-wcf-readquery-style instrumentation against CConfigStatusClient.SetMdasStoreForwardEvent's upstream caller — see Workstream D.

Workstream D — Native capture (sequential after A and C)

Two captures are needed:

  1. Native call to mdas_GetStorageStatus. Run tools\AVEVA.Historian.NativeTraceHarness with a new scenario --scenario sfstatus (to be added) that invokes HistorianAccess.GetStoreForwardStatus() and dumps the HISTORIAN_STORAGE_STATUS C struct memory before the managed conversion runs. This pins the binary layout of the struct (offsets, field widths, endianness) without us guessing.
  2. WCF push of SF events. Configure the local Historian to enter SF mode (stop the runtime DB writer; let the writer's queue trigger SF) and capture the WCF traffic with the existing instrument-wcf-readquery sibling — i.e. add an instrument-wcf-setstoreforwardevent subcommand that IL-rewrites aahClientManaged.dll to log the bytes the server sends to IStatusServiceContract2.SetStoreForwardEvent. Save the rewrite under docs/reverse-engineering/dnlib-write-copy/, never current/.

Workstream D is the only step that needs an actively-storing SF sidecar. Plan: stop the Historian Runtime DB SQL service, write a single test point via the wrapper's writer harness, and capture the SF event push, then restart Runtime DB and capture the "end-of-SF / data drained" push.

Workstream E — On-disk cache (only if Workstream D fails)

If the WCF push protocol turns out to be impractical to reproduce (e.g. requires duplex contract, callback channel, or a server-side session-bind we cannot match from our managed client), fall back to inspecting the on-disk SF cache directly. Steps:

  1. Resolve CSFConnection.GetSFPath IL to find the cache directory convention (likely %ProgramData%\ArchestrA\Historian\Cache\ or similar — to be confirmed, never assume the path).
  2. Inventory file types: .sfdata, .sfindex, .cache — whatever the directory contains.
  3. Decode the file header. The presence/size of .sfdata files is sufficient to populate DataStored and Pending; we do not need to decode the value payload.

This fallback is only for DataStored / Pending. Storing and Error fundamentally require a live server-state read.

4. Concrete Reverse-Engineering Steps (execution order)

Mirrors the read/event decoding workflow that succeeded for raw queries.

Step 1 — Find native methods that touch SF

Already done; baseline evidence is recorded in §2 Q1/Q3 above. Key tokens to reference:

  • 0x06006186, 0x06006187 — public/private HistorianAccess.GetStoreForwardStatus
  • 0x060060E4HistorianAccessUtil.ConvertUnmanagedSFStorageStatusToManagedStorageStatus
  • 0x060029DCCConfigStatusClient.SetMdasStoreForwardEvent
  • 0x06002A04aahClientCommon.CStatus.SetStoreForwardEvent
  • 0x06002DFFaahClientCommon.CClientCommon.IsInStoreForward
  • 0x06002E18aahClientCommon.CClientCommon.SetStoreForwardParams
  • 0x06002EC0CClientCommon.SetStoreForwardEventOnServer
  • 0x06004BC6aahClientCommon.CSFConnection.StartStoreforward
  • 0x06004B6F..0x06004B73 — CSFConnection getters (path, pipe, enabled, connected)
  • 0x06004DF8, 0x06004E4E — direct vs WCF status connections
  • 0x06005F57IStatusServiceContract2.SetStoreForwardEvent MD ref
  • 0x06006193HistorianAccess.IsBothConnectionRequested (used by the public arity-0 GetStoreForwardStatus to decide whether to fan out to a redundant partner)

Step 2 — Decode HISTORIAN_STORAGE_STATUS layout

Run Workstream A.2 (decode token 0x060060E4) and Workstream D.1 (native struct memory dump). Together they pin the field layout.

The managed struct fields we already know we need to populate (from HistorianStoreForwardStatus.cs): ServerName, Pending, ErrorOccurred, Error, DataStored, Storing, ConnectionKind. The native struct will have ≥7 fields plus padding. Express the mapping as a comment table in the implementation.

Step 3 — Decide the wire model

Two possible implementations:

  1. Push-mode (native parity). SDK opens an authenticated WCF session that the server treats as a status subscriber, listens for IStatusServiceContract2.SetStoreForwardEvent callbacks, maintains a local cache, and GetStoreForwardStatusAsync returns from the cache. This requires WCF duplex (CallbackContract) which is not currently exercised anywhere in src/AVEVA.Historian.Client/Wcf/.
  2. Pull-mode (probe). SDK calls GetHistorianInfo (GETHI) or a discovered Storage-service equivalent and maps the one-shot response. No subscription state required.

Pull-mode is strongly preferred: it matches the SDK's existing WCF style, avoids duplex contracts, and the existing code path in HistorianWcfStatusClient.GetSystemParameter is the right shape. Only fall back to push-mode if Workstream C.3 proves the server has no pull endpoint that returns SF state.

Step 4 — Implement the managed contract method

Once Step 3 picks pull-mode, implement against the WCF contract (likely a new [OperationContract] on IStatusServiceContract2 or a method on IStorageServiceContract). Follow the existing parameter-naming discipline from the resolved ValidateClientCredential blocker: use [MessageParameter(Name = "...")] to match exact server element names — do not let WCF derive them from C# parameter names. See handoff.md "Active Blocker" entry for the 2026-05-04 fix.

Step 5 — Add golden-byte fixtures

Add a request and response fixture under fixtures/protocol/store-forward-status/:

  • request-get-storage-status.bin — bytes the SDK sends.
  • response-get-storage-status-running-normal.bin — server not in SF.
  • response-get-storage-status-active-sf.bin — server actively storing.
  • response-get-storage-status-error.bin — server's SF errored.

Capture sources: the same instrumented native wrapper runs that populate Workstream D. Sanitize hostnames, GUIDs, and timestamps before committing.

Step 6 — Replace the synthesized stub

Replace SynthesizeStoreForwardStatus (lines 107-117 of HistorianWcfStatusClient.cs) with a real implementation. Keep the synthesized fallback for the case where the storage service returns a "no SF configured" sentinel — that is not an error condition, it is the normal state for client-only deployments.

Add a unit test class WcfStoreForwardStatusProtocolTests next to the existing WcfDataQueryProtocolTests etc., with golden-byte parse tests using the fixtures from Step 5.

Update the operation status table in README.md:20 from "synthesized defaults (no SF sidecar to probe)" to "live-verified" once the integration test passes.

5. Risks and Gotchas

  1. SF may not be present on the test host. The dev Historian probably has SF disabled by default; turning it on means stopping Runtime DB SQL services, which is invasive. Plan to do capture work on a dedicated sacrificial Historian VM, not the shared dev box.
  2. SF sidecar may require Admin or LocalSystem to query. Any pipe-direct fallback (Workstream E) will fail under standard user accounts. Document the privilege requirement explicitly in the SDK XML doc comments on GetStoreForwardStatusAsync.
  3. State is volatile. Probes that take >100 ms can race against the server's own SF state machine. Capture both request and response in the same instrumented run; do not try to correlate two captures.
  4. Push-mode would force a duplex WCF contract. None of the existing decoded operations use duplex. Adding it widens the managed WCF surface significantly and risks .NET-WCF compatibility issues we have not yet hit. Pull-mode first.
  5. The wrapper's IsBothConnectionRequested (token 0x06006193) path indicates a "primary + partner" topology. Out of scope for this pass per §1, but if the server returns partner data in the same response we must skip-decode (not throw on) unknown trailing bytes.
  6. Open2-only sessions never receive SF events. handoff.md "Active Blocker" notes the wrapper's full chain (OpenConnection3 after the ValCl rounds) is the path that produces a session the server treats as a real client. SF probes must run from inside that chain — re-using HistorianWcfAuthChainHelper.OpenAuthenticatedConnection, the same call site already used by GetSystemParameter at HistorianWcfStatusClient.cs:42.
  7. HISTORIAN_STORAGE_STATUS field order is not contractual. The struct is C++ inside the closed source. If AVEVA reorders fields between Historian versions, our decoder breaks. Pin the decoder to the Historian server version observed at session open (already exposed via IRetrievalServiceContractN) and reject mismatched versions explicitly with ProtocolEvidenceMissingException. Do not silently best-effort parse.
  8. Sanitization. Pipe names, registry paths, and SF cache directory paths can leak hostnames and account names. Run the rg sanitizer (handoff.md "Next Pickup Steps") after every doc edit.

6. Success Criteria

A real implementation is "done" when all of the following hold:

  1. client.GetStoreForwardStatusAsync() returns Pending = true and Storing = true while the local Historian's SF cache is actively buffering writes (verifiable by stopping the Runtime DB and writing a value).
  2. Returns Pending = false and Storing = false within ≤ 5 seconds after the Runtime DB recovers and SF drains.
  3. Returns ErrorOccurred = true and a non-null, actionable Error message when the SF cache itself fails (disk full, pipe closed, etc.).
  4. Returns the synthesized "no SF" shape (all-false) without throwing on a Historian where SF is not configured.
  5. Two new golden-byte unit tests pass (active-SF and idle-SF responses).
  6. ProtocolGuardrailTests no longer needs to exempt GetStoreForwardStatusAsync from any "must throw ProtocolEvidenceMissingException" rule — the method is now evidence-backed.
  7. Live integration test HistorianClientIntegrationTests.GetStoreForwardStatusAsync_ReturnsServerState (to be added) passes when HISTORIAN_HOST is set, skips cleanly otherwise.
  8. README.md:20 operation status table is updated from "synthesized defaults" to "live-verified".

7. Open Questions for the Implementer

Resolve these before writing production code:

  1. Does the server expose a pull endpoint that returns the full HISTORIAN_STORAGE_STATUS snapshot, or only push events? (Workstream C.3 answers this.)
  2. What is the binary layout of HISTORIAN_STORAGE_STATUS? (Workstream A.2 + D.1.)
  3. What is the [OperationContract] shape on IStatusServiceContract2.SetStoreForwardEvent? Specifically: parameter count, byte-buffer parameters, and exact MessageParameter names? (Workstream A.4.)
  4. Is the Storage service slot at net.pipe://<host>/Storage and net.tcp://<host>:32568/Storage reachable on a non-Historian-server install? Or does it 404 when only the client redistributable is present? (Workstream B + C.1.)
  5. Does the SF status snapshot include partner / redundant SF state inline, or is it returned from a separate call? (Workstream A.1, look for branches under IsBothConnectionRequested.)
  6. Does the SF status read require OpenConnection3 to have succeeded, or is Open2 enough? (Trial: try the discovered pull endpoint after Open2 only, before doing OpenConnection3. If it works, the implementation is much simpler.)
  7. What happens when SF is disabled by configuration vs enabled but idle? Both should map to Pending=false, Storing=false, but the underlying server response may be a sentinel error vs an all-zeros struct. The implementation must distinguish "no SF" (return defaults silently) from "SF errored" (return ErrorOccurred = true).

8. Out of Scope

Explicitly not part of this plan:

  • SF write-back (the project mission is read-only; IStorageServiceContract.AddStreamValues etc. stay unimplemented).
  • Setting SF parameters (IStorageServiceContract.SetStoreForwardParameter).
  • Redundant-partner SF aggregation (HistorianStoreForwardStatus.AddPartnerStoreForwardStatus).
  • Reverse-engineering the on-disk SF cache file format beyond presence / file count (Workstream E is a fallback, not a primary deliverable).
  • Anything in the aahClientCommon.CSFConnection.StartStoreforward / SetStorageStopped / SetTagSynchronized write surface.

9. 2026-06-21 gRPC re-scope (current R4.3 plan)

This supersedes the recommended route in §2/§3/§4. The deliverable (§1) and success criteria (§6) are unchanged. What changed is the transport and the resolved architecture risk.

9.1 What the recovered gRPC contract already gives us

The 2023 R2 contract under src/AVEVA.Historian.Client/Grpc/Protos/ exposes SF state through first-class pull RPCs on StorageService (StorageService.proto) — no duplex/callback contract, no native HISTORIAN_STORAGE_STATUS C-struct decode:

  • GetSFParameter(uint32 Handle, string ParameterName) → (Status status, string ParamaterValue) — the direct analogue of the already-shipped GetSystemParameter/GetRuntimeParameter string-keyed pulls. This is the primary SF-state lever: a name→value read.
  • GetRemainingSnapshotsSize(uint32 Handle) → (Status status, uint64 SnapshotSize) — the pending-buffer magnitude in one call. Non-zero ⇒ data is queued (Pending/DataStored=true); zero ⇒ drained. The cleanest single signal for the idle-vs-active split.
  • GetInfo(string Request) → (Status status, bytes info) — generic server info blob; a fallback if a named SF key lives here instead of in GetSFParameter.
  • OpenStorageConnectionResponse.ServerStatus (field 5) and the GetSnapshots/StartQuerySnapshot family — secondary signals.

SetSFParameter exists too but is out of scope (read-only mission, §8).

The TransactionService.ForwardSnapshot{,Begin,End} RPCs are the SF cache replay/transfer path (write-side), not a status read — also out of scope here; they belong to the deferred bit-faithful SF cache work, not to GetStoreForwardStatusAsync.

9.2 Plumbing that already exists (reuse, don't rebuild)

  • HistorianGrpcHandshake.OpenSession — authenticated gRPC session (ValidateClientCredential NTLM loop + Open2) yielding ClientHandle (uint) + storage-session GUID. Live-verified against the 2023 R2 box.
  • HistorianGrpcStorageConnectionProbe — already constructs a StorageService.StorageServiceClient, primes GetInterfaceVersion, and calls OpenStorageConnection/CloseStorageConnection. The SF-status probe is a near-clone that swaps the OpenStorageConnection body for GetSFParameter/GetRemainingSnapshotsSize calls.
  • HistorianGrpcChannelFactory / HistorianGrpcConnection — channel, metadata, deadlines.

9.3 The one open risk that survives: which Handle?

GetSFParameter/GetRemainingSnapshotsSize both take uint32 Handle. Unknown: do they accept the session ClientHandle (from OpenSession, which is cheap and unblocked), or do they require the storage console Handle returned by OpenStorageConnection — which is the D2 wall (OpenStorageConnection routes to the \\.\pipe\aahStorageEngine\console session and is the same storage-engine pipe that blocks revision writes)? See project_roadmap_exhausted_2020wcf and HistorianGrpcStorageConnectionProbe header.

  • Best case: these read-only status RPCs accept the session ClientHandle (status reads shouldn't need a console writer session). Then R4.3-over-gRPC is unblocked end-to-end and is a small, shippable feature.
  • Worst case: they require the OpenStorageConnection Handle ⇒ R4.3 inherits the D2 storage-engine-pipe wall and stays blocked on the same root cause as R4.2. Either way the probe answers it in one run.

9.4 Discovery steps (execution order)

  1. Add grpc-sf-status-probe to tools/AVEVA.Historian.ReverseEngineering (mirror HistorianGrpcStorageConnectionProbe). Against the live 2023 R2 server it:
    • opens an authenticated session, gets ClientHandle;
    • calls GetRemainingSnapshotsSize(ClientHandle) and reports status.bSuccess + SnapshotSize + any error buffer;
    • sweeps GetSFParameter(ClientHandle, name) over a candidate name list (Status, Storing, Pending, DataStored, SF.Status, StoreForwardStatus, Forward, CacheSize, ErrorOccurred, plus any names surfaced by Workstream A's IL of ConvertUnmanagedSFStorageStatusToManagedStorageStatus);
    • records which names the server accepts and the returned values.
    • If every call fails with an auth/handle-shaped error, retry once with the OpenStorageConnection Handle to disambiguate §9.3.
  2. Idle baseline first — run against the server with SF not active. Establishes the "no SF / drained" response shape (expected: SnapshotSize=0, parameter reads succeed-with-defaults or return a "not configured" sentinel). This alone may be enough to ship an honest idle-state implementation that is strictly better than today's hardcoded all-false synthesis (it would be measured false).
  3. Active-SF capture — only if step 2 proves the read works and we need the active-state fixtures. Force SF on the sacrificial Historian VM (stop Runtime DB writer; let the queue spill to SF), re-run the probe, capture the non-zero/Storing=true response. This is the one invasive step and the gate on full success criteria §6.16.3.
  4. Map + implement — add GrpcGetStoreForwardStatus to the gRPC read orchestrator, map the probed fields onto HistorianStoreForwardStatus, route GetStoreForwardStatusAsync to it when Transport == RemoteGrpc (keep the synthesized fallback for non-gRPC transports and for the "no SF configured" sentinel). Add golden-byte fixtures (idle + active) and WcfStoreForwardStatusProtocolTests-style parse tests. Gate the live integration test on HISTORIAN_GRPC_HOST.

9.5 Effort / feasibility summary

  • Risk collapsed: pull-vs-push (the old plan's worst risk) is settled — it's a pull. No duplex WCF/gRPC callback contract.
  • No native struct decode: GetSFParameter returns a string; we skip the HISTORIAN_STORAGE_STATUS C-layout RE entirely (Workstream A.2 / D.1 become "nice-to-have for field names", not blocking).
  • Reuses shipped plumbing: session open + StorageServiceClient + channel already exist and are live-verified.
  • Remaining unknowns are empirical, one probe-run each: (a) the accepted parameter-name vocabulary, (b) which Handle the status RPCs want (§9.3 — the only thing that could re-block it), (c) the active-SF response shape (needs the invasive force-SF step).
  • Net: Step 12 are low-risk and could land a measured idle-state GetStoreForwardStatusAsync over gRPC quickly. Steps 34 (full success criteria) still need the sacrificial-VM force-SF capture and are gated on §9.3 not landing on the D2 wall.

9.6 Out of scope (unchanged from §8, restated for gRPC)

SetSFParameter, ForwardSnapshot* (SF replay/transfer), the on-disk cache file format, and redundant-partner SF aggregation all remain out of scope. R4.3 is read-only status, gRPC-first.

9.7 Idle-baseline run — RESULTS (2026-06-21)

Built HistorianGrpcStoreForwardStatusProbe + the grpc-sf-status-probe CLI command and ran it against the live 2023 R2 server with the historian in its idle / not-actively-storing state (storage interface v4, authenticated session opened OK). Tested both read-only (0x402) and write-enabled (0x401) sessions. Findings, with the §9.3 handle question resolved:

  1. Direct StorageService SF pull RPCs are D2-gated — confirmed the §9.3 worst-case branch.

    • GetRemainingSnapshotsSize(session.ClientHandle)bSuccess=false, error buffer 04 84 00 00 00 (= status 0x84 / 132 OperationNotEnabled). Identical under 0x401 and 0x402 — so it is NOT the read/write connection-mode gate; the History-session ClientHandle is simply not a valid handle for this op's handle-space.
    • GetSFParameter(session.ClientHandle, <name>) → server-side RpcException(Unknown, "Exception was thrown by handler") for all 16 candidate names, both session modes.
    • These two ops need the OpenStorageConnection console handle, and OpenStorageConnection itself fails with the storage-engine console error (84 55 00 00 00 01 02 00 09 15 00
      • ASCII "OpenStorageConnection") — the D2 storage-engine-pipe wall, the same root cause that blocks R4.2 revision writes. We cannot obtain the console handle, so these two SF RPCs are unreachable from a pure managed client. See project_roadmap_exhausted_2020wcf.
  2. One reachable session-handle lever found: StatusService.GetHistorianConsoleStatus(strHandle) SUCCEEDS with the session string handle (uppercase Open2 GUID) — no console handle needed — and returns uiConsoleStatus = 3 at idle. This is the only SF-adjacent signal reachable from the managed client. Its enum semantics are unknown (3 = presumably "running/normal"); whether it shifts when SF is actively storing is the open question.

  3. StatusService.GetHistorianInfo(strHandle, btRequest)bSuccess= false for every btRequest candidate (empty / u32(0) / ascii+utf16 "StoreForward"); its request framing is not yet known. Lower-yield than GetHistorianConsoleStatus; revisit only if needed.

Net idle-baseline conclusion. R4.3's clean direct route (GetSFParameter / GetRemainingSnapshotsSize) is blocked behind the D2 storage-engine console pipe, exactly like R4.2 — a pure managed client cannot open the console session those ops require. The only reachable SF-adjacent signal is GetHistorianConsoleStatus → a status uint. Two paths forward:

  • (a) Ship a measured idle-state only. — SHIPPED + LIVE-VERIFIED 2026-06-21. HistorianGrpcStatusClient.GetStoreForwardStatusAsync opens a session, calls GetHistorianConsoleStatus, and returns HistorianStoreForwardStatus all-false but measured: it actually contacts the server and reports ErrorOccurred=true (with the underlying error) when the server is unreachable / the console-status call fails — strictly better than the blind hardcoded synthesis, which never contacts the server. Routed via Historian2020ProtocolDialect.GetStoreForwardStatusAsync when Transport == RemoteGrpc (non-gRPC keeps the synthesized fallback). Gated live test HistorianGrpcIntegrationTests.GetStoreForwardStatusAsync_OverGrpc_ReturnsMeasuredIdleState passes against the real 2023 R2 server. Storing/Pending/DataStored magnitude is intentionally NOT surfaced — it lives behind the D2 wall (see path (b)).
  • (b) Full success criteria (§6) stay blocked on the D2 console-pipe wall. Decoding the active-SF uiConsoleStatus value and any GetSystemParameter SF keys still needs the invasive force-SF capture on a sacrificial Historian — and even then Storing/DataStored magnitude is only available via the D2-gated GetRemainingSnapshotsSize.

Probe code: src/AVEVA.Historian.Client/Grpc/HistorianGrpcStoreForwardStatusProbe.cs, CLI grpc-sf-status-probe <host> [port] [--tls] [--dnsid <n>] [--write-session]. Writes nothing; releases any console session immediately.