Files
histsdk/docs/plans/store-forward-cache-reverse-engineering.md
T
Joseph Doherty c2d8fb9bc8 R4.3: gRPC store-forward status probe + re-scope
Add HistorianGrpcStoreForwardStatusProbe and the `grpc-sf-status-probe` CLI
command. The idle-baseline run against the live 2023 R2 server resolves the
plan's §9.3 handle question: the direct StorageService SF pull RPCs
(GetSFParameter / GetRemainingSnapshotsSize) require the OpenStorageConnection
console handle and are D2-gated (err 132, identical under read-only and
write-enabled sessions), while StatusService.GetHistorianConsoleStatus IS
reachable on the session string handle (=3 at idle).

Records the gRPC re-scope and the idle-baseline findings in
docs/plans/store-forward-cache-reverse-engineering.md §9. The probe writes
nothing and releases any console session immediately.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01B6mcaT2PjRFKcogzp9UkfC
2026-06-21 23:14:05 -04:00

713 lines
35 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Store/Forward Cache Reverse-Engineering Plan
Last updated: 2026-06-21
> **2026-06-21 R4.3 re-scope — read this first.** The original plan below
> (2026-05-04) was written against the 2020 Net.TCP/WCF transport, before the
> 2023 R2 gRPC transport existed. Its single biggest open risk — *"is SF state
> readable via a one-shot pull, or only via a duplex push contract we'd have to
> add?"* (Q1/Q2 + §3 Step 3 + Risk 4) — is now **answered: pull, no duplex**.
> The recovered gRPC `StorageService` contract exposes SF state as plain
> request/response RPCs. The current R4.3 scope and recommended path are in
> §9 ("2026-06-21 gRPC re-scope"); the 2020-WCF body below is retained as
> background, not the recommended route.
Original last-updated: 2026-05-04
This document plans the reverse-engineering effort needed to replace the
synthesized `GetStoreForwardStatusAsync` in
`src/AVEVA.Historian.Client/Wcf/HistorianWcfStatusClient.cs` (lines 101-117)
with a real, evidence-backed implementation. It is a *plan*, not the work
itself. No code changes; no captures collected.
Read this together with:
- `docs/reverse-engineering/handoff.md` — read/event protocol decoding state
- `src/AVEVA.Historian.Client/Wcf/Contracts/IStorageServiceContract.cs` — the
WCF contract that already declares the SF parameter ops
- `src/AVEVA.Historian.Client/Models/HistorianStoreForwardStatus.cs` — the
output model the implementation must populate
## 1. Goal
"SF support works" means, end-to-end:
1. **Primary deliverable.** `client.GetStoreForwardStatusAsync()` against a
live local Historian returns a `HistorianStoreForwardStatus` whose
`Pending`, `Storing`, `DataStored`, `ErrorOccurred`, `Error`, `ServerName`,
and `ConnectionKind` fields reflect actual server-reported state, not the
synthesized defaults at
`HistorianWcfStatusClient.cs:107-117`.
2. **Secondary deliverable.** The SDK can also answer the higher-level
"is SF currently buffering?" question accurately when the runtime DB is
*down*, not just when it is up. That is the case the real native client
handles correctly and where the synthesized default (`Storing = false`,
`ErrorOccurred = false`) is silently wrong today.
3. **Non-goals.** Writing into SF, replaying SF buffers, configuring SF
parameters, redundant-partner SF aggregation
(`HistorianStoreForwardStatus.AddPartnerStoreForwardStatus`,
token `0x060060B8`). Read-only matches the project mission in
`CLAUDE.md`.
The success bar is parity with the native wrapper's
`ArchestrA.HistorianAccess.GetStoreForwardStatus`
(MD token `0x06006186` in `current/aahClientManaged.dll`),
not a superset.
## 2. Architecture Investigation (open questions, in priority order)
Answer these before writing any production code. Each has a discovery action
in §3.
### Q1. Is SF status read from a local in-process struct, a separate WCF endpoint, or a Named Pipe IPC?
Current evidence: **all three are plausible, but the wrapper actually uses
"in-process struct kept current by server-pushed WCF events"**. Specifically:
- `ArchestrA.HistorianAccess.GetStoreForwardStatus`
(token `0x06006187`, the private 2-arg overload) does *not* call WCF.
It calls `mdas_GetStorageStatus` (a `calli` against the
`INSQL_MDAS_ERROR (IntPtr handle, uint, HISTORIAN_STORAGE_STATUS*)` C
signature in `current/aahClient.dll` exports) and then maps the result
through `HistorianAccessUtil.ConvertUnmanagedSFStorageStatusToManagedStorageStatus`
(token `0x060060E4`).
- Mutators like `CConfigStatusClient.SetMdasStoreForwardEvent`
(token `0x060029DC`) and `aahClientCommon.CStatus.SetStoreForwardEvent`
(token `0x06002A04`) are wired to the WCF callback
`IStatusServiceContract2.SetStoreForwardEvent`
(`StatusServiceContract.IStatusServiceContract2.SetStoreForwardEvent`,
token `0x06005F57`). The server *pushes* SF state changes; the client
caches them.
- Confirm: read the IL of token `0x06006187` and verify the only system call
is `mdas_GetStorageStatus`. The first 200 instructions confirm this:
`GetClient(ConnectionIndex)``calli` against the
`INSQL_MDAS_ERROR(IntPtr,uint,HISTORIAN_STORAGE_STATUS*)` signature →
`ConvertUnmanagedSFStorageStatusToManagedStorageStatus`.
Implication: **the SDK cannot ship a synchronous probe that calls one WCF
operation and gets the answer**. It must subscribe to the same status-event
stream the native wrapper subscribes to, or call a status query that returns
the cached snapshot from the server.
### Q2. Is there a single-shot WCF query that returns the same snapshot?
Likely yes. Hypothesis: `IStatusServiceContract2.GetHistorianInfo`
(`GETHI`, see `IStatusServiceContract2.cs:24-30`) returns a multi-key status
blob whose schema includes SF state. Alternative: a status-only key passed to
`GetSystemParameter` (already plumbed via `HistorianWcfStatusClient.GetSystemParameterAsync`).
Both are testable without writing protocol code by sending probe payloads
and observing the response shape.
### Q3. Does SF have its own sidecar process / pipe / WCF endpoint we are missing?
Strong evidence the answer is yes when SF is *enabled*:
- `aahClientCommon.CSFConnection.GetSFPipeName` (token `0x06004B72`),
`GetSFPath` (`0x06004B71`), `IsConnected` (`0x06004B73`), `IsEnabled`
(`0x06004B6F`) — there is a separately-named SF Named Pipe distinct
from the main MDAS pipe.
- `aahClientCommon.CSFConnection.StartStoreforward` (token `0x06004BC6`).
- `IStorageServiceContract` already declares `GetStoreForwardParameter`
/ `SetStoreForwardParameter` (`GetSFP`/`SetSFP`,
see `IStorageServiceContract.cs:81-85`) and `Storage` is a separate
WCF service slot in `HistorianWcfServiceNames.cs:15`.
- `CWcfConfig.ConfigurePipeProxy<IStorageServiceContract>` (token
`0x06004B1C`) and `CWcfConfig.ConfigureTcpProxy<IStorageServiceContract>`
(token `0x06004B1B`) confirm the storage proxy supports both transports —
same dual-transport pattern the History/Retrieval proxies use.
- `CStorageEngineConsoleClient.GetPipeNameStr` (token `0x06000E2D`) /
`GetFullPipeNameStr` (token `0x06000E2E`) wraps the storage-engine
console pipe via `STransactPipeClient2` (a *non-WCF* binary pipe
protocol).
Open: **is the SF sidecar even running on the dev host this SDK is being
tested against?** `handoff.md` does not record an SF process being
observed. `aveva-install-x64/` and `aveva-install-x86/` ship only DLLs
(no `aahStoreForwardClient.exe` / `aahSFClient.exe` / similar). The SF
sidecar is part of the Historian *server* install, not the client
redistributable. So:
- On the developer machine, SF is reachable only because the local
Historian server is installed.
- A pure-client install (the deployment target this SDK ships into) may
*never* have SF.
This shapes the success criteria: when SF is not configured, a correct
implementation returns `Pending = false`, `ErrorOccurred = false`,
`DataStored = false`, `Storing = false` — i.e. the same shape the
synthesized defaults produce today. The interesting case is *when SF is
configured and active*.
### Q4. Is SF state authoritative on the Historian server or on a per-client basis?
Native wrapper reads it from `HistorianClient*` (the per-connection C++
object). This means it is *connection-scoped* server-pushed state. We
do not need to enumerate cluster-wide SF state — the server reports
"my SF buffer for this client's writes" only. This matches our read-only
mission: we are not a writer, so the only SF state of interest is the
server-side cache for *other* writers, which the server can report to
us as a passive observer.
### Q5. Does any SF probe require Admin?
`CSFConnection.GetSFPipeName` returns a kernel object name. Reading
from it requires the pipe ACL to permit the caller. If the SF pipe is
ACL'd to `LocalSystem` only, the SDK cannot read it without
impersonation — and the SDK runs as the calling process. This is a
hard limit, not a bug.
## 3. Discovery Workstreams
Run these in parallel. None require a live server beyond what the
existing test rig already has.
### Workstream A — Static IL inspection (parallel-safe, read-only)
Owner action items, in order:
1. Dump full IL of token `0x06006187`
(`HistorianAccess.GetStoreForwardStatus(ConnectionIndex,out)`):
```powershell
dotnet run --no-build --project tools\AVEVA.Historian.ReverseEngineering -- `
dnlib-method current\aahClientManaged.dll HistorianAccess.GetStoreForwardStatus --instructions
```
Save under `docs/reverse-engineering/historianaccess-getstoreforwardstatus-il-latest.txt`.
Confirm the `calli` target signature
`INSQL_MDAS_ERROR(IntPtr,uint,HISTORIAN_STORAGE_STATUS*)` and that
the only WCF entry-points it touches are zero.
2. Dump IL of `HistorianAccessUtil.ConvertUnmanagedSFStorageStatusToManagedStorageStatus`
(token `0x060060E4`). This is the unmanaged→managed mapping; it
tells us which fields of `HISTORIAN_STORAGE_STATUS` populate which
fields of `HistorianStoreForwardStatus`. We will need the same
mapping in reverse on the wire response.
3. Inventory every method that *writes* into the local SF status
struct:
```
methods current\aahClientManaged.dll SetStoreForward
methods current\aahClientManaged.dll SetMdasStoreForward
```
The known set as of writing:
`CConfigStatusClient.SetMdasStoreForwardEvent` (`0x060029DC`),
`aahClientCommon.CStatus.SetStoreForwardEvent` (`0x06002A04`),
`CStatusConnectionDirect.SetStoreForwardEvent` (`0x06004DF8`),
`CStatusConnectionWCF.SetStoreForwardEvent` (`0x06004E4E`),
`CClientCommon.SetStoreForwardEventOnServer` (`0x06002EC0`).
The `WCF` variant is the one whose IL maps onto
`IStatusServiceContract2.SetStoreForwardEvent`
(token `0x06005F57`) — read its IL and document the request/response
shape.
4. Dump IL of `IStatusServiceContract2.SetStoreForwardEvent`
(`0x06005F57`) parameter types. The `[OperationContract]`
declaration in the wrapper assembly already encodes the wire shape;
this gives us the bytes the server pushes us.
### Workstream B — Install inventory (parallel-safe)
1. Inventory `aveva-install-x64\` and `aveva-install-x86\` for any
binary whose name contains `Store`, `Forward`, `SF`, `Cache`,
`Spool`. As of this checkout: **none**, only DLLs. Confirm.
2. Inventory the deployed Historian server (out-of-band; not in this
repo) for `aahStoreForwardClient.exe`,
`aahStoreForwardServer.exe`, `aahSFCache.exe`, or any service
registered with `Description` matching `*Forward*`. Capture the
service name, account identity, and pipe ACLs (`accesschk -wuvc`).
3. Walk the registry: `HKLM\SOFTWARE\ArchestrA\Historian` and any
sub-key matching `*StoreForward*`, recording paths and pipe names.
Sanitize before committing.
### Workstream C — WCF probe (parallel-safe)
Use the existing `wcf-probe` and `wcf-status` subcommands of
`tools\AVEVA.Historian.ReverseEngineering`:
1. `wcf-probe $env:HISTORIAN_HOST 32568` — confirm `Storage/GetV` is
reachable. (It is the third service slot in
`HistorianWcfServiceNames`.) Document the returned interface
version.
2. `wcf-status $env:HISTORIAN_HOST 32568 <param-name>` — sweep
plausible SF parameter names (`SF.Status`, `StoreForward.State`,
`SFCacheBytes`, etc.) through `GetSystemParameter` and record what
the server accepts. Cheap, read-only, no session needed beyond the
already-decoded auth chain.
3. Probe `GetHistorianInfo` (`GETHI`,
`IStatusServiceContract2.cs:24`) with the byte request shape used
by the native wrapper. The request bytes are visible if we run
`instrument-wcf-readquery`-style instrumentation against
`CConfigStatusClient.SetMdasStoreForwardEvent`'s upstream caller —
see Workstream D.
### Workstream D — Native capture (sequential after A and C)
Two captures are needed:
1. **Native call to `mdas_GetStorageStatus`.** Run
`tools\AVEVA.Historian.NativeTraceHarness` with a new scenario
`--scenario sfstatus` (to be added) that invokes
`HistorianAccess.GetStoreForwardStatus()` and dumps the
`HISTORIAN_STORAGE_STATUS` C struct memory before the managed
conversion runs. This pins the binary layout of the struct
(offsets, field widths, endianness) without us guessing.
2. **WCF push of SF events.** Configure the local Historian to enter
SF mode (stop the runtime DB writer; let the writer's queue
trigger SF) and capture the WCF traffic with the existing
`instrument-wcf-readquery` sibling — i.e. add an
`instrument-wcf-setstoreforwardevent` subcommand that
IL-rewrites `aahClientManaged.dll` to log the bytes the server
sends to `IStatusServiceContract2.SetStoreForwardEvent`. Save
the rewrite under `docs/reverse-engineering/dnlib-write-copy/`,
never `current/`.
Workstream D is the only step that needs an actively-storing SF
sidecar. Plan: stop the Historian Runtime DB SQL service, write a
single test point via the wrapper's writer harness, and capture the
SF event push, then restart Runtime DB and capture the
"end-of-SF / data drained" push.
### Workstream E — On-disk cache (only if Workstream D fails)
If the WCF push protocol turns out to be impractical to reproduce
(e.g. requires duplex contract, callback channel, or a server-side
session-bind we cannot match from our managed client), fall back to
inspecting the on-disk SF cache directly. Steps:
1. Resolve `CSFConnection.GetSFPath` IL to find the cache directory
convention (likely `%ProgramData%\ArchestrA\Historian\Cache\` or
similar — to be confirmed, **never assume the path**).
2. Inventory file types: `.sfdata`, `.sfindex`, `.cache` — whatever
the directory contains.
3. Decode the file header. The presence/size of `.sfdata` files is
sufficient to populate `DataStored` and `Pending`; we do not
need to decode the value payload.
This fallback is only for `DataStored` / `Pending`. `Storing` and
`Error` fundamentally require a live server-state read.
## 4. Concrete Reverse-Engineering Steps (execution order)
Mirrors the read/event decoding workflow that succeeded for raw
queries.
### Step 1 — Find native methods that touch SF
Already done; baseline evidence is recorded in §2 Q1/Q3 above. Key
tokens to reference:
- `0x06006186`, `0x06006187` — public/private
`HistorianAccess.GetStoreForwardStatus`
- `0x060060E4`
`HistorianAccessUtil.ConvertUnmanagedSFStorageStatusToManagedStorageStatus`
- `0x060029DC``CConfigStatusClient.SetMdasStoreForwardEvent`
- `0x06002A04``aahClientCommon.CStatus.SetStoreForwardEvent`
- `0x06002DFF``aahClientCommon.CClientCommon.IsInStoreForward`
- `0x06002E18``aahClientCommon.CClientCommon.SetStoreForwardParams`
- `0x06002EC0``CClientCommon.SetStoreForwardEventOnServer`
- `0x06004BC6``aahClientCommon.CSFConnection.StartStoreforward`
- `0x06004B6F`..`0x06004B73` — CSFConnection getters (path, pipe,
enabled, connected)
- `0x06004DF8`, `0x06004E4E` — direct vs WCF status connections
- `0x06005F57``IStatusServiceContract2.SetStoreForwardEvent` MD ref
- `0x06006193``HistorianAccess.IsBothConnectionRequested` (used by
the public arity-0 GetStoreForwardStatus to decide whether to fan
out to a redundant partner)
### Step 2 — Decode `HISTORIAN_STORAGE_STATUS` layout
Run Workstream A.2 (decode token `0x060060E4`) and Workstream D.1
(native struct memory dump). Together they pin the field layout.
The managed struct fields we already know we need to populate
(from `HistorianStoreForwardStatus.cs`):
`ServerName`, `Pending`, `ErrorOccurred`, `Error`, `DataStored`,
`Storing`, `ConnectionKind`. The native struct will have ≥7
fields plus padding. Express the mapping as a comment table in
the implementation.
### Step 3 — Decide the wire model
Two possible implementations:
1. **Push-mode (native parity).** SDK opens an authenticated WCF
session that the server treats as a status subscriber, listens
for `IStatusServiceContract2.SetStoreForwardEvent` callbacks,
maintains a local cache, and `GetStoreForwardStatusAsync`
returns from the cache. This requires WCF duplex
(`CallbackContract`) which is not currently exercised
anywhere in `src/AVEVA.Historian.Client/Wcf/`.
2. **Pull-mode (probe).** SDK calls `GetHistorianInfo` (`GETHI`)
or a discovered `Storage`-service equivalent and maps the
one-shot response. No subscription state required.
Pull-mode is strongly preferred: it matches the SDK's existing
WCF style, avoids duplex contracts, and the existing code path
in `HistorianWcfStatusClient.GetSystemParameter` is the right
shape. Only fall back to push-mode if Workstream C.3 proves the
server has no pull endpoint that returns SF state.
### Step 4 — Implement the managed contract method
Once Step 3 picks pull-mode, implement against the WCF contract
(likely a new `[OperationContract]` on `IStatusServiceContract2`
or a method on `IStorageServiceContract`). Follow the existing
parameter-naming discipline from the resolved
`ValidateClientCredential` blocker:
**use `[MessageParameter(Name = "...")]` to match exact server
element names — do not let WCF derive them from C# parameter
names.** See `handoff.md` "Active Blocker" entry for the
2026-05-04 fix.
### Step 5 — Add golden-byte fixtures
Add a request and response fixture under
`fixtures/protocol/store-forward-status/`:
- `request-get-storage-status.bin` — bytes the SDK sends.
- `response-get-storage-status-running-normal.bin` — server
not in SF.
- `response-get-storage-status-active-sf.bin` — server actively
storing.
- `response-get-storage-status-error.bin` — server's SF errored.
Capture sources: the same instrumented native wrapper runs that
populate Workstream D. Sanitize hostnames, GUIDs, and timestamps
before committing.
### Step 6 — Replace the synthesized stub
Replace `SynthesizeStoreForwardStatus` (lines 107-117 of
`HistorianWcfStatusClient.cs`) with a real implementation. Keep
the synthesized fallback for the case where the storage service
returns a "no SF configured" sentinel — that is *not* an error
condition, it is the normal state for client-only deployments.
Add a unit test class `WcfStoreForwardStatusProtocolTests` next
to the existing `WcfDataQueryProtocolTests` etc., with golden-byte
parse tests using the fixtures from Step 5.
Update the operation status table in `README.md:20` from
"synthesized defaults (no SF sidecar to probe)" to
"live-verified" once the integration test passes.
## 5. Risks and Gotchas
1. **SF may not be present on the test host.** The dev Historian
probably has SF disabled by default; turning it on means
stopping Runtime DB SQL services, which is invasive. Plan to do
capture work on a dedicated sacrificial Historian VM, not the
shared dev box.
2. **SF sidecar may require Admin or LocalSystem to query.** Any
pipe-direct fallback (Workstream E) will fail under standard
user accounts. Document the privilege requirement explicitly
in the SDK XML doc comments on `GetStoreForwardStatusAsync`.
3. **State is volatile.** Probes that take >100 ms can race
against the server's own SF state machine. Capture *both*
request and response in the same instrumented run; do not
try to correlate two captures.
4. **Push-mode would force a duplex WCF contract.** None of the
existing decoded operations use duplex. Adding it widens the
managed WCF surface significantly and risks .NET-WCF
compatibility issues we have not yet hit. Pull-mode first.
5. **The wrapper's `IsBothConnectionRequested` (token `0x06006193`)
path indicates a "primary + partner" topology.** Out of scope
for this pass per §1, but if the server returns partner data
in the same response we must skip-decode (not throw on)
unknown trailing bytes.
6. **`Open2`-only sessions never receive SF events.** `handoff.md`
"Active Blocker" notes the wrapper's full chain
(`OpenConnection3` after the `ValCl` rounds) is the path that
produces a session the server treats as a real client. SF
probes must run from inside that chain — re-using
`HistorianWcfAuthChainHelper.OpenAuthenticatedConnection`,
the same call site already used by `GetSystemParameter` at
`HistorianWcfStatusClient.cs:42`.
7. **`HISTORIAN_STORAGE_STATUS` field order is not contractual.**
The struct is C++ inside the closed source. If AVEVA reorders
fields between Historian versions, our decoder breaks. Pin the
decoder to the Historian server version observed at session
open (already exposed via `IRetrievalServiceContractN`) and
reject mismatched versions explicitly with
`ProtocolEvidenceMissingException`. Do not silently best-effort
parse.
8. **Sanitization.** Pipe names, registry paths, and SF cache
directory paths can leak hostnames and account names. Run the
`rg` sanitizer (handoff.md "Next Pickup Steps") after every
doc edit.
## 6. Success Criteria
A real implementation is "done" when all of the following hold:
1. `client.GetStoreForwardStatusAsync()` returns
`Pending = true` and `Storing = true` while the local
Historian's SF cache is actively buffering writes (verifiable
by stopping the Runtime DB and writing a value).
2. Returns `Pending = false` and `Storing = false` within
≤ 5 seconds after the Runtime DB recovers and SF drains.
3. Returns `ErrorOccurred = true` and a non-null, actionable
`Error` message when the SF cache itself fails (disk full,
pipe closed, etc.).
4. Returns the synthesized "no SF" shape (all-false) without
throwing on a Historian where SF is not configured.
5. Two new golden-byte unit tests pass (active-SF and idle-SF
responses).
6. `ProtocolGuardrailTests` no longer needs to exempt
`GetStoreForwardStatusAsync` from any "must throw
`ProtocolEvidenceMissingException`" rule — the method is now
evidence-backed.
7. Live integration test
`HistorianClientIntegrationTests.GetStoreForwardStatusAsync_ReturnsServerState`
(to be added) passes when `HISTORIAN_HOST` is set, skips
cleanly otherwise.
8. `README.md:20` operation status table is updated from
"synthesized defaults" to "live-verified".
## 7. Open Questions for the Implementer
Resolve these before writing production code:
1. Does the server expose a *pull* endpoint that returns the full
`HISTORIAN_STORAGE_STATUS` snapshot, or only push events?
(Workstream C.3 answers this.)
2. What is the binary layout of `HISTORIAN_STORAGE_STATUS`?
(Workstream A.2 + D.1.)
3. What is the `[OperationContract]` shape on
`IStatusServiceContract2.SetStoreForwardEvent`? Specifically:
parameter count, byte-buffer parameters, and exact
`MessageParameter` names? (Workstream A.4.)
4. Is the `Storage` service slot at
`net.pipe://<host>/Storage` and `net.tcp://<host>:32568/Storage`
reachable on a non-Historian-server install? Or does it 404
when only the client redistributable is present? (Workstream
B + C.1.)
5. Does the SF status snapshot include partner / redundant SF
state inline, or is it returned from a separate call?
(Workstream A.1, look for branches under
`IsBothConnectionRequested`.)
6. Does the SF status read require `OpenConnection3` to have
succeeded, or is `Open2` enough? (Trial: try the discovered
pull endpoint after `Open2` only, before doing
`OpenConnection3`. If it works, the implementation is much
simpler.)
7. What happens when SF is *disabled* by configuration vs
*enabled but idle*? Both should map to `Pending=false,
Storing=false`, but the underlying server response may be a
sentinel error vs an all-zeros struct. The implementation must
distinguish "no SF" (return defaults silently) from "SF errored"
(return `ErrorOccurred = true`).
## 8. Out of Scope
Explicitly not part of this plan:
- SF write-back (the project mission is read-only;
`IStorageServiceContract.AddStreamValues` etc. stay
unimplemented).
- Setting SF parameters
(`IStorageServiceContract.SetStoreForwardParameter`).
- Redundant-partner SF aggregation
(`HistorianStoreForwardStatus.AddPartnerStoreForwardStatus`).
- Reverse-engineering the on-disk SF cache file format beyond
presence / file count (Workstream E is a fallback, not a
primary deliverable).
- Anything in the
`aahClientCommon.CSFConnection.StartStoreforward` /
`SetStorageStopped` / `SetTagSynchronized` write surface.
## 9. 2026-06-21 gRPC re-scope (current R4.3 plan)
This supersedes the recommended *route* in §2/§3/§4. The deliverable
(§1) and success criteria (§6) are unchanged. What changed is the
transport and the resolved architecture risk.
### 9.1 What the recovered gRPC contract already gives us
The 2023 R2 contract under `src/AVEVA.Historian.Client/Grpc/Protos/`
exposes SF state through **first-class pull RPCs** on `StorageService`
(`StorageService.proto`) — no duplex/callback contract, no native
`HISTORIAN_STORAGE_STATUS` C-struct decode:
- `GetSFParameter(uint32 Handle, string ParameterName)
→ (Status status, string ParamaterValue)` — the direct analogue of the
already-shipped `GetSystemParameter`/`GetRuntimeParameter` string-keyed
pulls. This is the primary SF-state lever: a name→value read.
- `GetRemainingSnapshotsSize(uint32 Handle)
→ (Status status, uint64 SnapshotSize)` — the pending-buffer magnitude
in one call. Non-zero ⇒ data is queued (`Pending`/`DataStored=true`);
zero ⇒ drained. The cleanest single signal for the idle-vs-active split.
- `GetInfo(string Request) → (Status status, bytes info)` — generic
server info blob; a fallback if a named SF key lives here instead of in
`GetSFParameter`.
- `OpenStorageConnectionResponse.ServerStatus` (field 5) and the
`GetSnapshots`/`StartQuerySnapshot` family — secondary signals.
`SetSFParameter` exists too but is **out of scope** (read-only mission, §8).
The `TransactionService.ForwardSnapshot{,Begin,End}` RPCs are the SF
cache *replay/transfer* path (write-side), **not** a status read — also
out of scope here; they belong to the deferred bit-faithful SF cache work,
not to `GetStoreForwardStatusAsync`.
### 9.2 Plumbing that already exists (reuse, don't rebuild)
- `HistorianGrpcHandshake.OpenSession` — authenticated gRPC session
(`ValidateClientCredential` NTLM loop + Open2) yielding `ClientHandle`
(uint) + storage-session GUID. Live-verified against the 2023 R2 box.
- `HistorianGrpcStorageConnectionProbe` — already constructs a
`StorageService.StorageServiceClient`, primes `GetInterfaceVersion`, and
calls `OpenStorageConnection`/`CloseStorageConnection`. The SF-status
probe is a near-clone that swaps the `OpenStorageConnection` body for
`GetSFParameter`/`GetRemainingSnapshotsSize` calls.
- `HistorianGrpcChannelFactory` / `HistorianGrpcConnection` — channel,
metadata, deadlines.
### 9.3 The one open risk that survives: which `Handle`?
`GetSFParameter`/`GetRemainingSnapshotsSize` both take `uint32 Handle`.
Unknown: do they accept the **session `ClientHandle`** (from
`OpenSession`, which is cheap and unblocked), or do they require the
**storage console `Handle`** returned by `OpenStorageConnection` — which
is the D2 wall (`OpenStorageConnection` routes to the
`\\.\pipe\aahStorageEngine\console` session and is the same storage-engine
pipe that blocks revision writes)? See
[[project_roadmap_exhausted_2020wcf]] and `HistorianGrpcStorageConnectionProbe`
header.
- **Best case:** these read-only status RPCs accept the session
`ClientHandle` (status reads shouldn't need a console writer session).
Then R4.3-over-gRPC is unblocked end-to-end and is a small, shippable
feature.
- **Worst case:** they require the `OpenStorageConnection` `Handle` ⇒
R4.3 inherits the D2 storage-engine-pipe wall and stays blocked on the
same root cause as R4.2. Either way the probe answers it in one run.
### 9.4 Discovery steps (execution order)
1. **Add `grpc-sf-status-probe` to `tools/AVEVA.Historian.ReverseEngineering`**
(mirror `HistorianGrpcStorageConnectionProbe`). Against the live 2023 R2
server it:
- opens an authenticated session, gets `ClientHandle`;
- calls `GetRemainingSnapshotsSize(ClientHandle)` and reports
`status.bSuccess` + `SnapshotSize` + any error buffer;
- sweeps `GetSFParameter(ClientHandle, name)` over a candidate
name list (`Status`, `Storing`, `Pending`, `DataStored`,
`SF.Status`, `StoreForwardStatus`, `Forward`, `CacheSize`,
`ErrorOccurred`, plus any names surfaced by Workstream A's IL of
`ConvertUnmanagedSFStorageStatusToManagedStorageStatus`);
- records which names the server accepts and the returned values.
- If every call fails with an auth/handle-shaped error, retry once
with the `OpenStorageConnection` `Handle` to disambiguate §9.3.
2. **Idle baseline first** — run against the server with SF *not* active.
Establishes the "no SF / drained" response shape (expected:
`SnapshotSize=0`, parameter reads succeed-with-defaults or
return a "not configured" sentinel). This alone may be enough to ship
an honest idle-state implementation that is strictly better than
today's hardcoded all-false synthesis (it would be *measured* false).
3. **Active-SF capture** — only if step 2 proves the read works and we
need the active-state fixtures. Force SF on the sacrificial Historian
VM (stop Runtime DB writer; let the queue spill to SF), re-run the
probe, capture the non-zero/`Storing=true` response. This is the one
invasive step and the gate on full success criteria §6.16.3.
4. **Map + implement** — add `GrpcGetStoreForwardStatus` to the gRPC
read orchestrator, map the probed fields onto
`HistorianStoreForwardStatus`, route `GetStoreForwardStatusAsync`
to it when `Transport == RemoteGrpc` (keep the synthesized fallback
for non-gRPC transports and for the "no SF configured" sentinel).
Add golden-byte fixtures (idle + active) and
`WcfStoreForwardStatusProtocolTests`-style parse tests. Gate the live
integration test on `HISTORIAN_GRPC_HOST`.
### 9.5 Effort / feasibility summary
- **Risk collapsed:** pull-vs-push (the old plan's worst risk) is settled
— it's a pull. No duplex WCF/gRPC callback contract.
- **No native struct decode:** `GetSFParameter` returns a *string*; we
skip the `HISTORIAN_STORAGE_STATUS` C-layout RE entirely (Workstream
A.2 / D.1 become "nice-to-have for field names", not blocking).
- **Reuses shipped plumbing:** session open + `StorageServiceClient` +
channel already exist and are live-verified.
- **Remaining unknowns are empirical, one probe-run each:** (a) the
accepted parameter-name vocabulary, (b) which `Handle` the status RPCs
want (§9.3 — the only thing that could re-block it), (c) the
active-SF response shape (needs the invasive force-SF step).
- **Net:** Step 12 are low-risk and could land a *measured* idle-state
`GetStoreForwardStatusAsync` over gRPC quickly. Steps 34 (full
success criteria) still need the sacrificial-VM force-SF capture and
are gated on §9.3 not landing on the D2 wall.
### 9.6 Out of scope (unchanged from §8, restated for gRPC)
`SetSFParameter`, `ForwardSnapshot*` (SF replay/transfer), the on-disk
cache file format, and redundant-partner SF aggregation all remain out of
scope. R4.3 is read-only status, gRPC-first.
### 9.7 Idle-baseline run — RESULTS (2026-06-21)
Built `HistorianGrpcStoreForwardStatusProbe` + the `grpc-sf-status-probe`
CLI command and ran it against the **live 2023 R2 server** with the
historian in its **idle / not-actively-storing** state (storage interface
v4, authenticated session opened OK). Tested both read-only (`0x402`) and
write-enabled (`0x401`) sessions. Findings, with the §9.3 handle question
**resolved**:
1. **Direct `StorageService` SF pull RPCs are D2-gated — confirmed the
§9.3 worst-case branch.**
- `GetRemainingSnapshotsSize(session.ClientHandle)` →
`bSuccess=false`, error buffer `04 84 00 00 00` (= status `0x84` /
**132 `OperationNotEnabled`**). **Identical under `0x401` and
`0x402`** — so it is NOT the read/write connection-mode gate; the
History-session `ClientHandle` is simply not a valid handle for this
op's handle-space.
- `GetSFParameter(session.ClientHandle, <name>)` → server-side
`RpcException(Unknown, "Exception was thrown by handler")` for **all
16** candidate names, both session modes.
- These two ops need the **`OpenStorageConnection` console handle**,
and `OpenStorageConnection` itself fails with the storage-engine
console error (`84 55 00 00 00 01 02 00 09 15 00`
+ ASCII `"OpenStorageConnection"`) — the **D2 storage-engine-pipe
wall**, the same root cause that blocks R4.2 revision writes. We
cannot obtain the console handle, so these two SF RPCs are
unreachable from a pure managed client. See
[[project_roadmap_exhausted_2020wcf]].
2. **One reachable session-handle lever found:**
`StatusService.GetHistorianConsoleStatus(strHandle)` **SUCCEEDS** with
the session string handle (uppercase Open2 GUID) — no console handle
needed — and returns `uiConsoleStatus = 3` at idle. This is the only
SF-adjacent signal reachable from the managed client. **Its enum
semantics are unknown** (3 = presumably "running/normal"); whether it
shifts when SF is actively storing is the open question.
3. `StatusService.GetHistorianInfo(strHandle, btRequest)` → `bSuccess=
false` for every `btRequest` candidate (empty / `u32(0)` / ascii+utf16
`"StoreForward"`); its request framing is not yet known. Lower-yield
than `GetHistorianConsoleStatus`; revisit only if needed.
**Net idle-baseline conclusion.** R4.3's clean direct route
(`GetSFParameter` / `GetRemainingSnapshotsSize`) is **blocked behind the
D2 storage-engine console pipe**, exactly like R4.2 — a pure managed
client cannot open the console session those ops require. The *only*
reachable SF-adjacent signal is `GetHistorianConsoleStatus` → a status
uint. Two paths forward:
- **(a) Ship a measured idle-state only. — SHIPPED + LIVE-VERIFIED 2026-06-21.**
`HistorianGrpcStatusClient.GetStoreForwardStatusAsync` opens a session,
calls `GetHistorianConsoleStatus`, and returns
`HistorianStoreForwardStatus` all-false but *measured*: it actually
contacts the server and reports `ErrorOccurred=true` (with the underlying
error) when the server is unreachable / the console-status call fails —
strictly better than the blind hardcoded synthesis, which never contacts
the server. Routed via `Historian2020ProtocolDialect.GetStoreForwardStatusAsync`
when `Transport == RemoteGrpc` (non-gRPC keeps the synthesized fallback).
Gated live test `HistorianGrpcIntegrationTests.GetStoreForwardStatusAsync_OverGrpc_ReturnsMeasuredIdleState`
passes against the real 2023 R2 server. `Storing`/`Pending`/`DataStored`
magnitude is intentionally NOT surfaced — it lives behind the D2 wall (see
path (b)).
- **(b) Full success criteria (§6) stay blocked** on the D2 console-pipe
wall. Decoding the active-SF `uiConsoleStatus` value and any
`GetSystemParameter` SF keys still needs the invasive force-SF capture
on a sacrificial Historian — and even then `Storing`/`DataStored`
magnitude is only available via the D2-gated `GetRemainingSnapshotsSize`.
Probe code: `src/AVEVA.Historian.Client/Grpc/HistorianGrpcStoreForwardStatusProbe.cs`,
CLI `grpc-sf-status-probe <host> [port] [--tls] [--dnsid <n>] [--write-session]`.
Writes nothing; releases any console session immediately.