diff --git a/docs/plans/store-forward-cache-reverse-engineering.md b/docs/plans/store-forward-cache-reverse-engineering.md new file mode 100644 index 0000000..8172570 --- /dev/null +++ b/docs/plans/store-forward-cache-reverse-engineering.md @@ -0,0 +1,501 @@ +# Store/Forward Cache Reverse-Engineering Plan + +Last updated: 2026-05-04 + +This document plans the reverse-engineering effort needed to replace the +synthesized `GetStoreForwardStatusAsync` in +`src/AVEVA.Historian.Client/Wcf/HistorianWcfStatusClient.cs` (lines 101-117) +with a real, evidence-backed implementation. It is a *plan*, not the work +itself. No code changes; no captures collected. + +Read this together with: + +- `docs/reverse-engineering/handoff.md` — read/event protocol decoding state +- `src/AVEVA.Historian.Client/Wcf/Contracts/IStorageServiceContract.cs` — the + WCF contract that already declares the SF parameter ops +- `src/AVEVA.Historian.Client/Models/HistorianStoreForwardStatus.cs` — the + output model the implementation must populate + +## 1. Goal + +"SF support works" means, end-to-end: + +1. **Primary deliverable.** `client.GetStoreForwardStatusAsync()` against a + live local Historian returns a `HistorianStoreForwardStatus` whose + `Pending`, `Storing`, `DataStored`, `ErrorOccurred`, `Error`, `ServerName`, + and `ConnectionKind` fields reflect actual server-reported state, not the + synthesized defaults at + `HistorianWcfStatusClient.cs:107-117`. +2. **Secondary deliverable.** The SDK can also answer the higher-level + "is SF currently buffering?" question accurately when the runtime DB is + *down*, not just when it is up. That is the case the real native client + handles correctly and where the synthesized default (`Storing = false`, + `ErrorOccurred = false`) is silently wrong today. +3. **Non-goals.** Writing into SF, replaying SF buffers, configuring SF + parameters, redundant-partner SF aggregation + (`HistorianStoreForwardStatus.AddPartnerStoreForwardStatus`, + token `0x060060B8`). Read-only matches the project mission in + `CLAUDE.md`. + +The success bar is parity with the native wrapper's +`ArchestrA.HistorianAccess.GetStoreForwardStatus` +(MD token `0x06006186` in `current/aahClientManaged.dll`), +not a superset. + +## 2. Architecture Investigation (open questions, in priority order) + +Answer these before writing any production code. Each has a discovery action +in §3. + +### Q1. Is SF status read from a local in-process struct, a separate WCF endpoint, or a Named Pipe IPC? + +Current evidence: **all three are plausible, but the wrapper actually uses +"in-process struct kept current by server-pushed WCF events"**. Specifically: + +- `ArchestrA.HistorianAccess.GetStoreForwardStatus` + (token `0x06006187`, the private 2-arg overload) does *not* call WCF. + It calls `mdas_GetStorageStatus` (a `calli` against the + `INSQL_MDAS_ERROR (IntPtr handle, uint, HISTORIAN_STORAGE_STATUS*)` C + signature in `current/aahClient.dll` exports) and then maps the result + through `HistorianAccessUtil.ConvertUnmanagedSFStorageStatusToManagedStorageStatus` + (token `0x060060E4`). +- Mutators like `CConfigStatusClient.SetMdasStoreForwardEvent` + (token `0x060029DC`) and `aahClientCommon.CStatus.SetStoreForwardEvent` + (token `0x06002A04`) are wired to the WCF callback + `IStatusServiceContract2.SetStoreForwardEvent` + (`StatusServiceContract.IStatusServiceContract2.SetStoreForwardEvent`, + token `0x06005F57`). The server *pushes* SF state changes; the client + caches them. +- Confirm: read the IL of token `0x06006187` and verify the only system call + is `mdas_GetStorageStatus`. The first 200 instructions confirm this: + `GetClient(ConnectionIndex)` → `calli` against the + `INSQL_MDAS_ERROR(IntPtr,uint,HISTORIAN_STORAGE_STATUS*)` signature → + `ConvertUnmanagedSFStorageStatusToManagedStorageStatus`. + +Implication: **the SDK cannot ship a synchronous probe that calls one WCF +operation and gets the answer**. It must subscribe to the same status-event +stream the native wrapper subscribes to, or call a status query that returns +the cached snapshot from the server. + +### Q2. Is there a single-shot WCF query that returns the same snapshot? + +Likely yes. Hypothesis: `IStatusServiceContract2.GetHistorianInfo` +(`GETHI`, see `IStatusServiceContract2.cs:24-30`) returns a multi-key status +blob whose schema includes SF state. Alternative: a status-only key passed to +`GetSystemParameter` (already plumbed via `HistorianWcfStatusClient.GetSystemParameterAsync`). +Both are testable without writing protocol code by sending probe payloads +and observing the response shape. + +### Q3. Does SF have its own sidecar process / pipe / WCF endpoint we are missing? + +Strong evidence the answer is yes when SF is *enabled*: + +- `aahClientCommon.CSFConnection.GetSFPipeName` (token `0x06004B72`), + `GetSFPath` (`0x06004B71`), `IsConnected` (`0x06004B73`), `IsEnabled` + (`0x06004B6F`) — there is a separately-named SF Named Pipe distinct + from the main MDAS pipe. +- `aahClientCommon.CSFConnection.StartStoreforward` (token `0x06004BC6`). +- `IStorageServiceContract` already declares `GetStoreForwardParameter` + / `SetStoreForwardParameter` (`GetSFP`/`SetSFP`, + see `IStorageServiceContract.cs:81-85`) and `Storage` is a separate + WCF service slot in `HistorianWcfServiceNames.cs:15`. +- `CWcfConfig.ConfigurePipeProxy` (token + `0x06004B1C`) and `CWcfConfig.ConfigureTcpProxy` + (token `0x06004B1B`) confirm the storage proxy supports both transports — + same dual-transport pattern the History/Retrieval proxies use. +- `CStorageEngineConsoleClient.GetPipeNameStr` (token `0x06000E2D`) / + `GetFullPipeNameStr` (token `0x06000E2E`) wraps the storage-engine + console pipe via `STransactPipeClient2` (a *non-WCF* binary pipe + protocol). + +Open: **is the SF sidecar even running on the dev host this SDK is being +tested against?** `handoff.md` does not record an SF process being +observed. `aveva-install-x64/` and `aveva-install-x86/` ship only DLLs +(no `aahStoreForwardClient.exe` / `aahSFClient.exe` / similar). The SF +sidecar is part of the Historian *server* install, not the client +redistributable. So: + +- On the developer machine, SF is reachable only because the local + Historian server is installed. +- A pure-client install (the deployment target this SDK ships into) may + *never* have SF. + +This shapes the success criteria: when SF is not configured, a correct +implementation returns `Pending = false`, `ErrorOccurred = false`, +`DataStored = false`, `Storing = false` — i.e. the same shape the +synthesized defaults produce today. The interesting case is *when SF is +configured and active*. + +### Q4. Is SF state authoritative on the Historian server or on a per-client basis? + +Native wrapper reads it from `HistorianClient*` (the per-connection C++ +object). This means it is *connection-scoped* server-pushed state. We +do not need to enumerate cluster-wide SF state — the server reports +"my SF buffer for this client's writes" only. This matches our read-only +mission: we are not a writer, so the only SF state of interest is the +server-side cache for *other* writers, which the server can report to +us as a passive observer. + +### Q5. Does any SF probe require Admin? + +`CSFConnection.GetSFPipeName` returns a kernel object name. Reading +from it requires the pipe ACL to permit the caller. If the SF pipe is +ACL'd to `LocalSystem` only, the SDK cannot read it without +impersonation — and the SDK runs as the calling process. This is a +hard limit, not a bug. + +## 3. Discovery Workstreams + +Run these in parallel. None require a live server beyond what the +existing test rig already has. + +### Workstream A — Static IL inspection (parallel-safe, read-only) + +Owner action items, in order: + +1. Dump full IL of token `0x06006187` + (`HistorianAccess.GetStoreForwardStatus(ConnectionIndex,out)`): + ```powershell + dotnet run --no-build --project tools\AVEVA.Historian.ReverseEngineering -- ` + dnlib-method current\aahClientManaged.dll HistorianAccess.GetStoreForwardStatus --instructions + ``` + Save under `docs/reverse-engineering/historianaccess-getstoreforwardstatus-il-latest.txt`. + Confirm the `calli` target signature + `INSQL_MDAS_ERROR(IntPtr,uint,HISTORIAN_STORAGE_STATUS*)` and that + the only WCF entry-points it touches are zero. +2. Dump IL of `HistorianAccessUtil.ConvertUnmanagedSFStorageStatusToManagedStorageStatus` + (token `0x060060E4`). This is the unmanaged→managed mapping; it + tells us which fields of `HISTORIAN_STORAGE_STATUS` populate which + fields of `HistorianStoreForwardStatus`. We will need the same + mapping in reverse on the wire response. +3. Inventory every method that *writes* into the local SF status + struct: + ``` + methods current\aahClientManaged.dll SetStoreForward + methods current\aahClientManaged.dll SetMdasStoreForward + ``` + The known set as of writing: + `CConfigStatusClient.SetMdasStoreForwardEvent` (`0x060029DC`), + `aahClientCommon.CStatus.SetStoreForwardEvent` (`0x06002A04`), + `CStatusConnectionDirect.SetStoreForwardEvent` (`0x06004DF8`), + `CStatusConnectionWCF.SetStoreForwardEvent` (`0x06004E4E`), + `CClientCommon.SetStoreForwardEventOnServer` (`0x06002EC0`). + The `WCF` variant is the one whose IL maps onto + `IStatusServiceContract2.SetStoreForwardEvent` + (token `0x06005F57`) — read its IL and document the request/response + shape. +4. Dump IL of `IStatusServiceContract2.SetStoreForwardEvent` + (`0x06005F57`) parameter types. The `[OperationContract]` + declaration in the wrapper assembly already encodes the wire shape; + this gives us the bytes the server pushes us. + +### Workstream B — Install inventory (parallel-safe) + +1. Inventory `aveva-install-x64\` and `aveva-install-x86\` for any + binary whose name contains `Store`, `Forward`, `SF`, `Cache`, + `Spool`. As of this checkout: **none**, only DLLs. Confirm. +2. Inventory the deployed Historian server (out-of-band; not in this + repo) for `aahStoreForwardClient.exe`, + `aahStoreForwardServer.exe`, `aahSFCache.exe`, or any service + registered with `Description` matching `*Forward*`. Capture the + service name, account identity, and pipe ACLs (`accesschk -wuvc`). +3. Walk the registry: `HKLM\SOFTWARE\ArchestrA\Historian` and any + sub-key matching `*StoreForward*`, recording paths and pipe names. + Sanitize before committing. + +### Workstream C — WCF probe (parallel-safe) + +Use the existing `wcf-probe` and `wcf-status` subcommands of +`tools\AVEVA.Historian.ReverseEngineering`: + +1. `wcf-probe $env:HISTORIAN_HOST 32568` — confirm `Storage/GetV` is + reachable. (It is the third service slot in + `HistorianWcfServiceNames`.) Document the returned interface + version. +2. `wcf-status $env:HISTORIAN_HOST 32568 ` — sweep + plausible SF parameter names (`SF.Status`, `StoreForward.State`, + `SFCacheBytes`, etc.) through `GetSystemParameter` and record what + the server accepts. Cheap, read-only, no session needed beyond the + already-decoded auth chain. +3. Probe `GetHistorianInfo` (`GETHI`, + `IStatusServiceContract2.cs:24`) with the byte request shape used + by the native wrapper. The request bytes are visible if we run + `instrument-wcf-readquery`-style instrumentation against + `CConfigStatusClient.SetMdasStoreForwardEvent`'s upstream caller — + see Workstream D. + +### Workstream D — Native capture (sequential after A and C) + +Two captures are needed: + +1. **Native call to `mdas_GetStorageStatus`.** Run + `tools\AVEVA.Historian.NativeTraceHarness` with a new scenario + `--scenario sfstatus` (to be added) that invokes + `HistorianAccess.GetStoreForwardStatus()` and dumps the + `HISTORIAN_STORAGE_STATUS` C struct memory before the managed + conversion runs. This pins the binary layout of the struct + (offsets, field widths, endianness) without us guessing. +2. **WCF push of SF events.** Configure the local Historian to enter + SF mode (stop the runtime DB writer; let the writer's queue + trigger SF) and capture the WCF traffic with the existing + `instrument-wcf-readquery` sibling — i.e. add an + `instrument-wcf-setstoreforwardevent` subcommand that + IL-rewrites `aahClientManaged.dll` to log the bytes the server + sends to `IStatusServiceContract2.SetStoreForwardEvent`. Save + the rewrite under `docs/reverse-engineering/dnlib-write-copy/`, + never `current/`. + +Workstream D is the only step that needs an actively-storing SF +sidecar. Plan: stop the Historian Runtime DB SQL service, write a +single test point via the wrapper's writer harness, and capture the +SF event push, then restart Runtime DB and capture the +"end-of-SF / data drained" push. + +### Workstream E — On-disk cache (only if Workstream D fails) + +If the WCF push protocol turns out to be impractical to reproduce +(e.g. requires duplex contract, callback channel, or a server-side +session-bind we cannot match from our managed client), fall back to +inspecting the on-disk SF cache directly. Steps: + +1. Resolve `CSFConnection.GetSFPath` IL to find the cache directory + convention (likely `%ProgramData%\ArchestrA\Historian\Cache\` or + similar — to be confirmed, **never assume the path**). +2. Inventory file types: `.sfdata`, `.sfindex`, `.cache` — whatever + the directory contains. +3. Decode the file header. The presence/size of `.sfdata` files is + sufficient to populate `DataStored` and `Pending`; we do not + need to decode the value payload. + +This fallback is only for `DataStored` / `Pending`. `Storing` and +`Error` fundamentally require a live server-state read. + +## 4. Concrete Reverse-Engineering Steps (execution order) + +Mirrors the read/event decoding workflow that succeeded for raw +queries. + +### Step 1 — Find native methods that touch SF + +Already done; baseline evidence is recorded in §2 Q1/Q3 above. Key +tokens to reference: + +- `0x06006186`, `0x06006187` — public/private + `HistorianAccess.GetStoreForwardStatus` +- `0x060060E4` — + `HistorianAccessUtil.ConvertUnmanagedSFStorageStatusToManagedStorageStatus` +- `0x060029DC` — `CConfigStatusClient.SetMdasStoreForwardEvent` +- `0x06002A04` — `aahClientCommon.CStatus.SetStoreForwardEvent` +- `0x06002DFF` — `aahClientCommon.CClientCommon.IsInStoreForward` +- `0x06002E18` — `aahClientCommon.CClientCommon.SetStoreForwardParams` +- `0x06002EC0` — `CClientCommon.SetStoreForwardEventOnServer` +- `0x06004BC6` — `aahClientCommon.CSFConnection.StartStoreforward` +- `0x06004B6F`..`0x06004B73` — CSFConnection getters (path, pipe, + enabled, connected) +- `0x06004DF8`, `0x06004E4E` — direct vs WCF status connections +- `0x06005F57` — `IStatusServiceContract2.SetStoreForwardEvent` MD ref +- `0x06006193` — `HistorianAccess.IsBothConnectionRequested` (used by + the public arity-0 GetStoreForwardStatus to decide whether to fan + out to a redundant partner) + +### Step 2 — Decode `HISTORIAN_STORAGE_STATUS` layout + +Run Workstream A.2 (decode token `0x060060E4`) and Workstream D.1 +(native struct memory dump). Together they pin the field layout. + +The managed struct fields we already know we need to populate +(from `HistorianStoreForwardStatus.cs`): +`ServerName`, `Pending`, `ErrorOccurred`, `Error`, `DataStored`, +`Storing`, `ConnectionKind`. The native struct will have ≥7 +fields plus padding. Express the mapping as a comment table in +the implementation. + +### Step 3 — Decide the wire model + +Two possible implementations: + +1. **Push-mode (native parity).** SDK opens an authenticated WCF + session that the server treats as a status subscriber, listens + for `IStatusServiceContract2.SetStoreForwardEvent` callbacks, + maintains a local cache, and `GetStoreForwardStatusAsync` + returns from the cache. This requires WCF duplex + (`CallbackContract`) which is not currently exercised + anywhere in `src/AVEVA.Historian.Client/Wcf/`. +2. **Pull-mode (probe).** SDK calls `GetHistorianInfo` (`GETHI`) + or a discovered `Storage`-service equivalent and maps the + one-shot response. No subscription state required. + +Pull-mode is strongly preferred: it matches the SDK's existing +WCF style, avoids duplex contracts, and the existing code path +in `HistorianWcfStatusClient.GetSystemParameter` is the right +shape. Only fall back to push-mode if Workstream C.3 proves the +server has no pull endpoint that returns SF state. + +### Step 4 — Implement the managed contract method + +Once Step 3 picks pull-mode, implement against the WCF contract +(likely a new `[OperationContract]` on `IStatusServiceContract2` +or a method on `IStorageServiceContract`). Follow the existing +parameter-naming discipline from the resolved +`ValidateClientCredential` blocker: +**use `[MessageParameter(Name = "...")]` to match exact server +element names — do not let WCF derive them from C# parameter +names.** See `handoff.md` "Active Blocker" entry for the +2026-05-04 fix. + +### Step 5 — Add golden-byte fixtures + +Add a request and response fixture under +`fixtures/protocol/store-forward-status/`: + +- `request-get-storage-status.bin` — bytes the SDK sends. +- `response-get-storage-status-running-normal.bin` — server + not in SF. +- `response-get-storage-status-active-sf.bin` — server actively + storing. +- `response-get-storage-status-error.bin` — server's SF errored. + +Capture sources: the same instrumented native wrapper runs that +populate Workstream D. Sanitize hostnames, GUIDs, and timestamps +before committing. + +### Step 6 — Replace the synthesized stub + +Replace `SynthesizeStoreForwardStatus` (lines 107-117 of +`HistorianWcfStatusClient.cs`) with a real implementation. Keep +the synthesized fallback for the case where the storage service +returns a "no SF configured" sentinel — that is *not* an error +condition, it is the normal state for client-only deployments. + +Add a unit test class `WcfStoreForwardStatusProtocolTests` next +to the existing `WcfDataQueryProtocolTests` etc., with golden-byte +parse tests using the fixtures from Step 5. + +Update the operation status table in `README.md:20` from +"synthesized defaults (no SF sidecar to probe)" to +"live-verified" once the integration test passes. + +## 5. Risks and Gotchas + +1. **SF may not be present on the test host.** The dev Historian + probably has SF disabled by default; turning it on means + stopping Runtime DB SQL services, which is invasive. Plan to do + capture work on a dedicated sacrificial Historian VM, not the + shared dev box. +2. **SF sidecar may require Admin or LocalSystem to query.** Any + pipe-direct fallback (Workstream E) will fail under standard + user accounts. Document the privilege requirement explicitly + in the SDK XML doc comments on `GetStoreForwardStatusAsync`. +3. **State is volatile.** Probes that take >100 ms can race + against the server's own SF state machine. Capture *both* + request and response in the same instrumented run; do not + try to correlate two captures. +4. **Push-mode would force a duplex WCF contract.** None of the + existing decoded operations use duplex. Adding it widens the + managed WCF surface significantly and risks .NET-WCF + compatibility issues we have not yet hit. Pull-mode first. +5. **The wrapper's `IsBothConnectionRequested` (token `0x06006193`) + path indicates a "primary + partner" topology.** Out of scope + for this pass per §1, but if the server returns partner data + in the same response we must skip-decode (not throw on) + unknown trailing bytes. +6. **`Open2`-only sessions never receive SF events.** `handoff.md` + "Active Blocker" notes the wrapper's full chain + (`OpenConnection3` after the `ValCl` rounds) is the path that + produces a session the server treats as a real client. SF + probes must run from inside that chain — re-using + `HistorianWcfAuthChainHelper.OpenAuthenticatedConnection`, + the same call site already used by `GetSystemParameter` at + `HistorianWcfStatusClient.cs:42`. +7. **`HISTORIAN_STORAGE_STATUS` field order is not contractual.** + The struct is C++ inside the closed source. If AVEVA reorders + fields between Historian versions, our decoder breaks. Pin the + decoder to the Historian server version observed at session + open (already exposed via `IRetrievalServiceContractN`) and + reject mismatched versions explicitly with + `ProtocolEvidenceMissingException`. Do not silently best-effort + parse. +8. **Sanitization.** Pipe names, registry paths, and SF cache + directory paths can leak hostnames and account names. Run the + `rg` sanitizer (handoff.md "Next Pickup Steps") after every + doc edit. + +## 6. Success Criteria + +A real implementation is "done" when all of the following hold: + +1. `client.GetStoreForwardStatusAsync()` returns + `Pending = true` and `Storing = true` while the local + Historian's SF cache is actively buffering writes (verifiable + by stopping the Runtime DB and writing a value). +2. Returns `Pending = false` and `Storing = false` within + ≤ 5 seconds after the Runtime DB recovers and SF drains. +3. Returns `ErrorOccurred = true` and a non-null, actionable + `Error` message when the SF cache itself fails (disk full, + pipe closed, etc.). +4. Returns the synthesized "no SF" shape (all-false) without + throwing on a Historian where SF is not configured. +5. Two new golden-byte unit tests pass (active-SF and idle-SF + responses). +6. `ProtocolGuardrailTests` no longer needs to exempt + `GetStoreForwardStatusAsync` from any "must throw + `ProtocolEvidenceMissingException`" rule — the method is now + evidence-backed. +7. Live integration test + `HistorianClientIntegrationTests.GetStoreForwardStatusAsync_ReturnsServerState` + (to be added) passes when `HISTORIAN_HOST` is set, skips + cleanly otherwise. +8. `README.md:20` operation status table is updated from + "synthesized defaults" to "live-verified". + +## 7. Open Questions for the Implementer + +Resolve these before writing production code: + +1. Does the server expose a *pull* endpoint that returns the full + `HISTORIAN_STORAGE_STATUS` snapshot, or only push events? + (Workstream C.3 answers this.) +2. What is the binary layout of `HISTORIAN_STORAGE_STATUS`? + (Workstream A.2 + D.1.) +3. What is the `[OperationContract]` shape on + `IStatusServiceContract2.SetStoreForwardEvent`? Specifically: + parameter count, byte-buffer parameters, and exact + `MessageParameter` names? (Workstream A.4.) +4. Is the `Storage` service slot at + `net.pipe:///Storage` and `net.tcp://:32568/Storage` + reachable on a non-Historian-server install? Or does it 404 + when only the client redistributable is present? (Workstream + B + C.1.) +5. Does the SF status snapshot include partner / redundant SF + state inline, or is it returned from a separate call? + (Workstream A.1, look for branches under + `IsBothConnectionRequested`.) +6. Does the SF status read require `OpenConnection3` to have + succeeded, or is `Open2` enough? (Trial: try the discovered + pull endpoint after `Open2` only, before doing + `OpenConnection3`. If it works, the implementation is much + simpler.) +7. What happens when SF is *disabled* by configuration vs + *enabled but idle*? Both should map to `Pending=false, + Storing=false`, but the underlying server response may be a + sentinel error vs an all-zeros struct. The implementation must + distinguish "no SF" (return defaults silently) from "SF errored" + (return `ErrorOccurred = true`). + +## 8. Out of Scope + +Explicitly not part of this plan: + +- SF write-back (the project mission is read-only; + `IStorageServiceContract.AddStreamValues` etc. stay + unimplemented). +- Setting SF parameters + (`IStorageServiceContract.SetStoreForwardParameter`). +- Redundant-partner SF aggregation + (`HistorianStoreForwardStatus.AddPartnerStoreForwardStatus`). +- Reverse-engineering the on-disk SF cache file format beyond + presence / file count (Workstream E is a fallback, not a + primary deliverable). +- Anything in the + `aahClientCommon.CSFConnection.StartStoreforward` / + `SetStorageStopped` / `SetTagSynchronized` write surface. diff --git a/docs/plans/write-commands-reverse-engineering.md b/docs/plans/write-commands-reverse-engineering.md new file mode 100644 index 0000000..c4bc4b4 --- /dev/null +++ b/docs/plans/write-commands-reverse-engineering.md @@ -0,0 +1,425 @@ +# Plan: Reverse-Engineering Write Commands + +Status: PLAN ONLY (no implementation yet). Extends the read/event +work in `docs/reverse-engineering/handoff.md` (2026-05-04). + +## 1. Goal + +"Write commands work" means the production SDK at +`src/AVEVA.Historian.Client/` performs these operations end-to-end +against a live AVEVA Historian, with parsed responses, golden-byte +unit tests, and gated live integration tests. + +In scope: + +1. **`AddS2` (`IHistoryServiceContract2.AddStreamValues2`)** — push + one or more timestamped samples for an existing historized tag. + Primary use case: an OPC UA driver pushing values to the + Historian. +2. **`EnsT2` (`IHistoryServiceContract2.EnsureTags2`) for + analog/discrete/string data tags** — partially decoded for the + `CM_EVENT` AnE-event tag in + `src/AVEVA.Historian.Client/Wcf/HistorianAddTagsProtocol.cs`. The + `CTagMetadata` byte layout for `CDataType` ∈ {1, 2, 3, 4} is the + new evidence target. +3. **`DelT` (`IHistoryServiceContract2.DeleteTags`)** — needed for + safe sandbox cleanup during RE. +4. **`ModifyData` / `DeleteData`** — only if §3.4 method discovery + confirms a managed WCF op exists. + +Out of scope: tag-extended-properties (`AddTEx` / `DelTep`), +`ExKey`, `SetSFP`, snapshot send (`SendSnapshotBegin/End/Snapshot`), +tag-id-pair maintenance, shard splits, flush ops, all +`IStorageServiceContract` writes (engine-internal — see §6.d), event +writes (events come from AVEVA AnE, we only read them), schema +changes (forbidden over the wire). + +## 2. Safety Constraints + +The Runtime DB is production data even on `localhost`. `AddS2` +writes are persistent — they go to compressed history blocks and +cannot be removed through any client-facing surface. + +Hard rules: + +1. **Single dedicated sandbox tag.** Add env var + `HISTORIAN_WRITE_SANDBOX_TAG = "RetestSdkWriteSandbox"`. Live + write tests refuse to run when unset, even when other + `HISTORIAN_*` vars are set. +2. **Never write to** any tag named in `HISTORIAN_TEST_TAG`, + `HISTORIAN_TAG_FILTER`, the docs, the test fixtures, or the + captured RE ndjson. The read fixture + `OtOpcUaParityTest_001.Counter` is OFF-LIMITS for writes. +3. **Documented rollback.** Every write session records its time + window to + `artifacts/reverse-engineering/write-sandbox-window-.json` + so SQL `SELECT * FROM History WHERE wwTagKey = ? AND DateTime + BETWEEN @s AND @e` can identify exactly which rows the session + inserted. Tag rollback is via decoded `DelT` (§3.3) once + available, or manually via System Management Console until then. +4. **Time bounds on writes.** Every `AddS2` test uses + `DateTime.UtcNow` ± a small offset, so writes always land inside + the live `RealTimeWindow` / `FutureTimeThreshold` system + parameters and cannot accidentally overwrite older blocks. +5. **No customer / corporate hosts.** `localhost` only. +6. **Sanitization scan after every session:** + `rg -n "(?i)(password|credential|secret|token|||)" docs\reverse-engineering scripts tools docs\plans`. + +Soft rules: + +- Use a separate captures dir + (`artifacts/reverse-engineering/instrumented-wcf-writemessage-writes/`) + so write captures don't contaminate the existing read/event + ndjson. +- New integration tests follow the existing gating pattern in + `tests/AVEVA.Historian.Client.Tests/HistorianClientIntegrationTests.cs` + (`Skip = ...` when env var unset). + +## 3. Discovery Workstreams + +### 3.1 EnsT2 for analog/discrete/string tags (priority 1) + +- WCF op: `aa/Hist/EnsT2`. +- Contract: + `src/AVEVA.Historian.Client/Wcf/Contracts/IHistoryServiceContract2.cs:82-89`, + already declared with `[MessageParameter(Name = "InBuff" / "OutBuff")]`. +- Existing code: `HistorianAddTagsProtocol.SerializeCmEventCTagMetadata` + builds the `CDataType=5` (event) shape. +- Missing: the `CTagMetadata` byte layout for `CDataType ∈ {1, 2, + 3, 4}` (analog double, discrete, string, analog int per the + type-code table in `data-query-request-ctor-il-latest.txt`); + whether the optional-mask `0x0086` and the 5-byte trailer + `2F 27 01 01 01` change per type; analog engineering-units / range + / deadband fields (likely populate the bytes that are zero in the + event-tag fixture). + +### 3.2 AddS2 stream values (priority 1) + +- WCF op: `aa/Hist/AddS2`. +- Contract: + `src/AVEVA.Historian.Client/Wcf/Contracts/IHistoryServiceContract2.cs:75-80`, + already has `[MessageParameter(Name = "pBuf")]`. **Audit + requirement:** verify against `ildasm aahClientAccessPoint.exe` + that `Handle` and `errorBuffer` parameter names also match — the + handoff's parameter-name-mismatch class has bitten ~30 ops. +- Missing: entire `pBuf` byte layout (likely `UInt16 version + UInt32 + sampleCount + N × {tagId GUID, FILETIME, qualityByte, value typed + by CDataType}`); whether `Handle` is the same Open2 v6 session GUID + as `UpdC3`/`RTag2`/`EnsT2`; the auth-chain prereqs (event flow + needed Stat priming + Trx/Stat/Retr `GetV` between RTag2 and EnsT2; + writes may have a different chain); success vs error response + shape. + +### 3.3 DelT tag deletion (priority 2 — needed for safe RE) + +- WCF op: `aa/Hist/DelT`. +- Contract: + `src/AVEVA.Historian.Client/Wcf/Contracts/IHistoryServiceContract2.cs:21-30`. +- Missing: `tagNames` byte layout (likely length-prefixed + compact-ASCII per the handoff convention); whether server refuses + to delete tags with stored history or cascades; whether `DelT` is + sufficient to fully unregister or leaves orphan rows in + `Runtime.dbo.Tag`. + +### 3.4 ModifyData / DeleteData (priority 3 — exists?) + +No corresponding WCF op is currently declared. **First step:** static +inspection to confirm any managed wrapper exists. + +```powershell +dotnet run --no-build --project tools\AVEVA.Historian.ReverseEngineering -- methods current\aahClientManaged.dll EditValue +dotnet run --no-build --project tools\AVEVA.Historian.ReverseEngineering -- methods current\aahClientManaged.dll ModifyValue +dotnet run --no-build --project tools\AVEVA.Historian.ReverseEngineering -- methods current\aahClientManaged.dll EditData +dotnet run --no-build --project tools\AVEVA.Historian.ReverseEngineering -- methods current\aahClientManaged.dll DeleteData +``` + +If no managed wrapper exists, this op is REST-only / SMC-only — +mark as **out of scope** in this doc. Otherwise decode like +§3.1/§3.2. + +Parallelism: 3.1 and 3.3 can be developed in parallel because the +operator can create the sandbox tag manually via SMC while SDK code +is being written. 3.2 cannot meaningfully proceed until 3.1 (or the +manual tag) exists. 3.4 method discovery is cheap and may eliminate +its own scope. + +## 4. RE Steps in Execution Order + +For each workstream above, run these five steps. Mirrors the read ++ event flows that recovered the existing protocol. + +### 4.a Static method discovery + +Find the native serializer: + +```powershell +dotnet run --no-build --project tools\AVEVA.Historian.ReverseEngineering -- methods current\aahClientManaged.dll AddS +dotnet run --no-build --project tools\AVEVA.Historian.ReverseEngineering -- methods current\aahClientManaged.dll EnsureTag +dotnet run --no-build --project tools\AVEVA.Historian.ReverseEngineering -- methods current\aahClientManaged.dll DeleteTag +``` + +Dump IL for each method of interest: + +```powershell +dotnet run --no-build --project tools\AVEVA.Historian.ReverseEngineering -- dnlib-method --instructions current\aahClientManaged.dll +``` + +Save sanitized excerpts to +`docs/reverse-engineering/dnlib--il-latest.txt`. + +### 4.b Wire-byte capture for the request + +Same IL-rewrite tooling that captured the 27 outgoing event calls: + +```powershell +$captureDir = "artifacts\reverse-engineering\instrumented-wcf-writemessage-writes" +New-Item -ItemType Directory -Force -Path $captureDir | Out-Null +dotnet run --no-build --project tools\AVEVA.Historian.ReverseEngineering -- instrument-wcf-writemessage current\aahClientManaged.dll "$captureDir\aahClientManaged.dll" +Copy-Item -Force "$captureDir\aahClientManaged.dll" "$captureDir\current-copy\aahClientManaged.dll" +$env:AVEVA_HISTORIAN_RE_CAPTURE = (Resolve-Path $captureDir).Path + "\writemessage-capture-write-latest.ndjson" +``` + +A new harness scenario `--scenario write` needs to be added to +`tools/AVEVA.Historian.NativeTraceHarness` to drive the native +wrapper's `AddStreamValues2` against the sandbox tag. Suggested +new args: `--write-sandbox-tag`, `--write-value`. + +### 4.c Wire-byte capture for the response + +Symmetric `instrument-wcf-readmessage`: + +```powershell +dotnet run --no-build --project tools\AVEVA.Historian.ReverseEngineering -- instrument-wcf-readmessage current\aahClientManaged.dll "$captureDir\aahClientManaged.dll" +``` + +The success response for `AddS2` is just `true` + +empty `errorBuffer`. **Capture at least one negative case** (write +to non-existent tag, or write with malformed CDataType) so the +orchestrator can surface diagnostics like +`HistorianWcfEventOrchestrator.LastErrorBufferDescription`. + +### 4.d Decode against IL + +Strip SOAP/MDAS envelope; align byte offsets against the native +serializer IL from 4.a (the `ldc.i4 / call WriteByte` sequence +makes field order and constants explicit); cross-reference the +`CDataType` table from `data-query-request-ctor-il-latest.txt` to +interpret typed value bytes; write a parser-and-builder pair and +verify against the captured bytes before committing. + +### 4.e Implement managed serializer + tests + +New code under `src/AVEVA.Historian.Client/Wcf/`: + +- `HistorianAddStreamValuesProtocol.cs` — `Serialize(...)` returns + `byte[] pBuf`, mirroring `HistorianAddTagsProtocol`. +- Extend (or split) `HistorianAddTagsProtocol` for the analog / + discrete / string `EnsT2` shapes. +- `HistorianWcfWriteOrchestrator.cs` — chains `Hist.GetV → + Hist.ValCl × 2 → Hist.Open2 → UpdC3 → priming chain (TBD per + §3.2) → AddS2 loop → Close2`. + +Public surface on `HistorianClient`: + +- `WriteValueAsync(tag, value, timestampUtc, quality)` +- `WriteValuesAsync(IReadOnlyList)` +- `EnsureTagAsync(HistorianTagDefinition)` +- `DeleteTagAsync(string tagName)` + +Until evidence supports each path, throw +`ProtocolEvidenceMissingException` (mirrors the existing read +guardrail). + +Unit tests under `tests/AVEVA.Historian.Client.Tests/Wcf/`: + +- `WcfAddStreamValuesProtocolTests` — golden-byte tests for one + analog, one discrete, one string write. +- `WcfEnsureTagsProtocolTests` — golden-byte tests for the + analog/discrete/string `CTagMetadata` shapes. +- Extend `ProtocolGuardrailTests` so any not-yet-implemented write + path still throws `ProtocolEvidenceMissingException`. + +Live integration tests in `HistorianClientIntegrationTests.cs`, +gated on `HISTORIAN_WRITE_SANDBOX_TAG`: +`WriteValueAsync_WithinDocumentedWindow_PersistsToHistorianDb` +writes a unique value, reads it back via `ReadRawAsync`, and +verifies via direct `sqlcmd` to the History extension table. + +## 5. Order of Operations + +``` +3.4 method discovery (cheap; may eliminate scope) + │ + ▼ +3.1 EnsT2 (analog/discrete/string) ──► sandbox tag exists + │ + ├─────────────────────────────┐ + ▼ ▼ +3.2 AddS2 (priority 1) 3.3 DelT (sandbox cleanup) + │ + ▼ +3.4 ModifyData/DeleteData (only if 3.4 confirmed scope) + │ + ▼ +public surface, golden-byte tests, integration tests +``` + +3.2 is the headline win and depends only on 3.1 (or a manually +created sandbox tag). 3.3 must land before any commit that +programmatically creates new tags; until then, manual SMC deletion +is the documented rollback. + +## 6. Risks and Mitigations + +### 6.a Auth chain may differ for writes + +Reads use `Hist.Open2(ConnectionMode = 0x402)`. Events use the same +`0x402` plus a Stat-priming chain. Writes may need a different +mode (the handoff notes `0x501` was an unverified guess for +events; writes may legitimately need `0x401` or another value). + +Mitigation: capture the *full* WriteMessage sequence for a native +write session (not just `AddS2`) to see what `Open2` payload and +priming calls the native wrapper sends. + +### 6.b Server-side session-table requirement + +Writes may require `RTag2` after `EnsT2` and before `AddS2` (the +event flow needs `RTag2(CmEventTagId)`). The "tag identifier" the +server returns from `EnsT2` may differ from the GUID the client +seeded. + +Mitigation: capture the analog `EnsT2` `OutBuff` (event flow's was +a 45-byte echo) and verify whether subsequent `AddS2` payloads +reference the client-seeded GUID, the server-returned GUID, or a +numeric `wwTagKey`. SQL ground truth: `SELECT TagName, wwTagKey +FROM Tag WHERE TagName = '...'`. + +### 6.c Silent-success failure mode + +`AddS2` may return `true` but no row appears in the History +extension table — the engine silently drops samples outside the +`FutureTimeThreshold` / `RealTimeWindow` system parameters (which +the event flow now reads). + +Mitigation: always write at `DateTime.UtcNow`; cross-check with +SQL after every test: + +```sql +SELECT TOP 5 DateTime, Value, QualityDetail +FROM History +WHERE wwTagKey = (SELECT wwTagKey FROM Tag WHERE TagName = @sandbox) + AND DateTime BETWEEN @windowStart AND @windowEnd +ORDER BY DateTime DESC; +``` + +Surface `FutureTimeThreshold` / `RealTimeWindow` via existing +`GetSystemParameterAsync` so failures are diagnosable. + +### 6.d Storage service vs History service + +`IStorageServiceContract` also exposes `AddT/AddS/AddS2/DelT`. The +working hypothesis is that `/Hist` is client-facing and `/Stor` is +engine-internal, but it's not yet verified. + +Mitigation: the WriteMessage capture (§4.b) shows the actual +service path on the wire. If it goes to `/Stor`, update the +orchestrator. Do NOT preemptively implement against both. + +### 6.e Parameter-name mismatches + +Handoff already flagged `EnsT`, `EnsT2`, `RTag2`, `ExKey`, `StJb`, +`GtJb` for the same `inBuff`/`inputBuffer` mismatch class that +broke reads for weeks. Until each is audited against the server +contract, requests bind to null and the server NREs. + +Mitigation: before the first write WriteMessage capture, run an +`ildasm` audit against `aahClientAccessPoint.exe` for the exact +parameter names of `EnsT2`, `AddS2`, and `DelT`, and reconcile +against the existing `[MessageParameter]` attributes. + +### 6.f Customer-data exposure in capture files + +Write captures contain the sandbox tag name and any value the test +wrote. Not secrets, but noise. + +Mitigation: keep all +`instrumented-wcf-writemessage-writes/` artifacts under +`artifacts/` (already gitignored). Sanitize tag names to +`` before committing decoded bytes into +`docs/reverse-engineering/`. + +## 7. Success Criteria + +Per op: + +- **`EnsT2(analog)`**: `EnsureTagAsync(new HistorianTagDefinition { + Name = sandbox, DataType = Analog })` returns success; + `sqlcmd -E -S . -d Runtime -Q "SELECT TagName FROM Tag WHERE + TagName = '...'"` returns one row. +- **`EnsT2(discrete, string)`**: same shape with corresponding + `DataType`; SQL check uses `DiscreteTag` / `StringTag` view. +- **`AddS2`**: `WriteValueAsync(sandbox, 42.0, DateTime.UtcNow)` + returns success; `ReadRawAsync` returns the value; + `SELECT TOP 1 Value FROM History WHERE wwTagKey = ? AND DateTime + BETWEEN ? AND ?` returns the same value. +- **`DelT`**: `DeleteTagAsync(sandbox)` returns success and SQL + returns zero rows from `Tag`. +- **`ModifyData` / `DeleteData`**: deferred until §3.4 method + discovery confirms scope. + +Cross-cutting: + +- All new code in `src/AVEVA.Historian.Client/` is pure managed + .NET 10. No new P/Invoke beyond the existing `HistorianSspiClient`. +- Every new op has a golden-byte unit test. +- `dotnet test .\Histsdk.slnx --no-build --logger + "console;verbosity=minimal"` passes 100%. +- With `HISTORIAN_HOST=localhost`, + `HISTORIAN_WRITE_SANDBOX_TAG=RetestSdkWriteSandbox` set, write + integration tests pass and leave zero residue (test `Dispose` + calls `DelT` for cleanup). +- Sanitization scan returns no real secrets. +- `CLAUDE.md` "Required SDK Surface" updated to add the new write + ops — this is a SCOPE CHANGE that must land *alongside* the + evidence, not before. Do not update the SDK surface doc until + 3.1 + 3.2 are at least live-test-green. + +## 8. Open Questions + +1. Does `AddS2` go through `/Hist` or `/Stor` on the wire? +2. Does the sandbox tag need pre-configuration via System + Management Console once before `EnsT2` will accept it from a + client (e.g. for `Storage` / `wwDomain` rows the wire protocol + may not be able to populate)? +3. What `ConnectionMode` does the native wrapper use for write + sessions — `0x402` (read mode reused), `0x401`, or something + else? +4. Does `EnsT2(analog)` require any optional Archestra + engineering-units fields, or are they purely cosmetic? Affects + how minimal `HistorianTagDefinition` can be. +5. Server-side throttles on writes (max samples per AddS2, max + calls per second) — need to surface as batching guidance? +6. What does the server return when `AddS2` is called with a + timestamp older than the tag's earliest stored block? Some + historians silently drop, some error, some accept-and-overwrite. +7. Does the SDK expose write quality as the same + `HistorianSample.Quality` enum used on reads, or a smaller + subset (good/bad)? +8. Is there a managed-side `DelT` path at all? If + `aahClientManaged` only exposes deletion via SMC, §3.3 is + "manual SMC only" and must be documented as such. + +## 9. Docs To Update Once Each Workstream Lands + +- `CLAUDE.md` "Required SDK Surface" — add `WriteValueAsync`, + `EnsureTagAsync`, `DeleteTagAsync` once 3.1+3.2+3.3 land. +- `AGENTS.md` "Required SDK Surface" — same; update the "alarm-event + write path is dormant" note. +- `docs/reverse-engineering/handoff.md` — add a "Write-flow prereqs" + section symmetric to the existing "Event-flow prereqs". +- `docs/reverse-engineering/wcf-contract-evidence.md` — add evidence + rows for `EnsT2(analog/discrete/string)`, `AddS2`, `DelT`. +- `docs/reverse-engineering/implementation-status.md` — flip + status from "out of scope" to "implemented". +- `README.md` — operation status table.