Files
histsdk/docs/plans/write-commands-reverse-engineering.md
T
dohertj2 6f01b83313 Plan two reverse-engineering campaigns: write commands + store/forward cache
docs/plans/write-commands-reverse-engineering.md (425 lines):
  Plan for adding WriteValueAsync (AddS2 stream values), EnsureTags2 for
  analog/discrete/string tags, and DelT for sandbox cleanup. Hard safety
  rules center on a dedicated sandbox tag gated by env var, time-bounded
  writes, SQL ground-truth verification per session, explicit rollback.
  Five-step RE workflow mirrors the read/event decode (static IL discovery
  -> instrument-wcf-writemessage capture -> instrument-wcf-readmessage
  capture -> byte/IL alignment -> managed serializer + golden-byte tests).
  Risks call out auth-chain unknowns, parameter-name-mismatch class,
  silent-success failure modes, History-vs-Storage service question.

docs/plans/store-forward-cache-reverse-engineering.md (501 lines):
  Plan for replacing the synthesized GetStoreForwardStatusAsync with a
  real implementation. Architecture investigation already partially
  answered via IL inspection during planning: ArchestrA.HistorianAccess.
  GetStoreForwardStatus (token 0x06006187) reads an in-process C struct
  via calli to mdas_GetStorageStatus, kept current by server-pushed WCF
  callbacks (IStatusServiceContract2.SetStoreForwardEvent). CSFConnection.
  GetSFPipeName indicates a separate Named Pipe sidecar exists when SF
  is configured. Five parallelizable discovery workstreams, six concrete
  RE steps with cited tokens, eight risks, eight success criteria.

Both plans deliberately produce no code changes and no captures. They
exist so the next implementer can start with full context.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-04 07:16:32 -04:00

426 lines
18 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Plan: Reverse-Engineering Write Commands
Status: PLAN ONLY (no implementation yet). Extends the read/event
work in `docs/reverse-engineering/handoff.md` (2026-05-04).
## 1. Goal
"Write commands work" means the production SDK at
`src/AVEVA.Historian.Client/` performs these operations end-to-end
against a live AVEVA Historian, with parsed responses, golden-byte
unit tests, and gated live integration tests.
In scope:
1. **`AddS2` (`IHistoryServiceContract2.AddStreamValues2`)** — push
one or more timestamped samples for an existing historized tag.
Primary use case: an OPC UA driver pushing values to the
Historian.
2. **`EnsT2` (`IHistoryServiceContract2.EnsureTags2`) for
analog/discrete/string data tags** — partially decoded for the
`CM_EVENT` AnE-event tag in
`src/AVEVA.Historian.Client/Wcf/HistorianAddTagsProtocol.cs`. The
`CTagMetadata` byte layout for `CDataType` ∈ {1, 2, 3, 4} is the
new evidence target.
3. **`DelT` (`IHistoryServiceContract2.DeleteTags`)** — needed for
safe sandbox cleanup during RE.
4. **`ModifyData` / `DeleteData`** — only if §3.4 method discovery
confirms a managed WCF op exists.
Out of scope: tag-extended-properties (`AddTEx` / `DelTep`),
`ExKey`, `SetSFP`, snapshot send (`SendSnapshotBegin/End/Snapshot`),
tag-id-pair maintenance, shard splits, flush ops, all
`IStorageServiceContract` writes (engine-internal — see §6.d), event
writes (events come from AVEVA AnE, we only read them), schema
changes (forbidden over the wire).
## 2. Safety Constraints
The Runtime DB is production data even on `localhost`. `AddS2`
writes are persistent — they go to compressed history blocks and
cannot be removed through any client-facing surface.
Hard rules:
1. **Single dedicated sandbox tag.** Add env var
`HISTORIAN_WRITE_SANDBOX_TAG = "RetestSdkWriteSandbox"`. Live
write tests refuse to run when unset, even when other
`HISTORIAN_*` vars are set.
2. **Never write to** any tag named in `HISTORIAN_TEST_TAG`,
`HISTORIAN_TAG_FILTER`, the docs, the test fixtures, or the
captured RE ndjson. The read fixture
`OtOpcUaParityTest_001.Counter` is OFF-LIMITS for writes.
3. **Documented rollback.** Every write session records its time
window to
`artifacts/reverse-engineering/write-sandbox-window-<stamp>.json`
so SQL `SELECT * FROM History WHERE wwTagKey = ? AND DateTime
BETWEEN @s AND @e` can identify exactly which rows the session
inserted. Tag rollback is via decoded `DelT` (§3.3) once
available, or manually via System Management Console until then.
4. **Time bounds on writes.** Every `AddS2` test uses
`DateTime.UtcNow` ± a small offset, so writes always land inside
the live `RealTimeWindow` / `FutureTimeThreshold` system
parameters and cannot accidentally overwrite older blocks.
5. **No customer / corporate hosts.** `localhost` only.
6. **Sanitization scan after every session:**
`rg -n "(?i)(password|credential|secret|token|<known-sensitive-host>|<known-sensitive-machine>|<known-sensitive-user>)" docs\reverse-engineering scripts tools docs\plans`.
Soft rules:
- Use a separate captures dir
(`artifacts/reverse-engineering/instrumented-wcf-writemessage-writes/`)
so write captures don't contaminate the existing read/event
ndjson.
- New integration tests follow the existing gating pattern in
`tests/AVEVA.Historian.Client.Tests/HistorianClientIntegrationTests.cs`
(`Skip = ...` when env var unset).
## 3. Discovery Workstreams
### 3.1 EnsT2 for analog/discrete/string tags (priority 1)
- WCF op: `aa/Hist/EnsT2`.
- Contract:
`src/AVEVA.Historian.Client/Wcf/Contracts/IHistoryServiceContract2.cs:82-89`,
already declared with `[MessageParameter(Name = "InBuff" / "OutBuff")]`.
- Existing code: `HistorianAddTagsProtocol.SerializeCmEventCTagMetadata`
builds the `CDataType=5` (event) shape.
- Missing: the `CTagMetadata` byte layout for `CDataType ∈ {1, 2,
3, 4}` (analog double, discrete, string, analog int per the
type-code table in `data-query-request-ctor-il-latest.txt`);
whether the optional-mask `0x0086` and the 5-byte trailer
`2F 27 01 01 01` change per type; analog engineering-units / range
/ deadband fields (likely populate the bytes that are zero in the
event-tag fixture).
### 3.2 AddS2 stream values (priority 1)
- WCF op: `aa/Hist/AddS2`.
- Contract:
`src/AVEVA.Historian.Client/Wcf/Contracts/IHistoryServiceContract2.cs:75-80`,
already has `[MessageParameter(Name = "pBuf")]`. **Audit
requirement:** verify against `ildasm aahClientAccessPoint.exe`
that `Handle` and `errorBuffer` parameter names also match — the
handoff's parameter-name-mismatch class has bitten ~30 ops.
- Missing: entire `pBuf` byte layout (likely `UInt16 version + UInt32
sampleCount + N × {tagId GUID, FILETIME, qualityByte, value typed
by CDataType}`); whether `Handle` is the same Open2 v6 session GUID
as `UpdC3`/`RTag2`/`EnsT2`; the auth-chain prereqs (event flow
needed Stat priming + Trx/Stat/Retr `GetV` between RTag2 and EnsT2;
writes may have a different chain); success vs error response
shape.
### 3.3 DelT tag deletion (priority 2 — needed for safe RE)
- WCF op: `aa/Hist/DelT`.
- Contract:
`src/AVEVA.Historian.Client/Wcf/Contracts/IHistoryServiceContract2.cs:21-30`.
- Missing: `tagNames` byte layout (likely length-prefixed
compact-ASCII per the handoff convention); whether server refuses
to delete tags with stored history or cascades; whether `DelT` is
sufficient to fully unregister or leaves orphan rows in
`Runtime.dbo.Tag`.
### 3.4 ModifyData / DeleteData (priority 3 — exists?)
No corresponding WCF op is currently declared. **First step:** static
inspection to confirm any managed wrapper exists.
```powershell
dotnet run --no-build --project tools\AVEVA.Historian.ReverseEngineering -- methods current\aahClientManaged.dll EditValue
dotnet run --no-build --project tools\AVEVA.Historian.ReverseEngineering -- methods current\aahClientManaged.dll ModifyValue
dotnet run --no-build --project tools\AVEVA.Historian.ReverseEngineering -- methods current\aahClientManaged.dll EditData
dotnet run --no-build --project tools\AVEVA.Historian.ReverseEngineering -- methods current\aahClientManaged.dll DeleteData
```
If no managed wrapper exists, this op is REST-only / SMC-only —
mark as **out of scope** in this doc. Otherwise decode like
§3.1/§3.2.
Parallelism: 3.1 and 3.3 can be developed in parallel because the
operator can create the sandbox tag manually via SMC while SDK code
is being written. 3.2 cannot meaningfully proceed until 3.1 (or the
manual tag) exists. 3.4 method discovery is cheap and may eliminate
its own scope.
## 4. RE Steps in Execution Order
For each workstream above, run these five steps. Mirrors the read
+ event flows that recovered the existing protocol.
### 4.a Static method discovery
Find the native serializer:
```powershell
dotnet run --no-build --project tools\AVEVA.Historian.ReverseEngineering -- methods current\aahClientManaged.dll AddS
dotnet run --no-build --project tools\AVEVA.Historian.ReverseEngineering -- methods current\aahClientManaged.dll EnsureTag
dotnet run --no-build --project tools\AVEVA.Historian.ReverseEngineering -- methods current\aahClientManaged.dll DeleteTag
```
Dump IL for each method of interest:
```powershell
dotnet run --no-build --project tools\AVEVA.Historian.ReverseEngineering -- dnlib-method --instructions current\aahClientManaged.dll <Type::Method>
```
Save sanitized excerpts to
`docs/reverse-engineering/dnlib-<op>-il-latest.txt`.
### 4.b Wire-byte capture for the request
Same IL-rewrite tooling that captured the 27 outgoing event calls:
```powershell
$captureDir = "artifacts\reverse-engineering\instrumented-wcf-writemessage-writes"
New-Item -ItemType Directory -Force -Path $captureDir | Out-Null
dotnet run --no-build --project tools\AVEVA.Historian.ReverseEngineering -- instrument-wcf-writemessage current\aahClientManaged.dll "$captureDir\aahClientManaged.dll"
Copy-Item -Force "$captureDir\aahClientManaged.dll" "$captureDir\current-copy\aahClientManaged.dll"
$env:AVEVA_HISTORIAN_RE_CAPTURE = (Resolve-Path $captureDir).Path + "\writemessage-capture-write-latest.ndjson"
```
A new harness scenario `--scenario write` needs to be added to
`tools/AVEVA.Historian.NativeTraceHarness` to drive the native
wrapper's `AddStreamValues2` against the sandbox tag. Suggested
new args: `--write-sandbox-tag`, `--write-value`.
### 4.c Wire-byte capture for the response
Symmetric `instrument-wcf-readmessage`:
```powershell
dotnet run --no-build --project tools\AVEVA.Historian.ReverseEngineering -- instrument-wcf-readmessage current\aahClientManaged.dll "$captureDir\aahClientManaged.dll"
```
The success response for `AddS2` is just `<AddS2Result>true</…>` +
empty `errorBuffer`. **Capture at least one negative case** (write
to non-existent tag, or write with malformed CDataType) so the
orchestrator can surface diagnostics like
`HistorianWcfEventOrchestrator.LastErrorBufferDescription`.
### 4.d Decode against IL
Strip SOAP/MDAS envelope; align byte offsets against the native
serializer IL from 4.a (the `ldc.i4 / call WriteByte` sequence
makes field order and constants explicit); cross-reference the
`CDataType` table from `data-query-request-ctor-il-latest.txt` to
interpret typed value bytes; write a parser-and-builder pair and
verify against the captured bytes before committing.
### 4.e Implement managed serializer + tests
New code under `src/AVEVA.Historian.Client/Wcf/`:
- `HistorianAddStreamValuesProtocol.cs` — `Serialize(...)` returns
`byte[] pBuf`, mirroring `HistorianAddTagsProtocol`.
- Extend (or split) `HistorianAddTagsProtocol` for the analog /
discrete / string `EnsT2` shapes.
- `HistorianWcfWriteOrchestrator.cs` — chains `Hist.GetV →
Hist.ValCl × 2 → Hist.Open2 → UpdC3 → priming chain (TBD per
§3.2) → AddS2 loop → Close2`.
Public surface on `HistorianClient`:
- `WriteValueAsync(tag, value, timestampUtc, quality)`
- `WriteValuesAsync(IReadOnlyList<HistorianSampleWrite>)`
- `EnsureTagAsync(HistorianTagDefinition)`
- `DeleteTagAsync(string tagName)`
Until evidence supports each path, throw
`ProtocolEvidenceMissingException` (mirrors the existing read
guardrail).
Unit tests under `tests/AVEVA.Historian.Client.Tests/Wcf/`:
- `WcfAddStreamValuesProtocolTests` — golden-byte tests for one
analog, one discrete, one string write.
- `WcfEnsureTagsProtocolTests` — golden-byte tests for the
analog/discrete/string `CTagMetadata` shapes.
- Extend `ProtocolGuardrailTests` so any not-yet-implemented write
path still throws `ProtocolEvidenceMissingException`.
Live integration tests in `HistorianClientIntegrationTests.cs`,
gated on `HISTORIAN_WRITE_SANDBOX_TAG`:
`WriteValueAsync_WithinDocumentedWindow_PersistsToHistorianDb`
writes a unique value, reads it back via `ReadRawAsync`, and
verifies via direct `sqlcmd` to the History extension table.
## 5. Order of Operations
```
3.4 method discovery (cheap; may eliminate scope)
3.1 EnsT2 (analog/discrete/string) ──► sandbox tag exists
├─────────────────────────────┐
▼ ▼
3.2 AddS2 (priority 1) 3.3 DelT (sandbox cleanup)
3.4 ModifyData/DeleteData (only if 3.4 confirmed scope)
public surface, golden-byte tests, integration tests
```
3.2 is the headline win and depends only on 3.1 (or a manually
created sandbox tag). 3.3 must land before any commit that
programmatically creates new tags; until then, manual SMC deletion
is the documented rollback.
## 6. Risks and Mitigations
### 6.a Auth chain may differ for writes
Reads use `Hist.Open2(ConnectionMode = 0x402)`. Events use the same
`0x402` plus a Stat-priming chain. Writes may need a different
mode (the handoff notes `0x501` was an unverified guess for
events; writes may legitimately need `0x401` or another value).
Mitigation: capture the *full* WriteMessage sequence for a native
write session (not just `AddS2`) to see what `Open2` payload and
priming calls the native wrapper sends.
### 6.b Server-side session-table requirement
Writes may require `RTag2` after `EnsT2` and before `AddS2` (the
event flow needs `RTag2(CmEventTagId)`). The "tag identifier" the
server returns from `EnsT2` may differ from the GUID the client
seeded.
Mitigation: capture the analog `EnsT2` `OutBuff` (event flow's was
a 45-byte echo) and verify whether subsequent `AddS2` payloads
reference the client-seeded GUID, the server-returned GUID, or a
numeric `wwTagKey`. SQL ground truth: `SELECT TagName, wwTagKey
FROM Tag WHERE TagName = '...'`.
### 6.c Silent-success failure mode
`AddS2` may return `true` but no row appears in the History
extension table — the engine silently drops samples outside the
`FutureTimeThreshold` / `RealTimeWindow` system parameters (which
the event flow now reads).
Mitigation: always write at `DateTime.UtcNow`; cross-check with
SQL after every test:
```sql
SELECT TOP 5 DateTime, Value, QualityDetail
FROM History
WHERE wwTagKey = (SELECT wwTagKey FROM Tag WHERE TagName = @sandbox)
AND DateTime BETWEEN @windowStart AND @windowEnd
ORDER BY DateTime DESC;
```
Surface `FutureTimeThreshold` / `RealTimeWindow` via existing
`GetSystemParameterAsync` so failures are diagnosable.
### 6.d Storage service vs History service
`IStorageServiceContract` also exposes `AddT/AddS/AddS2/DelT`. The
working hypothesis is that `/Hist` is client-facing and `/Stor` is
engine-internal, but it's not yet verified.
Mitigation: the WriteMessage capture (§4.b) shows the actual
service path on the wire. If it goes to `/Stor`, update the
orchestrator. Do NOT preemptively implement against both.
### 6.e Parameter-name mismatches
Handoff already flagged `EnsT`, `EnsT2`, `RTag2`, `ExKey`, `StJb`,
`GtJb` for the same `inBuff`/`inputBuffer` mismatch class that
broke reads for weeks. Until each is audited against the server
contract, requests bind to null and the server NREs.
Mitigation: before the first write WriteMessage capture, run an
`ildasm` audit against `aahClientAccessPoint.exe` for the exact
parameter names of `EnsT2`, `AddS2`, and `DelT`, and reconcile
against the existing `[MessageParameter]` attributes.
### 6.f Customer-data exposure in capture files
Write captures contain the sandbox tag name and any value the test
wrote. Not secrets, but noise.
Mitigation: keep all
`instrumented-wcf-writemessage-writes/` artifacts under
`artifacts/` (already gitignored). Sanitize tag names to
`<sandbox-tag>` before committing decoded bytes into
`docs/reverse-engineering/`.
## 7. Success Criteria
Per op:
- **`EnsT2(analog)`**: `EnsureTagAsync(new HistorianTagDefinition {
Name = sandbox, DataType = Analog })` returns success;
`sqlcmd -E -S . -d Runtime -Q "SELECT TagName FROM Tag WHERE
TagName = '...'"` returns one row.
- **`EnsT2(discrete, string)`**: same shape with corresponding
`DataType`; SQL check uses `DiscreteTag` / `StringTag` view.
- **`AddS2`**: `WriteValueAsync(sandbox, 42.0, DateTime.UtcNow)`
returns success; `ReadRawAsync` returns the value;
`SELECT TOP 1 Value FROM History WHERE wwTagKey = ? AND DateTime
BETWEEN ? AND ?` returns the same value.
- **`DelT`**: `DeleteTagAsync(sandbox)` returns success and SQL
returns zero rows from `Tag`.
- **`ModifyData` / `DeleteData`**: deferred until §3.4 method
discovery confirms scope.
Cross-cutting:
- All new code in `src/AVEVA.Historian.Client/` is pure managed
.NET 10. No new P/Invoke beyond the existing `HistorianSspiClient`.
- Every new op has a golden-byte unit test.
- `dotnet test .\Histsdk.slnx --no-build --logger
"console;verbosity=minimal"` passes 100%.
- With `HISTORIAN_HOST=localhost`,
`HISTORIAN_WRITE_SANDBOX_TAG=RetestSdkWriteSandbox` set, write
integration tests pass and leave zero residue (test `Dispose`
calls `DelT` for cleanup).
- Sanitization scan returns no real secrets.
- `CLAUDE.md` "Required SDK Surface" updated to add the new write
ops — this is a SCOPE CHANGE that must land *alongside* the
evidence, not before. Do not update the SDK surface doc until
3.1 + 3.2 are at least live-test-green.
## 8. Open Questions
1. Does `AddS2` go through `/Hist` or `/Stor` on the wire?
2. Does the sandbox tag need pre-configuration via System
Management Console once before `EnsT2` will accept it from a
client (e.g. for `Storage` / `wwDomain` rows the wire protocol
may not be able to populate)?
3. What `ConnectionMode` does the native wrapper use for write
sessions — `0x402` (read mode reused), `0x401`, or something
else?
4. Does `EnsT2(analog)` require any optional Archestra
engineering-units fields, or are they purely cosmetic? Affects
how minimal `HistorianTagDefinition` can be.
5. Server-side throttles on writes (max samples per AddS2, max
calls per second) — need to surface as batching guidance?
6. What does the server return when `AddS2` is called with a
timestamp older than the tag's earliest stored block? Some
historians silently drop, some error, some accept-and-overwrite.
7. Does the SDK expose write quality as the same
`HistorianSample.Quality` enum used on reads, or a smaller
subset (good/bad)?
8. Is there a managed-side `DelT` path at all? If
`aahClientManaged` only exposes deletion via SMC, §3.3 is
"manual SMC only" and must be documented as such.
## 9. Docs To Update Once Each Workstream Lands
- `CLAUDE.md` "Required SDK Surface" — add `WriteValueAsync`,
`EnsureTagAsync`, `DeleteTagAsync` once 3.1+3.2+3.3 land.
- `AGENTS.md` "Required SDK Surface" — same; update the "alarm-event
write path is dormant" note.
- `docs/reverse-engineering/handoff.md` — add a "Write-flow prereqs"
section symmetric to the existing "Event-flow prereqs".
- `docs/reverse-engineering/wcf-contract-evidence.md` — add evidence
rows for `EnsT2(analog/discrete/string)`, `AddS2`, `DelT`.
- `docs/reverse-engineering/implementation-status.md` — flip
status from "out of scope" to "implemented".
- `README.md` — operation status table.