Files
histsdk/docs/plans/write-commands-reverse-engineering.md
T
dohertj2 6f01b83313 Plan two reverse-engineering campaigns: write commands + store/forward cache
docs/plans/write-commands-reverse-engineering.md (425 lines):
  Plan for adding WriteValueAsync (AddS2 stream values), EnsureTags2 for
  analog/discrete/string tags, and DelT for sandbox cleanup. Hard safety
  rules center on a dedicated sandbox tag gated by env var, time-bounded
  writes, SQL ground-truth verification per session, explicit rollback.
  Five-step RE workflow mirrors the read/event decode (static IL discovery
  -> instrument-wcf-writemessage capture -> instrument-wcf-readmessage
  capture -> byte/IL alignment -> managed serializer + golden-byte tests).
  Risks call out auth-chain unknowns, parameter-name-mismatch class,
  silent-success failure modes, History-vs-Storage service question.

docs/plans/store-forward-cache-reverse-engineering.md (501 lines):
  Plan for replacing the synthesized GetStoreForwardStatusAsync with a
  real implementation. Architecture investigation already partially
  answered via IL inspection during planning: ArchestrA.HistorianAccess.
  GetStoreForwardStatus (token 0x06006187) reads an in-process C struct
  via calli to mdas_GetStorageStatus, kept current by server-pushed WCF
  callbacks (IStatusServiceContract2.SetStoreForwardEvent). CSFConnection.
  GetSFPipeName indicates a separate Named Pipe sidecar exists when SF
  is configured. Five parallelizable discovery workstreams, six concrete
  RE steps with cited tokens, eight risks, eight success criteria.

Both plans deliberately produce no code changes and no captures. They
exist so the next implementer can start with full context.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-04 07:16:32 -04:00

18 KiB
Raw Blame History

Plan: Reverse-Engineering Write Commands

Status: PLAN ONLY (no implementation yet). Extends the read/event work in docs/reverse-engineering/handoff.md (2026-05-04).

1. Goal

"Write commands work" means the production SDK at src/AVEVA.Historian.Client/ performs these operations end-to-end against a live AVEVA Historian, with parsed responses, golden-byte unit tests, and gated live integration tests.

In scope:

  1. AddS2 (IHistoryServiceContract2.AddStreamValues2) — push one or more timestamped samples for an existing historized tag. Primary use case: an OPC UA driver pushing values to the Historian.
  2. EnsT2 (IHistoryServiceContract2.EnsureTags2) for analog/discrete/string data tags — partially decoded for the CM_EVENT AnE-event tag in src/AVEVA.Historian.Client/Wcf/HistorianAddTagsProtocol.cs. The CTagMetadata byte layout for CDataType ∈ {1, 2, 3, 4} is the new evidence target.
  3. DelT (IHistoryServiceContract2.DeleteTags) — needed for safe sandbox cleanup during RE.
  4. ModifyData / DeleteData — only if §3.4 method discovery confirms a managed WCF op exists.

Out of scope: tag-extended-properties (AddTEx / DelTep), ExKey, SetSFP, snapshot send (SendSnapshotBegin/End/Snapshot), tag-id-pair maintenance, shard splits, flush ops, all IStorageServiceContract writes (engine-internal — see §6.d), event writes (events come from AVEVA AnE, we only read them), schema changes (forbidden over the wire).

2. Safety Constraints

The Runtime DB is production data even on localhost. AddS2 writes are persistent — they go to compressed history blocks and cannot be removed through any client-facing surface.

Hard rules:

  1. Single dedicated sandbox tag. Add env var HISTORIAN_WRITE_SANDBOX_TAG = "RetestSdkWriteSandbox". Live write tests refuse to run when unset, even when other HISTORIAN_* vars are set.
  2. Never write to any tag named in HISTORIAN_TEST_TAG, HISTORIAN_TAG_FILTER, the docs, the test fixtures, or the captured RE ndjson. The read fixture OtOpcUaParityTest_001.Counter is OFF-LIMITS for writes.
  3. Documented rollback. Every write session records its time window to artifacts/reverse-engineering/write-sandbox-window-<stamp>.json so SQL SELECT * FROM History WHERE wwTagKey = ? AND DateTime BETWEEN @s AND @e can identify exactly which rows the session inserted. Tag rollback is via decoded DelT (§3.3) once available, or manually via System Management Console until then.
  4. Time bounds on writes. Every AddS2 test uses DateTime.UtcNow ± a small offset, so writes always land inside the live RealTimeWindow / FutureTimeThreshold system parameters and cannot accidentally overwrite older blocks.
  5. No customer / corporate hosts. localhost only.
  6. Sanitization scan after every session: rg -n "(?i)(password|credential|secret|token|<known-sensitive-host>|<known-sensitive-machine>|<known-sensitive-user>)" docs\reverse-engineering scripts tools docs\plans.

Soft rules:

  • Use a separate captures dir (artifacts/reverse-engineering/instrumented-wcf-writemessage-writes/) so write captures don't contaminate the existing read/event ndjson.
  • New integration tests follow the existing gating pattern in tests/AVEVA.Historian.Client.Tests/HistorianClientIntegrationTests.cs (Skip = ... when env var unset).

3. Discovery Workstreams

3.1 EnsT2 for analog/discrete/string tags (priority 1)

  • WCF op: aa/Hist/EnsT2.
  • Contract: src/AVEVA.Historian.Client/Wcf/Contracts/IHistoryServiceContract2.cs:82-89, already declared with [MessageParameter(Name = "InBuff" / "OutBuff")].
  • Existing code: HistorianAddTagsProtocol.SerializeCmEventCTagMetadata builds the CDataType=5 (event) shape.
  • Missing: the CTagMetadata byte layout for CDataType ∈ {1, 2, 3, 4} (analog double, discrete, string, analog int per the type-code table in data-query-request-ctor-il-latest.txt); whether the optional-mask 0x0086 and the 5-byte trailer 2F 27 01 01 01 change per type; analog engineering-units / range / deadband fields (likely populate the bytes that are zero in the event-tag fixture).

3.2 AddS2 stream values (priority 1)

  • WCF op: aa/Hist/AddS2.
  • Contract: src/AVEVA.Historian.Client/Wcf/Contracts/IHistoryServiceContract2.cs:75-80, already has [MessageParameter(Name = "pBuf")]. Audit requirement: verify against ildasm aahClientAccessPoint.exe that Handle and errorBuffer parameter names also match — the handoff's parameter-name-mismatch class has bitten ~30 ops.
  • Missing: entire pBuf byte layout (likely UInt16 version + UInt32 sampleCount + N × {tagId GUID, FILETIME, qualityByte, value typed by CDataType}); whether Handle is the same Open2 v6 session GUID as UpdC3/RTag2/EnsT2; the auth-chain prereqs (event flow needed Stat priming + Trx/Stat/Retr GetV between RTag2 and EnsT2; writes may have a different chain); success vs error response shape.

3.3 DelT tag deletion (priority 2 — needed for safe RE)

  • WCF op: aa/Hist/DelT.
  • Contract: src/AVEVA.Historian.Client/Wcf/Contracts/IHistoryServiceContract2.cs:21-30.
  • Missing: tagNames byte layout (likely length-prefixed compact-ASCII per the handoff convention); whether server refuses to delete tags with stored history or cascades; whether DelT is sufficient to fully unregister or leaves orphan rows in Runtime.dbo.Tag.

3.4 ModifyData / DeleteData (priority 3 — exists?)

No corresponding WCF op is currently declared. First step: static inspection to confirm any managed wrapper exists.

dotnet run --no-build --project tools\AVEVA.Historian.ReverseEngineering -- methods current\aahClientManaged.dll EditValue
dotnet run --no-build --project tools\AVEVA.Historian.ReverseEngineering -- methods current\aahClientManaged.dll ModifyValue
dotnet run --no-build --project tools\AVEVA.Historian.ReverseEngineering -- methods current\aahClientManaged.dll EditData
dotnet run --no-build --project tools\AVEVA.Historian.ReverseEngineering -- methods current\aahClientManaged.dll DeleteData

If no managed wrapper exists, this op is REST-only / SMC-only — mark as out of scope in this doc. Otherwise decode like §3.1/§3.2.

Parallelism: 3.1 and 3.3 can be developed in parallel because the operator can create the sandbox tag manually via SMC while SDK code is being written. 3.2 cannot meaningfully proceed until 3.1 (or the manual tag) exists. 3.4 method discovery is cheap and may eliminate its own scope.

4. RE Steps in Execution Order

For each workstream above, run these five steps. Mirrors the read

  • event flows that recovered the existing protocol.

4.a Static method discovery

Find the native serializer:

dotnet run --no-build --project tools\AVEVA.Historian.ReverseEngineering -- methods current\aahClientManaged.dll AddS
dotnet run --no-build --project tools\AVEVA.Historian.ReverseEngineering -- methods current\aahClientManaged.dll EnsureTag
dotnet run --no-build --project tools\AVEVA.Historian.ReverseEngineering -- methods current\aahClientManaged.dll DeleteTag

Dump IL for each method of interest:

dotnet run --no-build --project tools\AVEVA.Historian.ReverseEngineering -- dnlib-method --instructions current\aahClientManaged.dll <Type::Method>

Save sanitized excerpts to docs/reverse-engineering/dnlib-<op>-il-latest.txt.

4.b Wire-byte capture for the request

Same IL-rewrite tooling that captured the 27 outgoing event calls:

$captureDir = "artifacts\reverse-engineering\instrumented-wcf-writemessage-writes"
New-Item -ItemType Directory -Force -Path $captureDir | Out-Null
dotnet run --no-build --project tools\AVEVA.Historian.ReverseEngineering -- instrument-wcf-writemessage current\aahClientManaged.dll "$captureDir\aahClientManaged.dll"
Copy-Item -Force "$captureDir\aahClientManaged.dll" "$captureDir\current-copy\aahClientManaged.dll"
$env:AVEVA_HISTORIAN_RE_CAPTURE = (Resolve-Path $captureDir).Path + "\writemessage-capture-write-latest.ndjson"

A new harness scenario --scenario write needs to be added to tools/AVEVA.Historian.NativeTraceHarness to drive the native wrapper's AddStreamValues2 against the sandbox tag. Suggested new args: --write-sandbox-tag, --write-value.

4.c Wire-byte capture for the response

Symmetric instrument-wcf-readmessage:

dotnet run --no-build --project tools\AVEVA.Historian.ReverseEngineering -- instrument-wcf-readmessage current\aahClientManaged.dll "$captureDir\aahClientManaged.dll"

The success response for AddS2 is just <AddS2Result>true</…> + empty errorBuffer. Capture at least one negative case (write to non-existent tag, or write with malformed CDataType) so the orchestrator can surface diagnostics like HistorianWcfEventOrchestrator.LastErrorBufferDescription.

4.d Decode against IL

Strip SOAP/MDAS envelope; align byte offsets against the native serializer IL from 4.a (the ldc.i4 / call WriteByte sequence makes field order and constants explicit); cross-reference the CDataType table from data-query-request-ctor-il-latest.txt to interpret typed value bytes; write a parser-and-builder pair and verify against the captured bytes before committing.

4.e Implement managed serializer + tests

New code under src/AVEVA.Historian.Client/Wcf/:

  • HistorianAddStreamValuesProtocol.csSerialize(...) returns byte[] pBuf, mirroring HistorianAddTagsProtocol.
  • Extend (or split) HistorianAddTagsProtocol for the analog / discrete / string EnsT2 shapes.
  • HistorianWcfWriteOrchestrator.cs — chains Hist.GetV → Hist.ValCl × 2 → Hist.Open2 → UpdC3 → priming chain (TBD per §3.2) → AddS2 loop → Close2.

Public surface on HistorianClient:

  • WriteValueAsync(tag, value, timestampUtc, quality)
  • WriteValuesAsync(IReadOnlyList<HistorianSampleWrite>)
  • EnsureTagAsync(HistorianTagDefinition)
  • DeleteTagAsync(string tagName)

Until evidence supports each path, throw ProtocolEvidenceMissingException (mirrors the existing read guardrail).

Unit tests under tests/AVEVA.Historian.Client.Tests/Wcf/:

  • WcfAddStreamValuesProtocolTests — golden-byte tests for one analog, one discrete, one string write.
  • WcfEnsureTagsProtocolTests — golden-byte tests for the analog/discrete/string CTagMetadata shapes.
  • Extend ProtocolGuardrailTests so any not-yet-implemented write path still throws ProtocolEvidenceMissingException.

Live integration tests in HistorianClientIntegrationTests.cs, gated on HISTORIAN_WRITE_SANDBOX_TAG: WriteValueAsync_WithinDocumentedWindow_PersistsToHistorianDb writes a unique value, reads it back via ReadRawAsync, and verifies via direct sqlcmd to the History extension table.

5. Order of Operations

3.4 method discovery (cheap; may eliminate scope)
        │
        ▼
3.1 EnsT2 (analog/discrete/string)  ──► sandbox tag exists
        │
        ├─────────────────────────────┐
        ▼                             ▼
3.2 AddS2 (priority 1)         3.3 DelT (sandbox cleanup)
        │
        ▼
3.4 ModifyData/DeleteData (only if 3.4 confirmed scope)
        │
        ▼
public surface, golden-byte tests, integration tests

3.2 is the headline win and depends only on 3.1 (or a manually created sandbox tag). 3.3 must land before any commit that programmatically creates new tags; until then, manual SMC deletion is the documented rollback.

6. Risks and Mitigations

6.a Auth chain may differ for writes

Reads use Hist.Open2(ConnectionMode = 0x402). Events use the same 0x402 plus a Stat-priming chain. Writes may need a different mode (the handoff notes 0x501 was an unverified guess for events; writes may legitimately need 0x401 or another value).

Mitigation: capture the full WriteMessage sequence for a native write session (not just AddS2) to see what Open2 payload and priming calls the native wrapper sends.

6.b Server-side session-table requirement

Writes may require RTag2 after EnsT2 and before AddS2 (the event flow needs RTag2(CmEventTagId)). The "tag identifier" the server returns from EnsT2 may differ from the GUID the client seeded.

Mitigation: capture the analog EnsT2 OutBuff (event flow's was a 45-byte echo) and verify whether subsequent AddS2 payloads reference the client-seeded GUID, the server-returned GUID, or a numeric wwTagKey. SQL ground truth: SELECT TagName, wwTagKey FROM Tag WHERE TagName = '...'.

6.c Silent-success failure mode

AddS2 may return true but no row appears in the History extension table — the engine silently drops samples outside the FutureTimeThreshold / RealTimeWindow system parameters (which the event flow now reads).

Mitigation: always write at DateTime.UtcNow; cross-check with SQL after every test:

SELECT TOP 5 DateTime, Value, QualityDetail
FROM History
WHERE wwTagKey = (SELECT wwTagKey FROM Tag WHERE TagName = @sandbox)
  AND DateTime BETWEEN @windowStart AND @windowEnd
ORDER BY DateTime DESC;

Surface FutureTimeThreshold / RealTimeWindow via existing GetSystemParameterAsync so failures are diagnosable.

6.d Storage service vs History service

IStorageServiceContract also exposes AddT/AddS/AddS2/DelT. The working hypothesis is that /Hist is client-facing and /Stor is engine-internal, but it's not yet verified.

Mitigation: the WriteMessage capture (§4.b) shows the actual service path on the wire. If it goes to /Stor, update the orchestrator. Do NOT preemptively implement against both.

6.e Parameter-name mismatches

Handoff already flagged EnsT, EnsT2, RTag2, ExKey, StJb, GtJb for the same inBuff/inputBuffer mismatch class that broke reads for weeks. Until each is audited against the server contract, requests bind to null and the server NREs.

Mitigation: before the first write WriteMessage capture, run an ildasm audit against aahClientAccessPoint.exe for the exact parameter names of EnsT2, AddS2, and DelT, and reconcile against the existing [MessageParameter] attributes.

6.f Customer-data exposure in capture files

Write captures contain the sandbox tag name and any value the test wrote. Not secrets, but noise.

Mitigation: keep all instrumented-wcf-writemessage-writes/ artifacts under artifacts/ (already gitignored). Sanitize tag names to <sandbox-tag> before committing decoded bytes into docs/reverse-engineering/.

7. Success Criteria

Per op:

  • EnsT2(analog): EnsureTagAsync(new HistorianTagDefinition { Name = sandbox, DataType = Analog }) returns success; sqlcmd -E -S . -d Runtime -Q "SELECT TagName FROM Tag WHERE TagName = '...'" returns one row.
  • EnsT2(discrete, string): same shape with corresponding DataType; SQL check uses DiscreteTag / StringTag view.
  • AddS2: WriteValueAsync(sandbox, 42.0, DateTime.UtcNow) returns success; ReadRawAsync returns the value; SELECT TOP 1 Value FROM History WHERE wwTagKey = ? AND DateTime BETWEEN ? AND ? returns the same value.
  • DelT: DeleteTagAsync(sandbox) returns success and SQL returns zero rows from Tag.
  • ModifyData / DeleteData: deferred until §3.4 method discovery confirms scope.

Cross-cutting:

  • All new code in src/AVEVA.Historian.Client/ is pure managed .NET 10. No new P/Invoke beyond the existing HistorianSspiClient.
  • Every new op has a golden-byte unit test.
  • dotnet test .\Histsdk.slnx --no-build --logger "console;verbosity=minimal" passes 100%.
  • With HISTORIAN_HOST=localhost, HISTORIAN_WRITE_SANDBOX_TAG=RetestSdkWriteSandbox set, write integration tests pass and leave zero residue (test Dispose calls DelT for cleanup).
  • Sanitization scan returns no real secrets.
  • CLAUDE.md "Required SDK Surface" updated to add the new write ops — this is a SCOPE CHANGE that must land alongside the evidence, not before. Do not update the SDK surface doc until 3.1 + 3.2 are at least live-test-green.

8. Open Questions

  1. Does AddS2 go through /Hist or /Stor on the wire?
  2. Does the sandbox tag need pre-configuration via System Management Console once before EnsT2 will accept it from a client (e.g. for Storage / wwDomain rows the wire protocol may not be able to populate)?
  3. What ConnectionMode does the native wrapper use for write sessions — 0x402 (read mode reused), 0x401, or something else?
  4. Does EnsT2(analog) require any optional Archestra engineering-units fields, or are they purely cosmetic? Affects how minimal HistorianTagDefinition can be.
  5. Server-side throttles on writes (max samples per AddS2, max calls per second) — need to surface as batching guidance?
  6. What does the server return when AddS2 is called with a timestamp older than the tag's earliest stored block? Some historians silently drop, some error, some accept-and-overwrite.
  7. Does the SDK expose write quality as the same HistorianSample.Quality enum used on reads, or a smaller subset (good/bad)?
  8. Is there a managed-side DelT path at all? If aahClientManaged only exposes deletion via SMC, §3.3 is "manual SMC only" and must be documented as such.

9. Docs To Update Once Each Workstream Lands

  • CLAUDE.md "Required SDK Surface" — add WriteValueAsync, EnsureTagAsync, DeleteTagAsync once 3.1+3.2+3.3 land.
  • AGENTS.md "Required SDK Surface" — same; update the "alarm-event write path is dormant" note.
  • docs/reverse-engineering/handoff.md — add a "Write-flow prereqs" section symmetric to the existing "Event-flow prereqs".
  • docs/reverse-engineering/wcf-contract-evidence.md — add evidence rows for EnsT2(analog/discrete/string), AddS2, DelT.
  • docs/reverse-engineering/implementation-status.md — flip status from "out of scope" to "implemented".
  • README.md — operation status table.