CW-1: reusable capture -> sanitize -> golden-fixture pipeline

Adds the highest-leverage reverse-engineering primitive from the roadmap: one
path to turn a live operation buffer into a committable golden fixture. Unblocks
every capture-tier item (R0.5, R1.x, R2.1).

- ProtocolCaptureSanitizer: redacts identity-bearing values (host, tag, user,
  machine) from a native buffer in BOTH ASCII and UTF-16LE, overwriting in place
  with an 'X' fill so length and every field offset are preserved (keeps the
  fixture useful for byte-layout RE). ASCII-letter matching is case-insensitive;
  secrets < 3 chars are skipped to avoid collision corruption. AssertNoSecretsRemain
  is a fail-closed safety net that refuses to emit if any value survives.
- ProtocolFixtureWriter: serializes a capture to fixtures/protocol/<op>/<name>.json
  with sanitized hex, length, SHA-256 of the sanitized bytes, and a scrub report.
  Timestamps are passed in (deterministic / testable).
- capture-tag-info CLI command: captures a live GetTagInfoFromName response and
  writes the fixture. The same native bytes ride inside 2023 R2 gRPC
  GetTagInfosFromName, so the fixture is transport-agnostic.
- 11 unit tests for the sanitizer/writer (test project now references the RE tool).
- First real fixture: get-tag-info/analog-*.json — a 98-byte Int4 CTagMetadata
  buffer captured live from the local Historian 2020 server, tag name redacted,
  verified to contain no identity (descriptor 03 c3 00 31 = Int4, as documented).

180 non-live unit tests green.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
Joseph Doherty
2026-06-19 14:56:48 -04:00
parent 6b892b69ba
commit fa9cde3e2f
6 changed files with 502 additions and 0 deletions
@@ -12,8 +12,10 @@ using System.Security.Cryptography;
using System.Runtime.Versioning;
using System.Text;
using System.Text.Json;
using AVEVA.Historian.Client;
using AVEVA.Historian.Client.Wcf;
using AVEVA.Historian.Client.Wcf.Contracts;
using AVEVA.Historian.ReverseEngineering.Capture;
using dnlib.DotNet;
using dnlib.DotNet.Emit;
@@ -68,6 +70,7 @@ try
"wcf-start-event-query" => StartWcfEventQuery(args),
"wcf-register-event-tag" => RegisterEventTagAndStartQuery(args),
"wcf-add-event-tag" => AddEventTagAndStartQuery(args),
"capture-tag-info" => CaptureTagInfo(args),
_ => UnknownCommand(args[0])
};
}
@@ -3605,6 +3608,90 @@ static int ProbeWcfTagInfo(string[] args)
return result.Success ? 0 : 1;
}
// CW-1: capture a live GetTagInfoFromName response buffer and persist it as a sanitized,
// committable golden fixture under fixtures/protocol/get-tag-info/. The same native byte blob
// travels inside the 2023 R2 gRPC RetrievalService.GetTagInfosFromName response, so the fixture
// is transport-agnostic. Usage: capture-tag-info [host] [port] [tag] [fixture-root]
static int CaptureTagInfo(string[] args)
{
string host = args.Length > 1 ? args[1] : "localhost";
int port = args.Length > 2 && int.TryParse(args[2], out int parsedPort)
? parsedPort
: HistorianWcfBindingFactory.DefaultPort;
string tag = args.Length > 3 ? args[3] : "OtOpcUaParityTest_001.Counter";
string fixtureRoot = args.Length > 4 ? args[4] : ResolveFixtureRoot();
var options = new HistorianClientOptions
{
Host = host,
Port = port,
IntegratedSecurity = true,
};
IReadOnlyDictionary<string, byte[]?> raw = HistorianWcfTagClient.GetTagInfoRawBytesForProbe(options, [tag]);
byte[]? response = raw.TryGetValue(tag, out byte[]? bytes) ? bytes : null;
if (response is null || response.Length == 0)
{
Console.Error.WriteLine($"GetTagInfoFromName returned no bytes for the requested tag against {host}:{port}.");
return 1;
}
// Redact every identity-bearing value that could appear in the buffer: the requested tag,
// the host/machine name, and the captured user. The sanitizer scrubs ASCII + UTF-16LE and
// refuses to emit if any value survives.
var secrets = new List<CaptureSecret>
{
new("tag", tag),
new("host", host),
new("machine", Environment.MachineName),
new("user", Environment.UserName),
};
string? envUser = Environment.GetEnvironmentVariable("HISTORIAN_USER");
if (!string.IsNullOrWhiteSpace(envUser))
{
secrets.Add(new CaptureSecret("env-user", envUser));
}
var capture = new ProtocolCapture(
Op: "get-tag-info",
Request: null,
Response: response,
Notes: "RetrievalService.GetTagInfoFromName response (CTagMetadata buffer); identical bytes on 2023 R2 gRPC GetTagInfosFromName.");
string capturedUtc = DateTime.UtcNow.ToString("o");
string path = ProtocolFixtureWriter.Write(fixtureRoot, $"analog-{DateTime.UtcNow:yyyyMMddHHmmss}", capture, secrets, capturedUtc);
var summary = new
{
Op = capture.Op,
ResponseLength = response.Length,
FixturePath = path,
Redactions = ProtocolCaptureSanitizer.Sanitize(response, secrets).Report
.Where(r => r.Total > 0)
.Select(r => new { r.Name, r.AsciiMatches, r.Utf16Matches }),
};
Console.WriteLine(JsonSerializer.Serialize(summary, CreateJsonOptions()));
return 0;
}
// Walk up from the working directory to the repo root (the directory holding Histsdk.slnx) and
// return its fixtures/protocol path; fall back to fixtures/protocol under the CWD.
static string ResolveFixtureRoot()
{
DirectoryInfo? dir = new(Directory.GetCurrentDirectory());
while (dir is not null)
{
if (File.Exists(Path.Combine(dir.FullName, "Histsdk.slnx")))
{
return Path.Combine(dir.FullName, "fixtures", "protocol");
}
dir = dir.Parent;
}
return Path.Combine(Directory.GetCurrentDirectory(), "fixtures", "protocol");
}
static int ProbeWcfLikeTagBrowse(string[] args)
{
string host = args.Length > 1 ? args[1] : "localhost";
@@ -6370,6 +6457,9 @@ static void PrintHelp()
instrument-tagquery-gettaginfo [dll-path] [output-path]
Write a reverse-only wrapper copy that logs TagQuery CTagMetadata vectors.
mark <scenario-name> Emit a timestamp marker for Wireshark/API Monitor notes.
capture-tag-info [host] [port] [tag] [fixture-root]
CW-1: capture a live GetTagInfoFromName buffer and write a
sanitized golden fixture to fixtures/protocol/get-tag-info/.
wcf-probe [host] [port] Probe Hist/Retr/Stat WCF GetV endpoints with MDAS encoding.
wcf-cert-probe [host] [port] [dns]
Probe HistCert GetV with MDAS over TLS transport security.