docs(historian): design for sidecar TCP transport (replace named pipe)

This commit is contained in:
Joseph Doherty
2026-06-12 11:03:32 -04:00
parent c6edef0efb
commit 3d3f8a47a9
@@ -0,0 +1,176 @@
# Wonderware Historian Sidecar — TCP Transport Design
**Date:** 2026-06-12
**Status:** Approved (design); implementation plan to follow.
## Goal
Replace the Wonderware Historian sidecar's **local named-pipe IPC** with a
**TCP transport** so a remote OtOpcUa host (e.g. the dev server running in
Linux Docker on a MacBook) can reach the net48 sidecar on the Windows
Historian VM. Today the IPC is a local, Windows-SID-gated named pipe, so the
only possible consumer is a Windows OtOpcUa process on the same machine; once
the host moves off the VM the sidecar is orphaned.
## Why not gRPC (the "like mxaccessgw" question)
mxaccessgw uses gRPC/HTTP2, but it can do so only because it is split into a
**net10 Server** (Kestrel/`Grpc.AspNetCore`) + a **net48 Worker**. The
Historian sidecar must stay **net48** (the AVEVA `aahClientManaged` + native
`aahClient.dll` SDK is .NET Framework 4.8), and **net48 cannot host
Kestrel/`Grpc.AspNetCore`**. The only gRPC-on-net48 option is the **EOL
`Grpc.Core` C-core** library, or adding a second net10 front process.
Decision: **plain TCP reusing the existing MessagePack frame protocol.** The
protocol is 5 unary request/reply ops (`ReadRaw`, `ReadProcessed`,
`ReadAtTime`, `ReadEvents`, `WriteAlarmEvents`) + a `Hello` handshake — no
streaming — and is already abstracted behind a `Stream` on both ends, so a TCP
swap is small, native to net48, depends on no EOL libraries, and reuses every
contract.
## Locked decisions
| Decision | Choice |
|---|---|
| Transport | **TCP only** — named pipe fully removed |
| Concurrency | **Single active connection, serial accept** (mirrors today's pipe `maxInstances:1`) |
| Caller auth | **Shared-secret `Hello`**, required in every mode (replaces the SID ACL) |
| Transport security | **TLS optional** — plaintext in dev, TLS in prod (config-driven) |
| mTLS / client-cert | Out of scope now; future hardening follow-up |
## Architecture
```
Before: After:
OtOpcUa host (same VM, Windows) OtOpcUa host (anywhere, .NET 10)
| named pipe (local, SID-gated) | TCP (+ optional TLS), MessagePack frames
v v
Sidecar (net48) PipeServer Sidecar (net48) TcpFrameServer
```
Everything above the socket is unchanged: `Hello`/`HelloAck` handshake,
length-prefixed `MessageKind` framing, MessagePack DTOs, `FrameReader`/
`FrameWriter` (they operate on `Stream`; `NetworkStream`/`SslStream` are
`Stream`s), `HistorianFrameHandler` dispatch, and the AVEVA SDK backends
(`HistorianDataSource` reads, `SdkAlarmHistorianWriteBackend` writes).
## Detailed design
### Server (net48 sidecar — `src/Drivers/ZB.MOM.WW.OtOpcUa.Driver.Historian.Wonderware/`)
- **New `Ipc/TcpFrameServer.cs`** replaces `Ipc/PipeServer.cs`:
- `TcpListener` bound to `<bind>:<port>`; single-active **serial** accept
loop (accept the next connection only after the current disconnects).
- Per connection: if TLS enabled, wrap `NetworkStream` in `SslStream` and
`AuthenticateAsServer(cert)`; else use the raw `NetworkStream`.
- Run the **existing** `Hello` handshake (verify shared secret; reject with a
`HelloAck{Accepted=false}` on mismatch / major-version mismatch) → then the
existing `reader.ReadFrameAsync``handler.HandleAsync``writer.WriteAsync`
loop.
- Keep the `RunAsync` backoff (`250ms…8s`) + `MaxConsecutiveFailures=20`→throw
behavior so `Program.Main`'s exit-2 + NSSM restart semantics are identical.
- **Remove:** `PipeServer`, `PipeAcl`, `VerifyCaller`/`CallerVerifier`
(Windows-pipe-only), the `OTOPCUA_ALLOWED_SID` env + `SecurityIdentifier`.
- **`Program.cs`:** swap `new PipeServer(pipeName, allowedSid, sharedSecret, …)`
(line ~62) for `new TcpFrameServer(bind, port, sharedSecret, tlsCert?, …)`;
drop the SID read; keep `OTOPCUA_HISTORIAN_ENABLED` (pipe-only-idle behavior
becomes "tcp-only-idle" / listen-but-SDK-disabled, semantics preserved).
- **New env vars:** `OTOPCUA_HISTORIAN_TCP_PORT` (default e.g. 32569),
`OTOPCUA_HISTORIAN_BIND` (default `0.0.0.0`), `OTOPCUA_HISTORIAN_TLS_ENABLED`
(default `false`), `OTOPCUA_HISTORIAN_TLS_CERT` (pfx path or cert-store
thumbprint), `OTOPCUA_HISTORIAN_TLS_CERT_PASSWORD`. Keep
`OTOPCUA_HISTORIAN_SECRET`, `OTOPCUA_HISTORIAN_ENABLED`, and the SDK vars
(`OTOPCUA_HISTORIAN_SERVER`/`PORT`/…).
### Client (.NET 10 — `src/Drivers/ZB.MOM.WW.OtOpcUa.Driver.Historian.Wonderware.Client/`)
- **`Internal/PipeChannel.cs``Internal/FrameChannel.cs`** (cosmetic rename;
already transport-agnostic). Add **`DefaultTcpConnectFactory`**:
`TcpClient.ConnectAsync(host, port, ct)` → if TLS,
`SslStream.AuthenticateAsClientAsync` (validate server cert: thumbprint-pin
*or* CA-chain per config; skip in plaintext mode) → return the stream. The
`FrameReader`/`FrameWriter`/`Hello`/MessagePack layer is reused unchanged.
- **`WonderwareHistorianClient.cs`:** default ctor switches to the TCP factory;
the injectable `connect`-func ctor stays (used by tests).
### Options + host wiring
- **`…Client.Contracts/WonderwareHistorianClientOptions.cs`:** replace
`PipeName` with `Host` + `Port`; add `UseTls` and `ServerCertThumbprint`
(optional pin) / validation mode. Keep `SharedSecret`, `PeerName`,
`ConnectTimeout`, `CallTimeout`, `ProbeTimeoutSeconds`.
- **`Historian:Wonderware` appsettings:** `Host`/`Port`/`UseTls`/
`ServerCertThumbprint` replace `PipeName`. Bound where the client is built:
`src/Server/ZB.MOM.WW.OtOpcUa.Runtime/ServiceCollectionExtensions.cs`,
`src/Server/ZB.MOM.WW.OtOpcUa.Host/Drivers/DriverFactoryBootstrap.cs`,
`src/Server/ZB.MOM.WW.OtOpcUa.Runtime/Historian/AlarmHistorianOptions.cs`.
- **AdminUI Test-Connect probe:**
`src/Server/ZB.MOM.WW.OtOpcUa.AdminUI/Components/Shared/Drivers/Pickers/HistorianWonderwareAddressBuilder.cs`
+ the probe path updated to host/port/TLS.
## Security model
- **Caller auth:** shared-secret `Hello`, required in all modes — this is the
replacement for the named pipe's Windows-SID ACL.
- **Transport:** TLS optional, config-driven. Dev = plaintext (`UseTls=false`).
Prod = server cert; client pins the thumbprint *or* validates the CA chain
(both supported). Server cert can live in the existing
`C:\ProgramData\OtOpcUa\pki`.
- **Network exposure:** a new inbound port requires a **Windows Firewall rule**
on the VM. Bind to a specific management NIC instead of `0.0.0.0` to scope
exposure.
- **mTLS** (client-cert auth) is a future follow-up; the shared secret covers
caller authentication for now.
## Deployment
- `scripts/install/Install-Services.ps1` / `Refresh-Services.ps1`: swap the
historian service env block (drop `OTOPCUA_ALLOWED_SID`, add
`OTOPCUA_HISTORIAN_TCP_PORT` + `OTOPCUA_HISTORIAN_TLS_*`), add an inbound
firewall rule for the port, and provision the server cert (prod). The Step 4b
deploy-completeness assertion stays as-is.
## Testing (no live sign-in by the agent)
- **Reuse** the existing byte-parity / round-trip contract tests (contracts
unchanged).
- **New unit/integration (xUnit + Shouldly):** TCP connect factory; self-signed
TLS loopback handshake; **end-to-end loopback** (`TcpFrameServer` + client
over `127.0.0.1`, both plaintext and TLS); `Hello`-reject on bad shared
secret over TCP; single-active-connection serial-accept behavior.
- **Live (user-driven):** MacBook OtOpcUa → VM sidecar over TCP — dev plaintext
first: `ReadRaw` returns live samples + a `WriteAlarmEvents` round-trips; then
flip `UseTls=true` and re-verify. Open the VM firewall port. Done = build
clean + `dotnet test` green + live read/write pass.
## Rollout / migration
- The pipe is fully replaced, so **both ends move together** (no mixed
pipe/TCP). The protocol above the socket is byte-identical, so this is a
transport swap, not a contract change. Sequence: deploy the TCP sidecar
(+firewall +env), then the TCP-client host, with the same shared secret.
## Open items / follow-ups (not blockers)
- mTLS / client-cert auth (hardening).
- Optional: bind-NIC scoping vs `0.0.0.0`.
- Same-machine deploys now use loopback TCP (`127.0.0.1:<port>`) instead of a
pipe — expected given full replacement.
## Touched code (authoritative file list)
- Sidecar: `Ipc/TcpFrameServer.cs` (new, replaces `Ipc/PipeServer.cs`), remove
`Ipc/PipeAcl.cs`, `Program.cs`.
- Client: `Internal/FrameChannel.cs` (rename of `PipeChannel.cs` + TCP factory),
`WonderwareHistorianClient.cs`.
- Contracts: `WonderwareHistorianClientOptions.cs`.
- Host: `Runtime/ServiceCollectionExtensions.cs`,
`Host/Drivers/DriverFactoryBootstrap.cs`,
`Runtime/Historian/AlarmHistorianOptions.cs`,
`AdminUI/.../Pickers/HistorianWonderwareAddressBuilder.cs`.
- Deploy: `scripts/install/Install-Services.ps1`,
`scripts/install/Refresh-Services.ps1`.
- Docs: `docs/drivers/Historian.Wonderware.md`, `docs/ServiceHosting.md`,
`docs/AlarmHistorian.md`.
- Unchanged (reused): both `Ipc/Contracts.cs`, `Ipc/Framing.cs`,
`Ipc/FrameReader.cs`, `Ipc/HistorianFrameHandler.cs`, the AVEVA SDK backends.