review(Driver.Historian.Wonderware.Client): async frame-header write + wire-parity test
Re-review at 7286d320. -011: FrameWriter folded the sync WriteByte (could block on SslStream
past the call timeout) into one async 5-byte header write. -012: DefaultTcpConnectFactory
readonly. -013: wire-parity test for PerEventStatus [Key(4)]. No wire change.
This commit is contained in:
@@ -4,8 +4,8 @@
|
||||
|---|---|
|
||||
| Module | `src/Drivers/ZB.MOM.WW.OtOpcUa.Driver.Historian.Wonderware.Client` |
|
||||
| Reviewer | Claude Code |
|
||||
| Review date | 2026-05-22 |
|
||||
| Commit reviewed | `76d35d1` |
|
||||
| Review date | 2026-06-19 |
|
||||
| Commit reviewed | `7286d320` |
|
||||
| Status | Reviewed |
|
||||
| Open findings | 0 |
|
||||
|
||||
@@ -292,3 +292,67 @@ short remark on `GetHealthSnapshot` explaining that the single-channel client ma
|
||||
connection flags to one transport and does not track per-node health.
|
||||
|
||||
**Resolution:** Resolved 2026-05-23 — (1) reworded the `Dispose()` XML comment to drop the "non-blocking" claim and instead state that the bridge is **deadlock-safe** because the cleanup never awaits a captured `SynchronizationContext` nor takes any lock the caller could hold, while acknowledging that `NamedPipeClientStream` teardown can block briefly on OS handle release. (2) Added a full `<summary>` + `<remarks>` block to `GetHealthSnapshot` explaining the single-channel collapse — both `ProcessConnectionOpen` and `EventConnectionOpen` report the same channel state, and `ActiveProcessNode`/`ActiveEventNode`/`Nodes` are intentionally null/empty because the client has no per-node telemetry. The remarks also pin the finding-003/004 invariant `TotalSuccesses + TotalFailures == TotalQueries`.
|
||||
|
||||
## Re-review 2026-06-19 (commit 7286d320)
|
||||
|
||||
Significant changes since commit `76d35d1`: named-pipe transport retired and replaced with TCP/TLS (`72f32045`, `6e152047`, `fd4d0553`, `35ac0b8c`, `fcf84adb`); `FrameChannel` introduced (replacing `PipeChannel`); `WonderwareHistorianDriverProbe` added; `WonderwareHistorianClientOptions` extracted to a `.Contracts` project; `PerEventStatus` wire field added for granular `PermanentFail` signalling (`feddc2b8`); `Total` aggregate derived client-side from `Average × interval-seconds` (`5e27b5f7`); test project TCP-ified. All 10 prior findings remained Resolved at this commit.
|
||||
|
||||
#### Checklist coverage
|
||||
|
||||
| # | Category | Result |
|
||||
|---|---|---|
|
||||
| 1 | Correctness & logic bugs | No new issues found |
|
||||
| 2 | OtOpcUa conventions | No issues found |
|
||||
| 3 | Concurrency & thread safety | Driver.Historian.Wonderware.Client-011 |
|
||||
| 4 | Error handling & resilience | No new issues found |
|
||||
| 5 | Security | No new issues found — SharedSecret not logged; TLS pin-check preserved |
|
||||
| 6 | Performance & resource management | Driver.Historian.Wonderware.Client-012 |
|
||||
| 7 | Design-document adherence | No issues found |
|
||||
| 8 | Code organization & conventions | No issues found |
|
||||
| 9 | Testing coverage | Driver.Historian.Wonderware.Client-013 |
|
||||
| 10 | Documentation & comments | No new issues found |
|
||||
|
||||
### Driver.Historian.Wonderware.Client-011
|
||||
|
||||
| Field | Value |
|
||||
|---|---|
|
||||
| Severity | Low |
|
||||
| Category | Concurrency & thread safety |
|
||||
| Location | `Ipc/FrameWriter.cs:47` |
|
||||
| Status | Resolved |
|
||||
|
||||
**Description:** `FrameWriter.WriteAsync` called `_stream.WriteByte((byte)kind)` synchronously inside the `_gate`-locked section. For a `NetworkStream` (plaintext TCP) this is benign — the kernel send buffer absorbs the byte. For a `SslStream` (TLS path), `Stream.WriteByte` delegates to `Write(byte[])` synchronously: it encrypts and hands the ciphertext to the kernel, which CAN block the calling thread-pool thread if the peer's receive window is exhausted. The `CancellationToken` cannot interrupt a synchronous `WriteByte`; a stuck TLS write could wedge the single-in-flight gate indefinitely. This is the same class of bug as finding 005 (synchronous `ReadByte` on the read path), fixed in `75580fb4`.
|
||||
|
||||
**Recommendation:** Fold the kind byte into the 4-byte length-prefix buffer to form a 5-byte header array, then emit it with a single `await _stream.WriteAsync(header, ct)`. This makes every write inside the gate async+cancellable and eliminates the synchronous call entirely, without changing the on-wire layout.
|
||||
|
||||
**Resolution:** Resolved 2026-06-19 — replaced `_stream.WriteByte((byte)kind)` with a 5-byte header array (`Framing.LengthPrefixSize + Framing.KindByteSize`) containing the big-endian body length and the kind byte together, emitted with a single `await _stream.WriteAsync(header, ct)` call; all three TCP round-trip tests (`Plaintext_ReturnsConnectedStream_ByteRoundTrips`, `Tls_PinnedThumbprintMatches_ConnectsSuccessfully`, and the full `WonderwareHistorianClientTests` suite) pass green confirming the on-wire format is unchanged.
|
||||
|
||||
### Driver.Historian.Wonderware.Client-012
|
||||
|
||||
| Field | Value |
|
||||
|---|---|
|
||||
| Severity | Low |
|
||||
| Category | Performance & resource management |
|
||||
| Location | `Internal/FrameChannel.cs:38` |
|
||||
| Status | Resolved |
|
||||
|
||||
**Description:** `FrameChannel.DefaultTcpConnectFactory` was declared as a non-`readonly` mutable `public static` field. Because `FrameChannel` is `internal` but `InternalsVisibleTo` exposes it to the test project, the field could be replaced at runtime by any test that imports the assembly. A test that forgets to restore the field after mutation would silently contaminate subsequent tests in the same process (test-isolation hazard). In production code, any code in the same assembly or the test project can swap in a hostile factory. There is no legitimate reason for the field to be mutable after initialization.
|
||||
|
||||
**Recommendation:** Add `readonly` to the field declaration. The field initializer (a lambda) is a compile-time constant, so this requires no other change.
|
||||
|
||||
**Resolution:** Resolved 2026-06-19 — added `readonly` to `DefaultTcpConnectFactory`; the field is now `public static readonly Func<...>`. All 41 tests pass. No consumer assigned to the field.
|
||||
|
||||
### Driver.Historian.Wonderware.Client-013
|
||||
|
||||
| Field | Value |
|
||||
|---|---|
|
||||
| Severity | Low |
|
||||
| Category | Testing coverage |
|
||||
| Location | `tests/Drivers/ZB.MOM.WW.OtOpcUa.Driver.Historian.Wonderware.Client.Tests/ContractsWireParityTests.cs` |
|
||||
| Status | Resolved |
|
||||
|
||||
**Description:** `ContractsWireParityTests.WriteAlarmEventsReply_RoundTrips` covered only the legacy `PerEventOk [Key(3)]` bool array; it did not include the `PerEventStatus [Key(4)]` byte-array field added in commit `feddc2b8`. The whole purpose of `ContractsWireParityTests` is to catch silent `[Key]` index drift between the client and sidecar mirror copies. Without a test pinning `Key(4)`, a future refactor that accidentally shifts `PerEventStatus` to a different key index on either side would go undetected until runtime, causing the `PermanentFail` dead-lettering logic to silently stop working.
|
||||
|
||||
**Recommendation:** Add a test that serializes a `WriteAlarmEventsReply` with a non-empty `PerEventStatus` (e.g. `[0, 1, 2]`), asserts the header byte is `0x95` (fixarray of 5 elements confirming both Key(3) and Key(4) are present), and round-trips to verify `PerEventStatus` survives independently of `PerEventOk`.
|
||||
|
||||
**Resolution:** Resolved 2026-06-19 — added `WriteAlarmEventsReply_PerEventStatus_IsAtKey4_AndRoundTrips` to `ContractsWireParityTests.cs`; asserts the 5-field fixarray header (`0x95`), round-trips `PerEventOk=[true]` and `PerEventStatus=[0,1,2]` independently, and pins `Key(4)` against the `[0,1,2]` byte layout. 41 tests pass (was 40).
|
||||
|
||||
+1
-1
@@ -35,7 +35,7 @@ internal sealed class FrameChannel : IAsyncDisposable
|
||||
/// in TLS (server-auth; pinned-thumbprint or CA-chain validation). The Hello handshake +
|
||||
/// shared secret still authenticate the caller on top of this.
|
||||
/// </summary>
|
||||
public static Func<WonderwareHistorianClientOptions, CancellationToken, Task<Stream>> DefaultTcpConnectFactory =
|
||||
public static readonly Func<WonderwareHistorianClientOptions, CancellationToken, Task<Stream>> DefaultTcpConnectFactory =
|
||||
async (opts, ct) =>
|
||||
{
|
||||
if (string.IsNullOrWhiteSpace(opts.Host))
|
||||
|
||||
+12
-8
@@ -33,18 +33,22 @@ public sealed class FrameWriter : IDisposable
|
||||
throw new InvalidOperationException(
|
||||
$"Sidecar IPC frame body {body.Length} exceeds {Framing.MaxFrameBodyBytes} byte cap.");
|
||||
|
||||
var lengthPrefix = new byte[Framing.LengthPrefixSize];
|
||||
// Big-endian.
|
||||
lengthPrefix[0] = (byte)((body.Length >> 24) & 0xFF);
|
||||
lengthPrefix[1] = (byte)((body.Length >> 16) & 0xFF);
|
||||
lengthPrefix[2] = (byte)((body.Length >> 8) & 0xFF);
|
||||
lengthPrefix[3] = (byte)( body.Length & 0xFF);
|
||||
// 5-byte header: [4-byte big-endian body length][1-byte message kind].
|
||||
// The kind byte is folded into the header array so every write inside the gate
|
||||
// is async+cancellable — a synchronous Stream.WriteByte() blocks the calling
|
||||
// thread-pool thread and cannot be interrupted by the call-timeout token when
|
||||
// the peer's receive window is full (same class of bug as finding 005 on reads).
|
||||
var header = new byte[Framing.LengthPrefixSize + Framing.KindByteSize];
|
||||
header[0] = (byte)((body.Length >> 24) & 0xFF);
|
||||
header[1] = (byte)((body.Length >> 16) & 0xFF);
|
||||
header[2] = (byte)((body.Length >> 8) & 0xFF);
|
||||
header[3] = (byte)( body.Length & 0xFF);
|
||||
header[4] = (byte)kind;
|
||||
|
||||
await _gate.WaitAsync(ct).ConfigureAwait(false);
|
||||
try
|
||||
{
|
||||
await _stream.WriteAsync(lengthPrefix, ct).ConfigureAwait(false);
|
||||
_stream.WriteByte((byte)kind);
|
||||
await _stream.WriteAsync(header, ct).ConfigureAwait(false);
|
||||
await _stream.WriteAsync(body, ct).ConfigureAwait(false);
|
||||
await _stream.FlushAsync(ct).ConfigureAwait(false);
|
||||
}
|
||||
|
||||
+31
-1
@@ -183,7 +183,7 @@ public sealed class ContractsWireParityTests
|
||||
rt.Events[0].EventTimeUtcTicks.ShouldBe(999L);
|
||||
}
|
||||
|
||||
/// <summary>Verifies that WriteAlarmEventsReply round-trips correctly.</summary>
|
||||
/// <summary>Verifies that WriteAlarmEventsReply round-trips correctly (legacy PerEventOk path).</summary>
|
||||
[Fact]
|
||||
public void WriteAlarmEventsReply_RoundTrips()
|
||||
{
|
||||
@@ -202,6 +202,36 @@ public sealed class ContractsWireParityTests
|
||||
rt.PerEventOk.ShouldBe(new[] { true, false, true });
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// Pins the <c>[Key(4)]</c> index for <see cref="WriteAlarmEventsReply.PerEventStatus"/>,
|
||||
/// the additive granular status field added in the <c>feddc2b8</c> commit. A silent
|
||||
/// Key-index drift in either the client or the sidecar mirror copy would swap the legacy
|
||||
/// <c>PerEventOk</c> bool array and the new status byte array, misclassifying outcomes
|
||||
/// at runtime. (Finding 013.)
|
||||
/// </summary>
|
||||
[Fact]
|
||||
public void WriteAlarmEventsReply_PerEventStatus_IsAtKey4_AndRoundTrips()
|
||||
{
|
||||
var original = new WriteAlarmEventsReply
|
||||
{
|
||||
CorrelationId = "s",
|
||||
Success = true,
|
||||
PerEventOk = [true],
|
||||
PerEventStatus = [0, 1, 2], // Ack, Retry, Permanent
|
||||
};
|
||||
var bytes = MessagePackSerializer.Serialize(original);
|
||||
|
||||
// The array must start with fixarray(5) — five keys at indices 0-4.
|
||||
bytes[0].ShouldBe((byte)0x95, "WriteAlarmEventsReply must be a 5-field MessagePack array");
|
||||
|
||||
var rt = MessagePackSerializer.Deserialize<WriteAlarmEventsReply>(bytes);
|
||||
rt.CorrelationId.ShouldBe("s");
|
||||
rt.Success.ShouldBeTrue();
|
||||
rt.PerEventOk.ShouldBe(new[] { true });
|
||||
// Key(4): PerEventStatus must round-trip independently of Key(3): PerEventOk.
|
||||
rt.PerEventStatus.ShouldBe(new byte[] { 0, 1, 2 });
|
||||
}
|
||||
|
||||
// ---- MessageKind enum values are pinned ----
|
||||
// Changing a MessageKind value is a wire break; pin them explicitly.
|
||||
|
||||
|
||||
Reference in New Issue
Block a user