diff --git a/code-reviews/Driver.Historian.Wonderware.Client/findings.md b/code-reviews/Driver.Historian.Wonderware.Client/findings.md index 4f345fa5..25037ecb 100644 --- a/code-reviews/Driver.Historian.Wonderware.Client/findings.md +++ b/code-reviews/Driver.Historian.Wonderware.Client/findings.md @@ -4,8 +4,8 @@ |---|---| | Module | `src/Drivers/ZB.MOM.WW.OtOpcUa.Driver.Historian.Wonderware.Client` | | Reviewer | Claude Code | -| Review date | 2026-05-22 | -| Commit reviewed | `76d35d1` | +| Review date | 2026-06-19 | +| Commit reviewed | `7286d320` | | Status | Reviewed | | Open findings | 0 | @@ -292,3 +292,67 @@ short remark on `GetHealthSnapshot` explaining that the single-channel client ma connection flags to one transport and does not track per-node health. **Resolution:** Resolved 2026-05-23 — (1) reworded the `Dispose()` XML comment to drop the "non-blocking" claim and instead state that the bridge is **deadlock-safe** because the cleanup never awaits a captured `SynchronizationContext` nor takes any lock the caller could hold, while acknowledging that `NamedPipeClientStream` teardown can block briefly on OS handle release. (2) Added a full `` + `` block to `GetHealthSnapshot` explaining the single-channel collapse — both `ProcessConnectionOpen` and `EventConnectionOpen` report the same channel state, and `ActiveProcessNode`/`ActiveEventNode`/`Nodes` are intentionally null/empty because the client has no per-node telemetry. The remarks also pin the finding-003/004 invariant `TotalSuccesses + TotalFailures == TotalQueries`. + +## Re-review 2026-06-19 (commit 7286d320) + +Significant changes since commit `76d35d1`: named-pipe transport retired and replaced with TCP/TLS (`72f32045`, `6e152047`, `fd4d0553`, `35ac0b8c`, `fcf84adb`); `FrameChannel` introduced (replacing `PipeChannel`); `WonderwareHistorianDriverProbe` added; `WonderwareHistorianClientOptions` extracted to a `.Contracts` project; `PerEventStatus` wire field added for granular `PermanentFail` signalling (`feddc2b8`); `Total` aggregate derived client-side from `Average × interval-seconds` (`5e27b5f7`); test project TCP-ified. All 10 prior findings remained Resolved at this commit. + +#### Checklist coverage + +| # | Category | Result | +|---|---|---| +| 1 | Correctness & logic bugs | No new issues found | +| 2 | OtOpcUa conventions | No issues found | +| 3 | Concurrency & thread safety | Driver.Historian.Wonderware.Client-011 | +| 4 | Error handling & resilience | No new issues found | +| 5 | Security | No new issues found — SharedSecret not logged; TLS pin-check preserved | +| 6 | Performance & resource management | Driver.Historian.Wonderware.Client-012 | +| 7 | Design-document adherence | No issues found | +| 8 | Code organization & conventions | No issues found | +| 9 | Testing coverage | Driver.Historian.Wonderware.Client-013 | +| 10 | Documentation & comments | No new issues found | + +### Driver.Historian.Wonderware.Client-011 + +| Field | Value | +|---|---| +| Severity | Low | +| Category | Concurrency & thread safety | +| Location | `Ipc/FrameWriter.cs:47` | +| Status | Resolved | + +**Description:** `FrameWriter.WriteAsync` called `_stream.WriteByte((byte)kind)` synchronously inside the `_gate`-locked section. For a `NetworkStream` (plaintext TCP) this is benign — the kernel send buffer absorbs the byte. For a `SslStream` (TLS path), `Stream.WriteByte` delegates to `Write(byte[])` synchronously: it encrypts and hands the ciphertext to the kernel, which CAN block the calling thread-pool thread if the peer's receive window is exhausted. The `CancellationToken` cannot interrupt a synchronous `WriteByte`; a stuck TLS write could wedge the single-in-flight gate indefinitely. This is the same class of bug as finding 005 (synchronous `ReadByte` on the read path), fixed in `75580fb4`. + +**Recommendation:** Fold the kind byte into the 4-byte length-prefix buffer to form a 5-byte header array, then emit it with a single `await _stream.WriteAsync(header, ct)`. This makes every write inside the gate async+cancellable and eliminates the synchronous call entirely, without changing the on-wire layout. + +**Resolution:** Resolved 2026-06-19 — replaced `_stream.WriteByte((byte)kind)` with a 5-byte header array (`Framing.LengthPrefixSize + Framing.KindByteSize`) containing the big-endian body length and the kind byte together, emitted with a single `await _stream.WriteAsync(header, ct)` call; all three TCP round-trip tests (`Plaintext_ReturnsConnectedStream_ByteRoundTrips`, `Tls_PinnedThumbprintMatches_ConnectsSuccessfully`, and the full `WonderwareHistorianClientTests` suite) pass green confirming the on-wire format is unchanged. + +### Driver.Historian.Wonderware.Client-012 + +| Field | Value | +|---|---| +| Severity | Low | +| Category | Performance & resource management | +| Location | `Internal/FrameChannel.cs:38` | +| Status | Resolved | + +**Description:** `FrameChannel.DefaultTcpConnectFactory` was declared as a non-`readonly` mutable `public static` field. Because `FrameChannel` is `internal` but `InternalsVisibleTo` exposes it to the test project, the field could be replaced at runtime by any test that imports the assembly. A test that forgets to restore the field after mutation would silently contaminate subsequent tests in the same process (test-isolation hazard). In production code, any code in the same assembly or the test project can swap in a hostile factory. There is no legitimate reason for the field to be mutable after initialization. + +**Recommendation:** Add `readonly` to the field declaration. The field initializer (a lambda) is a compile-time constant, so this requires no other change. + +**Resolution:** Resolved 2026-06-19 — added `readonly` to `DefaultTcpConnectFactory`; the field is now `public static readonly Func<...>`. All 41 tests pass. No consumer assigned to the field. + +### Driver.Historian.Wonderware.Client-013 + +| Field | Value | +|---|---| +| Severity | Low | +| Category | Testing coverage | +| Location | `tests/Drivers/ZB.MOM.WW.OtOpcUa.Driver.Historian.Wonderware.Client.Tests/ContractsWireParityTests.cs` | +| Status | Resolved | + +**Description:** `ContractsWireParityTests.WriteAlarmEventsReply_RoundTrips` covered only the legacy `PerEventOk [Key(3)]` bool array; it did not include the `PerEventStatus [Key(4)]` byte-array field added in commit `feddc2b8`. The whole purpose of `ContractsWireParityTests` is to catch silent `[Key]` index drift between the client and sidecar mirror copies. Without a test pinning `Key(4)`, a future refactor that accidentally shifts `PerEventStatus` to a different key index on either side would go undetected until runtime, causing the `PermanentFail` dead-lettering logic to silently stop working. + +**Recommendation:** Add a test that serializes a `WriteAlarmEventsReply` with a non-empty `PerEventStatus` (e.g. `[0, 1, 2]`), asserts the header byte is `0x95` (fixarray of 5 elements confirming both Key(3) and Key(4) are present), and round-trips to verify `PerEventStatus` survives independently of `PerEventOk`. + +**Resolution:** Resolved 2026-06-19 — added `WriteAlarmEventsReply_PerEventStatus_IsAtKey4_AndRoundTrips` to `ContractsWireParityTests.cs`; asserts the 5-field fixarray header (`0x95`), round-trips `PerEventOk=[true]` and `PerEventStatus=[0,1,2]` independently, and pins `Key(4)` against the `[0,1,2]` byte layout. 41 tests pass (was 40). diff --git a/src/Drivers/ZB.MOM.WW.OtOpcUa.Driver.Historian.Wonderware.Client/Internal/FrameChannel.cs b/src/Drivers/ZB.MOM.WW.OtOpcUa.Driver.Historian.Wonderware.Client/Internal/FrameChannel.cs index bdd2f305..7b4dc1b4 100644 --- a/src/Drivers/ZB.MOM.WW.OtOpcUa.Driver.Historian.Wonderware.Client/Internal/FrameChannel.cs +++ b/src/Drivers/ZB.MOM.WW.OtOpcUa.Driver.Historian.Wonderware.Client/Internal/FrameChannel.cs @@ -35,7 +35,7 @@ internal sealed class FrameChannel : IAsyncDisposable /// in TLS (server-auth; pinned-thumbprint or CA-chain validation). The Hello handshake + /// shared secret still authenticate the caller on top of this. /// - public static Func> DefaultTcpConnectFactory = + public static readonly Func> DefaultTcpConnectFactory = async (opts, ct) => { if (string.IsNullOrWhiteSpace(opts.Host)) diff --git a/src/Drivers/ZB.MOM.WW.OtOpcUa.Driver.Historian.Wonderware.Client/Ipc/FrameWriter.cs b/src/Drivers/ZB.MOM.WW.OtOpcUa.Driver.Historian.Wonderware.Client/Ipc/FrameWriter.cs index 33e9609d..87b7c5eb 100644 --- a/src/Drivers/ZB.MOM.WW.OtOpcUa.Driver.Historian.Wonderware.Client/Ipc/FrameWriter.cs +++ b/src/Drivers/ZB.MOM.WW.OtOpcUa.Driver.Historian.Wonderware.Client/Ipc/FrameWriter.cs @@ -33,18 +33,22 @@ public sealed class FrameWriter : IDisposable throw new InvalidOperationException( $"Sidecar IPC frame body {body.Length} exceeds {Framing.MaxFrameBodyBytes} byte cap."); - var lengthPrefix = new byte[Framing.LengthPrefixSize]; - // Big-endian. - lengthPrefix[0] = (byte)((body.Length >> 24) & 0xFF); - lengthPrefix[1] = (byte)((body.Length >> 16) & 0xFF); - lengthPrefix[2] = (byte)((body.Length >> 8) & 0xFF); - lengthPrefix[3] = (byte)( body.Length & 0xFF); + // 5-byte header: [4-byte big-endian body length][1-byte message kind]. + // The kind byte is folded into the header array so every write inside the gate + // is async+cancellable — a synchronous Stream.WriteByte() blocks the calling + // thread-pool thread and cannot be interrupted by the call-timeout token when + // the peer's receive window is full (same class of bug as finding 005 on reads). + var header = new byte[Framing.LengthPrefixSize + Framing.KindByteSize]; + header[0] = (byte)((body.Length >> 24) & 0xFF); + header[1] = (byte)((body.Length >> 16) & 0xFF); + header[2] = (byte)((body.Length >> 8) & 0xFF); + header[3] = (byte)( body.Length & 0xFF); + header[4] = (byte)kind; await _gate.WaitAsync(ct).ConfigureAwait(false); try { - await _stream.WriteAsync(lengthPrefix, ct).ConfigureAwait(false); - _stream.WriteByte((byte)kind); + await _stream.WriteAsync(header, ct).ConfigureAwait(false); await _stream.WriteAsync(body, ct).ConfigureAwait(false); await _stream.FlushAsync(ct).ConfigureAwait(false); } diff --git a/tests/Drivers/ZB.MOM.WW.OtOpcUa.Driver.Historian.Wonderware.Client.Tests/ContractsWireParityTests.cs b/tests/Drivers/ZB.MOM.WW.OtOpcUa.Driver.Historian.Wonderware.Client.Tests/ContractsWireParityTests.cs index df2478b5..d4132e04 100644 --- a/tests/Drivers/ZB.MOM.WW.OtOpcUa.Driver.Historian.Wonderware.Client.Tests/ContractsWireParityTests.cs +++ b/tests/Drivers/ZB.MOM.WW.OtOpcUa.Driver.Historian.Wonderware.Client.Tests/ContractsWireParityTests.cs @@ -183,7 +183,7 @@ public sealed class ContractsWireParityTests rt.Events[0].EventTimeUtcTicks.ShouldBe(999L); } - /// Verifies that WriteAlarmEventsReply round-trips correctly. + /// Verifies that WriteAlarmEventsReply round-trips correctly (legacy PerEventOk path). [Fact] public void WriteAlarmEventsReply_RoundTrips() { @@ -202,6 +202,36 @@ public sealed class ContractsWireParityTests rt.PerEventOk.ShouldBe(new[] { true, false, true }); } + /// + /// Pins the [Key(4)] index for , + /// the additive granular status field added in the feddc2b8 commit. A silent + /// Key-index drift in either the client or the sidecar mirror copy would swap the legacy + /// PerEventOk bool array and the new status byte array, misclassifying outcomes + /// at runtime. (Finding 013.) + /// + [Fact] + public void WriteAlarmEventsReply_PerEventStatus_IsAtKey4_AndRoundTrips() + { + var original = new WriteAlarmEventsReply + { + CorrelationId = "s", + Success = true, + PerEventOk = [true], + PerEventStatus = [0, 1, 2], // Ack, Retry, Permanent + }; + var bytes = MessagePackSerializer.Serialize(original); + + // The array must start with fixarray(5) — five keys at indices 0-4. + bytes[0].ShouldBe((byte)0x95, "WriteAlarmEventsReply must be a 5-field MessagePack array"); + + var rt = MessagePackSerializer.Deserialize(bytes); + rt.CorrelationId.ShouldBe("s"); + rt.Success.ShouldBeTrue(); + rt.PerEventOk.ShouldBe(new[] { true }); + // Key(4): PerEventStatus must round-trip independently of Key(3): PerEventOk. + rt.PerEventStatus.ShouldBe(new byte[] { 0, 1, 2 }); + } + // ---- MessageKind enum values are pinned ---- // Changing a MessageKind value is a wire break; pin them explicitly.