cd072baad8
Re-review at 7286d320. -011: FrameWriter folded the sync WriteByte (could block on SslStream
past the call timeout) into one async 5-byte header write. -012: DefaultTcpConnectFactory
readonly. -013: wire-parity test for PerEventStatus [Key(4)]. No wire change.
359 lines
26 KiB
Markdown
359 lines
26 KiB
Markdown
# Code Review — Driver.Historian.Wonderware.Client
|
||
|
||
| Field | Value |
|
||
|---|---|
|
||
| Module | `src/Drivers/ZB.MOM.WW.OtOpcUa.Driver.Historian.Wonderware.Client` |
|
||
| Reviewer | Claude Code |
|
||
| Review date | 2026-06-19 |
|
||
| Commit reviewed | `7286d320` |
|
||
| Status | Reviewed |
|
||
| Open findings | 0 |
|
||
|
||
## Checklist coverage
|
||
|
||
A comprehensive review completes every category, recording "No issues found" where
|
||
a category produced nothing rather than leaving it blank.
|
||
|
||
| # | Category | Result |
|
||
|---|---|---|
|
||
| 1 | Correctness & logic bugs | Driver.Historian.Wonderware.Client-001, Driver.Historian.Wonderware.Client-002 |
|
||
| 2 | OtOpcUa conventions | No issues found |
|
||
| 3 | Concurrency & thread safety | Driver.Historian.Wonderware.Client-003, Driver.Historian.Wonderware.Client-004 |
|
||
| 4 | Error handling & resilience | Driver.Historian.Wonderware.Client-005, Driver.Historian.Wonderware.Client-006 |
|
||
| 5 | Security | Driver.Historian.Wonderware.Client-007, Driver.Historian.Wonderware.Client-008 |
|
||
| 6 | Performance & resource management | No issues found |
|
||
| 7 | Design-document adherence | No issues found |
|
||
| 8 | Code organization & conventions | No issues found |
|
||
| 9 | Testing coverage | Driver.Historian.Wonderware.Client-009 |
|
||
| 10 | Documentation & comments | Driver.Historian.Wonderware.Client-010 |
|
||
|
||
## Findings
|
||
|
||
### Driver.Historian.Wonderware.Client-001
|
||
|
||
| Field | Value |
|
||
|---|---|
|
||
| Severity | High |
|
||
| Category | Correctness & logic bugs |
|
||
| Location | `WonderwareHistorianClient.cs:98-113` |
|
||
| Status | Resolved |
|
||
|
||
**Description:** `ReadAtTimeAsync` violates the explicit `IHistorianDataSource.ReadAtTimeAsync`
|
||
contract. The interface XML doc states: the returned list MUST be the same length and
|
||
order as `timestampsUtc`, and gaps are returned as Bad-quality snapshots. The client passes
|
||
`reply.Samples` straight through `ToSnapshots` with no check that the sidecar returned
|
||
exactly one sample per requested timestamp, nor that the order matches. If the sidecar
|
||
returns fewer/more samples (e.g. it drops boundary-less timestamps), the OPC UA
|
||
HistoryReadAtTime service receives a result that the spec-compliant caller expects to
|
||
index positionally against the request timestamps, silently misaligning values with
|
||
timestamps. The matching `ReadAtTimeAsync_PreservesTimestampOrder` test only passes because
|
||
the fake echoes the request verbatim; it never exercises a short/reordered reply.
|
||
|
||
**Recommendation:** After receiving the reply, reconcile `reply.Samples` against
|
||
`timestampsUtc` by timestamp: build the result array at `timestampsUtc.Count`, fill matched
|
||
entries, and emit a Bad-quality (`0x80000000`) snapshot for any requested timestamp the
|
||
sidecar did not return. Alternatively assert `reply.Samples.Length == timestampsUtc.Count`
|
||
and fail loudly. Add a test where the fake returns a partial/reordered sample set.
|
||
|
||
**Resolution:** Resolved 2026-05-22 — `ReadAtTimeAsync` now reconciles the sidecar reply against the requested timestamps via a new `AlignAtTimeSnapshots` helper: it indexes returned samples by timestamp ticks, builds the result at `timestampsUtc.Count` in request order, and emits a Bad-quality (`0x80000000`) snapshot for any requested timestamp the sidecar did not return; added the `ReadAtTimeAsync_PartialAndReorderedReply_AlignsByTimestamp_AndFillsGapsAsBad` regression test.
|
||
|
||
### Driver.Historian.Wonderware.Client-002
|
||
|
||
| Field | Value |
|
||
|---|---|
|
||
| Severity | Medium |
|
||
| Category | Correctness & logic bugs |
|
||
| Location | `WonderwareHistorianClient.cs:154-199`, `IAlarmHistorianSink.cs:66-74` |
|
||
| Status | Resolved |
|
||
|
||
**Description:** `WriteBatchAsync` can never return `HistorianWriteOutcome.PermanentFail`.
|
||
`HistorianWriteOutcome` defines three states (`Ack`, `RetryPlease`, `PermanentFail`) and
|
||
the drain worker is documented to move the event to the dead-letter table on
|
||
`PermanentFail`. The client maps the sidecar `WriteAlarmEventsReply.PerEventOk` bool array
|
||
to only `Ack`/`RetryPlease`, and the whole-call-failure and catch paths also only emit
|
||
`RetryPlease`. A malformed alarm event the sidecar can never persist (unrecoverable SDK
|
||
error on that specific row) therefore retries forever, blocking the head of the
|
||
store-and-forward queue and never dead-lettering. The wire contract
|
||
(`WriteAlarmEventsReply`) carries no per-event permanent/transient distinction, so the
|
||
limitation is structural.
|
||
|
||
**Recommendation:** Extend the wire contract: replace `bool[] PerEventOk` with a
|
||
per-event status enum (Ack/Retry/Permanent), coordinated as an additive change on both
|
||
sidecar and client per the Contracts.cs versioning rules, so unrecoverable events can be
|
||
dead-lettered. Until then, document explicitly that this writer never produces
|
||
`PermanentFail` and that poison events retry indefinitely.
|
||
|
||
**Resolution:** Resolved 2026-05-22 — extending the wire contract (replacing `bool[] PerEventOk` with a per-event status enum) requires a coordinated change to the .NET 4.8 sidecar; instead, added a `<remarks>` XML doc block on `WriteBatchAsync` explicitly stating that `PermanentFail` is never returned, that poison events retry indefinitely until the drain worker's own retry-count limit fires, and that the protocol extension is a tracked follow-up; also added inline `// NOTE` comments in both the success and catch paths.
|
||
|
||
### Driver.Historian.Wonderware.Client-003
|
||
|
||
| Field | Value |
|
||
|---|---|
|
||
| Severity | Low |
|
||
| Category | Concurrency & thread safety |
|
||
| Location | `WonderwareHistorianClient.cs:207`, `WonderwareHistorianClient.cs:132-150` |
|
||
| Status | Resolved |
|
||
|
||
**Description:** `_totalQueries` is mutated with `Interlocked.Increment` in `Invoke`, but
|
||
read inside `GetHealthSnapshot` under `_healthLock`, and every other counter
|
||
(`_totalSuccesses`, `_totalFailures`, `_consecutiveFailures`) is mutated only under
|
||
`_healthLock`. The two synchronization mechanisms do not compose: an `Interlocked`
|
||
increment is not ordered against `lock`-protected reads, so a snapshot can observe a
|
||
`_totalQueries` value inconsistent with the lock-protected counters. The window is small
|
||
and the counters are advisory, but the mixed model is a latent hazard.
|
||
|
||
**Recommendation:** Pick one mechanism. Simplest: move the `_totalQueries++` into the
|
||
`_healthLock` block (a new `RecordQuery()` helper, or fold it into `RecordSuccess`/
|
||
`RecordFailure`) so all six health fields share a single lock.
|
||
|
||
**Resolution:** Resolved 2026-05-23 — replaced the mixed `Interlocked.Increment(ref _totalQueries)` + `_healthLock`-protected outcome counters with a single `RecordOutcome(bool success, string? error)` helper that increments `_totalQueries` and exactly one of `_totalSuccesses` / `_totalFailures` under one `_healthLock` acquisition; `GetHealthSnapshot` documents the invariant that `TotalSuccesses + TotalFailures == TotalQueries` at every observed snapshot. Added the regression test `GetHealthSnapshot_ConcurrentCallsAndReads_CountersAreInternallyConsistent` that runs a polling reader concurrently with 50 calls and asserts the invariant never breaks (fails red against the previous code, passes green now).
|
||
|
||
### Driver.Historian.Wonderware.Client-004
|
||
|
||
| Field | Value |
|
||
|---|---|
|
||
| Severity | Low |
|
||
| Category | Concurrency & thread safety |
|
||
| Location | `WonderwareHistorianClient.cs:203-267` |
|
||
| Status | Resolved |
|
||
|
||
**Description:** A sidecar-reported failure is recorded in two non-atomic steps under
|
||
separate lock acquisitions: `Invoke` calls `RecordSuccess()` (line 211) and then the
|
||
caller calls `ThrowIfFailed` which calls `ReclassifySuccessAsFailure()` (line 256),
|
||
decrementing `_totalSuccesses` and incrementing `_totalFailures`. Between those two locked
|
||
regions a concurrent `GetHealthSnapshot` can observe a transient state where the operation
|
||
counts as both a success and not-yet-a-failure (`_totalSuccesses` inflated,
|
||
`_consecutiveFailures` still 0). The undo-a-success/record-a-failure dance is also fragile:
|
||
if a future change adds an early return or exception between `RecordSuccess` and
|
||
`ThrowIfFailed`, the success is never reversed.
|
||
|
||
**Recommendation:** Classify the call once: do not call `RecordSuccess` until the
|
||
sidecar-level `Success` flag has been checked, or pass the reply success/error into a
|
||
single `RecordOutcome(bool transportOk, bool sidecarOk, string? error)` that updates all
|
||
counters under one lock acquisition.
|
||
|
||
**Resolution:** Resolved 2026-05-23 — eliminated the `RecordSuccess` → `ReclassifySuccessAsFailure` undo dance. `InvokeAsync` now takes a `Func<TReply, (bool ok, string? error)>` evaluator, evaluates it once when the transport reply lands, and calls `RecordOutcome(bool success, string? error)` exactly once per call under a single `_healthLock` acquisition. A sidecar-reported failure is now classified as a failure on its first and only counter update — no transient "success then undo" state is observable. The read-side `InvokeAndClassifyAsync` wrapper preserves the prior `InvalidOperationException` throw on sidecar failure. Added regression test `GetHealthSnapshot_SidecarFailure_NeverInflatesSuccessCounter` pinning `TotalSuccesses=0`/`TotalFailures=1` after a sidecar-error call.
|
||
|
||
### Driver.Historian.Wonderware.Client-005
|
||
|
||
| Field | Value |
|
||
|---|---|
|
||
| Severity | Medium |
|
||
| Category | Error handling & resilience |
|
||
| Location | `Ipc/FrameReader.cs:31-32` |
|
||
| Status | Resolved |
|
||
|
||
**Description:** After reading the 4-byte length prefix, `ReadFrameAsync` reads the kind
|
||
byte with the synchronous, blocking `_stream.ReadByte()` and ignores the
|
||
`CancellationToken`. On a `NamedPipeClientStream` with `PipeOptions.Asynchronous`, a
|
||
synchronous `ReadByte()` blocks the calling thread until a byte arrives or the pipe
|
||
closes. If the sidecar sends a length prefix and then stalls (slow/hung peer), the call
|
||
hangs on a thread-pool thread and the `EffectiveCallTimeout` linked token in
|
||
`PipeChannel.InvokeAsync` cannot interrupt it because the timeout only fires between
|
||
awaits. This defeats the documented cap on a single read/write call once connected and can
|
||
wedge the single-in-flight call gate.
|
||
|
||
**Recommendation:** Read the kind byte asynchronously and cancellably: extend the length
|
||
prefix read to 5 bytes, or do a second `ReadExactAsync(new byte[1], ct)`. This makes the
|
||
whole frame read honor the call-timeout token and matches the async style of the rest of
|
||
the reader.
|
||
|
||
**Resolution:** Resolved 2026-05-22 — replaced the synchronous, non-cancellable `_stream.ReadByte()` for the kind byte with an async `ReadExactAsync(new byte[1], ct)` call so the full frame read honours the call-timeout token and cannot wedge the channel on a stalled peer.
|
||
|
||
### Driver.Historian.Wonderware.Client-006
|
||
|
||
| Field | Value |
|
||
|---|---|
|
||
| Severity | Low |
|
||
| Category | Error handling & resilience |
|
||
| Location | `Internal/PipeChannel.cs:96-107`, `WonderwareHistorianClientOptions.cs:11-12` |
|
||
| Status | Resolved |
|
||
|
||
**Description:** `PipeChannel.InvokeAsync` retries exactly once on transport failure and
|
||
otherwise propagates. The options expose `ReconnectInitialBackoff` and
|
||
`ReconnectMaxBackoff` and `WonderwareHistorianClientOptions` documents them as exponential
|
||
backoff between reconnects, but neither field is referenced anywhere in the module: the
|
||
single retry reconnects immediately with no delay. A sidecar that is restarting will
|
||
reject or refuse the immediate reconnect, the call fails, and there is no backoff before
|
||
the next caller-driven attempt. Either the backoff belongs in the channel and is missing,
|
||
or the options are dead config that misleads operators.
|
||
|
||
**Recommendation:** Either implement the documented exponential backoff in the reconnect
|
||
path, or remove the two unused option fields and their XML docs and state plainly that
|
||
retry/backoff is owned by the caller (the alarm drain worker / history router).
|
||
|
||
**Resolution:** Resolved 2026-05-23 — removed the dead `ReconnectInitialBackoff`/`ReconnectMaxBackoff` fields (and their `Effective*` accessors) from `WonderwareHistorianClientOptions` and added a `<remarks>` block stating that retry/backoff is owned by the caller (the alarm drain worker and the read-side history router) and that the channel itself performs exactly one in-place reconnect with no delay. Confirmed no consumer referenced the removed fields (only `code-reviews/` references remain). Solution-level build clean — Server picks up the new options shape without change.
|
||
|
||
### Driver.Historian.Wonderware.Client-007
|
||
|
||
| Field | Value |
|
||
|---|---|
|
||
| Severity | Medium |
|
||
| Category | Security |
|
||
| Location | `WonderwareHistorianClient.cs:276` |
|
||
| Status | Resolved |
|
||
|
||
**Description:** `ToSnapshots` deserializes peer-supplied bytes with
|
||
`MessagePackSerializer.Deserialize<object>(dto.ValueBytes)`, typeless MessagePack
|
||
deserialization. The `object` overload resolves runtime types from the wire payload. The
|
||
client treats the pipe peer as untrusted elsewhere (16 MiB frame cap stated to protect
|
||
the receiver from a hostile or buggy peer, shared-secret Hello). Typeless deserialization
|
||
of bytes that originate from the historian database widens the trust surface. The
|
||
MessagePack standard resolver is primitive-only by default so the practical blast radius
|
||
is limited, but this is the pattern called out by the two suppressed MessagePack
|
||
advisories on this project (see finding 008).
|
||
|
||
**Recommendation:** Confirm the serializer options here use the default (non-typeless)
|
||
resolver and that no `TypelessContractlessStandardResolver` is in play; if so, document
|
||
that. Prefer round-tripping the value as a constrained set of known primitive types rather
|
||
than `object`, and validate `ValueBytes.Length` against a sane per-sample cap before
|
||
deserializing.
|
||
|
||
**Resolution:** Resolved 2026-05-22 — added `DeserializeSampleValue()` helper that enforces a 64 KiB per-sample `ValueBytes` cap before deserialization and documents that the default `StandardResolver` (primitive-only, no `TypelessContractlessStandardResolver`) is in use; both `ToSnapshots` and `AlignAtTimeSnapshots` now route through the helper; added inline XML comments to the two `NuGetAuditSuppress` entries in the csproj stating the advisory title, why it does not apply to this usage, and the revisit trigger.
|
||
|
||
### Driver.Historian.Wonderware.Client-008
|
||
|
||
| Field | Value |
|
||
|---|---|
|
||
| Severity | Low |
|
||
| Category | Security |
|
||
| Location | `ZB.MOM.WW.OtOpcUa.Driver.Historian.Wonderware.Client.csproj:29-32` |
|
||
| Status | Resolved |
|
||
|
||
**Description:** The csproj suppresses two NuGet audit advisories
|
||
(`GHSA-37gx-xxp4-5rgx`, `GHSA-w3x6-4m5h-cxqf`) for the `MessagePack` 2.5.187 dependency
|
||
with no inline comment recording why the suppression is safe, who reviewed it, or when it
|
||
should be revisited. Blanket `NuGetAuditSuppress` entries silence the very signal that
|
||
would flag the next related CVE. Combined with finding 007 (typeless deserialization), an
|
||
unexplained MessagePack advisory suppression is a maintainability and audit-trail gap.
|
||
|
||
**Recommendation:** Add an XML comment next to each `NuGetAuditSuppress` stating the
|
||
advisory title, why it does not apply to this module usage, and a revisit trigger. Track a
|
||
follow-up to upgrade `MessagePack` once a patched version is available so the suppressions
|
||
can be dropped.
|
||
|
||
**Resolution:** Resolved 2026-05-23 — the suppression block in the csproj (already added under finding 007) records each advisory title (GHSA-37gx-xxp4-5rgx unsafe-dynamic-codegen, GHSA-w3x6-4m5h-cxqf typeless-resolver gadget chain), why neither applies to this module (default `StandardResolver` only, no `TypelessContractlessStandardResolver` / `DynamicUnion` / `DynamicGenericResolver`, plus the 64 KiB per-sample ValueBytes cap in `DeserializeSampleValue` from finding 007), and the revisit trigger ("Revisit once MessagePack 3.x is available and drop these suppressions at that time"). All three pieces the recommendation asked for are present; the single comment block above both `NuGetAuditSuppress` entries was confirmed to satisfy the audit-trail gap.
|
||
|
||
### Driver.Historian.Wonderware.Client-009
|
||
|
||
| Field | Value |
|
||
|---|---|
|
||
| Severity | Medium |
|
||
| Category | Testing coverage |
|
||
| Location | `tests/Drivers/ZB.MOM.WW.OtOpcUa.Driver.Historian.Wonderware.Client.Tests/WonderwareHistorianClientTests.cs` |
|
||
| Status | Resolved |
|
||
|
||
**Description:** The suite covers happy paths, server-error, bad-secret, a single
|
||
reconnect and health counters, but several critical paths are untested:
|
||
(1) `ReadAtTimeAsync` with a partial/reordered sidecar reply, the contract-alignment case
|
||
from finding 001 (the existing test only echoes the request);
|
||
(2) the `WriteBatchAsync` catch branch, a transport/deserialization throw during a write,
|
||
which must return `RetryPlease` for every event;
|
||
(3) `InvokeAsync` second-attempt-also-fails path (the test only proves a successful
|
||
reconnect, never a reconnect that fails again and propagates);
|
||
(4) the `CallTimeout` path, no test asserts that a stalled sidecar produces a timed-out
|
||
`OperationCanceledException`;
|
||
(5) `MapAggregate` for `HistoryAggregateType.Total` throwing `NotSupportedException`;
|
||
(6) the `InvalidDataException` path when the sidecar replies with an unexpected
|
||
`MessageKind`. The byte-equality / round-trip parity test the Contracts.cs and Framing.cs
|
||
comments repeatedly promise is not present in this test project.
|
||
|
||
**Recommendation:** Add the missing-edge-case tests above. In particular add the
|
||
wire-parity test the source comments commit to: serialize each DTO with the client copy
|
||
and assert byte-equality against the sidecar `Driver.Historian.Wonderware.Ipc` copy, so a
|
||
silent `[Key]` drift between the two duplicated contract sets is caught at build time.
|
||
|
||
**Resolution:** Resolved 2026-05-22 — added six missing tests to `WonderwareHistorianClientTests.cs` (WriteBatchAsync transport-drop catch path returns RetryPlease; InvokeAsync both-attempts-fail propagates exception; stalled sidecar fires OperationCanceledException within CallTimeout; ReadProcessedAsync Total aggregate throws NotSupportedException; sidecar wrong-kind reply throws InvalidDataException) and extended `FakeSidecarServer` with `DisconnectBeforeReply`, `ReplyWithWrongKind`, and `StallAfterRequest` test knobs; added new `ContractsWireParityTests.cs` with 11 tests pinning MessagePack byte layout, round-trip correctness, MessageKind enum values, and Framing constants to catch silent `[Key]` index drift between the client and sidecar mirror copies. Total test count grew from 11 to 27, all passing.
|
||
|
||
### Driver.Historian.Wonderware.Client-010
|
||
|
||
| Field | Value |
|
||
|---|---|
|
||
| Severity | Low |
|
||
| Category | Documentation & comments |
|
||
| Location | `WonderwareHistorianClient.cs:355-361`, `WonderwareHistorianClient.cs:132-150` |
|
||
| Status | Resolved |
|
||
|
||
**Description:** Two doc/behaviour mismatches.
|
||
(1) The `Dispose()` XML comment asserts the underlying channel async cleanup is
|
||
non-blocking so the `GetAwaiter()/GetResult()` bridge is safe. `PipeChannel.DisposeAsync`
|
||
calls `ResetTransport()`, which invokes synchronous `Stream.Dispose()` on a
|
||
`NamedPipeClientStream`; pipe disposal can block briefly on OS handle teardown. The bridge
|
||
is safe (no deadlock, no captured context) but not strictly non-blocking; the comment
|
||
should say "does not deadlock".
|
||
(2) `GetHealthSnapshot` populates both `ProcessConnectionOpen` and `EventConnectionOpen`
|
||
from the same `_channel.IsConnected`, and `ActiveProcessNode`/`ActiveEventNode`/`Nodes`
|
||
are hard-coded to null/empty. A consumer reading `HistorianHealthSnapshot` would assume
|
||
two independent connections and per-node health; this client has a single channel and no
|
||
node concept. The collapse is reasonable but undocumented.
|
||
|
||
**Recommendation:** Reword the `Dispose()` comment to claim only deadlock-safety. Add a
|
||
short remark on `GetHealthSnapshot` explaining that the single-channel client maps both
|
||
connection flags to one transport and does not track per-node health.
|
||
|
||
**Resolution:** Resolved 2026-05-23 — (1) reworded the `Dispose()` XML comment to drop the "non-blocking" claim and instead state that the bridge is **deadlock-safe** because the cleanup never awaits a captured `SynchronizationContext` nor takes any lock the caller could hold, while acknowledging that `NamedPipeClientStream` teardown can block briefly on OS handle release. (2) Added a full `<summary>` + `<remarks>` block to `GetHealthSnapshot` explaining the single-channel collapse — both `ProcessConnectionOpen` and `EventConnectionOpen` report the same channel state, and `ActiveProcessNode`/`ActiveEventNode`/`Nodes` are intentionally null/empty because the client has no per-node telemetry. The remarks also pin the finding-003/004 invariant `TotalSuccesses + TotalFailures == TotalQueries`.
|
||
|
||
## Re-review 2026-06-19 (commit 7286d320)
|
||
|
||
Significant changes since commit `76d35d1`: named-pipe transport retired and replaced with TCP/TLS (`72f32045`, `6e152047`, `fd4d0553`, `35ac0b8c`, `fcf84adb`); `FrameChannel` introduced (replacing `PipeChannel`); `WonderwareHistorianDriverProbe` added; `WonderwareHistorianClientOptions` extracted to a `.Contracts` project; `PerEventStatus` wire field added for granular `PermanentFail` signalling (`feddc2b8`); `Total` aggregate derived client-side from `Average × interval-seconds` (`5e27b5f7`); test project TCP-ified. All 10 prior findings remained Resolved at this commit.
|
||
|
||
#### Checklist coverage
|
||
|
||
| # | Category | Result |
|
||
|---|---|---|
|
||
| 1 | Correctness & logic bugs | No new issues found |
|
||
| 2 | OtOpcUa conventions | No issues found |
|
||
| 3 | Concurrency & thread safety | Driver.Historian.Wonderware.Client-011 |
|
||
| 4 | Error handling & resilience | No new issues found |
|
||
| 5 | Security | No new issues found — SharedSecret not logged; TLS pin-check preserved |
|
||
| 6 | Performance & resource management | Driver.Historian.Wonderware.Client-012 |
|
||
| 7 | Design-document adherence | No issues found |
|
||
| 8 | Code organization & conventions | No issues found |
|
||
| 9 | Testing coverage | Driver.Historian.Wonderware.Client-013 |
|
||
| 10 | Documentation & comments | No new issues found |
|
||
|
||
### Driver.Historian.Wonderware.Client-011
|
||
|
||
| Field | Value |
|
||
|---|---|
|
||
| Severity | Low |
|
||
| Category | Concurrency & thread safety |
|
||
| Location | `Ipc/FrameWriter.cs:47` |
|
||
| Status | Resolved |
|
||
|
||
**Description:** `FrameWriter.WriteAsync` called `_stream.WriteByte((byte)kind)` synchronously inside the `_gate`-locked section. For a `NetworkStream` (plaintext TCP) this is benign — the kernel send buffer absorbs the byte. For a `SslStream` (TLS path), `Stream.WriteByte` delegates to `Write(byte[])` synchronously: it encrypts and hands the ciphertext to the kernel, which CAN block the calling thread-pool thread if the peer's receive window is exhausted. The `CancellationToken` cannot interrupt a synchronous `WriteByte`; a stuck TLS write could wedge the single-in-flight gate indefinitely. This is the same class of bug as finding 005 (synchronous `ReadByte` on the read path), fixed in `75580fb4`.
|
||
|
||
**Recommendation:** Fold the kind byte into the 4-byte length-prefix buffer to form a 5-byte header array, then emit it with a single `await _stream.WriteAsync(header, ct)`. This makes every write inside the gate async+cancellable and eliminates the synchronous call entirely, without changing the on-wire layout.
|
||
|
||
**Resolution:** Resolved 2026-06-19 — replaced `_stream.WriteByte((byte)kind)` with a 5-byte header array (`Framing.LengthPrefixSize + Framing.KindByteSize`) containing the big-endian body length and the kind byte together, emitted with a single `await _stream.WriteAsync(header, ct)` call; all three TCP round-trip tests (`Plaintext_ReturnsConnectedStream_ByteRoundTrips`, `Tls_PinnedThumbprintMatches_ConnectsSuccessfully`, and the full `WonderwareHistorianClientTests` suite) pass green confirming the on-wire format is unchanged.
|
||
|
||
### Driver.Historian.Wonderware.Client-012
|
||
|
||
| Field | Value |
|
||
|---|---|
|
||
| Severity | Low |
|
||
| Category | Performance & resource management |
|
||
| Location | `Internal/FrameChannel.cs:38` |
|
||
| Status | Resolved |
|
||
|
||
**Description:** `FrameChannel.DefaultTcpConnectFactory` was declared as a non-`readonly` mutable `public static` field. Because `FrameChannel` is `internal` but `InternalsVisibleTo` exposes it to the test project, the field could be replaced at runtime by any test that imports the assembly. A test that forgets to restore the field after mutation would silently contaminate subsequent tests in the same process (test-isolation hazard). In production code, any code in the same assembly or the test project can swap in a hostile factory. There is no legitimate reason for the field to be mutable after initialization.
|
||
|
||
**Recommendation:** Add `readonly` to the field declaration. The field initializer (a lambda) is a compile-time constant, so this requires no other change.
|
||
|
||
**Resolution:** Resolved 2026-06-19 — added `readonly` to `DefaultTcpConnectFactory`; the field is now `public static readonly Func<...>`. All 41 tests pass. No consumer assigned to the field.
|
||
|
||
### Driver.Historian.Wonderware.Client-013
|
||
|
||
| Field | Value |
|
||
|---|---|
|
||
| Severity | Low |
|
||
| Category | Testing coverage |
|
||
| Location | `tests/Drivers/ZB.MOM.WW.OtOpcUa.Driver.Historian.Wonderware.Client.Tests/ContractsWireParityTests.cs` |
|
||
| Status | Resolved |
|
||
|
||
**Description:** `ContractsWireParityTests.WriteAlarmEventsReply_RoundTrips` covered only the legacy `PerEventOk [Key(3)]` bool array; it did not include the `PerEventStatus [Key(4)]` byte-array field added in commit `feddc2b8`. The whole purpose of `ContractsWireParityTests` is to catch silent `[Key]` index drift between the client and sidecar mirror copies. Without a test pinning `Key(4)`, a future refactor that accidentally shifts `PerEventStatus` to a different key index on either side would go undetected until runtime, causing the `PermanentFail` dead-lettering logic to silently stop working.
|
||
|
||
**Recommendation:** Add a test that serializes a `WriteAlarmEventsReply` with a non-empty `PerEventStatus` (e.g. `[0, 1, 2]`), asserts the header byte is `0x95` (fixarray of 5 elements confirming both Key(3) and Key(4) are present), and round-trips to verify `PerEventStatus` survives independently of `PerEventOk`.
|
||
|
||
**Resolution:** Resolved 2026-06-19 — added `WriteAlarmEventsReply_PerEventStatus_IsAtKey4_AndRoundTrips` to `ContractsWireParityTests.cs`; asserts the 5-field fixarray header (`0x95`), round-trips `PerEventOk=[true]` and `PerEventStatus=[0,1,2]` independently, and pins `Key(4)` against the `[0,1,2]` byte layout. 41 tests pass (was 40).
|