Files
lmxopcua/code-reviews/Driver.Historian.Wonderware.Client/findings.md
T
Joseph Doherty cd072baad8 review(Driver.Historian.Wonderware.Client): async frame-header write + wire-parity test
Re-review at 7286d320. -011: FrameWriter folded the sync WriteByte (could block on SslStream
past the call timeout) into one async 5-byte header write. -012: DefaultTcpConnectFactory
readonly. -013: wire-parity test for PerEventStatus [Key(4)]. No wire change.
2026-06-19 11:58:15 -04:00

359 lines
26 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Code Review — Driver.Historian.Wonderware.Client
| Field | Value |
|---|---|
| Module | `src/Drivers/ZB.MOM.WW.OtOpcUa.Driver.Historian.Wonderware.Client` |
| Reviewer | Claude Code |
| Review date | 2026-06-19 |
| Commit reviewed | `7286d320` |
| Status | Reviewed |
| Open findings | 0 |
## Checklist coverage
A comprehensive review completes every category, recording "No issues found" where
a category produced nothing rather than leaving it blank.
| # | Category | Result |
|---|---|---|
| 1 | Correctness & logic bugs | Driver.Historian.Wonderware.Client-001, Driver.Historian.Wonderware.Client-002 |
| 2 | OtOpcUa conventions | No issues found |
| 3 | Concurrency & thread safety | Driver.Historian.Wonderware.Client-003, Driver.Historian.Wonderware.Client-004 |
| 4 | Error handling & resilience | Driver.Historian.Wonderware.Client-005, Driver.Historian.Wonderware.Client-006 |
| 5 | Security | Driver.Historian.Wonderware.Client-007, Driver.Historian.Wonderware.Client-008 |
| 6 | Performance & resource management | No issues found |
| 7 | Design-document adherence | No issues found |
| 8 | Code organization & conventions | No issues found |
| 9 | Testing coverage | Driver.Historian.Wonderware.Client-009 |
| 10 | Documentation & comments | Driver.Historian.Wonderware.Client-010 |
## Findings
### Driver.Historian.Wonderware.Client-001
| Field | Value |
|---|---|
| Severity | High |
| Category | Correctness & logic bugs |
| Location | `WonderwareHistorianClient.cs:98-113` |
| Status | Resolved |
**Description:** `ReadAtTimeAsync` violates the explicit `IHistorianDataSource.ReadAtTimeAsync`
contract. The interface XML doc states: the returned list MUST be the same length and
order as `timestampsUtc`, and gaps are returned as Bad-quality snapshots. The client passes
`reply.Samples` straight through `ToSnapshots` with no check that the sidecar returned
exactly one sample per requested timestamp, nor that the order matches. If the sidecar
returns fewer/more samples (e.g. it drops boundary-less timestamps), the OPC UA
HistoryReadAtTime service receives a result that the spec-compliant caller expects to
index positionally against the request timestamps, silently misaligning values with
timestamps. The matching `ReadAtTimeAsync_PreservesTimestampOrder` test only passes because
the fake echoes the request verbatim; it never exercises a short/reordered reply.
**Recommendation:** After receiving the reply, reconcile `reply.Samples` against
`timestampsUtc` by timestamp: build the result array at `timestampsUtc.Count`, fill matched
entries, and emit a Bad-quality (`0x80000000`) snapshot for any requested timestamp the
sidecar did not return. Alternatively assert `reply.Samples.Length == timestampsUtc.Count`
and fail loudly. Add a test where the fake returns a partial/reordered sample set.
**Resolution:** Resolved 2026-05-22 — `ReadAtTimeAsync` now reconciles the sidecar reply against the requested timestamps via a new `AlignAtTimeSnapshots` helper: it indexes returned samples by timestamp ticks, builds the result at `timestampsUtc.Count` in request order, and emits a Bad-quality (`0x80000000`) snapshot for any requested timestamp the sidecar did not return; added the `ReadAtTimeAsync_PartialAndReorderedReply_AlignsByTimestamp_AndFillsGapsAsBad` regression test.
### Driver.Historian.Wonderware.Client-002
| Field | Value |
|---|---|
| Severity | Medium |
| Category | Correctness & logic bugs |
| Location | `WonderwareHistorianClient.cs:154-199`, `IAlarmHistorianSink.cs:66-74` |
| Status | Resolved |
**Description:** `WriteBatchAsync` can never return `HistorianWriteOutcome.PermanentFail`.
`HistorianWriteOutcome` defines three states (`Ack`, `RetryPlease`, `PermanentFail`) and
the drain worker is documented to move the event to the dead-letter table on
`PermanentFail`. The client maps the sidecar `WriteAlarmEventsReply.PerEventOk` bool array
to only `Ack`/`RetryPlease`, and the whole-call-failure and catch paths also only emit
`RetryPlease`. A malformed alarm event the sidecar can never persist (unrecoverable SDK
error on that specific row) therefore retries forever, blocking the head of the
store-and-forward queue and never dead-lettering. The wire contract
(`WriteAlarmEventsReply`) carries no per-event permanent/transient distinction, so the
limitation is structural.
**Recommendation:** Extend the wire contract: replace `bool[] PerEventOk` with a
per-event status enum (Ack/Retry/Permanent), coordinated as an additive change on both
sidecar and client per the Contracts.cs versioning rules, so unrecoverable events can be
dead-lettered. Until then, document explicitly that this writer never produces
`PermanentFail` and that poison events retry indefinitely.
**Resolution:** Resolved 2026-05-22 — extending the wire contract (replacing `bool[] PerEventOk` with a per-event status enum) requires a coordinated change to the .NET 4.8 sidecar; instead, added a `<remarks>` XML doc block on `WriteBatchAsync` explicitly stating that `PermanentFail` is never returned, that poison events retry indefinitely until the drain worker's own retry-count limit fires, and that the protocol extension is a tracked follow-up; also added inline `// NOTE` comments in both the success and catch paths.
### Driver.Historian.Wonderware.Client-003
| Field | Value |
|---|---|
| Severity | Low |
| Category | Concurrency & thread safety |
| Location | `WonderwareHistorianClient.cs:207`, `WonderwareHistorianClient.cs:132-150` |
| Status | Resolved |
**Description:** `_totalQueries` is mutated with `Interlocked.Increment` in `Invoke`, but
read inside `GetHealthSnapshot` under `_healthLock`, and every other counter
(`_totalSuccesses`, `_totalFailures`, `_consecutiveFailures`) is mutated only under
`_healthLock`. The two synchronization mechanisms do not compose: an `Interlocked`
increment is not ordered against `lock`-protected reads, so a snapshot can observe a
`_totalQueries` value inconsistent with the lock-protected counters. The window is small
and the counters are advisory, but the mixed model is a latent hazard.
**Recommendation:** Pick one mechanism. Simplest: move the `_totalQueries++` into the
`_healthLock` block (a new `RecordQuery()` helper, or fold it into `RecordSuccess`/
`RecordFailure`) so all six health fields share a single lock.
**Resolution:** Resolved 2026-05-23 — replaced the mixed `Interlocked.Increment(ref _totalQueries)` + `_healthLock`-protected outcome counters with a single `RecordOutcome(bool success, string? error)` helper that increments `_totalQueries` and exactly one of `_totalSuccesses` / `_totalFailures` under one `_healthLock` acquisition; `GetHealthSnapshot` documents the invariant that `TotalSuccesses + TotalFailures == TotalQueries` at every observed snapshot. Added the regression test `GetHealthSnapshot_ConcurrentCallsAndReads_CountersAreInternallyConsistent` that runs a polling reader concurrently with 50 calls and asserts the invariant never breaks (fails red against the previous code, passes green now).
### Driver.Historian.Wonderware.Client-004
| Field | Value |
|---|---|
| Severity | Low |
| Category | Concurrency & thread safety |
| Location | `WonderwareHistorianClient.cs:203-267` |
| Status | Resolved |
**Description:** A sidecar-reported failure is recorded in two non-atomic steps under
separate lock acquisitions: `Invoke` calls `RecordSuccess()` (line 211) and then the
caller calls `ThrowIfFailed` which calls `ReclassifySuccessAsFailure()` (line 256),
decrementing `_totalSuccesses` and incrementing `_totalFailures`. Between those two locked
regions a concurrent `GetHealthSnapshot` can observe a transient state where the operation
counts as both a success and not-yet-a-failure (`_totalSuccesses` inflated,
`_consecutiveFailures` still 0). The undo-a-success/record-a-failure dance is also fragile:
if a future change adds an early return or exception between `RecordSuccess` and
`ThrowIfFailed`, the success is never reversed.
**Recommendation:** Classify the call once: do not call `RecordSuccess` until the
sidecar-level `Success` flag has been checked, or pass the reply success/error into a
single `RecordOutcome(bool transportOk, bool sidecarOk, string? error)` that updates all
counters under one lock acquisition.
**Resolution:** Resolved 2026-05-23 — eliminated the `RecordSuccess``ReclassifySuccessAsFailure` undo dance. `InvokeAsync` now takes a `Func<TReply, (bool ok, string? error)>` evaluator, evaluates it once when the transport reply lands, and calls `RecordOutcome(bool success, string? error)` exactly once per call under a single `_healthLock` acquisition. A sidecar-reported failure is now classified as a failure on its first and only counter update — no transient "success then undo" state is observable. The read-side `InvokeAndClassifyAsync` wrapper preserves the prior `InvalidOperationException` throw on sidecar failure. Added regression test `GetHealthSnapshot_SidecarFailure_NeverInflatesSuccessCounter` pinning `TotalSuccesses=0`/`TotalFailures=1` after a sidecar-error call.
### Driver.Historian.Wonderware.Client-005
| Field | Value |
|---|---|
| Severity | Medium |
| Category | Error handling & resilience |
| Location | `Ipc/FrameReader.cs:31-32` |
| Status | Resolved |
**Description:** After reading the 4-byte length prefix, `ReadFrameAsync` reads the kind
byte with the synchronous, blocking `_stream.ReadByte()` and ignores the
`CancellationToken`. On a `NamedPipeClientStream` with `PipeOptions.Asynchronous`, a
synchronous `ReadByte()` blocks the calling thread until a byte arrives or the pipe
closes. If the sidecar sends a length prefix and then stalls (slow/hung peer), the call
hangs on a thread-pool thread and the `EffectiveCallTimeout` linked token in
`PipeChannel.InvokeAsync` cannot interrupt it because the timeout only fires between
awaits. This defeats the documented cap on a single read/write call once connected and can
wedge the single-in-flight call gate.
**Recommendation:** Read the kind byte asynchronously and cancellably: extend the length
prefix read to 5 bytes, or do a second `ReadExactAsync(new byte[1], ct)`. This makes the
whole frame read honor the call-timeout token and matches the async style of the rest of
the reader.
**Resolution:** Resolved 2026-05-22 — replaced the synchronous, non-cancellable `_stream.ReadByte()` for the kind byte with an async `ReadExactAsync(new byte[1], ct)` call so the full frame read honours the call-timeout token and cannot wedge the channel on a stalled peer.
### Driver.Historian.Wonderware.Client-006
| Field | Value |
|---|---|
| Severity | Low |
| Category | Error handling & resilience |
| Location | `Internal/PipeChannel.cs:96-107`, `WonderwareHistorianClientOptions.cs:11-12` |
| Status | Resolved |
**Description:** `PipeChannel.InvokeAsync` retries exactly once on transport failure and
otherwise propagates. The options expose `ReconnectInitialBackoff` and
`ReconnectMaxBackoff` and `WonderwareHistorianClientOptions` documents them as exponential
backoff between reconnects, but neither field is referenced anywhere in the module: the
single retry reconnects immediately with no delay. A sidecar that is restarting will
reject or refuse the immediate reconnect, the call fails, and there is no backoff before
the next caller-driven attempt. Either the backoff belongs in the channel and is missing,
or the options are dead config that misleads operators.
**Recommendation:** Either implement the documented exponential backoff in the reconnect
path, or remove the two unused option fields and their XML docs and state plainly that
retry/backoff is owned by the caller (the alarm drain worker / history router).
**Resolution:** Resolved 2026-05-23 — removed the dead `ReconnectInitialBackoff`/`ReconnectMaxBackoff` fields (and their `Effective*` accessors) from `WonderwareHistorianClientOptions` and added a `<remarks>` block stating that retry/backoff is owned by the caller (the alarm drain worker and the read-side history router) and that the channel itself performs exactly one in-place reconnect with no delay. Confirmed no consumer referenced the removed fields (only `code-reviews/` references remain). Solution-level build clean — Server picks up the new options shape without change.
### Driver.Historian.Wonderware.Client-007
| Field | Value |
|---|---|
| Severity | Medium |
| Category | Security |
| Location | `WonderwareHistorianClient.cs:276` |
| Status | Resolved |
**Description:** `ToSnapshots` deserializes peer-supplied bytes with
`MessagePackSerializer.Deserialize<object>(dto.ValueBytes)`, typeless MessagePack
deserialization. The `object` overload resolves runtime types from the wire payload. The
client treats the pipe peer as untrusted elsewhere (16 MiB frame cap stated to protect
the receiver from a hostile or buggy peer, shared-secret Hello). Typeless deserialization
of bytes that originate from the historian database widens the trust surface. The
MessagePack standard resolver is primitive-only by default so the practical blast radius
is limited, but this is the pattern called out by the two suppressed MessagePack
advisories on this project (see finding 008).
**Recommendation:** Confirm the serializer options here use the default (non-typeless)
resolver and that no `TypelessContractlessStandardResolver` is in play; if so, document
that. Prefer round-tripping the value as a constrained set of known primitive types rather
than `object`, and validate `ValueBytes.Length` against a sane per-sample cap before
deserializing.
**Resolution:** Resolved 2026-05-22 — added `DeserializeSampleValue()` helper that enforces a 64 KiB per-sample `ValueBytes` cap before deserialization and documents that the default `StandardResolver` (primitive-only, no `TypelessContractlessStandardResolver`) is in use; both `ToSnapshots` and `AlignAtTimeSnapshots` now route through the helper; added inline XML comments to the two `NuGetAuditSuppress` entries in the csproj stating the advisory title, why it does not apply to this usage, and the revisit trigger.
### Driver.Historian.Wonderware.Client-008
| Field | Value |
|---|---|
| Severity | Low |
| Category | Security |
| Location | `ZB.MOM.WW.OtOpcUa.Driver.Historian.Wonderware.Client.csproj:29-32` |
| Status | Resolved |
**Description:** The csproj suppresses two NuGet audit advisories
(`GHSA-37gx-xxp4-5rgx`, `GHSA-w3x6-4m5h-cxqf`) for the `MessagePack` 2.5.187 dependency
with no inline comment recording why the suppression is safe, who reviewed it, or when it
should be revisited. Blanket `NuGetAuditSuppress` entries silence the very signal that
would flag the next related CVE. Combined with finding 007 (typeless deserialization), an
unexplained MessagePack advisory suppression is a maintainability and audit-trail gap.
**Recommendation:** Add an XML comment next to each `NuGetAuditSuppress` stating the
advisory title, why it does not apply to this module usage, and a revisit trigger. Track a
follow-up to upgrade `MessagePack` once a patched version is available so the suppressions
can be dropped.
**Resolution:** Resolved 2026-05-23 — the suppression block in the csproj (already added under finding 007) records each advisory title (GHSA-37gx-xxp4-5rgx unsafe-dynamic-codegen, GHSA-w3x6-4m5h-cxqf typeless-resolver gadget chain), why neither applies to this module (default `StandardResolver` only, no `TypelessContractlessStandardResolver` / `DynamicUnion` / `DynamicGenericResolver`, plus the 64 KiB per-sample ValueBytes cap in `DeserializeSampleValue` from finding 007), and the revisit trigger ("Revisit once MessagePack 3.x is available and drop these suppressions at that time"). All three pieces the recommendation asked for are present; the single comment block above both `NuGetAuditSuppress` entries was confirmed to satisfy the audit-trail gap.
### Driver.Historian.Wonderware.Client-009
| Field | Value |
|---|---|
| Severity | Medium |
| Category | Testing coverage |
| Location | `tests/Drivers/ZB.MOM.WW.OtOpcUa.Driver.Historian.Wonderware.Client.Tests/WonderwareHistorianClientTests.cs` |
| Status | Resolved |
**Description:** The suite covers happy paths, server-error, bad-secret, a single
reconnect and health counters, but several critical paths are untested:
(1) `ReadAtTimeAsync` with a partial/reordered sidecar reply, the contract-alignment case
from finding 001 (the existing test only echoes the request);
(2) the `WriteBatchAsync` catch branch, a transport/deserialization throw during a write,
which must return `RetryPlease` for every event;
(3) `InvokeAsync` second-attempt-also-fails path (the test only proves a successful
reconnect, never a reconnect that fails again and propagates);
(4) the `CallTimeout` path, no test asserts that a stalled sidecar produces a timed-out
`OperationCanceledException`;
(5) `MapAggregate` for `HistoryAggregateType.Total` throwing `NotSupportedException`;
(6) the `InvalidDataException` path when the sidecar replies with an unexpected
`MessageKind`. The byte-equality / round-trip parity test the Contracts.cs and Framing.cs
comments repeatedly promise is not present in this test project.
**Recommendation:** Add the missing-edge-case tests above. In particular add the
wire-parity test the source comments commit to: serialize each DTO with the client copy
and assert byte-equality against the sidecar `Driver.Historian.Wonderware.Ipc` copy, so a
silent `[Key]` drift between the two duplicated contract sets is caught at build time.
**Resolution:** Resolved 2026-05-22 — added six missing tests to `WonderwareHistorianClientTests.cs` (WriteBatchAsync transport-drop catch path returns RetryPlease; InvokeAsync both-attempts-fail propagates exception; stalled sidecar fires OperationCanceledException within CallTimeout; ReadProcessedAsync Total aggregate throws NotSupportedException; sidecar wrong-kind reply throws InvalidDataException) and extended `FakeSidecarServer` with `DisconnectBeforeReply`, `ReplyWithWrongKind`, and `StallAfterRequest` test knobs; added new `ContractsWireParityTests.cs` with 11 tests pinning MessagePack byte layout, round-trip correctness, MessageKind enum values, and Framing constants to catch silent `[Key]` index drift between the client and sidecar mirror copies. Total test count grew from 11 to 27, all passing.
### Driver.Historian.Wonderware.Client-010
| Field | Value |
|---|---|
| Severity | Low |
| Category | Documentation & comments |
| Location | `WonderwareHistorianClient.cs:355-361`, `WonderwareHistorianClient.cs:132-150` |
| Status | Resolved |
**Description:** Two doc/behaviour mismatches.
(1) The `Dispose()` XML comment asserts the underlying channel async cleanup is
non-blocking so the `GetAwaiter()/GetResult()` bridge is safe. `PipeChannel.DisposeAsync`
calls `ResetTransport()`, which invokes synchronous `Stream.Dispose()` on a
`NamedPipeClientStream`; pipe disposal can block briefly on OS handle teardown. The bridge
is safe (no deadlock, no captured context) but not strictly non-blocking; the comment
should say "does not deadlock".
(2) `GetHealthSnapshot` populates both `ProcessConnectionOpen` and `EventConnectionOpen`
from the same `_channel.IsConnected`, and `ActiveProcessNode`/`ActiveEventNode`/`Nodes`
are hard-coded to null/empty. A consumer reading `HistorianHealthSnapshot` would assume
two independent connections and per-node health; this client has a single channel and no
node concept. The collapse is reasonable but undocumented.
**Recommendation:** Reword the `Dispose()` comment to claim only deadlock-safety. Add a
short remark on `GetHealthSnapshot` explaining that the single-channel client maps both
connection flags to one transport and does not track per-node health.
**Resolution:** Resolved 2026-05-23 — (1) reworded the `Dispose()` XML comment to drop the "non-blocking" claim and instead state that the bridge is **deadlock-safe** because the cleanup never awaits a captured `SynchronizationContext` nor takes any lock the caller could hold, while acknowledging that `NamedPipeClientStream` teardown can block briefly on OS handle release. (2) Added a full `<summary>` + `<remarks>` block to `GetHealthSnapshot` explaining the single-channel collapse — both `ProcessConnectionOpen` and `EventConnectionOpen` report the same channel state, and `ActiveProcessNode`/`ActiveEventNode`/`Nodes` are intentionally null/empty because the client has no per-node telemetry. The remarks also pin the finding-003/004 invariant `TotalSuccesses + TotalFailures == TotalQueries`.
## Re-review 2026-06-19 (commit 7286d320)
Significant changes since commit `76d35d1`: named-pipe transport retired and replaced with TCP/TLS (`72f32045`, `6e152047`, `fd4d0553`, `35ac0b8c`, `fcf84adb`); `FrameChannel` introduced (replacing `PipeChannel`); `WonderwareHistorianDriverProbe` added; `WonderwareHistorianClientOptions` extracted to a `.Contracts` project; `PerEventStatus` wire field added for granular `PermanentFail` signalling (`feddc2b8`); `Total` aggregate derived client-side from `Average × interval-seconds` (`5e27b5f7`); test project TCP-ified. All 10 prior findings remained Resolved at this commit.
#### Checklist coverage
| # | Category | Result |
|---|---|---|
| 1 | Correctness & logic bugs | No new issues found |
| 2 | OtOpcUa conventions | No issues found |
| 3 | Concurrency & thread safety | Driver.Historian.Wonderware.Client-011 |
| 4 | Error handling & resilience | No new issues found |
| 5 | Security | No new issues found — SharedSecret not logged; TLS pin-check preserved |
| 6 | Performance & resource management | Driver.Historian.Wonderware.Client-012 |
| 7 | Design-document adherence | No issues found |
| 8 | Code organization & conventions | No issues found |
| 9 | Testing coverage | Driver.Historian.Wonderware.Client-013 |
| 10 | Documentation & comments | No new issues found |
### Driver.Historian.Wonderware.Client-011
| Field | Value |
|---|---|
| Severity | Low |
| Category | Concurrency & thread safety |
| Location | `Ipc/FrameWriter.cs:47` |
| Status | Resolved |
**Description:** `FrameWriter.WriteAsync` called `_stream.WriteByte((byte)kind)` synchronously inside the `_gate`-locked section. For a `NetworkStream` (plaintext TCP) this is benign — the kernel send buffer absorbs the byte. For a `SslStream` (TLS path), `Stream.WriteByte` delegates to `Write(byte[])` synchronously: it encrypts and hands the ciphertext to the kernel, which CAN block the calling thread-pool thread if the peer's receive window is exhausted. The `CancellationToken` cannot interrupt a synchronous `WriteByte`; a stuck TLS write could wedge the single-in-flight gate indefinitely. This is the same class of bug as finding 005 (synchronous `ReadByte` on the read path), fixed in `75580fb4`.
**Recommendation:** Fold the kind byte into the 4-byte length-prefix buffer to form a 5-byte header array, then emit it with a single `await _stream.WriteAsync(header, ct)`. This makes every write inside the gate async+cancellable and eliminates the synchronous call entirely, without changing the on-wire layout.
**Resolution:** Resolved 2026-06-19 — replaced `_stream.WriteByte((byte)kind)` with a 5-byte header array (`Framing.LengthPrefixSize + Framing.KindByteSize`) containing the big-endian body length and the kind byte together, emitted with a single `await _stream.WriteAsync(header, ct)` call; all three TCP round-trip tests (`Plaintext_ReturnsConnectedStream_ByteRoundTrips`, `Tls_PinnedThumbprintMatches_ConnectsSuccessfully`, and the full `WonderwareHistorianClientTests` suite) pass green confirming the on-wire format is unchanged.
### Driver.Historian.Wonderware.Client-012
| Field | Value |
|---|---|
| Severity | Low |
| Category | Performance & resource management |
| Location | `Internal/FrameChannel.cs:38` |
| Status | Resolved |
**Description:** `FrameChannel.DefaultTcpConnectFactory` was declared as a non-`readonly` mutable `public static` field. Because `FrameChannel` is `internal` but `InternalsVisibleTo` exposes it to the test project, the field could be replaced at runtime by any test that imports the assembly. A test that forgets to restore the field after mutation would silently contaminate subsequent tests in the same process (test-isolation hazard). In production code, any code in the same assembly or the test project can swap in a hostile factory. There is no legitimate reason for the field to be mutable after initialization.
**Recommendation:** Add `readonly` to the field declaration. The field initializer (a lambda) is a compile-time constant, so this requires no other change.
**Resolution:** Resolved 2026-06-19 — added `readonly` to `DefaultTcpConnectFactory`; the field is now `public static readonly Func<...>`. All 41 tests pass. No consumer assigned to the field.
### Driver.Historian.Wonderware.Client-013
| Field | Value |
|---|---|
| Severity | Low |
| Category | Testing coverage |
| Location | `tests/Drivers/ZB.MOM.WW.OtOpcUa.Driver.Historian.Wonderware.Client.Tests/ContractsWireParityTests.cs` |
| Status | Resolved |
**Description:** `ContractsWireParityTests.WriteAlarmEventsReply_RoundTrips` covered only the legacy `PerEventOk [Key(3)]` bool array; it did not include the `PerEventStatus [Key(4)]` byte-array field added in commit `feddc2b8`. The whole purpose of `ContractsWireParityTests` is to catch silent `[Key]` index drift between the client and sidecar mirror copies. Without a test pinning `Key(4)`, a future refactor that accidentally shifts `PerEventStatus` to a different key index on either side would go undetected until runtime, causing the `PermanentFail` dead-lettering logic to silently stop working.
**Recommendation:** Add a test that serializes a `WriteAlarmEventsReply` with a non-empty `PerEventStatus` (e.g. `[0, 1, 2]`), asserts the header byte is `0x95` (fixarray of 5 elements confirming both Key(3) and Key(4) are present), and round-trips to verify `PerEventStatus` survives independently of `PerEventOk`.
**Resolution:** Resolved 2026-06-19 — added `WriteAlarmEventsReply_PerEventStatus_IsAtKey4_AndRoundTrips` to `ContractsWireParityTests.cs`; asserts the 5-field fixarray header (`0x95`), round-trips `PerEventOk=[true]` and `PerEventStatus=[0,1,2]` independently, and pins `Key(4)` against the `[0,1,2]` byte layout. 41 tests pass (was 40).