Files
lmxopcua/code-reviews/Driver.AbLegacy/findings.md
Joseph Doherty d89be2a011 fix(driver-ablegacy): resolve High code-review findings (Driver.AbLegacy-001, Driver.AbLegacy-006)
Driver.AbLegacy-001 — PCCC bit-index range. AbLegacyAddress.TryParse
accepted a bit index of 0..31 for every file type, but a 16-bit
N/B/I/O/S/A word only has bits 0..15. TryParse now range-checks the
bit index against the file's word width (0..15 for 16-bit element
files, 0..31 for the 32-bit L file, no bits on float files), so
addresses like N7:0/20 are rejected at parse time instead of silently
truncating in the (short) cast. WriteBitInWordAsync reads and writes
an L-file parent word as 32-bit Long and masks the RMW arithmetic to
the native width, so a sign-extended 16-bit decode can no longer
corrupt the high bits.

Driver.AbLegacy-006 — shared-runtime concurrency. A per-tag libplctag
Tag handle is cached and reused by both the server read path and the
poll loop, with no synchronisation around Read/GetStatus/DecodeValue.
Added a per-runtime SemaphoreSlim (DeviceState.GetRuntimeLock, keyed
by tag name); ReadAsync and WriteAsync now hold it across the whole
Read -> GetStatus -> Decode / Encode -> Write -> GetStatus sequence so
no two threads touch the same Tag handle concurrently.

Added xUnit + Shouldly regression coverage: AbLegacyBitIndexRangeTests
(per-file bit-range validation + L-file 32-bit RMW + sign-extension
safety) and AbLegacyRuntimeConcurrencyTests (overlap-detecting fake
proving concurrent read/read and read/write are serialised).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-22 06:41:26 -04:00

17 KiB

Code Review - Driver.AbLegacy

Field Value
Module src/Drivers/ZB.MOM.WW.OtOpcUa.Driver.AbLegacy
Reviewer Claude Code
Review date 2026-05-22
Commit reviewed 76d35d1
Status Reviewed
Open findings 11

Checklist coverage

A comprehensive review completes every category, recording "No issues found" where a category produced nothing rather than leaving it blank.

# Category Result
1 Correctness & logic bugs Driver.AbLegacy-001, Driver.AbLegacy-002, Driver.AbLegacy-003, Driver.AbLegacy-004
2 OtOpcUa conventions Driver.AbLegacy-005
3 Concurrency & thread safety Driver.AbLegacy-006, Driver.AbLegacy-007, Driver.AbLegacy-008
4 Error handling & resilience Driver.AbLegacy-009, Driver.AbLegacy-010
5 Security No issues found
6 Performance & resource management Driver.AbLegacy-011
7 Design-document adherence Driver.AbLegacy-012
8 Code organization & conventions Driver.AbLegacy-013
9 Testing coverage No issues found
10 Documentation & comments No issues found

Findings

Driver.AbLegacy-001

Field Value
Severity High
Category Correctness & logic bugs
Location AbLegacyAddress.cs:54, AbLegacyDriver.cs:368-374
Status Resolved

Description: AbLegacyAddress.TryParse accepts a BitIndex of 0..31 for every file type. A PCCC N-file word is a signed 16-bit integer, so valid bit indices are 0..15. When a tag is Bit-typed against an N-file with a bit suffix of 16..31 (e.g. N7:0/20), WriteBitInWordAsync reads the parent as AbLegacyDataType.Int (16-bit), then computes current | (1 << bit) / current & ~(1 << bit) with bit up to 31. 1 << 20 produces a value outside the 16-bit range, the result is cast (short)updated, and the high bits are silently truncated - the wrong bit (or no bit) is written and no error is surfaced. The mask arithmetic is also done on a sign-extended int. For L-file (32-bit) bits the parent is still read as Int (16-bit), so bits 16..31 of a long can never be addressed correctly.

Recommendation: Validate BitIndex against the parent word width during parse or in WriteBitInWordAsync - reject bit > 15 for N/B/I/O/S files and bit > 31 for L files. For bit-in-word RMW against L files, read the parent as Long. Mask the read-back value to the word width before applying the bit operation.

Resolution: Resolved 2026-05-22 — AbLegacyAddress.TryParse now range-checks the bit index against per-file word width (0..15 for N/B/I/O/S/A, 0..31 for L, no bits on F); WriteBitInWordAsync reads/writes an L-file parent as 32-bit Long and masks the RMW arithmetic to the native width so sign-extension can no longer corrupt high bits.

Driver.AbLegacy-002

Field Value
Severity Medium
Category Correctness & logic bugs
Location AbLegacyDriver.cs:368
Status Open

Description: In WriteBitInWordAsync the parent word is decoded with Convert.ToInt32(parentRuntime.DecodeValue(AbLegacyDataType.Int, ...)). LibplctagLegacyTagRuntime.DecodeValue for AbLegacyDataType.Int returns (int)_tag.GetInt16(0) - a sign-extended int. When the current word has its high bit set (value 0x8000..0xFFFF, decoded as a negative int), the subsequent (short)updated cast re-encodes the low 16 bits correctly, but current | (1 << bit) is performed on the sign-extended value. The result is bit-correct for the low 16 bits only because the cast preserves them; any future change to widen the mask range will break silently. Combined with finding 001 this is a latent correctness hazard.

Recommendation: Mask current to current & 0xFFFF before the bit operation and operate on an explicitly 16-bit value, or document the reliance on low-16-bit preservation explicitly.

Resolution: (open)

Driver.AbLegacy-003

Field Value
Severity Medium
Category Correctness & logic bugs
Location AbLegacyAddress.cs:62-95
Status Open

Description: TryParse does not reject several malformed PCCC addresses that the XML docs imply are invalid:

  • A sub-element and a bit index together (T4:0.ACC/2) parse successfully even though no PCCC element supports both.
  • I/O/S files with a file number (I3:0, S2:1) parse successfully - I/O and S are single-letter files with no file number per the doc comment, but the parser only requires "letter then optional digits".
  • B-file addresses with a sub-element (B3:0.DN) parse successfully. ToLibplctagName() re-emits whatever was parsed, so a malformed address is passed through to libplctag rather than rejected early with a clear error.

Recommendation: Tighten the parser: reject sub-element + bit-index combinations, reject file numbers on I/O/S, and restrict which file letters may carry a sub-element (T/C/R only). Add unit coverage for the rejection cases.

Resolution: (open)

Driver.AbLegacy-004

Field Value
Severity Medium
Category Correctness & logic bugs
Location LibplctagLegacyTagRuntime.cs:36-37
Status Open

Description: DecodeValue for AbLegacyDataType.Bit with bitIndex == null returns _tag.GetInt8(0) != 0. A bit-file element (B3:0/0) is a single bit inside a 16-bit word; reading only the low byte (GetInt8(0)) means a Bit tag whose live bit sits in bits 8..15 of the word, or a B-file element addressed without an explicit bit suffix, decodes incorrectly. The driver passes parsed.ToLibplctagName() which preserves the /bit suffix, so libplctag resolves the bit when a suffix is present - but a Bit-typed tag configured with an address that has no /bit suffix (e.g. B3:0) silently decodes the wrong thing.

Recommendation: For Bit with no bitIndex, decide explicitly: either require a bit suffix on Bit-typed tags (validate in CreateInstance/DiscoverAsync) or decode the full 16-bit word and test bit 0.

Resolution: (open)

Driver.AbLegacy-005

Field Value
Severity Low
Category OtOpcUa conventions
Location AbLegacyDriver.cs (whole file)
Status Open

Description: The driver uses no ILogger/Serilog at all. Probe-loop failures, runtime initialisation failures, libplctag non-zero statuses, and read/write exceptions are folded into DriverHealth.Detail strings but never logged. CLAUDE.md names Serilog with a rolling daily file sink as the logging library. The complete absence of structured logging makes field diagnosis of a PCCC comms problem (timeout vs route failure vs wrong PLC family) rely entirely on a single overwritten Detail string that the next read or write immediately clobbers.

Recommendation: Inject ILogger<AbLegacyDriver> (optional, like tagFactory) and log probe transitions, runtime-init failures, and the first occurrence of a non-zero libplctag status per device.

Resolution: (open)

Driver.AbLegacy-006

Field Value
Severity High
Category Concurrency & thread safety
Location AbLegacyDriver.cs:107-158, AbLegacyDriver.cs:162-234, LibplctagLegacyTagRuntime.cs
Status Resolved

Description: A per-tag IAbLegacyTagRuntime (wrapping a single libplctag Tag) is cached in DeviceState.Runtimes and reused. ReadAsync (called directly by the server read path) and the PollGroupEngine poll loop (which also calls ReadAsync via the reader delegate) can run concurrently, and two poll subscriptions covering the same tag run on independent background tasks. All of them call EnsureTagRuntimeAsync to the same Tag instance and call runtime.ReadAsync / GetStatus / DecodeValue with no synchronisation. A libplctag Tag is not safe for concurrent operations on the same handle: an interleaved Read/GetStatus/DecodeValue from two threads can read a value mid-update or observe a status that belongs to the other operation. WriteAsync shares the same runtime dictionary and compounds the hazard. Only the bit-in-word RMW path is serialised (per-parent SemaphoreSlim).

Recommendation: Serialise all operations against a given runtime - a per-runtime SemaphoreSlim, or a per-device read lock - so no two threads touch the same Tag handle concurrently.

Resolution: Resolved 2026-05-22 — added a per-runtime SemaphoreSlim (DeviceState.GetRuntimeLock, keyed by tag name); ReadAsync and WriteAsync now hold it around the whole Read→GetStatus→Decode / Encode→Write→GetStatus sequence so the shared libplctag Tag handle is never touched by two threads at once.

Driver.AbLegacy-007

Field Value
Severity Medium
Category Concurrency & thread safety
Location AbLegacyDriver.cs:411-438, AbLegacyDriver.cs:386-409
Status Open

Description: EnsureTagRuntimeAsync and EnsureParentRuntimeAsync are check-then-act: device.Runtimes.TryGetValue(...) then, after await runtime.InitializeAsync, device.Runtimes[def.Name] = runtime. Dictionary is not thread-safe, and two concurrent callers for the same tag (read + poll, or two poll loops) both miss the lookup, both Create + InitializeAsync a runtime, and both write the dictionary. One runtime is overwritten and leaked - DisposeRuntimes only disposes what is currently in the dict - and concurrent Dictionary writes can corrupt internal state. ParentRuntimes has the identical pattern.

Recommendation: Replace the runtime caches with ConcurrentDictionary and use GetOrAdd, or guard runtime creation under a per-device lock. Ensure the losing runtime of any race is disposed.

Resolution: (open)

Driver.AbLegacy-008

Field Value
Severity Medium
Category Concurrency & thread safety
Location AbLegacyDriver.cs:21, AbLegacyDriver.cs:138-146, AbLegacyDriver.cs:216-229
Status Open

Description: _health is a plain non-volatile reference field mutated from ReadAsync, WriteAsync (both can run on multiple threads / poll loops) and InitializeAsync/ShutdownAsync, and read by GetHealth() from yet another thread. There is no lock, no volatile, and no Interlocked exchange. The record reference assignment is atomic, but without a memory barrier a reader can observe a stale _health indefinitely, and concurrent writers race so a Healthy write from one successful read can clobber a Degraded write from a concurrent failing read. GetHealth() may therefore report Healthy while reads are persistently failing.

Recommendation: Mark _health volatile, or funnel health transitions through a lock / Interlocked.Exchange. Consider only downgrading on failure and upgrading on a successful poll so a single failed read does not flap the surface.

Resolution: (open)

Driver.AbLegacy-009

Field Value
Severity Medium
Category Error handling & resilience
Location AbLegacyDriver.cs:41-74
Status Open

Description: InitializeAsync starts probe loops with Task.Run inside the try block. If InitializeAsync fails - or is re-entered - after some probe loops are already started, the catch only sets _health = Faulted and rethrows; it does not cancel state.ProbeCts, dispose runtimes, or clear _devices. A caller that catches the exception and retries via ReinitializeAsync is covered (it calls ShutdownAsync first), but a caller that catches and abandons the driver leaves orphaned probe tasks and CancellationTokenSources alive holding libplctag handles. Separately, ProbeLoopAsync never escalates a permanently-unreachable device beyond Stopped.

Recommendation: On the catch path in InitializeAsync, run the same teardown as ShutdownAsync (cancel probe CTSs, dispose runtimes, clear dictionaries) before rethrowing, so a failed initialise leaves no live background work.

Resolution: (open)

Driver.AbLegacy-010

Field Value
Severity Medium
Category Error handling & resilience
Location AbLegacyStatusMapper.cs:26-56
Status Open

Description: MapLibplctagStatus maps the integer codes -5/-7/-14/-16/-17. These do not match the native libplctag PLCTAG_ERR_* constants (PLCTAG_ERR_TIMEOUT = -32, PLCTAG_ERR_NOT_FOUND = -22, PLCTAG_ERR_NOT_ALLOWED = -21, PLCTAG_ERR_OUT_OF_BOUNDS = -25, PLCTAG_ERR_BAD_CONNECTION = -8). The mapper operates on (int)_tag.GetStatus(), where GetStatus() returns the libplctag .NET wrapper Status enum whose underlying ordinals differ from the native codes - so the -5/-7/... values are at best the .NET enum ordinals (unverified, undocumented) and at worst wrong. Any unmatched negative status falls through to BadCommunicationError, so a timeout is reported as a generic comms error rather than BadTimeout. MapPcccStatus is dead code - the PCCC STS byte is never inspected because libplctag surfaces only its own status enum.

Recommendation: Verify the actual libplctag.Status enum values against the 1.5.2 package and map by enum name rather than magic integers. Either wire MapPcccStatus into a real PCCC-STS path or delete it as dead code. The same defect exists in AbCipStatusMapper and should be fixed in lockstep.

Resolution: (open)

Driver.AbLegacy-011

Field Value
Severity Low
Category Performance & resource management
Location AbLegacyDriver.cs:440
Status Open

Description: Dispose() is implemented as DisposeAsync().AsTask().GetAwaiter().GetResult() - sync-over-async. ShutdownAsync awaits _poll.DisposeAsync() (which completes synchronously) and does no other real async work, so a deadlock is unlikely in practice, but the pattern blocks the calling thread and would deadlock if any awaited continuation were ever marshalled back to a single-threaded synchronization context.

Recommendation: Prefer callers use IAsyncDisposable. If a synchronous Dispose() must exist, perform the synchronous teardown directly (cancel CTSs, dispose runtimes) rather than blocking on the async path.

Resolution: (open)

Driver.AbLegacy-012

Field Value
Severity Medium
Category Design-document adherence
Location PlcFamilies/AbLegacyPlcFamilyProfile.cs:7-54, AbLegacyDriver.cs:48-52
Status Open

Description: AbLegacyPlcFamilyProfile declares four record properties - DefaultCipPath, MaxTagBytes, SupportsStringFile, SupportsLongFile - and only LibplctagPlcAttribute is ever consumed. In particular:

  • DefaultCipPath is dead: the per-family default path (empty for MicroLogix, 1,0 for SLC/PLC-5) is never used to substitute an empty CIP path. The CIP path always comes verbatim from AbLegacyHostAddress.CipPath, so a SLC 500 misconfigured with an empty path is never corrected to 1,0 even though the profile knows the right default - contradicting the test-fixture doc, which calls out the /1,0 cip-path workaround as required for SLC.
  • MaxTagBytes is never used to validate or chunk a string/array read.
  • SupportsStringFile/SupportsLongFile are never checked, so a String or Long tag configured against a MicroLogix or PLC-5 (which the profile says lack them) is accepted and only fails at runtime with an opaque comms error.

Recommendation: Either consume the profile fields (substitute DefaultCipPath when the host CIP path is empty; reject Long/String tags against families whose profile sets the corresponding flag false; use MaxTagBytes for validation) or remove the unused fields and the doc comments that imply they are load-bearing.

Resolution: (open)

Driver.AbLegacy-013

Field Value
Severity Low
Category Code organization & conventions
Location AbLegacyDriver.cs:340-345, AbLegacyDriver.cs:238-264
Status Open

Description: Two minor organisational issues:

  1. ResolveHost returns _options.Devices.FirstOrDefault()?.HostAddress ?? DriverInstanceId when the reference is unknown and no devices are configured. DriverInstanceId is not a host address (ab://...), so a downstream IHostConnectivityProbe / host lookup keyed on the returned value never matches a real device. Returning the instance id as a fake host masks a configuration error.
  2. DiscoverAsync always emits IsArray: false / ArrayDim: null. PCCC files are inherently arrays of elements; a tag that genuinely addresses a multi-element region cannot be represented. This is consistent with the PR-staged scope (the doc says array coverage is thin) but should be tracked rather than silently shipped.

Recommendation: For (1), either throw / return a sentinel the caller can detect, or document why falling back to the instance id is acceptable. For (2), record the array-addressing gap as a tracked follow-up.

Resolution: (open)