InitializeAsync catch block now mirrors ShutdownAsync teardown: cancels and disposes probe CancellationTokenSources, calls DisposeRuntimes, and clears _devices/_tagsByName before rethrowing. A caller that catches and abandons (rather than retrying via ReinitializeAsync) no longer leaves orphaned probe tasks or libplctag handles alive. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
18 KiB
Code Review - Driver.AbLegacy
| Field | Value |
|---|---|
| Module | src/Drivers/ZB.MOM.WW.OtOpcUa.Driver.AbLegacy |
| Reviewer | Claude Code |
| Review date | 2026-05-22 |
| Commit reviewed | 76d35d1 |
| Status | Reviewed |
| Open findings | 11 |
Checklist coverage
A comprehensive review completes every category, recording "No issues found" where a category produced nothing rather than leaving it blank.
| # | Category | Result |
|---|---|---|
| 1 | Correctness & logic bugs | Driver.AbLegacy-001, Driver.AbLegacy-002, Driver.AbLegacy-003, Driver.AbLegacy-004 |
| 2 | OtOpcUa conventions | Driver.AbLegacy-005 |
| 3 | Concurrency & thread safety | Driver.AbLegacy-006, Driver.AbLegacy-007, Driver.AbLegacy-008 |
| 4 | Error handling & resilience | Driver.AbLegacy-009, Driver.AbLegacy-010 |
| 5 | Security | No issues found |
| 6 | Performance & resource management | Driver.AbLegacy-011 |
| 7 | Design-document adherence | Driver.AbLegacy-012 |
| 8 | Code organization & conventions | Driver.AbLegacy-013 |
| 9 | Testing coverage | No issues found |
| 10 | Documentation & comments | No issues found |
Findings
Driver.AbLegacy-001
| Field | Value |
|---|---|
| Severity | High |
| Category | Correctness & logic bugs |
| Location | AbLegacyAddress.cs:54, AbLegacyDriver.cs:368-374 |
| Status | Resolved |
Description: AbLegacyAddress.TryParse accepts a BitIndex of 0..31 for every
file type. A PCCC N-file word is a signed 16-bit integer, so valid bit indices are
0..15. When a tag is Bit-typed against an N-file with a bit suffix of 16..31
(e.g. N7:0/20), WriteBitInWordAsync reads the parent as AbLegacyDataType.Int
(16-bit), then computes current | (1 << bit) / current & ~(1 << bit) with bit
up to 31. 1 << 20 produces a value outside the 16-bit range, the result is cast
(short)updated, and the high bits are silently truncated - the wrong bit (or no
bit) is written and no error is surfaced. The mask arithmetic is also done on a
sign-extended int. For L-file (32-bit) bits the parent is still read as Int
(16-bit), so bits 16..31 of a long can never be addressed correctly.
Recommendation: Validate BitIndex against the parent word width during parse or
in WriteBitInWordAsync - reject bit > 15 for N/B/I/O/S files and bit > 31 for L
files. For bit-in-word RMW against L files, read the parent as Long. Mask the
read-back value to the word width before applying the bit operation.
Resolution: Resolved 2026-05-22 — AbLegacyAddress.TryParse now range-checks the
bit index against per-file word width (0..15 for N/B/I/O/S/A, 0..31 for L, no bits on
F); WriteBitInWordAsync reads/writes an L-file parent as 32-bit Long and masks the
RMW arithmetic to the native width so sign-extension can no longer corrupt high bits.
Driver.AbLegacy-002
| Field | Value |
|---|---|
| Severity | Medium |
| Category | Correctness & logic bugs |
| Location | AbLegacyDriver.cs:368 |
| Status | Resolved |
Description: In WriteBitInWordAsync the parent word is decoded with
Convert.ToInt32(parentRuntime.DecodeValue(AbLegacyDataType.Int, ...)).
LibplctagLegacyTagRuntime.DecodeValue for AbLegacyDataType.Int returns
(int)_tag.GetInt16(0) - a sign-extended int. When the current word has its high
bit set (value 0x8000..0xFFFF, decoded as a negative int), the subsequent
(short)updated cast re-encodes the low 16 bits correctly, but current | (1 << bit)
is performed on the sign-extended value. The result is bit-correct for the low 16
bits only because the cast preserves them; any future change to widen the mask range
will break silently. Combined with finding 001 this is a latent correctness hazard.
Recommendation: Mask current to current & 0xFFFF before the bit operation and
operate on an explicitly 16-bit value, or document the reliance on low-16-bit
preservation explicitly.
Resolution: Resolved 2026-05-22 — current & widthMask already applied in WriteBitInWordAsync by the -001 fix; no additional change needed.
Driver.AbLegacy-003
| Field | Value |
|---|---|
| Severity | Medium |
| Category | Correctness & logic bugs |
| Location | AbLegacyAddress.cs:62-95 |
| Status | Resolved |
Description: TryParse does not reject several malformed PCCC addresses that the
XML docs imply are invalid:
- A sub-element and a bit index together (
T4:0.ACC/2) parse successfully even though no PCCC element supports both. - I/O/S files with a file number (
I3:0,S2:1) parse successfully - I/O and S are single-letter files with no file number per the doc comment, but the parser only requires "letter then optional digits". - B-file addresses with a sub-element (
B3:0.DN) parse successfully.ToLibplctagName()re-emits whatever was parsed, so a malformed address is passed through to libplctag rather than rejected early with a clear error.
Recommendation: Tighten the parser: reject sub-element + bit-index combinations, reject file numbers on I/O/S, and restrict which file letters may carry a sub-element (T/C/R only). Add unit coverage for the rejection cases.
Resolution: Resolved 2026-05-22 — TryParse now rejects sub-element+bit-index combinations, file numbers on I/O/S files, and sub-elements on non-T/C/R files; unit tests added in AbLegacyAddressTests.
Driver.AbLegacy-004
| Field | Value |
|---|---|
| Severity | Medium |
| Category | Correctness & logic bugs |
| Location | LibplctagLegacyTagRuntime.cs:36-37 |
| Status | Resolved |
Description: DecodeValue for AbLegacyDataType.Bit with bitIndex == null
returns _tag.GetInt8(0) != 0. A bit-file element (B3:0/0) is a single bit inside
a 16-bit word; reading only the low byte (GetInt8(0)) means a Bit tag whose live
bit sits in bits 8..15 of the word, or a B-file element addressed without an explicit
bit suffix, decodes incorrectly. The driver passes parsed.ToLibplctagName() which
preserves the /bit suffix, so libplctag resolves the bit when a suffix is present -
but a Bit-typed tag configured with an address that has no /bit suffix (e.g.
B3:0) silently decodes the wrong thing.
Recommendation: For Bit with no bitIndex, decide explicitly: either require a
bit suffix on Bit-typed tags (validate in CreateInstance/DiscoverAsync) or
decode the full 16-bit word and test bit 0.
Resolution: Resolved 2026-05-22 — DecodeValue for Bit with no bitIndex now reads the full 16-bit word via GetInt16(0) and tests bit 0, avoiding the silent half-word truncation from GetInt8.
Driver.AbLegacy-005
| Field | Value |
|---|---|
| Severity | Low |
| Category | OtOpcUa conventions |
| Location | AbLegacyDriver.cs (whole file) |
| Status | Open |
Description: The driver uses no ILogger/Serilog at all. Probe-loop failures,
runtime initialisation failures, libplctag non-zero statuses, and read/write
exceptions are folded into DriverHealth.Detail strings but never logged. CLAUDE.md
names Serilog with a rolling daily file sink as the logging library. The complete
absence of structured logging makes field diagnosis of a PCCC comms problem (timeout
vs route failure vs wrong PLC family) rely entirely on a single overwritten Detail
string that the next read or write immediately clobbers.
Recommendation: Inject ILogger<AbLegacyDriver> (optional, like tagFactory) and
log probe transitions, runtime-init failures, and the first occurrence of a non-zero
libplctag status per device.
Resolution: (open)
Driver.AbLegacy-006
| Field | Value |
|---|---|
| Severity | High |
| Category | Concurrency & thread safety |
| Location | AbLegacyDriver.cs:107-158, AbLegacyDriver.cs:162-234, LibplctagLegacyTagRuntime.cs |
| Status | Resolved |
Description: A per-tag IAbLegacyTagRuntime (wrapping a single libplctag Tag)
is cached in DeviceState.Runtimes and reused. ReadAsync (called directly by the
server read path) and the PollGroupEngine poll loop (which also calls ReadAsync
via the reader delegate) can run concurrently, and two poll subscriptions covering
the same tag run on independent background tasks. All of them call
EnsureTagRuntimeAsync to the same Tag instance and call runtime.ReadAsync /
GetStatus / DecodeValue with no synchronisation. A libplctag Tag is not safe
for concurrent operations on the same handle: an interleaved Read/GetStatus/DecodeValue
from two threads can read a value mid-update or observe a status that belongs to the
other operation. WriteAsync shares the same runtime dictionary and compounds the
hazard. Only the bit-in-word RMW path is serialised (per-parent SemaphoreSlim).
Recommendation: Serialise all operations against a given runtime - a per-runtime
SemaphoreSlim, or a per-device read lock - so no two threads touch the same Tag
handle concurrently.
Resolution: Resolved 2026-05-22 — added a per-runtime SemaphoreSlim
(DeviceState.GetRuntimeLock, keyed by tag name); ReadAsync and WriteAsync now
hold it around the whole Read→GetStatus→Decode / Encode→Write→GetStatus sequence so the
shared libplctag Tag handle is never touched by two threads at once.
Driver.AbLegacy-007
| Field | Value |
|---|---|
| Severity | Medium |
| Category | Concurrency & thread safety |
| Location | AbLegacyDriver.cs:411-438, AbLegacyDriver.cs:386-409 |
| Status | Resolved |
Description: EnsureTagRuntimeAsync and EnsureParentRuntimeAsync are
check-then-act: device.Runtimes.TryGetValue(...) then, after await runtime.InitializeAsync, device.Runtimes[def.Name] = runtime. Dictionary is not
thread-safe, and two concurrent callers for the same tag (read + poll, or two poll
loops) both miss the lookup, both Create + InitializeAsync a runtime, and both write
the dictionary. One runtime is overwritten and leaked - DisposeRuntimes only
disposes what is currently in the dict - and concurrent Dictionary writes can
corrupt internal state. ParentRuntimes has the identical pattern.
Recommendation: Replace the runtime caches with ConcurrentDictionary and use
GetOrAdd, or guard runtime creation under a per-device lock. Ensure the losing
runtime of any race is disposed.
Resolution: Resolved 2026-05-22 — Runtimes and ParentRuntimes changed to ConcurrentDictionary; EnsureTagRuntimeAsync and EnsureParentRuntimeAsync now hold a per-key GetCreationLock semaphore around the double-checked create+initialize+store sequence so exactly one runtime is created per key and no race-loser is leaked.
Driver.AbLegacy-008
| Field | Value |
|---|---|
| Severity | Medium |
| Category | Concurrency & thread safety |
| Location | AbLegacyDriver.cs:21, AbLegacyDriver.cs:138-146, AbLegacyDriver.cs:216-229 |
| Status | Resolved |
Description: _health is a plain non-volatile reference field mutated from
ReadAsync, WriteAsync (both can run on multiple threads / poll loops) and
InitializeAsync/ShutdownAsync, and read by GetHealth() from yet another thread.
There is no lock, no volatile, and no Interlocked exchange. The record reference
assignment is atomic, but without a memory barrier a reader can observe a stale
_health indefinitely, and concurrent writers race so a Healthy write from one
successful read can clobber a Degraded write from a concurrent failing read.
GetHealth() may therefore report Healthy while reads are persistently failing.
Recommendation: Mark _health volatile, or funnel health transitions through a
lock / Interlocked.Exchange. Consider only downgrading on failure and upgrading on a
successful poll so a single failed read does not flap the surface.
Resolution: Resolved 2026-05-22 — _health marked volatile; memory barrier comment documents the acquire/release ordering guarantee.
Driver.AbLegacy-009
| Field | Value |
|---|---|
| Severity | Medium |
| Category | Error handling & resilience |
| Location | AbLegacyDriver.cs:41-74 |
| Status | Resolved |
Description: InitializeAsync starts probe loops with Task.Run inside the try
block. If InitializeAsync fails - or is re-entered - after some probe loops are
already started, the catch only sets _health = Faulted and rethrows; it does not
cancel state.ProbeCts, dispose runtimes, or clear _devices. A caller that catches
the exception and retries via ReinitializeAsync is covered (it calls ShutdownAsync
first), but a caller that catches and abandons the driver leaves orphaned probe tasks
and CancellationTokenSources alive holding libplctag handles. Separately,
ProbeLoopAsync never escalates a permanently-unreachable device beyond Stopped.
Recommendation: On the catch path in InitializeAsync, run the same teardown as
ShutdownAsync (cancel probe CTSs, dispose runtimes, clear dictionaries) before
rethrowing, so a failed initialise leaves no live background work.
Resolution: Resolved 2026-05-22 — InitializeAsync catch block now cancels and disposes probe CTSs, calls DisposeRuntimes, and clears _devices/_tagsByName before rethrowing, leaving no orphaned background tasks or handles.
Driver.AbLegacy-010
| Field | Value |
|---|---|
| Severity | Medium |
| Category | Error handling & resilience |
| Location | AbLegacyStatusMapper.cs:26-56 |
| Status | Open |
Description: MapLibplctagStatus maps the integer codes -5/-7/-14/-16/-17. These
do not match the native libplctag PLCTAG_ERR_* constants (PLCTAG_ERR_TIMEOUT = -32,
PLCTAG_ERR_NOT_FOUND = -22, PLCTAG_ERR_NOT_ALLOWED = -21, PLCTAG_ERR_OUT_OF_BOUNDS =
-25, PLCTAG_ERR_BAD_CONNECTION = -8). The mapper operates on (int)_tag.GetStatus(),
where GetStatus() returns the libplctag .NET wrapper Status enum whose underlying
ordinals differ from the native codes - so the -5/-7/... values are at best the .NET
enum ordinals (unverified, undocumented) and at worst wrong. Any unmatched negative
status falls through to BadCommunicationError, so a timeout is reported as a generic
comms error rather than BadTimeout. MapPcccStatus is dead code - the PCCC STS byte
is never inspected because libplctag surfaces only its own status enum.
Recommendation: Verify the actual libplctag.Status enum values against the 1.5.2
package and map by enum name rather than magic integers. Either wire MapPcccStatus
into a real PCCC-STS path or delete it as dead code. The same defect exists in
AbCipStatusMapper and should be fixed in lockstep.
Resolution: (open)
Driver.AbLegacy-011
| Field | Value |
|---|---|
| Severity | Low |
| Category | Performance & resource management |
| Location | AbLegacyDriver.cs:440 |
| Status | Open |
Description: Dispose() is implemented as
DisposeAsync().AsTask().GetAwaiter().GetResult() - sync-over-async. ShutdownAsync
awaits _poll.DisposeAsync() (which completes synchronously) and does no other real
async work, so a deadlock is unlikely in practice, but the pattern blocks the calling
thread and would deadlock if any awaited continuation were ever marshalled back to a
single-threaded synchronization context.
Recommendation: Prefer callers use IAsyncDisposable. If a synchronous Dispose()
must exist, perform the synchronous teardown directly (cancel CTSs, dispose runtimes)
rather than blocking on the async path.
Resolution: (open)
Driver.AbLegacy-012
| Field | Value |
|---|---|
| Severity | Medium |
| Category | Design-document adherence |
| Location | PlcFamilies/AbLegacyPlcFamilyProfile.cs:7-54, AbLegacyDriver.cs:48-52 |
| Status | Open |
Description: AbLegacyPlcFamilyProfile declares four record properties -
DefaultCipPath, MaxTagBytes, SupportsStringFile, SupportsLongFile - and only
LibplctagPlcAttribute is ever consumed. In particular:
DefaultCipPathis dead: the per-family default path (empty for MicroLogix, 1,0 for SLC/PLC-5) is never used to substitute an empty CIP path. The CIP path always comes verbatim fromAbLegacyHostAddress.CipPath, so a SLC 500 misconfigured with an empty path is never corrected to 1,0 even though the profile knows the right default - contradicting the test-fixture doc, which calls out the /1,0 cip-path workaround as required for SLC.MaxTagBytesis never used to validate or chunk a string/array read.SupportsStringFile/SupportsLongFileare never checked, so aStringorLongtag configured against a MicroLogix or PLC-5 (which the profile says lack them) is accepted and only fails at runtime with an opaque comms error.
Recommendation: Either consume the profile fields (substitute DefaultCipPath when
the host CIP path is empty; reject Long/String tags against families whose profile
sets the corresponding flag false; use MaxTagBytes for validation) or remove the
unused fields and the doc comments that imply they are load-bearing.
Resolution: (open)
Driver.AbLegacy-013
| Field | Value |
|---|---|
| Severity | Low |
| Category | Code organization & conventions |
| Location | AbLegacyDriver.cs:340-345, AbLegacyDriver.cs:238-264 |
| Status | Open |
Description: Two minor organisational issues:
ResolveHostreturns_options.Devices.FirstOrDefault()?.HostAddress ?? DriverInstanceIdwhen the reference is unknown and no devices are configured.DriverInstanceIdis not a host address (ab://...), so a downstreamIHostConnectivityProbe/ host lookup keyed on the returned value never matches a real device. Returning the instance id as a fake host masks a configuration error.DiscoverAsyncalways emitsIsArray: false/ArrayDim: null. PCCC files are inherently arrays of elements; a tag that genuinely addresses a multi-element region cannot be represented. This is consistent with the PR-staged scope (the doc says array coverage is thin) but should be tracked rather than silently shipped.
Recommendation: For (1), either throw / return a sentinel the caller can detect, or document why falling back to the instance id is acceptable. For (2), record the array-addressing gap as a tracked follow-up.
Resolution: (open)