# Code Review - Driver.AbLegacy | Field | Value | |---|---| | Module | `src/Drivers/ZB.MOM.WW.OtOpcUa.Driver.AbLegacy` | | Reviewer | Claude Code | | Review date | 2026-05-22 | | Commit reviewed | `76d35d1` | | Status | Reviewed | | Open findings | 3 | ## Checklist coverage A comprehensive review completes every category, recording "No issues found" where a category produced nothing rather than leaving it blank. | # | Category | Result | |---|---|---| | 1 | Correctness & logic bugs | Driver.AbLegacy-001, Driver.AbLegacy-002, Driver.AbLegacy-003, Driver.AbLegacy-004 | | 2 | OtOpcUa conventions | Driver.AbLegacy-005 | | 3 | Concurrency & thread safety | Driver.AbLegacy-006, Driver.AbLegacy-007, Driver.AbLegacy-008 | | 4 | Error handling & resilience | Driver.AbLegacy-009, Driver.AbLegacy-010 | | 5 | Security | No issues found | | 6 | Performance & resource management | Driver.AbLegacy-011 | | 7 | Design-document adherence | Driver.AbLegacy-012 | | 8 | Code organization & conventions | Driver.AbLegacy-013 | | 9 | Testing coverage | No issues found | | 10 | Documentation & comments | No issues found | ## Findings ### Driver.AbLegacy-001 | Field | Value | |---|---| | Severity | High | | Category | Correctness & logic bugs | | Location | `AbLegacyAddress.cs:54`, `AbLegacyDriver.cs:368-374` | | Status | Resolved | **Description:** `AbLegacyAddress.TryParse` accepts a `BitIndex` of `0..31` for every file type. A PCCC N-file word is a signed 16-bit integer, so valid bit indices are `0..15`. When a tag is `Bit`-typed against an N-file with a bit suffix of `16..31` (e.g. `N7:0/20`), `WriteBitInWordAsync` reads the parent as `AbLegacyDataType.Int` (16-bit), then computes `current | (1 << bit)` / `current & ~(1 << bit)` with `bit` up to 31. `1 << 20` produces a value outside the 16-bit range, the result is cast `(short)updated`, and the high bits are silently truncated - the wrong bit (or no bit) is written and no error is surfaced. The mask arithmetic is also done on a sign-extended `int`. For L-file (32-bit) bits the parent is still read as `Int` (16-bit), so bits 16..31 of a long can never be addressed correctly. **Recommendation:** Validate `BitIndex` against the parent word width during parse or in `WriteBitInWordAsync` - reject bit > 15 for N/B/I/O/S files and bit > 31 for L files. For bit-in-word RMW against L files, read the parent as `Long`. Mask the read-back value to the word width before applying the bit operation. **Resolution:** Resolved 2026-05-22 — `AbLegacyAddress.TryParse` now range-checks the bit index against per-file word width (0..15 for N/B/I/O/S/A, 0..31 for L, no bits on F); `WriteBitInWordAsync` reads/writes an L-file parent as 32-bit `Long` and masks the RMW arithmetic to the native width so sign-extension can no longer corrupt high bits. ### Driver.AbLegacy-002 | Field | Value | |---|---| | Severity | Medium | | Category | Correctness & logic bugs | | Location | `AbLegacyDriver.cs:368` | | Status | Resolved | **Description:** In `WriteBitInWordAsync` the parent word is decoded with `Convert.ToInt32(parentRuntime.DecodeValue(AbLegacyDataType.Int, ...))`. `LibplctagLegacyTagRuntime.DecodeValue` for `AbLegacyDataType.Int` returns `(int)_tag.GetInt16(0)` - a sign-extended `int`. When the current word has its high bit set (value 0x8000..0xFFFF, decoded as a negative `int`), the subsequent `(short)updated` cast re-encodes the low 16 bits correctly, but `current | (1 << bit)` is performed on the sign-extended value. The result is bit-correct for the low 16 bits only because the cast preserves them; any future change to widen the mask range will break silently. Combined with finding 001 this is a latent correctness hazard. **Recommendation:** Mask `current` to `current & 0xFFFF` before the bit operation and operate on an explicitly 16-bit value, or document the reliance on low-16-bit preservation explicitly. **Resolution:** Resolved 2026-05-22 — `current & widthMask` already applied in `WriteBitInWordAsync` by the -001 fix; no additional change needed. ### Driver.AbLegacy-003 | Field | Value | |---|---| | Severity | Medium | | Category | Correctness & logic bugs | | Location | `AbLegacyAddress.cs:62-95` | | Status | Resolved | **Description:** `TryParse` does not reject several malformed PCCC addresses that the XML docs imply are invalid: - A sub-element and a bit index together (`T4:0.ACC/2`) parse successfully even though no PCCC element supports both. - I/O/S files with a file number (`I3:0`, `S2:1`) parse successfully - I/O and S are single-letter files with no file number per the doc comment, but the parser only requires "letter then optional digits". - B-file addresses with a sub-element (`B3:0.DN`) parse successfully. `ToLibplctagName()` re-emits whatever was parsed, so a malformed address is passed through to libplctag rather than rejected early with a clear error. **Recommendation:** Tighten the parser: reject sub-element + bit-index combinations, reject file numbers on I/O/S, and restrict which file letters may carry a sub-element (T/C/R only). Add unit coverage for the rejection cases. **Resolution:** Resolved 2026-05-22 — `TryParse` now rejects sub-element+bit-index combinations, file numbers on I/O/S files, and sub-elements on non-T/C/R files; unit tests added in `AbLegacyAddressTests`. ### Driver.AbLegacy-004 | Field | Value | |---|---| | Severity | Medium | | Category | Correctness & logic bugs | | Location | `LibplctagLegacyTagRuntime.cs:36-37` | | Status | Resolved | **Description:** `DecodeValue` for `AbLegacyDataType.Bit` with `bitIndex == null` returns `_tag.GetInt8(0) != 0`. A bit-file element (`B3:0/0`) is a single bit inside a 16-bit word; reading only the low byte (`GetInt8(0)`) means a `Bit` tag whose live bit sits in bits 8..15 of the word, or a B-file element addressed without an explicit bit suffix, decodes incorrectly. The driver passes `parsed.ToLibplctagName()` which preserves the `/bit` suffix, so libplctag resolves the bit when a suffix is present - but a `Bit`-typed tag configured with an address that has no `/bit` suffix (e.g. `B3:0`) silently decodes the wrong thing. **Recommendation:** For `Bit` with no `bitIndex`, decide explicitly: either require a bit suffix on `Bit`-typed tags (validate in `CreateInstance`/`DiscoverAsync`) or decode the full 16-bit word and test bit 0. **Resolution:** Resolved 2026-05-22 — `DecodeValue` for `Bit` with no `bitIndex` now reads the full 16-bit word via `GetInt16(0)` and tests bit 0, avoiding the silent half-word truncation from `GetInt8`. ### Driver.AbLegacy-005 | Field | Value | |---|---| | Severity | Low | | Category | OtOpcUa conventions | | Location | `AbLegacyDriver.cs` (whole file) | | Status | Open | **Description:** The driver uses no `ILogger`/Serilog at all. Probe-loop failures, runtime initialisation failures, libplctag non-zero statuses, and read/write exceptions are folded into `DriverHealth.Detail` strings but never logged. CLAUDE.md names Serilog with a rolling daily file sink as the logging library. The complete absence of structured logging makes field diagnosis of a PCCC comms problem (timeout vs route failure vs wrong PLC family) rely entirely on a single overwritten `Detail` string that the next read or write immediately clobbers. **Recommendation:** Inject `ILogger` (optional, like `tagFactory`) and log probe transitions, runtime-init failures, and the first occurrence of a non-zero libplctag status per device. **Resolution:** _(open)_ ### Driver.AbLegacy-006 | Field | Value | |---|---| | Severity | High | | Category | Concurrency & thread safety | | Location | `AbLegacyDriver.cs:107-158`, `AbLegacyDriver.cs:162-234`, `LibplctagLegacyTagRuntime.cs` | | Status | Resolved | **Description:** A per-tag `IAbLegacyTagRuntime` (wrapping a single libplctag `Tag`) is cached in `DeviceState.Runtimes` and reused. `ReadAsync` (called directly by the server read path) and the `PollGroupEngine` poll loop (which also calls `ReadAsync` via the reader delegate) can run concurrently, and two poll subscriptions covering the same tag run on independent background tasks. All of them call `EnsureTagRuntimeAsync` to the same `Tag` instance and call `runtime.ReadAsync` / `GetStatus` / `DecodeValue` with no synchronisation. A libplctag `Tag` is not safe for concurrent operations on the same handle: an interleaved Read/GetStatus/DecodeValue from two threads can read a value mid-update or observe a status that belongs to the other operation. `WriteAsync` shares the same runtime dictionary and compounds the hazard. Only the bit-in-word RMW path is serialised (per-parent `SemaphoreSlim`). **Recommendation:** Serialise all operations against a given runtime - a per-runtime `SemaphoreSlim`, or a per-device read lock - so no two threads touch the same `Tag` handle concurrently. **Resolution:** Resolved 2026-05-22 — added a per-runtime `SemaphoreSlim` (`DeviceState.GetRuntimeLock`, keyed by tag name); `ReadAsync` and `WriteAsync` now hold it around the whole Read→GetStatus→Decode / Encode→Write→GetStatus sequence so the shared libplctag `Tag` handle is never touched by two threads at once. ### Driver.AbLegacy-007 | Field | Value | |---|---| | Severity | Medium | | Category | Concurrency & thread safety | | Location | `AbLegacyDriver.cs:411-438`, `AbLegacyDriver.cs:386-409` | | Status | Resolved | **Description:** `EnsureTagRuntimeAsync` and `EnsureParentRuntimeAsync` are check-then-act: `device.Runtimes.TryGetValue(...)` then, after `await runtime.InitializeAsync`, `device.Runtimes[def.Name] = runtime`. `Dictionary` is not thread-safe, and two concurrent callers for the same tag (read + poll, or two poll loops) both miss the lookup, both Create + InitializeAsync a runtime, and both write the dictionary. One runtime is overwritten and leaked - `DisposeRuntimes` only disposes what is currently in the dict - and concurrent `Dictionary` writes can corrupt internal state. `ParentRuntimes` has the identical pattern. **Recommendation:** Replace the runtime caches with `ConcurrentDictionary` and use `GetOrAdd`, or guard runtime creation under a per-device lock. Ensure the losing runtime of any race is disposed. **Resolution:** Resolved 2026-05-22 — `Runtimes` and `ParentRuntimes` changed to `ConcurrentDictionary`; `EnsureTagRuntimeAsync` and `EnsureParentRuntimeAsync` now hold a per-key `GetCreationLock` semaphore around the double-checked create+initialize+store sequence so exactly one runtime is created per key and no race-loser is leaked. ### Driver.AbLegacy-008 | Field | Value | |---|---| | Severity | Medium | | Category | Concurrency & thread safety | | Location | `AbLegacyDriver.cs:21`, `AbLegacyDriver.cs:138-146`, `AbLegacyDriver.cs:216-229` | | Status | Resolved | **Description:** `_health` is a plain non-volatile reference field mutated from `ReadAsync`, `WriteAsync` (both can run on multiple threads / poll loops) and `InitializeAsync`/`ShutdownAsync`, and read by `GetHealth()` from yet another thread. There is no lock, no `volatile`, and no `Interlocked` exchange. The record reference assignment is atomic, but without a memory barrier a reader can observe a stale `_health` indefinitely, and concurrent writers race so a `Healthy` write from one successful read can clobber a `Degraded` write from a concurrent failing read. `GetHealth()` may therefore report `Healthy` while reads are persistently failing. **Recommendation:** Mark `_health` volatile, or funnel health transitions through a lock / `Interlocked.Exchange`. Consider only downgrading on failure and upgrading on a successful poll so a single failed read does not flap the surface. **Resolution:** Resolved 2026-05-22 — `_health` marked `volatile`; memory barrier comment documents the acquire/release ordering guarantee. ### Driver.AbLegacy-009 | Field | Value | |---|---| | Severity | Medium | | Category | Error handling & resilience | | Location | `AbLegacyDriver.cs:41-74` | | Status | Resolved | **Description:** `InitializeAsync` starts probe loops with `Task.Run` inside the try block. If `InitializeAsync` fails - or is re-entered - after some probe loops are already started, the catch only sets `_health = Faulted` and rethrows; it does not cancel `state.ProbeCts`, dispose runtimes, or clear `_devices`. A caller that catches the exception and retries via `ReinitializeAsync` is covered (it calls `ShutdownAsync` first), but a caller that catches and abandons the driver leaves orphaned probe tasks and `CancellationTokenSource`s alive holding libplctag handles. Separately, `ProbeLoopAsync` never escalates a permanently-unreachable device beyond `Stopped`. **Recommendation:** On the catch path in `InitializeAsync`, run the same teardown as `ShutdownAsync` (cancel probe CTSs, dispose runtimes, clear dictionaries) before rethrowing, so a failed initialise leaves no live background work. **Resolution:** Resolved 2026-05-22 — `InitializeAsync` catch block now cancels and disposes probe CTSs, calls `DisposeRuntimes`, and clears `_devices`/`_tagsByName` before rethrowing, leaving no orphaned background tasks or handles. ### Driver.AbLegacy-010 | Field | Value | |---|---| | Severity | Medium | | Category | Error handling & resilience | | Location | `AbLegacyStatusMapper.cs:26-56` | | Status | Resolved | **Description:** `MapLibplctagStatus` maps the integer codes -5/-7/-14/-16/-17. These do not match the native libplctag PLCTAG_ERR_* constants (PLCTAG_ERR_TIMEOUT = -32, PLCTAG_ERR_NOT_FOUND = -22, PLCTAG_ERR_NOT_ALLOWED = -21, PLCTAG_ERR_OUT_OF_BOUNDS = -25, PLCTAG_ERR_BAD_CONNECTION = -8). The mapper operates on `(int)_tag.GetStatus()`, where `GetStatus()` returns the libplctag .NET wrapper Status enum whose underlying ordinals differ from the native codes - so the -5/-7/... values are at best the .NET enum ordinals (unverified, undocumented) and at worst wrong. Any unmatched negative status falls through to `BadCommunicationError`, so a timeout is reported as a generic comms error rather than `BadTimeout`. `MapPcccStatus` is dead code - the PCCC STS byte is never inspected because libplctag surfaces only its own status enum. **Recommendation:** Verify the actual `libplctag.Status` enum values against the 1.5.2 package and map by enum name rather than magic integers. Either wire `MapPcccStatus` into a real PCCC-STS path or delete it as dead code. The same defect exists in `AbCipStatusMapper` and should be fixed in lockstep. **Resolution:** Resolved 2026-05-22 — `MapLibplctagStatus` now casts to `libplctag.Status` and switches on named enum members (matching the AbCip mapper pattern); `MapPcccStatus` retained with a comment documenting it as a reference mapping for future PCCC-STS inspection; tests updated to use `Status` enum members. ### Driver.AbLegacy-011 | Field | Value | |---|---| | Severity | Low | | Category | Performance & resource management | | Location | `AbLegacyDriver.cs:440` | | Status | Open | **Description:** `Dispose()` is implemented as `DisposeAsync().AsTask().GetAwaiter().GetResult()` - sync-over-async. `ShutdownAsync` awaits `_poll.DisposeAsync()` (which completes synchronously) and does no other real async work, so a deadlock is unlikely in practice, but the pattern blocks the calling thread and would deadlock if any awaited continuation were ever marshalled back to a single-threaded synchronization context. **Recommendation:** Prefer callers use `IAsyncDisposable`. If a synchronous `Dispose()` must exist, perform the synchronous teardown directly (cancel CTSs, dispose runtimes) rather than blocking on the async path. **Resolution:** _(open)_ ### Driver.AbLegacy-012 | Field | Value | |---|---| | Severity | Medium | | Category | Design-document adherence | | Location | `PlcFamilies/AbLegacyPlcFamilyProfile.cs:7-54`, `AbLegacyDriver.cs:48-52` | | Status | Resolved | **Description:** `AbLegacyPlcFamilyProfile` declares four record properties - `DefaultCipPath`, `MaxTagBytes`, `SupportsStringFile`, `SupportsLongFile` - and only `LibplctagPlcAttribute` is ever consumed. In particular: - `DefaultCipPath` is dead: the per-family default path (empty for MicroLogix, 1,0 for SLC/PLC-5) is never used to substitute an empty CIP path. The CIP path always comes verbatim from `AbLegacyHostAddress.CipPath`, so a SLC 500 misconfigured with an empty path is never corrected to 1,0 even though the profile knows the right default - contradicting the test-fixture doc, which calls out the /1,0 cip-path workaround as required for SLC. - `MaxTagBytes` is never used to validate or chunk a string/array read. - `SupportsStringFile`/`SupportsLongFile` are never checked, so a `String` or `Long` tag configured against a MicroLogix or PLC-5 (which the profile says lack them) is accepted and only fails at runtime with an opaque comms error. **Recommendation:** Either consume the profile fields (substitute `DefaultCipPath` when the host CIP path is empty; reject `Long`/`String` tags against families whose profile sets the corresponding flag false; use `MaxTagBytes` for validation) or remove the unused fields and the doc comments that imply they are load-bearing. **Resolution:** Resolved 2026-05-22 — `DeviceState.EffectiveCipPath` applies `DefaultCipPath` when the parsed host address has an empty CIP path; `InitializeAsync` validates `Long`/`String` tag types against `SupportsLongFile`/`SupportsStringFile` and throws early; `MaxTagBytes` tracked as a follow-up (string/array chunking requires broader design work). ### Driver.AbLegacy-013 | Field | Value | |---|---| | Severity | Low | | Category | Code organization & conventions | | Location | `AbLegacyDriver.cs:340-345`, `AbLegacyDriver.cs:238-264` | | Status | Open | **Description:** Two minor organisational issues: 1. `ResolveHost` returns `_options.Devices.FirstOrDefault()?.HostAddress ?? DriverInstanceId` when the reference is unknown and no devices are configured. `DriverInstanceId` is not a host address (ab://...), so a downstream `IHostConnectivityProbe` / host lookup keyed on the returned value never matches a real device. Returning the instance id as a fake host masks a configuration error. 2. `DiscoverAsync` always emits `IsArray: false` / `ArrayDim: null`. PCCC files are inherently arrays of elements; a tag that genuinely addresses a multi-element region cannot be represented. This is consistent with the PR-staged scope (the doc says array coverage is thin) but should be tracked rather than silently shipped. **Recommendation:** For (1), either throw / return a sentinel the caller can detect, or document why falling back to the instance id is acceptable. For (2), record the array-addressing gap as a tracked follow-up. **Resolution:** _(open)_