Runtimes and ParentRuntimes changed from Dictionary to ConcurrentDictionary. EnsureTagRuntimeAsync and EnsureParentRuntimeAsync now use a per-key GetCreationLock semaphore with a double-checked pattern: fast-path read requires no lock; slow-path create+initialize+store is serialised per key so a concurrent caller waits rather than creating a duplicate runtime that would be leaked when DisposeRuntimes runs. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
366 lines
17 KiB
Markdown
366 lines
17 KiB
Markdown
# Code Review - Driver.AbLegacy
|
|
|
|
| Field | Value |
|
|
|---|---|
|
|
| Module | `src/Drivers/ZB.MOM.WW.OtOpcUa.Driver.AbLegacy` |
|
|
| Reviewer | Claude Code |
|
|
| Review date | 2026-05-22 |
|
|
| Commit reviewed | `76d35d1` |
|
|
| Status | Reviewed |
|
|
| Open findings | 11 |
|
|
|
|
## Checklist coverage
|
|
|
|
A comprehensive review completes every category, recording "No issues found" where
|
|
a category produced nothing rather than leaving it blank.
|
|
|
|
| # | Category | Result |
|
|
|---|---|---|
|
|
| 1 | Correctness & logic bugs | Driver.AbLegacy-001, Driver.AbLegacy-002, Driver.AbLegacy-003, Driver.AbLegacy-004 |
|
|
| 2 | OtOpcUa conventions | Driver.AbLegacy-005 |
|
|
| 3 | Concurrency & thread safety | Driver.AbLegacy-006, Driver.AbLegacy-007, Driver.AbLegacy-008 |
|
|
| 4 | Error handling & resilience | Driver.AbLegacy-009, Driver.AbLegacy-010 |
|
|
| 5 | Security | No issues found |
|
|
| 6 | Performance & resource management | Driver.AbLegacy-011 |
|
|
| 7 | Design-document adherence | Driver.AbLegacy-012 |
|
|
| 8 | Code organization & conventions | Driver.AbLegacy-013 |
|
|
| 9 | Testing coverage | No issues found |
|
|
| 10 | Documentation & comments | No issues found |
|
|
|
|
## Findings
|
|
|
|
### Driver.AbLegacy-001
|
|
|
|
| Field | Value |
|
|
|---|---|
|
|
| Severity | High |
|
|
| Category | Correctness & logic bugs |
|
|
| Location | `AbLegacyAddress.cs:54`, `AbLegacyDriver.cs:368-374` |
|
|
| Status | Resolved |
|
|
|
|
**Description:** `AbLegacyAddress.TryParse` accepts a `BitIndex` of `0..31` for every
|
|
file type. A PCCC N-file word is a signed 16-bit integer, so valid bit indices are
|
|
`0..15`. When a tag is `Bit`-typed against an N-file with a bit suffix of `16..31`
|
|
(e.g. `N7:0/20`), `WriteBitInWordAsync` reads the parent as `AbLegacyDataType.Int`
|
|
(16-bit), then computes `current | (1 << bit)` / `current & ~(1 << bit)` with `bit`
|
|
up to 31. `1 << 20` produces a value outside the 16-bit range, the result is cast
|
|
`(short)updated`, and the high bits are silently truncated - the wrong bit (or no
|
|
bit) is written and no error is surfaced. The mask arithmetic is also done on a
|
|
sign-extended `int`. For L-file (32-bit) bits the parent is still read as `Int`
|
|
(16-bit), so bits 16..31 of a long can never be addressed correctly.
|
|
|
|
**Recommendation:** Validate `BitIndex` against the parent word width during parse or
|
|
in `WriteBitInWordAsync` - reject bit > 15 for N/B/I/O/S files and bit > 31 for L
|
|
files. For bit-in-word RMW against L files, read the parent as `Long`. Mask the
|
|
read-back value to the word width before applying the bit operation.
|
|
|
|
**Resolution:** Resolved 2026-05-22 — `AbLegacyAddress.TryParse` now range-checks the
|
|
bit index against per-file word width (0..15 for N/B/I/O/S/A, 0..31 for L, no bits on
|
|
F); `WriteBitInWordAsync` reads/writes an L-file parent as 32-bit `Long` and masks the
|
|
RMW arithmetic to the native width so sign-extension can no longer corrupt high bits.
|
|
|
|
### Driver.AbLegacy-002
|
|
|
|
| Field | Value |
|
|
|---|---|
|
|
| Severity | Medium |
|
|
| Category | Correctness & logic bugs |
|
|
| Location | `AbLegacyDriver.cs:368` |
|
|
| Status | Resolved |
|
|
|
|
**Description:** In `WriteBitInWordAsync` the parent word is decoded with
|
|
`Convert.ToInt32(parentRuntime.DecodeValue(AbLegacyDataType.Int, ...))`.
|
|
`LibplctagLegacyTagRuntime.DecodeValue` for `AbLegacyDataType.Int` returns
|
|
`(int)_tag.GetInt16(0)` - a sign-extended `int`. When the current word has its high
|
|
bit set (value 0x8000..0xFFFF, decoded as a negative `int`), the subsequent
|
|
`(short)updated` cast re-encodes the low 16 bits correctly, but `current | (1 << bit)`
|
|
is performed on the sign-extended value. The result is bit-correct for the low 16
|
|
bits only because the cast preserves them; any future change to widen the mask range
|
|
will break silently. Combined with finding 001 this is a latent correctness hazard.
|
|
|
|
**Recommendation:** Mask `current` to `current & 0xFFFF` before the bit operation and
|
|
operate on an explicitly 16-bit value, or document the reliance on low-16-bit
|
|
preservation explicitly.
|
|
|
|
**Resolution:** Resolved 2026-05-22 — `current & widthMask` already applied in `WriteBitInWordAsync` by the -001 fix; no additional change needed.
|
|
|
|
### Driver.AbLegacy-003
|
|
|
|
| Field | Value |
|
|
|---|---|
|
|
| Severity | Medium |
|
|
| Category | Correctness & logic bugs |
|
|
| Location | `AbLegacyAddress.cs:62-95` |
|
|
| Status | Resolved |
|
|
|
|
**Description:** `TryParse` does not reject several malformed PCCC addresses that the
|
|
XML docs imply are invalid:
|
|
- A sub-element and a bit index together (`T4:0.ACC/2`) parse successfully even
|
|
though no PCCC element supports both.
|
|
- I/O/S files with a file number (`I3:0`, `S2:1`) parse successfully - I/O and S are
|
|
single-letter files with no file number per the doc comment, but the parser only
|
|
requires "letter then optional digits".
|
|
- B-file addresses with a sub-element (`B3:0.DN`) parse successfully.
|
|
`ToLibplctagName()` re-emits whatever was parsed, so a malformed address is passed
|
|
through to libplctag rather than rejected early with a clear error.
|
|
|
|
**Recommendation:** Tighten the parser: reject sub-element + bit-index combinations,
|
|
reject file numbers on I/O/S, and restrict which file letters may carry a sub-element
|
|
(T/C/R only). Add unit coverage for the rejection cases.
|
|
|
|
**Resolution:** Resolved 2026-05-22 — `TryParse` now rejects sub-element+bit-index combinations, file numbers on I/O/S files, and sub-elements on non-T/C/R files; unit tests added in `AbLegacyAddressTests`.
|
|
|
|
### Driver.AbLegacy-004
|
|
|
|
| Field | Value |
|
|
|---|---|
|
|
| Severity | Medium |
|
|
| Category | Correctness & logic bugs |
|
|
| Location | `LibplctagLegacyTagRuntime.cs:36-37` |
|
|
| Status | Resolved |
|
|
|
|
**Description:** `DecodeValue` for `AbLegacyDataType.Bit` with `bitIndex == null`
|
|
returns `_tag.GetInt8(0) != 0`. A bit-file element (`B3:0/0`) is a single bit inside
|
|
a 16-bit word; reading only the low byte (`GetInt8(0)`) means a `Bit` tag whose live
|
|
bit sits in bits 8..15 of the word, or a B-file element addressed without an explicit
|
|
bit suffix, decodes incorrectly. The driver passes `parsed.ToLibplctagName()` which
|
|
preserves the `/bit` suffix, so libplctag resolves the bit when a suffix is present -
|
|
but a `Bit`-typed tag configured with an address that has no `/bit` suffix (e.g.
|
|
`B3:0`) silently decodes the wrong thing.
|
|
|
|
**Recommendation:** For `Bit` with no `bitIndex`, decide explicitly: either require a
|
|
bit suffix on `Bit`-typed tags (validate in `CreateInstance`/`DiscoverAsync`) or
|
|
decode the full 16-bit word and test bit 0.
|
|
|
|
**Resolution:** Resolved 2026-05-22 — `DecodeValue` for `Bit` with no `bitIndex` now reads the full 16-bit word via `GetInt16(0)` and tests bit 0, avoiding the silent half-word truncation from `GetInt8`.
|
|
|
|
### Driver.AbLegacy-005
|
|
|
|
| Field | Value |
|
|
|---|---|
|
|
| Severity | Low |
|
|
| Category | OtOpcUa conventions |
|
|
| Location | `AbLegacyDriver.cs` (whole file) |
|
|
| Status | Open |
|
|
|
|
**Description:** The driver uses no `ILogger`/Serilog at all. Probe-loop failures,
|
|
runtime initialisation failures, libplctag non-zero statuses, and read/write
|
|
exceptions are folded into `DriverHealth.Detail` strings but never logged. CLAUDE.md
|
|
names Serilog with a rolling daily file sink as the logging library. The complete
|
|
absence of structured logging makes field diagnosis of a PCCC comms problem (timeout
|
|
vs route failure vs wrong PLC family) rely entirely on a single overwritten `Detail`
|
|
string that the next read or write immediately clobbers.
|
|
|
|
**Recommendation:** Inject `ILogger<AbLegacyDriver>` (optional, like `tagFactory`) and
|
|
log probe transitions, runtime-init failures, and the first occurrence of a non-zero
|
|
libplctag status per device.
|
|
|
|
**Resolution:** _(open)_
|
|
|
|
### Driver.AbLegacy-006
|
|
|
|
| Field | Value |
|
|
|---|---|
|
|
| Severity | High |
|
|
| Category | Concurrency & thread safety |
|
|
| Location | `AbLegacyDriver.cs:107-158`, `AbLegacyDriver.cs:162-234`, `LibplctagLegacyTagRuntime.cs` |
|
|
| Status | Resolved |
|
|
|
|
**Description:** A per-tag `IAbLegacyTagRuntime` (wrapping a single libplctag `Tag`)
|
|
is cached in `DeviceState.Runtimes` and reused. `ReadAsync` (called directly by the
|
|
server read path) and the `PollGroupEngine` poll loop (which also calls `ReadAsync`
|
|
via the reader delegate) can run concurrently, and two poll subscriptions covering
|
|
the same tag run on independent background tasks. All of them call
|
|
`EnsureTagRuntimeAsync` to the same `Tag` instance and call `runtime.ReadAsync` /
|
|
`GetStatus` / `DecodeValue` with no synchronisation. A libplctag `Tag` is not safe
|
|
for concurrent operations on the same handle: an interleaved Read/GetStatus/DecodeValue
|
|
from two threads can read a value mid-update or observe a status that belongs to the
|
|
other operation. `WriteAsync` shares the same runtime dictionary and compounds the
|
|
hazard. Only the bit-in-word RMW path is serialised (per-parent `SemaphoreSlim`).
|
|
|
|
**Recommendation:** Serialise all operations against a given runtime - a per-runtime
|
|
`SemaphoreSlim`, or a per-device read lock - so no two threads touch the same `Tag`
|
|
handle concurrently.
|
|
|
|
**Resolution:** Resolved 2026-05-22 — added a per-runtime `SemaphoreSlim`
|
|
(`DeviceState.GetRuntimeLock`, keyed by tag name); `ReadAsync` and `WriteAsync` now
|
|
hold it around the whole Read→GetStatus→Decode / Encode→Write→GetStatus sequence so the
|
|
shared libplctag `Tag` handle is never touched by two threads at once.
|
|
|
|
### Driver.AbLegacy-007
|
|
|
|
| Field | Value |
|
|
|---|---|
|
|
| Severity | Medium |
|
|
| Category | Concurrency & thread safety |
|
|
| Location | `AbLegacyDriver.cs:411-438`, `AbLegacyDriver.cs:386-409` |
|
|
| Status | Resolved |
|
|
|
|
**Description:** `EnsureTagRuntimeAsync` and `EnsureParentRuntimeAsync` are
|
|
check-then-act: `device.Runtimes.TryGetValue(...)` then, after `await
|
|
runtime.InitializeAsync`, `device.Runtimes[def.Name] = runtime`. `Dictionary` is not
|
|
thread-safe, and two concurrent callers for the same tag (read + poll, or two poll
|
|
loops) both miss the lookup, both Create + InitializeAsync a runtime, and both write
|
|
the dictionary. One runtime is overwritten and leaked - `DisposeRuntimes` only
|
|
disposes what is currently in the dict - and concurrent `Dictionary` writes can
|
|
corrupt internal state. `ParentRuntimes` has the identical pattern.
|
|
|
|
**Recommendation:** Replace the runtime caches with `ConcurrentDictionary` and use
|
|
`GetOrAdd`, or guard runtime creation under a per-device lock. Ensure the losing
|
|
runtime of any race is disposed.
|
|
|
|
**Resolution:** Resolved 2026-05-22 — `Runtimes` and `ParentRuntimes` changed to `ConcurrentDictionary`; `EnsureTagRuntimeAsync` and `EnsureParentRuntimeAsync` now hold a per-key `GetCreationLock` semaphore around the double-checked create+initialize+store sequence so exactly one runtime is created per key and no race-loser is leaked.
|
|
|
|
### Driver.AbLegacy-008
|
|
|
|
| Field | Value |
|
|
|---|---|
|
|
| Severity | Medium |
|
|
| Category | Concurrency & thread safety |
|
|
| Location | `AbLegacyDriver.cs:21`, `AbLegacyDriver.cs:138-146`, `AbLegacyDriver.cs:216-229` |
|
|
| Status | Open |
|
|
|
|
**Description:** `_health` is a plain non-volatile reference field mutated from
|
|
`ReadAsync`, `WriteAsync` (both can run on multiple threads / poll loops) and
|
|
`InitializeAsync`/`ShutdownAsync`, and read by `GetHealth()` from yet another thread.
|
|
There is no lock, no `volatile`, and no `Interlocked` exchange. The record reference
|
|
assignment is atomic, but without a memory barrier a reader can observe a stale
|
|
`_health` indefinitely, and concurrent writers race so a `Healthy` write from one
|
|
successful read can clobber a `Degraded` write from a concurrent failing read.
|
|
`GetHealth()` may therefore report `Healthy` while reads are persistently failing.
|
|
|
|
**Recommendation:** Mark `_health` volatile, or funnel health transitions through a
|
|
lock / `Interlocked.Exchange`. Consider only downgrading on failure and upgrading on a
|
|
successful poll so a single failed read does not flap the surface.
|
|
|
|
**Resolution:** _(open)_
|
|
|
|
### Driver.AbLegacy-009
|
|
|
|
| Field | Value |
|
|
|---|---|
|
|
| Severity | Medium |
|
|
| Category | Error handling & resilience |
|
|
| Location | `AbLegacyDriver.cs:41-74` |
|
|
| Status | Open |
|
|
|
|
**Description:** `InitializeAsync` starts probe loops with `Task.Run` inside the try
|
|
block. If `InitializeAsync` fails - or is re-entered - after some probe loops are
|
|
already started, the catch only sets `_health = Faulted` and rethrows; it does not
|
|
cancel `state.ProbeCts`, dispose runtimes, or clear `_devices`. A caller that catches
|
|
the exception and retries via `ReinitializeAsync` is covered (it calls `ShutdownAsync`
|
|
first), but a caller that catches and abandons the driver leaves orphaned probe tasks
|
|
and `CancellationTokenSource`s alive holding libplctag handles. Separately,
|
|
`ProbeLoopAsync` never escalates a permanently-unreachable device beyond `Stopped`.
|
|
|
|
**Recommendation:** On the catch path in `InitializeAsync`, run the same teardown as
|
|
`ShutdownAsync` (cancel probe CTSs, dispose runtimes, clear dictionaries) before
|
|
rethrowing, so a failed initialise leaves no live background work.
|
|
|
|
**Resolution:** _(open)_
|
|
|
|
### Driver.AbLegacy-010
|
|
|
|
| Field | Value |
|
|
|---|---|
|
|
| Severity | Medium |
|
|
| Category | Error handling & resilience |
|
|
| Location | `AbLegacyStatusMapper.cs:26-56` |
|
|
| Status | Open |
|
|
|
|
**Description:** `MapLibplctagStatus` maps the integer codes -5/-7/-14/-16/-17. These
|
|
do not match the native libplctag PLCTAG_ERR_* constants (PLCTAG_ERR_TIMEOUT = -32,
|
|
PLCTAG_ERR_NOT_FOUND = -22, PLCTAG_ERR_NOT_ALLOWED = -21, PLCTAG_ERR_OUT_OF_BOUNDS =
|
|
-25, PLCTAG_ERR_BAD_CONNECTION = -8). The mapper operates on `(int)_tag.GetStatus()`,
|
|
where `GetStatus()` returns the libplctag .NET wrapper Status enum whose underlying
|
|
ordinals differ from the native codes - so the -5/-7/... values are at best the .NET
|
|
enum ordinals (unverified, undocumented) and at worst wrong. Any unmatched negative
|
|
status falls through to `BadCommunicationError`, so a timeout is reported as a generic
|
|
comms error rather than `BadTimeout`. `MapPcccStatus` is dead code - the PCCC STS byte
|
|
is never inspected because libplctag surfaces only its own status enum.
|
|
|
|
**Recommendation:** Verify the actual `libplctag.Status` enum values against the 1.5.2
|
|
package and map by enum name rather than magic integers. Either wire `MapPcccStatus`
|
|
into a real PCCC-STS path or delete it as dead code. The same defect exists in
|
|
`AbCipStatusMapper` and should be fixed in lockstep.
|
|
|
|
**Resolution:** _(open)_
|
|
|
|
### Driver.AbLegacy-011
|
|
|
|
| Field | Value |
|
|
|---|---|
|
|
| Severity | Low |
|
|
| Category | Performance & resource management |
|
|
| Location | `AbLegacyDriver.cs:440` |
|
|
| Status | Open |
|
|
|
|
**Description:** `Dispose()` is implemented as
|
|
`DisposeAsync().AsTask().GetAwaiter().GetResult()` - sync-over-async. `ShutdownAsync`
|
|
awaits `_poll.DisposeAsync()` (which completes synchronously) and does no other real
|
|
async work, so a deadlock is unlikely in practice, but the pattern blocks the calling
|
|
thread and would deadlock if any awaited continuation were ever marshalled back to a
|
|
single-threaded synchronization context.
|
|
|
|
**Recommendation:** Prefer callers use `IAsyncDisposable`. If a synchronous `Dispose()`
|
|
must exist, perform the synchronous teardown directly (cancel CTSs, dispose runtimes)
|
|
rather than blocking on the async path.
|
|
|
|
**Resolution:** _(open)_
|
|
|
|
### Driver.AbLegacy-012
|
|
|
|
| Field | Value |
|
|
|---|---|
|
|
| Severity | Medium |
|
|
| Category | Design-document adherence |
|
|
| Location | `PlcFamilies/AbLegacyPlcFamilyProfile.cs:7-54`, `AbLegacyDriver.cs:48-52` |
|
|
| Status | Open |
|
|
|
|
**Description:** `AbLegacyPlcFamilyProfile` declares four record properties -
|
|
`DefaultCipPath`, `MaxTagBytes`, `SupportsStringFile`, `SupportsLongFile` - and only
|
|
`LibplctagPlcAttribute` is ever consumed. In particular:
|
|
- `DefaultCipPath` is dead: the per-family default path (empty for MicroLogix, 1,0
|
|
for SLC/PLC-5) is never used to substitute an empty CIP path. The CIP path always
|
|
comes verbatim from `AbLegacyHostAddress.CipPath`, so a SLC 500 misconfigured with
|
|
an empty path is never corrected to 1,0 even though the profile knows the right
|
|
default - contradicting the test-fixture doc, which calls out the /1,0 cip-path
|
|
workaround as required for SLC.
|
|
- `MaxTagBytes` is never used to validate or chunk a string/array read.
|
|
- `SupportsStringFile`/`SupportsLongFile` are never checked, so a `String` or `Long`
|
|
tag configured against a MicroLogix or PLC-5 (which the profile says lack them) is
|
|
accepted and only fails at runtime with an opaque comms error.
|
|
|
|
**Recommendation:** Either consume the profile fields (substitute `DefaultCipPath` when
|
|
the host CIP path is empty; reject `Long`/`String` tags against families whose profile
|
|
sets the corresponding flag false; use `MaxTagBytes` for validation) or remove the
|
|
unused fields and the doc comments that imply they are load-bearing.
|
|
|
|
**Resolution:** _(open)_
|
|
|
|
### Driver.AbLegacy-013
|
|
|
|
| Field | Value |
|
|
|---|---|
|
|
| Severity | Low |
|
|
| Category | Code organization & conventions |
|
|
| Location | `AbLegacyDriver.cs:340-345`, `AbLegacyDriver.cs:238-264` |
|
|
| Status | Open |
|
|
|
|
**Description:** Two minor organisational issues:
|
|
1. `ResolveHost` returns `_options.Devices.FirstOrDefault()?.HostAddress ??
|
|
DriverInstanceId` when the reference is unknown and no devices are configured.
|
|
`DriverInstanceId` is not a host address (ab://...), so a downstream
|
|
`IHostConnectivityProbe` / host lookup keyed on the returned value never matches a
|
|
real device. Returning the instance id as a fake host masks a configuration error.
|
|
2. `DiscoverAsync` always emits `IsArray: false` / `ArrayDim: null`. PCCC files are
|
|
inherently arrays of elements; a tag that genuinely addresses a multi-element
|
|
region cannot be represented. This is consistent with the PR-staged scope (the doc
|
|
says array coverage is thin) but should be tracked rather than silently shipped.
|
|
|
|
**Recommendation:** For (1), either throw / return a sentinel the caller can detect, or
|
|
document why falling back to the instance id is acceptable. For (2), record the
|
|
array-addressing gap as a tracked follow-up.
|
|
|
|
**Resolution:** _(open)_
|