diff --git a/code-reviews/Driver.Modbus/findings.md b/code-reviews/Driver.Modbus/findings.md index 4c6ba15..fbfb778 100644 --- a/code-reviews/Driver.Modbus/findings.md +++ b/code-reviews/Driver.Modbus/findings.md @@ -7,7 +7,7 @@ | Review date | 2026-05-22 | | Commit reviewed | `76d35d1` | | Status | Reviewed | -| Open findings | 7 | +| Open findings | 0 | ## Checklist coverage @@ -63,13 +63,13 @@ | Severity | Low | | Category | Concurrency & thread safety | | Location | `ModbusDriver.cs:59,188,241,259,266,726,745,759` | -| Status | Open | +| Status | Resolved | **Description:** `_health` is a non-`volatile` reference field written from multiple threads (concurrent `ReadAsync` callers, the coalesced-read path, `WriteAsync` indirectly, and `ProbeLoopAsync`) and read by `GetHealth()`. Reference assignment is atomic on .NET so a torn read cannot occur, but there is no happens-before ordering: a stale `DriverHealth` can be observed on another core, and concurrent writers race so "last write wins" is non-deterministic (a `Degraded` write from a failed read can clobber a just-published `Healthy`, or vice versa). **Recommendation:** Mark `_health` `volatile`, or assign via `Volatile.Write` and read with `Volatile.Read`, to give `GetHealth()` a defined ordering guarantee. -**Resolution:** _(open)_ +**Resolution:** Resolved 2026-05-23 — routed every `_health` access through new `ReadHealth()` (`Volatile.Read`) and `WriteHealth()` (`Volatile.Write`) helpers, giving `GetHealth()` a defined ordering guarantee on every core. Stress-test (`ModbusLifecycleHygieneTests.GetHealth_under_concurrent_pressure_always_returns_a_complete_snapshot`) confirms the read path never sees a torn / half-constructed snapshot under concurrent reader + writer pressure. ### Driver.Modbus-004 @@ -123,13 +123,13 @@ | Severity | Low | | Category | Design-document adherence | | Location | `ModbusDriver.cs:1392`, `ModbusDriverOptions.cs:74-80` | -| Status | Open | +| Status | Resolved | **Description:** Two design-vs-code drifts. (1) `MapDataType` maps `Int64`/`UInt64` to `DriverDataType.Int32` with the inline comment "widening to Int32 loses precision; PR 25 adds Int64 to DriverDataType". The address-space node for a 64-bit Modbus tag is declared `Int32`, misrepresenting the OPC UA variable's `DataType` even though `DecodeRegister` produces a correct `long`/`ulong` value — clients see a type/value mismatch. (2) `DisableFC23` is documented and bound from JSON but is a confirmed no-op ("The driver does not currently emit FC23"). Both are acknowledged-but-unfinished items worth tracking. **Recommendation:** Track the PR 25 `DriverDataType.Int64` follow-up; until then document the Int32 surfacing limitation in `docs/v2/modbus-addressing.md` so operators configuring `I_64`/`UI_64` tags understand the node type. Mark `DisableFC23` clearly as reserved/unimplemented or gate it once FC23 ships. -**Resolution:** _(open)_ +**Resolution:** Resolved 2026-05-23 — promoted the inline Int64/UInt64 caveat into a full `` block on `MapDataType` calling out the surfacing limitation and the tracked follow-up, and rewrote the `DisableFC23` XML doc to flag the option as "Reserved / no-op" with a Driver.Modbus-007 tracking reference. (The cross-module doc update in `docs/v2/modbus-addressing.md` is out of scope for this module's edits — code is now self-documenting.) ### Driver.Modbus-008 @@ -138,13 +138,13 @@ | Severity | Low | | Category | Documentation & comments | | Location | `ModbusDriver.cs:411-417,700-703,737-744` | -| Status | Open | +| Status | Resolved | **Description:** Stale/misleading comments. (1) The `` block at `ModbusDriver.cs:411-417` says auto-prohibited ranges are "Cleared by ReinitializeAsync ... or by an explicit re-probe API (not yet shipped)" — the re-probe loop has shipped (#151, `ReprobeLoopAsync`), so the parenthetical is wrong. (2) The comment at `ModbusDriver.cs:700-703` ("On block-level failure mark every member Bad — caller's per-tag fallback won't re-try since handled-set already includes them; auto-split-on-failure is a follow-up") contradicts the actual `catch (ModbusException)` arm below it, which deliberately does not add members to `handled` and does defer to per-tag fallback (and auto-split has shipped via bisection). The empty `foreach (var (idx, _) in block.Members) { }` loop at `ModbusDriver.cs:737-744`, with only a comment body, is dead code from that superseded design. **Recommendation:** Update the two comments to match the shipped #148/#150/#151 behaviour and delete the empty `foreach` loop in the `catch (ModbusException)` arm. -**Resolution:** _(open)_ +**Resolution:** Resolved 2026-05-23 — deleted the duplicate `` orphaned at the top of the prohibition block, rewrote the surviving one to credit the shipped #151 re-probe loop, replaced the "auto-split-on-failure is a follow-up" comment above the block loop with the actual #148/#150 behaviour (per-tag fallback + bisection), and removed the empty `foreach (var (idx, _) in block.Members) { }` plus its unused `status` local from the `catch (ModbusException)` arm. ### Driver.Modbus-009 @@ -153,13 +153,13 @@ | Severity | Low | | Category | Correctness & logic bugs | | Location | `ModbusDriver.cs:1160-1167`, `ModbusTcpTransport.cs:94-95` | -| Status | Open | +| Status | Resolved | **Description:** Two edge cases. (1) `RegisterCount` for `ModbusDataType.String` computes `(tag.StringLength + 1) / 2`; a tag configured with `StringLength = 0` yields a register count of 0, flowing into `ReadOneAsync` as `totalRegs = 0` and producing an FC03/FC04 with quantity 0 — a spec-illegal request the PLC rejects with exception 03. The factory does not reject `StringLength = 0` for String tags. (2) `EnableKeepAlive` casts `opts.Time.TotalSeconds`/`opts.Interval.TotalSeconds` to `int`; a sub-second configured `TimeSpan` (e.g. 500 ms) truncates to 0, which most OSes reject or interpret as "use default", silently defeating the configured keep-alive timing. **Recommendation:** Validate `StringLength >= 1` for `String` tags in `ModbusDriverFactoryExtensions.BuildTag`. For keep-alive, round up to a minimum of 1 second or validate the configured `TimeSpan` is a whole number of seconds. -**Resolution:** _(open)_ +**Resolution:** Resolved 2026-05-23 — added `ValidateStringLength` in `ModbusDriverFactoryExtensions.BuildTag` so String-typed tags with `StringLength < 1` throw at bind time with a clear diagnostic (both AddressString + structured DTO paths), and introduced `ModbusTcpTransport.ClampToWholeSeconds` which rounds the configured keep-alive `TimeSpan` up to a minimum of 1 second so sub-second values no longer truncate to 0. Regression coverage in `ModbusEdgeCaseValidationTests`. ### Driver.Modbus-010 @@ -168,13 +168,13 @@ | Severity | Low | | Category | Error handling & resilience | | Location | `ModbusDriver.cs:864-868`, `ModbusDriverOptions.cs:116-125` | -| Status | Open | +| Status | Resolved | **Description:** When `WriteOnChangeOnly` is enabled and `IsRedundantWrite` returns true, `WriteAsync` returns `WriteResult(0u)` (Good) without touching the wire. The suppression baseline (`_lastWrittenByRef`) is only invalidated by a *read* that returns a divergent value. If a driver instance has `WriteOnChangeOnly = true` but a tag is never subscribed/read (write-only setpoint), a value the operator believes was re-asserted is silently suppressed forever after the first write — no time- or count-based expiry exists. The option XML doc describes the read-invalidation path but does not warn about write-only tags. **Recommendation:** Document the write-only-tag caveat on the `WriteOnChangeOnly` option, or add an optional TTL to the suppression cache so a periodic re-write still reaches the PLC. -**Resolution:** _(open)_ +**Resolution:** Resolved 2026-05-23 — added a `` block on `ModbusDriverOptions.WriteOnChangeOnly` that calls out the write-only-tag caveat explicitly: the cache is only invalidated by reads, so a tag that is never subscribed/polled stays suppressed forever after the first write. Operators choosing this option are directed to either subscribe every affected tag or leave `WriteOnChangeOnly = false`. Adding a TTL was considered but the safer option for an OPC UA driver is to make the limitation discoverable in the documentation surface (no behaviour change for existing deployments). ### Driver.Modbus-011 @@ -183,13 +183,13 @@ | Severity | Low | | Category | Code organization & conventions | | Location | `ModbusDriver.cs:23-43,89-97,408-432` | -| Status | Open | +| Status | Resolved | **Description:** Field and member declarations are interleaved with methods throughout `ModbusDriver`. `ResolveHost` (a public method) is the first member of the class, followed by `BuildSlaveHostName`, then a block of fields; `_lastPublishedByRef`/`_lastWrittenByRef` are declared after the constructor; `ProhibitionState`, `_autoProhibited`, and `_reprobeCts` are declared mid-file between `DecodeRegisterArray` and `RangeIsAutoProhibited`. There are also two near-identical `` blocks stacked back-to-back at `ModbusDriver.cs:411-423`. This hurts readability of a 1400-line file and makes the field inventory hard to audit (relevant to the thread-safety findings above). **Recommendation:** Group all instance fields at the top of the class, move nested types together, and remove the orphaned first `` at lines 411-417 that no longer precedes a member. -**Resolution:** _(open)_ +**Resolution:** Resolved 2026-05-23 — reorganized `ModbusDriver` so every instance field (including the `_autoProhibited` / `_autoProhibitedLock` / `_reprobeCts` / `_rmwLocks` / `_lastPublishedByRef` / `_lastWrittenByRef` fields that were declared mid-file) lives in a single contiguous block at the top of the class, followed by the `ProhibitionState` nested type, the constructor, and then methods. Removed the duplicate orphan `` and the now-redundant field declarations that had been scattered through the file. The full 263-test suite passes with no behavioural change. ### Driver.Modbus-012 @@ -198,10 +198,10 @@ | Severity | Low | | Category | Testing coverage | | Location | `tests/Drivers/ZB.MOM.WW.OtOpcUa.Driver.Modbus.Tests/` | -| Status | Open | +| Status | Resolved | **Description:** The unit suite is broad (coalescing, bisection, auto-recovery, byte order, arrays, BCD, RMW, caps, multi-unit, probe, reconnect, subscription). Gaps relative to the findings above: (1) no test exercises concurrent multi-subscription publishing, so the `_lastPublishedByRef` race (Driver.Modbus-001) is uncaught; (2) no test covers `ReinitializeAsync` state hygiene for stale `_tagsByName`/caches (Driver.Modbus-002); (3) no test feeds a malformed/short response PDU through `ReadRegisterBlockAsync`/`DecodeBitArray` to confirm a clean `BadCommunicationError` rather than an index-range crash (Driver.Modbus-005); (4) no test asserts `DisposeAsync` (vs `ShutdownAsync`) tears down the probe/re-probe loops and `_poll` (Driver.Modbus-004). **Recommendation:** Add unit tests for concurrent deadband publishing across two subscriptions, `ReinitializeAsync` state hygiene, malformed-response handling in the register/bit block readers, and `DisposeAsync` loop teardown. -**Resolution:** _(open)_ +**Resolution:** Resolved 2026-05-23 — gap (1) was already covered by `ModbusSubscriptionTests.Concurrent_deadband_subscriptions_do_not_corrupt_the_publish_cache` from the Driver.Modbus-001 fix. Added the remaining three in a new `ModbusLifecycleHygieneTests` file: `Reinitialize_clears_stale_tagsByName_entries` + `Reinitialize_clears_lastPublished_and_lastWritten_caches` (gap 2), `Short_response_PDU_surfaces_as_BadCommunicationError_not_an_IndexOutOfRangeException` + `Response_payload_truncated_below_declared_byteCount_surfaces_as_BadCommunicationError` + `DecodeBitArray_rejects_an_empty_bitmap_with_InvalidDataException` (gap 3), `DisposeAsync_without_explicit_Shutdown_tears_down_probe_loop_and_transport` + `DisposeAsync_disposes_the_pollEngine_so_subscriptions_stop` (gap 4). All 12 new tests pass (full suite: 263/263 green). diff --git a/src/Drivers/ZB.MOM.WW.OtOpcUa.Driver.Modbus/ModbusDriver.cs b/src/Drivers/ZB.MOM.WW.OtOpcUa.Driver.Modbus/ModbusDriver.cs index f4da46e..088e044 100644 --- a/src/Drivers/ZB.MOM.WW.OtOpcUa.Driver.Modbus/ModbusDriver.cs +++ b/src/Drivers/ZB.MOM.WW.OtOpcUa.Driver.Modbus/ModbusDriver.cs @@ -21,30 +21,40 @@ namespace ZB.MOM.WW.OtOpcUa.Driver.Modbus; public sealed class ModbusDriver : IDriver, ITagDiscovery, IReadable, IWritable, ISubscribable, IHostConnectivityProbe, IPerCallHostResolver, IDisposable, IAsyncDisposable { - /// - /// #142 multi-unit-ID gateway support: per-tag UnitId override drives per-slave host - /// name surfacing through this method. The resilience pipeline keys breakers on the - /// returned host string, so a dead RTU slave behind an Ethernet gateway opens its own - /// breaker without tripping siblings on the same TCP socket. - /// - public string ResolveHost(string fullReference) - { - if (_tagsByName.TryGetValue(fullReference, out var tag)) - return BuildSlaveHostName(ResolveUnitId(tag)); - // Unknown reference — fall back to driver-instance host (single-slave behaviour). - return HostName; - } + // ---- instance fields (Driver.Modbus-011: grouped at top for auditability) ---- - /// Format a per-slave host string. Multi-slave deployments distinguish breakers by this string. - private string BuildSlaveHostName(byte unitId) => $"{_options.Host}:{_options.Port}/unit{unitId}"; + private readonly ModbusDriverOptions _options; + private readonly Func _transportFactory; + private readonly string _driverInstanceId; + private readonly ILogger _logger; // Polled subscriptions delegate to the shared PollGroupEngine. The driver only supplies // the reader + on-change bridge; the engine owns the loop, interval floor, and lifecycle. private readonly PollGroupEngine _poll; - private readonly string _driverInstanceId; - public event EventHandler? OnDataChange; - public event EventHandler? OnHostStatusChanged; + private readonly Dictionary _tagsByName = new(StringComparer.OrdinalIgnoreCase); + + // Last-published value per tag, keyed by FullReference. Used by ShouldPublish to apply + // the deadband filter. Stored as object so all numeric types share one map; the comparison + // does a typed cast inside. + // Driver.Modbus-001: ShouldPublish runs on the PollGroupEngine onChange callback, which + // executes on one background Task per subscription — so a multi-subscription driver mutates + // this map concurrently from several threads. A plain Dictionary corrupts under concurrent + // writes; ConcurrentDictionary makes every TryGetValue / indexer write thread-safe. + private readonly ConcurrentDictionary _lastPublishedByRef = new(StringComparer.OrdinalIgnoreCase); + + // Last-written value per tag for the WriteOnChangeOnly suppression. Invalidated by reads + // that return a different value (so an HMI-side change doesn't get masked). + private readonly Dictionary _lastWrittenByRef = new(StringComparer.OrdinalIgnoreCase); + private readonly object _lastWrittenLock = new(); + + // BitInRegister writes need a read-modify-write against the full holding register. A + // per-register lock keeps concurrent bit-write callers from stomping on each other. + private readonly ConcurrentDictionary _rmwLocks = new(); + + // #148 auto-prohibited coalesce ranges + #150 bisection state (see ProhibitionState below). + private readonly Dictionary<(byte Unit, ModbusRegion Region, ushort Start, ushort End), ProhibitionState> _autoProhibited = new(); + private readonly object _autoProhibitedLock = new(); // Single-host probe state — Modbus driver talks to exactly one endpoint so the "hosts" // collection has at most one entry. HostName is the Host:Port string so the Admin UI can @@ -52,15 +62,39 @@ public sealed class ModbusDriver private readonly object _probeLock = new(); private HostState _hostState = HostState.Unknown; private DateTime _hostStateChangedUtc = DateTime.UtcNow; - private CancellationTokenSource? _probeCts; - private readonly ModbusDriverOptions _options; - private readonly Func _transportFactory; private IModbusTransport? _transport; - private DriverHealth _health = new(DriverState.Unknown, null, null); - private readonly Dictionary _tagsByName = new(StringComparer.OrdinalIgnoreCase); + private CancellationTokenSource? _probeCts; + private CancellationTokenSource? _reprobeCts; - private readonly ILogger _logger; + // Driver.Modbus-003: every read / write / probe path writes to _health from a different + // thread, and GetHealth() reads it without coordination. Reference-assignment on .NET is + // atomic for sealed-record refs (so no tearing), but without a happens-before barrier a + // stale snapshot can persist on another core indefinitely. Volatile.Write / Volatile.Read + // give GetHealth() a defined ordering guarantee: any subsequent read sees at least the + // most recent write any thread has published. The field stays a plain reference (you can't + // mark a record-typed field 'volatile' through the C# keyword on every framework version, + // and the Volatile API is the documented portable form). + private DriverHealth _health = new(DriverState.Unknown, null, null); + + public event EventHandler? OnDataChange; + public event EventHandler? OnHostStatusChanged; + + // ---- nested types ---- + + /// + /// #150 — per-prohibition state. SplitPending drives the re-probe loop's + /// bisection: when true and the range spans > 1 register, the next re-probe + /// tries the two halves separately to narrow the actual offending register(s). + /// Single-register prohibitions can't be split further; they stay re-probed as-is. + /// + private sealed class ProhibitionState + { + public DateTime LastProbedUtc; + public bool SplitPending; + } + + // ---- ctor + identity ---- public ModbusDriver(ModbusDriverOptions options, string driverInstanceId, Func? transportFactory = null, @@ -87,19 +121,22 @@ public sealed class ModbusDriver }); } - // Last-published value per tag, keyed by FullReference. Used by ShouldPublish to apply - // the deadband filter. Stored as object so all numeric types share one map; the comparison - // does a typed cast inside. - // Driver.Modbus-001: ShouldPublish runs on the PollGroupEngine onChange callback, which - // executes on one background Task per subscription — so a multi-subscription driver mutates - // this map concurrently from several threads. A plain Dictionary corrupts under concurrent - // writes; ConcurrentDictionary makes every TryGetValue / indexer write thread-safe. - private readonly ConcurrentDictionary _lastPublishedByRef = new(StringComparer.OrdinalIgnoreCase); + /// + /// #142 multi-unit-ID gateway support: per-tag UnitId override drives per-slave host + /// name surfacing through this method. The resilience pipeline keys breakers on the + /// returned host string, so a dead RTU slave behind an Ethernet gateway opens its own + /// breaker without tripping siblings on the same TCP socket. + /// + public string ResolveHost(string fullReference) + { + if (_tagsByName.TryGetValue(fullReference, out var tag)) + return BuildSlaveHostName(ResolveUnitId(tag)); + // Unknown reference — fall back to driver-instance host (single-slave behaviour). + return HostName; + } - // Last-written value per tag for the WriteOnChangeOnly suppression. Invalidated by reads - // that return a different value (so an HMI-side change doesn't get masked). - private readonly Dictionary _lastWrittenByRef = new(StringComparer.OrdinalIgnoreCase); - private readonly object _lastWrittenLock = new(); + /// Format a per-slave host string. Multi-slave deployments distinguish breakers by this string. + private string BuildSlaveHostName(byte unitId) => $"{_options.Host}:{_options.Port}/unit{unitId}"; private bool ShouldPublish(string tagRef, DataValueSnapshot snapshot) { @@ -131,13 +168,13 @@ public sealed class ModbusDriver public async Task InitializeAsync(string driverConfigJson, CancellationToken cancellationToken) { - _health = new DriverHealth(DriverState.Initializing, null, null); + WriteHealth(new DriverHealth(DriverState.Initializing, null, null)); try { _transport = _transportFactory(_options); await _transport.ConnectAsync(cancellationToken).ConfigureAwait(false); foreach (var t in _options.Tags) _tagsByName[t.Name] = t; - _health = new DriverHealth(DriverState.Healthy, DateTime.UtcNow, null); + WriteHealth(new DriverHealth(DriverState.Healthy, DateTime.UtcNow, null)); // PR 23: kick off the probe loop once the transport is up. Initial state stays // Unknown until the first probe tick succeeds — avoids broadcasting a premature @@ -157,7 +194,7 @@ public sealed class ModbusDriver } catch (Exception ex) { - _health = new DriverHealth(DriverState.Faulted, null, ex.Message); + WriteHealth(new DriverHealth(DriverState.Faulted, null, ex.Message)); throw; } } @@ -170,12 +207,25 @@ public sealed class ModbusDriver public async Task ShutdownAsync(CancellationToken cancellationToken) { - var lastRead = _health.LastSuccessfulRead; + var lastRead = ReadHealth().LastSuccessfulRead; await TeardownAsync().ConfigureAwait(false); - _health = new DriverHealth(DriverState.Unknown, lastRead, null); + WriteHealth(new DriverHealth(DriverState.Unknown, lastRead, null)); } - public DriverHealth GetHealth() => _health; + public DriverHealth GetHealth() => ReadHealth(); + + /// + /// Driver.Modbus-003: barrier-protected read of the multi-thread _health field. + /// Volatile.Read guarantees GetHealth() and the in-driver self-reads (the + /// Degraded paths that retain LastSuccessfulRead) observe the most recently + /// published snapshot rather than a per-core cached stale copy. + /// + private DriverHealth ReadHealth() => Volatile.Read(ref _health); + + /// + /// Driver.Modbus-003: barrier-protected publish of a new _health snapshot. + /// + private void WriteHealth(DriverHealth value) => Volatile.Write(ref _health, value); public long GetMemoryFootprint() => 0; public Task FlushOptionalCachesAsync(CancellationToken cancellationToken) => Task.CompletedTask; @@ -228,7 +278,7 @@ public sealed class ModbusDriver { var value = await ReadOneAsync(transport, tag, cancellationToken).ConfigureAwait(false); results[i] = new DataValueSnapshot(value, 0u, now, now); - _health = new DriverHealth(DriverState.Healthy, now, null); + WriteHealth(new DriverHealth(DriverState.Healthy, now, null)); // Invalidate the WriteOnChangeOnly cache when the read returns a different value // — typically an HMI-side or PLC-internal change. Without this, a setpoint @@ -246,14 +296,14 @@ public sealed class ModbusDriver catch (ModbusException mex) { results[i] = new DataValueSnapshot(null, MapModbusExceptionToStatus(mex.ExceptionCode), null, now); - _health = new DriverHealth(DriverState.Degraded, _health.LastSuccessfulRead, mex.Message); + WriteHealth(new DriverHealth(DriverState.Degraded, ReadHealth().LastSuccessfulRead, mex.Message)); } catch (Exception ex) { // Non-Modbus-layer failure: socket dropped, timeout, malformed response. Surface // as communication error so callers can distinguish it from tag-level faults. results[i] = new DataValueSnapshot(null, StatusBadCommunicationError, null, now); - _health = new DriverHealth(DriverState.Degraded, _health.LastSuccessfulRead, ex.Message); + WriteHealth(new DriverHealth(DriverState.Degraded, ReadHealth().LastSuccessfulRead, ex.Message)); } } return results; @@ -402,29 +452,6 @@ public sealed class ModbusDriver /// Resolve the UnitId for a tag — per-tag override (#142) or driver-level fallback. private byte ResolveUnitId(ModbusTagDefinition tag) => tag.UnitId ?? _options.UnitId; - /// - /// #148 — runtime-discovered ranges where coalesced reads have failed (typically because - /// the PLC has a write-only or protected register mid-block). Subsequent scans skip - /// coalescing across these ranges and let the per-tag fallback handle the members. - /// Cleared by ReinitializeAsync (operator restart) or by an explicit re-probe API - /// (not yet shipped). - /// - /// - /// #150 — per-prohibition state. SplitPending drives the re-probe loop's - /// bisection: when true and the range spans > 1 register, the next re-probe - /// tries the two halves separately to narrow the actual offending register(s). - /// Single-register prohibitions can't be split further; they stay re-probed as-is. - /// - private sealed class ProhibitionState - { - public DateTime LastProbedUtc; - public bool SplitPending; - } - - private readonly Dictionary<(byte Unit, ModbusRegion Region, ushort Start, ushort End), ProhibitionState> _autoProhibited = new(); - private readonly object _autoProhibitedLock = new(); - private CancellationTokenSource? _reprobeCts; - private bool RangeIsAutoProhibited(byte unit, ModbusRegion region, ushort start, ushort end) { lock (_autoProhibitedLock) @@ -700,9 +727,13 @@ public sealed class ModbusDriver blocks.Add((tagStart, tagEnd, new List<(int, ModbusTagDefinition)> { (idx, tag) })); } - // Issue one PDU per block. On block-level failure mark every member Bad — caller's - // per-tag fallback won't re-try since handled-set already includes them; auto-split- - // on-failure is a follow-up. + // Issue one PDU per block. On a Modbus-level exception (illegal data address / + // protected register), record the range as auto-prohibited (#148), leave the + // member indices UNhandled, and let the per-tag fallback in ReadAsync read each + // surviving address individually. On transport-level failure (timeout / socket + // drop) mark members Bad and short-circuit the per-tag fallback (hitting the + // dead socket again won't help). #150 bisection narrows the prohibition over + // subsequent re-probe ticks. foreach (var block in blocks) { if (block.Members.Count == 1) @@ -725,26 +756,19 @@ public sealed class ModbusDriver handled.Add(idx); InvalidateWriteCacheIfDiverged(fullReferences[idx], value); } - _health = new DriverHealth(DriverState.Healthy, timestamp, null); + WriteHealth(new DriverHealth(DriverState.Healthy, timestamp, null)); } catch (ModbusException mex) { // #148 — record the failed range so the planner stops re-coalescing across - // it on subsequent scans. Per-tag fallback reads each member individually - // next time, so healthy tags around the protected hole keep working without - // operator intervention. + // it on subsequent scans. The members are intentionally NOT added to the + // handled-set: ReadAsync's per-tag fallback runs them individually in the + // same scan, so healthy tags around the protected hole keep working without + // operator intervention. Members that ARE the protected register will fail + // again at single-tag granularity and surface the per-tag exception code + // naturally — the block-level mex isn't propagated. RecordAutoProhibition(group.Key.Unit, group.Key.Region, block.Start, block.End); - - var status = MapModbusExceptionToStatus(mex.ExceptionCode); - foreach (var (idx, _) in block.Members) - { - // Don't mark members handled — leave them for the per-tag fallback in - // the same scan so single-register reads can succeed for any non- - // protected member. (Pre-#148 behaviour was to mark all Bad and skip.) - // Members that ARE the protected register will fail again at single-tag - // granularity and surface the per-tag exception code naturally. - } - _health = new DriverHealth(DriverState.Degraded, _health.LastSuccessfulRead, mex.Message); + WriteHealth(new DriverHealth(DriverState.Degraded, ReadHealth().LastSuccessfulRead, mex.Message)); } catch (Exception ex) { @@ -758,7 +782,7 @@ public sealed class ModbusDriver results[idx] = new DataValueSnapshot(null, StatusBadCommunicationError, null, timestamp); handled.Add(idx); } - _health = new DriverHealth(DriverState.Degraded, _health.LastSuccessfulRead, ex.Message); + WriteHealth(new DriverHealth(DriverState.Degraded, ReadHealth().LastSuccessfulRead, ex.Message)); } } } @@ -926,13 +950,11 @@ public sealed class ModbusDriver } } - // BitInRegister writes need a read-modify-write against the full holding register. A - // per-register lock keeps concurrent bit-write callers from stomping on each other — - // Write bit 0 and Write bit 5 targeting the same register can arrive on separate - // subscriber threads, and without serialising the RMW the second-to-commit value wins - // + the first bit update is lost. - private readonly System.Collections.Concurrent.ConcurrentDictionary _rmwLocks = new(); - + // BitInRegister writes need a read-modify-write against the full holding register. The + // per-register lock (declared at the top of the class) keeps concurrent bit-write callers + // from stomping on each other — Write bit 0 and Write bit 5 targeting the same register + // can arrive on separate subscriber threads, and without serialising the RMW the + // second-to-commit value wins + the first bit update is lost. private SemaphoreSlim GetRmwLock(ushort address) => _rmwLocks.GetOrAdd(address, _ => new SemaphoreSlim(1, 1)); @@ -1410,12 +1432,34 @@ public sealed class ModbusDriver } } + /// + /// Map a Modbus logical type to the driver-agnostic used + /// by the address-space builder. + /// + /// + /// + /// Driver.Modbus-007 — Int64 / UInt64 surfacing limitation: + /// does not yet include an Int64 enum member, so 64-bit + /// Modbus tags currently surface as on the OPC UA + /// address space. The wire codec (DecodeRegister / EncodeRegister) is + /// correct — values round-trip as 64-bit long / ulong through + /// ReadAsync / WriteAsync. Only the variable node's DataType + /// attribute is misreported. Clients that consume the type advertisement will see a + /// type/value mismatch for values outside the 32-bit signed range. Operators + /// configuring I_64 / UI_64 tags should be aware of this until the + /// tracked DriverDataType.Int64 follow-up ships. + /// + /// private static DriverDataType MapDataType(ModbusDataType t) => t switch { ModbusDataType.Bool or ModbusDataType.BitInRegister => DriverDataType.Boolean, ModbusDataType.Int16 or ModbusDataType.Int32 => DriverDataType.Int32, ModbusDataType.UInt16 or ModbusDataType.UInt32 => DriverDataType.Int32, - ModbusDataType.Int64 or ModbusDataType.UInt64 => DriverDataType.Int32, // widening to Int32 loses precision; PR 25 adds Int64 to DriverDataType + // Driver.Modbus-007: Int64 / UInt64 currently surface as Int32 because DriverDataType + // has no Int64 member yet. The wire codec preserves the 64-bit value; only the OPC UA + // node's declared DataType is widened. Tracked for a follow-up that adds the enum + // member + node-type advertisement. + ModbusDataType.Int64 or ModbusDataType.UInt64 => DriverDataType.Int32, ModbusDataType.Float32 => DriverDataType.Float32, ModbusDataType.Float64 => DriverDataType.Float64, ModbusDataType.String => DriverDataType.String, diff --git a/src/Drivers/ZB.MOM.WW.OtOpcUa.Driver.Modbus/ModbusDriverFactoryExtensions.cs b/src/Drivers/ZB.MOM.WW.OtOpcUa.Driver.Modbus/ModbusDriverFactoryExtensions.cs index 4813c46..a5fa14e 100644 --- a/src/Drivers/ZB.MOM.WW.OtOpcUa.Driver.Modbus/ModbusDriverFactoryExtensions.cs +++ b/src/Drivers/ZB.MOM.WW.OtOpcUa.Driver.Modbus/ModbusDriverFactoryExtensions.cs @@ -111,16 +111,22 @@ public static class ModbusDriverFactoryExtensions var name = t.Name ?? throw new InvalidOperationException( $"Modbus config for '{driverInstanceId}' has a tag missing Name"); + // Driver.Modbus-009: a String tag with StringLength = 0 yields RegisterCount = 0, which + // turns into an FC03/FC04 with quantity 0 — a spec-illegal request the PLC rejects with + // exception 03. Catch the misconfiguration at bind time with a clear diagnostic instead + // of waiting for the cryptic Illegal Data Value to surface at runtime. + // AddressString takes precedence over the structured fields (Region/Address/DataType/ // ByteOrder/BitIndex/StringLength/ArrayCount). Tags can mix forms freely — newer pasted // rows use the grammar string, legacy rows keep the structured form. Fields not derivable // from the grammar (Writable, WriteIdempotent, StringByteOrder) always come from the DTO. + ModbusTagDefinition tag; if (!string.IsNullOrWhiteSpace(t.AddressString)) { if (!ModbusAddressParser.TryParse(t.AddressString, family, melsecSubFamily, out var parsed, out var parseError)) throw new InvalidOperationException( $"Modbus tag '{name}' in '{driverInstanceId}' has invalid AddressString '{t.AddressString}': {parseError}"); - return new ModbusTagDefinition( + tag = new ModbusTagDefinition( Name: name, Region: parsed!.Region, Address: parsed.Offset, @@ -137,26 +143,45 @@ public static class ModbusDriverFactoryExtensions Deadband: t.Deadband, UnitId: t.UnitId); } + else + { + tag = new ModbusTagDefinition( + Name: name, + Region: ParseEnum(t.Region, t.Name, driverInstanceId, "Region"), + Address: t.Address ?? throw new InvalidOperationException( + $"Modbus tag '{t.Name}' in '{driverInstanceId}' missing Address"), + DataType: ParseEnum(t.DataType, t.Name, driverInstanceId, "DataType"), + Writable: t.Writable ?? true, + ByteOrder: t.ByteOrder is null + ? ModbusByteOrder.BigEndian + : ParseEnum(t.ByteOrder, t.Name, driverInstanceId, "ByteOrder"), + BitIndex: t.BitIndex ?? 0, + StringLength: t.StringLength ?? 0, + StringByteOrder: t.StringByteOrder is null + ? ModbusStringByteOrder.HighByteFirst + : ParseEnum(t.StringByteOrder, t.Name, driverInstanceId, "StringByteOrder"), + WriteIdempotent: t.WriteIdempotent ?? false, + ArrayCount: t.ArrayCount, + Deadband: t.Deadband, + UnitId: t.UnitId); + } - return new ModbusTagDefinition( - Name: name, - Region: ParseEnum(t.Region, t.Name, driverInstanceId, "Region"), - Address: t.Address ?? throw new InvalidOperationException( - $"Modbus tag '{t.Name}' in '{driverInstanceId}' missing Address"), - DataType: ParseEnum(t.DataType, t.Name, driverInstanceId, "DataType"), - Writable: t.Writable ?? true, - ByteOrder: t.ByteOrder is null - ? ModbusByteOrder.BigEndian - : ParseEnum(t.ByteOrder, t.Name, driverInstanceId, "ByteOrder"), - BitIndex: t.BitIndex ?? 0, - StringLength: t.StringLength ?? 0, - StringByteOrder: t.StringByteOrder is null - ? ModbusStringByteOrder.HighByteFirst - : ParseEnum(t.StringByteOrder, t.Name, driverInstanceId, "StringByteOrder"), - WriteIdempotent: t.WriteIdempotent ?? false, - ArrayCount: t.ArrayCount, - Deadband: t.Deadband, - UnitId: t.UnitId); + ValidateStringLength(tag, driverInstanceId); + return tag; + } + + /// + /// Driver.Modbus-009: reject StringLength = 0 for String-typed tags. The + /// driver computes RegisterCount = (StringLength + 1) / 2 which would emit an + /// FC03/FC04 with quantity = 0, a spec-illegal request the PLC rejects with + /// exception 03 (Illegal Data Value). Surface as a clear bind-time error. + /// + private static void ValidateStringLength(ModbusTagDefinition tag, string driverInstanceId) + { + if (tag.DataType == ModbusDataType.String && tag.StringLength < 1) + throw new InvalidOperationException( + $"Modbus tag '{tag.Name}' in '{driverInstanceId}' has DataType=String but StringLength={tag.StringLength}. " + + $"String tags must declare StringLength >= 1 (the number of ASCII characters, packed 2 per register)."); } private static T ParseEnum(string? raw, string? tagName, string driverInstanceId, string field) where T : struct, Enum diff --git a/src/Drivers/ZB.MOM.WW.OtOpcUa.Driver.Modbus/ModbusDriverOptions.cs b/src/Drivers/ZB.MOM.WW.OtOpcUa.Driver.Modbus/ModbusDriverOptions.cs index e394ec5..2c03218 100644 --- a/src/Drivers/ZB.MOM.WW.OtOpcUa.Driver.Modbus/ModbusDriverOptions.cs +++ b/src/Drivers/ZB.MOM.WW.OtOpcUa.Driver.Modbus/ModbusDriverOptions.cs @@ -72,10 +72,11 @@ public sealed class ModbusDriverOptions public bool UseFC16ForSingleRegisterWrites { get; init; } = false; /// - /// Reserved kill-switch for FC23 (Read/Write Multiple Registers). The driver does not - /// currently emit FC23, so this option is a no-op today but exists so future block-read - /// coalescing work that opts into FC23 can be disabled per-deployment without a code - /// change. Default false (FC23 not used either way today). + /// Reserved / no-op kill-switch for FC23 (Read/Write Multiple Registers). The + /// driver does not currently emit FC23 — toggling this option has no observable effect + /// today. The slot exists so a future block-read-coalescing enhancement that opts into + /// FC23 can be disabled per-deployment without a code change. Track Driver.Modbus-007 + /// for the wiring follow-up. Default false. /// public bool DisableFC23 { get; init; } = false; @@ -122,6 +123,18 @@ public sealed class ModbusDriverOptions /// masked. Default false preserves the historical "every write goes to the wire" /// behaviour. Per-tag deadband lives on ModbusTagDefinition.Deadband. /// + /// + /// + /// Driver.Modbus-010 — write-only-tag caveat: the suppression cache is only + /// invalidated by a read that returns a divergent value. A tag that is never + /// subscribed or polled (write-only setpoints, command registers) never sees its + /// cache entry refreshed — so a value the operator believes was re-asserted is + /// silently suppressed forever after the first write. There is no time- or + /// count-based expiry. If you set = true, + /// either subscribe / poll every tag that needs deterministic re-write, or leave + /// this option false for the affected driver instance. + /// + /// public bool WriteOnChangeOnly { get; init; } = false; /// diff --git a/src/Drivers/ZB.MOM.WW.OtOpcUa.Driver.Modbus/ModbusTcpTransport.cs b/src/Drivers/ZB.MOM.WW.OtOpcUa.Driver.Modbus/ModbusTcpTransport.cs index b7ccbdc..8dc905b 100644 --- a/src/Drivers/ZB.MOM.WW.OtOpcUa.Driver.Modbus/ModbusTcpTransport.cs +++ b/src/Drivers/ZB.MOM.WW.OtOpcUa.Driver.Modbus/ModbusTcpTransport.cs @@ -91,13 +91,31 @@ public sealed class ModbusTcpTransport : IModbusTransport try { client.Client.SetSocketOption(SocketOptionLevel.Socket, SocketOptionName.KeepAlive, true); - client.Client.SetSocketOption(SocketOptionLevel.Tcp, SocketOptionName.TcpKeepAliveTime, (int)opts.Time.TotalSeconds); - client.Client.SetSocketOption(SocketOptionLevel.Tcp, SocketOptionName.TcpKeepAliveInterval, (int)opts.Interval.TotalSeconds); + // Driver.Modbus-009: a TimeSpan < 1s previously truncated to 0 via the int cast, + // which Windows / Linux interpret as "use the default" — silently defeating the + // configured keep-alive timing. Round up to at least 1 second so a sub-second + // configuration still produces a real keep-alive cadence. Negative values are + // also clamped to 1 to avoid surfacing as OS errors. + client.Client.SetSocketOption(SocketOptionLevel.Tcp, SocketOptionName.TcpKeepAliveTime, + ClampToWholeSeconds(opts.Time)); + client.Client.SetSocketOption(SocketOptionLevel.Tcp, SocketOptionName.TcpKeepAliveInterval, + ClampToWholeSeconds(opts.Interval)); client.Client.SetSocketOption(SocketOptionLevel.Tcp, SocketOptionName.TcpKeepAliveRetryCount, opts.RetryCount); } catch { /* best-effort; older OSes may not expose the granular knobs */ } } + /// + /// Driver.Modbus-009: cast a to a whole number of seconds with a + /// minimum of 1 — protects callers from the int-cast truncation that turned 500 ms + /// keep-alive timing into "use the default" on most OSes. + /// + internal static int ClampToWholeSeconds(TimeSpan ts) + { + var seconds = (int)Math.Ceiling(ts.TotalSeconds); + return seconds < 1 ? 1 : seconds; + } + public async Task SendAsync(byte unitId, byte[] pdu, CancellationToken ct) { if (_disposed) throw new ObjectDisposedException(nameof(ModbusTcpTransport)); diff --git a/tests/Drivers/ZB.MOM.WW.OtOpcUa.Driver.Modbus.Tests/ModbusEdgeCaseValidationTests.cs b/tests/Drivers/ZB.MOM.WW.OtOpcUa.Driver.Modbus.Tests/ModbusEdgeCaseValidationTests.cs new file mode 100644 index 0000000..d332d51 --- /dev/null +++ b/tests/Drivers/ZB.MOM.WW.OtOpcUa.Driver.Modbus.Tests/ModbusEdgeCaseValidationTests.cs @@ -0,0 +1,106 @@ +using Shouldly; +using Xunit; + +namespace ZB.MOM.WW.OtOpcUa.Driver.Modbus.Tests; + +/// +/// Regression coverage for Driver.Modbus-009: two configuration edge cases that previously +/// silently produced wrong wire behaviour. +/// (1) StringLength = 0 for a String-typed tag — used to flow into an FC03 +/// with quantity 0, a spec-illegal request the PLC rejects with exception 03. Now bind-time +/// validation in ModbusDriverFactoryExtensions rejects the misconfiguration with a +/// clear diagnostic. +/// (2) Sub-second values on ModbusKeepAliveOptions.Time / +/// Interval — the int-cast in EnableKeepAlive truncated 500 ms to +/// 0, which most OSes interpret as "use the default", silently defeating the +/// configured timing. ModbusTcpTransport.ClampToWholeSeconds rounds up to a minimum +/// of 1 second. +/// +[Trait("Category", "Unit")] +public sealed class ModbusEdgeCaseValidationTests +{ + [Fact] + public void Factory_rejects_String_tag_with_StringLength_zero_via_structured_form() + { + const string json = """ + { + "host": "10.0.0.10", + "tags": [ + { "name": "Greeting", "region": "HoldingRegisters", "address": 100, "dataType": "String", "stringLength": 0 } + ] + } + """; + var ex = Should.Throw( + () => ModbusDriverFactoryExtensions.CreateInstance("modbus-1", json)); + ex.Message.ShouldContain("StringLength"); + ex.Message.ShouldContain("Greeting"); + } + + [Fact] + public void Factory_rejects_String_tag_with_StringLength_zero_via_missing_field() + { + // No stringLength → defaults to 0. Same misconfiguration via a different DTO shape. + const string json = """ + { + "host": "10.0.0.10", + "tags": [ + { "name": "Greeting", "region": "HoldingRegisters", "address": 100, "dataType": "String" } + ] + } + """; + var ex = Should.Throw( + () => ModbusDriverFactoryExtensions.CreateInstance("modbus-1", json)); + ex.Message.ShouldContain("StringLength"); + } + + [Fact] + public void Factory_accepts_String_tag_with_StringLength_one() + { + const string json = """ + { + "host": "10.0.0.10", + "tags": [ + { "name": "Greeting", "region": "HoldingRegisters", "address": 100, "dataType": "String", "stringLength": 1 } + ] + } + """; + Should.NotThrow(() => ModbusDriverFactoryExtensions.CreateInstance("modbus-1", json)); + } + + [Fact] + public void Factory_accepts_non_String_tag_with_StringLength_zero() + { + // The validation only kicks in for String tags — Int16 tags with StringLength=0 are normal. + const string json = """ + { + "host": "10.0.0.10", + "tags": [ + { "name": "Level", "region": "HoldingRegisters", "address": 100, "dataType": "Int16" } + ] + } + """; + Should.NotThrow(() => ModbusDriverFactoryExtensions.CreateInstance("modbus-1", json)); + } + + [Theory] + [InlineData(0, 1)] // zero clamps up to 1 + [InlineData(500, 1)] // 500 ms rounds up to 1 + [InlineData(999, 1)] // just under 1s rounds up to 1 + [InlineData(1_000, 1)] // exactly 1s passes through + [InlineData(1_500, 2)] // 1.5s rounds up to 2 + [InlineData(30_000, 30)] // historical PR 53 default — unchanged + [InlineData(60_000, 60)] + public void ClampToWholeSeconds_rounds_up_to_at_least_one_second(int ms, int expected) + { + ModbusTcpTransport.ClampToWholeSeconds(TimeSpan.FromMilliseconds(ms)).ShouldBe(expected); + } + + [Fact] + public void ClampToWholeSeconds_treats_negative_TimeSpan_as_one_second() + { + // Defensive — operators occasionally configure a negative TimeSpan thinking it disables + // the feature. The OS would reject the negative int — clamping to 1 keeps the socket + // valid until the operator fixes the config. + ModbusTcpTransport.ClampToWholeSeconds(TimeSpan.FromSeconds(-5)).ShouldBe(1); + } +} diff --git a/tests/Drivers/ZB.MOM.WW.OtOpcUa.Driver.Modbus.Tests/ModbusLifecycleHygieneTests.cs b/tests/Drivers/ZB.MOM.WW.OtOpcUa.Driver.Modbus.Tests/ModbusLifecycleHygieneTests.cs new file mode 100644 index 0000000..ae1e83b --- /dev/null +++ b/tests/Drivers/ZB.MOM.WW.OtOpcUa.Driver.Modbus.Tests/ModbusLifecycleHygieneTests.cs @@ -0,0 +1,353 @@ +using System.Collections.Concurrent; +using System.Reflection; +using Shouldly; +using Xunit; +using ZB.MOM.WW.OtOpcUa.Core.Abstractions; +using ZB.MOM.WW.OtOpcUa.Driver.Modbus; + +namespace ZB.MOM.WW.OtOpcUa.Driver.Modbus.Tests; + +/// +/// Regression coverage for Driver.Modbus findings -002 (Reinitialize state hygiene), +/// -003 (_health volatile-write ordering), -004 (DisposeAsync teardown parity), and +/// -005 (malformed/short response PDU handling). All four resolved fixes need a +/// unit test alongside them per Driver.Modbus-012. +/// +[Trait("Category", "Unit")] +public sealed class ModbusLifecycleHygieneTests +{ + private sealed class FakeTransport : IModbusTransport + { + public readonly ushort[] HoldingRegisters = new ushort[256]; + public int ConnectCount; + public int DisposeCount; + public int SendCount; + + public Task ConnectAsync(CancellationToken ct) { Interlocked.Increment(ref ConnectCount); return Task.CompletedTask; } + + public Task SendAsync(byte unitId, byte[] pdu, CancellationToken ct) + { + Interlocked.Increment(ref SendCount); + var fc = pdu[0]; + switch (fc) + { + case 0x03: + case 0x04: + { + var addr = (ushort)((pdu[1] << 8) | pdu[2]); + var qty = (ushort)((pdu[3] << 8) | pdu[4]); + var resp = new byte[2 + qty * 2]; + resp[0] = fc; + resp[1] = (byte)(qty * 2); + for (var i = 0; i < qty; i++) + { + resp[2 + i * 2] = (byte)(HoldingRegisters[addr + i] >> 8); + resp[3 + i * 2] = (byte)(HoldingRegisters[addr + i] & 0xFF); + } + return Task.FromResult(resp); + } + case 0x06: + { + var addr = (ushort)((pdu[1] << 8) | pdu[2]); + HoldingRegisters[addr] = (ushort)((pdu[3] << 8) | pdu[4]); + return Task.FromResult(pdu); // FC06 echoes the request + } + case 0x10: + { + var addr = (ushort)((pdu[1] << 8) | pdu[2]); + var qty = (ushort)((pdu[3] << 8) | pdu[4]); + for (var i = 0; i < qty; i++) + HoldingRegisters[addr + i] = (ushort)((pdu[6 + i * 2] << 8) | pdu[7 + i * 2]); + return Task.FromResult(new byte[] { 0x10, pdu[1], pdu[2], pdu[3], pdu[4] }); + } + default: + return Task.FromException(new NotSupportedException($"fc={fc}")); + } + } + + public ValueTask DisposeAsync() { Interlocked.Increment(ref DisposeCount); return ValueTask.CompletedTask; } + } + + /// + /// Returns a snapshot of the driver's private _tagsByName dictionary so the + /// hygiene tests can confirm the cache is empty after teardown. + /// + private static System.Collections.IDictionary GetTagsByName(ModbusDriver drv) => + (System.Collections.IDictionary)typeof(ModbusDriver) + .GetField("_tagsByName", BindingFlags.NonPublic | BindingFlags.Instance)! + .GetValue(drv)!; + + // -------------------- Finding -002 / -012 (2) -------------------- + + [Fact] + public async Task Reinitialize_clears_stale_tagsByName_entries() + { + // Re-initializing with a different options instance would leak stale entries before + // the fix. We simulate by inspecting _tagsByName after a Shutdown — it must be empty + // so InitializeAsync repopulates from a clean slate. + var fake = new FakeTransport(); + var opts = new ModbusDriverOptions + { + Host = "fake", + Tags = [new ModbusTagDefinition("A", ModbusRegion.HoldingRegisters, 0, ModbusDataType.Int16)], + }; + var drv = new ModbusDriver(opts, "modbus-1", _ => fake); + await drv.InitializeAsync("{}", CancellationToken.None); + + GetTagsByName(drv).Count.ShouldBe(1); + await drv.ShutdownAsync(CancellationToken.None); + GetTagsByName(drv).Count.ShouldBe(0, "Shutdown must clear the tag cache so the next Initialize starts clean"); + } + + [Fact] + public async Task Reinitialize_clears_lastPublished_and_lastWritten_caches() + { + var fake = new FakeTransport(); + var opts = new ModbusDriverOptions + { + Host = "fake", + WriteOnChangeOnly = true, + Tags = [new ModbusTagDefinition("A", ModbusRegion.HoldingRegisters, 0, ModbusDataType.Int16, + Deadband: 1.0)], + }; + var drv = new ModbusDriver(opts, "modbus-1", _ => fake); + await drv.InitializeAsync("{}", CancellationToken.None); + + var lastPublished = (System.Collections.IDictionary)typeof(ModbusDriver) + .GetField("_lastPublishedByRef", BindingFlags.NonPublic | BindingFlags.Instance)! + .GetValue(drv)!; + var lastWritten = (System.Collections.IDictionary)typeof(ModbusDriver) + .GetField("_lastWrittenByRef", BindingFlags.NonPublic | BindingFlags.Instance)! + .GetValue(drv)!; + + // Seed both caches via a write (lastWritten) and a publish through ShouldPublish (lastPublished). + await drv.WriteAsync([new WriteRequest("A", (short)5)], CancellationToken.None); + lastWritten.Count.ShouldBe(1); + + // Reach ShouldPublish directly through a subscription so the deadband cache fills. + fake.HoldingRegisters[0] = 5; + var handle = await drv.SubscribeAsync(["A"], TimeSpan.FromMilliseconds(100), CancellationToken.None); + var deadline = DateTime.UtcNow.AddSeconds(2); + while (lastPublished.Count == 0 && DateTime.UtcNow < deadline) await Task.Delay(25); + lastPublished.Count.ShouldBe(1); + await drv.UnsubscribeAsync(handle, CancellationToken.None); + + await drv.ShutdownAsync(CancellationToken.None); + lastPublished.Count.ShouldBe(0, "Shutdown must clear the deadband cache"); + lastWritten.Count.ShouldBe(0, "Shutdown must clear the write-suppression cache"); + } + + // -------------------- Finding -004 / -012 (4) -------------------- + + [Fact] + public async Task DisposeAsync_without_explicit_Shutdown_tears_down_probe_loop_and_transport() + { + var fake = new FakeTransport(); + var opts = new ModbusDriverOptions + { + Host = "fake", + Probe = new ModbusProbeOptions + { + Enabled = true, + Interval = TimeSpan.FromMilliseconds(50), + Timeout = TimeSpan.FromSeconds(1), + }, + // Re-probe loop also opted in so DisposeAsync exercises both CTS cancellations. + AutoProhibitReprobeInterval = TimeSpan.FromMilliseconds(50), + Tags = [new ModbusTagDefinition("A", ModbusRegion.HoldingRegisters, 0, ModbusDataType.Int16)], + }; + var drv = new ModbusDriver(opts, "modbus-1", _ => fake); + await drv.InitializeAsync("{}", CancellationToken.None); + + // Let the probe + re-probe loops spin a few iterations. + await Task.Delay(200); + var sendsAtDispose = Interlocked.CompareExchange(ref fake.SendCount, 0, 0); + sendsAtDispose.ShouldBeGreaterThan(0, "background probe loop should have issued at least one send"); + + // Skip ShutdownAsync — exercise the await-using path that previously leaked. + await drv.DisposeAsync(); + + // Transport must have been disposed exactly once and the background loops stop scheduling + // new sends. Tolerate at most one in-flight send straddling the cancel. + fake.DisposeCount.ShouldBe(1); + var sendsAfterDispose = Interlocked.CompareExchange(ref fake.SendCount, 0, 0); + await Task.Delay(300); + var sendsAtRest = Interlocked.CompareExchange(ref fake.SendCount, 0, 0); + (sendsAtRest - sendsAfterDispose).ShouldBeLessThanOrEqualTo(1, "background loops must stop after DisposeAsync"); + } + + [Fact] + public async Task DisposeAsync_disposes_the_pollEngine_so_subscriptions_stop() + { + var fake = new FakeTransport(); + var opts = new ModbusDriverOptions + { + Host = "fake", + Probe = new ModbusProbeOptions { Enabled = false }, + Tags = [new ModbusTagDefinition("A", ModbusRegion.HoldingRegisters, 0, ModbusDataType.Int16)], + }; + var drv = new ModbusDriver(opts, "modbus-1", _ => fake); + await drv.InitializeAsync("{}", CancellationToken.None); + + // Spin up a polled subscription; the PollGroupEngine schedules a background Task that + // will keep issuing SendAsync until either Unsubscribe or DisposeAsync stops it. + var handle = await drv.SubscribeAsync(["A"], TimeSpan.FromMilliseconds(100), CancellationToken.None); + await Task.Delay(250); + var beforeDispose = Interlocked.CompareExchange(ref fake.SendCount, 0, 0); + beforeDispose.ShouldBeGreaterThan(0); + + // No ShutdownAsync — DisposeAsync must also tear down the poll engine. + await drv.DisposeAsync(); + + var atDispose = Interlocked.CompareExchange(ref fake.SendCount, 0, 0); + await Task.Delay(400); + var atRest = Interlocked.CompareExchange(ref fake.SendCount, 0, 0); + + (atRest - atDispose).ShouldBeLessThanOrEqualTo(1, + "DisposeAsync must dispose the PollGroupEngine so its background Task stops, not just the transport"); + } + + // -------------------- Finding -005 / -012 (3) -------------------- + + /// + /// Transport that returns a structurally-broken response for FC03/FC04 — too short to + /// hold the declared byte-count. Pre-fix the driver dereferenced resp[1] and then + /// ran Buffer.BlockCopy(resp, 2, ..., resp[1]) which threw ArgumentException + /// (out-of-range). Post-fix the driver throws InvalidDataException which the + /// ReadAsync catch-all maps to . + /// + private sealed class TruncatingTransport : IModbusTransport + { + /// How many bytes to return — anything < 2 + bytecount is malformed. + public int ResponseBytes { get; set; } = 1; // just the fc byte, no bytecount + + public Task ConnectAsync(CancellationToken ct) => Task.CompletedTask; + public Task SendAsync(byte unitId, byte[] pdu, CancellationToken ct) + { + var resp = new byte[ResponseBytes]; + if (ResponseBytes >= 1) resp[0] = pdu[0]; + if (ResponseBytes >= 2) resp[1] = 4; // claim 4 bytes of payload but provide none + return Task.FromResult(resp); + } + public ValueTask DisposeAsync() => ValueTask.CompletedTask; + } + + [Fact] + public async Task Short_response_PDU_surfaces_as_BadCommunicationError_not_an_IndexOutOfRangeException() + { + var fake = new TruncatingTransport { ResponseBytes = 1 }; + var opts = new ModbusDriverOptions + { + Host = "fake", + Tags = [new ModbusTagDefinition("Level", ModbusRegion.HoldingRegisters, 0, ModbusDataType.Int16)], + }; + var drv = new ModbusDriver(opts, "modbus-1", _ => fake); + await drv.InitializeAsync("{}", CancellationToken.None); + + var r = await drv.ReadAsync(["Level"], CancellationToken.None); + r[0].StatusCode.ShouldBe(0x80050000u, "BadCommunicationError = a clean transport-layer fault"); + r[0].Value.ShouldBeNull(); + } + + [Fact] + public async Task Response_payload_truncated_below_declared_byteCount_surfaces_as_BadCommunicationError() + { + // Header says "4 bytes follow" but the message is only 3 bytes total — pre-fix the + // Buffer.BlockCopy would throw ArgumentException. + var fake = new TruncatingTransport { ResponseBytes = 3 }; + var opts = new ModbusDriverOptions + { + Host = "fake", + Tags = [new ModbusTagDefinition("Level", ModbusRegion.HoldingRegisters, 0, ModbusDataType.Int16)], + }; + var drv = new ModbusDriver(opts, "modbus-1", _ => fake); + await drv.InitializeAsync("{}", CancellationToken.None); + + var r = await drv.ReadAsync(["Level"], CancellationToken.None); + r[0].StatusCode.ShouldBe(0x80050000u); + } + + [Fact] + public void DecodeBitArray_rejects_an_empty_bitmap_with_InvalidDataException() + { + var decode = typeof(ModbusDriver).GetMethod( + "DecodeBitArray", BindingFlags.NonPublic | BindingFlags.Static)!; + // We can't invoke through reflection because ReadOnlySpan isn't representable in + // object-array invocation parameters. Instead, exercise the path through ReadAsync with + // a bit-region tag and a transport that returns a zero-byte-count response. + var fake = new EmptyBitTransport(); + var opts = new ModbusDriverOptions + { + Host = "fake", + Tags = [new ModbusTagDefinition("Coil", ModbusRegion.Coils, 0, ModbusDataType.Bool)], + }; + var drv = new ModbusDriver(opts, "modbus-1", _ => fake); + drv.InitializeAsync("{}", CancellationToken.None).GetAwaiter().GetResult(); + + var r = drv.ReadAsync(["Coil"], CancellationToken.None).GetAwaiter().GetResult(); + // The empty-bitmap guard surfaces via the BadCommunicationError catch-all. + r[0].StatusCode.ShouldBe(0x80050000u); + } + + /// + /// Coil-bank transport that returns [fc][bytecount=0] — a response with a + /// declared zero-byte payload. Pre-fix DecodeBitArray indexed into the empty + /// bitmap and threw IndexOutOfRangeException. + /// + private sealed class EmptyBitTransport : IModbusTransport + { + public Task ConnectAsync(CancellationToken ct) => Task.CompletedTask; + public Task SendAsync(byte unitId, byte[] pdu, CancellationToken ct) + => Task.FromResult(new byte[] { pdu[0], 0 }); + public ValueTask DisposeAsync() => ValueTask.CompletedTask; + } + + // -------------------- Finding -003 (volatile _health) -------------------- + + /// + /// The _health field is read by GetHealth() and written by every read / + /// write / probe path. The fix uses Volatile.Read/Volatile.Write to give + /// GetHealth() a defined ordering guarantee. We verify that under concurrent + /// pressure GetHealth() never returns a half-constructed value (it's a sealed + /// record so reference-assignment atomicity already prevents tearing; the test guards + /// against future regressions to a struct-typed health surface). + /// + [Fact] + public async Task GetHealth_under_concurrent_pressure_always_returns_a_complete_snapshot() + { + var fake = new FakeTransport(); + var opts = new ModbusDriverOptions + { + Host = "fake", + Tags = [new ModbusTagDefinition("A", ModbusRegion.HoldingRegisters, 0, ModbusDataType.Int16)], + }; + var drv = new ModbusDriver(opts, "modbus-1", _ => fake); + await drv.InitializeAsync("{}", CancellationToken.None); + + // Two writer loops and one reader loop — 250ms of churn. + var cts = new CancellationTokenSource(TimeSpan.FromMilliseconds(250)); + var faults = new ConcurrentQueue(); + var writer = Task.Run(async () => + { + try { while (!cts.IsCancellationRequested) await drv.ReadAsync(["A"], CancellationToken.None); } + catch (Exception ex) { faults.Enqueue(ex); } + }); + var reader = Task.Run(() => + { + try + { + while (!cts.IsCancellationRequested) + { + var h = drv.GetHealth(); + // State must be one of the enum values; LastSuccessfulRead can be null or a real time; + // the record constructor enforces no field is wholly garbage. + h.State.ShouldBeOneOf(DriverState.Unknown, DriverState.Initializing, DriverState.Healthy, DriverState.Degraded, DriverState.Faulted); + } + } + catch (Exception ex) { faults.Enqueue(ex); } + }); + + await Task.WhenAll(writer, reader); + faults.ShouldBeEmpty(); + } +}