fix(driver-modbus): resolve Low code-review findings (Driver.Modbus-003,007,008,009,010,011,012)

- Driver.Modbus-003: route every _health access through ReadHealth /
  WriteHealth helpers backed by Volatile.Read / Volatile.Write so a
  burst of concurrent ReadAsync callers always sees a complete snapshot.
- Driver.Modbus-007: promoted the Int64 / UInt64 → Int32 surfacing
  caveat to a full <remarks> block; rewrote DisableFC23's doc to flag it
  as reserved / no-op.
- Driver.Modbus-008: deleted stale duplicate doc, rewrote the
  prohibition-block summaries to credit the shipped re-probe loop, and
  removed the unused 'status' local in the ModbusException catch arm.
- Driver.Modbus-009: bind-time validation rejects StringLength < 1 for
  String tags; ModbusTcpTransport clamps keep-alive intervals to whole
  seconds (>=1).
- Driver.Modbus-010: documented WriteOnChangeOnly's cache-invalidation
  policy (reads-only) and the write-only-tag caveat.
- Driver.Modbus-011: collected the scattered instance fields into a
  single contiguous block at the top of ModbusDriver.
- Driver.Modbus-012: covered the previously-uncovered Reinitialize
  state-hygiene, malformed/truncated/empty-bitmap response, and
  DisposeAsync teardown paths.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
Joseph Doherty
2026-05-23 08:17:51 -04:00
parent 3c75db7eb6
commit d5322b0f9a
7 changed files with 695 additions and 136 deletions

View File

@@ -7,7 +7,7 @@
| Review date | 2026-05-22 | | Review date | 2026-05-22 |
| Commit reviewed | `76d35d1` | | Commit reviewed | `76d35d1` |
| Status | Reviewed | | Status | Reviewed |
| Open findings | 7 | | Open findings | 0 |
## Checklist coverage ## Checklist coverage
@@ -63,13 +63,13 @@
| Severity | Low | | Severity | Low |
| Category | Concurrency & thread safety | | Category | Concurrency & thread safety |
| Location | `ModbusDriver.cs:59,188,241,259,266,726,745,759` | | Location | `ModbusDriver.cs:59,188,241,259,266,726,745,759` |
| Status | Open | | Status | Resolved |
**Description:** `_health` is a non-`volatile` reference field written from multiple threads (concurrent `ReadAsync` callers, the coalesced-read path, `WriteAsync` indirectly, and `ProbeLoopAsync`) and read by `GetHealth()`. Reference assignment is atomic on .NET so a torn read cannot occur, but there is no happens-before ordering: a stale `DriverHealth` can be observed on another core, and concurrent writers race so "last write wins" is non-deterministic (a `Degraded` write from a failed read can clobber a just-published `Healthy`, or vice versa). **Description:** `_health` is a non-`volatile` reference field written from multiple threads (concurrent `ReadAsync` callers, the coalesced-read path, `WriteAsync` indirectly, and `ProbeLoopAsync`) and read by `GetHealth()`. Reference assignment is atomic on .NET so a torn read cannot occur, but there is no happens-before ordering: a stale `DriverHealth` can be observed on another core, and concurrent writers race so "last write wins" is non-deterministic (a `Degraded` write from a failed read can clobber a just-published `Healthy`, or vice versa).
**Recommendation:** Mark `_health` `volatile`, or assign via `Volatile.Write` and read with `Volatile.Read`, to give `GetHealth()` a defined ordering guarantee. **Recommendation:** Mark `_health` `volatile`, or assign via `Volatile.Write` and read with `Volatile.Read`, to give `GetHealth()` a defined ordering guarantee.
**Resolution:** _(open)_ **Resolution:** Resolved 2026-05-23 — routed every `_health` access through new `ReadHealth()` (`Volatile.Read`) and `WriteHealth()` (`Volatile.Write`) helpers, giving `GetHealth()` a defined ordering guarantee on every core. Stress-test (`ModbusLifecycleHygieneTests.GetHealth_under_concurrent_pressure_always_returns_a_complete_snapshot`) confirms the read path never sees a torn / half-constructed snapshot under concurrent reader + writer pressure.
### Driver.Modbus-004 ### Driver.Modbus-004
@@ -123,13 +123,13 @@
| Severity | Low | | Severity | Low |
| Category | Design-document adherence | | Category | Design-document adherence |
| Location | `ModbusDriver.cs:1392`, `ModbusDriverOptions.cs:74-80` | | Location | `ModbusDriver.cs:1392`, `ModbusDriverOptions.cs:74-80` |
| Status | Open | | Status | Resolved |
**Description:** Two design-vs-code drifts. (1) `MapDataType` maps `Int64`/`UInt64` to `DriverDataType.Int32` with the inline comment "widening to Int32 loses precision; PR 25 adds Int64 to DriverDataType". The address-space node for a 64-bit Modbus tag is declared `Int32`, misrepresenting the OPC UA variable's `DataType` even though `DecodeRegister` produces a correct `long`/`ulong` value — clients see a type/value mismatch. (2) `DisableFC23` is documented and bound from JSON but is a confirmed no-op ("The driver does not currently emit FC23"). Both are acknowledged-but-unfinished items worth tracking. **Description:** Two design-vs-code drifts. (1) `MapDataType` maps `Int64`/`UInt64` to `DriverDataType.Int32` with the inline comment "widening to Int32 loses precision; PR 25 adds Int64 to DriverDataType". The address-space node for a 64-bit Modbus tag is declared `Int32`, misrepresenting the OPC UA variable's `DataType` even though `DecodeRegister` produces a correct `long`/`ulong` value — clients see a type/value mismatch. (2) `DisableFC23` is documented and bound from JSON but is a confirmed no-op ("The driver does not currently emit FC23"). Both are acknowledged-but-unfinished items worth tracking.
**Recommendation:** Track the PR 25 `DriverDataType.Int64` follow-up; until then document the Int32 surfacing limitation in `docs/v2/modbus-addressing.md` so operators configuring `I_64`/`UI_64` tags understand the node type. Mark `DisableFC23` clearly as reserved/unimplemented or gate it once FC23 ships. **Recommendation:** Track the PR 25 `DriverDataType.Int64` follow-up; until then document the Int32 surfacing limitation in `docs/v2/modbus-addressing.md` so operators configuring `I_64`/`UI_64` tags understand the node type. Mark `DisableFC23` clearly as reserved/unimplemented or gate it once FC23 ships.
**Resolution:** _(open)_ **Resolution:** Resolved 2026-05-23 — promoted the inline Int64/UInt64 caveat into a full `<remarks>` block on `MapDataType` calling out the surfacing limitation and the tracked follow-up, and rewrote the `DisableFC23` XML doc to flag the option as "Reserved / no-op" with a Driver.Modbus-007 tracking reference. (The cross-module doc update in `docs/v2/modbus-addressing.md` is out of scope for this module's edits — code is now self-documenting.)
### Driver.Modbus-008 ### Driver.Modbus-008
@@ -138,13 +138,13 @@
| Severity | Low | | Severity | Low |
| Category | Documentation & comments | | Category | Documentation & comments |
| Location | `ModbusDriver.cs:411-417,700-703,737-744` | | Location | `ModbusDriver.cs:411-417,700-703,737-744` |
| Status | Open | | Status | Resolved |
**Description:** Stale/misleading comments. (1) The `<summary>` block at `ModbusDriver.cs:411-417` says auto-prohibited ranges are "Cleared by ReinitializeAsync ... or by an explicit re-probe API (not yet shipped)" — the re-probe loop has shipped (#151, `ReprobeLoopAsync`), so the parenthetical is wrong. (2) The comment at `ModbusDriver.cs:700-703` ("On block-level failure mark every member Bad — caller's per-tag fallback won't re-try since handled-set already includes them; auto-split-on-failure is a follow-up") contradicts the actual `catch (ModbusException)` arm below it, which deliberately does not add members to `handled` and does defer to per-tag fallback (and auto-split has shipped via bisection). The empty `foreach (var (idx, _) in block.Members) { }` loop at `ModbusDriver.cs:737-744`, with only a comment body, is dead code from that superseded design. **Description:** Stale/misleading comments. (1) The `<summary>` block at `ModbusDriver.cs:411-417` says auto-prohibited ranges are "Cleared by ReinitializeAsync ... or by an explicit re-probe API (not yet shipped)" — the re-probe loop has shipped (#151, `ReprobeLoopAsync`), so the parenthetical is wrong. (2) The comment at `ModbusDriver.cs:700-703` ("On block-level failure mark every member Bad — caller's per-tag fallback won't re-try since handled-set already includes them; auto-split-on-failure is a follow-up") contradicts the actual `catch (ModbusException)` arm below it, which deliberately does not add members to `handled` and does defer to per-tag fallback (and auto-split has shipped via bisection). The empty `foreach (var (idx, _) in block.Members) { }` loop at `ModbusDriver.cs:737-744`, with only a comment body, is dead code from that superseded design.
**Recommendation:** Update the two comments to match the shipped #148/#150/#151 behaviour and delete the empty `foreach` loop in the `catch (ModbusException)` arm. **Recommendation:** Update the two comments to match the shipped #148/#150/#151 behaviour and delete the empty `foreach` loop in the `catch (ModbusException)` arm.
**Resolution:** _(open)_ **Resolution:** Resolved 2026-05-23 — deleted the duplicate `<summary>` orphaned at the top of the prohibition block, rewrote the surviving one to credit the shipped #151 re-probe loop, replaced the "auto-split-on-failure is a follow-up" comment above the block loop with the actual #148/#150 behaviour (per-tag fallback + bisection), and removed the empty `foreach (var (idx, _) in block.Members) { }` plus its unused `status` local from the `catch (ModbusException)` arm.
### Driver.Modbus-009 ### Driver.Modbus-009
@@ -153,13 +153,13 @@
| Severity | Low | | Severity | Low |
| Category | Correctness & logic bugs | | Category | Correctness & logic bugs |
| Location | `ModbusDriver.cs:1160-1167`, `ModbusTcpTransport.cs:94-95` | | Location | `ModbusDriver.cs:1160-1167`, `ModbusTcpTransport.cs:94-95` |
| Status | Open | | Status | Resolved |
**Description:** Two edge cases. (1) `RegisterCount` for `ModbusDataType.String` computes `(tag.StringLength + 1) / 2`; a tag configured with `StringLength = 0` yields a register count of 0, flowing into `ReadOneAsync` as `totalRegs = 0` and producing an FC03/FC04 with quantity 0 — a spec-illegal request the PLC rejects with exception 03. The factory does not reject `StringLength = 0` for String tags. (2) `EnableKeepAlive` casts `opts.Time.TotalSeconds`/`opts.Interval.TotalSeconds` to `int`; a sub-second configured `TimeSpan` (e.g. 500 ms) truncates to 0, which most OSes reject or interpret as "use default", silently defeating the configured keep-alive timing. **Description:** Two edge cases. (1) `RegisterCount` for `ModbusDataType.String` computes `(tag.StringLength + 1) / 2`; a tag configured with `StringLength = 0` yields a register count of 0, flowing into `ReadOneAsync` as `totalRegs = 0` and producing an FC03/FC04 with quantity 0 — a spec-illegal request the PLC rejects with exception 03. The factory does not reject `StringLength = 0` for String tags. (2) `EnableKeepAlive` casts `opts.Time.TotalSeconds`/`opts.Interval.TotalSeconds` to `int`; a sub-second configured `TimeSpan` (e.g. 500 ms) truncates to 0, which most OSes reject or interpret as "use default", silently defeating the configured keep-alive timing.
**Recommendation:** Validate `StringLength >= 1` for `String` tags in `ModbusDriverFactoryExtensions.BuildTag`. For keep-alive, round up to a minimum of 1 second or validate the configured `TimeSpan` is a whole number of seconds. **Recommendation:** Validate `StringLength >= 1` for `String` tags in `ModbusDriverFactoryExtensions.BuildTag`. For keep-alive, round up to a minimum of 1 second or validate the configured `TimeSpan` is a whole number of seconds.
**Resolution:** _(open)_ **Resolution:** Resolved 2026-05-23 — added `ValidateStringLength` in `ModbusDriverFactoryExtensions.BuildTag` so String-typed tags with `StringLength < 1` throw at bind time with a clear diagnostic (both AddressString + structured DTO paths), and introduced `ModbusTcpTransport.ClampToWholeSeconds` which rounds the configured keep-alive `TimeSpan` up to a minimum of 1 second so sub-second values no longer truncate to 0. Regression coverage in `ModbusEdgeCaseValidationTests`.
### Driver.Modbus-010 ### Driver.Modbus-010
@@ -168,13 +168,13 @@
| Severity | Low | | Severity | Low |
| Category | Error handling & resilience | | Category | Error handling & resilience |
| Location | `ModbusDriver.cs:864-868`, `ModbusDriverOptions.cs:116-125` | | Location | `ModbusDriver.cs:864-868`, `ModbusDriverOptions.cs:116-125` |
| Status | Open | | Status | Resolved |
**Description:** When `WriteOnChangeOnly` is enabled and `IsRedundantWrite` returns true, `WriteAsync` returns `WriteResult(0u)` (Good) without touching the wire. The suppression baseline (`_lastWrittenByRef`) is only invalidated by a *read* that returns a divergent value. If a driver instance has `WriteOnChangeOnly = true` but a tag is never subscribed/read (write-only setpoint), a value the operator believes was re-asserted is silently suppressed forever after the first write — no time- or count-based expiry exists. The option XML doc describes the read-invalidation path but does not warn about write-only tags. **Description:** When `WriteOnChangeOnly` is enabled and `IsRedundantWrite` returns true, `WriteAsync` returns `WriteResult(0u)` (Good) without touching the wire. The suppression baseline (`_lastWrittenByRef`) is only invalidated by a *read* that returns a divergent value. If a driver instance has `WriteOnChangeOnly = true` but a tag is never subscribed/read (write-only setpoint), a value the operator believes was re-asserted is silently suppressed forever after the first write — no time- or count-based expiry exists. The option XML doc describes the read-invalidation path but does not warn about write-only tags.
**Recommendation:** Document the write-only-tag caveat on the `WriteOnChangeOnly` option, or add an optional TTL to the suppression cache so a periodic re-write still reaches the PLC. **Recommendation:** Document the write-only-tag caveat on the `WriteOnChangeOnly` option, or add an optional TTL to the suppression cache so a periodic re-write still reaches the PLC.
**Resolution:** _(open)_ **Resolution:** Resolved 2026-05-23 — added a `<remarks>` block on `ModbusDriverOptions.WriteOnChangeOnly` that calls out the write-only-tag caveat explicitly: the cache is only invalidated by reads, so a tag that is never subscribed/polled stays suppressed forever after the first write. Operators choosing this option are directed to either subscribe every affected tag or leave `WriteOnChangeOnly = false`. Adding a TTL was considered but the safer option for an OPC UA driver is to make the limitation discoverable in the documentation surface (no behaviour change for existing deployments).
### Driver.Modbus-011 ### Driver.Modbus-011
@@ -183,13 +183,13 @@
| Severity | Low | | Severity | Low |
| Category | Code organization & conventions | | Category | Code organization & conventions |
| Location | `ModbusDriver.cs:23-43,89-97,408-432` | | Location | `ModbusDriver.cs:23-43,89-97,408-432` |
| Status | Open | | Status | Resolved |
**Description:** Field and member declarations are interleaved with methods throughout `ModbusDriver`. `ResolveHost` (a public method) is the first member of the class, followed by `BuildSlaveHostName`, then a block of fields; `_lastPublishedByRef`/`_lastWrittenByRef` are declared after the constructor; `ProhibitionState`, `_autoProhibited`, and `_reprobeCts` are declared mid-file between `DecodeRegisterArray` and `RangeIsAutoProhibited`. There are also two near-identical `<summary>` blocks stacked back-to-back at `ModbusDriver.cs:411-423`. This hurts readability of a 1400-line file and makes the field inventory hard to audit (relevant to the thread-safety findings above). **Description:** Field and member declarations are interleaved with methods throughout `ModbusDriver`. `ResolveHost` (a public method) is the first member of the class, followed by `BuildSlaveHostName`, then a block of fields; `_lastPublishedByRef`/`_lastWrittenByRef` are declared after the constructor; `ProhibitionState`, `_autoProhibited`, and `_reprobeCts` are declared mid-file between `DecodeRegisterArray` and `RangeIsAutoProhibited`. There are also two near-identical `<summary>` blocks stacked back-to-back at `ModbusDriver.cs:411-423`. This hurts readability of a 1400-line file and makes the field inventory hard to audit (relevant to the thread-safety findings above).
**Recommendation:** Group all instance fields at the top of the class, move nested types together, and remove the orphaned first `<summary>` at lines 411-417 that no longer precedes a member. **Recommendation:** Group all instance fields at the top of the class, move nested types together, and remove the orphaned first `<summary>` at lines 411-417 that no longer precedes a member.
**Resolution:** _(open)_ **Resolution:** Resolved 2026-05-23 — reorganized `ModbusDriver` so every instance field (including the `_autoProhibited` / `_autoProhibitedLock` / `_reprobeCts` / `_rmwLocks` / `_lastPublishedByRef` / `_lastWrittenByRef` fields that were declared mid-file) lives in a single contiguous block at the top of the class, followed by the `ProhibitionState` nested type, the constructor, and then methods. Removed the duplicate orphan `<summary>` and the now-redundant field declarations that had been scattered through the file. The full 263-test suite passes with no behavioural change.
### Driver.Modbus-012 ### Driver.Modbus-012
@@ -198,10 +198,10 @@
| Severity | Low | | Severity | Low |
| Category | Testing coverage | | Category | Testing coverage |
| Location | `tests/Drivers/ZB.MOM.WW.OtOpcUa.Driver.Modbus.Tests/` | | Location | `tests/Drivers/ZB.MOM.WW.OtOpcUa.Driver.Modbus.Tests/` |
| Status | Open | | Status | Resolved |
**Description:** The unit suite is broad (coalescing, bisection, auto-recovery, byte order, arrays, BCD, RMW, caps, multi-unit, probe, reconnect, subscription). Gaps relative to the findings above: (1) no test exercises concurrent multi-subscription publishing, so the `_lastPublishedByRef` race (Driver.Modbus-001) is uncaught; (2) no test covers `ReinitializeAsync` state hygiene for stale `_tagsByName`/caches (Driver.Modbus-002); (3) no test feeds a malformed/short response PDU through `ReadRegisterBlockAsync`/`DecodeBitArray` to confirm a clean `BadCommunicationError` rather than an index-range crash (Driver.Modbus-005); (4) no test asserts `DisposeAsync` (vs `ShutdownAsync`) tears down the probe/re-probe loops and `_poll` (Driver.Modbus-004). **Description:** The unit suite is broad (coalescing, bisection, auto-recovery, byte order, arrays, BCD, RMW, caps, multi-unit, probe, reconnect, subscription). Gaps relative to the findings above: (1) no test exercises concurrent multi-subscription publishing, so the `_lastPublishedByRef` race (Driver.Modbus-001) is uncaught; (2) no test covers `ReinitializeAsync` state hygiene for stale `_tagsByName`/caches (Driver.Modbus-002); (3) no test feeds a malformed/short response PDU through `ReadRegisterBlockAsync`/`DecodeBitArray` to confirm a clean `BadCommunicationError` rather than an index-range crash (Driver.Modbus-005); (4) no test asserts `DisposeAsync` (vs `ShutdownAsync`) tears down the probe/re-probe loops and `_poll` (Driver.Modbus-004).
**Recommendation:** Add unit tests for concurrent deadband publishing across two subscriptions, `ReinitializeAsync` state hygiene, malformed-response handling in the register/bit block readers, and `DisposeAsync` loop teardown. **Recommendation:** Add unit tests for concurrent deadband publishing across two subscriptions, `ReinitializeAsync` state hygiene, malformed-response handling in the register/bit block readers, and `DisposeAsync` loop teardown.
**Resolution:** _(open)_ **Resolution:** Resolved 2026-05-23 — gap (1) was already covered by `ModbusSubscriptionTests.Concurrent_deadband_subscriptions_do_not_corrupt_the_publish_cache` from the Driver.Modbus-001 fix. Added the remaining three in a new `ModbusLifecycleHygieneTests` file: `Reinitialize_clears_stale_tagsByName_entries` + `Reinitialize_clears_lastPublished_and_lastWritten_caches` (gap 2), `Short_response_PDU_surfaces_as_BadCommunicationError_not_an_IndexOutOfRangeException` + `Response_payload_truncated_below_declared_byteCount_surfaces_as_BadCommunicationError` + `DecodeBitArray_rejects_an_empty_bitmap_with_InvalidDataException` (gap 3), `DisposeAsync_without_explicit_Shutdown_tears_down_probe_loop_and_transport` + `DisposeAsync_disposes_the_pollEngine_so_subscriptions_stop` (gap 4). All 12 new tests pass (full suite: 263/263 green).

View File

@@ -21,30 +21,40 @@ namespace ZB.MOM.WW.OtOpcUa.Driver.Modbus;
public sealed class ModbusDriver public sealed class ModbusDriver
: IDriver, ITagDiscovery, IReadable, IWritable, ISubscribable, IHostConnectivityProbe, IPerCallHostResolver, IDisposable, IAsyncDisposable : IDriver, ITagDiscovery, IReadable, IWritable, ISubscribable, IHostConnectivityProbe, IPerCallHostResolver, IDisposable, IAsyncDisposable
{ {
/// <summary> // ---- instance fields (Driver.Modbus-011: grouped at top for auditability) ----
/// #142 multi-unit-ID gateway support: per-tag UnitId override drives per-slave host
/// name surfacing through this method. The resilience pipeline keys breakers on the
/// returned host string, so a dead RTU slave behind an Ethernet gateway opens its own
/// breaker without tripping siblings on the same TCP socket.
/// </summary>
public string ResolveHost(string fullReference)
{
if (_tagsByName.TryGetValue(fullReference, out var tag))
return BuildSlaveHostName(ResolveUnitId(tag));
// Unknown reference — fall back to driver-instance host (single-slave behaviour).
return HostName;
}
/// <summary>Format a per-slave host string. Multi-slave deployments distinguish breakers by this string.</summary> private readonly ModbusDriverOptions _options;
private string BuildSlaveHostName(byte unitId) => $"{_options.Host}:{_options.Port}/unit{unitId}"; private readonly Func<ModbusDriverOptions, IModbusTransport> _transportFactory;
private readonly string _driverInstanceId;
private readonly ILogger<ModbusDriver> _logger;
// Polled subscriptions delegate to the shared PollGroupEngine. The driver only supplies // Polled subscriptions delegate to the shared PollGroupEngine. The driver only supplies
// the reader + on-change bridge; the engine owns the loop, interval floor, and lifecycle. // the reader + on-change bridge; the engine owns the loop, interval floor, and lifecycle.
private readonly PollGroupEngine _poll; private readonly PollGroupEngine _poll;
private readonly string _driverInstanceId;
public event EventHandler<DataChangeEventArgs>? OnDataChange; private readonly Dictionary<string, ModbusTagDefinition> _tagsByName = new(StringComparer.OrdinalIgnoreCase);
public event EventHandler<HostStatusChangedEventArgs>? OnHostStatusChanged;
// Last-published value per tag, keyed by FullReference. Used by ShouldPublish to apply
// the deadband filter. Stored as object so all numeric types share one map; the comparison
// does a typed cast inside.
// Driver.Modbus-001: ShouldPublish runs on the PollGroupEngine onChange callback, which
// executes on one background Task per subscription — so a multi-subscription driver mutates
// this map concurrently from several threads. A plain Dictionary corrupts under concurrent
// writes; ConcurrentDictionary makes every TryGetValue / indexer write thread-safe.
private readonly ConcurrentDictionary<string, object> _lastPublishedByRef = new(StringComparer.OrdinalIgnoreCase);
// Last-written value per tag for the WriteOnChangeOnly suppression. Invalidated by reads
// that return a different value (so an HMI-side change doesn't get masked).
private readonly Dictionary<string, object?> _lastWrittenByRef = new(StringComparer.OrdinalIgnoreCase);
private readonly object _lastWrittenLock = new();
// BitInRegister writes need a read-modify-write against the full holding register. A
// per-register lock keeps concurrent bit-write callers from stomping on each other.
private readonly ConcurrentDictionary<ushort, SemaphoreSlim> _rmwLocks = new();
// #148 auto-prohibited coalesce ranges + #150 bisection state (see ProhibitionState below).
private readonly Dictionary<(byte Unit, ModbusRegion Region, ushort Start, ushort End), ProhibitionState> _autoProhibited = new();
private readonly object _autoProhibitedLock = new();
// Single-host probe state — Modbus driver talks to exactly one endpoint so the "hosts" // Single-host probe state — Modbus driver talks to exactly one endpoint so the "hosts"
// collection has at most one entry. HostName is the Host:Port string so the Admin UI can // collection has at most one entry. HostName is the Host:Port string so the Admin UI can
@@ -52,15 +62,39 @@ public sealed class ModbusDriver
private readonly object _probeLock = new(); private readonly object _probeLock = new();
private HostState _hostState = HostState.Unknown; private HostState _hostState = HostState.Unknown;
private DateTime _hostStateChangedUtc = DateTime.UtcNow; private DateTime _hostStateChangedUtc = DateTime.UtcNow;
private CancellationTokenSource? _probeCts;
private readonly ModbusDriverOptions _options;
private readonly Func<ModbusDriverOptions, IModbusTransport> _transportFactory;
private IModbusTransport? _transport; private IModbusTransport? _transport;
private DriverHealth _health = new(DriverState.Unknown, null, null); private CancellationTokenSource? _probeCts;
private readonly Dictionary<string, ModbusTagDefinition> _tagsByName = new(StringComparer.OrdinalIgnoreCase); private CancellationTokenSource? _reprobeCts;
private readonly ILogger<ModbusDriver> _logger; // Driver.Modbus-003: every read / write / probe path writes to _health from a different
// thread, and GetHealth() reads it without coordination. Reference-assignment on .NET is
// atomic for sealed-record refs (so no tearing), but without a happens-before barrier a
// stale snapshot can persist on another core indefinitely. Volatile.Write / Volatile.Read
// give GetHealth() a defined ordering guarantee: any subsequent read sees at least the
// most recent write any thread has published. The field stays a plain reference (you can't
// mark a record-typed field 'volatile' through the C# keyword on every framework version,
// and the Volatile API is the documented portable form).
private DriverHealth _health = new(DriverState.Unknown, null, null);
public event EventHandler<DataChangeEventArgs>? OnDataChange;
public event EventHandler<HostStatusChangedEventArgs>? OnHostStatusChanged;
// ---- nested types ----
/// <summary>
/// #150 — per-prohibition state. <c>SplitPending</c> drives the re-probe loop's
/// bisection: when true and the range spans &gt; 1 register, the next re-probe
/// tries the two halves separately to narrow the actual offending register(s).
/// Single-register prohibitions can't be split further; they stay re-probed as-is.
/// </summary>
private sealed class ProhibitionState
{
public DateTime LastProbedUtc;
public bool SplitPending;
}
// ---- ctor + identity ----
public ModbusDriver(ModbusDriverOptions options, string driverInstanceId, public ModbusDriver(ModbusDriverOptions options, string driverInstanceId,
Func<ModbusDriverOptions, IModbusTransport>? transportFactory = null, Func<ModbusDriverOptions, IModbusTransport>? transportFactory = null,
@@ -87,19 +121,22 @@ public sealed class ModbusDriver
}); });
} }
// Last-published value per tag, keyed by FullReference. Used by ShouldPublish to apply /// <summary>
// the deadband filter. Stored as object so all numeric types share one map; the comparison /// #142 multi-unit-ID gateway support: per-tag UnitId override drives per-slave host
// does a typed cast inside. /// name surfacing through this method. The resilience pipeline keys breakers on the
// Driver.Modbus-001: ShouldPublish runs on the PollGroupEngine onChange callback, which /// returned host string, so a dead RTU slave behind an Ethernet gateway opens its own
// executes on one background Task per subscription — so a multi-subscription driver mutates /// breaker without tripping siblings on the same TCP socket.
// this map concurrently from several threads. A plain Dictionary corrupts under concurrent /// </summary>
// writes; ConcurrentDictionary makes every TryGetValue / indexer write thread-safe. public string ResolveHost(string fullReference)
private readonly ConcurrentDictionary<string, object> _lastPublishedByRef = new(StringComparer.OrdinalIgnoreCase); {
if (_tagsByName.TryGetValue(fullReference, out var tag))
return BuildSlaveHostName(ResolveUnitId(tag));
// Unknown reference — fall back to driver-instance host (single-slave behaviour).
return HostName;
}
// Last-written value per tag for the WriteOnChangeOnly suppression. Invalidated by reads /// <summary>Format a per-slave host string. Multi-slave deployments distinguish breakers by this string.</summary>
// that return a different value (so an HMI-side change doesn't get masked). private string BuildSlaveHostName(byte unitId) => $"{_options.Host}:{_options.Port}/unit{unitId}";
private readonly Dictionary<string, object?> _lastWrittenByRef = new(StringComparer.OrdinalIgnoreCase);
private readonly object _lastWrittenLock = new();
private bool ShouldPublish(string tagRef, DataValueSnapshot snapshot) private bool ShouldPublish(string tagRef, DataValueSnapshot snapshot)
{ {
@@ -131,13 +168,13 @@ public sealed class ModbusDriver
public async Task InitializeAsync(string driverConfigJson, CancellationToken cancellationToken) public async Task InitializeAsync(string driverConfigJson, CancellationToken cancellationToken)
{ {
_health = new DriverHealth(DriverState.Initializing, null, null); WriteHealth(new DriverHealth(DriverState.Initializing, null, null));
try try
{ {
_transport = _transportFactory(_options); _transport = _transportFactory(_options);
await _transport.ConnectAsync(cancellationToken).ConfigureAwait(false); await _transport.ConnectAsync(cancellationToken).ConfigureAwait(false);
foreach (var t in _options.Tags) _tagsByName[t.Name] = t; foreach (var t in _options.Tags) _tagsByName[t.Name] = t;
_health = new DriverHealth(DriverState.Healthy, DateTime.UtcNow, null); WriteHealth(new DriverHealth(DriverState.Healthy, DateTime.UtcNow, null));
// PR 23: kick off the probe loop once the transport is up. Initial state stays // PR 23: kick off the probe loop once the transport is up. Initial state stays
// Unknown until the first probe tick succeeds — avoids broadcasting a premature // Unknown until the first probe tick succeeds — avoids broadcasting a premature
@@ -157,7 +194,7 @@ public sealed class ModbusDriver
} }
catch (Exception ex) catch (Exception ex)
{ {
_health = new DriverHealth(DriverState.Faulted, null, ex.Message); WriteHealth(new DriverHealth(DriverState.Faulted, null, ex.Message));
throw; throw;
} }
} }
@@ -170,12 +207,25 @@ public sealed class ModbusDriver
public async Task ShutdownAsync(CancellationToken cancellationToken) public async Task ShutdownAsync(CancellationToken cancellationToken)
{ {
var lastRead = _health.LastSuccessfulRead; var lastRead = ReadHealth().LastSuccessfulRead;
await TeardownAsync().ConfigureAwait(false); await TeardownAsync().ConfigureAwait(false);
_health = new DriverHealth(DriverState.Unknown, lastRead, null); WriteHealth(new DriverHealth(DriverState.Unknown, lastRead, null));
} }
public DriverHealth GetHealth() => _health; public DriverHealth GetHealth() => ReadHealth();
/// <summary>
/// Driver.Modbus-003: barrier-protected read of the multi-thread <c>_health</c> field.
/// <c>Volatile.Read</c> guarantees <c>GetHealth()</c> and the in-driver self-reads (the
/// Degraded paths that retain <c>LastSuccessfulRead</c>) observe the most recently
/// published snapshot rather than a per-core cached stale copy.
/// </summary>
private DriverHealth ReadHealth() => Volatile.Read(ref _health);
/// <summary>
/// Driver.Modbus-003: barrier-protected publish of a new <c>_health</c> snapshot.
/// </summary>
private void WriteHealth(DriverHealth value) => Volatile.Write(ref _health, value);
public long GetMemoryFootprint() => 0; public long GetMemoryFootprint() => 0;
public Task FlushOptionalCachesAsync(CancellationToken cancellationToken) => Task.CompletedTask; public Task FlushOptionalCachesAsync(CancellationToken cancellationToken) => Task.CompletedTask;
@@ -228,7 +278,7 @@ public sealed class ModbusDriver
{ {
var value = await ReadOneAsync(transport, tag, cancellationToken).ConfigureAwait(false); var value = await ReadOneAsync(transport, tag, cancellationToken).ConfigureAwait(false);
results[i] = new DataValueSnapshot(value, 0u, now, now); results[i] = new DataValueSnapshot(value, 0u, now, now);
_health = new DriverHealth(DriverState.Healthy, now, null); WriteHealth(new DriverHealth(DriverState.Healthy, now, null));
// Invalidate the WriteOnChangeOnly cache when the read returns a different value // Invalidate the WriteOnChangeOnly cache when the read returns a different value
// — typically an HMI-side or PLC-internal change. Without this, a setpoint // — typically an HMI-side or PLC-internal change. Without this, a setpoint
@@ -246,14 +296,14 @@ public sealed class ModbusDriver
catch (ModbusException mex) catch (ModbusException mex)
{ {
results[i] = new DataValueSnapshot(null, MapModbusExceptionToStatus(mex.ExceptionCode), null, now); results[i] = new DataValueSnapshot(null, MapModbusExceptionToStatus(mex.ExceptionCode), null, now);
_health = new DriverHealth(DriverState.Degraded, _health.LastSuccessfulRead, mex.Message); WriteHealth(new DriverHealth(DriverState.Degraded, ReadHealth().LastSuccessfulRead, mex.Message));
} }
catch (Exception ex) catch (Exception ex)
{ {
// Non-Modbus-layer failure: socket dropped, timeout, malformed response. Surface // Non-Modbus-layer failure: socket dropped, timeout, malformed response. Surface
// as communication error so callers can distinguish it from tag-level faults. // as communication error so callers can distinguish it from tag-level faults.
results[i] = new DataValueSnapshot(null, StatusBadCommunicationError, null, now); results[i] = new DataValueSnapshot(null, StatusBadCommunicationError, null, now);
_health = new DriverHealth(DriverState.Degraded, _health.LastSuccessfulRead, ex.Message); WriteHealth(new DriverHealth(DriverState.Degraded, ReadHealth().LastSuccessfulRead, ex.Message));
} }
} }
return results; return results;
@@ -402,29 +452,6 @@ public sealed class ModbusDriver
/// <summary>Resolve the UnitId for a tag — per-tag override (#142) or driver-level fallback.</summary> /// <summary>Resolve the UnitId for a tag — per-tag override (#142) or driver-level fallback.</summary>
private byte ResolveUnitId(ModbusTagDefinition tag) => tag.UnitId ?? _options.UnitId; private byte ResolveUnitId(ModbusTagDefinition tag) => tag.UnitId ?? _options.UnitId;
/// <summary>
/// #148 — runtime-discovered ranges where coalesced reads have failed (typically because
/// the PLC has a write-only or protected register mid-block). Subsequent scans skip
/// coalescing across these ranges and let the per-tag fallback handle the members.
/// Cleared by ReinitializeAsync (operator restart) or by an explicit re-probe API
/// (not yet shipped).
/// </summary>
/// <summary>
/// #150 — per-prohibition state. <c>SplitPending</c> drives the re-probe loop's
/// bisection: when true and the range spans &gt; 1 register, the next re-probe
/// tries the two halves separately to narrow the actual offending register(s).
/// Single-register prohibitions can't be split further; they stay re-probed as-is.
/// </summary>
private sealed class ProhibitionState
{
public DateTime LastProbedUtc;
public bool SplitPending;
}
private readonly Dictionary<(byte Unit, ModbusRegion Region, ushort Start, ushort End), ProhibitionState> _autoProhibited = new();
private readonly object _autoProhibitedLock = new();
private CancellationTokenSource? _reprobeCts;
private bool RangeIsAutoProhibited(byte unit, ModbusRegion region, ushort start, ushort end) private bool RangeIsAutoProhibited(byte unit, ModbusRegion region, ushort start, ushort end)
{ {
lock (_autoProhibitedLock) lock (_autoProhibitedLock)
@@ -700,9 +727,13 @@ public sealed class ModbusDriver
blocks.Add((tagStart, tagEnd, new List<(int, ModbusTagDefinition)> { (idx, tag) })); blocks.Add((tagStart, tagEnd, new List<(int, ModbusTagDefinition)> { (idx, tag) }));
} }
// Issue one PDU per block. On block-level failure mark every member Bad — caller's // Issue one PDU per block. On a Modbus-level exception (illegal data address /
// per-tag fallback won't re-try since handled-set already includes them; auto-split- // protected register), record the range as auto-prohibited (#148), leave the
// on-failure is a follow-up. // member indices UNhandled, and let the per-tag fallback in ReadAsync read each
// surviving address individually. On transport-level failure (timeout / socket
// drop) mark members Bad and short-circuit the per-tag fallback (hitting the
// dead socket again won't help). #150 bisection narrows the prohibition over
// subsequent re-probe ticks.
foreach (var block in blocks) foreach (var block in blocks)
{ {
if (block.Members.Count == 1) if (block.Members.Count == 1)
@@ -725,26 +756,19 @@ public sealed class ModbusDriver
handled.Add(idx); handled.Add(idx);
InvalidateWriteCacheIfDiverged(fullReferences[idx], value); InvalidateWriteCacheIfDiverged(fullReferences[idx], value);
} }
_health = new DriverHealth(DriverState.Healthy, timestamp, null); WriteHealth(new DriverHealth(DriverState.Healthy, timestamp, null));
} }
catch (ModbusException mex) catch (ModbusException mex)
{ {
// #148 — record the failed range so the planner stops re-coalescing across // #148 — record the failed range so the planner stops re-coalescing across
// it on subsequent scans. Per-tag fallback reads each member individually // it on subsequent scans. The members are intentionally NOT added to the
// next time, so healthy tags around the protected hole keep working without // handled-set: ReadAsync's per-tag fallback runs them individually in the
// operator intervention. // same scan, so healthy tags around the protected hole keep working without
// operator intervention. Members that ARE the protected register will fail
// again at single-tag granularity and surface the per-tag exception code
// naturally — the block-level mex isn't propagated.
RecordAutoProhibition(group.Key.Unit, group.Key.Region, block.Start, block.End); RecordAutoProhibition(group.Key.Unit, group.Key.Region, block.Start, block.End);
WriteHealth(new DriverHealth(DriverState.Degraded, ReadHealth().LastSuccessfulRead, mex.Message));
var status = MapModbusExceptionToStatus(mex.ExceptionCode);
foreach (var (idx, _) in block.Members)
{
// Don't mark members handled — leave them for the per-tag fallback in
// the same scan so single-register reads can succeed for any non-
// protected member. (Pre-#148 behaviour was to mark all Bad and skip.)
// Members that ARE the protected register will fail again at single-tag
// granularity and surface the per-tag exception code naturally.
}
_health = new DriverHealth(DriverState.Degraded, _health.LastSuccessfulRead, mex.Message);
} }
catch (Exception ex) catch (Exception ex)
{ {
@@ -758,7 +782,7 @@ public sealed class ModbusDriver
results[idx] = new DataValueSnapshot(null, StatusBadCommunicationError, null, timestamp); results[idx] = new DataValueSnapshot(null, StatusBadCommunicationError, null, timestamp);
handled.Add(idx); handled.Add(idx);
} }
_health = new DriverHealth(DriverState.Degraded, _health.LastSuccessfulRead, ex.Message); WriteHealth(new DriverHealth(DriverState.Degraded, ReadHealth().LastSuccessfulRead, ex.Message));
} }
} }
} }
@@ -926,13 +950,11 @@ public sealed class ModbusDriver
} }
} }
// BitInRegister writes need a read-modify-write against the full holding register. A // BitInRegister writes need a read-modify-write against the full holding register. The
// per-register lock keeps concurrent bit-write callers from stomping on each other — // per-register lock (declared at the top of the class) keeps concurrent bit-write callers
// Write bit 0 and Write bit 5 targeting the same register can arrive on separate // from stomping on each other — Write bit 0 and Write bit 5 targeting the same register
// subscriber threads, and without serialising the RMW the second-to-commit value wins // can arrive on separate subscriber threads, and without serialising the RMW the
// + the first bit update is lost. // second-to-commit value wins + the first bit update is lost.
private readonly System.Collections.Concurrent.ConcurrentDictionary<ushort, SemaphoreSlim> _rmwLocks = new();
private SemaphoreSlim GetRmwLock(ushort address) => private SemaphoreSlim GetRmwLock(ushort address) =>
_rmwLocks.GetOrAdd(address, _ => new SemaphoreSlim(1, 1)); _rmwLocks.GetOrAdd(address, _ => new SemaphoreSlim(1, 1));
@@ -1410,12 +1432,34 @@ public sealed class ModbusDriver
} }
} }
/// <summary>
/// Map a Modbus logical type to the driver-agnostic <see cref="DriverDataType"/> used
/// by the address-space builder.
/// </summary>
/// <remarks>
/// <para>
/// <b>Driver.Modbus-007 — Int64 / UInt64 surfacing limitation:</b>
/// <see cref="DriverDataType"/> does not yet include an Int64 enum member, so 64-bit
/// Modbus tags currently surface as <see cref="DriverDataType.Int32"/> on the OPC UA
/// address space. The wire codec (<c>DecodeRegister</c> / <c>EncodeRegister</c>) is
/// correct — values round-trip as 64-bit <c>long</c> / <c>ulong</c> through
/// <c>ReadAsync</c> / <c>WriteAsync</c>. Only the variable node's <c>DataType</c>
/// attribute is misreported. Clients that consume the type advertisement will see a
/// type/value mismatch for values outside the 32-bit signed range. Operators
/// configuring <c>I_64</c> / <c>UI_64</c> tags should be aware of this until the
/// tracked <c>DriverDataType.Int64</c> follow-up ships.
/// </para>
/// </remarks>
private static DriverDataType MapDataType(ModbusDataType t) => t switch private static DriverDataType MapDataType(ModbusDataType t) => t switch
{ {
ModbusDataType.Bool or ModbusDataType.BitInRegister => DriverDataType.Boolean, ModbusDataType.Bool or ModbusDataType.BitInRegister => DriverDataType.Boolean,
ModbusDataType.Int16 or ModbusDataType.Int32 => DriverDataType.Int32, ModbusDataType.Int16 or ModbusDataType.Int32 => DriverDataType.Int32,
ModbusDataType.UInt16 or ModbusDataType.UInt32 => DriverDataType.Int32, ModbusDataType.UInt16 or ModbusDataType.UInt32 => DriverDataType.Int32,
ModbusDataType.Int64 or ModbusDataType.UInt64 => DriverDataType.Int32, // widening to Int32 loses precision; PR 25 adds Int64 to DriverDataType // Driver.Modbus-007: Int64 / UInt64 currently surface as Int32 because DriverDataType
// has no Int64 member yet. The wire codec preserves the 64-bit value; only the OPC UA
// node's declared DataType is widened. Tracked for a follow-up that adds the enum
// member + node-type advertisement.
ModbusDataType.Int64 or ModbusDataType.UInt64 => DriverDataType.Int32,
ModbusDataType.Float32 => DriverDataType.Float32, ModbusDataType.Float32 => DriverDataType.Float32,
ModbusDataType.Float64 => DriverDataType.Float64, ModbusDataType.Float64 => DriverDataType.Float64,
ModbusDataType.String => DriverDataType.String, ModbusDataType.String => DriverDataType.String,

View File

@@ -111,16 +111,22 @@ public static class ModbusDriverFactoryExtensions
var name = t.Name ?? throw new InvalidOperationException( var name = t.Name ?? throw new InvalidOperationException(
$"Modbus config for '{driverInstanceId}' has a tag missing Name"); $"Modbus config for '{driverInstanceId}' has a tag missing Name");
// Driver.Modbus-009: a String tag with StringLength = 0 yields RegisterCount = 0, which
// turns into an FC03/FC04 with quantity 0 — a spec-illegal request the PLC rejects with
// exception 03. Catch the misconfiguration at bind time with a clear diagnostic instead
// of waiting for the cryptic Illegal Data Value to surface at runtime.
// AddressString takes precedence over the structured fields (Region/Address/DataType/ // AddressString takes precedence over the structured fields (Region/Address/DataType/
// ByteOrder/BitIndex/StringLength/ArrayCount). Tags can mix forms freely — newer pasted // ByteOrder/BitIndex/StringLength/ArrayCount). Tags can mix forms freely — newer pasted
// rows use the grammar string, legacy rows keep the structured form. Fields not derivable // rows use the grammar string, legacy rows keep the structured form. Fields not derivable
// from the grammar (Writable, WriteIdempotent, StringByteOrder) always come from the DTO. // from the grammar (Writable, WriteIdempotent, StringByteOrder) always come from the DTO.
ModbusTagDefinition tag;
if (!string.IsNullOrWhiteSpace(t.AddressString)) if (!string.IsNullOrWhiteSpace(t.AddressString))
{ {
if (!ModbusAddressParser.TryParse(t.AddressString, family, melsecSubFamily, out var parsed, out var parseError)) if (!ModbusAddressParser.TryParse(t.AddressString, family, melsecSubFamily, out var parsed, out var parseError))
throw new InvalidOperationException( throw new InvalidOperationException(
$"Modbus tag '{name}' in '{driverInstanceId}' has invalid AddressString '{t.AddressString}': {parseError}"); $"Modbus tag '{name}' in '{driverInstanceId}' has invalid AddressString '{t.AddressString}': {parseError}");
return new ModbusTagDefinition( tag = new ModbusTagDefinition(
Name: name, Name: name,
Region: parsed!.Region, Region: parsed!.Region,
Address: parsed.Offset, Address: parsed.Offset,
@@ -137,8 +143,9 @@ public static class ModbusDriverFactoryExtensions
Deadband: t.Deadband, Deadband: t.Deadband,
UnitId: t.UnitId); UnitId: t.UnitId);
} }
else
return new ModbusTagDefinition( {
tag = new ModbusTagDefinition(
Name: name, Name: name,
Region: ParseEnum<ModbusRegion>(t.Region, t.Name, driverInstanceId, "Region"), Region: ParseEnum<ModbusRegion>(t.Region, t.Name, driverInstanceId, "Region"),
Address: t.Address ?? throw new InvalidOperationException( Address: t.Address ?? throw new InvalidOperationException(
@@ -159,6 +166,24 @@ public static class ModbusDriverFactoryExtensions
UnitId: t.UnitId); UnitId: t.UnitId);
} }
ValidateStringLength(tag, driverInstanceId);
return tag;
}
/// <summary>
/// Driver.Modbus-009: reject <c>StringLength = 0</c> for <c>String</c>-typed tags. The
/// driver computes <c>RegisterCount = (StringLength + 1) / 2</c> which would emit an
/// FC03/FC04 with <c>quantity = 0</c>, a spec-illegal request the PLC rejects with
/// exception 03 (Illegal Data Value). Surface as a clear bind-time error.
/// </summary>
private static void ValidateStringLength(ModbusTagDefinition tag, string driverInstanceId)
{
if (tag.DataType == ModbusDataType.String && tag.StringLength < 1)
throw new InvalidOperationException(
$"Modbus tag '{tag.Name}' in '{driverInstanceId}' has DataType=String but StringLength={tag.StringLength}. " +
$"String tags must declare StringLength >= 1 (the number of ASCII characters, packed 2 per register).");
}
private static T ParseEnum<T>(string? raw, string? tagName, string driverInstanceId, string field) where T : struct, Enum private static T ParseEnum<T>(string? raw, string? tagName, string driverInstanceId, string field) where T : struct, Enum
{ {
if (string.IsNullOrWhiteSpace(raw)) if (string.IsNullOrWhiteSpace(raw))

View File

@@ -72,10 +72,11 @@ public sealed class ModbusDriverOptions
public bool UseFC16ForSingleRegisterWrites { get; init; } = false; public bool UseFC16ForSingleRegisterWrites { get; init; } = false;
/// <summary> /// <summary>
/// Reserved kill-switch for FC23 (Read/Write Multiple Registers). The driver does not /// <b>Reserved / no-op</b> kill-switch for FC23 (Read/Write Multiple Registers). The
/// currently emit FC23, so this option is a no-op today but exists so future block-read /// driver does not currently emit FC23 — toggling this option has no observable effect
/// coalescing work that opts into FC23 can be disabled per-deployment without a code /// today. The slot exists so a future block-read-coalescing enhancement that opts into
/// change. Default <c>false</c> (FC23 not used either way today). /// FC23 can be disabled per-deployment without a code change. Track Driver.Modbus-007
/// for the wiring follow-up. Default <c>false</c>.
/// </summary> /// </summary>
public bool DisableFC23 { get; init; } = false; public bool DisableFC23 { get; init; } = false;
@@ -122,6 +123,18 @@ public sealed class ModbusDriverOptions
/// masked. Default <c>false</c> preserves the historical "every write goes to the wire" /// masked. Default <c>false</c> preserves the historical "every write goes to the wire"
/// behaviour. Per-tag deadband lives on <c>ModbusTagDefinition.Deadband</c>. /// behaviour. Per-tag deadband lives on <c>ModbusTagDefinition.Deadband</c>.
/// </summary> /// </summary>
/// <remarks>
/// <para>
/// <b>Driver.Modbus-010 — write-only-tag caveat:</b> the suppression cache is only
/// invalidated by a <i>read</i> that returns a divergent value. A tag that is never
/// subscribed or polled (write-only setpoints, command registers) never sees its
/// cache entry refreshed — so a value the operator believes was re-asserted is
/// silently suppressed forever after the first write. There is no time- or
/// count-based expiry. If you set <see cref="WriteOnChangeOnly"/> = <c>true</c>,
/// either subscribe / poll every tag that needs deterministic re-write, or leave
/// this option <c>false</c> for the affected driver instance.
/// </para>
/// </remarks>
public bool WriteOnChangeOnly { get; init; } = false; public bool WriteOnChangeOnly { get; init; } = false;
/// <summary> /// <summary>

View File

@@ -91,13 +91,31 @@ public sealed class ModbusTcpTransport : IModbusTransport
try try
{ {
client.Client.SetSocketOption(SocketOptionLevel.Socket, SocketOptionName.KeepAlive, true); client.Client.SetSocketOption(SocketOptionLevel.Socket, SocketOptionName.KeepAlive, true);
client.Client.SetSocketOption(SocketOptionLevel.Tcp, SocketOptionName.TcpKeepAliveTime, (int)opts.Time.TotalSeconds); // Driver.Modbus-009: a TimeSpan < 1s previously truncated to 0 via the int cast,
client.Client.SetSocketOption(SocketOptionLevel.Tcp, SocketOptionName.TcpKeepAliveInterval, (int)opts.Interval.TotalSeconds); // which Windows / Linux interpret as "use the default" — silently defeating the
// configured keep-alive timing. Round up to at least 1 second so a sub-second
// configuration still produces a real keep-alive cadence. Negative values are
// also clamped to 1 to avoid surfacing as OS errors.
client.Client.SetSocketOption(SocketOptionLevel.Tcp, SocketOptionName.TcpKeepAliveTime,
ClampToWholeSeconds(opts.Time));
client.Client.SetSocketOption(SocketOptionLevel.Tcp, SocketOptionName.TcpKeepAliveInterval,
ClampToWholeSeconds(opts.Interval));
client.Client.SetSocketOption(SocketOptionLevel.Tcp, SocketOptionName.TcpKeepAliveRetryCount, opts.RetryCount); client.Client.SetSocketOption(SocketOptionLevel.Tcp, SocketOptionName.TcpKeepAliveRetryCount, opts.RetryCount);
} }
catch { /* best-effort; older OSes may not expose the granular knobs */ } catch { /* best-effort; older OSes may not expose the granular knobs */ }
} }
/// <summary>
/// Driver.Modbus-009: cast a <see cref="TimeSpan"/> to a whole number of seconds with a
/// minimum of 1 — protects callers from the int-cast truncation that turned 500&#160;ms
/// keep-alive timing into "use the default" on most OSes.
/// </summary>
internal static int ClampToWholeSeconds(TimeSpan ts)
{
var seconds = (int)Math.Ceiling(ts.TotalSeconds);
return seconds < 1 ? 1 : seconds;
}
public async Task<byte[]> SendAsync(byte unitId, byte[] pdu, CancellationToken ct) public async Task<byte[]> SendAsync(byte unitId, byte[] pdu, CancellationToken ct)
{ {
if (_disposed) throw new ObjectDisposedException(nameof(ModbusTcpTransport)); if (_disposed) throw new ObjectDisposedException(nameof(ModbusTcpTransport));

View File

@@ -0,0 +1,106 @@
using Shouldly;
using Xunit;
namespace ZB.MOM.WW.OtOpcUa.Driver.Modbus.Tests;
/// <summary>
/// Regression coverage for Driver.Modbus-009: two configuration edge cases that previously
/// silently produced wrong wire behaviour.
/// (1) <c>StringLength = 0</c> for a <c>String</c>-typed tag — used to flow into an FC03
/// with quantity 0, a spec-illegal request the PLC rejects with exception 03. Now bind-time
/// validation in <c>ModbusDriverFactoryExtensions</c> rejects the misconfiguration with a
/// clear diagnostic.
/// (2) Sub-second <see cref="TimeSpan"/> values on <c>ModbusKeepAliveOptions.Time</c> /
/// <c>Interval</c> — the int-cast in <c>EnableKeepAlive</c> truncated <c>500&#160;ms</c> to
/// <c>0</c>, which most OSes interpret as "use the default", silently defeating the
/// configured timing. <c>ModbusTcpTransport.ClampToWholeSeconds</c> rounds up to a minimum
/// of 1&#160;second.
/// </summary>
[Trait("Category", "Unit")]
public sealed class ModbusEdgeCaseValidationTests
{
[Fact]
public void Factory_rejects_String_tag_with_StringLength_zero_via_structured_form()
{
const string json = """
{
"host": "10.0.0.10",
"tags": [
{ "name": "Greeting", "region": "HoldingRegisters", "address": 100, "dataType": "String", "stringLength": 0 }
]
}
""";
var ex = Should.Throw<InvalidOperationException>(
() => ModbusDriverFactoryExtensions.CreateInstance("modbus-1", json));
ex.Message.ShouldContain("StringLength");
ex.Message.ShouldContain("Greeting");
}
[Fact]
public void Factory_rejects_String_tag_with_StringLength_zero_via_missing_field()
{
// No stringLength → defaults to 0. Same misconfiguration via a different DTO shape.
const string json = """
{
"host": "10.0.0.10",
"tags": [
{ "name": "Greeting", "region": "HoldingRegisters", "address": 100, "dataType": "String" }
]
}
""";
var ex = Should.Throw<InvalidOperationException>(
() => ModbusDriverFactoryExtensions.CreateInstance("modbus-1", json));
ex.Message.ShouldContain("StringLength");
}
[Fact]
public void Factory_accepts_String_tag_with_StringLength_one()
{
const string json = """
{
"host": "10.0.0.10",
"tags": [
{ "name": "Greeting", "region": "HoldingRegisters", "address": 100, "dataType": "String", "stringLength": 1 }
]
}
""";
Should.NotThrow(() => ModbusDriverFactoryExtensions.CreateInstance("modbus-1", json));
}
[Fact]
public void Factory_accepts_non_String_tag_with_StringLength_zero()
{
// The validation only kicks in for String tags — Int16 tags with StringLength=0 are normal.
const string json = """
{
"host": "10.0.0.10",
"tags": [
{ "name": "Level", "region": "HoldingRegisters", "address": 100, "dataType": "Int16" }
]
}
""";
Should.NotThrow(() => ModbusDriverFactoryExtensions.CreateInstance("modbus-1", json));
}
[Theory]
[InlineData(0, 1)] // zero clamps up to 1
[InlineData(500, 1)] // 500 ms rounds up to 1
[InlineData(999, 1)] // just under 1s rounds up to 1
[InlineData(1_000, 1)] // exactly 1s passes through
[InlineData(1_500, 2)] // 1.5s rounds up to 2
[InlineData(30_000, 30)] // historical PR 53 default — unchanged
[InlineData(60_000, 60)]
public void ClampToWholeSeconds_rounds_up_to_at_least_one_second(int ms, int expected)
{
ModbusTcpTransport.ClampToWholeSeconds(TimeSpan.FromMilliseconds(ms)).ShouldBe(expected);
}
[Fact]
public void ClampToWholeSeconds_treats_negative_TimeSpan_as_one_second()
{
// Defensive — operators occasionally configure a negative TimeSpan thinking it disables
// the feature. The OS would reject the negative int — clamping to 1 keeps the socket
// valid until the operator fixes the config.
ModbusTcpTransport.ClampToWholeSeconds(TimeSpan.FromSeconds(-5)).ShouldBe(1);
}
}

View File

@@ -0,0 +1,353 @@
using System.Collections.Concurrent;
using System.Reflection;
using Shouldly;
using Xunit;
using ZB.MOM.WW.OtOpcUa.Core.Abstractions;
using ZB.MOM.WW.OtOpcUa.Driver.Modbus;
namespace ZB.MOM.WW.OtOpcUa.Driver.Modbus.Tests;
/// <summary>
/// Regression coverage for Driver.Modbus findings -002 (Reinitialize state hygiene),
/// -003 (_health volatile-write ordering), -004 (DisposeAsync teardown parity), and
/// -005 (malformed/short response PDU handling). All four resolved fixes need a
/// unit test alongside them per Driver.Modbus-012.
/// </summary>
[Trait("Category", "Unit")]
public sealed class ModbusLifecycleHygieneTests
{
private sealed class FakeTransport : IModbusTransport
{
public readonly ushort[] HoldingRegisters = new ushort[256];
public int ConnectCount;
public int DisposeCount;
public int SendCount;
public Task ConnectAsync(CancellationToken ct) { Interlocked.Increment(ref ConnectCount); return Task.CompletedTask; }
public Task<byte[]> SendAsync(byte unitId, byte[] pdu, CancellationToken ct)
{
Interlocked.Increment(ref SendCount);
var fc = pdu[0];
switch (fc)
{
case 0x03:
case 0x04:
{
var addr = (ushort)((pdu[1] << 8) | pdu[2]);
var qty = (ushort)((pdu[3] << 8) | pdu[4]);
var resp = new byte[2 + qty * 2];
resp[0] = fc;
resp[1] = (byte)(qty * 2);
for (var i = 0; i < qty; i++)
{
resp[2 + i * 2] = (byte)(HoldingRegisters[addr + i] >> 8);
resp[3 + i * 2] = (byte)(HoldingRegisters[addr + i] & 0xFF);
}
return Task.FromResult(resp);
}
case 0x06:
{
var addr = (ushort)((pdu[1] << 8) | pdu[2]);
HoldingRegisters[addr] = (ushort)((pdu[3] << 8) | pdu[4]);
return Task.FromResult(pdu); // FC06 echoes the request
}
case 0x10:
{
var addr = (ushort)((pdu[1] << 8) | pdu[2]);
var qty = (ushort)((pdu[3] << 8) | pdu[4]);
for (var i = 0; i < qty; i++)
HoldingRegisters[addr + i] = (ushort)((pdu[6 + i * 2] << 8) | pdu[7 + i * 2]);
return Task.FromResult(new byte[] { 0x10, pdu[1], pdu[2], pdu[3], pdu[4] });
}
default:
return Task.FromException<byte[]>(new NotSupportedException($"fc={fc}"));
}
}
public ValueTask DisposeAsync() { Interlocked.Increment(ref DisposeCount); return ValueTask.CompletedTask; }
}
/// <summary>
/// Returns a snapshot of the driver's private <c>_tagsByName</c> dictionary so the
/// hygiene tests can confirm the cache is empty after teardown.
/// </summary>
private static System.Collections.IDictionary GetTagsByName(ModbusDriver drv) =>
(System.Collections.IDictionary)typeof(ModbusDriver)
.GetField("_tagsByName", BindingFlags.NonPublic | BindingFlags.Instance)!
.GetValue(drv)!;
// -------------------- Finding -002 / -012 (2) --------------------
[Fact]
public async Task Reinitialize_clears_stale_tagsByName_entries()
{
// Re-initializing with a different options instance would leak stale entries before
// the fix. We simulate by inspecting _tagsByName after a Shutdown — it must be empty
// so InitializeAsync repopulates from a clean slate.
var fake = new FakeTransport();
var opts = new ModbusDriverOptions
{
Host = "fake",
Tags = [new ModbusTagDefinition("A", ModbusRegion.HoldingRegisters, 0, ModbusDataType.Int16)],
};
var drv = new ModbusDriver(opts, "modbus-1", _ => fake);
await drv.InitializeAsync("{}", CancellationToken.None);
GetTagsByName(drv).Count.ShouldBe(1);
await drv.ShutdownAsync(CancellationToken.None);
GetTagsByName(drv).Count.ShouldBe(0, "Shutdown must clear the tag cache so the next Initialize starts clean");
}
[Fact]
public async Task Reinitialize_clears_lastPublished_and_lastWritten_caches()
{
var fake = new FakeTransport();
var opts = new ModbusDriverOptions
{
Host = "fake",
WriteOnChangeOnly = true,
Tags = [new ModbusTagDefinition("A", ModbusRegion.HoldingRegisters, 0, ModbusDataType.Int16,
Deadband: 1.0)],
};
var drv = new ModbusDriver(opts, "modbus-1", _ => fake);
await drv.InitializeAsync("{}", CancellationToken.None);
var lastPublished = (System.Collections.IDictionary)typeof(ModbusDriver)
.GetField("_lastPublishedByRef", BindingFlags.NonPublic | BindingFlags.Instance)!
.GetValue(drv)!;
var lastWritten = (System.Collections.IDictionary)typeof(ModbusDriver)
.GetField("_lastWrittenByRef", BindingFlags.NonPublic | BindingFlags.Instance)!
.GetValue(drv)!;
// Seed both caches via a write (lastWritten) and a publish through ShouldPublish (lastPublished).
await drv.WriteAsync([new WriteRequest("A", (short)5)], CancellationToken.None);
lastWritten.Count.ShouldBe(1);
// Reach ShouldPublish directly through a subscription so the deadband cache fills.
fake.HoldingRegisters[0] = 5;
var handle = await drv.SubscribeAsync(["A"], TimeSpan.FromMilliseconds(100), CancellationToken.None);
var deadline = DateTime.UtcNow.AddSeconds(2);
while (lastPublished.Count == 0 && DateTime.UtcNow < deadline) await Task.Delay(25);
lastPublished.Count.ShouldBe(1);
await drv.UnsubscribeAsync(handle, CancellationToken.None);
await drv.ShutdownAsync(CancellationToken.None);
lastPublished.Count.ShouldBe(0, "Shutdown must clear the deadband cache");
lastWritten.Count.ShouldBe(0, "Shutdown must clear the write-suppression cache");
}
// -------------------- Finding -004 / -012 (4) --------------------
[Fact]
public async Task DisposeAsync_without_explicit_Shutdown_tears_down_probe_loop_and_transport()
{
var fake = new FakeTransport();
var opts = new ModbusDriverOptions
{
Host = "fake",
Probe = new ModbusProbeOptions
{
Enabled = true,
Interval = TimeSpan.FromMilliseconds(50),
Timeout = TimeSpan.FromSeconds(1),
},
// Re-probe loop also opted in so DisposeAsync exercises both CTS cancellations.
AutoProhibitReprobeInterval = TimeSpan.FromMilliseconds(50),
Tags = [new ModbusTagDefinition("A", ModbusRegion.HoldingRegisters, 0, ModbusDataType.Int16)],
};
var drv = new ModbusDriver(opts, "modbus-1", _ => fake);
await drv.InitializeAsync("{}", CancellationToken.None);
// Let the probe + re-probe loops spin a few iterations.
await Task.Delay(200);
var sendsAtDispose = Interlocked.CompareExchange(ref fake.SendCount, 0, 0);
sendsAtDispose.ShouldBeGreaterThan(0, "background probe loop should have issued at least one send");
// Skip ShutdownAsync — exercise the await-using path that previously leaked.
await drv.DisposeAsync();
// Transport must have been disposed exactly once and the background loops stop scheduling
// new sends. Tolerate at most one in-flight send straddling the cancel.
fake.DisposeCount.ShouldBe(1);
var sendsAfterDispose = Interlocked.CompareExchange(ref fake.SendCount, 0, 0);
await Task.Delay(300);
var sendsAtRest = Interlocked.CompareExchange(ref fake.SendCount, 0, 0);
(sendsAtRest - sendsAfterDispose).ShouldBeLessThanOrEqualTo(1, "background loops must stop after DisposeAsync");
}
[Fact]
public async Task DisposeAsync_disposes_the_pollEngine_so_subscriptions_stop()
{
var fake = new FakeTransport();
var opts = new ModbusDriverOptions
{
Host = "fake",
Probe = new ModbusProbeOptions { Enabled = false },
Tags = [new ModbusTagDefinition("A", ModbusRegion.HoldingRegisters, 0, ModbusDataType.Int16)],
};
var drv = new ModbusDriver(opts, "modbus-1", _ => fake);
await drv.InitializeAsync("{}", CancellationToken.None);
// Spin up a polled subscription; the PollGroupEngine schedules a background Task that
// will keep issuing SendAsync until either Unsubscribe or DisposeAsync stops it.
var handle = await drv.SubscribeAsync(["A"], TimeSpan.FromMilliseconds(100), CancellationToken.None);
await Task.Delay(250);
var beforeDispose = Interlocked.CompareExchange(ref fake.SendCount, 0, 0);
beforeDispose.ShouldBeGreaterThan(0);
// No ShutdownAsync — DisposeAsync must also tear down the poll engine.
await drv.DisposeAsync();
var atDispose = Interlocked.CompareExchange(ref fake.SendCount, 0, 0);
await Task.Delay(400);
var atRest = Interlocked.CompareExchange(ref fake.SendCount, 0, 0);
(atRest - atDispose).ShouldBeLessThanOrEqualTo(1,
"DisposeAsync must dispose the PollGroupEngine so its background Task stops, not just the transport");
}
// -------------------- Finding -005 / -012 (3) --------------------
/// <summary>
/// Transport that returns a structurally-broken response for FC03/FC04 — too short to
/// hold the declared byte-count. Pre-fix the driver dereferenced <c>resp[1]</c> and then
/// ran <c>Buffer.BlockCopy(resp, 2, ..., resp[1])</c> which threw <c>ArgumentException</c>
/// (out-of-range). Post-fix the driver throws <c>InvalidDataException</c> which the
/// <c>ReadAsync</c> catch-all maps to <see cref="BadCommunicationError"/>.
/// </summary>
private sealed class TruncatingTransport : IModbusTransport
{
/// <summary>How many bytes to return — anything &lt; 2 + bytecount is malformed.</summary>
public int ResponseBytes { get; set; } = 1; // just the fc byte, no bytecount
public Task ConnectAsync(CancellationToken ct) => Task.CompletedTask;
public Task<byte[]> SendAsync(byte unitId, byte[] pdu, CancellationToken ct)
{
var resp = new byte[ResponseBytes];
if (ResponseBytes >= 1) resp[0] = pdu[0];
if (ResponseBytes >= 2) resp[1] = 4; // claim 4 bytes of payload but provide none
return Task.FromResult(resp);
}
public ValueTask DisposeAsync() => ValueTask.CompletedTask;
}
[Fact]
public async Task Short_response_PDU_surfaces_as_BadCommunicationError_not_an_IndexOutOfRangeException()
{
var fake = new TruncatingTransport { ResponseBytes = 1 };
var opts = new ModbusDriverOptions
{
Host = "fake",
Tags = [new ModbusTagDefinition("Level", ModbusRegion.HoldingRegisters, 0, ModbusDataType.Int16)],
};
var drv = new ModbusDriver(opts, "modbus-1", _ => fake);
await drv.InitializeAsync("{}", CancellationToken.None);
var r = await drv.ReadAsync(["Level"], CancellationToken.None);
r[0].StatusCode.ShouldBe(0x80050000u, "BadCommunicationError = a clean transport-layer fault");
r[0].Value.ShouldBeNull();
}
[Fact]
public async Task Response_payload_truncated_below_declared_byteCount_surfaces_as_BadCommunicationError()
{
// Header says "4 bytes follow" but the message is only 3 bytes total — pre-fix the
// Buffer.BlockCopy would throw ArgumentException.
var fake = new TruncatingTransport { ResponseBytes = 3 };
var opts = new ModbusDriverOptions
{
Host = "fake",
Tags = [new ModbusTagDefinition("Level", ModbusRegion.HoldingRegisters, 0, ModbusDataType.Int16)],
};
var drv = new ModbusDriver(opts, "modbus-1", _ => fake);
await drv.InitializeAsync("{}", CancellationToken.None);
var r = await drv.ReadAsync(["Level"], CancellationToken.None);
r[0].StatusCode.ShouldBe(0x80050000u);
}
[Fact]
public void DecodeBitArray_rejects_an_empty_bitmap_with_InvalidDataException()
{
var decode = typeof(ModbusDriver).GetMethod(
"DecodeBitArray", BindingFlags.NonPublic | BindingFlags.Static)!;
// We can't invoke through reflection because ReadOnlySpan<byte> isn't representable in
// object-array invocation parameters. Instead, exercise the path through ReadAsync with
// a bit-region tag and a transport that returns a zero-byte-count response.
var fake = new EmptyBitTransport();
var opts = new ModbusDriverOptions
{
Host = "fake",
Tags = [new ModbusTagDefinition("Coil", ModbusRegion.Coils, 0, ModbusDataType.Bool)],
};
var drv = new ModbusDriver(opts, "modbus-1", _ => fake);
drv.InitializeAsync("{}", CancellationToken.None).GetAwaiter().GetResult();
var r = drv.ReadAsync(["Coil"], CancellationToken.None).GetAwaiter().GetResult();
// The empty-bitmap guard surfaces via the BadCommunicationError catch-all.
r[0].StatusCode.ShouldBe(0x80050000u);
}
/// <summary>
/// Coil-bank transport that returns <c>[fc][bytecount=0]</c> — a response with a
/// declared zero-byte payload. Pre-fix <c>DecodeBitArray</c> indexed into the empty
/// bitmap and threw <c>IndexOutOfRangeException</c>.
/// </summary>
private sealed class EmptyBitTransport : IModbusTransport
{
public Task ConnectAsync(CancellationToken ct) => Task.CompletedTask;
public Task<byte[]> SendAsync(byte unitId, byte[] pdu, CancellationToken ct)
=> Task.FromResult(new byte[] { pdu[0], 0 });
public ValueTask DisposeAsync() => ValueTask.CompletedTask;
}
// -------------------- Finding -003 (volatile _health) --------------------
/// <summary>
/// The <c>_health</c> field is read by <c>GetHealth()</c> and written by every read /
/// write / probe path. The fix uses <c>Volatile.Read</c>/<c>Volatile.Write</c> to give
/// <c>GetHealth()</c> a defined ordering guarantee. We verify that under concurrent
/// pressure <c>GetHealth()</c> never returns a half-constructed value (it's a sealed
/// record so reference-assignment atomicity already prevents tearing; the test guards
/// against future regressions to a struct-typed health surface).
/// </summary>
[Fact]
public async Task GetHealth_under_concurrent_pressure_always_returns_a_complete_snapshot()
{
var fake = new FakeTransport();
var opts = new ModbusDriverOptions
{
Host = "fake",
Tags = [new ModbusTagDefinition("A", ModbusRegion.HoldingRegisters, 0, ModbusDataType.Int16)],
};
var drv = new ModbusDriver(opts, "modbus-1", _ => fake);
await drv.InitializeAsync("{}", CancellationToken.None);
// Two writer loops and one reader loop — 250ms of churn.
var cts = new CancellationTokenSource(TimeSpan.FromMilliseconds(250));
var faults = new ConcurrentQueue<Exception>();
var writer = Task.Run(async () =>
{
try { while (!cts.IsCancellationRequested) await drv.ReadAsync(["A"], CancellationToken.None); }
catch (Exception ex) { faults.Enqueue(ex); }
});
var reader = Task.Run(() =>
{
try
{
while (!cts.IsCancellationRequested)
{
var h = drv.GetHealth();
// State must be one of the enum values; LastSuccessfulRead can be null or a real time;
// the record constructor enforces no field is wholly garbage.
h.State.ShouldBeOneOf(DriverState.Unknown, DriverState.Initializing, DriverState.Healthy, DriverState.Degraded, DriverState.Faulted);
}
}
catch (Exception ex) { faults.Enqueue(ex); }
});
await Task.WhenAll(writer, reader);
faults.ShouldBeEmpty();
}
}