diff --git a/code-reviews/Driver.Modbus/findings.md b/code-reviews/Driver.Modbus/findings.md index 85e6332..4c6ba15 100644 --- a/code-reviews/Driver.Modbus/findings.md +++ b/code-reviews/Driver.Modbus/findings.md @@ -7,7 +7,7 @@ | Review date | 2026-05-22 | | Commit reviewed | `76d35d1` | | Status | Reviewed | -| Open findings | 11 | +| Open findings | 7 | ## Checklist coverage @@ -48,13 +48,13 @@ | Severity | Medium | | Category | Correctness & logic bugs | | Location | `ModbusDriver.cs:127-186` | -| Status | Open | +| Status | Resolved | **Description:** `ShutdownAsync` never clears `_tagsByName`, and `InitializeAsync` repopulates it with `_tagsByName[t.Name] = t` (`ModbusDriver.cs:134`) without clearing first. `ReinitializeAsync` calls `ShutdownAsync` then `InitializeAsync`. Because `_options.Tags` is fixed for a driver instance, the same set re-inserts harmlessly today — but the asymmetry is a latent bug: any future path that re-runs init with a different tag set leaves stale tag entries that resolve reads/writes against deleted nodes. `_lastPublishedByRef` and `_lastWrittenByRef` similarly survive a Reinitialize, retaining deadband/write-suppression baselines against the old config, while `_autoProhibited` *is* deliberately cleared (`ModbusDriver.cs:179`) — the inconsistency shows the clearing was simply overlooked. **Recommendation:** Clear `_tagsByName`, `_lastPublishedByRef`, and `_lastWrittenByRef` in `ShutdownAsync` (or at the top of `InitializeAsync`) so a Reinitialize starts from a clean state, consistent with the existing `_autoProhibited.Clear()`. -**Resolution:** _(open)_ +**Resolution:** Resolved 2026-05-22 — added `_tagsByName.Clear()`, `_lastPublishedByRef.Clear()`, and `_lastWrittenByRef.Clear()` to `ShutdownAsync` (via the new shared `TeardownAsync` helper) so a `ReinitializeAsync` cycle always starts from a clean state, consistent with the existing `_autoProhibited.Clear()`. ### Driver.Modbus-003 @@ -78,13 +78,13 @@ | Severity | Medium | | Category | Performance & resource management | | Location | `ModbusDriver.cs:1468-1473` | -| Status | Open | +| Status | Resolved | **Description:** `DisposeAsync()` only disposes `_transport`. Unlike `ShutdownAsync`, it does not cancel/dispose `_probeCts` or `_reprobeCts`, nor dispose `_poll` (the `PollGroupEngine`). A caller that uses `await using` or `using` without first calling `ShutdownAsync` leaks the probe loop, the re-probe loop, and every active polled subscription background `Task`/`CancellationTokenSource`. The two `Task.Run` loops keep running against a disposed transport, throwing on every tick. `Dispose()` (sync) has the same gap and additionally blocks on the async path via `GetAwaiter().GetResult()`. **Recommendation:** Make `DisposeAsync` perform the same teardown as `ShutdownAsync` (cancel both CTSs, dispose them, dispose `_poll`) before disposing `_transport`. Have `ShutdownAsync` and `DisposeAsync` share a private `TeardownAsync` helper. -**Resolution:** _(open)_ +**Resolution:** Resolved 2026-05-22 — refactored teardown into a shared `TeardownAsync` helper; `DisposeAsync` now delegates to it, cancelling both CTS objects, disposing `_poll`, and disposing `_transport` — matching `ShutdownAsync` and eliminating the probe/re-probe/poll-engine leak on `await using` callers. ### Driver.Modbus-005 @@ -93,13 +93,13 @@ | Severity | Medium | | Category | Correctness & logic bugs | | Location | `ModbusDriver.cs:777-798,323-330` | -| Status | Open | +| Status | Resolved | **Description:** `ReadRegisterBlockAsync` and `ReadBitBlockAsync` index `resp[1]` and call `Buffer.BlockCopy(resp, 2, ..., resp[1])` with no bounds validation. `ModbusTcpTransport.SendOnceAsync` validates only the MBAP length field and the exception high-bit — it does not guarantee a non-exception response PDU is long enough to hold function-code + byte-count + the claimed data. A device (or buggy server) returning a 1-byte PDU, or a byte-count larger than the actual payload, produces an `IndexOutOfRangeException`/`ArgumentException` rather than a clean comms error. `DecodeBitArray` similarly indexes `bitmap[0]` (`ModbusDriver.cs:325`) without checking the bitmap is non-empty. In `ReadAsync` these are caught by the catch-all and mapped to `BadCommunicationError`, so impact is limited; in `ReadCoalescedAsync` the exception is opaque to the narrower catch arms. **Recommendation:** In `ReadRegisterBlockAsync`/`ReadBitBlockAsync`, validate `resp.Length >= 2` and `resp.Length >= 2 + resp[1]` before slicing, throwing a descriptive `InvalidDataException`. Validate the decoded byte/bit count matches the request quantity. -**Resolution:** _(open)_ +**Resolution:** Resolved 2026-05-22 — added `resp.Length >= 2`, `resp.Length >= 2 + resp[1]`, and byte-count-vs-quantity checks in both `ReadRegisterBlockAsync` and `ReadBitBlockAsync`, throwing `InvalidDataException` with precise diagnostics; added an empty-bitmap guard in `DecodeBitArray`. ### Driver.Modbus-006 @@ -108,13 +108,13 @@ | Severity | Medium | | Category | Error handling & resilience | | Location | `ModbusDriver.cs:514-524,532-550` | -| Status | Open | +| Status | Resolved | **Description:** `RunReprobeOnceForTestAsync` reads `_transport` once at the top (`var transport = _transport ?? throw ...`). If `ShutdownAsync` runs (setting `_transport = null` and disposing it) while a re-probe pass is mid-iteration, the loop keeps issuing reads against the captured, disposed transport. `ReprobeLoopAsync` only catches `OperationCanceledException when (ct.IsCancellationRequested)` — an `ObjectDisposedException` from the disposed transport escapes `RunReprobeOnceForTestAsync` and faults the fire-and-forget background `Task`, silently killing the re-probe loop with the wrong failure mode. **Recommendation:** Re-check `_transport`/cancellation inside the per-candidate loop, or broaden the `ReprobeLoopAsync` catch to also swallow `ObjectDisposedException` when `ct.IsCancellationRequested`. -**Resolution:** _(open)_ +**Resolution:** Resolved 2026-05-22 — broadened `ReprobeLoopAsync` to catch `ObjectDisposedException when (ct.IsCancellationRequested)` and return cleanly, so a transport disposal race during shutdown exits the background task rather than faulting it. ### Driver.Modbus-007 diff --git a/src/Drivers/ZB.MOM.WW.OtOpcUa.Driver.Modbus/ModbusDriver.cs b/src/Drivers/ZB.MOM.WW.OtOpcUa.Driver.Modbus/ModbusDriver.cs index 9aec203..f4da46e 100644 --- a/src/Drivers/ZB.MOM.WW.OtOpcUa.Driver.Modbus/ModbusDriver.cs +++ b/src/Drivers/ZB.MOM.WW.OtOpcUa.Driver.Modbus/ModbusDriver.cs @@ -170,24 +170,9 @@ public sealed class ModbusDriver public async Task ShutdownAsync(CancellationToken cancellationToken) { - try { _probeCts?.Cancel(); } catch { } - _probeCts?.Dispose(); - _probeCts = null; - - try { _reprobeCts?.Cancel(); } catch { } - _reprobeCts?.Dispose(); - _reprobeCts = null; - - // #151 — clear the prohibition set on shutdown so an explicit operator restart - // (ReinitializeAsync) starts with a clean slate. The re-probe loop already retries - // automatically when enabled; the restart path is the manual escape hatch. - lock (_autoProhibitedLock) _autoProhibited.Clear(); - - await _poll.DisposeAsync().ConfigureAwait(false); - - if (_transport is not null) await _transport.DisposeAsync().ConfigureAwait(false); - _transport = null; - _health = new DriverHealth(DriverState.Unknown, _health.LastSuccessfulRead, null); + var lastRead = _health.LastSuccessfulRead; + await TeardownAsync().ConfigureAwait(false); + _health = new DriverHealth(DriverState.Unknown, lastRead, null); } public DriverHealth GetHealth() => _health; @@ -327,6 +312,10 @@ public sealed class ModbusDriver /// private static object DecodeBitArray(ReadOnlySpan bitmap, int count, bool isArray) { + // Driver.Modbus-005: guard against empty bitmap (already validated upstream but defensive + // here so the IndexOutOfRangeException path is explicitly closed at decode time too). + if (bitmap.IsEmpty) + throw new InvalidDataException("Modbus bit response produced an empty bitmap — cannot decode coil value"); if (!isArray) return (bitmap[0] & 0x01) == 1; var result = new bool[count]; for (var i = 0; i < count; i++) @@ -525,6 +514,14 @@ public sealed class ModbusDriver catch (OperationCanceledException) { return; } try { await RunReprobeOnceForTestAsync(ct).ConfigureAwait(false); } catch (OperationCanceledException) when (ct.IsCancellationRequested) { return; } + catch (ObjectDisposedException) when (ct.IsCancellationRequested) + { + // Driver.Modbus-006: ShutdownAsync disposes the transport while we may be + // mid-pass. An ObjectDisposedException from the disposed transport is the + // expected shutdown race — swallow it here so the fire-and-forget task + // exits cleanly rather than faulting with the wrong failure mode. + return; + } } } @@ -785,7 +782,20 @@ public sealed class ModbusDriver var pdu = new byte[] { fc, (byte)(address >> 8), (byte)(address & 0xFF), (byte)(quantity >> 8), (byte)(quantity & 0xFF) }; var resp = await transport.SendAsync(unitId, pdu, ct).ConfigureAwait(false); - // resp = [fc][byte-count][data...] + // resp = [fc][byte-count][data...] — validate before indexing to surface a clean error + // rather than an IndexOutOfRangeException when a device returns a truncated PDU. + // Driver.Modbus-005: guard resp.Length >= 2 (fc + byte-count) and that the payload is + // at least as long as the declared byte-count, matching the quantity we requested. + if (resp.Length < 2) + throw new InvalidDataException( + $"Modbus register response too short: expected at least 2 bytes (fc+bytecount), got {resp.Length}"); + if (resp.Length < 2 + resp[1]) + throw new InvalidDataException( + $"Modbus register response truncated: byte-count field declares {resp[1]} bytes but only {resp.Length - 2} available"); + var expectedByteCount = quantity * 2; + if (resp[1] != expectedByteCount) + throw new InvalidDataException( + $"Modbus register response byte-count mismatch: requested {quantity} registers ({expectedByteCount} bytes), got {resp[1]} bytes"); var data = new byte[resp[1]]; Buffer.BlockCopy(resp, 2, data, 0, resp[1]); return data; @@ -797,6 +807,17 @@ public sealed class ModbusDriver var pdu = new byte[] { fc, (byte)(address >> 8), (byte)(address & 0xFF), (byte)(qty >> 8), (byte)(qty & 0xFF) }; var resp = await transport.SendAsync(unitId, pdu, ct).ConfigureAwait(false); + // Driver.Modbus-005: validate the response is structurally sound before indexing. + if (resp.Length < 2) + throw new InvalidDataException( + $"Modbus bit response too short: expected at least 2 bytes (fc+bytecount), got {resp.Length}"); + if (resp.Length < 2 + resp[1]) + throw new InvalidDataException( + $"Modbus bit response truncated: byte-count field declares {resp[1]} bytes but only {resp.Length - 2} available"); + var expectedByteCount = (qty + 7) / 8; + if (resp[1] < expectedByteCount) + throw new InvalidDataException( + $"Modbus bit response byte-count mismatch: requested {qty} bits ({expectedByteCount} bytes), got {resp[1]} bytes"); var bitmap = new byte[resp[1]]; Buffer.BlockCopy(resp, 2, bitmap, 0, resp[1]); return bitmap; @@ -1471,8 +1492,40 @@ public sealed class ModbusDriver }; public void Dispose() => DisposeAsync().AsTask().GetAwaiter().GetResult(); + + /// + /// Driver.Modbus-004: DisposeAsync must perform the same teardown as ShutdownAsync so + /// callers that use await using (without an explicit ShutdownAsync) do not + /// leak the probe loop, re-probe loop, and poll-engine background tasks. Shares + /// with to keep them in sync. + /// public async ValueTask DisposeAsync() { + await TeardownAsync().ConfigureAwait(false); + } + + /// + /// Shared teardown helper used by both and + /// . Cancels both background loops, disposes the poll engine, + /// and disposes the transport. Idempotent — safe to call more than once. + /// + private async Task TeardownAsync() + { + try { _probeCts?.Cancel(); } catch { } + _probeCts?.Dispose(); + _probeCts = null; + + try { _reprobeCts?.Cancel(); } catch { } + _reprobeCts?.Dispose(); + _reprobeCts = null; + + _tagsByName.Clear(); + _lastPublishedByRef.Clear(); + lock (_lastWrittenLock) _lastWrittenByRef.Clear(); + lock (_autoProhibitedLock) _autoProhibited.Clear(); + + await _poll.DisposeAsync().ConfigureAwait(false); + if (_transport is not null) await _transport.DisposeAsync().ConfigureAwait(false); _transport = null; }