review(Driver.Modbus): validate FC03 RMW response + correct write error mapping

Re-review at 7286d320. Modbus-013 (Low): bit RMW now routes the FC03 read through the
validated ReadRegisterBlockAsync (was raw-indexing readResp -> IndexOutOfRange on a truncated
PDU). Modbus-014 (Low): WriteAsync maps InvalidDataException to BadCommunicationError (was
BadInternalError), matching ReadAsync. + TDD.
This commit is contained in:
Joseph Doherty
2026-06-19 11:34:34 -04:00
parent f2bdd8bc1c
commit 6853a0430f
3 changed files with 131 additions and 6 deletions
+53 -2
View File
@@ -4,8 +4,8 @@
|---|---|
| Module | `src/Drivers/ZB.MOM.WW.OtOpcUa.Driver.Modbus` |
| Reviewer | Claude Code |
| Review date | 2026-05-22 |
| Commit reviewed | `76d35d1` |
| Review date | 2026-06-19 |
| Commit reviewed | `7286d320` |
| Status | Reviewed |
| Open findings | 0 |
@@ -205,3 +205,54 @@
**Recommendation:** Add unit tests for concurrent deadband publishing across two subscriptions, `ReinitializeAsync` state hygiene, malformed-response handling in the register/bit block readers, and `DisposeAsync` loop teardown.
**Resolution:** Resolved 2026-05-23 — gap (1) was already covered by `ModbusSubscriptionTests.Concurrent_deadband_subscriptions_do_not_corrupt_the_publish_cache` from the Driver.Modbus-001 fix. Added the remaining three in a new `ModbusLifecycleHygieneTests` file: `Reinitialize_clears_stale_tagsByName_entries` + `Reinitialize_clears_lastPublished_and_lastWritten_caches` (gap 2), `Short_response_PDU_surfaces_as_BadCommunicationError_not_an_IndexOutOfRangeException` + `Response_payload_truncated_below_declared_byteCount_surfaces_as_BadCommunicationError` + `DecodeBitArray_rejects_an_empty_bitmap_with_InvalidDataException` (gap 3), `DisposeAsync_without_explicit_Shutdown_tears_down_probe_loop_and_transport` + `DisposeAsync_disposes_the_pollEngine_so_subscriptions_stop` (gap 4). All 12 new tests pass (full suite: 263/263 green).
## Re-review 2026-06-19 (commit 7286d320)
#### Context
Re-review following addition of the bit read-modify-write path (`_rmwLocks`, `WriteBitInRegisterAsync`), enum-serialization fix (`JsonStringEnumConverter` on AdminUI page), and equipment-tag ref resolver. Previous findings 001-012 are all Resolved.
#### Checklist coverage
| # | Category | Result |
|---|---|---|
| 1 | Correctness & logic bugs | Driver.Modbus-013, Driver.Modbus-014 |
| 2 | OtOpcUa conventions | No issues found |
| 3 | Concurrency & thread safety | No new issues — `_rmwLocks` per-address key over-serializes multi-unit but is not incorrect (transport `_gate` already serializes all sends) |
| 4 | Error handling & resilience | Driver.Modbus-014 |
| 5 | Security | No issues found |
| 6 | Performance & resource management | No issues found |
| 7 | Design-document adherence | No issues found |
| 8 | Code organization & conventions | No issues found |
| 9 | Testing coverage | Driver.Modbus-013 (truncated RMW response) had no test; added |
| 10 | Documentation & comments | No issues found |
### Driver.Modbus-013
| Field | Value |
|---|---|
| Severity | Low |
| Category | Correctness & logic bugs |
| Location | `ModbusDriver.cs:1169-1171` |
| Status | Resolved |
**Description:** `WriteBitInRegisterAsync` called `transport.SendAsync` directly for the FC03 read and then immediately indexed `readResp[2]` and `readResp[3]` without validating the response length. A device returning a malformed or truncated PDU caused `IndexOutOfRangeException`. Unlike the normal read path (`ReadRegisterBlockAsync`) which was hardened in Driver.Modbus-005, the RMW read bypassed that validation entirely. The exception was absorbed by `WriteAsync`'s catch-all and mapped to the wrong status code (see Driver.Modbus-014).
**Recommendation:** Route the FC03 read in `WriteBitInRegisterAsync` through `ReadRegisterBlockAsync` — the same validated path the normal read uses — so truncated responses throw `InvalidDataException` rather than `IndexOutOfRangeException`.
**Resolution:** Resolved 2026-06-19 (SHA blank) — replaced direct `transport.SendAsync` + raw indexing in `WriteBitInRegisterAsync` with a `ReadRegisterBlockAsync(... qty=1 ...)` call. `ReadRegisterBlockAsync` validates `resp.Length >= 2`, `resp.Length >= 2 + resp[1]`, and byte-count-vs-quantity before returning clean `[hi, lo]` register bytes. Regression test: `ModbusBitRmwTests.BitInRegister_write_with_truncated_FC03_response_returns_BadCommunicationError`.
### Driver.Modbus-014
| Field | Value |
|---|---|
| Severity | Low |
| Category | Error handling & resilience |
| Location | `ModbusDriver.cs:990-993` |
| Status | Resolved |
**Description:** `WriteAsync`'s generic `catch (Exception)` arm mapped all non-`ModbusException` failures to `StatusBadInternalError` (0x80020000). This includes `InvalidDataException` thrown by `ReadRegisterBlockAsync` for a malformed PDU during a BitInRegister RMW read — a communication/protocol error, not a driver code defect. By contrast, `ReadAsync` correctly returns `StatusBadCommunicationError` (0x80050000) for the same class of failure. The asymmetry makes write-time communication errors look like internal driver bugs to OPC UA clients monitoring StatusCodes, complicating diagnosis.
**Recommendation:** Add a `catch (InvalidDataException)` arm before the generic catch in `WriteAsync` that maps the exception to `StatusBadCommunicationError`, matching the `ReadAsync` error surface.
**Resolution:** Resolved 2026-06-19 (SHA blank) — added `catch (InvalidDataException) { results[i] = new WriteResult(StatusBadCommunicationError); }` between the `ModbusException` and generic catch arms in `WriteAsync`. Covered by the same regression test as Driver.Modbus-013: `ModbusBitRmwTests.BitInRegister_write_with_truncated_FC03_response_returns_BadCommunicationError`.
@@ -987,6 +987,14 @@ public sealed class ModbusDriver
{
results[i] = new WriteResult(MapModbusExceptionToStatus(mex.ExceptionCode));
}
catch (InvalidDataException)
{
// Driver.Modbus-014: malformed/truncated PDU during a write (e.g. the FC03 RMW
// read returning a short response). This is a communication-layer error — surface
// as BadCommunicationError to match the ReadAsync path, not BadInternalError which
// implies a driver code defect.
results[i] = new WriteResult(StatusBadCommunicationError);
}
catch (Exception)
{
results[i] = new WriteResult(StatusBadInternalError);
@@ -1165,10 +1173,16 @@ public sealed class ModbusDriver
try
{
// FC03 read 1 holding register at tag.Address.
var readPdu = new byte[] { 0x03, (byte)(tag.Address >> 8), (byte)(tag.Address & 0xFF), 0x00, 0x01 };
var readResp = await transport.SendAsync(ResolveUnitId(tag), readPdu, ct).ConfigureAwait(false);
// resp = [fc][byte-count=2][hi][lo]
var current = (ushort)((readResp[2] << 8) | readResp[3]);
// Driver.Modbus-013: use ReadRegisterBlockAsync so the response is validated before
// indexing (mirrors the fix applied to the normal read path in Driver.Modbus-005).
// Direct transport.SendAsync previously skipped the length checks, letting a truncated
// PDU throw IndexOutOfRangeException from readResp[2]/[3] — now surfaces as a clean
// InvalidDataException, which WriteAsync's communication-error catch arm maps to
// BadCommunicationError (see Driver.Modbus-014).
var readRaw = await ReadRegisterBlockAsync(transport, ResolveUnitId(tag), 0x03, tag.Address, 1, ct).ConfigureAwait(false);
// readRaw = [hi][lo] — ReadRegisterBlockAsync strips the fc+byte-count header and
// validates the payload length, so [0] and [1] are always safe to index here.
var current = (ushort)((readRaw[0] << 8) | readRaw[1]);
var updated = on
? (ushort)(current | (1 << bit))
@@ -153,4 +153,64 @@ public sealed class ModbusBitRmwTests
fake.HoldingRegisters[30].ShouldBe((ushort)((1 << 5) | (1 << 10)));
}
// ---- Driver.Modbus-013: RMW read-response not validated before indexing ----
/// <summary>
/// Transport that returns a too-short PDU for FC03 reads — simulates a buggy device
/// returning a malformed response during a BitInRegister read-modify-write. Pre-fix,
/// WriteBitInRegisterAsync indexed readResp[2]/[3] without checking resp.Length, causing
/// IndexOutOfRangeException. Post-fix it throws InvalidDataException which WriteAsync's
/// catch-all maps to BadInternalError.
/// </summary>
private sealed class TruncatedRmwTransport : IModbusTransport
{
/// <summary>Connects asynchronously (no-op for fake).</summary>
/// <param name="ct">Cancellation token (unused).</param>
public Task ConnectAsync(CancellationToken ct) => Task.CompletedTask;
/// <summary>Returns a truncated FC03 response to trigger the validation path.</summary>
/// <param name="unitId">The Modbus unit ID (unused).</param>
/// <param name="pdu">The PDU to process.</param>
/// <param name="ct">Cancellation token (unused).</param>
public Task<byte[]> SendAsync(byte unitId, byte[] pdu, CancellationToken ct)
{
if (pdu[0] == 0x03)
// Return only [fc][byte-count=2] — missing the actual register data bytes.
return Task.FromResult(new byte[] { 0x03, 0x02 });
// FC06 write echoes the PDU
return Task.FromResult(pdu);
}
/// <summary>Disposes asynchronously (no-op for fake).</summary>
public ValueTask DisposeAsync() => ValueTask.CompletedTask;
}
/// <summary>
/// Driver.Modbus-013/014: a truncated FC03 read response during a BitInRegister RMW must
/// surface as BadCommunicationError, not BadInternalError (a structural/driver-code
/// problem) — and not as an IndexOutOfRangeException that escapes WriteAsync uncaught.
/// </summary>
[Fact]
public async Task BitInRegister_write_with_truncated_FC03_response_returns_BadCommunicationError()
{
var truncated = new TruncatedRmwTransport();
var opts = new ModbusDriverOptions
{
Host = "fake",
Tags = [new ModbusTagDefinition("Bit5", ModbusRegion.HoldingRegisters, 10, ModbusDataType.BitInRegister, BitIndex: 5)],
Probe = new ModbusProbeOptions { Enabled = false },
};
var drv = new ModbusDriver(opts, "modbus-rmw-trunc", _ => truncated);
await drv.InitializeAsync("{}", CancellationToken.None);
// WriteAsync must complete (not throw) and return BadCommunicationError — the same
// status the ReadAsync path returns for a truncated response. BadInternalError (0x80020000)
// is wrong because the failure is a communication/protocol error, not a driver bug.
const uint BadCommunicationError = 0x80050000u;
var results = await drv.WriteAsync([new WriteRequest("Bit5", true)], CancellationToken.None);
results.Single().StatusCode.ShouldBe(BadCommunicationError,
"A truncated FC03 RMW response is a communication error, not an internal driver error");
}
}