# Code Review — Driver.S7 | Field | Value | |---|---| | Module | `src/Drivers/ZB.MOM.WW.OtOpcUa.Driver.S7` | | Reviewer | Claude Code | | Review date | 2026-05-22 | | Commit reviewed | `76d35d1` | | Status | Reviewed | | Open findings | 0 | ## Checklist coverage A comprehensive review completes every category, recording "No issues found" where a category produced nothing rather than leaving it blank. | # | Category | Result | |---|---|---| | 1 | Correctness & logic bugs | Driver.S7-001, Driver.S7-002, Driver.S7-003 | | 2 | OtOpcUa conventions | Driver.S7-004, Driver.S7-005 | | 3 | Concurrency & thread safety | Driver.S7-006 | | 4 | Error handling & resilience | Driver.S7-007, Driver.S7-008, Driver.S7-009 | | 5 | Security | No issues found | | 6 | Performance & resource management | Driver.S7-010 | | 7 | Design-document adherence | Driver.S7-011, Driver.S7-012 | | 8 | Code organization & conventions | Driver.S7-013 | | 9 | Testing coverage | Driver.S7-014 | | 10 | Documentation & comments | Driver.S7-012 (shared) | ## Findings ### Driver.S7-001 | Field | Value | |---|---| | Severity | High | | Category | Correctness & logic bugs | | Location | `S7AddressParser.cs:93`, `S7Driver.cs:231` | | Status | Resolved | **Description:** S7AddressParser.Parse accepts Timer (T0) and Counter (C0) addresses and the test suite asserts they parse successfully, but the read path cannot serve them. Two problems compound: (1) ReadOneAsync type-mapping switch (lines 231-250) has no case for any Timer/Counter combination, so a Timer/Counter tag falls through to the default arm and throws InvalidDataException with a misleading "type-mismatch" message on every read; (2) the read is issued via plc.ReadAsync(tag.Address, ...) passing the raw address string, and S7.Net string-based parser does not understand T{n}/C{n} syntax. A tag configured with a timer or counter address passes init-time parsing (the docstring promises config typos fail fast at init) and then fails on every read - exactly the un-diagnosable failure mode the fail-fast parse was meant to prevent. **Recommendation:** Either drop Timer/Counter from S7AddressParser and S7Area until they are wired through to S7.Net, or implement the Timer/Counter read path. If kept, reject Timer/Counter tags at InitializeAsync with a clear "not yet supported" error rather than letting them parse clean. **Resolution:** Resolved 2026-05-22 — `InitializeAsync` now runs `RejectUnsupportedTagAddresses`, which throws `NotSupportedException` with a clear "not yet supported" message (echoing the tag name + address) for any tag whose address parses as a Timer or Counter, so the bad config fails fast at init rather than throwing a misleading type-mismatch on every read. ### Driver.S7-002 | Field | Value | |---|---| | Severity | Medium | | Category | Correctness & logic bugs | | Location | `S7Driver.cs:350` | | Status | Resolved | **Description:** MapDataType collapses S7DataType.UInt32 to DriverDataType.Int32. UInt32 values above int.MaxValue (2^31-1) wrap to negative when surfaced to the OPC UA client, silently corrupting the value. The inline comment only flags Int64/UInt64 as "widens; lossy" but UInt32 to Int32 is equally lossy and is not called out. **Recommendation:** Map UInt32/UInt16 to a DriverDataType wide enough to hold the unsigned range, or add the missing unsigned DriverDataType members. At minimum correct the comment so the lossiness of UInt32 is documented. **Resolution:** Resolved 2026-05-22 — added an inline comment to the `MapDataType` switch explicitly documenting the UInt32→Int32 lossiness (same limitation as Int64/UInt64, tracked for a follow-up PR adding unsigned DriverDataType members); the code mapping is unchanged pending that follow-up. ### Driver.S7-003 | Field | Value | |---|---| | Severity | Low | | Category | Correctness & logic bugs | | Location | `S7Driver.cs:172`, `S7Driver.cs:255` | | Status | Resolved | **Description:** ReadAsync and WriteAsync dereference fullReferences.Count / writes.Count with no null guard. A null argument throws NullReferenceException rather than ArgumentNullException, and the NRE escapes before the _gate is taken so it is not wrapped in a per-item status. DiscoverAsync correctly uses ArgumentNullException.ThrowIfNull(builder); the read/write entry points are inconsistent with it. **Recommendation:** Add ArgumentNullException.ThrowIfNull for the list parameters at the top of ReadAsync and WriteAsync. **Resolution:** Resolved 2026-05-23 — added `ArgumentNullException.ThrowIfNull` at the top of both `ReadAsync` and `WriteAsync`, placed BEFORE `RequirePlc()` so a null argument produces a typed `ArgumentNullException` (consistent with `DiscoverAsync`) rather than either an NRE on `.Count` or the "not initialized" `InvalidOperationException` from `RequirePlc`. Regression tests `ReadAsync_with_null_fullReferences_throws_ArgumentNullException` and `WriteAsync_with_null_writes_throws_ArgumentNullException`. ### Driver.S7-004 | Field | Value | |---|---| | Severity | Medium | | Category | OtOpcUa conventions | | Location | `S7Driver.cs` (whole file) | | Status | Resolved | **Description:** The driver performs no logging. CLAUDE.md Library Preferences mandate Serilog with a rolling daily file sink. Every error path is an empty catch block (Initialize cleanup line 130, ShutdownAsync lines 142/149/153, ProbeLoop line 483, PollLoop lines 396/406, Dispose line 511). Connection faults, probe transitions, PUT/GET-disabled config errors, and poll-loop exceptions are all silently swallowed. An operator has only the DriverHealth.LastError string and no event trail to diagnose an intermittent PLC. **Recommendation:** Inject an ILogger/ILoggerFactory and log connect success/failure, probe Running/Stopped transitions, PUT/GET-disabled detection, and swallowed poll-loop / shutdown exceptions. **Resolution:** Resolved 2026-05-22 — injected `ILogger` (optional, defaults to `NullLogger`) into the primary constructor; added structured log calls for connect success/failure, probe Running/Stopped transitions, and swallowed poll-loop exceptions, giving operators an event trail via Serilog. ### Driver.S7-005 | Field | Value | |---|---| | Severity | Low | | Category | OtOpcUa conventions | | Location | `S7Driver.cs:33`, `S7Driver.cs:433` | | Status | Resolved | **Description:** System.Collections.Concurrent.ConcurrentDictionary is written out with a fully-qualified namespace at the field declarations instead of a using System.Collections.Concurrent directive. ImplicitUsings is enabled and the rest of the codebase relies on using directives; the inline FQN is inconsistent with house style. Similar redundant global::S7.Net.* qualifiers appear throughout S7Driver.cs despite the file-top using S7.Net. **Recommendation:** Add using System.Collections.Concurrent and drop the redundant global::S7.Net. qualifiers where using S7.Net already covers them. **Resolution:** Resolved 2026-05-23 — `using System.Collections.Concurrent` was already added by an earlier finding fix; this resolution removes the remaining `global::S7.Net.Plc` qualifiers from the `ReadOneAsync` and `WriteOneAsync` signatures, now using the unqualified `Plc` type (the file-top `using S7.Net` already covers it). House style restored. ### Driver.S7-006 | Field | Value | |---|---| | Severity | High | | Category | Concurrency & thread safety | | Location | `S7Driver.cs:140`, `S7Driver.cs:457`, `S7Driver.cs:506` | | Status | Resolved | **Description:** Disposal races with the in-flight probe / poll tasks. ShutdownAsync calls _probeCts.Cancel() and cancels each subscription CTS, but it does not await the ProbeLoopAsync / PollLoopAsync tasks (they are fire-and-forget Task.Run with the task handle discarded). DisposeAsync then calls ShutdownAsync followed immediately by _gate.Dispose(). A probe or poll iteration that is between _gate.WaitAsync and _gate.Release() when cancellation fires will call Release() (line 479) or have WaitAsync observe a disposed semaphore - ObjectDisposedException. The probe loop broad catch swallows it, but the disposal-ordering bug is real: the semaphore can be disposed while a worker still holds or is waiting on it. The same applies to _probeCts.Dispose() (line 143) running while ProbeLoopAsync may still touch the linked token. **Recommendation:** Track the probe and poll Task handles, and in ShutdownAsync (or DisposeAsync) await Task.WhenAll(...) with a bounded timeout after cancelling, before disposing _gate and the CTS objects. **Resolution:** Resolved 2026-05-22 — the probe loop now stores its Task in `_probeTask` and each subscription records its poll Task in `SubscriptionState.PollTask`. `ShutdownAsync` cancels every CTS, awaits `Task.WhenAll` of those handles with a bounded 5 s `DrainTimeout`, and only then disposes `_probeCts`, the subscription CTSs, and (via `DisposeAsync`) `_gate` — so no loop can touch a disposed semaphore. `Task.Run` is now passed `CancellationToken.None` so the handle is always awaitable even if the token is already cancelled. ### Driver.S7-007 | Field | Value | |---|---| | Severity | High | | Category | Error handling & resilience | | Location | `S7Driver.cs:200`, `S7DriverOptions.cs:13`, `docs/v2/driver-specs.md:434` | | Status | Resolved | **Description:** PUT/GET-disabled handling contradicts the design and the module own docstring. driver-specs.md section 5 (line 434) and the S7DriverOptions class remark both state PUT/GET-disabled must be mapped to BadNotSupported and surfaced as a configuration alert, not a transient fault, because blind retry is wasted effort. The actual code (ReadAsync, lines 200-208) catches every S7.Net.PlcException and maps it to StatusBadDeviceFailure, then sets health to Degraded. Consequences: (1) a genuinely transient PlcException (e.g. CPU briefly in STOP) is reported identically to a permanent PUT/GET misconfiguration - the operator cannot tell a config problem from a transient one, which is the exact distinction the spec demands; (2) the promised BadNotSupported status code is never produced, so the S7DriverOptions docstring is now false. **Recommendation:** Inspect PlcException.ErrorCode and map the PUT/GET-disabled / access-denied code to BadNotSupported with a distinct config-alert health state; keep BadDeviceFailure/Degraded only for genuine device-fault error codes. **Resolution:** Resolved 2026-05-22 — `ReadAsync` / `WriteAsync` now split the `PlcException` catch via an `IsAccessDenied` filter. S7.Net exposes no typed error code for the S7 `AccessingObjectNotAllowed` status (its `ValidateResponseCode` throws a plain `Exception` wrapped as the inner exception of a `PlcException`), so `IsAccessDenied` walks the inner-exception chain for the "not allowed" marker. A PUT/GET-disabled fault now maps to `BadNotSupported` and sets health to `Faulted` with a config-alert message pointing operators at the TIA Portal PUT/GET toggle; a genuine device fault still maps to `BadDeviceFailure`/`Degraded`. ### Driver.S7-008 | Field | Value | |---|---| | Severity | Medium | | Category | Error handling & resilience | | Location | `S7Driver.cs:286` | | Status | Resolved | **Description:** WriteAsync catch ladder is coarser than ReadAsync and loses information. The generic catch (Exception) maps everything - socket errors, timeouts, OverflowException from Convert.ToInt16 of an out-of-range value, NullReferenceException from Convert.ToBoolean(null) - to StatusBadInternalError. A genuine transport fault during a write is reported to the client as an internal error rather than BadCommunicationError, and unlike ReadAsync the write path never updates _health on failure, so a PLC that is down stays Healthy in the dashboard as long as only writes are attempted. OperationCanceledException is also caught and turned into a status code rather than propagating. **Recommendation:** Mirror the ReadAsync catch structure: let OperationCanceledException propagate, map socket/timeout faults to BadCommunicationError, map value-conversion failures to a distinct out-of-range status, and update _health to Degraded on transport failures. **Resolution:** Resolved 2026-05-22 — restructured `WriteAsync` catch ladder: `OperationCanceledException` now re-throws, genuine `PlcException` transport faults map to `BadDeviceFailure`/`Degraded`, `NotSupportedException` maps to `BadNotSupported`, the `IsAccessDenied` PlcException path maps to `BadNotSupported`/`Faulted`, and the catch-all maps to `BadCommunicationError` with a health update — matching `ReadAsync`'s structure. ### Driver.S7-009 | Field | Value | |---|---| | Severity | Low | | Category | Error handling & resilience | | Location | `S7Driver.cs:392` | | Status | Resolved | **Description:** The subscription poll loop never reflects sustained polling failure anywhere an operator can see it. PollLoopAsync swallows every non-cancellation exception with an empty catch and the comment claims "the health surface reflects it" - but a poll failure routes through ReadAsync, which only sets DriverState.Degraded when the per-tag read throws inside the gate; exceptions thrown before that (e.g. RequirePlc() when Plc is null after a drop) bypass the health update entirely. A subscription against an uninitialized or dropped driver loops forever silently, with no backoff - re-polling every Interval indefinitely on a hard failure. **Recommendation:** Have the poll loop update _health on repeated failure and apply a capped backoff after consecutive errors; at minimum log the swallowed exception (see Driver.S7-004). **Resolution:** Resolved 2026-05-23 — `PollLoopAsync` now tracks `consecutiveFailures`, calls new `HandlePollFailure` which both logs (with the failure count) AND degrades `_health` to `Degraded` once `PollFailureHealthThreshold` (1) consecutive failures have accumulated, and applies a capped exponential backoff via new `ComputeBackoffDelay` (doubles the wait each consecutive failure up to a 30 s `PollBackoffCap`). A healthy tick resets the counter so the cadence snaps back to the configured Interval. `HandlePollFailure` refuses to downgrade a `Faulted` state (reserved for permanent config faults like PUT/GET-denied). Regression test `PollLoop_against_uninitialized_driver_degrades_health` proves the health surface now reflects sustained failure; `PollLoop_applies_capped_backoff_after_consecutive_failures` proves shutdown still completes inside the drain window even under a fault storm. ### Driver.S7-010 | Field | Value | |---|---| | Severity | Low | | Category | Performance & resource management | | Location | `S7Driver.cs:504` | | Status | Resolved | **Description:** Dispose() is implemented as DisposeAsync().AsTask().GetAwaiter().GetResult() - sync-over-async. Inside the generic host this is currently safe (no captured SynchronizationContext), but it is a known deadlock pattern. The only async work behind DisposeAsync is ShutdownAsync, which does nothing async (returns Task.CompletedTask). The blocking wrap is unnecessary risk. **Recommendation:** Since ShutdownAsync is effectively synchronous, have Dispose() perform the teardown directly (cancel CTSs, close Plc, dispose _gate) without round-tripping through the async path. **Resolution:** Resolved 2026-05-23 — `Dispose()` now performs teardown directly via a new private `SynchronousTeardown` method that mirrors `ShutdownAsync` but uses `Task.WhenAll(...).Wait(DrainTimeout)` instead of `await Task.WhenAll(...).WaitAsync(...)`. Probe + poll Tasks are still drained with the bounded 5 s timeout (so a wedged loop cannot hang `Dispose` indefinitely), but the sync path no longer round-trips through `DisposeAsync().AsTask().GetAwaiter().GetResult()`. `DisposeAsync` keeps its existing implementation for callers that opt into the async dispose pattern. Regression tests `Dispose_completes_synchronously_without_sync_over_async_round_trip` and `Dispose_is_idempotent`. ### Driver.S7-011 | Field | Value | |---|---| | Severity | High | | Category | Design-document adherence | | Location | `S7Driver.cs:82`, `S7Driver.cs:134`, `IDriver.cs:24` | | Status | Resolved | **Description:** S7Driver ignores the driverConfigJson parameter on both InitializeAsync and ReinitializeAsync. The IDriver contract states InitializeAsync initializes the driver "from its DriverConfig JSON" and ReinitializeAsync "applies a config change in place". All configuration is instead captured in the constructor (S7DriverOptions options), and ReinitializeAsync simply calls ShutdownAsync then InitializeAsync with the same options object. Consequently a config change delivered to ReinitializeAsync (the documented IGenerationApplier recovery path per driver-stability.md) is silently discarded - the driver re-opens with the old config. This breaks the only Core-initiated in-process recovery path. **Recommendation:** Either re-parse driverConfigJson inside InitializeAsync/ReinitializeAsync and rebuild _options from it, or document explicitly that S7 reconfiguration requires instance recreation and have ReinitializeAsync signal that the passed JSON is unused so the contract mismatch is visible. **Resolution:** Resolved 2026-05-22 — config parsing was factored out of the factory into `S7DriverFactoryExtensions.ParseOptions`. `InitializeAsync` (and therefore `ReinitializeAsync`, which delegates to it) now re-parses `driverConfigJson` and rebuilds `_options` from it whenever the document carries a real body, so a config change delivered through `ReinitializeAsync` — the only Core-initiated in-process recovery path — is honoured. An empty / placeholder document (`""`, `{}`, `[]`) keeps the constructor-supplied options so existing lifecycle unit tests that pass `"{}"` are unaffected. ### Driver.S7-012 | Field | Value | |---|---| | Severity | Medium | | Category | Design-document adherence | | Location | `S7DriverOptions.cs:59`, `S7Driver.cs:457` | | Status | Resolved | **Description:** S7ProbeOptions.ProbeAddress is configured (default "MW0"), documented at length ("the driver runs a tick loop that issues a cheap read against S7ProbeOptions.ProbeAddress"), surfaced in the factory DTO (S7ProbeDto.ProbeAddress), and parsed from JSON - but it is never read by any code. ProbeLoopAsync probes liveness via plc.ReadStatusAsync (CPU status), not via a read of ProbeAddress. The XML doc on the S7DriverOptions.Probe property and on ProbeAddress describes behaviour the driver does not implement. An operator who sets ProbeAddress to a known-good DB word expecting the probe to exercise it will see no effect. **Recommendation:** Either make ProbeLoopAsync actually read ProbeAddress (parsing it once at init and rejecting a bad value early), or delete ProbeAddress from S7ProbeOptions/S7ProbeDto and correct the XML docs to describe the ReadStatusAsync-based probe. **Resolution:** Resolved 2026-05-22 — removed `ProbeAddress` from `S7ProbeOptions` and `S7ProbeDto`; updated the `S7DriverOptions.Probe` XML doc to describe the `ReadStatusAsync`-based probe accurately. Existing configs that set `probeAddress` are silently ignored (unknown JSON fields are tolerated by the deserializer). ### Driver.S7-013 | Field | Value | |---|---| | Severity | Low | | Category | Code organization & conventions | | Location | `S7DriverOptions.cs:90`, `S7Driver.cs:300` | | Status | Resolved | **Description:** S7TagDefinition.StringLength is a public configured/JSON-bound parameter (default 254) but is dead: S7DataType.String reads and writes both throw NotSupportedException ("...land in a follow-up PR"), so StringLength is never consumed. Likewise S7DataType.Int64, UInt64, Float64, String, and DateTime are exposed as configurable, browse through MapDataType into real DriverDataType values, and pass DiscoverAsync - creating address-space nodes - yet every read/write of them throws NotSupportedException, becoming BadNotSupported. A site can configure a Float64 tag, see the node appear, and get BadNotSupported on every access. The scaffold/follow-up-PR split leaks half-implemented types into the configurable surface. **Recommendation:** Reject the not-yet-implemented S7DataType values (and StringLength) at InitializeAsync / factory validation with a clear "not yet supported" error, so a partially-implemented type cannot be configured into a live address space. **Resolution:** Resolved 2026-05-23 — `InitializeAsync` now runs new `RejectUnsupportedTagDataTypes`, which throws `NotSupportedException` for any tag whose `DataType` is in the `UnimplementedDataTypes` set (`Int64`, `UInt64`, `Float64`, `String`, `DateTime`). The half-implemented types can no longer leak into the live address space — a site that configures one fails fast at init rather than seeing a node that returns `BadNotSupported` on every access. Entries should be removed from `UnimplementedDataTypes` as each type is wired through; the comment on `RejectUnsupportedTagDataTypes` makes it a single grep target for that follow-up. `StringLength` remains in `S7TagDefinition` because removing it would be a breaking change to existing config JSON; once `String` is implemented it will be consumed without further config changes. Regression tests `Initialize_rejects_not_yet_implemented_data_type_with_NotSupportedException` (Theory, 5 types) and `Initialize_accepts_implemented_data_types` (Theory, 7 types) prove the guard is targeted. ### Driver.S7-014 | Field | Value | |---|---| | Severity | Medium | | Category | Testing coverage | | Location | `tests/Drivers/ZB.MOM.WW.OtOpcUa.Driver.S7.Tests/` | | Status | Resolved | **Description:** Test coverage has notable gaps for the driver behavioural core: (1) no test exercises the ReadOneAsync type-reinterpret switch (Int16 from ushort, Int32 from uint, Float32 from UInt32 bits) - the most logic-heavy method in the driver is untested, and the unsigned/signed unchecked casts are unverified; (2) no test covers a Timer/Counter tag end-to-end, which would have caught Driver.S7-001; (3) no test covers WriteOneAsync boxing conversions or the out-of-range Convert failure paths; (4) the read-write tests only cover error paths (uninitialized, bad address) - the happy path is explicitly deferred to "a follow-up PR" with no mock S7 server, leaving the entire successful read, write, poll, and probe-transition surface untested; (5) ReinitializeAsync and the driverConfigJson-ignored behaviour (Driver.S7-011) has no test. **Recommendation:** Add unit tests for ReadOneAsync/WriteOneAsync type mapping by factoring the pure reinterpret/boxing logic out of the PLC round-trip so it is testable without a live PLC, and add a Timer/Counter rejection test. Track the live/mock-server happy-path coverage as an explicit follow-up rather than an open-ended deferral. **Resolution:** Resolved 2026-05-22 — factored `ReadOneAsync` type-reinterpret into `internal static ReinterpretRawValue` and `WriteOneAsync` boxing into `internal static BoxValueForWrite`; added `S7TypeMappingTests.cs` (26 tests) covering every implemented type round-trip (Bool/Byte/UInt16/Int16/UInt32/Int32/Float32), unsupported-type `NotSupportedException` assertions, and write overflow paths.