Files
lmxopcua/code-reviews/Driver.FOCAS/findings.md
Joseph Doherty 807ea11187 fix(driver-focas): resolve Medium code-review finding (Driver.FOCAS-006)
EnsureConnectedAsync now disposes and nulls any existing non-connected
client before creating a fresh one via _clientFactory.Create().

Previously the method reused a cached client via ConnectAsync, but a
client disposed by a HandleRecycle race or prior teardown would hit
FocasWireClient.ThrowIfDisposed on every subsequent call, leaving the
device permanently wedged with BadCommunicationError and no recovery
path until ReinitializeAsync.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-22 09:27:08 -04:00

16 KiB

Code Review — Driver.FOCAS

Field Value
Module src/Drivers/ZB.MOM.WW.OtOpcUa.Driver.FOCAS
Reviewer Claude Code
Review date 2026-05-22
Commit reviewed 76d35d1
Status Reviewed
Open findings 6

Checklist coverage

A comprehensive review completes every category, recording "No issues found" where a category produced nothing rather than leaving it blank.

# Category Result
1 Correctness & logic bugs Driver.FOCAS-001, Driver.FOCAS-002, Driver.FOCAS-003
2 OtOpcUa conventions Driver.FOCAS-004
3 Concurrency & thread safety Driver.FOCAS-005
4 Error handling & resilience Driver.FOCAS-006, Driver.FOCAS-007
5 Security No issues found
6 Performance & resource management Driver.FOCAS-008
7 Design-document adherence Driver.FOCAS-009
8 Code organization & conventions Driver.FOCAS-010, Driver.FOCAS-011
9 Testing coverage Driver.FOCAS-012
10 Documentation & comments No issues found

Findings

Driver.FOCAS-001

Field Value
Severity High
Category Correctness & logic bugs
Location FocasDriverFactoryExtensions.cs:54-86, FocasDriverFactoryExtensions.cs:132-140
Status Resolved

Description: FocasDriverConfigDto exposes only Backend, Series, TimeoutMs, Devices, Tags, and Probe. It has no FixedTree, AlarmProjection, or HandleRecycle properties, and CreateInstance never sets those three options on FocasDriverOptions. As a result, a deployment that follows the documented config - docs/drivers/FOCAS.md shows "FixedTree": { "Enabled": true }, "AlarmProjection": { "Enabled": true }, and "HandleRecycle": { "Enabled": true } inside Config - is parsed with PropertyNameCaseInsensitive and the unknown sections are discarded. The features stay at their hard-coded defaults (all Enabled = false). The fixed-node tree never appears, alarm subscriptions throw NotSupportedException ("FOCAS alarm projection is disabled"), and handle recycling never runs - despite the operator explicitly opting in.

Recommendation: Add FixedTree, AlarmProjection, and HandleRecycle DTO classes to FocasDriverConfigDto, parse their TimeSpan/bool fields, and populate the corresponding FocasDriverOptions properties in CreateInstance. Consider enabling strict JSON handling (UnmappedMemberHandling.Disallow) so future unknown config sections fail loudly instead of being dropped.

Resolution: Resolved 2026-05-22 — added FixedTreeDto/AlarmProjectionDto/HandleRecycleDto to FocasDriverConfigDto and Build* mappers in CreateInstance that populate the matching FocasDriverOptions properties (missing section / field keeps its default).

Driver.FOCAS-002

Field Value
Severity High
Category Correctness & logic bugs
Location WireFocasClient.cs:164-179, FocasDriver.cs:513, FocasDriver.cs:593
Status Resolved

Description: The fixed-tree bootstrap probes the ProgramInfo capability via SafeTryProbe(() => client.GetProgramInfoAsync(ct)) and treats a non-null result as "supported". But WireFocasClient.GetProgramInfoAsync never throws on a FOCAS error return code: ReadExecutingProgramNameAsync, ReadBlockCountAsync, and ReadOperationModeCodeAsync all return FocasResult<T> envelopes, and the method substitutes defaults (string.Empty, 0) when IsOk is false instead of throwing. It only throws from RequireConnected(). Consequently GetProgramInfoAsync always returns a non-null FocasProgramInfo, so Capabilities.ProgramInfo is set true even on a CNC series that returns EW_FUNC/EW_NOOPT for cnc_exeprgname2/cnc_rdopmode. The driver then emits the Program/ and OperationMode/ subtrees and polls them every tick against a controller that does not support them - the exact "nodes that only ever return BadDeviceFailure" outcome the capability suppression was designed to prevent (docs/drivers/FOCAS.md, "Per-series node suppression").

Recommendation: Make GetProgramInfoAsync throw (or return a nullable result) when the underlying cnc_exeprgname2 / cnc_rdopmode calls report a non-zero RC, so SafeTryProbe can correctly classify the series. At minimum require the program-name or op-mode read to be IsOk before declaring the capability present.

Resolution: Resolved 2026-05-22 — WireFocasClient.GetProgramInfoAsync now throws InvalidOperationException when neither the cnc_exeprgname2 nor the cnc_rdopmode read is IsOk, so SafeTryProbe records ProgramInfo as unsupported on series that answer EW_FUNC/EW_NOOPT.

Driver.FOCAS-003

Field Value
Severity Medium
Category Correctness & logic bugs
Location FocasDriver.cs:71-79
Status Resolved

Description: In InitializeAsync, capability-matrix validation only runs when _devices.TryGetValue(tag.DeviceHostAddress, out var device) succeeds. A tag whose DeviceHostAddress does not match any configured device (a common config typo, e.g. a trailing :8193 mismatch or a wrong host) silently skips validation and is still added to _tagsByName. The mistake is not surfaced at load time - it only manifests at read time as BadNodeIdUnknown (ReadAsync lines 191-194), defeating the documented goal that "config errors now fail at load instead of per-read" (docs/v2/focas-version-matrix.md).

Recommendation: After parsing the tag address, if _devices does not contain tag.DeviceHostAddress, throw an InvalidOperationException naming the tag and the unresolved device host so the operator fixes the typo at startup.

Resolution: Resolved 2026-05-22 — InitializeAsync now throws InvalidOperationException naming the tag and the unresolved device when _devices does not contain tag.DeviceHostAddress, preventing silent skip-and-defer to per-read BadNodeIdUnknown.

Driver.FOCAS-004

Field Value
Severity Medium
Category OtOpcUa conventions
Location FocasDriver.cs:374-379, WireFocasClient.cs:48-50
Status Resolved

Description: DiscoverAsync emits user tags with SecurityClass = tag.Writable ? SecurityClassification.Operate : SecurityClassification.ViewOnly, and FocasTagDefinition.Writable defaults to true (also defaulted to true in the factory - t.Writable ?? true). But the production wire backend's WireFocasClient.WriteAsync unconditionally returns FocasStatusMapper.BadNotWritable

  • the driver is read-only against FOCAS by design (docs/drivers/FOCAS.md). The result is that every tag is advertised in the address space as a writable Operate node, yet every write attempt fails. This is misleading to OPC UA clients and to the DriverNodeManager ACL layer, which will grant write permission on nodes that can never be written.

Recommendation: Either default Writable to false for the FOCAS driver, or have DiscoverAsync force SecurityClassification.ViewOnly when the active backend cannot write. Given the wire backend is read-only and is the only production backend, treating all FOCAS tags as ViewOnly is the simplest correct behaviour.

Resolution: Resolved 2026-05-22 — DiscoverAsync now unconditionally emits SecurityClassification.ViewOnly for all user-authored tags; the Writable config field no longer influences the advertised security class since the wire backend never writes.

Driver.FOCAS-005

Field Value
Severity Medium
Category Concurrency & thread safety
Location FocasDriver.cs:28, FocasDriver.cs:206-215, FocasDriver.cs:261, FocasDriver.cs:274
Status Resolved

Description: _health is a plain (non-volatile) field mutated from multiple concurrent contexts - ReadAsync, WriteAsync, and the per-device ProbeLoopAsync can all run on different threads simultaneously (subscriptions go through PollGroupEngine timers; probe loops are Task.Run). Several updates are read-modify-write - e.g. new DriverHealth(DriverState.Degraded, _health.LastSuccessfulRead, ...) reads _health then writes a new instance - so a concurrent update can be lost or a stale LastSuccessfulRead propagated. While DriverHealth is an immutable record and the reference write is atomic, the lack of synchronization means GetHealth() can observe torn-in-time state and successful-read timestamps can regress.

Recommendation: Guard _health reads/writes with a lock, or use Interlocked.Exchange/Volatile around the whole record reference and compute the new value from a single captured snapshot. The DeviceState/HostState transition already uses ProbeLock; apply the same discipline to driver health.

Resolution: Resolved 2026-05-22 — All _health reads use Volatile.Read(ref _health) and all writes use Volatile.Write(ref _health, ...), ensuring every thread observes the latest reference and multi-step read-modify-write sequences capture a stable snapshot before computing the new value.

Driver.FOCAS-006

Field Value
Severity Medium
Category Error handling & resilience
Location FocasDriver.cs:859-874, WireFocasClient.cs:22-31
Status Resolved

Description: EnsureConnectedAsync reuses the cached IFocasClient instance across a transient disconnect: it only checks device.Client is { IsConnected: true } and otherwise calls ConnectAsync again on the same object. For a WireFocasClient whose underlying FocasWireClient has been disposed (e.g. via a HandleRecycle / DisposeClient race, or a prior teardown), every subsequent call hits FocasWireClient.ThrowIfDisposed and throws ObjectDisposedException. In ReadAsync that exception is caught only by the generic catch (Exception ex) and mapped to a permanent BadCommunicationError - the device stays wedged with no recovery path until ReinitializeAsync is invoked, because the reconnect logic never discards the disposed client.

Recommendation: On any connect/use failure, treat a disposed or non-connected client as unrecoverable and recreate it from _clientFactory. Simplest: in EnsureConnectedAsync, when device.Client is non-null but not connected, dispose and null it before creating a fresh instance, rather than retrying ConnectAsync on the stale object.

Resolution: Resolved 2026-05-22 — EnsureConnectedAsync now unconditionally disposes and nulls any existing non-connected client before calling _clientFactory.Create(), preventing ObjectDisposedException loops on a stale WireFocasClient after a HandleRecycle race or prior teardown.

Driver.FOCAS-007

Field Value
Severity Low
Category Error handling & resilience
Location FocasDriver.cs:140-148, FocasDriver.cs:478-484, FocasDriver.cs:529-533, FocasAlarmProjection.cs:61-63
Status Open

Description: Numerous try { ... } catch {} blocks swallow every exception with no logging - ShutdownAsync (CTS cancel/dispose), RecycleLoopAsync (DisposeClient), FixedTreeLoopAsync transient catches, ProbeLoopAsync, and the alarm projection's sub.Cts.Cancel(). The driver takes no ILogger dependency at all (only FocasWireClient optionally accepts one, and the driver never supplies it). A CNC that is silently failing every probe/poll tick produces no diagnostic trail, which conflicts with the project's Serilog logging convention and forces field troubleshooting to rely solely on GetHealth().

Recommendation: Inject an ILogger<FocasDriver> and log caught exceptions in the poll/probe/recycle loops at Debug/Warning. Pass a logger into FocasWireClient so the per-response Debug entries it already emits are actually captured.

Resolution: (open)

Driver.FOCAS-008

Field Value
Severity Low
Category Performance & resource management
Location FocasDriver.cs:201, FocasDriver.cs:253
Status Open

Description: ReadAsync and WriteAsync call FocasAddress.TryParse(def.Address) on every operation, even though InitializeAsync already parsed and validated every tag address. On a subscription hot path (each poll tick re-enters ReadAsync) this re-parses and allocates a FocasAddress record per tag per tick unnecessarily.

Recommendation: Parse each tag address once at InitializeAsync and store the parsed FocasAddress on FocasTagDefinition (or in a side dictionary), so the runtime read/write paths use the cached value.

Resolution: (open)

Driver.FOCAS-009

Field Value
Severity Low
Category Design-document adherence
Location FocasDriverOptions.cs:110-115, FocasDriver.cs:468-486, FocasDriverFactoryExtensions.cs:75-80
Status Open

Description: FocasProbeOptions.Timeout is parsed by the factory (FocasProbeDto.TimeoutMs to FocasProbeOptions.Timeout) but never consumed. ProbeLoopAsync calls client.ProbeAsync(ct) with only the probe-loop cancellation token; no per-probe timeout is applied, and EnsureConnectedAsync uses _options.Timeout rather than Probe.Timeout. A hung CNC socket during a probe blocks until the OS TCP timeout rather than the configured Probe.Timeout.

Recommendation: Apply Probe.Timeout as a linked CancellationTokenSource timeout around the ProbeAsync call, or remove the dead Timeout field from FocasProbeOptions / FocasProbeDto if it is genuinely not intended.

Resolution: (open)

Driver.FOCAS-010

Field Value
Severity Low
Category Code organization & conventions
Location IFocasClient.cs:210-227 (FocasOpMode), FocasConstants.cs:42-78 (FocasOperationMode)
Status Open

Description: There are two parallel operation-mode-to-text mappings with divergent labels. FocasOpMode.ToText (used by the driver fixed-tree OperationMode/ModeText node) yields "TJOG", "TEACH_IN_HANDLE"; FocasOperationModeExtensions.ToText (in the Wire layer) yields "T-JOG", "TEACH-IN-HANDLE". They also use different fallback formats (Mode{mode} vs the bare number). The same concept is encoded twice with inconsistent results depending on which path renders it.

Recommendation: Consolidate to a single op-mode enum + ToText helper shared by both the wire layer and the driver projection, with one canonical label set.

Resolution: (open)

Driver.FOCAS-011

Field Value
Severity Low
Category Code organization & conventions
Location IFocasClient.cs:275-287 (FocasAlarmType), FocasAlarmProjection.cs:149-175
Status Open

Description: FocasAlarmType declares its constants as public const int, but the only consumers - FocasAlarmProjection.MapAlarmType(short type) and MapSeverity(short type) - take a short and switch against these int constants. It compiles only because the values (0..13) fit in short range as constant expressions. The type mismatch is a latent maintenance hazard: adding a constant above short.MaxValue, or changing the projection signatures, would break the switch in non-obvious ways. FocasAlarmType.All is -1 and is also passed where a short is expected by ReadAlarmsAsync.

Recommendation: Declare the FocasAlarmType constants as short (or make it an enum : short) so the type matches the wire field width and the projection signatures.

Resolution: (open)

Driver.FOCAS-012

Field Value
Severity Medium
Category Testing coverage
Location FocasDriverFactoryExtensions.cs, FocasDriver.cs:495-629 (FixedTreeLoopAsync)
Status Open

Description: The unit test project does not exercise FocasDriverFactoryExtensions.CreateInstance with FixedTree / AlarmProjection / HandleRecycle config sections - which is why the config-mapping gap in Driver.FOCAS-001 was not caught. There is also no test that drives the fixed-tree bootstrap / capability-probe path (FixedTreeLoopAsync), so the false-positive ProgramInfo capability in Driver.FOCAS-002 is untested, and the EnsureConnectedAsync reconnect-after-disconnect path (Driver.FOCAS-006) has no coverage.

Recommendation: Add factory tests that round-trip a full JSON config including the three opt-in sections and assert the options reach the driver; add a FakeFocasClient-driven test for fixed-tree bootstrap capability classification (including the unsupported-program-info case); add a reconnect test that disposes the fake client mid-session and asserts recovery.

Resolution: (open)