Files
lmxopcua/code-reviews/Driver.Cli.Common/findings.md
Joseph Doherty 41e62b2663 docs(code-reviews): updated re-review at commit a9be809 — 12 new findings
Re-reviewed the four modules with source changes since the previous review
commit 76d35d1, per REVIEW-PROCESS.md section 6. Updated each findings.md
header (date 2026-05-23, commit a9be809) and appended new findings under
continued numbering. Regenerated README.md.

## New findings — 12 total across 4 modules

### Core.Scripting (5 new, IDs -012 to -016)
- **-012 High Security** — broadened BCL references (System.* + netstandard)
  re-expose System.Threading.ThreadPool / Timer / AssemblyLoadContext, which
  the analyzer's deny-list doesn't cover. Re-introduces the background-work
  threat Core.Scripting-003 closed via System.Threading.Tasks deny.
- **-013 Medium Security** — hand-rolled wrapper-source generation lets
  brace-balanced user source inject sibling methods/classes alongside
  CompiledScript.Run. Analyzer still gates forbidden types, but the
  documented 'method body' authoring contract is silently relaxed.
- **-014 Medium Concurrency** — CompiledScriptCache.Clear() uses key-only
  TryRemove(key, out _) — the same race the -006 resolution fixed in
  GetOrCompile's catch is latent here on publish-replace.
- **-015 Low Correctness** — ToCSharpTypeName truncates at first backtick;
  silently drops closed type arguments of nested-generic shapes (Outer<>.Inner<>).
  Latent — no production caller uses this shape today.
- **-016 Medium Performance** — VirtualTagEngine + ScriptedAlarmEngine call
  ScriptEvaluator.Compile directly without going through CompiledScriptCache,
  so the headline -008 collectible-ALC fix doesn't run on the actual
  production path — the per-publish leak is still in effect.

### Core.ScriptedAlarms (1 new, ID -013)
- **-013 Low Documentation** — new internal test accessors return the live
  mutable scratch dictionary; XML docs don't warn future test authors about
  the synchronisation contract.

### Driver.Cli.Common (2 new, IDs -007, -008)
- **-007 High Correctness** — 0x80550000 was added as BadDeviceFailure but
  the real OPC UA spec value for BadDeviceFailure is 0x808B0000 (verified
  against Driver.Galaxy.Runtime.StatusCodeMap and HistorianQualityMapper,
  both of which use the correct 0x808B0000). 0x80550000 is actually
  BadSecurityPolicyRejected. The native mappers (FOCAS / AbCip / AbLegacy)
  all use the wrong 0x80550000; this session's SnapshotFormatter extension
  propagated the wrong name and the test asserts against the same wrong
  value so CI is blind — same shape of bug as Driver.Cli.Common-001.
- **-008 Low Testing** — new FormatStatus_names_native_driver_emitted_codes
  Theory is redundant with the existing well-known Theory (same five
  InlineData rows added to both) and uses weaker ShouldContain assertion
  than the well-known Theory's ShouldBe.

### Driver.Galaxy (4 new, IDs -015 to -018)
- **-015 Medium Security** — vendored DLLs (libs/) have no recorded
  provenance: no source-commit SHA from the mxaccessgw repo, no SHA-256
  checksum in libs/README.md. Tampering / accidental swap undetectable.
- **-016 Medium Performance** — version skew between declared
  PackageReferences (Polly 8.5.2 / Grpc.Net.Client 2.71.0 /
  Microsoft.Extensions.Logging.Abstractions 10.0.0) and what the vendored
  DLL was actually built against (Polly.Core 8.6.6 / Grpc.Net.Client
  2.76.0 / Microsoft.Extensions.Logging.Abstractions 10.0.7). Latent now
  (assembly-version refs are loose) but precise shape that produces a
  runtime MissingMethodException.
- **-017 Low Design** — no contract-version handshake between the driver
  and the gateway; proto could evolve under the gateway without the
  driver noticing.
- **-018 Low Documentation** — libs/README.md points at the wrong sibling
  csproj as the version source-of-truth; missing SpecificVersion=false
  on the Reference items; missing mxaccessgw source-commit SHA.

## Particularly notable

Two findings undercut commits from this session:

- Driver.Cli.Common-007 invalidates commit 5a9c459 (which named 0x80550000
  as BadDeviceFailure across the cross-CLI shortlist).
- Core.Scripting-016 invalidates the production effect of commit 7b6ab2e
  (the collectible-ALC fix wired Dispose only via CompiledScriptCache,
  which the engines don't use).

The wider native-mapper miscoding behind -007 also affects three driver
modules outside this session's edit scope (FocasStatusMapper,
AbCipStatusMapper, AbLegacyStatusMapper all carry the wrong code).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-23 17:02:47 -04:00

335 lines
18 KiB
Markdown

# Code Review — Driver.Cli.Common
| Field | Value |
|---|---|
| Module | `src/Drivers/Cli/ZB.MOM.WW.OtOpcUa.Driver.Cli.Common` |
| Reviewer | Claude Code |
| Review date | 2026-05-23 |
| Commit reviewed | `a9be809` |
| Status | Reviewed |
| Open findings | 2 |
## Checklist coverage
A comprehensive review completes every category, recording "No issues found" where
a category produced nothing rather than leaving it blank.
| # | Category | Result |
|---|---|---|
| 1 | Correctness & logic bugs | Driver.Cli.Common-001, Driver.Cli.Common-002, Driver.Cli.Common-007 |
| 2 | OtOpcUa conventions | No issues found |
| 3 | Concurrency & thread safety | Driver.Cli.Common-003 |
| 4 | Error handling & resilience | Driver.Cli.Common-004 |
| 5 | Security | No issues found |
| 6 | Performance & resource management | No issues found |
| 7 | Design-document adherence | No issues found |
| 8 | Code organization & conventions | No issues found |
| 9 | Testing coverage | Driver.Cli.Common-005, Driver.Cli.Common-008 |
| 10 | Documentation & comments | Driver.Cli.Common-006 |
## Re-review 2026-05-23 (commit `a9be809`)
Delta scope: commit `5a9c459` extends the `FormatStatus` shortlist with five
`Bad*` codes (`BadInternalError` 0x80020000, `BadNotWritable` 0x803B0000,
`BadOutOfRange` 0x803C0000, `BadNotSupported` 0x803D0000, `BadDeviceFailure`
0x80550000) the FOCAS / AbCip / AbLegacy native-protocol mappers emit. Tests
extended with parallel `[InlineData]` rows on the well-known Theory plus a new
`FormatStatus_names_native_driver_emitted_codes` Theory.
Cross-checked the five new hex literals against the OPC Foundation
`Opc.Ua.StatusCodes` table via DeepWiki:
| Name added | Code in shortlist | Spec value | Verdict |
|---|---|---|---|
| `BadInternalError` | `0x80020000` | `0x80020000` | Correct |
| `BadNotWritable` | `0x803B0000` | `0x803B0000` | Correct |
| `BadOutOfRange` | `0x803C0000` | `0x803C0000` | Correct |
| `BadNotSupported` | `0x803D0000` | `0x803D0000` | Correct |
| `BadDeviceFailure` | `0x80550000` | **`0x808B0000`** | **WRONG — `0x80550000` is `BadSecurityPolicyRejected`** |
The `BadDeviceFailure` mismapping is the same shape of bug as the original
Driver.Cli.Common-001 (wrong hex literal copied into the shortlist); recorded
as Driver.Cli.Common-007. The wrong constant also lives in
`FocasStatusMapper.cs`, `AbCipStatusMapper.cs`, `AbLegacyStatusMapper.cs`,
`TwinCATStatusMapper.cs`, `S7Driver.cs`, and `ModbusDriver.cs` — those are in
other modules' review scope but are noted here so future re-reviewers know
this isn't isolated. (`StatusCodeMap.cs` in Driver.Galaxy + the Wonderware
historian mappers use the correct `0x808B0000`, confirming the discrepancy.)
Testing observation: the new `FormatStatus_names_native_driver_emitted_codes`
Theory is fully redundant with the well-known Theory (the five rows were also
added there in the same commit) and uses `ShouldContain` rather than
`ShouldBe` — recorded as Driver.Cli.Common-008.
Other categories (concurrency, security, performance, design-doc adherence,
code organisation, documentation) are unchanged by this delta — no new
issues found.
## Findings
### Driver.Cli.Common-001
| Field | Value |
|---|---|
| Severity | High |
| Category | Correctness & logic bugs |
| Location | `src/Drivers/Cli/ZB.MOM.WW.OtOpcUa.Driver.Cli.Common/SnapshotFormatter.cs:106-119` |
| Status | Resolved |
**Description:** The `FormatStatus` shortlist maps four OPC UA status names to incorrect
numeric codes. The correct OPC UA spec values (verified against the OPC Foundation
UA-.NETStandard `Opc.Ua.StatusCodes` table) are:
| Name in shortlist | Code used | Correct code | What the used code actually is |
|---|---|---|---|
| `BadTimeout` | `0x80060000` | `0x800A0000` | `0x80060000` = `BadOutOfMemory` |
| `BadNoCommunication` | `0x80070000` | `0x80310000` | `0x80070000` = `BadResourceUnavailable` |
| `BadWaitingForInitialData` | `0x80080000` | `0x80320000` | `0x80080000` is not this name |
| `BadNodeIdInvalid` | `0x80350000` | `0x80330000` | `0x80350000` = `BadNodeClassInvalid` |
`Good` (`0x00000000`), `Bad` (`0x80000000`), `BadCommunicationError` (`0x80050000`),
`BadNodeIdUnknown` (`0x80340000`), `BadTypeMismatch` (`0x80740000`), and `Uncertain`
(`0x40000000`) are correct.
This is operator-facing and load-bearing: the CLI whole purpose is to label driver
status codes so a human can interpret a probe/read/write. A real device timeout
(`0x800A0000`) renders as bare `0x800A0000` with no name, while an out-of-memory
status (`0x80060000`) is mislabeled `BadTimeout`. A driver returning
`BadNodeClassInvalid` (`0x80350000`) is mislabeled `BadNodeIdInvalid`. The
`SnapshotFormatterTests` `[Theory]` cases for these codes assert against the wrong
expectations and therefore pass while the mapping is wrong (see Driver.Cli.Common-005).
**Recommendation:** Correct the four mappings to the spec values. Prefer deriving names
from the OPC Foundation `Opc.Ua.StatusCodes` constants (the stack the project already
depends on transitively) rather than hand-maintaining a hex shortlist, so the table
cannot drift from the spec again. If a hand-list is kept, add a test that cross-checks
each entry against `Opc.Ua.StatusCodes` reflection.
**Resolution:** Resolved 2026-05-22 — corrected the four mismapped `FormatStatus` codes
to their canonical `Opc.Ua.StatusCodes` values (`BadTimeout` 0x800A0000, `BadNoCommunication`
0x80310000, `BadWaitingForInitialData` 0x80320000, `BadNodeIdInvalid` 0x80330000); the CLI
project does not reference the `Opc.Ua` package so the hex literals were corrected in place
with a sync note, and `SnapshotFormatterTests` was updated with corrected expectations plus
a regression `[Theory]` asserting the pre-fix wrong names no longer apply.
### Driver.Cli.Common-002
| Field | Value |
|---|---|
| Severity | Medium |
| Category | Correctness & logic bugs |
| Location | `src/Drivers/Cli/ZB.MOM.WW.OtOpcUa.Driver.Cli.Common/SnapshotFormatter.cs:101-122` |
| Status | Resolved |
**Description:** `FormatStatus` matches the full 32-bit status word for exact equality
against the shortlist. OPC UA status codes carry sub-code/flag bits in the low 16 bits
(info type, structure-changed, semantics-changed, limit bits, overflow, etc.). A
driver-supplied status such as `0x80050001` or any `Good` value with info bits set
(e.g. an overflow bit) falls through the `switch` and renders as bare hex even though
the high bits clearly identify the severity class. The doc comment on `FormatStatus`
claims the well-known statuses are named, but only the bit-exact canonical forms are.
**Recommendation:** Either (a) narrow the doc-comment claim to bit-exact canonical
codes, or (b) match on the severity bits (`code & 0xC0000000`) to at least always emit
`Good` / `Uncertain` / `Bad` even when sub-code bits are set, and match the named codes
on the masked code (`code & 0xFFFF0000`).
**Resolution:** Resolved 2026-05-22 — `FormatStatus` now matches named codes on `code & 0xFFFF0000` and falls back to a severity-class label (`Good`/`Uncertain`/`Bad`) via `code & 0xC0000000` for unknown sub-codes; the stale "bare-hex for unknown codes" test expectation was corrected to reflect the new severity-class fallback.
### Driver.Cli.Common-003
| Field | Value |
|---|---|
| Severity | Medium |
| Category | Concurrency & thread safety |
| Location | `src/Drivers/Cli/ZB.MOM.WW.OtOpcUa.Driver.Cli.Common/DriverCommandBase.cs:51-59` |
| Status | Resolved |
**Description:** `ConfigureLogging` assigns the process-global `Serilog.Log.Logger`
without disposing the previously assigned logger and the library never calls
`Log.CloseAndFlush()`. Each call creates a fresh `Logger` via `CreateLogger()` and
overwrites `Log.Logger`; the prior instance (and its console sink) is never disposed
or flushed. The class is the shared base for every driver CLI and the `subscribe` verb
is long-running — if any command path re-invokes `ConfigureLogging` the buffered
console sink is abandoned without a flush, and on process exit the final logger is also
never flushed. Verbose debug output written just before exit can be lost.
**Recommendation:** Call `Log.CloseAndFlush()` on shutdown (e.g. in a `finally` in the
command `ExecuteAsync`, or via a `protected` disposal helper on this base). Treat
`ConfigureLogging` as call-once / idempotent and document that. At minimum capture and
dispose the previous logger if reconfiguration is genuinely intended.
**Resolution:** Resolved 2026-05-22 — `ConfigureLogging` is now idempotent (guarded by `_loggingConfigured` field) and disposes the previous `Log.Logger` before overwriting; a new `protected static FlushLogging()` helper calls `Log.CloseAndFlush()` for commands to call in their `finally` blocks; XML doc updated accordingly.
### Driver.Cli.Common-004
| Field | Value |
|---|---|
| Severity | Low |
| Category | Error handling & resilience |
| Location | `src/Drivers/Cli/ZB.MOM.WW.OtOpcUa.Driver.Cli.Common/SnapshotFormatter.cs:68-70` |
| Status | Resolved |
**Description:** `FormatTable` calls `rows.Max(r => r.Tag.Length)` (and the same for the
value and status columns) without guarding against empty input. When `tagNames` and
`snapshots` are both empty (equal length, so the mismatch check at line 56 passes),
`Enumerable.Max` throws `InvalidOperationException` ("Sequence contains no elements").
A batch read that legitimately returns zero tags therefore crashes the formatter
instead of producing an empty (header-only) table.
**Recommendation:** Short-circuit on `rows.Length == 0` (return just the header +
separator, or an explicit "no rows" line), or use `DefaultIfEmpty(0).Max(...)` for the
width computations.
**Resolution:** Resolved 2026-05-23 — `FormatTable` guards each `rows.Max(...)` width
computation with a `rows.Length == 0 ? "<HEADER>".Length : Math.Max(...)` ternary, so
an empty batch read returns the header + separator rows (no data rows) instead of
throwing `InvalidOperationException`. The fix was landed in commit `1433a1c` alongside
the -002 work, and the regression test
`SnapshotFormatterTests.FormatTable_with_empty_input_returns_header_only` (added under
-005) exercises it.
### Driver.Cli.Common-005
| Field | Value |
|---|---|
| Severity | Medium |
| Category | Testing coverage |
| Location | `tests/Drivers/Cli/ZB.MOM.WW.OtOpcUa.Driver.Cli.Common.Tests/SnapshotFormatterTests.cs:27-37` |
| Status | Resolved |
**Description:** The `FormatStatus_names_well_known_status_codes` `[Theory]` asserts
`0x80060000 => "BadTimeout"`, which encodes the wrong spec value (see
Driver.Cli.Common-001). The test passes because it validates the formatter against the
same incorrect table, so the bug is invisible to CI. Additionally there is no coverage
for: `DriverCommandBase` (`ConfigureLogging` verbose vs non-verbose level selection — no
test exercises the base at all), `FormatTable` with empty input (Driver.Cli.Common-004
would have been caught), `FormatValue` with array / enum / custom `object` values, and
`FormatTimestamp` with `DateTimeKind.Unspecified` (the docs imply Unspecified is
normalised but only `Local` is tested).
**Recommendation:** Fix the `[Theory]` expectations once Driver.Cli.Common-001 is
resolved, and add a test asserting each shortlist entry against the OPC Foundation
`Opc.Ua.StatusCodes` constants so the table cannot silently drift. Add `FormatTable`
empty-input and `DriverCommandBase` level-selection tests.
**Resolution:** Resolved 2026-05-22 — added `FormatTable_with_empty_input_returns_header_only` (exercises the -004 fix), `FormatStatus_with_sub_code_bits_resolves_to_named_class` / `FormatStatus_unknown_sub_code_falls_back_to_severity_class` Theories (cover -002 fix), and a new `DriverCommandBaseTests` class with four tests covering verbose/non-verbose level selection, idempotency of `ConfigureLogging`, and `FlushLogging`; stale `FormatStatus_unknown_codes_fall_back_to_hex_only` expectation corrected to match the -002 severity-class fallback.
### Driver.Cli.Common-006
| Field | Value |
|---|---|
| Severity | Low |
| Category | Documentation & comments |
| Location | `src/Drivers/Cli/ZB.MOM.WW.OtOpcUa.Driver.Cli.Common/SnapshotFormatter.cs:71`, `src/Drivers/Cli/ZB.MOM.WW.OtOpcUa.Driver.Cli.Common/DriverCommandBase.cs:9` |
| Status | Resolved |
**Description:** Two minor doc inaccuracies. (1) The comment at `SnapshotFormatter.cs:71`
states the "source-time column is fixed-width (ISO-8601 to ms) so no max-measurement
needed" — true only when every snapshot has a non-null `SourceTimestampUtc`.
`FormatTimestamp` returns `"-"` for a null timestamp, so a mixed table has a 1-char-wide
cell in an otherwise 24-char column; the column is unaligned. Harmless (right-most, no
padding consumer) but the stated invariant does not hold. (2) The `DriverCommandBase`
class summary enumerates "Modbus / AB CIP / AB Legacy / S7 / TwinCAT" as the driver CLIs
but omits FOCAS, which `docs/DriverClis.md` lists as the sixth CLI built on this shared
library. The XML doc is stale relative to the shipped driver-CLI set.
**Recommendation:** Reword the `SnapshotFormatter.cs:71` comment to note the column is
right-most and intentionally unpadded rather than claiming fixed width. Add FOCAS to the
`DriverCommandBase` class-summary driver list.
**Resolution:** Resolved 2026-05-23 — (1) `SnapshotFormatter.cs:71` comment reworded
to state the source-time column is the right-most one and intentionally not
measured/padded, calling out the null-timestamp `"-"` case explicitly. (2) FOCAS was
added to the `DriverCommandBase` class-summary driver enumeration in commit `7ff356b`
(landed alongside the -003 work).
### Driver.Cli.Common-007
| Field | Value |
|---|---|
| Severity | High |
| Category | Correctness & logic bugs |
| Location | `src/Drivers/Cli/ZB.MOM.WW.OtOpcUa.Driver.Cli.Common/SnapshotFormatter.cs:129` |
| Status | Open |
**Description:** Commit `5a9c459` added `0x80550000u => "BadDeviceFailure"` to the
`FormatStatus` shortlist, but `0x80550000` is the canonical OPC UA spec value for
`BadSecurityPolicyRejected`, not `BadDeviceFailure`. The correct spec value for
`BadDeviceFailure` is `0x808B0000` (verified against the OPC Foundation
`Opc.Ua.StatusCodes` table via DeepWiki; corroborated locally by
`src/Drivers/ZB.MOM.WW.OtOpcUa.Driver.Galaxy/Runtime/StatusCodeMap.cs:40`
(`BadDeviceFailure = 0x808B0000u`) and the two Wonderware historian quality mappers,
which all hand-pin the correct value).
This is the same shape of bug Driver.Cli.Common-001 closed: a wrong hex literal
in the shortlist that the test theory (`SnapshotFormatterTests.cs:42`) blindly
asserts against the same wrong value, so the bug is invisible to CI.
Practical impact, two-sided:
1. A driver that returns the real spec `BadDeviceFailure` (`0x808B0000`) — e.g.
the `Driver.Galaxy.StatusCodeMap` path on a deploy-time device fault — falls
through the named shortlist entirely. Since Driver.Cli.Common-002 added the
severity-class fallback, it now renders as `0x808B0000 (Bad)` instead of
`0x808B0000 (BadDeviceFailure)` — operators lose the specific class label
`docs/Driver.FOCAS.Cli.md:153` tells them to read off the output.
2. A driver that returns `0x80550000` (which `FocasStatusMapper`, `AbCipStatusMapper`,
`AbLegacyStatusMapper`, `TwinCATStatusMapper`, `S7Driver`, and `ModbusDriver` all
misuse as "BadDeviceFailure") now renders as `0x80550000 (BadDeviceFailure)`
matching driver intent but contradicting the OPC UA spec, which says any client
that decodes the same payload using the OPC Foundation stack will see
`BadSecurityPolicyRejected`. A security-monitoring tool keying on
`BadSecurityPolicyRejected` will fire on a CPU fault, while real
`BadSecurityPolicyRejected` returns from the secure-channel layer would be
mislabelled as a device fault. Operator-facing CLI output and machine-readable
status semantics disagree.
The deeper bug is the wrong constant in the native-protocol mappers (out of scope
for this module), but the `SnapshotFormatter` shortlist is its own
spec-authoritative reference point — Driver.Cli.Common-001 explicitly framed the
shortlist as canonical, with the in-line "keep [these literals] in sync with [the
Opc.Ua.StatusCodes] table" comment at `SnapshotFormatter.cs:112-113`. That
contract is now broken.
**Recommendation:** Change line 129 to `0x808B0000u => "BadDeviceFailure"`. Update
the matching `[InlineData]` rows in `SnapshotFormatterTests.cs` (line 42 in the
well-known Theory; line 60 in the redundant Theory — see Driver.Cli.Common-008).
Also note in the resolution that the native-protocol mappers (FOCAS / AbCip /
AbLegacy / TwinCAT / S7 / Modbus) need the same fix recorded against their own
module reviews — the constant `0x80550000` should be replaced with `0x808B0000`
everywhere it claims to mean `BadDeviceFailure`. Consider Driver.Cli.Common-001's
original recommendation again: add a CI test that cross-checks every shortlist
entry against `Opc.Ua.StatusCodes` reflection so this class of bug stops
recurring.
### Driver.Cli.Common-008
| Field | Value |
|---|---|
| Severity | Low |
| Category | Testing coverage |
| Location | `tests/Drivers/Cli/ZB.MOM.WW.OtOpcUa.Driver.Cli.Common.Tests/SnapshotFormatterTests.cs:50-64` |
| Status | Open |
**Description:** Commit `5a9c459` adds a new
`FormatStatus_names_native_driver_emitted_codes` `[Theory]` whose five
`[InlineData]` rows are identical to five rows added to the existing
`FormatStatus_names_well_known_status_codes` `[Theory]` in the same commit
(lines 32, 39, 40, 41, 42). The new Theory therefore adds no coverage. It is
also weaker than the Theory it duplicates: it asserts
`output.ShouldContain($"({expectedName})")` (substring match) where the
well-known Theory asserts `output.ShouldBe($"0x{status:X8} ({expectedName})")`
(exact match including the hex prefix). The substring form would not catch a
regression where the hex literal renders wrong but the name is correct.
This is not a correctness problem — both Theories pass — but it's a
copy-paste inconsistency that costs maintainer attention every time someone
reads the test file and wonders which Theory is authoritative.
**Recommendation:** Either (a) delete the new Theory entirely — its five rows
are already covered by the well-known Theory in the same commit — or (b) keep
it but switch to `ShouldBe($"0x{status:X8} ({expectedName})")` so its
assertion strength matches the rest of the file. Option (a) is cleaner: the
commit's "operator workflow" intent is documented well enough in the
well-known Theory comment block; the redundant Theory is dead weight.