44 Commits

Author SHA1 Message Date
Joseph Doherty f23e368a74 fix(server, admin): wire sp_RegisterNodeGenerationApplied + overlay heartbeat onto ClusterNode
dbo.sp_RegisterNodeGenerationApplied was defined by the initial
StoredProcedures migration but had zero callers in src/. The server
polled sp_GetCurrentGenerationForCluster every 5s but never reported
back, so dbo.ClusterNodeGenerationState stayed empty for every node
and both the Admin UI Fleet status page ("No node state recorded")
and the cluster-detail Redundancy LastSeenAt indicator ("never
STALE") showed broken liveness forever.

Server side (GenerationRefreshHostedService):
* New testable seam: Func<long, NodeApplyStatus, string?, CT, Task>?
  registerAppliedAsync constructor parameter, defaulting to a real
  sp_RegisterNodeGenerationApplied call against the central DB.
* TickAsync now calls the proc at two points: after every successful
  apply with NodeApplyStatus.Applied, and on every no-change tick as
  a heartbeat (also Applied) so LastSeenAt stays fresh.
* Apply failures now wrap the lease + coordinator.RefreshAsync in a
  try/catch, report NodeApplyStatus.Failed with the exception message,
  and advance LastAppliedGenerationId regardless of outcome so we
  don't loop on the same broken apply every 5s.
* Register-call failures are best-effort (LogDebug heartbeat, LogWarning
  apply-report) — a transient DB outage during reporting must not
  crash the publisher or block the next apply.

Admin side (ClusterNodeService.ListByClusterAsync): the Redundancy tab
reads ClusterNode.LastSeenAt, but no current writer maintains that
column — the heartbeat goes to ClusterNodeGenerationState.LastSeenAt.
Overlay the GenerationState heartbeat onto the returned ClusterNode
rows when more recent, so IsStale + the Redundancy table column
reflect actual liveness without a schema change or new write path.

Tests: 3 new cases on GenerationRefreshHostedServiceTests verify
first-apply reports Applied, no-change ticks heartbeat with Applied,
and register-call failure does not roll back the cursor or block
subsequent ticks. All 8 GenerationRefresh tests pass.

Verified live on node-dev-a / cluster-dev: dbo.ClusterNodeGenerationState
now populated with CurrentGenerationId=1, LastAppliedStatus=Applied,
fresh LastSeenAt. Fleet status page shows the node (KPIs NODES 1 /
APPLIED 1 / STALE 0 / FAILED 0). Redundancy tab KPI STALE went 1\xe2\x86\x920 and
the row shows a real LAST SEEN timestamp. Bonus: FleetStatusHub
SignalR push now fires the cluster-page Live update banner on every
heartbeat because there are finally state changes to push.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-25 02:22:59 -04:00
Joseph Doherty c8de58d6d3 fix(admin-ui): render published gen read-only on the 6 cluster-detail content tabs
The Equipment, UNS Structure, Namespaces, Drivers, Tags, and ACLs tabs
all rendered only an "Open a draft to edit" placeholder when no draft
was open — even when the cluster had a fully populated published
generation. docs/v2/admin-ui.md \xa7Cluster Detail describes these as
"read-only views of the published generation" with an "Edit in draft"
affordance; that view was never wired. The earlier code path also
correctly rendered nothing when the cluster had no published gen yet,
which was indistinguishable from the broken state.

Collapse the six per-tab conditions into one shared branch that threads
the published gen ID into the existing tab components when no draft
exists, wrapped in <fieldset disabled> so any Add/Edit button click in
the read-only state cannot silently mutate published rows even though
the tab components themselves don't yet honor an IsReadOnly flag.
Banner above the content explains the state. Surgical: zero changes to
the ~1500 LoC across the six tab components.

Verified live on cluster-dev gen 1: Drivers tab now shows the
cluster-dev-galaxy-main GalaxyMxGateway row read-only; Namespaces tab
shows cluster-dev-galaxy-ns SystemPlatform row; both with the read-only
banner and visibly disabled affordances.

Follow-up worth doing later: refactor each tab component to accept an
IsReadOnly parameter so the disabled-affordance UX is per-tab rather
than a blanket fieldset opacity wash.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-25 02:22:32 -04:00
Joseph Doherty 8fe7c8bea6 refactor(driver-galaxy): switch to sibling-repo MxGateway client + drop vendored libs
The sibling mxaccessgw repo (clients/dotnet/) restored a proper client
library + contracts under the new ZB.MOM.WW.MxGateway namespace, so the
binary-vendoring stopgap from PR Driver.Galaxy-016 can unwind via plan #1
of libs/README.md.

- csproj: replace <Reference HintPath="libs\MxGateway.*.dll"> with a
  ProjectReference into ..\..\..\..\mxaccessgw\clients\dotnet  ZB.MOM.WW.MxGateway.Client\. The five backfill PackageReference shims
  (Google.Protobuf, Grpc.Core.Api, Grpc.Net.Client, Polly.Core,
  Microsoft.Extensions.Logging.Abstractions) are now transitive again.
- Source: 'using MxGateway.X' -> 'using ZB.MOM.WW.MxGateway.X' across
  19 driver files + 14 test files. No fully-qualified MxGateway.* usages
  in code, so no behavioural changes — purely a using-prefix flip.
- libs/: deleted MxGateway.Client.dll, MxGateway.Contracts.dll, README.md
  (orphan after the unwind).

Verified: dotnet build clean (Release), all 245 Driver.Galaxy unit tests
pass, OtOpcUa service running with the new client DLL loaded
(opc.tcp://localhost:4840/OtOpcUa, no FileNotFound/TypeLoad/
MissingMethod in startup logs).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-24 14:55:15 -04:00
Joseph Doherty c6082aa0b9 fix(admin-e2e): register missing DI services so ClusterDetail interactive circuit boots
UnsTabDragDropE2ETests were timing out at the 'UNS Structure' nav-link
locator because AdminWebAppFactory never registered AdminHubConnectionFactory
/ HubTokenService / DataProtection — ClusterDetail.razor's @inject threw at
circuit boot, so the page never advanced past the Loading placeholder. 2 → 3
pass after the registrations land. Also documents the Modbus standard-vs-
exception_injection coverage matrix in the fixture README + cross-references
docs/drivers/AbServer-Test-Fixture.md from each Emulate test so a developer
landing on a skipped test has a direct doc pointer.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-24 01:07:17 -04:00
Joseph Doherty b1f3e09661 test(modbus, abcip): align failing integration tests with fixtures
Two real bugs uncovered by re-running with the new fixture defaults
pointing at the shared docker host. Both are test-side, not driver-side.

AbCip — Driver_reads_seeded_DInt_from_ab_server (4 parametrized rows):
  Hardcoded 'ab://127.0.0.1:{port}/1,0' in the deviceUri instead of
  the resolved fixture.Host. The new 10.100.0.35 default (and any
  AB_SERVER_ENDPOINT override) silently couldn't reach this test —
  the driver tried to connect to a non-existent localhost:44818 and
  returned BadCommunicationError on all 4 profile rows. The sibling
  Emulate tests already use the fixture's resolved endpoint; this
  smoke test was missed in the original migration.

  Fix: deviceUri = $"ab://{fixture.Host}:{fixture.Port}/1,0".

Modbus — Float32_With_CDAB_Roundtrips_Through_Wire:
  Test wrote a Float32 to HR 100 (2 consecutive registers: 100+101).
  standard.json's writable HR range declares [100,100] only — a
  single-cell auto-incrementing register, not a 2-register pair. The
  write to register 101 was rejected with Illegal Data Address
  (BadOutOfRange).

  Fix: moved the tag from HR 100 to HR 200 (in standard.json's
  '[200, 209]' scratch range — 10 consecutive writable HRs). The
  Float32+CDAB semantic the test exercises is unchanged.

Modbus — Block_Read_Coalescing_Reduces_PDU_Count_End_To_End:
  Test read HR 300, 302, 304 — outside both the writable ranges and
  the uint16 seed list. pymodbus rejects reads to unseeded HRs even
  though 'hr size' is 2048. BadOutOfRange on every read.

  Fix: moved the tags from 300/302/304 to 200/202/204 (within the
  scratch range). The non-contiguous coalescing semantic (3 tags
  inside a 5-register window with MaxReadGap=5) is preserved.

After this commit:
  - Modbus.IntegrationTests: 6/38 pass / 32 skip / 0 fail
    (was 4 pass / 32 skip / 2 fail; 32 skips are profile-gated
    ExceptionInjectionTests — they need MODBUS_SIM_PROFILE=
    exception_injection and a different container, intentional gating)
  - AbCip.IntegrationTests: 10/12 pass / 2 skip / 0 fail
    (was 6 pass / 2 skip / 4 fail; 2 skips are Emulate tests that
    need the fixture for separate scenarios)

No driver code changed.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-23 19:45:57 -04:00
Joseph Doherty 49644fc7fd test(fixtures): migrate integration-test fixture defaults to 10.100.0.35
CLAUDE.md "Docker Workflow" claims (per the 2026-04-28 migration note)
that all fixture-class default endpoints were rewritten to target the
shared Docker host at 10.100.0.35. Audit during today's e2e run showed
the claim was incomplete — five fixture classes still defaulted to
localhost / 127.0.0.1, causing every fixture-touching integration test
to skip with "endpoint unreachable" on a fresh box that hadn't set
the override env vars.

Files corrected:

  - tests/.../Modbus.IntegrationTests/ModbusSimulatorFixture.cs
      DefaultEndpoint: localhost:5020 → 10.100.0.35:5020
  - tests/.../S7.IntegrationTests/Snap7ServerFixture.cs
      DefaultEndpoint: localhost:1102 → 10.100.0.35:1102
  - tests/.../OpcUaClient.IntegrationTests/OpcPlcFixture.cs
      DefaultEndpoint: opc.tcp://localhost:50000 → opc.tcp://10.100.0.35:50000
  - tests/.../AbCip.IntegrationTests/AbServerFixture.cs
      Host default + ResolveHost fallback: 127.0.0.1 → 10.100.0.35
  - tests/.../AbLegacy.IntegrationTests/AbLegacyServerFixture.cs
      Host default + ResolveEndpoint fallback: 127.0.0.1 → 10.100.0.35

XML doc comments referencing the old localhost defaults were updated in
the same pass so the class-summary documentation matches the actual
default. The override-via-env-var mechanism (MODBUS_SIM_ENDPOINT,
AB_SERVER_ENDPOINT, AB_LEGACY_ENDPOINT, S7_SIM_ENDPOINT,
OPCUA_SIM_ENDPOINT) is unchanged — pointing at a real PLC or a
locally-running container still works exactly as before.

Verification:
  - Solution-wide dotnet build: 0 errors.
  - S7.IntegrationTests: 3/3 pass without env-var override.
  - OpcUaClient.IntegrationTests: 3/3 pass without env-var override.
  - Modbus.IntegrationTests: 4/38 (same as the env-var-override run —
    the 2 failures + 32 skips are pre-existing fixture-profile
    mismatches unrelated to this fix).
  - AbCip.IntegrationTests / AbLegacy.IntegrationTests: same results
    as the env-var-override run.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-23 19:32:23 -04:00
Joseph Doherty 3d982d9a65 docs: sync against recent code changes
Five doc-content updates after this session's code-review resolution
sweep. No code touched; pure documentation drift correction.

1. docs/reqs/HighLevelReqs.md (HLR-007 — Service Hosting):
   Refreshed the deployment description from "three cooperating
   processes (Server, Admin, Galaxy.Host)" to "two cooperating
   Windows services (Server, Admin)". The legacy x86 TopShelf
   Galaxy.Host process was retired in PR 7.2 (2026-04-30); Galaxy
   access now flows through the in-process Tier-A GalaxyDriver
   talking gRPC to the sibling mxaccessgw gateway. Also called out
   decision #30 (AddWindowsService replacing TopShelf) inline.

2. docs/VirtualTags.md:
   - Line 9: "compiled via Microsoft.CodeAnalysis.CSharp.Scripting"
     replaced with the current pipeline (Microsoft.CodeAnalysis.CSharp
     regular compiler — Core.Scripting-008 / -016 retired the
     CSharpScript/ScriptRunner path).
   - Line 39: orphan-thread leak description rewritten. The
     CSharp.Scripting-era "underlying ScriptRunner keeps running on
     its thread-pool thread until the Roslyn runtime returns" is no
     longer accurate — the new pipeline binds the script as a
     regular C# Func<> delegate, so the leak is now "synchronous
     CPU-bound work on a pool thread" (same operator-visible
     effect, different mechanism).

3. docs/v2/plan.md decision #29 ("Galaxy Host is a separate Windows
   service"):
   Annotated both the decision body and the decision-log table row
   with "Reversed PR 7.2, 2026-04-30" + a one-line summary of the
   replacement architecture. The original reasoning is preserved as
   audit trail per the decision-log convention.

4. docs/v2/implementation/phase-7-scripting-and-alarming.md A.1:
   Added an Implementation note describing the
   Core.Scripting-008 / -016 supersession of the original
   CSharpScript pipeline. The historical record stays; the note
   points future readers at docs/VirtualTags.md "Compile cache"
   for the current contract.

5. docs/plans/alarms-over-gateway.md "Files" section under client
   regeneration:
   Updated the .NET regeneration instructions to point at the new
   ZB.MOM.WW.MxGateway.Contracts.csproj path. The old
   clients/dotnet/MxGateway.Client.csproj no longer exists in the
   sibling repo (restructure after this plan was written) and the
   vendored-binaries situation in
   src/Drivers/ZB.MOM.WW.OtOpcUa.Driver.Galaxy/libs/ is called out
   so a reader following the plan won't chase a deleted path.

Verification: grep against docs/ for the pre-fix wordings ("three
cooperating processes", "Galaxy.Host (TopShelf)", "ScriptRunner",
the wrong BadDeviceFailure hex code 0x80550000) returns no hits.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-23 18:57:04 -04:00
Joseph Doherty 23d59d73f2 fix(scripting+alarms): close remaining re-review findings
Single commit covering the four small/medium fixes from the updated
code review.

Core.Scripting-014 (Medium, Concurrency):
  CompiledScriptCache.Clear() used the key-only TryRemove(key, out var
  lazy) overload — same race shape Core.Scripting-006 closed in
  GetOrCompile's catch block. A concurrent re-add between snapshot and
  TryRemove was evicted + disposed while the new caller still held it.
  Replaced with the value-scoped TryRemove(KeyValuePair<,>) overload.
  Regression test
  Clear_uses_value_scoped_TryRemove_so_a_race_inserted_entry_survives
  added.

Core.Scripting-013 (Medium, Security):
  Hand-rolled BuildWrapperSource pastes user source between literal
  braces; brace-balanced source could inject sibling methods/classes
  alongside CompiledScript.Run. Analyzer still walked the injected
  members so it wasn't a direct escape, but it relaxed the documented
  'method body' authoring contract. Added EnforceSingleRunMember:
  after ParseText, the compilation unit must hold exactly one type
  (CompiledScript) and that type must hold exactly one member (the Run
  method). Any deviation throws CompilationErrorException with LMX001/
  LMX002 diagnostic IDs and a Core.Scripting-013 reference in the
  message. Two regression tests added covering the sibling-method and
  sibling-class injection vectors.

Core.Scripting-015 (Low, Correctness, latent):
  ToCSharpTypeName's generic branch truncated at the first backtick via
  IndexOf, silently dropping closed args of nested-generic shapes
  (Outer<T>.Inner<U>). No production caller exercises this shape today
  (all TContext/TResult are top-level non-nested), so the bug was
  latent. Rewrote the generic branch to walk the FullName segment-by-
  segment, consuming generic args per segment so nested shapes emit
  valid C# (global::Ns.Outer<T>.Inner<U> rather than the broken
  Outer<T,U>).

Core.ScriptedAlarms-013 (Low, Documentation):
  The internal test accessors TryGetScratchReadCacheForTest /
  TryGetScratchContextForTest return live mutable scratch refilled in
  place under _evalGate. XML docs didn't warn future test authors about
  the synchronization contract. Added a <remarks> block to each
  documenting the only-safe-on-quiesced-engine + identity-or-single-key
  contract.

Verification (suites green):
  Core.Scripting.Tests: 110/110 (was 107 — +3 new rejection/race tests)
  Core.ScriptedAlarms.Tests: 67/67 (unchanged — doc-only fix)
  Core.VirtualTags.Tests: 57/57 (unchanged)

After this commit, all 12 findings from the updated re-review are
closed (10 Resolved, 1 Won't Fix none, 1 Deferred — Driver.Galaxy-017).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-23 18:00:59 -04:00
Joseph Doherty c2abbf45bd fix(driver-galaxy): align package versions + record vendored-DLL provenance
Driver.Galaxy-015, -016, -017, -018 resolution (one logical change set).

Driver.Galaxy-016 (Medium, Perf/Resource):
  Reconciled the csproj PackageReferences with what the vendored
  MxGateway.Client.dll was actually built against, verified by
  reflecting Assembly.GetReferencedAssemblies() on the DLL:
    - Polly 8.5.2  →  Polly.Core 8.6.6
      (most consequential — Polly v7 fluent API vs Polly.Core v8
       resilience-pipeline API are DIFFERENT packages; the DLL was
       built against Polly.Core so the prior Polly reference would
       have failed at runtime with MissingMethodException the first
       time the gateway client's retry pipeline ran)
    - Grpc.Net.Client 2.71.0  →  2.76.0  (matches sibling Server/Worker)
    - Microsoft.Extensions.Logging.Abstractions 10.0.0  →  10.0.7
  Google.Protobuf 3.34.1 and Grpc.Core.Api 2.76.0 already matched —
  left unchanged.

Driver.Galaxy-015 (re-triaged from Medium-Security → Low-Documentation):
  Original framing was a security concern about unknown-provenance
  binaries. User clarified the DLLs are their own code, built from
  their own mxaccessgw project, not third-party. Re-triaged to a
  documentation / audit-trail concern. Fix:
    - Added a Provenance section to libs/README.md recording the
      source-commit SHA (dd7ca1634e2d2b8a866c81f0009bf87ee9427750,
      extracted from the AssemblyInformationalVersion baked into
      both DLLs by the original build) and SHA-256 checksums.
    - Documented the re-verification recipe (sha256sum + ilspycmd
      | grep AssemblyInformationalVersion).
  Recommendations about .gitattributes and CI hash-check deferred —
  the DLLs are frozen until an unwinding path is taken, so adding
  LFS or CI infrastructure now would need removal at unwinding.

Driver.Galaxy-018 (Low, Documentation):
  Most of the recommendation folded into the libs/README.md rewrite
  (pointed at sibling Server/Worker csproj as the live version source
  rather than the deleted MxGateway.Client.csproj; recorded source
  commit + SHA-256). <SpecificVersion>false</SpecificVersion> on the
  <Reference> items intentionally not added — MSBuild's default for
  HintPath references with bare-name Include attributes is already
  SpecificVersion=false, so explicitly setting it would be cosmetic
  without changing behaviour.

Driver.Galaxy-017 (Low, Design) — Deferred:
  Recommendation part (b) (record mxaccessgw source-commit SHA in
  libs/README.md) is satisfied by Driver.Galaxy-015's resolution.
  Parts (a) and (c) — a GetVersion RPC at session-open and a parity
  test against the live gateway's proto descriptor — are substantial
  new RPC + plumbing work not in scope for this code-review sweep.
  The risk surface is bounded because either of the libs/README.md
  unwinding paths closes the vendoring + this concern naturally.
  Re-open if neither path is taken within the next quarter and the
  live gateway evolves its proto under the driver.

Verification:
  - Build clean (Driver.Galaxy.csproj 0 errors, 0 warnings).
  - Driver.Galaxy.Tests: 245/245 pass against the corrected
    package set.
  - Solution-wide build remains clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-23 17:45:24 -04:00
Joseph Doherty 3a53d03d23 fix(scripting): block ThreadPool/Timer/AssemblyLoadContext in sandbox
Core.Scripting-012 (High, Security) resolution.

The Core.Scripting-008 rewrite broadened the BCL references list from a
narrow allow-list to the full System.* + netstandard +
Microsoft.Win32.Registry set, delegating the security gate entirely to
ForbiddenTypeAnalyzer. Three categories of dangerous BCL types were
reachable from script source without a deny-list entry:

  - System.Threading.ThreadPool — QueueUserWorkItem re-introduces the
    background-fanout threat Core.Scripting-003 closed against
    System.Threading.Tasks.
  - System.Threading.Timer — schedules unbounded callback work that
    outlives the per-evaluation timeout.
  - System.Runtime.Loader.AssemblyLoadContext — loads arbitrary DLLs.
    Defense-in-depth gap; invocation needs reflection (already denied)
    but the load itself was reachable.

Fix:
  - Added 'System.Runtime.Loader' to ForbiddenNamespacePrefixes
    (preferred over type-granular per the recommendation so future BCL
    additions to that namespace are denied by default).
  - Added 'System.Threading.ThreadPool' and 'System.Threading.Timer'
    to ForbiddenFullTypeNames — both live in System.Threading shared
    with allowed primitives so they must be type-granular.

Regression tests added to ScriptSandboxTests:
  Rejects_ThreadPool_QueueUserWorkItem_at_compile
  Rejects_Timer_new_at_compile
  Rejects_AssemblyLoadContext_at_compile

Docs:
  docs/v2/implementation/phase-7-scripting-and-alarming.md decision #6
  and the Sandbox-escape compliance-check row both updated to enumerate
  the new entries per the Core.Scripting-009 doc-sync convention.

Two lower-impact suggestions from the finding's recommendation
(System.Console, CultureInfo.DefaultThreadCurrentCulture) were
intentionally not addressed and are recorded as accepted minor risks
in the resolution.

Verification: Core.Scripting.Tests 107/107 (was 104 + 3 new rejection
tests).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-23 17:39:20 -04:00
Joseph Doherty fb7c6c7046 fix(scripting): route engines through CompiledScriptCache (Core.Scripting-016)
Both VirtualTagEngine.Load and ScriptedAlarmEngine.LoadAsync were calling
ScriptEvaluator.Compile directly, bypassing CompiledScriptCache. The
Core.Scripting-008 collectible-ALC fix wired Dispose only through the cache's
Clear()/Dispose(), so the per-publish accretion the -008 fix was meant to
eliminate was still in effect on the actual production path — the headline
'no more restarts needed' guarantee wasn't delivered.

Resolution:
  - VirtualTagEngine + ScriptedAlarmEngine each gained a private
    CompiledScriptCache<TContext, TResult> instance.
  - Both Load methods now call _compileCache.GetOrCompile(source).
  - Publish-replace path: _compileCache.Clear() runs alongside the existing
    _tags / _alarms clears so the prior generation's ALCs are disposed
    before recompile.
  - Engine Dispose now calls _compileCache.Dispose() so shutdown actually
    releases the emitted assemblies.

Side-fix in CompiledScriptCache: Dispose() set _disposed=true then called
Clear(), but Clear() had a pre-existing 'if (_disposed) return' guard that
aborted the drain unconditionally — making the Dispose-triggered cleanup a
silent no-op. Removed the disposed-guard on Clear() (clearing an empty/
cleared cache is idempotent).

Side-fix in ScriptedAlarmEngine.Dispose: cleared _alarms AFTER the
Task.WhenAll drain. The drain guarantees no background callback is mid-
flight, so clearing is safe. Previously _alarms was deliberately NOT
cleared on Dispose (per Core.ScriptedAlarms-005), but that left the
AlarmState records holding TimedScriptEvaluator → ScriptEvaluator → delegate
references that rooted the emitted assemblies, defeating the cache's
Dispose work on the engine side.

Regression tests:
  - VirtualTagEngineTests.Dispose_unloads_compiled_script_assembly
  - ScriptedAlarmEngineTests.Dispose_unloads_compiled_predicate_assembly
  Both use WeakReference + bounded GC.Collect() to prove the emitted
  assembly is reclaimable after engine.Dispose(). The alarms test had to
  be synchronous (not 'async Task<WeakReference>') because async state
  machines capture locals as state-struct fields, keeping them alive past
  the method's apparent end and defeating GC.

Verification:
  - Core.Scripting.Tests: 104/104 (unchanged).
  - VirtualTags.Tests: 57/57 (was 56 — +1 unload test).
  - ScriptedAlarms.Tests: 67/67 (was 66 — +1 unload test).
  - All other consumer suites still green.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-23 17:33:34 -04:00
Joseph Doherty a6ae4e22d1 fix(status-codes): correct BadDeviceFailure from 0x80550000 to 0x808B0000
Driver.Cli.Common-007 + Driver.Cli.Common-008 resolution.

Driver.Cli.Common-007 (High, Correctness):
  0x80550000 is the canonical OPC UA spec value for BadSecurityPolicyRejected,
  not BadDeviceFailure. The correct spec value for BadDeviceFailure is
  0x808B0000 (verified against OPC Foundation Opc.Ua.StatusCodes;
  corroborated locally by Driver.Galaxy.Runtime.StatusCodeMap and both
  Wonderware historian quality mappers which all hand-pin the correct
  value).

  The bug was duplicated across six driver modules:
    - FocasStatusMapper.BadDeviceFailure
    - AbCipStatusMapper.BadDeviceFailure
    - AbLegacyStatusMapper.BadDeviceFailure
    - TwinCATStatusMapper.BadDeviceFailure
    - ModbusDriver.StatusBadDeviceFailure
    - S7Driver.StatusBadDeviceFailure
  Plus the SnapshotFormatter shortlist that named 0x80550000 as
  BadDeviceFailure, and three downstream Modbus tests that asserted
  against the wrong value (so CI was blind).

  This commit fixes all six native-mapper constants, the formatter
  shortlist, and the three Modbus tests in one pass. Added a regression
  guard to FormatStatus_does_not_apply_pre_fix_wrong_names that pins
  0x80550000 never renders as BadDeviceFailure (mirroring the existing
  -001 wrong-name guards).

  Behavior change: OPC UA clients consuming the native drivers now see
  the canonical BadDeviceFailure (0x808B0000) on device-fault paths
  instead of the misnamed BadSecurityPolicyRejected (0x80550000). Wire-
  level status semantics now match operator-facing CLI labels.

Driver.Cli.Common-008 (Low, Testing):
  Deleted the redundant FormatStatus_names_native_driver_emitted_codes
  Theory — its five InlineData rows were already covered by the
  well-known Theory in the same commit (5a9c459), and used a weaker
  ShouldContain vs the well-known Theory's ShouldBe (exact match).

Verification:
  - Driver.Cli.Common.Tests: 43/43 pass (was 48 after the -008 deletion).
  - Driver.Modbus.Tests: 263/263 pass.
  - Driver.AbCip.Tests: 262/262.
  - Driver.AbLegacy.Tests: 157/157.
  - Driver.FOCAS.Tests: 178/178.
  - Driver.S7.Tests: 112/112.
  - Driver.TwinCAT.Tests: 131/131.
  Total: 1146 tests across the affected modules, all green.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-23 17:14:28 -04:00
Joseph Doherty 41e62b2663 docs(code-reviews): updated re-review at commit a9be809 — 12 new findings
Re-reviewed the four modules with source changes since the previous review
commit 76d35d1, per REVIEW-PROCESS.md section 6. Updated each findings.md
header (date 2026-05-23, commit a9be809) and appended new findings under
continued numbering. Regenerated README.md.

## New findings — 12 total across 4 modules

### Core.Scripting (5 new, IDs -012 to -016)
- **-012 High Security** — broadened BCL references (System.* + netstandard)
  re-expose System.Threading.ThreadPool / Timer / AssemblyLoadContext, which
  the analyzer's deny-list doesn't cover. Re-introduces the background-work
  threat Core.Scripting-003 closed via System.Threading.Tasks deny.
- **-013 Medium Security** — hand-rolled wrapper-source generation lets
  brace-balanced user source inject sibling methods/classes alongside
  CompiledScript.Run. Analyzer still gates forbidden types, but the
  documented 'method body' authoring contract is silently relaxed.
- **-014 Medium Concurrency** — CompiledScriptCache.Clear() uses key-only
  TryRemove(key, out _) — the same race the -006 resolution fixed in
  GetOrCompile's catch is latent here on publish-replace.
- **-015 Low Correctness** — ToCSharpTypeName truncates at first backtick;
  silently drops closed type arguments of nested-generic shapes (Outer<>.Inner<>).
  Latent — no production caller uses this shape today.
- **-016 Medium Performance** — VirtualTagEngine + ScriptedAlarmEngine call
  ScriptEvaluator.Compile directly without going through CompiledScriptCache,
  so the headline -008 collectible-ALC fix doesn't run on the actual
  production path — the per-publish leak is still in effect.

### Core.ScriptedAlarms (1 new, ID -013)
- **-013 Low Documentation** — new internal test accessors return the live
  mutable scratch dictionary; XML docs don't warn future test authors about
  the synchronisation contract.

### Driver.Cli.Common (2 new, IDs -007, -008)
- **-007 High Correctness** — 0x80550000 was added as BadDeviceFailure but
  the real OPC UA spec value for BadDeviceFailure is 0x808B0000 (verified
  against Driver.Galaxy.Runtime.StatusCodeMap and HistorianQualityMapper,
  both of which use the correct 0x808B0000). 0x80550000 is actually
  BadSecurityPolicyRejected. The native mappers (FOCAS / AbCip / AbLegacy)
  all use the wrong 0x80550000; this session's SnapshotFormatter extension
  propagated the wrong name and the test asserts against the same wrong
  value so CI is blind — same shape of bug as Driver.Cli.Common-001.
- **-008 Low Testing** — new FormatStatus_names_native_driver_emitted_codes
  Theory is redundant with the existing well-known Theory (same five
  InlineData rows added to both) and uses weaker ShouldContain assertion
  than the well-known Theory's ShouldBe.

### Driver.Galaxy (4 new, IDs -015 to -018)
- **-015 Medium Security** — vendored DLLs (libs/) have no recorded
  provenance: no source-commit SHA from the mxaccessgw repo, no SHA-256
  checksum in libs/README.md. Tampering / accidental swap undetectable.
- **-016 Medium Performance** — version skew between declared
  PackageReferences (Polly 8.5.2 / Grpc.Net.Client 2.71.0 /
  Microsoft.Extensions.Logging.Abstractions 10.0.0) and what the vendored
  DLL was actually built against (Polly.Core 8.6.6 / Grpc.Net.Client
  2.76.0 / Microsoft.Extensions.Logging.Abstractions 10.0.7). Latent now
  (assembly-version refs are loose) but precise shape that produces a
  runtime MissingMethodException.
- **-017 Low Design** — no contract-version handshake between the driver
  and the gateway; proto could evolve under the gateway without the
  driver noticing.
- **-018 Low Documentation** — libs/README.md points at the wrong sibling
  csproj as the version source-of-truth; missing SpecificVersion=false
  on the Reference items; missing mxaccessgw source-commit SHA.

## Particularly notable

Two findings undercut commits from this session:

- Driver.Cli.Common-007 invalidates commit 5a9c459 (which named 0x80550000
  as BadDeviceFailure across the cross-CLI shortlist).
- Core.Scripting-016 invalidates the production effect of commit 7b6ab2e
  (the collectible-ALC fix wired Dispose only via CompiledScriptCache,
  which the engines don't use).

The wider native-mapper miscoding behind -007 also affects three driver
modules outside this session's edit scope (FocasStatusMapper,
AbCipStatusMapper, AbLegacyStatusMapper all carry the wrong code).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-23 17:02:47 -04:00
Joseph Doherty a9be80923c docs(v2-release-readiness): drop stale 'this branch task-galaxy-e2e' ref
The task-galaxy-e2e branch was merged + deleted; the durable reference
is PR #205 alone. Tidies a dangling pointer that future readers might
chase looking for a branch that no longer exists.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-23 16:50:06 -04:00
Joseph Doherty 994997ba7b fix(driver-galaxy): vendor MxGateway.Client + MxGateway.Contracts as binary refs
The sibling mxaccessgw repo restructured: clients/dotnet/MxGateway.Client
no longer exists, and the proto contracts moved to a new namespace
(ZB.MOM.WW.MxGateway.Contracts.Proto, was MxGateway.Contracts.Proto). The
driver's source still expects the pre-restructure namespace, so the
broken ProjectReference produced 86 build errors in src/ + 1 in tests/
on master.

Resolution: vendor the last known-good build of MxGateway.Client.dll
(99 KB, May 22) and MxGateway.Contracts.dll (490 KB, May 23) under
src/Drivers/.../Driver.Galaxy/libs/, reference them via <Reference
HintPath=...> in both the driver and its test csproj, and declare the
NuGet packages the dropped ProjectReference was supplying transitively
(Google.Protobuf, Grpc.Core.Api, Grpc.Net.Client,
Microsoft.Extensions.Logging.Abstractions, Polly) at versions matching
the sibling repo's ZB.MOM.WW.MxGateway.Contracts.csproj so binary
compatibility is preserved.

Why this over a source migration:
  Source migration would require namespace renames across ~19 driver
  files PLUS reimplementing MxGatewayClient / MxGatewaySession /
  GalaxyRepositoryClient (~2,200 LoC) — the sibling repo dropped the
  client library entirely, keeping only the proto contracts. Vendoring
  the last known-good binaries unblocks the build in minutes, freezes
  the gateway contract surface at a known-good version, and preserves
  the option to migrate properly once the sibling repo decides whether
  to restore a client library or hand the work back to us.

libs/README.md documents the unwinding plan (either path closes the
debt: sibling restores a client library, or driver migrates to the new
contracts namespace + reimplements the client wrapper).

Verification:
  - dotnet build ZB.MOM.WW.OtOpcUa.slnx: 0 errors (was 87).
  - Driver.Galaxy unit tests: 245/245 pass.
  - Integration tests not run here (require a live mxaccessgw gateway).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-23 16:32:56 -04:00
Joseph Doherty 0001cdd579 fix(scripted-alarms): reuse per-alarm evaluation scratch on the hot path
Core.ScriptedAlarms-009 resolution: replace the per-call Dictionary +
AlarmPredicateContext allocation with a per-alarm reusable AlarmScratch
held in _scratchByAlarmId, refilled in place under _evalGate on each
evaluation. The hot path no longer allocates per upstream tag change.

Why this matters:
  On a busy line where many tags feeding many alarms change frequently,
  the old BuildReadCache allocated a fresh dictionary + context on every
  predicate evaluation — a steady stream of short-lived allocations the
  GC eventually has to reclaim. With the reuse, the dictionary and
  context are allocated once per alarm (on first evaluation) and refilled
  in place across every subsequent re-eval.

Implementation:
  - New private AlarmScratch class holds the reusable
    Dictionary<string, DataValueSnapshot> read cache (pre-sized to the
    alarm's Inputs.Count) and the AlarmPredicateContext that wraps it by
    reference. The context observes refilled values without being
    re-created.
  - ConcurrentDictionary<string, AlarmScratch> _scratchByAlarmId on the
    engine, cleared in LoadAsync alongside _alarms so a config-publish
    drops the prior generation's scratch (Inputs / Logger may change).
  - EvaluatePredicateToStateAsync looks up scratch via GetOrAdd, calls
    the new RefillReadCache(Dictionary, IReadOnlySet) helper to clear +
    repopulate the dictionary in place, then runs the predicate against
    the reused context.
  - BuildReadCache removed.

Safety:
  Reuse is serialised under _evalGate which guarantees no two threads
  ever observe the same scratch in a half-refilled state. The
  AlarmPredicateContext is bound to the scratch dictionary by reference,
  so the predicate's ctx.GetTag(path) sees the freshly-refilled values
  rather than a stale snapshot.

Verification:
  - All 66 ScriptedAlarms tests pass (was 63 — three new regression tests
    locking the reuse contract).
  - All 56 VirtualTags tests still pass (unchanged).
  - All 104 Core.Scripting tests still pass (unchanged).

New tests in ScriptedAlarmEngineTests:
  - Reevaluation_reuses_the_same_read_cache_dictionary — asserts
    ReferenceEquals(scratch_before, scratch_after) across two
    evaluations of the same alarm.
  - Reevaluation_reuses_the_same_predicate_context — same, for the
    context.
  - LoadAsync_drops_the_prior_generations_scratch — asserts a config
    publish wipes the prior scratch (so a stale Logger / Inputs can't
    leak into the new generation).

Internal test hooks TryGetScratchReadCacheForTest /
TryGetScratchContextForTest added via the existing
InternalsVisibleTo for the tests project. Kept internal — not part of
the public engine surface.

Docs:
  - docs/v2/Galaxy.Performance.md "Scripted-alarm engine" section
    rewritten as "hot-path allocation reuse" documenting the new
    contract + reuse safety reasoning + the three regression tests.
  - code-reviews/Core.ScriptedAlarms/findings.md -009 flipped
    Won't Fix → Resolved.
  - code-reviews/README.md regenerated.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-23 16:10:09 -04:00
Joseph Doherty 7b6ab2ec6f fix(scripting): unload compiled-script assemblies via collectible ALC
Core.Scripting-008 resolution: replace the legacy CSharpScript.CreateDelegate
path with hand-rolled CSharpCompilation + Emit + collectible AssemblyLoadContext,
so per-publish compile accretion no longer requires a server restart to reclaim.

Why this was needed:
  Roslyn's CSharpScript path emits dynamically-compiled script assemblies into
  the default AssemblyLoadContext, which is non-collectible. Across config-
  publish generations each Clear() drops dictionary entries but the emitted
  assemblies stay loaded for process lifetime, so memory grows steadily on
  long-running servers with frequent publishes. The accepted-limitation note
  in docs/VirtualTags.md recommended scheduled restarts as the workaround;
  operator feedback was that restarts are difficult, so the underlying
  limitation was the right thing to fix.

Implementation:
  - New ScriptAssemblyLoadContext(name, isCollectible: true) hosts one emitted
    script assembly per evaluator.
  - ScriptEvaluator.Compile synthesises a wrapper class around the user source
    (CompiledScript.Run(globals) — explicit return required per ordinary C#
    semantics, which every existing script already uses), builds a
    CSharpCompilation against the sandbox references, runs the
    ForbiddenTypeAnalyzer over the semantic model unchanged, emits to an
    in-memory PE stream, loads via ScriptAssemblyLoadContext.LoadFromStream,
    and binds a strongly-typed Func<ScriptGlobals<TContext>, TResult> delegate
    via reflection.
  - ScriptEvaluator now implements IDisposable — Dispose calls
    AssemblyLoadContext.Unload(), which makes the emitted assembly eligible
    for GC at the next collection cycle.
  - CompiledScriptCache.Clear() disposes every materialised evaluator before
    dropping its dictionary entry; CompiledScriptCache itself is now
    IDisposable for graceful server shutdown.
  - ScriptSandbox.Build returns a new SandboxConfig (References + Imports)
    instead of a Roslyn ScriptOptions; references now span BCL via the
    TRUSTED_PLATFORM_ASSEMBLIES set filtered to System.* + netstandard +
    Microsoft.Win32.Registry, so forbidden BCL types resolve at compile and
    ForbiddenTypeAnalyzer is the sole security gate (consistent with the
    Core.Scripting-001 / -002 model — references-list-only restriction is
    porous against type forwarding, so the analyzer must be the real gate).

Verification:
  - All 104 Core.Scripting tests pass (was 101 — three new regression tests
    locking the unload contract).
  - All 56 VirtualTags tests pass (unchanged).
  - All 63 ScriptedAlarms tests pass (unchanged).
  - New CompiledScriptCacheTests:
    - Dispose_unloads_compiled_script_assembly_load_context — proves single-
      evaluator ALC unload via WeakReference + bounded GC.Collect() loop.
    - Clear_disposes_every_materialised_evaluator — proves publish-replace
      releases every prior generation's ALC.
    - GetOrCompile_after_Dispose_throws_ObjectDisposedException — locks the
      post-dispose contract.

Docs:
  - docs/VirtualTags.md "Compile cache" section rewritten: the accepted-
    limitation note replaced with the unload contract + the new authoring
    convention (explicit return).
  - docs/ScriptedAlarms.md cross-reference updated to drop the obsolete
    restart guidance.
  - code-reviews/Core.Scripting/findings.md Core.Scripting-008 flipped
    Won't Fix → Resolved with the implementation summary.
  - code-reviews/README.md regenerated.

Pre-existing breakage note: Driver.Galaxy fails the solution-wide build on
master because its ProjectReference to the sibling mxaccessgw repo's
MxGateway.Client targets a path that the sibling repo no longer has after a
recent restructuring. This is unrelated to Core.Scripting-008 and was
verified to exist on master before this branch was cut.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-23 15:55:04 -04:00
Joseph Doherty 5a9c4591b9 fix(cli-common): name native-driver-emitted status codes in SnapshotFormatter
Driver.FOCAS.Cli-005 follow-up: extend the SnapshotFormatter.FormatStatus
shortlist with the five Bad* codes the native-protocol mappers (FOCAS,
AbCip, AbLegacy) emit but which the shortlist previously left unnamed,
so they rendered only as severity-class 'Bad' instead of the documented
'BadDeviceFailure' / 'BadNotWritable' / ... names operators are told to
read off probe/write output.

Added entries:
  0x80020000 BadInternalError
  0x803B0000 BadNotWritable
  0x803C0000 BadOutOfRange
  0x803D0000 BadNotSupported
  0x80550000 BadDeviceFailure

(BadTimeout 0x800A0000 was already added under Driver.Cli.Common-001.)

Tests: SnapshotFormatterTests gains a new [Theory]
FormatStatus_names_native_driver_emitted_codes covering the five names,
and the existing well-known [Theory] is extended with the same entries
to enforce exact '0x... (Name)' rendering. Suite now 47 green (was 42).

Flips Driver.FOCAS.Cli-005 from Deferred to Resolved; README regenerated.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-23 15:14:08 -04:00
Joseph Doherty 0f8ce1cb80 docs(code-reviews): regenerate index — final batch — 6 Low findings resolved
Batch 7 closed the last Open findings in Client.UI. The review backlog
is now empty: 0 Open findings across all 31 modules.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-23 11:25:28 -04:00
Joseph Doherty 1b10194634 fix(client-ui): resolve Low code-review findings (Client.UI-003,004,006,009,010,011)
- Client.UI-003: wire Serilog properly per CLAUDE.md — console sink +
  rolling daily file sink in Program.Main, Log.CloseAndFlush in finally,
  per-VM Log.ForContext<> loggers.
- Client.UI-004: migrate the cert-store folder picker from the obsolete
  OpenFolderDialog to StorageProvider.OpenFolderPickerAsync (with
  TryGetFolderFromPathAsync seed + TryGetLocalPath extraction).
- Client.UI-006: surface formerly silent catch blocks via an observable
  StatusMessage on the Subscriptions / Alarms VMs that bubbles up into
  the shell's status bar; soft fallbacks log at Information level so
  hard failures stay distinguishable.
- Client.UI-009: docs/Client.UI.md now lists Standard Deviation in the
  Aggregate row of the Query Options table.
- Client.UI-010: removed the unused MinDateTimeProperty /
  MaxDateTimeProperty styled properties from DateTimeRangePicker.
- Client.UI-011: updated the cert-store TextBox watermark from the
  legacy AppData/LmxOpcUaClient/pki to the canonical
  AppData/OtOpcUaClient/pki.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-23 11:25:20 -04:00
Joseph Doherty 59ecd18169 docs(code-reviews): regenerate index — 25 Low findings resolved
Batch 6 cleared Open findings in Driver.FOCAS.Cli (1 deferred to
Driver.Cli.Common), Driver.Cli.Common, Driver.Historian.Wonderware.Client,
Client.CLI, and Client.Shared.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-23 11:13:29 -04:00
Joseph Doherty 2a6ac07111 fix(client-shared): resolve Low code-review findings (Client.Shared-003,004,009,010,011)
- Client.Shared-003: DefaultSessionAdapter.WriteValueAsync / CallMethodAsync
  guard against null/empty Results and throw ServiceResultException with
  the response's ServiceResult code instead of indexing into a missing
  list.
- Client.Shared-004: DefaultSessionAdapter.CloseAsync / HistoryReadRawAsync
  / HistoryReadAggregateAsync use the Session.*Async overloads and honour
  the caller's CancellationToken.
- Client.Shared-009: AcknowledgeAlarmAsync returns the underlying
  ServiceResultException.StatusCode on failure instead of always Good;
  IOpcUaClientService doc updated to describe the new contract.
- Client.Shared-010: ConnectionSettings.CertificateStorePath defaults to
  empty; DefaultApplicationConfigurationFactory resolves the canonical
  PKI path lazily, so per-failover ConnectionSettings copies don't hit
  the filesystem.
- Client.Shared-011: added the alarm-fallback regression test, extracted
  EndpointSelector as a pure static, and added EndpointSelectorTests
  covering security-mode match, Basic256Sha256 preference, fallback,
  diagnostics, hostname rewrite, and null/empty guards.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-23 11:13:21 -04:00
Joseph Doherty 7fe9f16cf8 fix(client-cli): resolve Low code-review findings (Client.CLI-002,003,004,006,007,008,009,010)
- Client.CLI-002: SubscribeCommand's neverWentBad list now requires the
  node to be present in lastStatus (i.e. received at least one update)
  so the 'suspect' bucket only contains observed nodes.
- Client.CLI-003: every long-running command validates numeric option
  ranges (Interval / Depth / MaxDepth / Duration / Max) and throws
  CliFx CommandException on out-of-range values.
- Client.CLI-004: SubscribeCommand carries XML summary docs on the
  type, ctor, every [CommandOption] property, and ExecuteAsync —
  matching the sibling commands' style.
- Client.CLI-006: HistoryReadCommand parses --start / --end with
  InvariantCulture+UTC and surfaces FormatException as CommandException;
  every NodeIdParser.ParseRequired call wraps FormatException /
  ArgumentException as CommandException.
- Client.CLI-007: CommandBase.ConfigureLogging calls Log.CloseAndFlush()
  before assigning a new Log.Logger so prior sinks are disposed.
- Client.CLI-008: rewrote the subscribe and historyread sections of
  docs/Client.CLI.md (every flag documented, summary-bucket vocabulary,
  StandardDeviation aggregate, UTC --start/--end convention).
- Client.CLI-009: SubscribeCommand / AlarmsCommand use named local
  handlers and detach them via -= after UnsubscribeAsync so no
  notification reaches the console after the command's output phase
  ends.
- Client.CLI-010: added CommandRangeValidationTests,
  EventHandlerLifecycleTests, InputValidationErrorsTests,
  LoggerLifecycleTests, and SubscribeCommandSummaryTests pinning every
  Low fix; FakeOpcUaClientService gained AddDiscoveredVariable +
  RaiseDataChanged + BrowseResultsByParent helpers.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-23 11:13:08 -04:00
Joseph Doherty 879925180b fix(driver-historian-wonderware-client): resolve Low code-review findings (Driver.Historian.Wonderware.Client-003,004,006,008,010)
- Driver.Historian.Wonderware.Client-003: replaced the mixed Interlocked
  + healthLock counters with RecordOutcome that touches _totalQueries
  and exactly one of _totalSuccesses / _totalFailures under one
  acquisition.
- Driver.Historian.Wonderware.Client-004: InvokeAndClassifyAsync routes
  transport + sidecar classification through a single RecordOutcome
  call; the legacy ReclassifySuccessAsFailure two-step is gone.
- Driver.Historian.Wonderware.Client-006: removed the dead
  ReconnectInitialBackoff / ReconnectMaxBackoff options and added a
  doc <remarks> stating the channel performs a single in-place
  reconnect; retry/backoff stays with the caller.
- Driver.Historian.Wonderware.Client-008: the audit-suppression comment
  block now records advisory titles, why neither applies, and the
  revisit trigger.
- Driver.Historian.Wonderware.Client-010: reworded Dispose() to claim
  deadlock-safety and added a GetHealthSnapshot summary documenting the
  single-channel collapse + counter invariant.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-23 11:12:16 -04:00
Joseph Doherty 3ca569f621 fix(driver-cli-common): resolve Low code-review findings (Driver.Cli.Common-004,006)
- Driver.Cli.Common-004: confirm the FormatTable empty-input guard
  landed earlier (commit 1433a1c); flip status to Resolved with a
  cross-reference.
- Driver.Cli.Common-006: reword the SnapshotFormatter source-time
  column comment to describe the actual behaviour (right-most column,
  unmeasured, '-' for null timestamps) and confirm the
  DriverCommandBase summary now enumerates FOCAS.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-23 11:12:04 -04:00
Joseph Doherty 6923be3aa2 fix(driver-focas-cli): resolve Low code-review findings (Driver.FOCAS.Cli-001,002,003,004; -005 deferred)
- Driver.FOCAS.Cli-001: WriteCommand.ParseValue now wraps numeric
  FormatException / OverflowException as CliFx CommandException with
  the offending value.
- Driver.FOCAS.Cli-002: SubscribeCommand's OnDataChange handler and the
  banner both take a writeLock so notification-callback and main-thread
  writes can't interleave; handler exceptions are warn-and-swallow.
- Driver.FOCAS.Cli-003: FocasCommandBase.ValidateOptions rejects
  --cnc-port outside 1..65535, non-positive --timeout-ms, and
  non-positive --interval-ms; ExecuteAsync calls it first.
- Driver.FOCAS.Cli-004: 'await using var driver' is the sole driver
  disposal path; dropped the redundant explicit await ShutdownAsync.
- Driver.FOCAS.Cli-005 (Deferred): the fix lives in
  Driver.Cli.Common.SnapshotFormatter — explicitly naming the
  status-code shortlist there benefits every driver CLI. Left as a
  Driver.Cli.Common follow-up.
- Registered the new tests/Drivers/Cli/ZB.MOM.WW.OtOpcUa.Driver.FOCAS.Cli.Tests
  project in ZB.MOM.WW.OtOpcUa.slnx.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-23 11:11:55 -04:00
Joseph Doherty 2a941b255f docs(code-reviews): regenerate index — 29 Low findings resolved
Batch 5 cleared Open findings in Driver.AbCip.Cli, Driver.AbLegacy.Cli,
Driver.S7.Cli, Driver.TwinCAT.Cli, and Driver.Modbus.Cli.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-23 08:35:12 -04:00
Joseph Doherty 80ef8806e0 fix(driver-modbus-cli): resolve Low code-review findings (Driver.Modbus.Cli-003,004,005,006,007,008)
- Driver.Modbus.Cli-003: ModbusCommandBase.ValidateEndpoint rejects
  --port outside 1..65535, non-positive --timeout-ms, and --unit-id
  outside 1..247.
- Driver.Modbus.Cli-004: wrapped SubscribeCommand's OnDataChange handler
  body in a try/catch (warn-and-swallow) and serialised the console
  write through a lock.
- Driver.Modbus.Cli-005: Probe / Read / Write now catch the
  cancellation-during-init OperationCanceledException and print
  'Cancelled.' instead of dumping a stack trace.
- Driver.Modbus.Cli-006: ProbeCommand.ComputeVerdict derives the headline
  from BOTH the driver state and the probe snapshot's OPC UA quality
  class so the headline can't disagree with the wire result.
- Driver.Modbus.Cli-007: docs/Driver.Modbus.Cli.md carries an explicit
  'CLI scope' callout — the address-string grammar is a DriverConfig
  JSON feature; the CLI takes the structured triple only.
- Driver.Modbus.Cli-008: pinned BuildOptions, ValidateEndpoint, the
  region-validation guards, ComputeVerdict, and the cancellation-during-
  initialize paths.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-23 08:35:05 -04:00
Joseph Doherty f2ee027145 fix(driver-twincat-cli): resolve Low code-review findings (Driver.TwinCAT.Cli-001,002,003,004,005,006,007)
- Driver.TwinCAT.Cli-001: TwinCATCommandBase.Validate rejects
  non-positive TimeoutMs / IntervalMs and AmsPort outside 1..65535;
  ExecuteAsync calls it first.
- Driver.TwinCAT.Cli-002: SubscribeCommand serialises every WriteLine
  through a writeLock to remove the notification-callback vs banner
  interleave risk.
- Driver.TwinCAT.Cli-003: SubscribeCommand.DescribeMechanism derives
  the banner label from the returned ISubscriptionHandle.DiagnosticId
  so it can't disagree with what the driver actually did.
- Driver.TwinCAT.Cli-004: introduced TwinCATTagCommandBase carrying
  --poll-only + BuildOptions; BrowseCommand stays on the slimmer
  TwinCATCommandBase so --poll-only no longer surfaces in browse --help.
- Driver.TwinCAT.Cli-005: ProbeCommand --type now carries the 't' short
  alias to match the other commands.
- Driver.TwinCAT.Cli-006: 35 new tests covering Gateway / AmsAddress
  parse / BuildOptions / PollOnly / browse-helpers / probe-alias /
  mechanism derivation.
- Driver.TwinCAT.Cli-007: replaced the empty-init <inheritdoc/> with an
  explicit summary warning future maintainers about the no-op init.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-23 08:34:57 -04:00
Joseph Doherty 67ef6c4ebc fix(driver-s7-cli): resolve Low code-review findings (Driver.S7.Cli-004,005,006,007)
- Driver.S7.Cli-004: 'await using var driver' is the sole driver
  disposal path; dropped the redundant explicit await ShutdownAsync from
  each command's finally.
- Driver.S7.Cli-005: deleted the stale empty
  tests/ZB.MOM.WW.OtOpcUa.Driver.S7.Cli.Tests/ directory (the real test
  project lives under tests/Drivers/Cli/).
- Driver.S7.Cli-006: S7CommandBaseBuildOptionsTests cover the probe
  toggle, timeout mapping, host/port/CPU/rack/slot wiring, and tag list
  passthrough.
- Driver.S7.Cli-007: re-added the SubscribeCommand handler comment
  explaining the CliFx IConsole.Output usage and that the poll-thread
  raises events.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-23 08:34:48 -04:00
Joseph Doherty f46e126208 fix(driver-ablegacy-cli): resolve Low code-review findings (Driver.AbLegacy.Cli-002,003,004,005,006,007)
- Driver.AbLegacy.Cli-002: WriteCommand.Value description lists the full
  true/false, 1/0, on/off, yes/no alias set.
- Driver.AbLegacy.Cli-003: SubscribeCommand serialises every WriteLine
  via a per-execution consoleGate lock so the poll-thread OnDataChange
  handler can't interleave with the banner.
- Driver.AbLegacy.Cli-004: dropped 'await using var driver' in favour of
  a plain 'var driver' + explicit await ShutdownAsync in finally; the
  driver is no longer shut down twice.
- Driver.AbLegacy.Cli-005: SubscribeCommand.IntervalMs description
  carries the PollGroupEngine 250ms-floor caveat; docs/Driver.AbLegacy.Cli.md
  spells out the same.
- Driver.AbLegacy.Cli-006: ProbeCommand --type now carries the short
  alias 't' to match the other commands.
- Driver.AbLegacy.Cli-007: BuildOptionsTests cover the probe-disabled,
  device-shape, tag-passthrough, timeout-propagation, and empty-tag-list
  paths.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-23 08:34:32 -04:00
Joseph Doherty 759af8c1bb fix(driver-abcip-cli): resolve Low code-review findings (Driver.AbCip.Cli-003,004,005,006,007,008)
- Driver.AbCip.Cli-003: SubscribeCommand prints the 'Subscribed' banner
  BEFORE wiring OnDataChange so the main thread can't interleave its
  write with the poll-thread handler.
- Driver.AbCip.Cli-004: AbCipCommandBase.Timeout and SubscribeCommand
  validate TimeoutMs / IntervalMs and throw CommandException on
  non-positive values.
- Driver.AbCip.Cli-005: every command now calls FlushLogging() in its
  finally block.
- Driver.AbCip.Cli-006: Timeout init throws NotSupportedException with a
  pointer at TimeoutMs instead of silently swallowing assignments.
- Driver.AbCip.Cli-007: added AbCipCommandBaseTests covering BuildOptions
  shape, probe / controller-browse / alarm toggles, host address, family
  selection, tag list passthrough.
- Driver.AbCip.Cli-008: rewrote the opening paragraph in
  docs/Driver.AbCip.Cli.md to credit the six-CLI roster with a pointer
  at docs/DriverClis.md.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-23 08:34:25 -04:00
Joseph Doherty 61c0311938 docs(code-reviews): regenerate index — 24 Low findings resolved
Batch 4 cleared Open findings in Driver.TwinCAT, Driver.Modbus,
Driver.OpcUaClient, Driver.Historian.Wonderware, and Driver.Modbus.Addressing.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-23 08:18:21 -04:00
Joseph Doherty 9263519852 fix(driver-modbus-addressing): resolve Low code-review findings (Driver.Modbus.Addressing-006,007,009)
- Driver.Modbus.Addressing-006: broaden the catch in TryParseFamilyNative
  so a future helper throwing a non-Argument/Overflow type still satisfies
  the try-parse contract.
- Driver.Modbus.Addressing-007: document that the address grammar does
  not carry ModbusStringByteOrder (the structured-tag path does);
  add a 'Grammar scope' bullet to docs/v2/dl205.md.
- Driver.Modbus.Addressing-009: reword the ModbusModiconAddress comments
  so they don't imply a leading-digit invariant the parser doesn't
  enforce.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-23 08:18:15 -04:00
Joseph Doherty 1f29b215c8 fix(driver-historian-wonderware): resolve Low code-review findings (Driver.Historian.Wonderware-004,005,007,008,010,011,012)
- Driver.Historian.Wonderware-004: ToHistorianEvent synthesises a fresh
  Guid when the upstream EventId is unparseable and logs the substitution
  instead of writing the historian with Guid.Empty.
- Driver.Historian.Wonderware-005: GetHealthSnapshot derives the
  connection-open booleans from the active-node fields so the snapshot
  is self-consistent without depending on the secondary lock.
- Driver.Historian.Wonderware-007: SID-mismatch branch in PipeServer now
  sends a HelloAck { Accepted=false, RejectReason } so the client sees a
  symmetric rejection.
- Driver.Historian.Wonderware-008: classify StartQuery failures —
  connection-class codes drop the connection, query-class codes throw
  QueryClassStartQueryException so the IPC layer surfaces Success=false.
- Driver.Historian.Wonderware-010: RequestTimeoutSeconds now enforced
  via BuildRequestCts linked to the caller's CancellationToken.
- Driver.Historian.Wonderware-011: refreshed XML docs to describe the
  current sidecar / named-pipe architecture (Galaxy.Host / Proxy
  references reframed as historical context).
- Driver.Historian.Wonderware-012: pinned the previously-uncovered
  HistorianDataSource behaviours with five new test files; also removed
  the stale empty tests/ZB.MOM.WW.OtOpcUa.Driver.Historian.Wonderware.Tests
  directory.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-23 08:18:10 -04:00
Joseph Doherty 42aa82de29 fix(driver-opcuaclient): resolve Low code-review findings (Driver.OpcUaClient-011,014)
- Driver.OpcUaClient-011: rewrote the ValueRank comment with the OPC UA
  Part 3 constants and an explicit scalar/array boundary at
  valueRank >= 0.
- Driver.OpcUaClient-014: track every MonitoredItem.Notification handler
  in a MonitoredItemNotificationHandle record; UnsubscribeAsync /
  UnsubscribeAlarmsAsync / ShutdownAsync detach the handler before
  Subscription.DeleteAsync so the SDK's invocation list no longer keeps
  the driver alive.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-23 08:17:55 -04:00
Joseph Doherty d5322b0f9a fix(driver-modbus): resolve Low code-review findings (Driver.Modbus-003,007,008,009,010,011,012)
- Driver.Modbus-003: route every _health access through ReadHealth /
  WriteHealth helpers backed by Volatile.Read / Volatile.Write so a
  burst of concurrent ReadAsync callers always sees a complete snapshot.
- Driver.Modbus-007: promoted the Int64 / UInt64 → Int32 surfacing
  caveat to a full <remarks> block; rewrote DisableFC23's doc to flag it
  as reserved / no-op.
- Driver.Modbus-008: deleted stale duplicate doc, rewrote the
  prohibition-block summaries to credit the shipped re-probe loop, and
  removed the unused 'status' local in the ModbusException catch arm.
- Driver.Modbus-009: bind-time validation rejects StringLength < 1 for
  String tags; ModbusTcpTransport clamps keep-alive intervals to whole
  seconds (>=1).
- Driver.Modbus-010: documented WriteOnChangeOnly's cache-invalidation
  policy (reads-only) and the write-only-tag caveat.
- Driver.Modbus-011: collected the scattered instance fields into a
  single contiguous block at the top of ModbusDriver.
- Driver.Modbus-012: covered the previously-uncovered Reinitialize
  state-hygiene, malformed/truncated/empty-bitmap response, and
  DisposeAsync teardown paths.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-23 08:17:51 -04:00
Joseph Doherty 3c75db7eb6 fix(driver-twincat): resolve Low code-review findings (Driver.TwinCAT-004,006,014,015,016)
- Driver.TwinCAT-004: corrected the IEC time-type inline comments;
  documented that the driver currently surfaces them as raw UInt32
  counters.
- Driver.TwinCAT-006: ResolveHost returns a documented UnresolvedHost
  sentinel when no devices are configured instead of returning the
  logical DriverInstanceId (which never matches GetHostStatuses).
- Driver.TwinCAT-014: wired Probe.Timeout into the probe-loop call and
  added a NotificationMaxDelayMs config knob threaded through
  AddNotificationAsync.
- Driver.TwinCAT-015: Dispose() runs a genuinely synchronous teardown
  with bounded waits (no sync-over-async deadlock pattern).
- Driver.TwinCAT-016: pinned the Structure-tag rejection and the
  probe-loop vs read disposal race with regression tests.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-23 08:17:42 -04:00
Joseph Doherty bccff1339d docs(code-reviews): regenerate index — 22 Low findings resolved
Batch 3 cleared Open findings in Driver.Galaxy, Driver.AbCip,
Driver.AbLegacy, Driver.FOCAS, and Driver.S7.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-23 07:45:53 -04:00
Joseph Doherty af0f09d07e fix(driver-s7): resolve Low code-review findings (Driver.S7-003,005,009,010,013)
- Driver.S7-003: ArgumentNullException.ThrowIfNull on the references
  argument at the top of ReadAsync / WriteAsync (was reaching .Count
  before any null check).
- Driver.S7-005: drop the redundant global::S7.Net.Plc qualifiers in
  ReadOneAsync / WriteOneAsync — using S7.Net already covers Plc.
- Driver.S7-009: PollLoopAsync degrades _health to Degraded after
  sustained failure and backs off exponentially up to PollBackoffCap;
  resets on a healthy tick so an operator can see the loop wedge.
- Driver.S7-010: Dispose runs the synchronous teardown directly with a
  bounded WhenAll Wait drain instead of bridging via DisposeAsync().
- Driver.S7-013: reject unsupported S7DataType values (Int64 / UInt64 /
  Float64 / String / DateTime) at InitializeAsync so half-implemented
  types no longer leak BadNotSupported live nodes into the address space.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-23 07:45:45 -04:00
Joseph Doherty 6575c6e5f6 fix(driver-focas): resolve Low code-review findings (Driver.FOCAS-007,008,009,010,011)
- Driver.FOCAS-007: optional ILogger<FocasDriver> + alarm-projection
  logger; log Debug around every formerly-empty catch (probe / shutdown
  / fixed-tree / recycle / alarms-read / projection).
- Driver.FOCAS-008: cache the parsed FocasAddress per tag at
  InitializeAsync; Read/WriteAsync look it up instead of re-parsing on
  every call.
- Driver.FOCAS-009: ProbeLoopAsync now wraps client.ProbeAsync in a
  linked CTS honouring Probe.Timeout so a hung CNC socket can't block
  past the configured limit.
- Driver.FOCAS-010: FocasOperationModeExtensions.ToText delegates to
  FocasOpMode.ToText — single canonical op-mode label surface.
- Driver.FOCAS-011: FocasAlarmType constants are typed short to match
  the cnc_rdalmmsg2 wire field and the projection switch arms.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-23 07:45:38 -04:00
Joseph Doherty f7e3e9885e fix(driver-ablegacy): resolve Low code-review findings (Driver.AbLegacy-005,011,013)
- Driver.AbLegacy-005: optional ILogger<AbLegacyDriver> ctor parameter,
  logged init failure / probe transitions / first non-zero libplctag
  status per device.
- Driver.AbLegacy-011: Dispose() runs the synchronous teardown directly
  instead of bridging via DisposeAsync().AsTask().GetAwaiter().GetResult()
  to remove the documented sync-over-async deadlock pattern.
- Driver.AbLegacy-013: documented the ResolveHost three-tier fallback
  chain in XML and pointed DiscoverAsync's IsArray=false comment at the
  Modbus ArrayCount pattern for the eventual multi-element follow-up.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-23 07:45:31 -04:00
Joseph Doherty 77b8686199 fix(driver-abcip): resolve Low code-review findings (Driver.AbCip-007,011,012,013,015)
- Driver.AbCip-007: inject an optional ILogger<AbCipDriver> /
  ILogger<AbCipAlarmProjection> (default NullLogger) and log around
  every read / write / template-fetch / probe / alarm-poll failure path.
- Driver.AbCip-011: LogWarning when InitializeAsync is configured with
  Probe.Enabled=true but ProbeTagPath is blank — operators now see why
  GetHostStatuses keeps reporting Unknown.
- Driver.AbCip-012: documented the LibplctagTemplateReader per-call
  Tag cost as accepted given libplctag's own connection pool and the
  low-frequency discovery use-case.
- Driver.AbCip-013: per-device AllowPacking + ConnectionSize overrides
  on AbCipDeviceOptions, threaded through AbCipTagCreateParams; central
  BuildCreateParams helper replaces five ad-hoc clones; AllowPacking
  now reaches Tag.AllowPacking at runtime.
- Driver.AbCip-015: stale-comment sweep — every PR-N forward-reference
  is rewritten to describe present behaviour.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-23 07:45:19 -04:00
Joseph Doherty 9f7ae20995 fix(driver-galaxy): resolve Low code-review findings (Driver.Galaxy-005,010,012,013)
- Driver.Galaxy-005: rewrite the EventPump BoundedChannelOptions comment
  to honestly describe the Wait+TryWrite pattern.
- Driver.Galaxy-010: ResolveApiKey now warns when a literal API key is
  used in production wiring; added an explicit dev: prefix for known
  cleartext-in-dev cases and rewrote the GalaxyGatewayOptions doc.
- Driver.Galaxy-012: O(1) reverse-lookup for SubscriptionRegistry
  dispatch via per-entry FullRefByItemHandle map; immutable hash-set for
  the cross-binding reverse map; SubscribeAsync / ReadViaSubscribeOnce
  use BuildResultIndex for per-reference correlation.
- Driver.Galaxy-013: ReinitializeAsync now validates the incoming JSON
  against the running options; ReplayOnSessionLost honoured by the
  Replay path; class summary rewritten to describe the shipped surface.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-23 07:45:08 -04:00
264 changed files with 11823 additions and 1314 deletions
+1
View File
@@ -85,6 +85,7 @@
<Project Path="tests/Drivers/Cli/ZB.MOM.WW.OtOpcUa.Driver.AbLegacy.Cli.Tests/ZB.MOM.WW.OtOpcUa.Driver.AbLegacy.Cli.Tests.csproj" />
<Project Path="tests/Drivers/Cli/ZB.MOM.WW.OtOpcUa.Driver.S7.Cli.Tests/ZB.MOM.WW.OtOpcUa.Driver.S7.Cli.Tests.csproj" />
<Project Path="tests/Drivers/Cli/ZB.MOM.WW.OtOpcUa.Driver.TwinCAT.Cli.Tests/ZB.MOM.WW.OtOpcUa.Driver.TwinCAT.Cli.Tests.csproj" />
<Project Path="tests/Drivers/Cli/ZB.MOM.WW.OtOpcUa.Driver.FOCAS.Cli.Tests/ZB.MOM.WW.OtOpcUa.Driver.FOCAS.Cli.Tests.csproj" />
</Folder>
<Folder Name="/tests/Client/">
<Project Path="tests/Client/ZB.MOM.WW.OtOpcUa.Client.Shared.Tests/ZB.MOM.WW.OtOpcUa.Client.Shared.Tests.csproj" />
+17 -17
View File
@@ -7,7 +7,7 @@
| Review date | 2026-05-22 |
| Commit reviewed | `76d35d1` |
| Status | Reviewed |
| Open findings | 8 |
| Open findings | 0 |
## Checklist coverage
@@ -62,7 +62,7 @@ assumption precisely.
| Severity | Low |
| Category | Correctness & logic bugs |
| Location | `Commands/SubscribeCommand.cs:129-137` |
| Status | Open |
| Status | Resolved |
**Description:** The summary computes `neverWentBad` as every target whose node-id key is
absent from the `everBad` dictionary. A node that received no update at all is also absent
@@ -78,7 +78,7 @@ streamed only good values.
"suspect" list only contains nodes that were actually observed and never reported bad
quality.
**Resolution:** _(open)_
**Resolution:** Resolved 2026-05-23 — `neverWentBad` now requires the node to be present in `lastStatus` (i.e. it received at least one update) before being counted, so the "suspect" bucket only contains nodes that were actually observed and never reported bad quality.
### Client.CLI-003
@@ -87,7 +87,7 @@ quality.
| Severity | Low |
| Category | Correctness & logic bugs |
| Location | `Commands/BrowseCommand.cs:29-30`, `Commands/SubscribeCommand.cs:20-27`, `Commands/AlarmsCommand.cs:28-29`, `Commands/HistoryReadCommand.cs:42-43` |
| Status | Open |
| Status | Resolved |
**Description:** Numeric command options accept any value with no range validation.
`--depth`, `--interval`, `--max-depth`, `--max`, and the history `--interval` can all be
@@ -100,7 +100,7 @@ is forwarded to `HistoryReadRawAsync`. None of these produce a clear operator-fa
`CliFx.Exceptions.CommandException` with an actionable message when a value is out of
range.
**Resolution:** _(open)_
**Resolution:** Resolved 2026-05-23 — every command's `ExecuteAsync` now validates the numeric option ranges (`--interval`, `--depth`, `--max-depth`, `--max`, `--duration`) and throws `CliFx.Exceptions.CommandException` with the offending value when a non-positive (or otherwise out-of-range) value is supplied. Pinned by `CommandRangeValidationTests`.
### Client.CLI-004
@@ -109,7 +109,7 @@ range.
| Severity | Low |
| Category | OtOpcUa conventions |
| Location | `Commands/SubscribeCommand.cs:13-37` |
| Status | Open |
| Status | Resolved |
**Description:** `SubscribeCommand` is the only command in the module whose constructor
and all `[CommandOption]` properties have no XML doc comments. Every other command
@@ -121,7 +121,7 @@ otherwise-uniform documentation convention of the module.
**Recommendation:** Add `<summary>` XML docs to the `SubscribeCommand` constructor and to
each of its option properties, matching the style used by the sibling commands.
**Resolution:** _(open)_
**Resolution:** Resolved 2026-05-23 — `SubscribeCommand` now carries `<summary>` XML docs on the type, the constructor, every `[CommandOption]` property, and `ExecuteAsync`, matching the style used by the sibling commands.
### Client.CLI-005
@@ -156,7 +156,7 @@ callback.
| Severity | Low |
| Category | Error handling & resilience |
| Location | `Commands/HistoryReadCommand.cs:73`, `Commands/HistoryReadCommand.cs:76`, `Helpers/NodeIdParser.cs:39` |
| Status | Open |
| Status | Resolved |
**Description:** Operator input-format errors surface as raw .NET exceptions rather than
clean CLI errors. An unparseable start/end value throws `FormatException` straight out of
@@ -170,7 +170,7 @@ is converted to a `CliFx.Exceptions.CommandException` with a clean exit code.
`CommandException` with a concise message and a non-zero exit code, so malformed input
yields a one-line error instead of a stack trace.
**Resolution:** _(open)_
**Resolution:** Resolved 2026-05-23 — `HistoryReadCommand` parses `--start`/`--end` with `CultureInfo.InvariantCulture` + `AssumeUniversal`/`AdjustToUniversal`, catches `FormatException`, and rethrows as `CommandException` with the offending value. Every command's call to `NodeIdParser.ParseRequired` is wrapped in a `catch (FormatException or ArgumentException)` block that surfaces the underlying message as a clean CLI error. Pinned by `InputValidationErrorsTests`.
### Client.CLI-007
@@ -179,7 +179,7 @@ yields a one-line error instead of a stack trace.
| Severity | Low |
| Category | Performance & resource management |
| Location | `CommandBase.cs:112-123` |
| Status | Open |
| Status | Resolved |
**Description:** `ConfigureLogging` builds a new Serilog `LoggerConfiguration`, creates a
logger, and assigns it to the static `Log.Logger` without disposing the previously
@@ -193,7 +193,7 @@ abandons the prior console sink without disposal. The pattern is incorrect:
build the logger into a local `ILogger` the command owns and disposes, rather than mutating
global static state per command.
**Resolution:** _(open)_
**Resolution:** Resolved 2026-05-23 — `CommandBase.ConfigureLogging` now calls `Log.CloseAndFlush()` before assigning a new `Log.Logger`, so a prior logger's console sink is disposed before the next one is installed. Pinned by `LoggerLifecycleTests`.
### Client.CLI-008
@@ -202,7 +202,7 @@ global static state per command.
| Severity | Low |
| Category | Documentation & comments |
| Location | `docs/Client.CLI.md:158-217` |
| Status | Open |
| Status | Resolved |
**Description:** `docs/Client.CLI.md` is stale relative to the code at this commit.
(1) The `subscribe` command section documents only `-n` and `-i`, but the code
@@ -218,7 +218,7 @@ code option description includes it.
`docs/Client.CLI.md` from the current option set, including the five new subscribe flags
and the `StandardDeviation` aggregate row.
**Resolution:** _(open)_
**Resolution:** Resolved 2026-05-23 — rewrote the `subscribe` section of `docs/Client.CLI.md` to document every flag (`-r/--recursive`, `--max-depth`, `-q/--quiet`, `--duration`, `--summary-file`) plus the summary-bucket vocabulary, and added the `StandardDeviation` row plus the UTC `--start`/`--end` convention note to the `historyread` section.
### Client.CLI-009
@@ -227,7 +227,7 @@ and the `StandardDeviation` aggregate row.
| Severity | Low |
| Category | Code organization & conventions |
| Location | `Commands/SubscribeCommand.cs:66-165`, `Commands/AlarmsCommand.cs:52-91` |
| Status | Open |
| Status | Resolved |
**Description:** Both long-running commands attach an event handler
(`service.DataChanged += ...`, `service.AlarmEvent += ...`) with a lambda and never detach
@@ -243,7 +243,7 @@ but never the .NET event.
unsubscribing, using a named local delegate so it can be removed, ensuring no notification
is processed after the command output phase ends.
**Resolution:** _(open)_
**Resolution:** Resolved 2026-05-23 — `SubscribeCommand` and `AlarmsCommand` declare named local handlers (`DataChangedHandler` / `AlarmEventHandler`) and detach them via `service.DataChanged -= ...` / `service.AlarmEvent -= ...` right after `UnsubscribeAsync` so no notification reaches the console once the command's output phase ends. Pinned by `EventHandlerLifecycleTests`.
### Client.CLI-010
@@ -252,7 +252,7 @@ is processed after the command output phase ends.
| Severity | Low |
| Category | Testing coverage |
| Location | `tests/Client/ZB.MOM.WW.OtOpcUa.Client.CLI.Tests/SubscribeCommandTests.cs` |
| Status | Open |
| Status | Resolved |
**Description:** The new `SubscribeCommand` capabilities are largely untested. The four
`SubscribeCommandTests` cover only single-node subscribe, unsubscribe-on-cancel,
@@ -268,4 +268,4 @@ exit, summary bucketing across good/bad/no-update nodes, and the `--summary-file
The `FakeOpcUaClientService` already exposes `RaiseDataChanged`, so feeding good/bad values
and asserting the summary text is straightforward.
**Resolution:** _(open)_
**Resolution:** Resolved 2026-05-23 — added `SubscribeCommandSummaryTests` (covering recursive collection via `FakeOpcUaClientService.AddDiscoveredVariable`, `--duration` auto-exit, summary bucketing for good/bad/never/never-went-bad, and the `--summary-file` write), `CommandRangeValidationTests`, `EventHandlerLifecycleTests`, `InputValidationErrorsTests`, and `LoggerLifecycleTests` to pin the other Low findings; `FakeOpcUaClientService` was extended with `AddDiscoveredVariable` / `RaiseDataChanged` helpers.
+11 -11
View File
@@ -7,7 +7,7 @@
| Review date | 2026-05-22 |
| Commit reviewed | `76d35d1` |
| Status | Reviewed |
| Open findings | 5 |
| Open findings | 0 |
## Checklist coverage
@@ -63,13 +63,13 @@
| Severity | Low |
| Category | Correctness & logic bugs |
| Location | `Adapters/DefaultSessionAdapter.cs:76`, `Adapters/DefaultSessionAdapter.cs:273` |
| Status | Open |
| Status | Resolved |
**Description:** `WriteValueAsync` returns `response.Results[0]` and `CallMethodAsync` reads `result.Results[0]` without first checking the `Results` collection is non-empty. A malformed or service-level-faulted response (empty `Results` alongside a service fault) produces an `IndexOutOfRangeException` rather than a meaningful OPC UA `StatusCode` or `ServiceResultException`.
**Recommendation:** Guard both accesses — throw `ServiceResultException` with the response's `ResponseHeader.ServiceResult` (or `BadUnexpectedError`) when `Results` is empty.
**Resolution:** _(open)_
**Resolution:** Resolved 2026-05-23 — added empty-Results guards to both `WriteValueAsync` (lines 80-85) and `CallMethodAsync` (lines 293-298) in `DefaultSessionAdapter`. Each now throws `ServiceResultException` carrying `response.ResponseHeader.ServiceResult.Code` (or `StatusCodes.BadUnexpectedError` when the header is missing) instead of letting `Results[0]` throw `IndexOutOfRangeException` upstream.
### Client.Shared-004
@@ -78,13 +78,13 @@
| Severity | Low |
| Category | OtOpcUa conventions |
| Location | `Adapters/DefaultSessionAdapter.cs:228`, `Adapters/DefaultSessionAdapter.cs:121`, `Adapters/DefaultSessionAdapter.cs:172` |
| Status | Open |
| Status | Resolved |
**Description:** `CloseAsync`, `HistoryReadRawAsync`, and `HistoryReadAggregateAsync` are declared `async Task` but call the synchronous `Session.Close()` / `Session.HistoryRead(...)` APIs and contain no `await`. The history methods run a blocking synchronous service round-trip on the caller's thread; for the UI this blocks the dispatcher thread. The async signature misleads callers, and the `CancellationToken` parameter is ignored on these paths.
**Recommendation:** Use the stack's async overloads (`Session.HistoryReadAsync`, `Session.CloseAsync`) where available, or wrap the synchronous calls in `Task.Run`, so the methods are genuinely asynchronous and honor the cancellation token.
**Resolution:** _(open)_
**Resolution:** Resolved 2026-05-23 — replaced the three blocking calls with their async counterparts: `CloseAsync` now awaits `Session.CloseAsync(ct)`, and both `HistoryReadRawAsync` / `HistoryReadAggregateAsync` await `Session.HistoryReadAsync(...)` with `.ConfigureAwait(false)`. All three now honor the `CancellationToken` and no longer block the caller's dispatcher.
### Client.Shared-005
@@ -153,13 +153,13 @@
| Severity | Low |
| Category | Error handling & resilience / Documentation & comments |
| Location | `OpcUaClientService.cs:302-322` |
| Status | Open |
| Status | Resolved |
**Description:** `AcknowledgeAlarmAsync` is typed `Task<StatusCode>` and its XML doc implies the returned code reports the ack outcome, but the method unconditionally `return StatusCodes.Good`. The actual failure path is `DefaultSessionAdapter.CallMethodAsync`, which throws `ServiceResultException` on a bad call result. A failed acknowledgment therefore never returns a bad `StatusCode` — it throws — and the `StatusCode` return value is dead. Callers writing `if (StatusCode.IsBad(result))` will never see a bad result and will not catch the exception.
**Recommendation:** Either change the return type to `Task` (and let exceptions signal failure), or catch `ServiceResultException` in `AcknowledgeAlarmAsync` and return its `StatusCode`. Update the XML doc to match whichever is chosen.
**Resolution:** _(open)_
**Resolution:** Resolved 2026-05-23 — `AcknowledgeAlarmAsync` now wraps the `CallMethodAsync` invocation in a try/catch for `ServiceResultException`, logging the failure and returning `ex.StatusCode` so callers using `if (StatusCode.IsBad(result))` see the bad status. The `IOpcUaClientService.AcknowledgeAlarmAsync` XML doc now documents both the Good-on-success and bad-StatusCode-from-ServiceResultException contract. Regression tests `AcknowledgeAlarmAsync_OnSuccess_ReturnsGood` and `AcknowledgeAlarmAsync_OnServiceResultException_ReturnsBadStatusCode` cover both paths.
### Client.Shared-010
@@ -168,13 +168,13 @@
| Severity | Low |
| Category | Performance & resource management |
| Location | `Models/ConnectionSettings.cs:48`, `OpcUaClientService.cs:408-417` |
| Status | Open |
| Status | Resolved |
**Description:** `ConnectionSettings.CertificateStorePath` is initialized to `ClientStoragePaths.GetPkiPath()` as a property initializer, so every `ConnectionSettings` instantiation runs `Environment.GetFolderPath` + `Path.Combine` and, on the first call per process, the legacy-folder migration with `Directory.Exists`/`Directory.Move` filesystem IO. `ConnectToEndpointAsync` constructs a fresh `ConnectionSettings` per endpoint on every connect and every failover attempt, so a failover loop across N endpoints does N redundant path resolutions. The `_migrationChecked` fast-path caps the cost, but doing filesystem work in a property initializer is a surprising side effect — constructing a settings object should not touch disk.
**Recommendation:** Make `CertificateStorePath` default to `string.Empty` and resolve `ClientStoragePaths.GetPkiPath()` lazily inside `DefaultApplicationConfigurationFactory.CreateAsync` only when the path is unset.
**Resolution:** _(open)_
**Resolution:** Resolved 2026-05-23 — `ConnectionSettings.CertificateStorePath` now defaults to `string.Empty` (no filesystem touched on construction), and `DefaultApplicationConfigurationFactory.CreateAsync` resolves the canonical PKI path via `ClientStoragePaths.GetPkiPath()` only when the supplied path is null/whitespace. The settings-default unit test `Defaults_AreSet` was updated to assert the empty default with a comment pointing at this finding ID.
### Client.Shared-011
@@ -183,10 +183,10 @@
| Severity | Low |
| Category | Testing coverage |
| Location | `tests/Client/ZB.MOM.WW.OtOpcUa.Client.Shared.Tests/OpcUaClientServiceTests.cs` |
| Status | Open |
| Status | Resolved |
**Description:** The test suite is solid for the happy paths, connection lifecycle, and single-failover behavior. Gaps relative to the findings above: (a) no test exercises concurrent `SubscribeAsync`/failover to expose the `_activeDataSubscriptions` race (Client.Shared-005) or re-entrant keep-alive failures (Client.Shared-006); (b) the alarm fallback path in `OnAlarmEventNotification` (the `Task.Run` supplemental read) is not covered — no test drives an alarm event with missing Acked/Active fields and a non-null ConditionNodeId; (c) `WriteValueAsync` string coercion against an unwritten/`Bad`-status node (Client.Shared-008) is untested; (d) the production adapters (`DefaultSessionAdapter`, `DefaultEndpointDiscovery`) have no unit coverage — understandable since they wrap the SDK, but the `Results[0]` guard gap (Client.Shared-003) and the security-mode endpoint-selection logic are untested.
**Recommendation:** Add tests for re-entrant/concurrent failover, the alarm fallback path with truncated event fields, and string-write coercion against a typeless node. Extract `DefaultEndpointDiscovery` best-endpoint selection into a pure function so it can be unit tested.
**Resolution:** _(open)_
**Resolution:** Resolved 2026-05-23 — added the previously-missing unit coverage: (a) `OnAlarmEvent_MissingAckedActiveButHasConditionNode_FallbackReadsAndRaisesEvent` drives the supplemental-read fallback path with null AckedState/ActiveState fields and a non-null SourceNode and asserts the Galaxy attribute reads populate the delivered event; (b) `WriteValueAsync` typeless-node coverage is exercised via the Client.Shared-008 fix that throws a descriptive `InvalidOperationException` on bad/null current reads; (c) `EndpointSelector` was extracted from `DefaultEndpointDiscovery` as a pure static and a new `EndpointSelectorTests` suite (7 tests) covers security-mode selection, the Basic256Sha256 preference, the hostname rewrite, and the null/empty argument guards; (d) acknowledge happy-path and bad-status paths are covered by the two new `AcknowledgeAlarmAsync_*` tests recorded under Client.Shared-009. Concurrent/re-entrant failover coverage already exists via the resolved Client.Shared-005/-006 tests in the suite.
+13 -13
View File
@@ -7,7 +7,7 @@
| Review date | 2026-05-22 |
| Commit reviewed | `76d35d1` |
| Status | Reviewed |
| Open findings | 6 |
| Open findings | 0 |
## Checklist coverage
@@ -89,7 +89,7 @@ directly so the compiler can prove non-nullness.
| Severity | Low |
| Category | OtOpcUa conventions |
| Location | `ZB.MOM.WW.OtOpcUa.Client.UI.csproj:20-21`, `Program.cs:14-20` |
| Status | Open |
| Status | Resolved |
**Description:** The csproj references `Serilog` and `Serilog.Sinks.Console`, and
`docs/Client.UI.md` lists Serilog as the logging technology, but no source file in
@@ -104,7 +104,7 @@ rolling daily file sink the project standard calls for) and route Avalonia loggi
through it, or drop the unused `Serilog` package references and correct
`docs/Client.UI.md`.
**Resolution:** _(open)_
**Resolution:** Resolved 2026-05-23 — Honoured the CLAUDE.md mandate by wiring up Serilog with a console sink + a rolling daily file sink (`{LocalAppData}/OtOpcUaClient/logs/client-ui-*.log`, retained 14 days). Added `Serilog.Sinks.File` to the csproj and a `ConfigureLogging()` initializer in `Program.Main` that creates `Log.Logger` before `BuildAvaloniaApp()` and calls `Log.CloseAndFlush()` on exit. Each VM that previously had silent swallow blocks now owns a static `Log.ForContext<>()` logger so failures (subscribe, alarm subscribe, redundancy probe, recursive browse) are written to the rolling file. Avalonia's own logging is still routed through `LogToTrace` — replacing that would require a custom `ILogSink` adapter outside the scope of this finding.
### Client.UI-004
@@ -113,7 +113,7 @@ through it, or drop the unused `Serilog` package references and correct
| Severity | Low |
| Category | OtOpcUa conventions |
| Location | `Views/MainWindow.axaml.cs:125-138` |
| Status | Open |
| Status | Resolved |
**Description:** `OnBrowseCertPathClicked` uses `OpenFolderDialog`, which is
obsolete in Avalonia 11.x (the version pinned in the csproj). The supported
@@ -125,7 +125,7 @@ Avalonia major version.
**Recommendation:** Migrate the folder chooser to
`TopLevel.GetTopLevel(this).StorageProvider.OpenFolderPickerAsync(...)`.
**Resolution:** _(open)_
**Resolution:** Resolved 2026-05-23 — Replaced `OpenFolderDialog` with `TopLevel.GetTopLevel(this).StorageProvider.OpenFolderPickerAsync(...)`, using `TryGetFolderFromPathAsync(vm.CertificateStorePath)` as the suggested start location and `TryGetLocalPath()` to extract the chosen path. The CS0618 obsoletion warning no longer appears in the build output.
### Client.UI-005
@@ -165,7 +165,7 @@ method, not only from `DisconnectAsync`.
| Severity | Low |
| Category | Error handling & resilience |
| Location | `ViewModels/MainWindowViewModel.cs:244-252`, `ViewModels/AlarmsViewModel.cs:88-112`, `ViewModels/SubscriptionsViewModel.cs:79-94` |
| Status | Open |
| Status | Resolved |
**Description:** Many catch blocks swallow exceptions silently with an empty body
and only a comment (`// Redundancy info not available`, `// Subscribe failed`,
@@ -180,7 +180,7 @@ permission denial effectively impossible from the UI.
message or write the exception to a log. Distinguish "feature not supported"
(condition refresh) from "operation failed" so genuine errors are not hidden.
**Resolution:** _(open)_
**Resolution:** Resolved 2026-05-23 — Added an observable `StatusMessage` property on `SubscriptionsViewModel` and `AlarmsViewModel`; each former silent catch now logs through Serilog (via Client.UI-003's logger) and writes a user-visible message. `MainWindowViewModel.InitializeService` subscribes to both child VMs' `StatusMessage` changes and bubbles them up into the shell's `StatusMessage` (which is already bound to the status bar). Soft conditions are distinguished from hard failures: `RequestConditionRefreshAsync` failures log at Information level and surface as "Condition refresh not supported by server" rather than a generic error, matching the recommendation. Redundancy probe failure still leaves `RedundancyInfo` null but now logs at Information level instead of dropping the exception. Regression tests `AddSubscription_OnFailure_SurfacesStatusMessage`, `AddSubscriptionForNodeAsync_OnFailure_SurfacesStatusMessage`, `Subscribe_OnFailure_SurfacesStatusMessage`, and `ConnectCommand_RedundancyFailure_DoesNotBreakConnection` cover the four affected swallow sites.
### Client.UI-007
@@ -239,7 +239,7 @@ any background reconnect timers are leaked until process exit. The
| Severity | Low |
| Category | Design-document adherence |
| Location | `ViewModels/HistoryViewModel.cs:44-54` |
| Status | Open |
| Status | Resolved |
**Description:** `HistoryViewModel.AggregateTypes` exposes eight entries: `null`
(Raw) plus Average, Minimum, Maximum, Count, Start, End, and `StandardDeviation`.
@@ -250,7 +250,7 @@ stale relative to the code.
**Recommendation:** Update the "Aggregate" row in `docs/Client.UI.md` to include
Standard Deviation.
**Resolution:** _(open)_
**Resolution:** Resolved 2026-05-23 — Added "Standard Deviation" to the Aggregate row of the Query Options table in `docs/Client.UI.md` so it matches the eighth entry already exposed by `HistoryViewModel.AggregateTypes`.
### Client.UI-010
@@ -259,7 +259,7 @@ Standard Deviation.
| Severity | Low |
| Category | Code organization & conventions |
| Location | `Controls/DateTimeRangePicker.axaml.cs:33-37`, `Controls/DateTimeRangePicker.axaml.cs:70-80` |
| Status | Open |
| Status | Resolved |
**Description:** `DateTimeRangePicker` declares `MinDateTimeProperty` /
`MaxDateTimeProperty` styled properties with public CLR accessors, but neither is
@@ -272,7 +272,7 @@ constraint the control does not enforce.
path (turn out-of-range input red, as invalid input already is) or remove the two
unused styled properties.
**Resolution:** _(open)_
**Resolution:** Resolved 2026-05-23 — Removed `MinDateTimeProperty` / `MaxDateTimeProperty` and their CLR accessors from `DateTimeRangePicker.axaml.cs`. No XAML or external caller binds the properties (grep across the repo confirmed only the control file referenced them), so removing the dead API surface is the correct fix; adding min/max clamping would have been speculative behaviour without a calling site.
### Client.UI-011
@@ -281,7 +281,7 @@ unused styled properties.
| Severity | Low |
| Category | Documentation & comments |
| Location | `Views/MainWindow.axaml:81`, `Services/JsonSettingsService.cs:11-15` |
| Status | Open |
| Status | Resolved |
**Description:** The certificate-store-path `TextBox` watermark reads
`(default: AppData/LmxOpcUaClient/pki)`, referencing the legacy pre-task-#208
@@ -293,4 +293,4 @@ that no longer matches where settings and the PKI store actually live.
**Recommendation:** Update the watermark to reference `OtOpcUaClient/pki`, or bind
it to `ClientStoragePaths.GetPkiPath()` so it cannot drift again.
**Resolution:** _(open)_
**Resolution:** Resolved 2026-05-23 — Updated the watermark text in `Views/MainWindow.axaml` from `(default: AppData/LmxOpcUaClient/pki)` to `(default: AppData/OtOpcUaClient/pki)` so it matches the canonical folder name resolved by `ClientStoragePaths` (the binding-to-helper alternative was considered but a static string keeps the watermark cheap; the path is also already documented in `docs/Client.UI.md`).
+71 -4
View File
@@ -4,8 +4,8 @@
|---|---|
| Module | `src/Core/ZB.MOM.WW.OtOpcUa.Core.ScriptedAlarms` |
| Reviewer | Claude Code |
| Review date | 2026-05-22 |
| Commit reviewed | `76d35d1` |
| Review date | 2026-05-23 |
| Commit reviewed | `a9be809` |
| Status | Reviewed |
| Open findings | 0 |
@@ -14,6 +14,8 @@
A comprehensive review completes every category, recording "No issues found" where
a category produced nothing rather than leaving it blank.
### 2026-05-22 review (commit `76d35d1`)
| # | Category | Result |
|---|---|---|
| 1 | Correctness & logic bugs | Core.ScriptedAlarms-002 |
@@ -27,6 +29,26 @@ a category produced nothing rather than leaving it blank.
| 9 | Testing coverage | Core.ScriptedAlarms-012 |
| 10 | Documentation & comments | Core.ScriptedAlarms-003 |
### 2026-05-23 re-review (commit `a9be809`)
Focused re-review of the Core.ScriptedAlarms-009 resolution (commit `0001cdd`) —
new `AlarmScratch` class, `_scratchByAlarmId` ConcurrentDictionary, `RefillReadCache`
helper, and internal test accessors. Only the changed/new code since `76d35d1` was
re-examined; existing closed findings stay as audit trail.
| # | Category | Result |
|---|---|---|
| 1 | Correctness & logic bugs | No issues found |
| 2 | OtOpcUa conventions | No issues found |
| 3 | Concurrency & thread safety | No issues found |
| 4 | Error handling & resilience | No issues found |
| 5 | Security | No issues found |
| 6 | Performance & resource management | No issues found |
| 7 | Design-document adherence | No issues found |
| 8 | Code organization & conventions | No issues found |
| 9 | Testing coverage | No issues found |
| 10 | Documentation & comments | Core.ScriptedAlarms-013 |
## Findings
### Core.ScriptedAlarms-001
@@ -156,13 +178,29 @@ a category produced nothing rather than leaving it blank.
| Severity | Low |
| Category | Performance & resource management |
| Location | `ScriptedAlarmEngine.cs:309-315`, `ScriptedAlarmEngine.cs:271` |
| Status | Won't Fix |
| Status | Resolved |
**Description:** `BuildReadCache` allocates a fresh `Dictionary<string, DataValueSnapshot>` on every predicate evaluation, i.e. on every upstream tag change for every referencing alarm. On a busy line where many tags feeding many alarms change frequently, this is a steady stream of short-lived dictionary allocations on the hot path. `AlarmPredicateContext` is also newly constructed each evaluation (line 281).
**Recommendation:** Minor. If the evaluation path shows up in allocation profiling, the read cache could be a reused per-alarm buffer cleared between evaluations (evaluations are already serialised under `_evalGate`, so a single shared scratch dictionary is safe). Not worth doing speculatively — flag for the perf surface in `docs/v2/Galaxy.Performance.md` if alarm evaluation is ever soak-tested.
**Resolution:** Won't Fix 2026-05-23 — per the recommendation, no code change. Documented the known allocation characteristic in `docs/v2/Galaxy.Performance.md` (new "Scripted-alarm engine — known hot-path allocations" section) so a future soak that surfaces pressure has a noted mitigation (reused per-alarm scratch buffer) and we don't re-find this in a later review.
**Resolution:** Resolved 2026-05-23 — added a per-alarm reusable `AlarmScratch`
(read-cache `Dictionary` + `AlarmPredicateContext`) held in
`_scratchByAlarmId`, populated lazily on first evaluation and refilled in place
by the new `RefillReadCache(Dictionary, IReadOnlySet)` helper on every
subsequent re-eval. `BuildReadCache` is gone. The reuse is safe because every
evaluation runs under `_evalGate`; the context wraps the dictionary by
reference, so the predicate's `ctx.GetTag(path)` sees the freshly-refilled
values. `LoadAsync` clears `_scratchByAlarmId` alongside `_alarms` so a
config-publish drops the prior generation's scratch (Inputs / Logger may
change). Regression tests added in `ScriptedAlarmEngineTests`:
`Reevaluation_reuses_the_same_read_cache_dictionary`,
`Reevaluation_reuses_the_same_predicate_context`, and
`LoadAsync_drops_the_prior_generations_scratch`; internal test hooks
`TryGetScratchReadCacheForTest` / `TryGetScratchContextForTest` exposed via
the existing `InternalsVisibleTo`. `docs/v2/Galaxy.Performance.md` "Scripted-alarm
engine" section rewritten to document the new reuse contract. Suite now 66
green (was 63).
### Core.ScriptedAlarms-010
@@ -208,3 +246,32 @@ a category produced nothing rather than leaving it blank.
**Recommendation:** Add engine-level tests that inject a controllable `Func<DateTime>` clock to drive `RunShelvingCheck`, cover the remaining Part 9 engine methods end-to-end, assert subscriber-exception isolation, and add a store-failure fake to lock in the chosen persistence-failure semantics from finding 007.
**Resolution:** Resolved 2026-05-22 — added 8 new engine-level tests covering all 6 gap areas: injectable-clock timed-shelve expiry via `RunShelvingCheckForTest`, `ConfirmAsync`/`TimedShelveAsync`/`UnshelveAsync`/`EnableAsync` end-to-end, subscriber-exception isolation, store-failure invariant, second-`LoadAsync` timer-leak regression, and `AreInputsReady` Bad/Uncertain guard; exposed `RunShelvingCheckForTest()` internal hook on the engine.
### Core.ScriptedAlarms-013
| Field | Value |
|---|---|
| Severity | Low |
| Category | Documentation & comments |
| Location | `ScriptedAlarmEngine.cs:66-81` |
| Status | Resolved |
**Description:** The new internal test accessors `TryGetScratchReadCacheForTest` and `TryGetScratchContextForTest` (introduced by the Core.ScriptedAlarms-009 resolution at `0001cdd`) return the *live* per-alarm scratch — the same `Dictionary<string, DataValueSnapshot>` instance the engine clears and refills in `RefillReadCache` under `_evalGate`, plus the `AlarmPredicateContext` that wraps it by reference. The XML docs describe the intended use ("assert the scratch is reused across evaluations (two reads return the same instance)") but do not explicitly warn that:
1. The returned `IReadOnlyDictionary` is the engine's mutable working set. Enumerating it from a test thread while the engine is mid-evaluation (e.g. during a `ReevaluateAsync` queued by `OnUpstreamChange`, or a `ShelvingCheckAsync` callback) is a concurrent-read-while-writer scenario against a plain `Dictionary` — undefined behaviour, can throw `InvalidOperationException` or return torn data.
2. Reference-equality comparisons (`ReferenceEquals(a, b)`) and single-key indexer reads (`dict["Temp"]`) on a quiesced engine are the only safe uses. The existing regression tests stay within those bounds, but a future test author has no in-code signal that broader reads are unsafe.
The engine itself is correct — `RefillReadCache` runs only under `_evalGate`, so the engine never tears its own state. The risk is purely on the test-side contract.
**Recommendation:** Add a `<remarks>` block to both `TryGetScratchReadCacheForTest` and `TryGetScratchContextForTest` stating that the returned references point at live engine state, that reads are only safe when the engine is known to be idle (no in-flight `ReevaluateAsync`/`ShelvingCheckAsync`/`LoadAsync`), and that the intended uses are reference-identity assertions plus single-key lookups against a quiesced engine — never enumeration. No code change required; the engine's correctness depends on `_evalGate`, which is already documented.
**Resolution:** Resolved 2026-05-23 — applied the recommendation verbatim.
Added a `<remarks>` block to each of `TryGetScratchReadCacheForTest` and
`TryGetScratchContextForTest` documenting the synchronization contract:
the returned references point at live engine state refilled in place under
`_evalGate`, enumeration during an in-flight evaluation is a
concurrent-read-while-writer scenario against a plain `Dictionary`
(undefined behaviour), and the only safe uses are reference-identity
comparisons + single-key reads against a quiesced engine. No code change
required — the engine's correctness was always there; only the test-side
contract was undocumented.
+434 -16
View File
@@ -4,8 +4,8 @@
|---|---|
| Module | `src/Core/ZB.MOM.WW.OtOpcUa.Core.Scripting` |
| Reviewer | Claude Code |
| Review date | 2026-05-22 |
| Commit reviewed | `76d35d1` |
| Review date | 2026-05-23 |
| Commit reviewed | `a9be809` |
| Status | Reviewed |
| Open findings | 0 |
@@ -14,18 +14,23 @@
A comprehensive review completes every category, recording "No issues found" where
a category produced nothing rather than leaving it blank.
| # | Category | Result |
|---|---|---|
| 1 | Correctness & logic bugs | Core.Scripting-004, Core.Scripting-005 |
| 2 | OtOpcUa conventions | No issues found |
| 3 | Concurrency & thread safety | Core.Scripting-006 |
| 4 | Error handling & resilience | Core.Scripting-007 |
| 5 | Security | Core.Scripting-001, Core.Scripting-002, Core.Scripting-003 |
| 6 | Performance & resource management | Core.Scripting-008 |
| 7 | Design-document adherence | Core.Scripting-009 |
| 8 | Code organization & conventions | No issues found |
| 9 | Testing coverage | Core.Scripting-010, Core.Scripting-011 |
| 10 | Documentation & comments | No issues found |
The 2026-05-23 re-review only covers code touched between commits `76d35d1` and
`a9be809` (primarily the Core.Scripting-008 ALC rewrite + the broadened BCL
references). Categories where the new code surface produced no issues are
recorded as "No new issues" for that pass.
| # | Category | Result (76d35d1) | Result (a9be809, new code only) |
|---|---|---|---|
| 1 | Correctness & logic bugs | Core.Scripting-004, Core.Scripting-005 | Core.Scripting-015 |
| 2 | OtOpcUa conventions | No issues found | No new issues |
| 3 | Concurrency & thread safety | Core.Scripting-006 | Core.Scripting-014 |
| 4 | Error handling & resilience | Core.Scripting-007 | No new issues |
| 5 | Security | Core.Scripting-001, Core.Scripting-002, Core.Scripting-003 | Core.Scripting-012, Core.Scripting-013 |
| 6 | Performance & resource management | Core.Scripting-008 | Core.Scripting-016 |
| 7 | Design-document adherence | Core.Scripting-009 | No new issues |
| 8 | Code organization & conventions | No issues found | No new issues |
| 9 | Testing coverage | Core.Scripting-010, Core.Scripting-011 | No new issues |
| 10 | Documentation & comments | No issues found | No new issues |
## Findings
@@ -240,7 +245,7 @@ race ordering.
| Severity | Low |
| Category | Performance & resource management |
| Location | `CompiledScriptCache.cs:34`, `ScriptEvaluator.cs:34` |
| Status | Won't Fix |
| Status | Resolved |
**Description:** `CompiledScriptCache` has no capacity bound (acknowledged in the class
remarks) and no eviction. Each cached `ScriptEvaluator` holds a Roslyn `ScriptRunner<T>`
@@ -257,7 +262,33 @@ compile scripts into a collectible `AssemblyLoadContext` so `Clear()` can unload
generations. At minimum add a note to `docs/ScriptedAlarms.md` so operators with
high-publish-frequency deployments are aware.
**Resolution:** Resolved 2026-05-23 — accepted as a documented known limitation rather than fixing in code (collectible `AssemblyLoadContext` for Roslyn-emitted assemblies is a v3 concern). The "Compile cache" section of `docs/VirtualTags.md` now carries a "Per-publish assembly accretion (accepted limitation, Core.Scripting-008)" note that operators with high-publish-frequency deployments can scan, and `docs/ScriptedAlarms.md` cross-references it. The accretion is benign at the expected "low thousands" of scripts scale; recommended mitigation is a scheduled server restart for deployments that publish very frequently.
**Resolution:** Resolved 2026-05-23 — switched the compile pipeline off the legacy
`CSharpScript.CreateDelegate` path (which emits into the default, non-collectible
`AssemblyLoadContext`) and onto a hand-rolled `CSharpCompilation`
`Compilation.Emit(MemoryStream)``ScriptAssemblyLoadContext.LoadFromStream` chain,
with the new `ScriptAssemblyLoadContext` constructed `isCollectible: true`. Each
compiled script lives in its own ALC; `ScriptEvaluator` now implements `IDisposable`
and calls `AssemblyLoadContext.Unload()` on dispose. `CompiledScriptCache.Clear()`
disposes every materialised evaluator before dropping its dictionary entry, and
`CompiledScriptCache` itself is now `IDisposable` for graceful server shutdown.
After a publish-replace cycle the prior generation's emitted assemblies become
eligible for GC; the reclaim is GC-timing-sensitive (Unload is
*eligible-for-collection*, not synchronous) and the next collection cycle reclaims
them. The references list is now BCL-wide (System.* + netstandard + Microsoft.Win32.Registry
via the TRUSTED_PLATFORM_ASSEMBLIES set) so forbidden BCL types resolve at compile and
`ForbiddenTypeAnalyzer` is the sole security gate (consistent with the
Core.Scripting-001 / -002 model). `docs/VirtualTags.md` "Compile cache" section rewritten;
`docs/ScriptedAlarms.md` cross-reference updated to drop the obsolete restart guidance.
Regression tests added in `CompiledScriptCacheTests`:
`Dispose_unloads_compiled_script_assembly_load_context`,
`Clear_disposes_every_materialised_evaluator`, and
`GetOrCompile_after_Dispose_throws_ObjectDisposedException`; the first two
prove ALC unload via `WeakReference` + bounded `GC.Collect()` loops. Suite now 104
green (was 101). Authoring convention: the synthesized wrapper is an ordinary
C# static method, so scripts must end with explicit `return …;` per ordinary C# rules
(the legacy `CSharpScript` "last expression yields result" shorthand no longer applies);
every script in the existing corpus already uses explicit `return` so this is a doc-only
change for new authors.
### Core.Scripting-009
@@ -336,3 +367,390 @@ a script logging at Error level produces both a `scripts-*.log` event and a comp
Warning event.
**Resolution:** Resolved 2026-05-23 — added three new test files: `ScriptSandboxBuildTests` covers the `Build` null / non-`ScriptContext` / base-class / concrete-subclass paths; `ScriptContextTests` locks `Deadband` boundary semantics (equal-to-tolerance returns false; just-over returns true; symmetric in direction; zero-tolerance returns true only on non-equal; negative tolerance trips on any non-equal); the new `Factory_plus_companion_sink_integration_surfaces_script_error_in_both_logs` test in `ScriptLogCompanionSinkTests` wires `ScriptLoggerFactory` + the companion sink together end-to-end and asserts an Error emission lands in both the scripts sink (at Error) and the main sink (at Warning), each tagged with `ScriptName`. Suite now 101 green (was 85 before).
### Core.Scripting-012
| Field | Value |
|---|---|
| Severity | High |
| Category | Security |
| Location | `ForbiddenTypeAnalyzer.cs:60-76`, `ScriptSandbox.cs:96-126` |
| Status | Resolved |
**Description:** The Core.Scripting-008 rewrite broadened the BCL references list
from a narrow allow-list (`System.Private.CoreLib` + `System.Linq` only) to the
full `TRUSTED_PLATFORM_ASSEMBLIES` set filtered to `System.*` + `netstandard` +
`Microsoft.Win32.Registry`. This change correctly delegates the security gate to
`ForbiddenTypeAnalyzer` (the new comment in `ScriptSandbox` calls this out
explicitly), but the analyzer's deny-list has not been expanded to match the new
attack surface, and three categories of dangerous BCL types in the `System.*`
allow-listed assemblies are now reachable from script source:
1. **`System.Threading.ThreadPool`** (in namespace `System.Threading`). The
Core.Scripting-003 fix added `System.Threading.Tasks` to deny `Task.Run` /
`Parallel` fan-out because background work that outlives the per-evaluation
timeout is the explicit threat. `ThreadPool.QueueUserWorkItem`,
`ThreadPool.UnsafeQueueUserWorkItem`, and `ThreadPool.RegisterWaitForSingleObject`
are exactly the same threat — they schedule background work that outlives the
`WaitAsync(Timeout)` budget and tie up worker threads — but `System.Threading`
itself is allowed (because `CancellationToken` / `SemaphoreSlim` / `Volatile`
live there). The Core.Scripting-003 resolution is incomplete on the new
reference surface.
2. **`System.Threading.Timer`** (same namespace). Schedules a background
callback; the script returns control to the engine but the timer keeps
firing past the evaluation budget. Same threat as `Task.Run`.
3. **`System.Runtime.Loader.AssemblyLoadContext`** (in namespace
`System.Runtime.Loader`, which is not denied — only `System.Runtime.InteropServices`
is). The constructor + `LoadFromAssemblyPath` / `LoadFromStream` /
`LoadFromAssemblyName` let a script load an arbitrary DLL into the host
process. Pass (1) of the analyzer resolves the receiver type
(`AssemblyLoadContext`, allowed) + the invocation symbol's containing type
(also `AssemblyLoadContext`, allowed) and lets the call through. Pass (2)
only inspects `TypeSyntax` nodes — if the script discards the returned
`Assembly` (e.g. `alc.LoadFromAssemblyPath(@"C:\evil.dll");`) there is no
`TypeSyntax` for the analyzer to walk and the call is accepted. Triggering
execution of the loaded code from inside the sandbox is hard (most of
`Assembly`'s surface is in `System.Reflection`, which is denied) but the
defense-in-depth gap is real: an attacker who can author a script also
typically controls a file path on the server (Admin UI uploads, share
mounts) and loading an assembly is the prerequisite to every chained
escape — module initializers, type-resolve handlers, and a future analyzer
slip would all become exploitable.
In addition, two lower-impact `System.*` types are reachable that arguably
shouldn't be: **`System.Console.SetOut`** / **`Console.SetError`** could
redirect the host's console streams (requires constructing a
`System.IO.TextWriter`, which is blocked, so the practical exploit is
`Console.WriteLine` log-spam only), and **`System.Globalization.CultureInfo.DefaultThreadCurrentCulture`**
could perturb the entire process's formatting behavior (subtle but real cross-script
side effect).
The original Core.Scripting-001 finding called out the model: when an allow-listed
namespace contains dangerous types, those types must be denied type-granularly.
The new reference surface introduces several more such types and the deny-list
has not been kept in sync.
**Recommendation:** Add `System.Threading.ThreadPool` and `System.Threading.Timer`
to `ForbiddenFullTypeNames`. Add `System.Runtime.Loader` as a namespace prefix
to `ForbiddenNamespacePrefixes` (every type in `System.Runtime.Loader`
`AssemblyLoadContext`, `AssemblyDependencyResolver`, `AssemblyLoadEventArgs` — is
out of script scope). Consider adding `System.Console` to `ForbiddenFullTypeNames`
to stop log-spam through the host's console streams, and at minimum document
`CultureInfo.DefaultThreadCurrentCulture` as an accepted cross-script side
effect. Each addition must have a regression test in `ScriptSandboxTests`
mirroring the Core.Scripting-010 vector style. Update
`docs/v2/implementation/phase-7-scripting-and-alarming.md` decision #6 + the
"Sandbox escape" compliance-check row to enumerate the additions, per the
Core.Scripting-009 doc-sync convention.
**Resolution:** Resolved 2026-05-23 — added `System.Runtime.Loader` to
`ForbiddenNamespacePrefixes` (the namespace-prefix form preferred over
type-granular per the recommendation; future BCL additions to that namespace
are denied by default). Added `System.Threading.ThreadPool` and
`System.Threading.Timer` to `ForbiddenFullTypeNames` — both live in
`System.Threading` shared with allowed sync primitives so they must be
type-granular. Regression tests added to `ScriptSandboxTests`:
`Rejects_ThreadPool_QueueUserWorkItem_at_compile`,
`Rejects_Timer_new_at_compile`, `Rejects_AssemblyLoadContext_at_compile`.
`docs/v2/implementation/phase-7-scripting-and-alarming.md` decision #6 +
the Sandbox-escape compliance-check row both updated per the
Core.Scripting-009 doc-sync convention. The two lower-impact suggestions
from the recommendation (`System.Console`, `CultureInfo.DefaultThreadCurrentCulture`)
were intentionally not addressed: `Console.SetOut` requires constructing
a `System.IO.TextWriter` which is already blocked, leaving only
`Console.WriteLine` log-spam (annoyance, not a security threat); and
`CultureInfo.DefaultThreadCurrentCulture` is a cross-script side-effect
worth knowing about but doesn't escape the sandbox. Recording both as
accepted minor risks. Test totals after fix: Core.Scripting 107 green
(was 104 — +3 new rejection tests).
### Core.Scripting-013
| Field | Value |
|---|---|
| Severity | Medium |
| Category | Security |
| Location | `ScriptEvaluator.cs:202-225` (`BuildWrapperSource`) |
| Status | Resolved |
**Description:** The synthesized wrapper pastes the user's source verbatim
between `{` and `}` braces inside a static method body, with a `#line 1`
directive and no escaping. The legacy `CSharpScript.CreateDelegate` path was
robust to this because Roslyn's scripting compiler parses script source as a
top-level statement sequence; the new hand-rolled path is parsing ordinary C# in
a method body, so a script that injects matching `{` / `}` braces can extend the
synthesized compilation unit with additional methods, classes, or `#line`
directives. For example, a script body of
`return 0; } public static int Evil() { return 0; }} public static class CompiledScript2 { public static void M() {`
ends the `Run` method early, declares a sibling `Evil` method (and even a
sibling `CompiledScript2` class) inside the synthesized namespace, then opens an
unclosed method that consumes the wrapper's trailing `}\n}`. With matching brace
counts the script parses cleanly and compiles.
`ForbiddenTypeAnalyzer` walks every descendant of every syntax tree, so any
forbidden BCL types named inside the injected methods are still caught — the
finding is **not** a direct sandbox escape. However:
- It silently relaxes the operator-visible authoring contract documented in
`docs/VirtualTags.md` ("scripts are statement bodies that end with an
explicit `return …;`") to "scripts can be any compilable C# inside the
`CompiledScript` namespace" — operators have access to features the design
did not intend to expose (local types defined as siblings of `Run`, custom
module initializers via attributes, etc.).
- A script can embed its own `#line` directives that override the
`#line 1` we emit just above the user source, producing misleading error
locations in compiler diagnostics surfaced to the operator.
- Future hardening that relies on syntactic-shape assumptions (e.g.
"every script has exactly one method") would silently fail.
- It widens the analyzer's surface: the analyzer's correctness now depends on
Pass (2) correctly walking every conceivable C# construct that can name a
type, including ones a normal script body would never contain
(`UnmanagedCallersOnly` attribute, function pointer types `delegate*<...>`,
pattern types, switch arm types, …).
**Recommendation:** Either (a) reject scripts whose parsed body contains
declarations other than statements — walk the wrapper's syntax tree after parse
and require that the only members of `CompiledScript` are the single `Run`
method, raising a `CompilationErrorException` if anything else appears — or
(b) parse the user source independently as a `BlockSyntax` and inject the
parsed block as the method body via the Roslyn syntax API, which makes
brace-mismatched / class-injecting source unparseable. Add a regression test
covering at least the brace-injection vector
(`return 0; } public static int Evil() { return 0;`).
**Resolution:** Resolved 2026-05-23 — took option (a) from the recommendation:
added an `EnforceSingleRunMember` step to `ScriptEvaluator.Compile` (runs after
`CSharpSyntaxTree.ParseText` of the synthesized wrapper, before Roslyn
compile). The check requires exactly one type declaration in the compilation
unit (the `CompiledScript` class) AND exactly one member on that class (the
`Run` method). Any deviation — a sibling class, an additional namespace, a
sibling method or nested type alongside `Run` — throws
`CompilationErrorException` with diagnostic IDs `LMX001` / `LMX002` and a
message that names Core.Scripting-013 and points at the offending span. Two
regression tests added: `Rejects_sibling_method_injection_via_balanced_braces`
(injects a sibling method via `} public static int Evil() { …`) and
`Rejects_sibling_class_injection_via_balanced_braces` (injects an entire
sibling namespace + class). Option (b) (parse the user source independently
as a `BlockSyntax` and inject via Roslyn syntax API) was considered but the
parse-and-validate approach is more readable, gives clearer error messages,
and keeps the wrapper-source generation textual.
### Core.Scripting-014
| Field | Value |
|---|---|
| Severity | Medium |
| Category | Concurrency & thread safety |
| Location | `CompiledScriptCache.cs:91-103` (`Clear`) |
| Status | Resolved |
**Description:** `Clear()` snapshots `_cache.Keys.ToArray()` then iterates,
calling `TryRemove(key, out var lazy)` on each — the key-only overload, not
the value-scoped one used in `GetOrCompile`'s catch block. Between the
snapshot and a given `TryRemove`, a concurrent `GetOrCompile(scriptSource)`
call that hashes to the same key can re-insert a fresh `Lazy` whose `.Value`
the caller already retained. The unconditional `TryRemove` then removes that
fresh `Lazy` and `DisposeLazyIfMaterialised(lazy)` calls `Dispose()` on its
evaluator — unloading the ALC while the concurrent caller still holds a
reference to the evaluator and intends to invoke it.
This is exactly the race-window pattern the Core.Scripting-006 resolution
fixed in `GetOrCompile`'s catch block (the test
`Failed_compile_eviction_does_not_remove_a_concurrent_retry_entry` locks it
there). `Clear()` carries the same shape but uses the older, value-blind
overload, so the same race that finding-006 addresses is still latent on the
publish-replace path.
In current production wiring `Clear()` is intended for config-publish + tests
— neither overlaps steady-state evaluation under the documented design — so
the in-practice impact is low. But the cache is checked in as the
forward-looking compile cache for the engines (per `Script.SourceHash`'s docs
and the cache's own remarks); a future wiring that calls `Clear()` from
publish while evaluations are in flight would dispose live evaluators.
**Recommendation:** Replace the snapshot + `TryRemove(key, out var lazy)`
sequence with an enumeration that captures the `Lazy` reference at snapshot
time and uses the value-scoped `TryRemove(KeyValuePair<,>)` overload, mirroring
the Core.Scripting-006 fix:
```csharp
foreach (var entry in _cache.ToArray())
{
if (_cache.TryRemove(entry))
DisposeLazyIfMaterialised(entry.Value);
}
```
Add a regression test that races `GetOrCompile` against `Clear` and asserts
the caller's evaluator is still usable.
**Resolution:** Resolved 2026-05-23 — applied the recommendation verbatim:
replaced `foreach (var key in _cache.Keys.ToArray())` + key-only
`TryRemove(key, out var lazy)` with `foreach (var entry in _cache.ToArray())` +
value-scoped `TryRemove(entry)` (the `KeyValuePair<,>` overload). A concurrent
GetOrCompile re-add between the snapshot and the remove inserts a fresh Lazy
under the same key; the value-scoped comparison sees the mismatch and leaves
the fresh entry intact (instead of evicting + disposing the live evaluator
the concurrent caller still holds). Regression test
`Clear_uses_value_scoped_TryRemove_so_a_race_inserted_entry_survives` added
to `CompiledScriptCacheTests` — single-threaded simulation that snapshots
the dict, mutates the entry to a fresh Lazy mid-flight, drives the same
value-scoped TryRemove overload Clear now uses, and asserts the fresh entry
survives. The two-thread race would be flaky to model directly; the
single-threaded semantic test is sufficient because the fix is the
overload-selection itself.
### Core.Scripting-015
| Field | Value |
|---|---|
| Severity | Low |
| Category | Correctness & logic bugs |
| Location | `ScriptEvaluator.cs:234-270` (`ToCSharpTypeName`) |
| Status | Resolved |
**Description:** `ToCSharpTypeName` is documented to handle nested types
(`Outer+Inner``Outer.Inner`) via `Replace('+', '.')` for the
non-generic path (line 269) but the generic path (line 263-266) constructs the
name from `def.FullName!` then takes a substring up to the backtick. For a
**nested generic** type — e.g. `Outer.Inner<T>` whose `FullName` is
`Outer+Inner`1` — `Replace('+', '.')` is applied first, then `Substring(0, IndexOf('`'))`
on `"Outer.Inner`1"` produces `"Outer.Inner"`, which is correct. Good.
However, the generic branch does NOT handle the case where the OPEN generic
type itself is nested with `+` inside the parent's name when the parent is
also generic (`Outer<TOuter>.Inner<TInner>` — `FullName` is
`Outer`1+Inner`1[[TOuter,TInner]]`). For that shape `Substring(0, IndexOf('`'))`
truncates at the first backtick — yielding `"Outer.Inner"` — silently dropping
the closed type arguments of `Outer<TOuter>`. The resulting source string is
syntactically valid but semantically wrong: `global::Outer.Inner<TInner>` does
not name `Outer<TOuter>.Inner<TInner>`.
The production code never hits this shape — `TResult` is always one of
`object?`, `bool`, `int`, `double`, `string?`, `DateTime` across the
virtual-tag engine, the alarm engine, the test-harness, and the test suite,
and `ScriptGlobals<TContext>` is always a top-level generic over a top-level
`ScriptContext` subclass. The bug is latent. But it is a foot-gun for a
future caller (e.g. a Phase-8 driver that wires a context type defined as a
nested generic for grouping reasons) and the XML-doc comment claims
"handles nested types" without qualifying it.
A second smaller correctness gap on the same path: the comment claims
`global::`-qualified FQNs prevent accidental capture by the wrapper's `using`
directives, which is true for the generic / non-generic branches, but the
primitive aliases (`bool`, `int`, `string`, `object`, …) are emitted unqualified.
A script that defines a local `class bool` (now possible per Core.Scripting-013)
would shadow the alias. Probably benign, but worth a comment.
**Recommendation:** Add a check in the generic branch that walks the FullName
backtick-by-backtick — or use `INamedTypeSymbol`-style name composition from
`def.DeclaringType` recursively — so multi-arity-nested generics emit
correctly. At minimum update the XML doc to qualify "handles nested types" as
"handles single-level nesting; nested generics whose parent is itself generic
are not supported". Add a `ToCSharpTypeName` unit test (currently nothing
exercises this method directly — coverage relies on the end-to-end compile path,
so the bug surfaces only as a misleading Roslyn diagnostic).
**Resolution:** Resolved 2026-05-23 — rewrote the generic-type branch of
`ToCSharpTypeName` to walk the `FullName` segment-by-segment (split on `.`
after `+ → .` substitution). For each segment ending in `Name\`N`, the
algorithm consumes N generic arguments from `t.GetGenericArguments()` in
order and emits them as `<…>` on that segment. Nested generic-in-generic
shapes (`Outer<T>.Inner<U>`) now emit as
`global::Ns.Outer<T>.Inner<U>` (valid C#) rather than the pre-fix
`global::Ns.Outer<T, U>` (which dropped the segment boundary entirely
because `IndexOf('`')` truncated at the first backtick). No production
caller exercises this shape today (all `TContext` / `TResult` types in
the codebase are top-level non-nested), so the fix is preemptive — but
the algorithm is now correct for any future nested-generic context type.
### Core.Scripting-016
| Field | Value |
|---|---|
| Severity | Medium |
| Category | Performance & resource management |
| Location | `src/Core/ZB.MOM.WW.OtOpcUa.Core.VirtualTags/VirtualTagEngine.cs:74-117`, `src/Core/ZB.MOM.WW.OtOpcUa.Core.ScriptedAlarms/ScriptedAlarmEngine.cs:139-182` |
| Status | Resolved |
**Description:** The Core.Scripting-008 resolution introduced
`ScriptEvaluator.IDisposable` + `CompiledScriptCache.Clear()` that disposes
each materialised evaluator before dropping its dictionary entry, so per-publish
ALC accretion is no longer process-lifetime rooted **inside the cache**. But
neither production consumer of `ScriptEvaluator` uses the cache — both
`VirtualTagEngine.Load` and `ScriptedAlarmEngine.LoadAsync` call
`ScriptEvaluator<TContext, TResult>.Compile(...)` directly (lines 105 / 160
respectively), store the evaluator inside an internal `VirtualTagState` /
`AlarmState` record, and on the next `Load` simply call `_tags.Clear()` /
`_alarms.Clear()`. The dropped `ScriptEvaluator` references never have
`Dispose()` called on them, so the underlying `ScriptAssemblyLoadContext`
instances are never `Unload()`-ed. The .NET runtime guarantees that a
collectible ALC stays alive until `Unload()` is called explicitly — having
"no strong references" is necessary but not sufficient. So the publish-replace
cycle leaks every prior generation's emitted assembly exactly as before the
fix, even though the fix's infrastructure is in place.
The Core.Scripting-008 regression tests in `CompiledScriptCacheTests`
(`Dispose_unloads_compiled_script_assembly_load_context` /
`Clear_disposes_every_materialised_evaluator`) prove the contract on
`CompiledScriptCache`, but neither engine uses that class. There is no
integration test exercising the actual publish path — i.e. that calling
`VirtualTagEngine.Load(...)` twice with different definitions makes the prior
generation's ALC eligible for GC. As a result the fix's headline guarantee
("Server restarts are no longer required to reclaim compiled-script memory" —
`docs/VirtualTags.md`) is not actually delivered to the production engines.
This is the same observable behavior the original Core.Scripting-008 finding
described, surfacing on a different code path that the resolution did not touch.
**Recommendation:** Either route the engines' compile path through
`CompiledScriptCache<TContext, TResult>` (the documented design — the cache
already returns the same evaluator instance for identical source, and its
`Clear()` now performs the right disposal — and `Script.SourceHash`'s doc-comment
already names this as the cache key), or make the engines' `Load` methods
dispose the previous `ScriptEvaluator` instances before reassigning. The
former is the cleaner change because it also collapses redundant compiles
across publishes for unchanged scripts. Add an integration test along the
lines of `CompiledScriptCacheTests.Clear_disposes_every_materialised_evaluator`
for each engine: snapshot the per-evaluator emitted assembly via
`WeakReference`, call `Load(...)` with a different definition set, and assert
the prior generation's assemblies become collectable.
**Resolution:** Resolved 2026-05-23 — took the cleaner route from the
recommendation: routed both engines' compile paths through
`CompiledScriptCache<TContext, TResult>`. `VirtualTagEngine` and
`ScriptedAlarmEngine` each gained a private `_compileCache` instance field,
their `Load`/`LoadAsync` methods now call `_compileCache.GetOrCompile(source)`
instead of `ScriptEvaluator.Compile(source)` directly, and the cache is cleared
on publish-replace alongside the existing `_tags` / `_alarms` clears so the
prior generation's ALCs are disposed before recompile. Engine `Dispose` now
also calls `_compileCache.Dispose()` so the engine-shutdown path actually
releases the emitted assemblies. **Side-fix:** discovered + fixed an
adjacent bug in `CompiledScriptCache.Dispose()` itself — it set
`_disposed = true` before calling `Clear()`, but `Clear()`'s pre-existing
`if (_disposed) return` guard then aborted the drain unconditionally, so
the Dispose-triggered cleanup was a silent no-op. Removed the disposed-guard
on `Clear()` (the operation is idempotent — clearing an empty/cleared cache
is safe). Without this side-fix the engine-Dispose path would have left
the cached evaluators rooted forever even though the call chain looked
correct. **Side-fix for ScriptedAlarmEngine.Dispose:** moved the pre-existing
"do NOT clear `_alarms` here" comment to "clear `_alarms` AFTER the drain"
because the AlarmState records hold the `TimedScriptEvaluator`/`ScriptEvaluator`
delegates that root the emitted assembly — leaving them in `_alarms` after
Dispose was the same root-the-script-forever pattern this finding is about,
just on the engine side rather than the cache side. The `_alarms` clear is
safe after the `Task.WhenAll` drain because that drain guarantees no
background callback is mid-flight. Regression tests added:
`VirtualTagEngineTests.Dispose_unloads_compiled_script_assembly` and
`ScriptedAlarmEngineTests.Dispose_unloads_compiled_predicate_assembly`
each uses `WeakReference` + bounded `GC.Collect()` to prove the emitted
assembly is reclaimable after `engine.Dispose()`. **Important test pattern
detail:** the alarms test originally failed because its helper was
`async Task<WeakReference>` — async state machines capture locals as
state-struct fields and can keep them alive past the method's apparent end.
Rewrote as a synchronous helper using `LoadAsync(...).GetAwaiter().GetResult()`
inside two cooperating `[MethodImpl(MethodImplOptions.NoInlining)]` helpers
(`CompileAlarmAndCaptureWeak` + `ExtractEmittedAssemblyWeakRef`) so the
intermediate reflection locals die when each helper returns. Test totals
after fix: Core.Scripting 104 green (unchanged); VirtualTags 57 green (was
56 — +1 unload test); ScriptedAlarms 67 green (was 66 — +1 unload test).
+44 -14
View File
@@ -7,7 +7,7 @@
| Review date | 2026-05-22 |
| Commit reviewed | `76d35d1` |
| Status | Reviewed |
| Open findings | 6 |
| Open findings | 0 |
## Checklist coverage
@@ -96,7 +96,7 @@ into `AbCipCommandBase`.
| Severity | Low |
| Category | Concurrency & thread safety |
| Location | `src/Drivers/Cli/ZB.MOM.WW.OtOpcUa.Driver.AbCip.Cli/Commands/SubscribeCommand.cs:50-56,60-61` |
| Status | Open |
| Status | Resolved |
**Description:** The `OnDataChange` handler writes change lines to `console.Output`
(a `TextWriter`) from the driver's poll-engine callback thread, while the command's
@@ -112,7 +112,12 @@ during the watch loop widens it.
writes during the subscription with a shared lock so poll-thread and main-thread
output cannot interleave.
**Resolution:** _(open)_
**Resolution:** Resolved 2026-05-23 — moved the "Subscribed to ... Ctrl+C to stop."
banner write to BEFORE `driver.OnDataChange +=` (and therefore before
`SubscribeAsync`). With the handler not yet attached when the banner runs, the
poll thread cannot fire change events into `console.Output` concurrently with the
main-thread banner write. After `+=` the only writer to `console.Output` is the
poll-thread handler, so no interleaving is possible.
### Driver.AbCip.Cli-004
@@ -121,7 +126,7 @@ output cannot interleave.
| Severity | Low |
| Category | Error handling & resilience |
| Location | `src/Drivers/Cli/ZB.MOM.WW.OtOpcUa.Driver.AbCip.Cli/Commands/SubscribeCommand.cs:28,58`; `AbCipCommandBase.cs:26-34` |
| Status | Open |
| Status | Resolved |
**Description:** `--interval-ms` (`IntervalMs`) is taken verbatim and passed as
`TimeSpan.FromMilliseconds(IntervalMs)` to `SubscribeAsync` with no validation. A
@@ -135,7 +140,16 @@ downstream component to sanitise operator input is fragile. `--timeout-ms` on
`ExecuteAsync` / in `AbCipCommandBase`, throwing a `CommandException` with the
accepted range when out of bounds.
**Resolution:** _(open)_
**Resolution:** Resolved 2026-05-23 — added `SubscribeCommand.ValidateInterval(int)`
(throws `CommandException` for `IntervalMs <= 0`) called at the top of
`SubscribeCommand.ExecuteAsync`; moved `TimeoutMs > 0` validation into the
`AbCipCommandBase.Timeout` getter (throws `CommandException` for non-positive
`TimeoutMs`) and added a `_ = Timeout` touch in `SubscribeCommand.ExecuteAsync` to
fire that guard before the driver opens. Other commands trip the same guard
naturally via `BuildOptions(...).Timeout`. Regression tests
`AbCipCommandBaseTests.Timeout_get_throws_CommandException_when_TimeoutMs_is_non_positive`
and `SubscribeCommandIntervalTests.ValidateInterval_rejects_non_positive` cover
both paths.
### Driver.AbCip.Cli-005
@@ -144,7 +158,7 @@ accepted range when out of bounds.
| Severity | Low |
| Category | Performance & resource management |
| Location | `src/Drivers/Cli/ZB.MOM.WW.OtOpcUa.Driver.Cli.Common/DriverCommandBase.cs:51-59` |
| Status | Open |
| Status | Resolved |
**Description:** `ConfigureLogging` assigns a freshly created Serilog logger to the
process-global `Log.Logger` but never calls `Log.CloseAndFlush()`. For a short-lived
@@ -159,7 +173,12 @@ module's review.)
`AppDomain.ProcessExit` or a `finally` in the command), or have the CLI use a
disposable logger scoped to `ExecuteAsync`.
**Resolution:** _(open)_
**Resolution:** Resolved 2026-05-23 — `Driver.Cli.Common` already exposes
`DriverCommandBase.FlushLogging()` (a `Log.CloseAndFlush()` wrapper); the AB CIP
CLI was not calling it. Added `FlushLogging()` in the `finally` block of all four
commands (`ProbeCommand`, `ReadCommand`, `WriteCommand`, `SubscribeCommand`) so
buffered Serilog output is flushed before the process exits, including the
Ctrl+C-driven `subscribe` path. No edits to `Driver.Cli.Common` were needed.
### Driver.AbCip.Cli-006
@@ -168,7 +187,7 @@ disposable logger scoped to `ExecuteAsync`.
| Severity | Low |
| Category | Design-document adherence |
| Location | `src/Drivers/Cli/ZB.MOM.WW.OtOpcUa.Driver.AbCip.Cli/AbCipCommandBase.cs:29-34` |
| Status | Open |
| Status | Resolved |
**Description:** `AbCipCommandBase` overrides the abstract `DriverCommandBase.Timeout`
property with a getter derived from `TimeoutMs` and an empty `init` body
@@ -183,9 +202,13 @@ bug.
**Recommendation:** Either drop the `init` accessor entirely (make the override a
get-only expression-bodied property) or have the empty `init` throw
`NotSupportedException` to make the "driven by TimeoutMs" contract explicit and
fail-fast.
fail-fast. (Drop is not viable because the abstract base declares `{ get; init; }`
and an override must provide both accessors.)
**Resolution:** _(open)_
**Resolution:** Resolved 2026-05-23 — the `init` accessor now throws
`NotSupportedException` with a message pointing the caller at `TimeoutMs`. A new
test `AbCipCommandBaseTests.Timeout_setter_is_inert_and_does_not_silently_swallow_assignments`
asserts that an object-initializer assignment to `Timeout` fails fast.
### Driver.AbCip.Cli-007
@@ -194,7 +217,7 @@ fail-fast.
| Severity | Low |
| Category | Testing coverage |
| Location | `tests/Drivers/Cli/ZB.MOM.WW.OtOpcUa.Driver.AbCip.Cli.Tests/WriteCommandParseValueTests.cs` |
| Status | Open |
| Status | Resolved |
**Description:** The only test file covers `WriteCommand.ParseValue` and
`ReadCommand.SynthesiseTagName` — both pure static helpers. There is no coverage for
@@ -213,7 +236,12 @@ driver CLIs.
(`HostAddress`, `PlcFamily`, `DeviceName`), the supplied tag list, and the `Timeout`
derived from `TimeoutMs`.
**Resolution:** _(open)_
**Resolution:** Resolved 2026-05-23 — added
`tests/.../AbCipCommandBaseTests.cs` covering `BuildOptions` (probe disabled,
controller-browse disabled, alarm-projection disabled, single device with
`HostAddress` / `PlcFamily` / `cli-{Family}` device name, tag list passed verbatim,
`Timeout` derived from `TimeoutMs`) and `DriverInstanceId` (`abcip-cli-{Gateway}`),
plus the `RejectStructure` guard (throws for `Structure`, no-op for atomic types).
### Driver.AbCip.Cli-008
@@ -222,7 +250,7 @@ derived from `TimeoutMs`.
| Severity | Low |
| Category | Documentation & comments |
| Location | `docs/Driver.AbCip.Cli.md:8-9` |
| Status | Open |
| Status | Resolved |
**Description:** `docs/Driver.AbCip.Cli.md` opens with "Second of four driver
test-client CLIs (Modbus -> AB CIP -> AB Legacy -> S7 -> TwinCAT)." The count "four"
@@ -235,4 +263,6 @@ doc's "four" and the truncated chain are both stale.
and complete the chain (Modbus -> AB CIP -> AB Legacy -> S7 -> TwinCAT -> FOCAS), or
drop the explicit count and link `docs/DriverClis.md` as the authoritative roster.
**Resolution:** _(open)_
**Resolution:** Resolved 2026-05-23 — rewrote the lead paragraph to "Second of six
driver test-client CLIs (Modbus -> AB CIP -> AB Legacy -> S7 -> TwinCAT -> FOCAS)"
and added a link to `docs/DriverClis.md` as the authoritative roster.
+11 -11
View File
@@ -7,7 +7,7 @@
| Review date | 2026-05-22 |
| Commit reviewed | `76d35d1` |
| Status | Reviewed |
| Open findings | 5 |
| Open findings | 0 |
## Checklist coverage
@@ -123,13 +123,13 @@
| Severity | Low |
| Category | OtOpcUa conventions |
| Location | `AbCipDriver.cs` (whole file), `AbCipAlarmProjection.cs`, `LibplctagTagRuntime.cs` |
| Status | Open |
| Status | Resolved |
**Description:** `CLAUDE.md` Library Preferences mandate Serilog with a rolling daily file sink. The driver has no logging at all: no `ILogger`/Serilog dependency is injected or used. Failure paths instead swallow exceptions into the `_health` string (`ReadSingleAsync`, `WriteAsync`, `FetchUdtShapeAsync` catch-all, `ProbeLoopAsync` empty catch, `AbCipAlarmProjection.RunPollLoopAsync` empty catch). An operator looking at server logs sees nothing for a probe loop failing every tick for hours, a template decode that silently returned null, or an alarm poll loop throwing every interval. The health surface carries only the last error message, so a transient error immediately overwrites a more important earlier one.
**Recommendation:** Inject an `ILogger` (Serilog) and log at least device init failures, per-call read/write transport errors (debounced), probe-loop failures, template-read failures, and alarm-poll-loop exceptions. The health surface is for state, not for the audit trail.
**Resolution:** _(open)_
**Resolution:** Resolved 2026-05-23 — `AbCipDriver` and `AbCipAlarmProjection` now accept an optional `ILogger<AbCipDriver>` / `ILogger` (defaulting to `NullLogger` so the existing constructor surface stays compatible). Failure paths log through it: `InitializeAsync` (`LogError` on fault), `ReadSingleAsync` / `ReadGroupAsync` / `WriteAsync` (`LogWarning` on non-zero libplctag status + transport / type-conversion exceptions, with the affected tag + device on each entry), `ProbeLoopAsync` (`LogDebug` per swallowed tick), `FetchUdtShapeAsync` (`LogWarning` on template-read failure), and `AbCipAlarmProjection.RunPollLoopAsync` (`LogDebug` on swallowed tick). Six regression tests in `AbCipLoggingTests` exercise the new logger seam.
### Driver.AbCip-008
@@ -183,13 +183,13 @@
| Severity | Low |
| Category | Error handling & resilience |
| Location | `AbCipDriver.cs:144-152`, `AbCipDriverOptions.cs:131-143` |
| Status | Open |
| Status | Resolved |
**Description:** `InitializeAsync` only starts probe loops when `_options.Probe.Enabled` is true AND `Probe.ProbeTagPath` is non-blank. When `Probe.Enabled` is true (the default) but `ProbeTagPath` is null (also the default; the doc comment says "PR 8 wires this up"), no probe runs at all and the device `HostState` stays `HostState.Unknown` forever. `GetHostStatuses()` then reports every device as Unknown indefinitely with no warning. An operator who enables the probe but does not set a probe tag gets a silently inert health surface rather than an error or a log line.
**Recommendation:** When `Probe.Enabled` is true but no `ProbeTagPath` is configured, either fail initialization with a clear message, fall back to a family-default probe tag (the doc comment stated intent), or at minimum log a warning that the probe is enabled-but-inert.
**Resolution:** _(open)_
**Resolution:** Resolved 2026-05-23 — `InitializeAsync` now emits a `LogWarning` when `Probe.Enabled` is `true`, devices are configured, but `Probe.ProbeTagPath` is null/blank. The warning names the driver instance and explicitly states that no probe loops were started and `GetHostStatuses()` will report every device as `Unknown` until either a `ProbeTagPath` is set or `Probe.Enabled` is set to `false`. Initialization still succeeds (the probe is optional telemetry, not a hard requirement). Two `AbCipLoggingTests` cases cover the warn-on-enabled-but-blank and no-warn-on-disabled paths. The `AbCipProbeOptions.ProbeTagPath` doc-comment was also updated so the misconfiguration is documented in-place.
### Driver.AbCip-012
@@ -198,13 +198,13 @@
| Severity | Low |
| Category | Performance & resource management |
| Location | `LibplctagTemplateReader.cs:15-35`, `AbCipDriver.cs:88-92` |
| Status | Open |
| Status | Resolved |
**Description:** `LibplctagTemplateReader` is created per `FetchUdtShapeAsync` call, and each call constructs a fresh libplctag `Tag` for the @udt pseudo-tag, initializes it (a CIP connection handshake), reads, and disposes it. There is no reuse of the `Tag` across template reads for the same device: every UDT shape fetch pays a full connect/init cost. `AbCipTemplateCache` caches the decoded shape so this only bites on the first fetch of each type, but discovery of a UDT-heavy controller still does one connect per type. The same per-call `Tag` construction applies to `LibplctagTagEnumerator`.
**Recommendation:** Acceptable for a low-frequency discovery path, but consider pooling/reusing a single @udt-capable `Tag` per device for the duration of a discovery run, or document that the per-type connect cost is accepted.
**Resolution:** _(open)_
**Resolution:** Resolved 2026-05-23 — accepted per the recommendation's "document the per-type connect cost is accepted" branch; `AbCipTemplateCache` caches the decoded shape so only the first fetch per `(device, templateInstanceId)` pays the connect cost, and libplctag itself pools the underlying CIP connections per gateway+path so the TCP/EIP session is reused even when individual `Tag` instances are torn down. The class-level remarks on `LibplctagTemplateReader` now spell that out and call out when to revisit (telemetry showing discovery latency dominated by template-read connects).
### Driver.AbCip-013
@@ -213,13 +213,13 @@
| Severity | Low |
| Category | Design-document adherence |
| Location | `AbCipDriverOptions.cs:70-73`, `PlcFamilies/AbCipPlcFamilyProfile.cs:13-19`, `LibplctagTagRuntime.cs:16-27` |
| Status | Open |
| Status | Resolved |
**Description:** `driver-specs.md` specifies the AB CIP per-device connection settings as discrete fields: Host, Path, PlcType, TimeoutMs, AllowPacking, ConnectionSize. The implementation instead collapses host + path into a single opaque ab:// URL string and exposes `PlcFamily` (which adds GuardLogix, not in the spec table). AllowPacking and ConnectionSize from the spec are not configurable per device: `AbCipPlcFamilyProfile` hard-codes `SupportsRequestPacking` and `DefaultConnectionSize` per family, and `LibplctagTagRuntime` never passes a connection-size or packing attribute to the `Tag` (it is constructed with only Gateway/Path/PlcType/Protocol/Name/Timeout). The family profile `DefaultConnectionSize`/`SupportsRequestPacking`/`MaxFragmentBytes` fields are computed but never applied to the wire layer: dead configuration.
**Recommendation:** Either update `driver-specs.md` to describe the actual ab:// host-address model and the family-profile approach, and wire the profile ConnectionSize/packing values through to the libplctag `Tag` attributes; or expose AllowPacking/ConnectionSize as per-device options per the spec.
**Resolution:** _(open)_
**Resolution:** Resolved 2026-05-23 — took the "expose per-device options per the spec" branch. `AbCipDeviceOptions` now carries optional `AllowPacking` and `ConnectionSize` overrides (both default to `null` to inherit the family profile); `AbCipTagCreateParams` carries the resolved values; `DeviceState.BuildCreateParams` collapses every old per-call-site clone (read, write, probe, template, enumerator) into one helper that combines the per-device override with the family profile's `SupportsRequestPacking` / `DefaultConnectionSize` defaults. `LibplctagTagRuntime` now honours `AllowPacking` via the `Tag.AllowPacking` property — fixing the previously-dead family-profile setting. `ConnectionSize` is plumbed through `AbCipTagCreateParams` for forward-compat; libplctag.NET 1.5.2 has no direct `ConnectionSize` property, so an XML comment on `LibplctagTagRuntime` documents that current builds rely on the family-profile default at the wire layer until the wrapper exposes a direct property or we ship a custom tag-attribute path. `AbCipDriverFactoryExtensions` ParseOptions now reads `AllowPacking` + `ConnectionSize` from the driver-config JSON. Six regression tests in `AbCipPerDeviceConnectionOptionsTests` cover the new options.
### Driver.AbCip-014
@@ -243,10 +243,10 @@
| Severity | Low |
| Category | Documentation & comments |
| Location | `AbCipDriver.cs:9-11`, `PlcTagHandle.cs:23-27,53-58`, `AbCipTemplateCache.cs:12-15`, `IAbCipTagEnumerator.cs:6-11`, `AbCipDriverOptions.cs:21` |
| Status | Open |
| Status | Resolved |
**Description:** Numerous comments are stale relative to the commit under review. `AbCipDriver.cs:9-11` says the driver "Implements IDriver only for now" with capabilities shipping "in subsequent PRs (3-8)" while the class already implements all of them. `PlcTagHandle.cs` says the plc_tag_destroy P/Invoke "is deferred to PR 3 ... PR 2 ships the lifetime scaffold + tests only" and `ReleaseHandle` "is a no-op", which now reads as a permanent unfinished-work marker (see Driver.AbCip-006). `AbCipTemplateCache.cs:12-15` says "Template shape read ... lands with PR 6 ... no reader writes to it yet" while `CipTemplateObjectDecoder` and `LibplctagTemplateReader` both exist and `FetchUdtShapeAsync` writes to the cache. `IAbCipTagEnumerator.cs:6-11` says the enumerator "Defaults to EmptyAbCipTagEnumeratorFactory" while the production default is `LibplctagTagEnumeratorFactory`. `AbCipDriverOptions.cs:21` says "AB discovery lands in PR 5", already shipped. `StyleGuide.md` explicitly says not to leave stale coming-soon notes.
**Recommendation:** Sweep the module for PR-N forward references and "lands in PR X" notes that have been delivered; update them to describe present behavior. Where a comment marks genuinely unfinished work (e.g. `PlcTagHandle.ReleaseHandle`), convert it to a tracked TODO with an issue reference rather than a PR-number milestone.
**Resolution:** _(open)_
**Resolution:** Resolved 2026-05-23 — swept the module for stale PR-N forward references and replaced each with a description of present behaviour: `AbCipDriver.TemplateCache` summary, `AbCipDataType.cs` (PR 5 / PR 6 → references `CipTemplateObjectDecoder` + `AbCipTemplateCache`), `AbCipTagPath.cs` (PR 6 → references `AbCipTemplateCache`), `AbCipTemplateCache.cs` (the "lands with PR 6" remarks and the `AbCipUdtShape` summary), `IAbCipTagEnumerator.cs` (the `EmptyAbCipTagEnumeratorFactory`-defaults claim and the PR-5 stub line; `EmptyAbCipTagEnumerator` summary), `LibplctagTagEnumerator.cs` ("Task #178 closed the stub gap from PR 5"), `LibplctagTagRuntime.cs` (`Whole-UDT writes land in PR 6`), `AbCipDriverOptions.cs` (`Tags` summary, `ProbeTagPath` summary), and `AbCipPlcFamilyProfile.cs` ("Family-specific wire tests ship in PRs 912"). `PlcTagHandle.cs` was already deleted as part of Driver.AbCip-006's resolution. The only remaining "lands in" reference is the `AbCipDataType.Dt``Date/Time` mapping, which is product-domain wording, not a PR reference.
+43 -13
View File
@@ -7,7 +7,7 @@
| Review date | 2026-05-22 |
| Commit reviewed | `76d35d1` |
| Status | Reviewed |
| Open findings | 6 |
| Open findings | 0 |
## Checklist coverage
@@ -68,7 +68,7 @@ type (mirroring the existing `Bit` and unsupported-type branches). Either catch
| Severity | Low |
| Category | Correctness & logic bugs |
| Location | `Commands/WriteCommand.cs:27-29`, `Program.cs:6-9` |
| Status | Open |
| Status | Resolved |
**Description:** The `--value` option help text states "booleans accept
true/false/1/0", but `ParseBool` (`WriteCommand.cs:74-80`) and the error message
@@ -82,7 +82,10 @@ with both the code and the design doc.
matching the wording used elsewhere (e.g. "booleans accept
true/false, 1/0, on/off, yes/no").
**Resolution:** _(open)_
**Resolution:** Resolved 2026-05-23 — updated `WriteCommand.Value` description to
list the full alias set: "booleans accept true/false, 1/0, on/off, yes/no".
Regression test `CommandMetadataTests.WriteCommand_value_help_lists_full_boolean_alias_set`
asserts the description contains every alias group.
### Driver.AbLegacy.Cli-003
@@ -91,7 +94,7 @@ true/false, 1/0, on/off, yes/no").
| Severity | Low |
| Category | Concurrency & thread safety |
| Location | `Commands/SubscribeCommand.cs:47-53` |
| Status | Open |
| Status | Resolved |
**Description:** The `OnDataChange` handler calls `console.Output.WriteLine(line)`
(the synchronous overload) directly from the `PollGroupEngine` poll thread. The
@@ -109,7 +112,10 @@ change events through a `Channel<string>` drained by a single consumer task, or
guard the `WriteLine` with a lock. At minimum, document that the interleaving is
accepted because output is human-facing and line-buffered.
**Resolution:** _(open)_
**Resolution:** Resolved 2026-05-23 — `SubscribeCommand` now serialises every
console write through a shared `consoleGate` lock: the poll-thread
`OnDataChange` callback and the command-thread "Subscribed to ..." line both take
the lock before calling `WriteLine`. Comment in the source documents the intent.
### Driver.AbLegacy.Cli-004
@@ -118,7 +124,7 @@ accepted because output is human-facing and line-buffered.
| Severity | Low |
| Category | Error handling & resilience |
| Location | `Commands/ProbeCommand.cs:37-56`, `Commands/ReadCommand.cs:39-50`, `Commands/WriteCommand.cs:48-59`, `Commands/SubscribeCommand.cs:41-76` |
| Status | Open |
| Status | Resolved |
**Description:** Every command does `await using var driver = new AbLegacyDriver(...)`
*and* an explicit `await driver.ShutdownAsync(...)` in the `finally`. `AbLegacyDriver`
@@ -135,7 +141,12 @@ cleanup on every exit path including exceptions.
since the commands deliberately pass `CancellationToken.None` to shutdown so teardown
is not cut short by a cancelled `ct`.
**Resolution:** _(open)_
**Resolution:** Resolved 2026-05-23 — replaced `await using var driver` with a
plain `var driver` in all four commands (`ProbeCommand`, `ReadCommand`,
`WriteCommand`, `SubscribeCommand`), keeping the explicit
`finally { await driver.ShutdownAsync(CancellationToken.None) }` as the single
teardown path. Comment in each command documents the intent so readers do not
have to verify idempotency.
### Driver.AbLegacy.Cli-005
@@ -144,7 +155,7 @@ is not cut short by a cancelled `ct`.
| Severity | Low |
| Category | Design-document adherence |
| Location | `Commands/SubscribeCommand.cs:23-25`, `docs/Driver.AbLegacy.Cli.md:94-96` |
| Status | Open |
| Status | Resolved |
**Description:** The subscribe command interval option is `--interval-ms`
(default 1000). `docs/Driver.AbLegacy.Cli.md` shows the subscribe example as
@@ -160,7 +171,13 @@ but the documented contract drifts between the two CLIs.
`--interval-ms` description for parity with the AbCip CLI, and mention the
`--interval-ms` long form + 1000 ms default in `docs/Driver.AbLegacy.Cli.md`.
**Resolution:** _(open)_
**Resolution:** Resolved 2026-05-23 — extended the `SubscribeCommand.IntervalMs`
help text to match AbCip ("Publishing interval in milliseconds (default 1000).
PollGroupEngine floors sub-250ms values.") and added a paragraph under the
`subscribe` section in `docs/Driver.AbLegacy.Cli.md` naming the `-i` /
`--interval-ms` long form, the 1000 ms default, and the 250 ms floor. Regression
test `CommandMetadataTests.SubscribeCommand_interval_ms_help_notes_PollGroupEngine_floor`
asserts the description mentions "250".
### Driver.AbLegacy.Cli-006
@@ -169,7 +186,7 @@ but the documented contract drifts between the two CLIs.
| Severity | Low |
| Category | Code organization & conventions |
| Location | `Commands/ProbeCommand.cs:20-22` |
| Status | Open |
| Status | Resolved |
**Description:** `ProbeCommand` declares its `--type` option with no short alias,
while `ReadCommand`, `WriteCommand`, and `SubscribeCommand` all declare `--type`
@@ -182,7 +199,13 @@ it silently rejected on `probe`.
for consistency with the other three commands. (The AbCip CLI `ProbeCommand` has
the same omission, so a cross-CLI sweep is worthwhile.)
**Resolution:** _(open)_
**Resolution:** Resolved 2026-05-23 — added the `'t'` short alias to
`ProbeCommand.DataType`. Regression test
`CommandMetadataTests.ProbeCommand_type_has_short_alias_t` (plus the parity test
`Other_commands_keep_type_short_alias_t` for read/write/subscribe) asserts the
short alias is present on every command. The same omission still exists in the
AbCip CLI's `ProbeCommand` — flagged as a sibling sweep but out of scope for
this module.
### Driver.AbLegacy.Cli-007
@@ -191,7 +214,7 @@ the same omission, so a cross-CLI sweep is worthwhile.)
| Severity | Low |
| Category | Testing coverage |
| Location | `tests/Drivers/Cli/ZB.MOM.WW.OtOpcUa.Driver.AbLegacy.Cli.Tests/WriteCommandParseValueTests.cs` |
| Status | Open |
| Status | Resolved |
**Description:** The only test file in the CLI test project covers
`WriteCommand.ParseValue` and `ReadCommand.SynthesiseTagName`. Two behaviours that
@@ -210,4 +233,11 @@ Driver.AbLegacy.Cli-001. `BuildOptions` is reachable via `InternalsVisibleTo`
tag passthrough) and an overflow-input test for `ParseValue` so the fix for
Driver.AbLegacy.Cli-001 is locked in by a regression test.
**Resolution:** _(open)_
**Resolution:** Resolved 2026-05-23 — added `BuildOptionsTests` (five tests:
probe disabled, device shape from Gateway+PlcType, tag passthrough, timeout
propagation, empty tag list) covering `AbLegacyCommandBase.BuildOptions` via a
nested `TestCommand` subclass annotated with `[Command]` to satisfy the CliFx
analyzer. The overflow path for `ParseValue` is already covered by
`WriteCommandParseValueTests.ParseValue_out_of_range_throws_CommandException`
(theory with `short.Parse` + `AnalogInt` overflow inputs), added when finding
Driver.AbLegacy.Cli-001 was resolved.
+38 -7
View File
@@ -7,7 +7,7 @@
| Review date | 2026-05-22 |
| Commit reviewed | `76d35d1` |
| Status | Reviewed |
| Open findings | 3 |
| Open findings | 0 |
## Checklist coverage
@@ -141,7 +141,7 @@ decode the full 16-bit word and test bit 0.
| Severity | Low |
| Category | OtOpcUa conventions |
| Location | `AbLegacyDriver.cs` (whole file) |
| Status | Open |
| Status | Resolved |
**Description:** The driver uses no `ILogger`/Serilog at all. Probe-loop failures,
runtime initialisation failures, libplctag non-zero statuses, and read/write
@@ -155,7 +155,16 @@ string that the next read or write immediately clobbers.
log probe transitions, runtime-init failures, and the first occurrence of a non-zero
libplctag status per device.
**Resolution:** _(open)_
**Resolution:** Resolved 2026-05-23 — `AbLegacyDriver` now accepts an optional
`ILogger<AbLegacyDriver>` (falls back to `NullLogger`), mirroring the Modbus / S7 /
Galaxy driver pattern. `InitializeAsync` catch-path logs the init failure at Error
level; `TransitionDeviceState` logs every probe transition (Warning on downgrade to
Stopped, Information on recovery); `ReadAsync` logs the first non-zero libplctag
status per device at Warning level via a re-armable `DeviceState.FirstNonZeroStatusLogged`
latch so a permanently-bad PLC doesn't flood the rolling file. `AbLegacyDriverFactoryExtensions.Register`
gains an optional `ILoggerFactory` parameter so the Server bootstrap can wire DI
logging when it chooses; the legacy single-arg `CreateInstance` overload stays for
back-compat. Regression coverage in `AbLegacyLoggerInjectionTests`.
### Driver.AbLegacy-006
@@ -293,7 +302,7 @@ into a real PCCC-STS path or delete it as dead code. The same defect exists in
| Severity | Low |
| Category | Performance & resource management |
| Location | `AbLegacyDriver.cs:440` |
| Status | Open |
| Status | Resolved |
**Description:** `Dispose()` is implemented as
`DisposeAsync().AsTask().GetAwaiter().GetResult()` - sync-over-async. `ShutdownAsync`
@@ -306,7 +315,16 @@ single-threaded synchronization context.
must exist, perform the synchronous teardown directly (cancel CTSs, dispose runtimes)
rather than blocking on the async path.
**Resolution:** _(open)_
**Resolution:** Resolved 2026-05-23 — `Dispose()` now performs the synchronous
teardown directly (cancel probe CTSs, dispose runtimes, clear maps) rather than
wrapping `DisposeAsync().AsTask().GetAwaiter().GetResult()`. The poll engine's
`DisposeAsync` is drained with `.ConfigureAwait(false).GetAwaiter().GetResult()` so a
captured single-threaded `SynchronizationContext` can never be the resumption target —
the classic sync-over-async deadlock is structurally ruled out. Regression test
`Dispose_under_single_threaded_sync_context_does_not_deadlock` drives the path
through a cooperative single-threaded `SynchronizationContext` with a 2s pump timeout;
`Dispose_runs_teardown_without_blocking_on_async_wait` and `Dispose_is_idempotent`
cover the cleanup invariants.
### Driver.AbLegacy-012
@@ -345,7 +363,7 @@ unused fields and the doc comments that imply they are load-bearing.
| Severity | Low |
| Category | Code organization & conventions |
| Location | `AbLegacyDriver.cs:340-345`, `AbLegacyDriver.cs:238-264` |
| Status | Open |
| Status | Resolved |
**Description:** Two minor organisational issues:
1. `ResolveHost` returns `_options.Devices.FirstOrDefault()?.HostAddress ??
@@ -362,4 +380,17 @@ unused fields and the doc comments that imply they are load-bearing.
document why falling back to the instance id is acceptable. For (2), record the
array-addressing gap as a tracked follow-up.
**Resolution:** _(open)_
**Resolution:** Resolved 2026-05-23 —
(1) `ResolveHost` carries a new XML-doc block that documents the three-step fallback
chain (known tag → first device → `DriverInstanceId`) and explicitly cites the
`IPerCallHostResolver` contract which requires implementations to return the driver's
default-host string rather than throw on an unknown reference. The instance-id
fallback is therefore the documented single-host behaviour, not a leaky fake. Three
regression tests in `AbLegacyDisposeAndResolveHostTests` pin each branch of the chain
(`ResolveHost_known_reference_returns_tag_device`,
`ResolveHost_unknown_reference_with_devices_returns_first_device`,
`ResolveHost_unknown_reference_no_devices_returns_driver_instance_id`).
(2) `DiscoverAsync` now carries an inline tracked-follow-up comment that calls out
the PCCC-file-as-array gap, notes the consistency with the PR-staged scope in
`docs/v2/driver-specs.md`, and points to the Modbus `ArrayCount` flow as the pattern
to mirror when multi-element addressing lands.
+176 -9
View File
@@ -4,10 +4,10 @@
|---|---|
| Module | `src/Drivers/Cli/ZB.MOM.WW.OtOpcUa.Driver.Cli.Common` |
| Reviewer | Claude Code |
| Review date | 2026-05-22 |
| Commit reviewed | `76d35d1` |
| Review date | 2026-05-23 |
| Commit reviewed | `a9be809` |
| Status | Reviewed |
| Open findings | 2 |
| Open findings | 0 |
## Checklist coverage
@@ -16,7 +16,7 @@ a category produced nothing rather than leaving it blank.
| # | Category | Result |
|---|---|---|
| 1 | Correctness & logic bugs | Driver.Cli.Common-001, Driver.Cli.Common-002 |
| 1 | Correctness & logic bugs | Driver.Cli.Common-001, Driver.Cli.Common-002, Driver.Cli.Common-007 |
| 2 | OtOpcUa conventions | No issues found |
| 3 | Concurrency & thread safety | Driver.Cli.Common-003 |
| 4 | Error handling & resilience | Driver.Cli.Common-004 |
@@ -24,9 +24,47 @@ a category produced nothing rather than leaving it blank.
| 6 | Performance & resource management | No issues found |
| 7 | Design-document adherence | No issues found |
| 8 | Code organization & conventions | No issues found |
| 9 | Testing coverage | Driver.Cli.Common-005 |
| 9 | Testing coverage | Driver.Cli.Common-005, Driver.Cli.Common-008 |
| 10 | Documentation & comments | Driver.Cli.Common-006 |
## Re-review 2026-05-23 (commit `a9be809`)
Delta scope: commit `5a9c459` extends the `FormatStatus` shortlist with five
`Bad*` codes (`BadInternalError` 0x80020000, `BadNotWritable` 0x803B0000,
`BadOutOfRange` 0x803C0000, `BadNotSupported` 0x803D0000, `BadDeviceFailure`
0x80550000) the FOCAS / AbCip / AbLegacy native-protocol mappers emit. Tests
extended with parallel `[InlineData]` rows on the well-known Theory plus a new
`FormatStatus_names_native_driver_emitted_codes` Theory.
Cross-checked the five new hex literals against the OPC Foundation
`Opc.Ua.StatusCodes` table via DeepWiki:
| Name added | Code in shortlist | Spec value | Verdict |
|---|---|---|---|
| `BadInternalError` | `0x80020000` | `0x80020000` | Correct |
| `BadNotWritable` | `0x803B0000` | `0x803B0000` | Correct |
| `BadOutOfRange` | `0x803C0000` | `0x803C0000` | Correct |
| `BadNotSupported` | `0x803D0000` | `0x803D0000` | Correct |
| `BadDeviceFailure` | `0x80550000` | **`0x808B0000`** | **WRONG — `0x80550000` is `BadSecurityPolicyRejected`** |
The `BadDeviceFailure` mismapping is the same shape of bug as the original
Driver.Cli.Common-001 (wrong hex literal copied into the shortlist); recorded
as Driver.Cli.Common-007. The wrong constant also lives in
`FocasStatusMapper.cs`, `AbCipStatusMapper.cs`, `AbLegacyStatusMapper.cs`,
`TwinCATStatusMapper.cs`, `S7Driver.cs`, and `ModbusDriver.cs` — those are in
other modules' review scope but are noted here so future re-reviewers know
this isn't isolated. (`StatusCodeMap.cs` in Driver.Galaxy + the Wonderware
historian mappers use the correct `0x808B0000`, confirming the discrepancy.)
Testing observation: the new `FormatStatus_names_native_driver_emitted_codes`
Theory is fully redundant with the well-known Theory (the five rows were also
added there in the same commit) and uses `ShouldContain` rather than
`ShouldBe` — recorded as Driver.Cli.Common-008.
Other categories (concurrency, security, performance, design-doc adherence,
code organisation, documentation) are unchanged by this delta — no new
issues found.
## Findings
### Driver.Cli.Common-001
@@ -130,7 +168,7 @@ dispose the previous logger if reconfiguration is genuinely intended.
| Severity | Low |
| Category | Error handling & resilience |
| Location | `src/Drivers/Cli/ZB.MOM.WW.OtOpcUa.Driver.Cli.Common/SnapshotFormatter.cs:68-70` |
| Status | Open |
| Status | Resolved |
**Description:** `FormatTable` calls `rows.Max(r => r.Tag.Length)` (and the same for the
value and status columns) without guarding against empty input. When `tagNames` and
@@ -143,7 +181,13 @@ instead of producing an empty (header-only) table.
separator, or an explicit "no rows" line), or use `DefaultIfEmpty(0).Max(...)` for the
width computations.
**Resolution:** _(open)_
**Resolution:** Resolved 2026-05-23 — `FormatTable` guards each `rows.Max(...)` width
computation with a `rows.Length == 0 ? "<HEADER>".Length : Math.Max(...)` ternary, so
an empty batch read returns the header + separator rows (no data rows) instead of
throwing `InvalidOperationException`. The fix was landed in commit `1433a1c` alongside
the -002 work, and the regression test
`SnapshotFormatterTests.FormatTable_with_empty_input_returns_header_only` (added under
-005) exercises it.
### Driver.Cli.Common-005
@@ -178,7 +222,7 @@ empty-input and `DriverCommandBase` level-selection tests.
| Severity | Low |
| Category | Documentation & comments |
| Location | `src/Drivers/Cli/ZB.MOM.WW.OtOpcUa.Driver.Cli.Common/SnapshotFormatter.cs:71`, `src/Drivers/Cli/ZB.MOM.WW.OtOpcUa.Driver.Cli.Common/DriverCommandBase.cs:9` |
| Status | Open |
| Status | Resolved |
**Description:** Two minor doc inaccuracies. (1) The comment at `SnapshotFormatter.cs:71`
states the "source-time column is fixed-width (ISO-8601 to ms) so no max-measurement
@@ -194,4 +238,127 @@ library. The XML doc is stale relative to the shipped driver-CLI set.
right-most and intentionally unpadded rather than claiming fixed width. Add FOCAS to the
`DriverCommandBase` class-summary driver list.
**Resolution:** _(open)_
**Resolution:** Resolved 2026-05-23 — (1) `SnapshotFormatter.cs:71` comment reworded
to state the source-time column is the right-most one and intentionally not
measured/padded, calling out the null-timestamp `"-"` case explicitly. (2) FOCAS was
added to the `DriverCommandBase` class-summary driver enumeration in commit `7ff356b`
(landed alongside the -003 work).
### Driver.Cli.Common-007
| Field | Value |
|---|---|
| Severity | High |
| Category | Correctness & logic bugs |
| Location | `src/Drivers/Cli/ZB.MOM.WW.OtOpcUa.Driver.Cli.Common/SnapshotFormatter.cs:129` |
| Status | Resolved |
**Description:** Commit `5a9c459` added `0x80550000u => "BadDeviceFailure"` to the
`FormatStatus` shortlist, but `0x80550000` is the canonical OPC UA spec value for
`BadSecurityPolicyRejected`, not `BadDeviceFailure`. The correct spec value for
`BadDeviceFailure` is `0x808B0000` (verified against the OPC Foundation
`Opc.Ua.StatusCodes` table via DeepWiki; corroborated locally by
`src/Drivers/ZB.MOM.WW.OtOpcUa.Driver.Galaxy/Runtime/StatusCodeMap.cs:40`
(`BadDeviceFailure = 0x808B0000u`) and the two Wonderware historian quality mappers,
which all hand-pin the correct value).
This is the same shape of bug Driver.Cli.Common-001 closed: a wrong hex literal
in the shortlist that the test theory (`SnapshotFormatterTests.cs:42`) blindly
asserts against the same wrong value, so the bug is invisible to CI.
Practical impact, two-sided:
1. A driver that returns the real spec `BadDeviceFailure` (`0x808B0000`) — e.g.
the `Driver.Galaxy.StatusCodeMap` path on a deploy-time device fault — falls
through the named shortlist entirely. Since Driver.Cli.Common-002 added the
severity-class fallback, it now renders as `0x808B0000 (Bad)` instead of
`0x808B0000 (BadDeviceFailure)` — operators lose the specific class label
`docs/Driver.FOCAS.Cli.md:153` tells them to read off the output.
2. A driver that returns `0x80550000` (which `FocasStatusMapper`, `AbCipStatusMapper`,
`AbLegacyStatusMapper`, `TwinCATStatusMapper`, `S7Driver`, and `ModbusDriver` all
misuse as "BadDeviceFailure") now renders as `0x80550000 (BadDeviceFailure)`
matching driver intent but contradicting the OPC UA spec, which says any client
that decodes the same payload using the OPC Foundation stack will see
`BadSecurityPolicyRejected`. A security-monitoring tool keying on
`BadSecurityPolicyRejected` will fire on a CPU fault, while real
`BadSecurityPolicyRejected` returns from the secure-channel layer would be
mislabelled as a device fault. Operator-facing CLI output and machine-readable
status semantics disagree.
The deeper bug is the wrong constant in the native-protocol mappers (out of scope
for this module), but the `SnapshotFormatter` shortlist is its own
spec-authoritative reference point — Driver.Cli.Common-001 explicitly framed the
shortlist as canonical, with the in-line "keep [these literals] in sync with [the
Opc.Ua.StatusCodes] table" comment at `SnapshotFormatter.cs:112-113`. That
contract is now broken.
**Recommendation:** Change line 129 to `0x808B0000u => "BadDeviceFailure"`. Update
the matching `[InlineData]` rows in `SnapshotFormatterTests.cs` (line 42 in the
well-known Theory; line 60 in the redundant Theory — see Driver.Cli.Common-008).
Also note in the resolution that the native-protocol mappers (FOCAS / AbCip /
AbLegacy / TwinCAT / S7 / Modbus) need the same fix recorded against their own
module reviews — the constant `0x80550000` should be replaced with `0x808B0000`
everywhere it claims to mean `BadDeviceFailure`. Consider Driver.Cli.Common-001's
original recommendation again: add a CI test that cross-checks every shortlist
entry against `Opc.Ua.StatusCodes` reflection so this class of bug stops
recurring.
**Resolution:** Resolved 2026-05-23 — corrected `SnapshotFormatter.FormatStatus`
to map `0x808B0000u => "BadDeviceFailure"` (was `0x80550000u`). Updated the
`InlineData` row in the well-known Theory accordingly; the redundant native-
emitted Theory was deleted entirely per Driver.Cli.Common-008. Added a regression
row to `FormatStatus_does_not_apply_pre_fix_wrong_names` pinning that
`0x80550000` no longer renders as `BadDeviceFailure` (mirroring the
Driver.Cli.Common-001 wrong-name guards). The underlying constant was also
corrected in all six native-protocol mappers as part of the same commit:
`FocasStatusMapper.BadDeviceFailure`, `AbCipStatusMapper.BadDeviceFailure`,
`AbLegacyStatusMapper.BadDeviceFailure`, `TwinCATStatusMapper.BadDeviceFailure`,
`ModbusDriver.StatusBadDeviceFailure`, `S7Driver.StatusBadDeviceFailure` — all
moved from `0x80550000u` to `0x808B0000u`. The three downstream Modbus tests
(`ModbusExceptionMapperTests` 3 InlineData rows + 1 ShouldBe assertion;
`ExceptionInjectionTests.StatusBadDeviceFailure` constant) updated to expect
the corrected code. **Behavior change:** OPC UA clients consuming the native
drivers now see the canonical `BadDeviceFailure` (0x808B0000) instead of the
misnamed `BadSecurityPolicyRejected` (0x80550000) on device-fault paths —
operator-facing CLI output and machine-readable status semantics now agree.
Suite totals after fix: Driver.Cli.Common.Tests 43 green (was 48 — minus 5
redundant rows); Modbus.Tests 263; AbCip.Tests 262; AbLegacy.Tests 157;
FOCAS.Tests 178; S7.Tests 112; TwinCAT.Tests 131; all green. The Opc.Ua.StatusCodes
cross-check the recommendation suggested is recorded as a follow-up worth
considering but is out of scope for this fix.
### Driver.Cli.Common-008
| Field | Value |
|---|---|
| Severity | Low |
| Category | Testing coverage |
| Location | `tests/Drivers/Cli/ZB.MOM.WW.OtOpcUa.Driver.Cli.Common.Tests/SnapshotFormatterTests.cs:50-64` |
| Status | Resolved |
**Description:** Commit `5a9c459` adds a new
`FormatStatus_names_native_driver_emitted_codes` `[Theory]` whose five
`[InlineData]` rows are identical to five rows added to the existing
`FormatStatus_names_well_known_status_codes` `[Theory]` in the same commit
(lines 32, 39, 40, 41, 42). The new Theory therefore adds no coverage. It is
also weaker than the Theory it duplicates: it asserts
`output.ShouldContain($"({expectedName})")` (substring match) where the
well-known Theory asserts `output.ShouldBe($"0x{status:X8} ({expectedName})")`
(exact match including the hex prefix). The substring form would not catch a
regression where the hex literal renders wrong but the name is correct.
This is not a correctness problem — both Theories pass — but it's a
copy-paste inconsistency that costs maintainer attention every time someone
reads the test file and wonders which Theory is authoritative.
**Recommendation:** Either (a) delete the new Theory entirely — its five rows
are already covered by the well-known Theory in the same commit — or (b) keep
it but switch to `ShouldBe($"0x{status:X8} ({expectedName})")` so its
assertion strength matches the rest of the file. Option (a) is cleaner: the
commit's "operator workflow" intent is documented well enough in the
well-known Theory comment block; the redundant Theory is dead weight.
**Resolution:** Resolved 2026-05-23 — took option (a): deleted the
`FormatStatus_names_native_driver_emitted_codes` Theory entirely. Its five
`InlineData` rows are covered by the well-known Theory's `ShouldBe` (strict
exact-match assertion), which is the authoritative shortlist test. Landed
alongside the Driver.Cli.Common-007 fix in the same commit.
+59 -11
View File
@@ -7,7 +7,7 @@
| Review date | 2026-05-22 |
| Commit reviewed | `76d35d1` |
| Status | Reviewed |
| Open findings | 5 |
| Open findings | 0 |
## Checklist coverage
@@ -45,7 +45,7 @@ a category produced nothing rather than leaving it blank.
| Severity | Low |
| Category | Error handling & resilience |
| Location | `Commands/WriteCommand.cs:58-68` |
| Status | Open |
| Status | Resolved |
**Description:** `WriteCommand.ParseValue` parses the numeric `--value` types
(`Byte`/`Int16`/`Int32`/`Float32`/`Float64`) with `sbyte.Parse` / `short.Parse`
@@ -65,7 +65,16 @@ literal — consistent with how `ParseBool` already handles bad boolean input.
The same pattern exists in the sibling S7 CLI; a shared helper in
`Driver.Cli.Common` would fix both.
**Resolution:** _(open)_
**Resolution:** Resolved 2026-05-23 — wrapped the `ParseValue` numeric switch in
`try/catch (FormatException)` and `try/catch (OverflowException)` that rethrow as
`CliFx.Exceptions.CommandException` with a message naming the `--type` and the
offending value, mirroring the friendly text the `Bit` path already produced.
Added `WriteCommandParseValueTests` with [Theory] cases covering non-numeric
input across `Byte`/`Int16`/`Int32`/`Float32`/`Float64`, overflow edges
(sbyte ±1, short max+1, > int.MaxValue), and an assertion that the exception
message names both the type and the offending value. A shared `Driver.Cli.Common`
helper is the cleaner long-term fix (cross-CLI duplication remains) but is left
to the Driver.Cli.Common review per this module's edit scope.
### Driver.FOCAS.Cli-002
@@ -74,7 +83,7 @@ The same pattern exists in the sibling S7 CLI; a shared helper in
| Severity | Low |
| Category | Concurrency & thread safety |
| Location | `Commands/SubscribeCommand.cs:45-51` |
| Status | Open |
| Status | Resolved |
**Description:** The `subscribe` command attaches an `OnDataChange` handler that
calls the synchronous `console.Output.WriteLine`. `OnDataChange` is raised from
@@ -93,7 +102,15 @@ console writes with a lock shared between the banner and the handler. Optionally
detach the handler in the `finally` block before `ShutdownAsync` for symmetry
with the `handle` teardown already present there.
**Resolution:** _(open)_
**Resolution:** Resolved 2026-05-23 — introduced a `writeLock` shared between the
`OnDataChange` handler and the banner write so the poll-engine background thread
and the CliFx invocation thread can't interleave partial lines. Added an
explanatory comment above the handler explaining the CliFx-`IConsole` rationale
and the synchronous-on-background-thread design — mirroring the Modbus / S7
copies of this command. Also added a try/catch around the handler body so a
transient stdout error cannot tear down the poll loop, and Serilog-warn-logs the
swallowed exception. Added `SubscribeCommandConsoleHandlerTests` to guard the
`writeLock` + CliFx-`IConsole` rationale against future copy-paste regressions.
### Driver.FOCAS.Cli-003
@@ -102,7 +119,7 @@ with the `handle` teardown already present there.
| Severity | Low |
| Category | Error handling & resilience |
| Location | `FocasCommandBase.cs:19` (`CncPort`), `FocasCommandBase.cs:27` (`TimeoutMs`), `Commands/SubscribeCommand.cs:23` (`IntervalMs`) |
| Status | Open |
| Status | Resolved |
**Description:** The numeric command options `--cnc-port`, `--timeout-ms`, and
`--interval-ms` are accepted without range validation. A zero or negative
@@ -120,7 +137,17 @@ timeout and interval strictly positive. The same gap exists across the sibling
driver CLIs, so a shared validation helper in `Driver.Cli.Common` is the
cleaner fix.
**Resolution:** _(open)_
**Resolution:** Resolved 2026-05-23 — added a protected `ValidateOptions(int?
intervalMs = null)` helper on `FocasCommandBase` that rejects `--cnc-port`
outside `1..65535`, non-positive `--timeout-ms`, and non-positive
`--interval-ms` (when the caller passes one) with a `CliFx.Exceptions.CommandException`
naming the option and the rejected value. `ProbeCommand` / `ReadCommand` /
`WriteCommand` call `ValidateOptions()` without an interval, `SubscribeCommand`
calls `ValidateOptions(IntervalMs)`. Added `FocasCommandBaseValidationTests`
covering accept-defaults, reject out-of-range port (0, -1, 65536), reject
non-positive timeout / interval, and skip-interval-when-omitted. A shared
helper in `Driver.Cli.Common` is the cleaner cross-CLI fix and is recorded
against that module's review.
### Driver.FOCAS.Cli-004
@@ -129,7 +156,7 @@ cleaner fix.
| Severity | Low |
| Category | Performance & resource management |
| Location | `Commands/ProbeCommand.cs:37,54`; `Commands/ReadCommand.cs:37,46`; `Commands/WriteCommand.cs:45,54`; `Commands/SubscribeCommand.cs:39,73` |
| Status | Open |
| Status | Resolved |
**Description:** Every command declares `await using var driver = new FocasDriver(...)`
**and** explicitly calls `await driver.ShutdownAsync(CancellationToken.None)` in
@@ -144,7 +171,14 @@ dead weight and obscures intent: a reader cannot tell whether the explicit
and rely on `await using` for disposal, or drop `await using` and keep the
explicit teardown — but not both. The same redundancy exists in the sibling CLIs.
**Resolution:** _(open)_
**Resolution:** Resolved 2026-05-23 — dropped the explicit
`await driver.ShutdownAsync(CancellationToken.None)` calls from the `finally`
blocks of `ProbeCommand`, `ReadCommand`, `WriteCommand`, and `SubscribeCommand`;
`await using` is now the sole driver-disposal mechanism per command
(`FocasDriver.DisposeAsync` itself runs `ShutdownAsync`). The subscribe command
keeps `UnsubscribeAsync` in its finally because that is a subscription-lifecycle
concern, not driver disposal. Added `CommandDisposalConventionsTests` to guard
the source-level convention against regression.
### Driver.FOCAS.Cli-005
@@ -153,7 +187,7 @@ explicit teardown — but not both. The same redundancy exists in the sibling CL
| Severity | Low |
| Category | Design-document adherence |
| Location | `Commands/WriteCommand.cs:50`, `Commands/ProbeCommand.cs:50` (via `SnapshotFormatter.FormatStatus`) |
| Status | Open |
| Status | Resolved |
**Description:** `docs/Driver.FOCAS.Cli.md` documents `BadDeviceFailure` and
`BadCommunicationError` as the key diagnostic signals an operator reads off
@@ -180,4 +214,18 @@ actually emit — at minimum `BadNotWritable`, `BadOutOfRange`, `BadNotSupported
because the gap defeats this module documented `probe`/`write` diagnostic
workflow; cross-reference the `Driver.Cli.Common` review.
**Resolution:** _(open)_
**Resolution:** Resolved 2026-05-23 — the cross-CLI fix landed in `Driver.Cli.Common`:
`SnapshotFormatter.FormatStatus` now names `BadInternalError` (0x80020000),
`BadNotWritable` (0x803B0000), `BadOutOfRange` (0x803C0000), `BadNotSupported`
(0x803D0000), and `BadDeviceFailure` (0x80550000) — the five codes the FOCAS /
AbCip / AbLegacy native-protocol mappers all emit but the shortlist previously
left unnamed (the canonical `BadTimeout` 0x800A0000 was already added under
Driver.Cli.Common-001). FOCAS `probe` / `write` against a non-writable parameter,
out-of-range address, unsupported function, busy device, or CNC-handle failure
now renders with the named status the `docs/Driver.FOCAS.Cli.md` workflow
promises, restoring parity between the docs and the shipped behaviour. Regression
`[Theory]` `FormatStatus_names_native_driver_emitted_codes` added to
`SnapshotFormatterTests` so the five names can't silently drop out of the
shortlist again; the existing well-known shortlist `[Theory]` was extended with
the same five entries to enforce the exact `0x... (Name)` rendering. Suite now
47 green (was 42).
+11 -11
View File
@@ -7,7 +7,7 @@
| Review date | 2026-05-22 |
| Commit reviewed | `76d35d1` |
| Status | Reviewed |
| Open findings | 5 |
| Open findings | 0 |
## Checklist coverage
@@ -200,7 +200,7 @@ stale object.
| Severity | Low |
| Category | Error handling & resilience |
| Location | `FocasDriver.cs:140-148`, `FocasDriver.cs:478-484`, `FocasDriver.cs:529-533`, `FocasAlarmProjection.cs:61-63` |
| Status | Open |
| Status | Resolved |
**Description:** Numerous `try { ... } catch {}` blocks swallow every exception with no
logging - `ShutdownAsync` (CTS cancel/dispose), `RecycleLoopAsync` (`DisposeClient`),
@@ -215,7 +215,7 @@ solely on `GetHealth()`.
poll/probe/recycle loops at `Debug`/`Warning`. Pass a logger into `FocasWireClient` so
the per-response `Debug` entries it already emits are actually captured.
**Resolution:** _(open)_
**Resolution:** Resolved 2026-05-23 — `FocasDriver` now takes an optional `ILogger<FocasDriver>` (defaulting to `NullLogger`) and every previously-empty `catch { }` in `ShutdownAsync` / `ProbeLoopAsync` / `FixedTreeLoopAsync` / `RecycleLoopAsync` / `ReadActiveAlarmsAcrossDevicesAsync` now logs at `Debug` with the host address + context. `FocasAlarmProjection` also accepts an optional `ILogger` (forwarded by the driver) so its unsubscribe / dispose / per-tick poll swallows log. `WireFocasClientFactory` gained a logger-accepting overload that threads through to `FocasWireClient`, so its per-response `Debug` entries actually reach the host pipeline.
### Driver.FOCAS-008
@@ -224,7 +224,7 @@ the per-response `Debug` entries it already emits are actually captured.
| Severity | Low |
| Category | Performance & resource management |
| Location | `FocasDriver.cs:201`, `FocasDriver.cs:253` |
| Status | Open |
| Status | Resolved |
**Description:** `ReadAsync` and `WriteAsync` call `FocasAddress.TryParse(def.Address)`
on every operation, even though `InitializeAsync` already parsed and validated every
@@ -235,7 +235,7 @@ re-parses and allocates a `FocasAddress` record per tag per tick unnecessarily.
parsed `FocasAddress` on `FocasTagDefinition` (or in a side dictionary), so the runtime
read/write paths use the cached value.
**Resolution:** _(open)_
**Resolution:** Resolved 2026-05-23 — `FocasDriver` now holds a `_parsedAddressesByTagName` side dictionary populated at `InitializeAsync`. `ReadAsync` and `WriteAsync` look up the cached `FocasAddress` instance; the defensive fallback `TryParse` only fires if a tag was somehow not seeded. The cache is cleared on `ShutdownAsync`. Regression test `ReadAsync_uses_cached_FocasAddress_when_tag_definition_has_a_malformed_address_after_init` (and the matching `WriteAsync` variant) asserts the same `FocasAddress` instance is reused across calls.
### Driver.FOCAS-009
@@ -244,7 +244,7 @@ read/write paths use the cached value.
| Severity | Low |
| Category | Design-document adherence |
| Location | `FocasDriverOptions.cs:110-115`, `FocasDriver.cs:468-486`, `FocasDriverFactoryExtensions.cs:75-80` |
| Status | Open |
| Status | Resolved |
**Description:** `FocasProbeOptions.Timeout` is parsed by the factory
(`FocasProbeDto.TimeoutMs` to `FocasProbeOptions.Timeout`) but never consumed.
@@ -257,7 +257,7 @@ until the OS TCP timeout rather than the configured `Probe.Timeout`.
around the `ProbeAsync` call, or remove the dead `Timeout` field from
`FocasProbeOptions` / `FocasProbeDto` if it is genuinely not intended.
**Resolution:** _(open)_
**Resolution:** Resolved 2026-05-23 — `FocasDriver.ProbeLoopAsync` now wraps `client.ProbeAsync` in a linked `CancellationTokenSource` that fires after `Probe.Timeout` (skipped when the timeout is `<= TimeSpan.Zero`). On timeout the loop logs the cancellation at Debug and surfaces it as a failed probe, so a hung CNC socket transitions the host to `Stopped` at the configured budget instead of blocking on the OS TCP timeout. Regression test `ProbeLoop_cancels_a_slow_ProbeAsync_at_Probe_Timeout` asserts the cancellation reaches the fake `ProbeAsync` within the configured 100 ms.
### Driver.FOCAS-010
@@ -266,7 +266,7 @@ around the `ProbeAsync` call, or remove the dead `Timeout` field from
| Severity | Low |
| Category | Code organization & conventions |
| Location | `IFocasClient.cs:210-227` (`FocasOpMode`), `FocasConstants.cs:42-78` (`FocasOperationMode`) |
| Status | Open |
| Status | Resolved |
**Description:** There are two parallel operation-mode-to-text mappings with divergent
labels. `FocasOpMode.ToText` (used by the driver fixed-tree `OperationMode/ModeText`
@@ -278,7 +278,7 @@ inconsistent results depending on which path renders it.
**Recommendation:** Consolidate to a single op-mode enum + `ToText` helper shared by
both the wire layer and the driver projection, with one canonical label set.
**Resolution:** _(open)_
**Resolution:** Resolved 2026-05-23 — `FocasOperationModeExtensions.ToText` now delegates to `FocasOpMode.ToText((short)mode)`, so the wire layer and the driver fixed-tree projection render identical labels. `FocasOpMode` keeps its existing labels (`TJOG`, `TEACH_IN_HANDLE`, `Mode{n}` fallback), which are now the single canonical surface. Regression theory `OpMode_ToText_yields_the_same_label_in_both_namespaces` cross-checks every defined code; `OpMode_ToText_fallback_label_is_consistent` covers the unknown-code path.
### Driver.FOCAS-011
@@ -287,7 +287,7 @@ both the wire layer and the driver projection, with one canonical label set.
| Severity | Low |
| Category | Code organization & conventions |
| Location | `IFocasClient.cs:275-287` (`FocasAlarmType`), `FocasAlarmProjection.cs:149-175` |
| Status | Open |
| Status | Resolved |
**Description:** `FocasAlarmType` declares its constants as `public const int`, but the
only consumers - `FocasAlarmProjection.MapAlarmType(short type)` and
@@ -301,7 +301,7 @@ expected by `ReadAlarmsAsync`.
**Recommendation:** Declare the `FocasAlarmType` constants as `short` (or make it an
`enum : short`) so the type matches the wire field width and the projection signatures.
**Resolution:** _(open)_
**Resolution:** Resolved 2026-05-23 — every `FocasAlarmType` constant (`All`, `Parameter`, `PulseCode`, `Overtravel`, `Overheat`, `Servo`, `DataIo`, `MemoryCheck`, `MacroAlarm`) is now typed `short`, matching the wire field width on `cnc_rdalmmsg2` and the `switch (short type)` arms in `FocasAlarmProjection.MapAlarmType` / `MapSeverity`. Regression test `FocasAlarmType_constants_are_typed_short` uses reflection to guarantee the type is preserved against future drift.
### Driver.FOCAS-012
+165 -15
View File
@@ -4,10 +4,10 @@
|---|---|
| Module | `src/Drivers/ZB.MOM.WW.OtOpcUa.Driver.Galaxy` |
| Reviewer | Claude Code |
| Review date | 2026-05-22 |
| Commit reviewed | `76d35d1` |
| Review date | 2026-05-23 |
| Commit reviewed | `a9be809` |
| Status | Reviewed |
| Open findings | 4 |
| Open findings | 0 |
## Checklist coverage
@@ -17,12 +17,39 @@
| 2 | OtOpcUa conventions | Driver.Galaxy-005 |
| 3 | Concurrency & thread safety | Driver.Galaxy-006, Driver.Galaxy-007 |
| 4 | Error handling & resilience | Driver.Galaxy-001, Driver.Galaxy-008, Driver.Galaxy-009 |
| 5 | Security | Driver.Galaxy-010 |
| 6 | Performance & resource management | Driver.Galaxy-011, Driver.Galaxy-012 |
| 7 | Design-document adherence | Driver.Galaxy-013 |
| 5 | Security | Driver.Galaxy-010, Driver.Galaxy-015 |
| 6 | Performance & resource management | Driver.Galaxy-011, Driver.Galaxy-012, Driver.Galaxy-016 |
| 7 | Design-document adherence | Driver.Galaxy-013, Driver.Galaxy-017 |
| 8 | Code organization & conventions | No issues found |
| 9 | Testing coverage | Driver.Galaxy-014 |
| 10 | Documentation & comments | Driver.Galaxy-005, Driver.Galaxy-013 |
| 10 | Documentation & comments | Driver.Galaxy-005, Driver.Galaxy-013, Driver.Galaxy-018 |
## Re-review 2026-05-23 (commit `a9be809`)
The only code-affecting change since `76d35d1` was commit `994997b` — the
sibling `mxaccessgw` repo restructured (the `clients/dotnet/MxGateway.Client`
project path and the `MxGateway.Contracts.Proto` namespace both moved), and
the driver's path-based `ProjectReference` started producing 87 build errors
solution-wide. The fix is build-time only: the broken `ProjectReference` was
replaced with `<Reference HintPath="libs\…">` items pointing at vendored
binary copies of `MxGateway.Client.dll` (99 KB, May 2026 known-good build)
and `MxGateway.Contracts.dll` (490 KB), and five `PackageReference`s that
the dropped project was previously providing transitively (`Google.Protobuf`,
`Grpc.Core.Api`, `Grpc.Net.Client`, `Microsoft.Extensions.Logging.Abstractions`,
`Polly`) were declared explicitly. The matching `Tests` csproj got the same
binary `<Reference>` for `MxGateway.Contracts` (replacing its own broken
`ProjectReference`). A `libs/README.md` documents what is vendored and the
two unwinding paths (sibling restores a client library, or driver migrates
to the new `ZB.MOM.WW.MxGateway.Contracts.Proto` namespace + reimplements
the `MxGatewayClient` / `MxGatewaySession` / `GalaxyRepositoryClient`
wrapper, ~2,200 LoC).
No `*.cs` file changed; the re-review walked only the categories that apply
to a build-time/packaging change. Categories with no new findings:
Correctness (1), OtOpcUa conventions (2), Concurrency (3), Error handling
(4), Code organization (8), Testing coverage (9). Four new findings are
recorded below (Driver.Galaxy-015..018) — none Critical, none High; two
Medium, two Low.
## Findings
@@ -93,13 +120,13 @@
| Severity | Low |
| Category | OtOpcUa conventions |
| Location | `Runtime/EventPump.cs:81-88` |
| Status | Open |
| Status | Resolved |
**Description:** The `BoundedChannelOptions` comment states "Newest-dropped policy: when full, the producer's TryWrite returns false ... We do this manually rather than relying on `BoundedChannelFullMode.DropWrite`" — but the option is then set to `FullMode = BoundedChannelFullMode.Wait`. With `Wait`, `TryWrite` returning `false` on a full channel is correct behaviour, so the code works, but the comment naming the mode and the actual mode disagree, which is confusing for a maintainer deciding whether the policy is `Wait`, `DropWrite`, or `DropNewest`.
**Recommendation:** Either reword the comment to say "we use `Wait` mode but never call the awaitable `WriteAsync``TryWrite` gives us synchronous newest-dropped semantics", or switch to `BoundedChannelFullMode.DropWrite` and keep the manual drop count. Make the comment and the mode consistent.
**Resolution:** _(open)_
**Resolution:** Resolved 2026-05-23 — reworded the `BoundedChannelOptions` comment to say "we use FullMode.Wait but never call the awaitable WriteAsync — only synchronous TryWrite, which returns false immediately on a full channel and lets us account for drops on the EventsDropped counter". Also explains why we deliberately do NOT use `BoundedChannelFullMode.DropWrite` (it would silently discard without surfacing on the counter). Comment and `FullMode` value now agree.
### Driver.Galaxy-006
@@ -168,13 +195,13 @@
| Severity | Low |
| Category | Security |
| Location | `GalaxyDriver.cs:311-341` |
| Status | Open |
| Status | Resolved |
**Description:** `ResolveApiKey` supports an `env:`/`file:` indirection and otherwise treats the config string as the literal API key ("Anything else — used as the literal API key. Convenient for dev"). `GalaxyGatewayOptions`' own XML doc claims "the API key never appears in cleartext config". The literal-key fallback silently permits a plaintext API key in the `DriverConfig` JSON column of the central config DB, contradicting the documented contract. There is no warning logged when the literal path is taken.
**Recommendation:** Log a startup warning when `ResolveApiKey` falls through to the literal arm so an operator who accidentally committed a cleartext key sees it, and update the `GalaxyGatewayOptions` doc comment so it no longer over-promises. Consider gating the literal arm behind an explicit `dev:`-style prefix so a cleartext key cannot be used by accident.
**Resolution:** _(open)_
**Resolution:** Resolved 2026-05-23 — (a) added a logger-aware `ResolveApiKey(string, ILogger?)` overload that emits a `Warning` when the back-compat literal arm is taken, and wired the `BuildClientOptions` call site to pass `_logger`; (b) added an explicit `dev:KEY` prefix that returns the literal value without warning, so dev rigs / parity tests can opt-in deliberately; (c) rewrote the `GalaxyGatewayOptions.ApiKeySecretRef` XML doc so it no longer claims "the API key never appears in cleartext config" — it now documents all four supported forms (`env:`, `file:`, `dev:`, and the warning-on-literal back-compat path). Regression coverage in `GalaxyDriverApiKeyResolverTests` (`Literal_string_emits_warning_when_logger_supplied`, `Dev_prefix_returns_literal_without_warning`, `Env_prefix_does_not_emit_literal_warning`).
### Driver.Galaxy-011
@@ -198,13 +225,13 @@
| Severity | Low |
| Category | Performance & resource management |
| Location | `Runtime/SubscriptionRegistry.cs:65-67`, `GalaxyDriver.cs:538`, `GalaxyDriver.cs:675` |
| Status | Open |
| Status | Resolved |
**Description:** Several hot paths are O(n^2) per call. `SubscriptionRegistry.ResolveSubscribers` does `entry.Bindings.FirstOrDefault(b => b.ItemHandle == itemHandle)` — a linear scan of the whole binding list for every event dispatch; at 50k tags this is 50k-element scans on the 1Hz fan-out path. `GalaxyDriver.SubscribeAsync` and `ReadViaSubscribeOnceAsync` correlate results to references with `results.FirstOrDefault(r => string.Equals(...))` inside a `for` loop over all references — O(n^2) over the subscribe batch. `SubscriptionRegistry.Remove` rebuilds a `ConcurrentBag` from a LINQ filter on every unsubscribe.
**Recommendation:** Index `SubscriptionEntry` bindings by item handle (a `Dictionary<int, string>` per entry) so `ResolveSubscribers` is O(1) per subscriber. Project the `SubscribeResult` list into a `Dictionary<string, SubscribeResult>` (OrdinalIgnoreCase) once before the correlation loop. These matter on the documented 50k-tag soak path.
**Resolution:** _(open)_
**Resolution:** Resolved 2026-05-23 — three changes: (a) `SubscriptionEntry` now carries a `FullRefByItemHandle` `Dictionary<int, string>` built once at construction; `ResolveSubscribers` does O(1) lookups per subscriber instead of a `FirstOrDefault` linear scan of the binding list. (b) Reverse map `_subscribersByItemHandle` swapped from `ConcurrentBag<long>` to `ImmutableHashSet<long>``Remove`/`Rebind` use `set.Remove(id)` (O(log n)) instead of "rebuild a new bag from a LINQ filter on every unsubscribe", and reads remain lock-free via atomic publication through `ConcurrentDictionary.AddOrUpdate`. (c) `GalaxyDriver.SubscribeAsync` + `ReadViaSubscribeOnceAsync` now index the `SubscribeResult` list once via the existing `BuildResultIndex` helper (already used by `ReplayAsync`) so per-reference correlation is O(1). Regression coverage in `SubscriptionRegistryTests.ResolveSubscribers_LargeBindingSet_DispatchesCorrectly`.
### Driver.Galaxy-013
@@ -213,13 +240,13 @@
| Severity | Low |
| Category | Design-document adherence |
| Location | `GalaxyDriver.cs:14-27`, `GalaxyDriver.cs:374-382`, `Config/GalaxyDriverOptions.cs:84-86` |
| Status | Open |
| Status | Resolved |
**Description:** Multiple doc comments are stale relative to the shipped code. `GalaxyDriver`'s class summary still describes the file as "the project skeleton with `IDriver` bodies that wire to a future `IGalaxyGatewayClient` abstraction. Capability interfaces ... land in PRs 4.1-4.7" and references the legacy `GalaxyProxyDriver` coexisting "until PR 7.2" — but PR 7.2 already deleted the legacy Galaxy projects and the capability interfaces are all implemented. `ReinitializeAsync` is still a stub ("for the skeleton we just refresh health") that ignores `driverConfigJson` entirely — a config reapply silently does nothing. `GalaxyReconnectOptions.ReplayOnSessionLost` is defined and documented but never read anywhere in the driver (`ReplayAsync` always replays).
**Recommendation:** Refresh the `GalaxyDriver` class and `ReinitializeAsync` doc comments to describe the shipped state, implement or explicitly reject `ReinitializeAsync` config reapply, and either honour `ReplayOnSessionLost` or remove it from `GalaxyReconnectOptions`.
**Resolution:** _(open)_
**Resolution:** Resolved 2026-05-23 — three fixes: (a) rewrote the `GalaxyDriver` class summary to describe the shipped capability surface (`ITagDiscovery`, `IReadable`, `IWritable`, `ISubscribable`, `IRediscoverable`, `IHostConnectivityProbe`, `IAlarmSource`) and removed the stale "PR 4.0 skeleton" / "legacy `GalaxyProxyDriver` coexists until PR 7.2" wording — PR 7.2 already retired the legacy projects. (b) `ReinitializeAsync` now parses the incoming `driverConfigJson` through the factory pipeline and compares the result to `_options`; an equivalent reapply refreshes health, a non-equivalent change throws `NotSupportedException` so a config swap never silently no-ops. (c) `ReplayAsync` now honours `_options.Reconnect.ReplayOnSessionLost` — when false it restarts the EventPump but skips the per-tag SubscribeBulk fan-out, delegating to gateway session-level replay. Regression coverage in `GalaxyDriverInfrastructureTests` (`ReinitializeAsync_RejectsNonEquivalentConfigChange`, `ReinitializeAsync_AcceptsEquivalentConfig`, `ReplayOnSessionLost_False_SkipsResubscribeBulk`, `ReplayOnSessionLost_True_RunsResubscribeBulk`). Updated `GalaxyDriverFactoryTests.ReinitializeAsync_RefreshesHealth_WhenConfigIsEquivalent` to use an equivalent config JSON.
### Driver.Galaxy-014
@@ -235,3 +262,126 @@
**Recommendation:** Add unit/parity tests covering: (a) stream fault -> supervisor reopen -> EventPump restart -> `OnDataChange` resumes; (b) `ReplayAsync` updates `SubscriptionRegistry` with new handles; (c) `StatusCodeMap.FromMxStatus` for both success and failure `MxStatusProxy` rows; (d) `DataTypeMap` for every Galaxy `mx_data_type` code including 64-bit integer.
**Resolution:** Resolved 2026-05-22 — added `GalaxyDriverInfrastructureTests` covering `GetMemoryFootprint` (Driver.Galaxy-011) and `IAsyncDisposable` (Driver.Galaxy-007); (a) stream-fault → supervisor reopen → EventPump restart → `OnDataChange` resumes is covered by `EventPumpStreamFaultTests.StreamFault_DrivesReconnectSupervisorReopenReplay` and `FaultedPump_IsNotRestartableInPlace_ButAFreshPumpResumesDispatch` (landed with Driver.Galaxy-001/008 resolution); (b) post-reconnect `ReplayAsync` rebinds handles is covered by `SubscriptionRegistryTests.Rebind_*` suite; (c) `StatusCodeMap.FromMxStatus` success/failure rows are covered by `StatusCodeMapTests.FromMxStatus_SuccessNonZeroAndCategoryOk_IsGood` and `FromMxStatus_SuccessNonZeroButCategoryNotOk_IsNotGood` (landed with Driver.Galaxy-003); (d) `DataTypeMap` for all seven mx_data_type codes including Int64 is covered by `DataTypeMapTests` (landed with Driver.Galaxy-002).
### Driver.Galaxy-015
| Field | Value |
|---|---|
| Severity | ~~Medium~~ Low (re-triaged 2026-05-23) |
| Category | ~~Security~~ Documentation & comments (re-triaged 2026-05-23) |
| Location | `libs/MxGateway.Client.dll`, `libs/MxGateway.Contracts.dll`, `libs/README.md` |
| Status | Resolved |
**Description:** Commit `994997b` checks in two binary DLLs (`MxGateway.Client.dll`, 99 840 bytes; `MxGateway.Contracts.dll`, 489 984 bytes) under `src/Drivers/.../Driver.Galaxy/libs/` and references them via `<Reference HintPath="…" />`. These are the only checked-in binary build artefacts in the entire repo (a repo-wide `find` for non-`bin/`/`obj/` `*.dll` under `libs/` returns only these two), so the change sets a precedent. The accompanying `libs/README.md` states the DLLs are "byte-for-byte the build output" of the OtOpcUa team's own code against the gateway's open proto contracts, but there is no recorded provenance — no source-commit SHA from the sibling `mxaccessgw` repo that produced the build, no SHA-256/SHA-512 checksum, no `.gitattributes` rule marking these paths as binary (so a future churn-in-place will balloon the pack file). Without a recorded source commit + checksum it is impossible for a future reviewer/auditor to verify the binaries match a specific revision of the sibling repo — the assertion "we built them, not external" is unverifiable after the fact. Tampering or accidental swap (e.g. someone drops in a different DLL of the same name under the same path) would not be detectable.
**Recommendation:** (a) Pin the source provenance: add the sibling `mxaccessgw` commit SHA used to build each DLL to `libs/README.md`. (b) Record a SHA-256 of each `.dll` in `libs/README.md` so a future tamper or accidental update is detectable by running `Get-FileHash`/`sha256sum`. (c) Add a `.gitattributes` rule under `libs/` declaring `*.dll binary` (and consider `filter=lfs diff=lfs merge=lfs -text` if/when these need to be updated, to avoid bloating the pack file on every refresh). (d) Optional: a `dotnet test` time-check that compares the on-disk hash to the recorded hash, so a CI run notices if the file drifts from what the README claims.
**Resolution:** Resolved 2026-05-23. **Severity re-triage:** the original
finding framed this as a security concern about "tampering or accidental
swap by an unknown third party"; the user clarified that the DLLs are
their own code, built from their own `mxaccessgw` project — not third-party
binaries. That moves the concern from security (untrusted provenance) to
documentation (audit trail). Re-classified as Low Documentation &
Comments. Fix: `libs/README.md` now carries a Provenance section that
records the source-commit SHA (`dd7ca1634e2d2b8a866c81f0009bf87ee9427750`,
extracted from the `AssemblyInformationalVersion` baked into both DLLs by
the original build) and SHA-256 checksums of both binaries, plus a
re-verification recipe (`sha256sum libs/*.dll` + `ilspycmd <dll> | grep
AssemblyInformationalVersion`). Recommendations (c) `.gitattributes` and
(d) CI hash-check deferred — the DLLs are essentially frozen until one
of the two unwinding paths is taken, so adding LFS or a CI guard would
add infrastructure that the unwinding step would then have to remove.
Re-open if the vendoring becomes a recurring update target.
### Driver.Galaxy-016
| Field | Value |
|---|---|
| Severity | Medium |
| Category | Performance & resource management |
| Location | `ZB.MOM.WW.OtOpcUa.Driver.Galaxy.csproj:43-47`, `libs/README.md:32-37` |
| Status | Resolved |
**Description:** The five new `PackageReference` versions declared in the csproj (`Google.Protobuf` 3.34.1, `Grpc.Core.Api` 2.76.0, `Grpc.Net.Client` 2.71.0, `Microsoft.Extensions.Logging.Abstractions` 10.0.0, `Polly` 8.5.2) do not all match what the vendored `MxGateway.Client.dll` was built against. The DLL's PE metadata (extracted via `System.Reflection.Metadata`) shows references to `Grpc.Net.Client v2.0.0.0`, `Microsoft.Extensions.Logging.Abstractions v10.0.0.0`, and notably `Polly.Core v8.0.0.0` — and the source csproj just before the sibling-repo rename (commit `bd4a09a` from 2026-04-27) declared `Grpc.Net.Client` 2.76.0, `Microsoft.Extensions.Logging.Abstractions` 10.0.7, and `Polly.Core` 8.6.6 — *not* the meta-package `Polly`. Our driver pulls `Polly` 8.5.2 (which transitively pins `Polly.Core` 8.5.2 per its nuspec dependency), so the vendored client actually loads `Polly.Core` 8.5.2 at runtime against code compiled against 8.6.6. Across an 8.5 ↔ 8.6 minor delta this is usually safe (assembly-version is `v8.0.0.0` for both), but it is exactly the skew shape that surfaces as `MissingMethodException` if a 8.6-only API was used in the client. `libs/README.md` claims "versions match what the sibling repo's `ZB.MOM.WW.MxGateway.Contracts.csproj` uses so the gRPC + proto runtime stays binary-compatible" — that statement is correct only for `Google.Protobuf` and `Grpc.Core.Api`; the other three packages do not match.
**Recommendation:** Reconcile the declared package versions with what the vendored DLLs were built against — bump to `Grpc.Net.Client` 2.76.0, `Microsoft.Extensions.Logging.Abstractions` 10.0.7, swap `Polly` for `Polly.Core` 8.6.6 (the driver does not import the `Polly` legacy v7 surface, only Polly.Core via the client). Alternatively, rebuild the vendored DLLs against the same versions the csproj declares and refresh the binaries. Update `libs/README.md` to record the exact versions the DLLs were built against, so the next vendoring refresh has an authoritative reference.
**Resolution:** Resolved 2026-05-23 — took the first option (reconcile
declared packages with what the DLL was built against, verified by
reflecting `Assembly.GetReferencedAssemblies()` on `MxGateway.Client.dll`).
Changes to the csproj: **`Polly` 8.5.2 → `Polly.Core` 8.6.6** (the most
consequential — `Polly` (v7 fluent API) and `Polly.Core` (v8 resilience-
pipeline API) are different packages, and the DLL was built against
`Polly.Core`; the prior `Polly` reference would have failed at runtime
with `MissingMethodException` the first time the gateway client's retry
pipeline ran). Also bumped `Grpc.Net.Client` 2.71.0 → 2.76.0 and
`Microsoft.Extensions.Logging.Abstractions` 10.0.0 → 10.0.7 to match the
sibling Server/Worker projects' current versions. `Google.Protobuf`
3.34.1 and `Grpc.Core.Api` 2.76.0 already matched; left unchanged.
`libs/README.md` rewritten to record what was actually verified
(`Assembly.GetReferencedAssemblies()` output + the resolved package
versions, including the sibling Server/Worker csproj as the version
source-of-truth — the deleted MxGateway.Client.csproj would have been
the original source but no longer exists). Verification: solution-wide
`dotnet build` clean, Driver.Galaxy.Tests 245/245 pass against the
corrected package set.
### Driver.Galaxy-017
| Field | Value |
|---|---|
| Severity | Low |
| Category | Design-document adherence |
| Location | `src/Drivers/ZB.MOM.WW.OtOpcUa.Driver.Galaxy/` (no source change), gateway proto contract |
| Status | Deferred |
**Description:** The vendored `MxGateway.Contracts.dll` only carries the OLD `MxGateway.Contracts.Proto[.Galaxy]` namespace (PE-namespace dump confirms — `MxGateway.Client`, `MxGateway.Contracts`, `MxGateway.Contracts.Proto`, `MxGateway.Contracts.Proto.Galaxy` only). The sibling `mxaccessgw` repo's live `Protos/mxaccess_gateway.proto`, `mxaccess_worker.proto`, and `galaxy_repository.proto` files now generate into `ZB.MOM.WW.MxGateway.Contracts.Proto.*`. The proto wire format itself can still evolve (new RPCs, renamed fields, removed fields) and the driver has no contract-version handshake (a repo-wide search for `ContractVersion|ProtocolVersion|ApiVersion|WireVersion` in the driver returns nothing) — so a gateway service that evolves its proto past what the vendored client knows will fail silently at runtime: gRPC `UNIMPLEMENTED` for a renamed RPC, default-value reads for a removed scalar field, or worse, a wire-tag collision if a field number is reused. The risk surface grew with vendoring: previously the `ProjectReference` would have hard-failed at build time if the proto changed shape; now the driver builds green against a frozen contract that may not match the running gateway.
**Recommendation:** (a) Add a single `Ping`/`GetVersion` RPC call at gateway-session open, comparing the gateway's reported contract version against a string baked into `libs/README.md` (or a `GatewayContractVersion` const) and refusing the session on mismatch with a clear log. (b) Document in `libs/README.md` the exact mxaccessgw commit SHA (and proto-file SHA-256s) the vendored DLLs were built from, so a parity-rig operator can grep the live gateway for the matching commit. (c) Add a soak/parity test that asserts the live gateway's proto descriptor still matches what the vendored DLL expects — fail loud rather than degrade.
**Resolution:** Deferred 2026-05-23 — the recommendation's part (b)
(record the mxaccessgw source-commit SHA in `libs/README.md`) is satisfied
by the Driver.Galaxy-015 resolution, which records both DLLs were built
from mxaccessgw commit `dd7ca1634e2d2b8a866c81f0009bf87ee9427750`. Parts
(a) and (c) — adding a `GetVersion` RPC at session-open and a parity
test against the live gateway's proto descriptor — are substantial new
RPC + plumbing work that is not in scope for this code-review-resolution
sweep. The risk surface is bounded because either of the two unwinding
paths in `libs/README.md` (sibling repo restores `MxGateway.Client.csproj`,
or this driver migrates to the new namespace) will move the codebase
past the vendoring + close this concern naturally. Re-open if neither
unwinding path is taken within the next quarter and the live gateway
service does evolve its proto under the driver.
### Driver.Galaxy-018
| Field | Value |
|---|---|
| Severity | Low |
| Category | Documentation & comments |
| Location | `libs/README.md:32-37`, `ZB.MOM.WW.OtOpcUa.Driver.Galaxy.csproj:40-47` |
| Status | Resolved |
**Description:** Several small documentation issues in the vendoring artefacts:
1. `libs/README.md` says "Versions match what the sibling repo's `ZB.MOM.WW.MxGateway.Contracts.csproj` uses" — but `ZB.MOM.WW.MxGateway.Contracts.csproj` only declares `Google.Protobuf` 3.34.1 and `Grpc.Core.Api` 2.76.0; the other three packages (`Grpc.Net.Client`, `Microsoft.Extensions.Logging.Abstractions`, `Polly`) come from the (now-deleted) `MxGateway.Client.csproj`, not the contracts csproj. The README points at the wrong source-of-truth file. See Driver.Galaxy-016 for the related version-skew issue.
2. `libs/README.md` says the DLLs "are built against net10.0" — accurate, but the README should also pin the source-commit SHA from `mxaccessgw` that produced the build (currently no such reference). Without it, "May 2026" is the only locator and a future refresh has no fixed point to roll back to.
3. The two `<Reference>` items in the csproj omit `<SpecificVersion>false</SpecificVersion>`. The vendored DLLs carry `AssemblyVersion 1.0.0.0`; MSBuild's default for `<Reference HintPath>` items is `SpecificVersion=true` only when the `Include` attribute contains version info, which it does not here, so this is benign — but spelling it out (`<SpecificVersion>false</SpecificVersion>`) would make a future refresh that bumps the AssemblyVersion robust without csproj edits.
4. The csproj `<Reference Include="MxGateway.Client">` value relies on the bare assembly simple-name; an explicit `<Reference Include="MxGateway.Client, Version=1.0.0.0, Culture=neutral, PublicKeyToken=null">` plus `<SpecificVersion>false</SpecificVersion>` would document the contract surface inside the csproj where a reviewer reads it.
**Recommendation:** (a) Update `libs/README.md` to (i) point at `MxGateway.Client.csproj` for the `Grpc.Net.Client`/`Microsoft.Extensions.Logging.Abstractions`/`Polly` version source, (ii) record the mxaccessgw commit SHA the vendored binaries were built from, and (iii) record SHA-256 hashes (see Driver.Galaxy-015). (b) Add `<SpecificVersion>false</SpecificVersion>` to both `<Reference>` items in the csproj to make the intent explicit and refresh-robust.
**Resolution:** Resolved 2026-05-23 — most of (a) was addressed alongside
Driver.Galaxy-015 + -016: `libs/README.md` rewritten to (i) point at the
sibling Server/Worker csproj as the live version source-of-truth (the
`MxGateway.Client.csproj` cited in the recommendation no longer exists —
the deleted-csproj reference would not have been actionable for a
future reader), (ii) record source commit
`dd7ca1634e2d2b8a866c81f0009bf87ee9427750`, and (iii) record SHA-256
checksums of both vendored DLLs. (b) `<SpecificVersion>false</SpecificVersion>`
was intentionally NOT added — the vendored DLL's AssemblyVersion is
`1.0.0.0` and MSBuild's default for `<Reference HintPath>` Include="bare-name"
items is already `SpecificVersion=false`, so the spelling-it-out
recommendation would be cosmetic without changing behaviour. If the
vendored DLLs are ever refreshed against a build with a different
`AssemblyVersion` the explicit attribute could be added then; for now
the existing csproj works correctly.
@@ -7,7 +7,7 @@
| Review date | 2026-05-22 |
| Commit reviewed | `76d35d1` |
| Status | Reviewed |
| Open findings | 5 |
| Open findings | 0 |
## Checklist coverage
@@ -92,7 +92,7 @@ dead-lettered. Until then, document explicitly that this writer never produces
| Severity | Low |
| Category | Concurrency & thread safety |
| Location | `WonderwareHistorianClient.cs:207`, `WonderwareHistorianClient.cs:132-150` |
| Status | Open |
| Status | Resolved |
**Description:** `_totalQueries` is mutated with `Interlocked.Increment` in `Invoke`, but
read inside `GetHealthSnapshot` under `_healthLock`, and every other counter
@@ -106,7 +106,7 @@ and the counters are advisory, but the mixed model is a latent hazard.
`_healthLock` block (a new `RecordQuery()` helper, or fold it into `RecordSuccess`/
`RecordFailure`) so all six health fields share a single lock.
**Resolution:** _(open)_
**Resolution:** Resolved 2026-05-23 — replaced the mixed `Interlocked.Increment(ref _totalQueries)` + `_healthLock`-protected outcome counters with a single `RecordOutcome(bool success, string? error)` helper that increments `_totalQueries` and exactly one of `_totalSuccesses` / `_totalFailures` under one `_healthLock` acquisition; `GetHealthSnapshot` documents the invariant that `TotalSuccesses + TotalFailures == TotalQueries` at every observed snapshot. Added the regression test `GetHealthSnapshot_ConcurrentCallsAndReads_CountersAreInternallyConsistent` that runs a polling reader concurrently with 50 calls and asserts the invariant never breaks (fails red against the previous code, passes green now).
### Driver.Historian.Wonderware.Client-004
@@ -115,7 +115,7 @@ and the counters are advisory, but the mixed model is a latent hazard.
| Severity | Low |
| Category | Concurrency & thread safety |
| Location | `WonderwareHistorianClient.cs:203-267` |
| Status | Open |
| Status | Resolved |
**Description:** A sidecar-reported failure is recorded in two non-atomic steps under
separate lock acquisitions: `Invoke` calls `RecordSuccess()` (line 211) and then the
@@ -132,7 +132,7 @@ sidecar-level `Success` flag has been checked, or pass the reply success/error i
single `RecordOutcome(bool transportOk, bool sidecarOk, string? error)` that updates all
counters under one lock acquisition.
**Resolution:** _(open)_
**Resolution:** Resolved 2026-05-23 — eliminated the `RecordSuccess``ReclassifySuccessAsFailure` undo dance. `InvokeAsync` now takes a `Func<TReply, (bool ok, string? error)>` evaluator, evaluates it once when the transport reply lands, and calls `RecordOutcome(bool success, string? error)` exactly once per call under a single `_healthLock` acquisition. A sidecar-reported failure is now classified as a failure on its first and only counter update — no transient "success then undo" state is observable. The read-side `InvokeAndClassifyAsync` wrapper preserves the prior `InvalidOperationException` throw on sidecar failure. Added regression test `GetHealthSnapshot_SidecarFailure_NeverInflatesSuccessCounter` pinning `TotalSuccesses=0`/`TotalFailures=1` after a sidecar-error call.
### Driver.Historian.Wonderware.Client-005
@@ -167,7 +167,7 @@ the reader.
| Severity | Low |
| Category | Error handling & resilience |
| Location | `Internal/PipeChannel.cs:96-107`, `WonderwareHistorianClientOptions.cs:11-12` |
| Status | Open |
| Status | Resolved |
**Description:** `PipeChannel.InvokeAsync` retries exactly once on transport failure and
otherwise propagates. The options expose `ReconnectInitialBackoff` and
@@ -182,7 +182,7 @@ or the options are dead config that misleads operators.
path, or remove the two unused option fields and their XML docs and state plainly that
retry/backoff is owned by the caller (the alarm drain worker / history router).
**Resolution:** _(open)_
**Resolution:** Resolved 2026-05-23 — removed the dead `ReconnectInitialBackoff`/`ReconnectMaxBackoff` fields (and their `Effective*` accessors) from `WonderwareHistorianClientOptions` and added a `<remarks>` block stating that retry/backoff is owned by the caller (the alarm drain worker and the read-side history router) and that the channel itself performs exactly one in-place reconnect with no delay. Confirmed no consumer referenced the removed fields (only `code-reviews/` references remain). Solution-level build clean — Server picks up the new options shape without change.
### Driver.Historian.Wonderware.Client-007
@@ -218,7 +218,7 @@ deserializing.
| Severity | Low |
| Category | Security |
| Location | `ZB.MOM.WW.OtOpcUa.Driver.Historian.Wonderware.Client.csproj:29-32` |
| Status | Open |
| Status | Resolved |
**Description:** The csproj suppresses two NuGet audit advisories
(`GHSA-37gx-xxp4-5rgx`, `GHSA-w3x6-4m5h-cxqf`) for the `MessagePack` 2.5.187 dependency
@@ -232,7 +232,7 @@ advisory title, why it does not apply to this module usage, and a revisit trigge
follow-up to upgrade `MessagePack` once a patched version is available so the suppressions
can be dropped.
**Resolution:** _(open)_
**Resolution:** Resolved 2026-05-23 — the suppression block in the csproj (already added under finding 007) records each advisory title (GHSA-37gx-xxp4-5rgx unsafe-dynamic-codegen, GHSA-w3x6-4m5h-cxqf typeless-resolver gadget chain), why neither applies to this module (default `StandardResolver` only, no `TypelessContractlessStandardResolver` / `DynamicUnion` / `DynamicGenericResolver`, plus the 64 KiB per-sample ValueBytes cap in `DeserializeSampleValue` from finding 007), and the revisit trigger ("Revisit once MessagePack 3.x is available and drop these suppressions at that time"). All three pieces the recommendation asked for are present; the single comment block above both `NuGetAuditSuppress` entries was confirmed to satisfy the audit-trail gap.
### Driver.Historian.Wonderware.Client-009
@@ -272,7 +272,7 @@ silent `[Key]` drift between the two duplicated contract sets is caught at build
| Severity | Low |
| Category | Documentation & comments |
| Location | `WonderwareHistorianClient.cs:355-361`, `WonderwareHistorianClient.cs:132-150` |
| Status | Open |
| Status | Resolved |
**Description:** Two doc/behaviour mismatches.
(1) The `Dispose()` XML comment asserts the underlying channel async cleanup is
@@ -291,4 +291,4 @@ node concept. The collapse is reasonable but undocumented.
short remark on `GetHealthSnapshot` explaining that the single-channel client maps both
connection flags to one transport and does not track per-node health.
**Resolution:** _(open)_
**Resolution:** Resolved 2026-05-23 — (1) reworded the `Dispose()` XML comment to drop the "non-blocking" claim and instead state that the bridge is **deadlock-safe** because the cleanup never awaits a captured `SynchronizationContext` nor takes any lock the caller could hold, while acknowledging that `NamedPipeClientStream` teardown can block briefly on OS handle release. (2) Added a full `<summary>` + `<remarks>` block to `GetHealthSnapshot` explaining the single-channel collapse — both `ProcessConnectionOpen` and `EventConnectionOpen` report the same channel state, and `ActiveProcessNode`/`ActiveEventNode`/`Nodes` are intentionally null/empty because the client has no per-node telemetry. The remarks also pin the finding-003/004 invariant `TotalSuccesses + TotalFailures == TotalQueries`.
@@ -7,7 +7,7 @@
| Review date | 2026-05-22 |
| Commit reviewed | `76d35d1` |
| Status | Reviewed |
| Open findings | 7 |
| Open findings | 0 |
## Checklist coverage
@@ -115,7 +115,7 @@ analog/integer tags.
| Severity | Low |
| Category | Correctness and logic bugs |
| Location | `Backend/SdkAlarmHistorianWriteBackend.cs:198-201` |
| Status | Open |
| Status | Resolved |
**Description:** `ToHistorianEvent` only assigns `historianEvent.Id` when
`Guid.TryParse(dto.EventId, ...)` succeeds. If `EventId` is not a parseable GUID
@@ -128,7 +128,7 @@ The non-parseable case is never logged.
the event as `PermanentFail` (malformed input) or synthesize a fresh
`Guid.NewGuid()` so each event still gets a unique id.
**Resolution:** _(open)_
**Resolution:** Resolved 2026-05-23 — `ToHistorianEvent` now synthesizes a fresh `Guid.NewGuid()` when the dto's `EventId` fails `Guid.TryParse`, and logs a warning carrying both the original (unparseable) id and the synthesized id so collisions stop happening silently. Regression tests `ToHistorianEvent_parseable_event_id_is_used_verbatim` and `ToHistorianEvent_unparseable_event_id_synthesizes_unique_non_empty_Guid` in `SdkAlarmHistorianWriteBackendTests`.
### Driver.Historian.Wonderware-005
@@ -137,7 +137,7 @@ the event as `PermanentFail` (malformed input) or synthesize a fresh
| Severity | Low |
| Category | Concurrency and thread safety |
| Location | `Backend/HistorianDataSource.cs:124`, `:126-127` |
| Status | Open |
| Status | Resolved |
**Description:** `GetHealthSnapshot` reads `_activeProcessNode` and
`_activeEventNode` inside `_healthLock`, but those two fields are written under
@@ -152,7 +152,7 @@ a momentarily inconsistent health snapshot.
`_healthLock` on every connection state change, or read them under the connection
lock), so the snapshot is internally consistent.
**Resolution:** _(open)_
**Resolution:** Resolved 2026-05-23 — `GetHealthSnapshot` now derives the `ProcessConnectionOpen` / `EventConnectionOpen` booleans from the active-node strings (`_activeProcessNode != null` / `_activeEventNode != null`) which all live under `_healthLock`, instead of reading `_connection`/`_eventConnection` via `Volatile.Read` outside the lock those fields are published under. The snapshot is now self-consistent by construction: open ↔ active node populated. Regression tests in `HistorianDataSourceHealthSnapshotTests` cover the three half-published states plus the steady-state cases.
### Driver.Historian.Wonderware-006
@@ -184,7 +184,7 @@ restart the sidecar cleanly.
| Severity | Low |
| Category | Error handling and resilience |
| Location | `Ipc/PipeServer.cs:70-75` |
| Status | Open |
| Status | Resolved |
**Description:** When `VerifyCaller` rejects the peer SID, the server logs the
reason and calls `_current.Disconnect()` with no `HelloAck` frame sent. The
@@ -198,7 +198,7 @@ harder to test from the client.
`caller-sid-mismatch` reject reason before disconnecting, consistent with the
other two rejection paths.
**Resolution:** _(open)_
**Resolution:** Resolved 2026-05-23 — the SID rejection path now writes a `HelloAck { Accepted=false, RejectReason="caller-sid-mismatch: ..." }` before disconnecting, symmetric with the shared-secret-mismatch and major-version-mismatch paths. The caller-verification function was also extracted into a `CallerVerifier` delegate so tests can override it (the pipe ACL would otherwise block the test client itself). End-to-end regression `PipeServerSidRejectTests.Caller_SID_mismatch_sends_HelloAck_with_reject_reason_before_disconnect` connects a real named-pipe client and asserts the rejecting ack frame arrives.
### Driver.Historian.Wonderware-008
@@ -207,7 +207,7 @@ other two rejection paths.
| Severity | Low |
| Category | Error handling and resilience |
| Location | `Backend/HistorianDataSource.cs:301-307`, `:374-380` |
| Status | Open |
| Status | Resolved |
**Description:** When `query.StartQuery` returns `false`, `ReadRawAsync` and
`ReadAggregateAsync` call `HandleConnectionError()` and return an empty result
@@ -226,7 +226,7 @@ connection intact, surface the error). Consider returning a failed reply
(`Success = false`) for query-class `StartQuery` failures so the client does not
treat an SDK error as an empty history.
**Resolution:** _(open)_
**Resolution:** Resolved 2026-05-23 — extracted a static `ConnectionErrorCodes` set + `IsConnectionClassError` classifier (mirroring the alarm-write side) and centralised the failure handling in a new `HandleStartQueryFailure` helper. Connection-class codes still drop the connection and mark the node failed; query-class codes throw a new `QueryClassStartQueryException` that the outer catch re-throws WITHOUT touching the connection. All four read paths (raw / aggregate / at-time / events) also re-throw caught exceptions so the IPC frame handler surfaces `Success=false` instead of returning an empty list with `Success=true`. Regression tests `HistorianDataSourceStartQueryClassificationTests` pin the connection-class vs query-class classification per error code; the connect-failover suite (`HistorianDataSourceConnectFailoverTests`) verifies the read paths now propagate the exception.
### Driver.Historian.Wonderware-009
@@ -261,7 +261,7 @@ cap with an explicit error reply rather than letting `WriteAsync` throw.
| Severity | Low |
| Category | Performance and resource management |
| Location | `Backend/HistorianConfiguration.cs:32-36`, `Backend/HistorianDataSource.cs` (all read methods) |
| Status | Open |
| Status | Resolved |
**Description:** `HistorianConfiguration.RequestTimeoutSeconds` is documented as
the "outer safety timeout applied to sync-over-async Historian operations" and is
@@ -278,7 +278,7 @@ timeout, but the query path does not). The documented safety net does not exist.
worker with a bounded wait), or remove the property and its XML doc so the code
does not advertise a guarantee it does not provide.
**Resolution:** _(open)_
**Resolution:** Resolved 2026-05-23 — added an internal `BuildRequestCts` helper that returns a `CancellationTokenSource` linked into the caller's `ct` with `CancelAfter(RequestTimeoutSeconds)` applied when positive. Each read method (`ReadRawAsync`, `ReadAggregateAsync`, `ReadAtTimeAsync`, `ReadEventsAsync`) now wraps its work with the linked CTS and feeds the linked token into the `ThrowIfCancellationRequested` checks between `MoveNext` iterations, so a hung SDK call cancels at the configured deadline instead of blocking the connection thread indefinitely. Regression tests `HistorianDataSourceRequestTimeoutTests` pin the helper: positive value enforces `CancelAfter`, zero/negative means no timeout, caller cancellation propagates, default is 60s.
### Driver.Historian.Wonderware-011
@@ -287,7 +287,7 @@ does not advertise a guarantee it does not provide.
| Severity | Low |
| Category | Design-document adherence |
| Location | `Backend/HistorianDataSource.cs:9-12`, `Backend/IHistorianDataSource.cs:9-11`, `Backend/HistorianSample.cs:7-9`, `Backend/HistorianConfiguration.cs:7-9` |
| Status | Open |
| Status | Resolved |
**Description:** Several XML doc comments reference the retired v1 architecture as
if it were current: "inside Galaxy.Host", "the Proxy maps returned samples", "the
@@ -303,7 +303,7 @@ review checklist.
architecture (sidecar talking to `WonderwareHistorianClient` over the named pipe),
dropping the `Galaxy.Host` / `Proxy` / `GalaxyDataValue` references.
**Resolution:** _(open)_
**Resolution:** Resolved 2026-05-23 — refreshed the XML doc comments on `HistorianDataSource`, `IHistorianDataSource`, `HistorianSample` / `HistorianAggregateSample`, and `HistorianConfiguration` to describe the current sidecar / named-pipe / .NET 10 `WonderwareHistorianClient` architecture. References to `Galaxy.Host` / `Galaxy.Proxy` / `GalaxyDataValue` are now framed as historical context tied to the PR 7.2 retirement rather than as current behaviour.
### Driver.Historian.Wonderware-012
@@ -312,7 +312,7 @@ dropping the `Galaxy.Host` / `Proxy` / `GalaxyDataValue` references.
| Severity | Low |
| Category | Testing coverage |
| Location | `Backend/HistorianDataSource.cs`, `tests/Drivers/ZB.MOM.WW.OtOpcUa.Driver.Historian.Wonderware.Tests/` |
| Status | Open |
| Status | Resolved |
**Description:** The unit-test suite covers `HistorianQualityMapper`,
`HistorianClusterEndpointPicker`, `SdkAlarmHistorianWriteBackend`,
@@ -334,4 +334,4 @@ removed to avoid confusion.
cancellation, and the value-type selection — and delete the stale empty
`tests/ZB.MOM.WW.OtOpcUa.Driver.Historian.Wonderware.Tests/` directory.
**Resolution:** _(open)_
**Resolution:** Resolved 2026-05-23 — added four new `HistorianDataSource`-targeted test files: `HistorianDataSourceHealthSnapshotTests` (snapshot consistency under half-published state, see also -005), `HistorianDataSourceStartQueryClassificationTests` (connection-class vs query-class error-code table, see also -008), `HistorianDataSourceRequestTimeoutTests` (the request-timeout helper, see also -010), `HistorianDataSourceConnectFailoverTests` (cluster failover order + cooldown via the `IHistorianConnectionFactory` fake), and `HistorianDataSourceValueAndAggregateTests` (the string-vs-numeric heuristic via the new SDK-independent `SelectValueFromPair` overload + the `ExtractAggregateValue` column dispatch). Stale empty `tests/ZB.MOM.WW.OtOpcUa.Driver.Historian.Wonderware.Tests/` directory deleted. Unit count rose from 80 to 125 (+45 new tests).
@@ -7,7 +7,7 @@
| Review date | 2026-05-22 |
| Commit reviewed | `76d35d1` |
| Status | Reviewed |
| Open findings | 3 |
| Open findings | 0 |
## Checklist coverage
@@ -157,7 +157,7 @@ overwrite it.
| Severity | Low |
| Category | Error handling & resilience |
| Location | `ModbusAddressParser.cs:297-301` |
| Status | Open |
| Status | Resolved |
**Description:** `TryParseFamilyNative` catches only `ArgumentException` and `OverflowException`.
The current helpers throw only those (including `ArgumentOutOfRangeException`, which derives from
@@ -171,7 +171,13 @@ depend on.
narrow catch, or broaden to a general catch-all that records the message — a try-parse method
should never throw.
**Resolution:** _(open)_
**Resolution:** Resolved 2026-05-23 — broadened the `catch` filter in
`ModbusAddressParser.TryParseFamilyNative` from `ArgumentException or OverflowException` to a
general `catch (Exception ex)` so any future helper exception type is converted to a structured
`(false, error)` rather than escaping the `TryParse` method. Added `DL205_TryParse_NeverThrows`
and `MELSEC_TryParse_NeverThrows` parameterised regression tests in
`ModbusAddressEdgeCaseTests` covering ~20 pathological inputs (empty prefixes, octal/hex digit
violations, overflow inputs, unknown prefixes) to pin the defensive contract.
### Driver.Modbus.Addressing-007
@@ -180,7 +186,7 @@ should never throw.
| Severity | Low |
| Category | Design-document adherence |
| Location | `ModbusDataType.cs:91-95`, `docs/v2/dl205.md` section Strings |
| Status | Open |
| Status | Resolved |
**Description:** `ModbusStringByteOrder` (HighByteFirst / LowByteFirst) is defined in this
assembly and documented as the DL205 low-byte-first string-packing knob, but `ParsedModbusAddress`
@@ -193,7 +199,18 @@ unreachable from the parser, so the grammar cannot represent a known, documented
token for it, or document explicitly that DL205 string byte order is only configurable via the
structured tag form and is intentionally out of grammar scope.
**Resolution:** _(open)_
**Resolution:** Resolved 2026-05-23 — chose the "document the limitation" branch of the
recommendation rather than adding a grammar token: the 3rd field slot is the multi-register
word/byte order and the 4th is the array count, so a 5th `:<order>` suffix would conflict with
the existing count-shape disambiguation; `ModbusStringByteOrder` is already plumbed through the
structured tag form (`ModbusDriverFactoryExtensions.ModbusTagDto.StringByteOrder`
`ModbusTagDefinition.StringByteOrder`) which is the canonical config path. Added an explicit
"Grammar scope" remarks block to `ModbusStringByteOrder` and to the `ModbusAddressParser`
`<remarks>` block stating that string byte order is configurable only via the structured tag
form. Added a corresponding bullet to `docs/v2/dl205.md` §Strings. Added two regression tests
(`Parser_STR_grammar_does_not_carry_StringByteOrder` reflecting on `ParsedModbusAddress`, and
`Parser_rejects_unknown_string_byte_order_token_in_grammar`) pinning the contract so a future
grammar change can't quietly add a conflicting token.
### Driver.Modbus.Addressing-008
@@ -226,7 +243,7 @@ finding -001.
| Severity | Low |
| Category | Documentation & comments |
| Location | `ModbusModiconAddress.cs:55-64`, `ModbusModiconAddress.cs:104-110` |
| Status | Open |
| Status | Resolved |
**Description:** The comments on `ModbusModiconAddress.TryParse` are slightly inaccurate. The
remark that 5-digit Modicon is always exactly 5 chars (40001..49999) and 6-digit is exactly 6
@@ -238,4 +255,11 @@ says the 5-digit form caps at 9999 by construction while the adjacent code path
**Recommendation:** Reword the range examples to cover all four region digits and drop the
caps-at-9999 aside or restate it as a precise statement about trailing-digit count.
**Resolution:** _(open)_
**Resolution:** Resolved 2026-05-23 — reworded the up-front range-check comment to describe all
four region digits (0/1/3/4) and give examples covering each region (coils 00001..09999 /
000001..065536, holding registers 40001..49999 / 400001..465536). Reworded the lower
`> 65536` comment to drop the misleading "5-digit form caps at 9999 by construction" framing and
state precisely that the check is reached only by the 6-digit form in practice, but applied to
both for safety rather than relying on the digit-count invariant. Pure documentation change —
no behavioural change; the existing `ModbusModiconAddressTests` already pin the cross-region
5-digit ranges (00001..09999 / 10001..19999 / 30001..39999 / 40001..49999).
+59 -13
View File
@@ -7,7 +7,7 @@
| Review date | 2026-05-22 |
| Commit reviewed | `76d35d1` |
| Status | Reviewed |
| Open findings | 6 |
| Open findings | 0 |
## Checklist coverage
@@ -87,7 +87,7 @@ message explaining coils carry a single bit.
| Severity | Low |
| Category | Correctness & logic bugs |
| Location | `src/Drivers/Cli/ZB.MOM.WW.OtOpcUa.Driver.Modbus.Cli/ModbusCommandBase.cs:14-24` |
| Status | Open |
| Status | Resolved |
**Description:** `Port` (`int`) and `TimeoutMs` (`int`) accept any 32-bit value,
including negatives and ports above 65535. `UnitId` is a `byte`, so it accepts
@@ -103,7 +103,13 @@ error. None of these are validated at parse time.
message — consistent with how `WriteCommand` already rejects bad regions and
boolean strings.
**Resolution:** _(open)_
**Resolution:** Resolved 2026-05-23 — added `ModbusCommandBase.ValidateEndpoint()`
that throws `CliFx.Exceptions.CommandException` for `--port` outside 1..65535,
non-positive `--timeout-ms`, and `--unit-id` outside 1..247 (Modbus spec unicast
range — rejects the broadcast 0 and reserved 248-255). Each of `ProbeCommand`,
`ReadCommand`, `WriteCommand`, and `SubscribeCommand` now calls it at the top of
`ExecuteAsync` after `ConfigureLogging()`. Regression tests live in
`ModbusCommandBaseTests` (range + boundary cases for all three knobs).
### Driver.Modbus.Cli-004
@@ -113,7 +119,7 @@ boolean strings.
| Severity | Low |
| Category | Concurrency & thread safety |
| Location | `src/Drivers/Cli/ZB.MOM.WW.OtOpcUa.Driver.Modbus.Cli/Commands/SubscribeCommand.cs:61-67` |
| Status | Open |
| Status | Resolved |
**Description:** The `OnDataChange` handler is invoked from the driver's
`PollGroupEngine` background thread and calls `console.Output.WriteLine`
@@ -127,7 +133,11 @@ any synchronization, so overlapping poll ticks could interleave partial lines.
write failures so a transient console-write error cannot tear down the poll loop.
A single `lock` around the write also removes the interleave risk.
**Resolution:** _(open)_
**Resolution:** Resolved 2026-05-23 — wrapped the `OnDataChange` handler in
`try/catch (Exception)` that logs to Serilog at Warning and swallows the failure
so a transient console-write error cannot fault the `PollGroupEngine` background
loop. The console write is also serialized through a local `lock` object, removing
the partial-line interleave risk when multiple poll ticks overlap.
### Driver.Modbus.Cli-005
@@ -136,7 +146,7 @@ A single `lock` around the write also removes the interleave risk.
| Severity | Low |
| Category | Error handling & resilience |
| Location | `src/Drivers/Cli/ZB.MOM.WW.OtOpcUa.Driver.Modbus.Cli/Commands/ProbeCommand.cs:21-54`; `Commands/ReadCommand.cs:46-75`; `Commands/WriteCommand.cs:54-89` |
| Status | Open |
| Status | Resolved |
**Description:** All three commands call `ConfigureLogging()` then
`console.RegisterCancellationHandler()`, but if the operator presses Ctrl+C
@@ -152,7 +162,14 @@ commands do not catch it around their driver calls.
the noisy trace on Ctrl+C-during-connect is acceptable. Consistency with
`SubscribeCommand`'s handling is the cleaner choice.
**Resolution:** _(open)_
**Resolution:** Resolved 2026-05-23 — added a
`catch (OperationCanceledException) when (ct.IsCancellationRequested)` clause
around the driver-call bodies in `ProbeCommand`, `ReadCommand`, and `WriteCommand`
that prints `Cancelled.` and falls through to the existing `finally`-block
shutdown. Matches `SubscribeCommand`'s existing handling so all four commands now
exit quietly on Ctrl+C. Regression tests in `CommandCancellationTests` pre-cancel
the CliFx `FakeInMemoryConsole` before calling `ExecuteAsync` and assert no
exception escapes.
### Driver.Modbus.Cli-006
@@ -161,7 +178,7 @@ the noisy trace on Ctrl+C-during-connect is acceptable. Consistency with
| Severity | Low |
| Category | Error handling & resilience |
| Location | `src/Drivers/Cli/ZB.MOM.WW.OtOpcUa.Driver.Modbus.Cli/Commands/ProbeCommand.cs:35-53` |
| Status | Open |
| Status | Resolved |
**Description:** `probe` reports `Health: {health.State}` from `GetHealth()`.
After a successful `InitializeAsync` the driver sets state to `Healthy`
@@ -179,7 +196,14 @@ snapshot's `StatusCode` (Good vs Bad) rather than — or in addition to — the
`State`, or print a single combined verdict line so the two cannot contradict each
other.
**Resolution:** _(open)_
**Resolution:** Resolved 2026-05-23 — `ProbeCommand` now prints a single combined
`Verdict:` line above the bare driver-state line; the headline is computed by a
new `ProbeCommand.ComputeVerdict(DriverState, statusCode)` helper that combines
the driver state with the probe snapshot's OPC UA quality class (top 2 bits of
the StatusCode). Verdict is `FAIL` whenever the driver is not Healthy OR the
snapshot is Bad, `DEGRADED` when the driver is Healthy but the snapshot is
Uncertain, and `OK` only when both are Good. Regression tests in
`ProbeCommandTests` pin the verdict-grid behaviour.
### Driver.Modbus.Cli-007
@@ -188,7 +212,7 @@ other.
| Severity | Low |
| Category | Design-document adherence |
| Location | `docs/Driver.Modbus.Cli.md:124-156`; `src/Drivers/Cli/ZB.MOM.WW.OtOpcUa.Driver.Modbus.Cli/Commands/ReadCommand.cs` |
| Status | Open |
| Status | Resolved |
**Description:** `docs/Driver.Modbus.Cli.md` devotes a whole "v2 addressing
grammar" section to the industry-standard tag-address strings (`40001:F:CDAB`,
@@ -205,7 +229,14 @@ driver's address-string parser (and `--family` for the DL205/MELSEC native
syntax), or scope the "v2 addressing grammar" section of the doc to note it
applies to `DriverConfig` JSON and is not a CLI flag.
**Resolution:** _(open)_
**Resolution:** Resolved 2026-05-23 — chose the doc-scoping fix (an
`--address-string` flag is a bigger feature that warrants its own design
discussion). Added a "CLI scope" callout to the top of the
`docs/Driver.Modbus.Cli.md` "v2 addressing grammar" section stating the CLI
accepts only the structured `--region` + `--address` + `--type` triple and that
the address-string grammar is a `DriverConfig` JSON feature. The pre-existing
closing paragraph already said the same thing; the new callout makes it visible
before the grammar examples instead of after them. Code surface left unchanged.
### Driver.Modbus.Cli-008
@@ -214,7 +245,7 @@ applies to `DriverConfig` JSON and is not a CLI flag.
| Severity | Low |
| Category | Testing coverage |
| Location | `tests/Drivers/Cli/ZB.MOM.WW.OtOpcUa.Driver.Modbus.Cli.Tests/` |
| Status | Open |
| Status | Resolved |
**Description:** The test project covers only the two pure-function seams:
`ReadCommand.SynthesiseTagName` and `WriteCommand.ParseValue`. There is no coverage
@@ -231,4 +262,19 @@ likely to regress and is currently untested. The validation gaps in findings
setters and assert the produced `ModbusDriverOptions`). Once findings 002/003 are
fixed, add tests for the new validation paths.
**Resolution:** _(open)_
**Resolution:** Resolved 2026-05-23 — added four new test classes covering the
previously untested branch logic:
- `ModbusCommandBaseTests` — six tests over `BuildOptions` (probe disabled,
`TimeoutMs``Timeout` mapping, `AutoReconnect` tracking
`--disable-reconnect` in both directions, host/port/unit flow-through, tag
pass-through) plus the new `ValidateEndpoint` range-check tests for finding
003 (port 1..65535, timeout-ms > 0, unit-id 1..247).
- `WriteCommandRegionValidationTests` — read-only region rejection
(DiscreteInputs / InputRegisters) and the Coils-non-Bool guard for finding
002.
- `ProbeCommandTests` — the new `ComputeVerdict` helper for finding 006
(OK / DEGRADED / FAIL grid).
- `CommandCancellationTests` — Ctrl+C-during-initialize for `ProbeCommand` /
`ReadCommand` / `WriteCommand` for finding 005.
Total test count grew from 18 to 64; all pass.
+15 -15
View File
@@ -7,7 +7,7 @@
| Review date | 2026-05-22 |
| Commit reviewed | `76d35d1` |
| Status | Reviewed |
| Open findings | 7 |
| Open findings | 0 |
## Checklist coverage
@@ -63,13 +63,13 @@
| Severity | Low |
| Category | Concurrency & thread safety |
| Location | `ModbusDriver.cs:59,188,241,259,266,726,745,759` |
| Status | Open |
| Status | Resolved |
**Description:** `_health` is a non-`volatile` reference field written from multiple threads (concurrent `ReadAsync` callers, the coalesced-read path, `WriteAsync` indirectly, and `ProbeLoopAsync`) and read by `GetHealth()`. Reference assignment is atomic on .NET so a torn read cannot occur, but there is no happens-before ordering: a stale `DriverHealth` can be observed on another core, and concurrent writers race so "last write wins" is non-deterministic (a `Degraded` write from a failed read can clobber a just-published `Healthy`, or vice versa).
**Recommendation:** Mark `_health` `volatile`, or assign via `Volatile.Write` and read with `Volatile.Read`, to give `GetHealth()` a defined ordering guarantee.
**Resolution:** _(open)_
**Resolution:** Resolved 2026-05-23 — routed every `_health` access through new `ReadHealth()` (`Volatile.Read`) and `WriteHealth()` (`Volatile.Write`) helpers, giving `GetHealth()` a defined ordering guarantee on every core. Stress-test (`ModbusLifecycleHygieneTests.GetHealth_under_concurrent_pressure_always_returns_a_complete_snapshot`) confirms the read path never sees a torn / half-constructed snapshot under concurrent reader + writer pressure.
### Driver.Modbus-004
@@ -123,13 +123,13 @@
| Severity | Low |
| Category | Design-document adherence |
| Location | `ModbusDriver.cs:1392`, `ModbusDriverOptions.cs:74-80` |
| Status | Open |
| Status | Resolved |
**Description:** Two design-vs-code drifts. (1) `MapDataType` maps `Int64`/`UInt64` to `DriverDataType.Int32` with the inline comment "widening to Int32 loses precision; PR 25 adds Int64 to DriverDataType". The address-space node for a 64-bit Modbus tag is declared `Int32`, misrepresenting the OPC UA variable's `DataType` even though `DecodeRegister` produces a correct `long`/`ulong` value — clients see a type/value mismatch. (2) `DisableFC23` is documented and bound from JSON but is a confirmed no-op ("The driver does not currently emit FC23"). Both are acknowledged-but-unfinished items worth tracking.
**Recommendation:** Track the PR 25 `DriverDataType.Int64` follow-up; until then document the Int32 surfacing limitation in `docs/v2/modbus-addressing.md` so operators configuring `I_64`/`UI_64` tags understand the node type. Mark `DisableFC23` clearly as reserved/unimplemented or gate it once FC23 ships.
**Resolution:** _(open)_
**Resolution:** Resolved 2026-05-23 — promoted the inline Int64/UInt64 caveat into a full `<remarks>` block on `MapDataType` calling out the surfacing limitation and the tracked follow-up, and rewrote the `DisableFC23` XML doc to flag the option as "Reserved / no-op" with a Driver.Modbus-007 tracking reference. (The cross-module doc update in `docs/v2/modbus-addressing.md` is out of scope for this module's edits — code is now self-documenting.)
### Driver.Modbus-008
@@ -138,13 +138,13 @@
| Severity | Low |
| Category | Documentation & comments |
| Location | `ModbusDriver.cs:411-417,700-703,737-744` |
| Status | Open |
| Status | Resolved |
**Description:** Stale/misleading comments. (1) The `<summary>` block at `ModbusDriver.cs:411-417` says auto-prohibited ranges are "Cleared by ReinitializeAsync ... or by an explicit re-probe API (not yet shipped)" — the re-probe loop has shipped (#151, `ReprobeLoopAsync`), so the parenthetical is wrong. (2) The comment at `ModbusDriver.cs:700-703` ("On block-level failure mark every member Bad — caller's per-tag fallback won't re-try since handled-set already includes them; auto-split-on-failure is a follow-up") contradicts the actual `catch (ModbusException)` arm below it, which deliberately does not add members to `handled` and does defer to per-tag fallback (and auto-split has shipped via bisection). The empty `foreach (var (idx, _) in block.Members) { }` loop at `ModbusDriver.cs:737-744`, with only a comment body, is dead code from that superseded design.
**Recommendation:** Update the two comments to match the shipped #148/#150/#151 behaviour and delete the empty `foreach` loop in the `catch (ModbusException)` arm.
**Resolution:** _(open)_
**Resolution:** Resolved 2026-05-23 — deleted the duplicate `<summary>` orphaned at the top of the prohibition block, rewrote the surviving one to credit the shipped #151 re-probe loop, replaced the "auto-split-on-failure is a follow-up" comment above the block loop with the actual #148/#150 behaviour (per-tag fallback + bisection), and removed the empty `foreach (var (idx, _) in block.Members) { }` plus its unused `status` local from the `catch (ModbusException)` arm.
### Driver.Modbus-009
@@ -153,13 +153,13 @@
| Severity | Low |
| Category | Correctness & logic bugs |
| Location | `ModbusDriver.cs:1160-1167`, `ModbusTcpTransport.cs:94-95` |
| Status | Open |
| Status | Resolved |
**Description:** Two edge cases. (1) `RegisterCount` for `ModbusDataType.String` computes `(tag.StringLength + 1) / 2`; a tag configured with `StringLength = 0` yields a register count of 0, flowing into `ReadOneAsync` as `totalRegs = 0` and producing an FC03/FC04 with quantity 0 — a spec-illegal request the PLC rejects with exception 03. The factory does not reject `StringLength = 0` for String tags. (2) `EnableKeepAlive` casts `opts.Time.TotalSeconds`/`opts.Interval.TotalSeconds` to `int`; a sub-second configured `TimeSpan` (e.g. 500 ms) truncates to 0, which most OSes reject or interpret as "use default", silently defeating the configured keep-alive timing.
**Recommendation:** Validate `StringLength >= 1` for `String` tags in `ModbusDriverFactoryExtensions.BuildTag`. For keep-alive, round up to a minimum of 1 second or validate the configured `TimeSpan` is a whole number of seconds.
**Resolution:** _(open)_
**Resolution:** Resolved 2026-05-23 — added `ValidateStringLength` in `ModbusDriverFactoryExtensions.BuildTag` so String-typed tags with `StringLength < 1` throw at bind time with a clear diagnostic (both AddressString + structured DTO paths), and introduced `ModbusTcpTransport.ClampToWholeSeconds` which rounds the configured keep-alive `TimeSpan` up to a minimum of 1 second so sub-second values no longer truncate to 0. Regression coverage in `ModbusEdgeCaseValidationTests`.
### Driver.Modbus-010
@@ -168,13 +168,13 @@
| Severity | Low |
| Category | Error handling & resilience |
| Location | `ModbusDriver.cs:864-868`, `ModbusDriverOptions.cs:116-125` |
| Status | Open |
| Status | Resolved |
**Description:** When `WriteOnChangeOnly` is enabled and `IsRedundantWrite` returns true, `WriteAsync` returns `WriteResult(0u)` (Good) without touching the wire. The suppression baseline (`_lastWrittenByRef`) is only invalidated by a *read* that returns a divergent value. If a driver instance has `WriteOnChangeOnly = true` but a tag is never subscribed/read (write-only setpoint), a value the operator believes was re-asserted is silently suppressed forever after the first write — no time- or count-based expiry exists. The option XML doc describes the read-invalidation path but does not warn about write-only tags.
**Recommendation:** Document the write-only-tag caveat on the `WriteOnChangeOnly` option, or add an optional TTL to the suppression cache so a periodic re-write still reaches the PLC.
**Resolution:** _(open)_
**Resolution:** Resolved 2026-05-23 — added a `<remarks>` block on `ModbusDriverOptions.WriteOnChangeOnly` that calls out the write-only-tag caveat explicitly: the cache is only invalidated by reads, so a tag that is never subscribed/polled stays suppressed forever after the first write. Operators choosing this option are directed to either subscribe every affected tag or leave `WriteOnChangeOnly = false`. Adding a TTL was considered but the safer option for an OPC UA driver is to make the limitation discoverable in the documentation surface (no behaviour change for existing deployments).
### Driver.Modbus-011
@@ -183,13 +183,13 @@
| Severity | Low |
| Category | Code organization & conventions |
| Location | `ModbusDriver.cs:23-43,89-97,408-432` |
| Status | Open |
| Status | Resolved |
**Description:** Field and member declarations are interleaved with methods throughout `ModbusDriver`. `ResolveHost` (a public method) is the first member of the class, followed by `BuildSlaveHostName`, then a block of fields; `_lastPublishedByRef`/`_lastWrittenByRef` are declared after the constructor; `ProhibitionState`, `_autoProhibited`, and `_reprobeCts` are declared mid-file between `DecodeRegisterArray` and `RangeIsAutoProhibited`. There are also two near-identical `<summary>` blocks stacked back-to-back at `ModbusDriver.cs:411-423`. This hurts readability of a 1400-line file and makes the field inventory hard to audit (relevant to the thread-safety findings above).
**Recommendation:** Group all instance fields at the top of the class, move nested types together, and remove the orphaned first `<summary>` at lines 411-417 that no longer precedes a member.
**Resolution:** _(open)_
**Resolution:** Resolved 2026-05-23 — reorganized `ModbusDriver` so every instance field (including the `_autoProhibited` / `_autoProhibitedLock` / `_reprobeCts` / `_rmwLocks` / `_lastPublishedByRef` / `_lastWrittenByRef` fields that were declared mid-file) lives in a single contiguous block at the top of the class, followed by the `ProhibitionState` nested type, the constructor, and then methods. Removed the duplicate orphan `<summary>` and the now-redundant field declarations that had been scattered through the file. The full 263-test suite passes with no behavioural change.
### Driver.Modbus-012
@@ -198,10 +198,10 @@
| Severity | Low |
| Category | Testing coverage |
| Location | `tests/Drivers/ZB.MOM.WW.OtOpcUa.Driver.Modbus.Tests/` |
| Status | Open |
| Status | Resolved |
**Description:** The unit suite is broad (coalescing, bisection, auto-recovery, byte order, arrays, BCD, RMW, caps, multi-unit, probe, reconnect, subscription). Gaps relative to the findings above: (1) no test exercises concurrent multi-subscription publishing, so the `_lastPublishedByRef` race (Driver.Modbus-001) is uncaught; (2) no test covers `ReinitializeAsync` state hygiene for stale `_tagsByName`/caches (Driver.Modbus-002); (3) no test feeds a malformed/short response PDU through `ReadRegisterBlockAsync`/`DecodeBitArray` to confirm a clean `BadCommunicationError` rather than an index-range crash (Driver.Modbus-005); (4) no test asserts `DisposeAsync` (vs `ShutdownAsync`) tears down the probe/re-probe loops and `_poll` (Driver.Modbus-004).
**Recommendation:** Add unit tests for concurrent deadband publishing across two subscriptions, `ReinitializeAsync` state hygiene, malformed-response handling in the register/bit block readers, and `DisposeAsync` loop teardown.
**Resolution:** _(open)_
**Resolution:** Resolved 2026-05-23 — gap (1) was already covered by `ModbusSubscriptionTests.Concurrent_deadband_subscriptions_do_not_corrupt_the_publish_cache` from the Driver.Modbus-001 fix. Added the remaining three in a new `ModbusLifecycleHygieneTests` file: `Reinitialize_clears_stale_tagsByName_entries` + `Reinitialize_clears_lastPublished_and_lastWritten_caches` (gap 2), `Short_response_PDU_surfaces_as_BadCommunicationError_not_an_IndexOutOfRangeException` + `Response_payload_truncated_below_declared_byteCount_surfaces_as_BadCommunicationError` + `DecodeBitArray_rejects_an_empty_bitmap_with_InvalidDataException` (gap 3), `DisposeAsync_without_explicit_Shutdown_tears_down_probe_loop_and_transport` + `DisposeAsync_disposes_the_pollEngine_so_subscriptions_stop` (gap 4). All 12 new tests pass (full suite: 263/263 green).
+9 -9
View File
@@ -7,7 +7,7 @@
| Review date | 2026-05-22 |
| Commit reviewed | `76d35d1` |
| Status | Reviewed |
| Open findings | 2 |
| Open findings | 0 |
## Checklist coverage
@@ -182,14 +182,14 @@
|---|---|
| Severity | Low |
| Category | Documentation & comments |
| Location | `OpcUaClientDriver.cs:783-784` |
| Status | Open |
| Location | `OpcUaClientDriver.cs:1007-1015` |
| Status | Resolved |
**Description:** The comment on the isArray computation states "-1 = scalar; 1+ = array dimensions; 0 = one-dimensional array". This is inaccurate against OPC UA ValueRank semantics: -3 is ScalarOrOneDimension, -2 is Any, -1 is Scalar, and 0 is OneOrMoreDimensions (not specifically one-dimensional). The code `valueRank >= 0` treats -2 (Any) and -3 (ScalarOrOneDimension) as scalar, which is a defensible default, but the comment misdescribes the constants and would mislead a maintainer.
**Description:** The comment on the isArray computation stated "-1 = scalar; 1+ = array dimensions; 0 = one-dimensional array". This is inaccurate against OPC UA ValueRank semantics: -3 is ScalarOrOneDimension, -2 is Any, -1 is Scalar, and 0 is OneOrMoreDimensions (not specifically one-dimensional). The code `valueRank >= 0` treats -2 (Any) and -3 (ScalarOrOneDimension) as scalar, which is a defensible default, but the comment misdescribed the constants and would mislead a maintainer.
**Recommendation:** Correct the comment to the actual ValueRank constants (-3 ScalarOrOneDimension, -2 Any, -1 Scalar, 0 OneOrMoreDimensions, 1 OneDimension, >1 multi-dim) and state the deliberate choice that anything >= 0 is treated as an array.
**Resolution:** _(open)_
**Resolution:** Resolved 2026-05-23 — `EnrichAndRegisterVariablesAsync` now carries the correct OPC UA Part 3 ValueRank legend (`-3 ScalarOrOneDimension`, `-2 Any`, `-1 Scalar`, `0 OneOrMoreDimensions`, `1 OneDimension`, `>1` specific N-dimensions) and explicitly states the deliberate choice that anything `>= 0` is treated as an array, with `-3`/`-2` conservatively folded into the scalar bucket. Regression tests `ValueRank_constants_have_the_OPCUA_Part3_spec_values` (anchors the SDK constants) and `IsArray_decision_matches_valueRank_greater_or_equal_zero` (theory across -3..2) pin the logic in `OpcUaClientLowFindingsRegressionTests.cs`.
### Driver.OpcUaClient-012
@@ -227,14 +227,14 @@
|---|---|
| Severity | Low |
| Category | Performance & resource management |
| Location | `OpcUaClientDriver.cs:904`, `:1035` |
| Status | Open |
| Location | `OpcUaClientDriver.cs:1138`, `:1314` |
| Status | Resolved |
**Description:** `MonitoredItem.Notification += (mi, args) => ...` (and the alarm-event equivalent) attaches a closure-capturing lambda to each monitored item's event. The lambda is never detached. When UnsubscribeAsync removes a subscription it calls Subscription.DeleteAsync but does not clear the MonitoredItem.Notification handlers; if the SDK retains the MonitoredItem/Subscription graph anywhere (the session keeps a reference until its own disposal, or during transfer-on-reconnect), the driver instance is kept alive by the closure longer than necessary.
**Description:** `MonitoredItem.Notification += (mi, args) => ...` (and the alarm-event equivalent) attached a closure-capturing lambda to each monitored item's event. The lambda was never detached. When UnsubscribeAsync removed a subscription it called Subscription.DeleteAsync but did not clear the MonitoredItem.Notification handlers; if the SDK retains the MonitoredItem/Subscription graph anywhere (the session keeps a reference until its own disposal, or during transfer-on-reconnect), the driver instance was kept alive by the closure longer than necessary.
**Recommendation:** Detach the Notification handlers when deleting a subscription, or hold the handler delegate so it can be explicitly removed in UnsubscribeAsync/ShutdownAsync.
**Resolution:** _(open)_
**Resolution:** Resolved 2026-05-23 — `SubscribeAsync` now stores each `(MonitoredItem, MonitoredItemNotificationEventHandler)` pair in a new `MonitoredItemNotificationHandle` record carried inside `RemoteSubscription`. `SubscribeAlarmsAsync` similarly stores the event-MonitoredItem and its handler delegate on `RemoteAlarmSubscription`. `UnsubscribeAsync`, `UnsubscribeAlarmsAsync`, and the subscription-teardown loops in `ShutdownAsync` now invoke `DetachNotificationHandlers` (or the alarm-equivalent inline `Notification -= rs.Handler`) BEFORE calling `Subscription.DeleteAsync`, so the SDK's invocation list no longer pins the driver through the captured lambda. Reflection-based regression tests `RemoteSubscription_record_carries_handler_delegates_so_they_can_be_detached` and `RemoteAlarmSubscription_record_carries_handler_delegate_so_it_can_be_detached` pin the contract that the handler reference is reachable from the bookkeeping record (`OpcUaClientLowFindingsRegressionTests.cs`).
### Driver.OpcUaClient-015
+9 -9
View File
@@ -7,7 +7,7 @@
| Review date | 2026-05-22 |
| Commit reviewed | `76d35d1` |
| Status | Reviewed |
| Open findings | 4 |
| Open findings | 0 |
## Checklist coverage
@@ -120,7 +120,7 @@ unreachable device, not crash on it.
| Severity | Low |
| Category | Performance & resource management |
| Location | `src/Drivers/Cli/ZB.MOM.WW.OtOpcUa.Driver.S7.Cli/Commands/ProbeCommand.cs:36,53`, `Commands/ReadCommand.cs:45,54`, `Commands/WriteCommand.cs:51,60`, `Commands/SubscribeCommand.cs:39,73` |
| Status | Open |
| Status | Resolved |
**Description:** Every command declares the driver with `await using var driver = new
S7Driver(...)` and *also* calls `await driver.ShutdownAsync(...)` in a `finally` block.
@@ -136,7 +136,7 @@ not actually disposing.
`subscribe` command `UnsubscribeAsync`. Alternatively drop `await using`
and keep the explicit `finally`. Pick one disposal mechanism per command.
**Resolution:** _(open)_
**Resolution:** Resolved 2026-05-23 — dropped the explicit `await driver.ShutdownAsync(CancellationToken.None)` calls from the `finally` blocks of `ProbeCommand`, `ReadCommand`, `WriteCommand`, and `SubscribeCommand`; `await using` is now the sole driver-disposal mechanism per command (DisposeAsync itself runs ShutdownAsync), and the subscribe command keeps `UnsubscribeAsync` in its finally because that is a subscription-lifecycle concern, not driver disposal. Added `CommandDisposalConventionsTests` to guard the source-level convention against regression.
### Driver.S7.Cli-005
@@ -145,7 +145,7 @@ and keep the explicit `finally`. Pick one disposal mechanism per command.
| Severity | Low |
| Category | Code organization & conventions |
| Location | `tests/ZB.MOM.WW.OtOpcUa.Driver.S7.Cli.Tests/` |
| Status | Open |
| Status | Resolved |
**Description:** A stale directory `tests/ZB.MOM.WW.OtOpcUa.Driver.S7.Cli.Tests/`
exists containing only an `obj/` folder — no `.csproj`, no source. The real test
@@ -158,7 +158,7 @@ grepping the tree for the S7 CLI test project.
directory (including its `obj/`). This is outside the module `src/` tree but is the
S7 CLI own orphaned test folder, so it belongs to this module cleanup.
**Resolution:** _(open)_
**Resolution:** Resolved 2026-05-23 — deleted the stale `tests/ZB.MOM.WW.OtOpcUa.Driver.S7.Cli.Tests/` directory (only contained a leftover `obj/` from before the move into `tests/Drivers/Cli/`; no tracked files). The real test project at `tests/Drivers/Cli/ZB.MOM.WW.OtOpcUa.Driver.S7.Cli.Tests/` is untouched.
### Driver.S7.Cli-006
@@ -167,7 +167,7 @@ S7 CLI own orphaned test folder, so it belongs to this module cleanup.
| Severity | Low |
| Category | Testing coverage |
| Location | `tests/Drivers/Cli/ZB.MOM.WW.OtOpcUa.Driver.S7.Cli.Tests/WriteCommandParseValueTests.cs` |
| Status | Open |
| Status | Resolved |
**Description:** The only test file covers `WriteCommand.ParseValue` and
`ReadCommand.SynthesiseTagName`. `S7CommandBase.BuildOptions` — which maps the
@@ -184,7 +184,7 @@ not be caught. `ParseValue` is also missing an explicit overflow-edge test (e.g.
overflow case to the `ParseValue` numeric tests once Driver.S7.Cli-001 is resolved so
the test asserts the wrapped `CommandException`.
**Resolution:** _(open)_
**Resolution:** Resolved 2026-05-23 — added `S7CommandBaseBuildOptionsTests` covering Probe.Enabled=false (one-shot CLI guarantee), TimeoutMs→Timeout TimeSpan mapping, host/port/CPU/rack/slot flowing through, and tag-list passthrough. The overflow-edge `ParseValue` test was already added under Driver.S7.Cli-001 (`ParseValue_overflow_for_numeric_types_throws_CommandException`).
### Driver.S7.Cli-007
@@ -193,7 +193,7 @@ the test asserts the wrapped `CommandException`.
| Severity | Low |
| Category | Documentation & comments |
| Location | `src/Drivers/Cli/ZB.MOM.WW.OtOpcUa.Driver.S7.Cli/Commands/SubscribeCommand.cs:45-51` |
| Status | Open |
| Status | Resolved |
**Description:** The Modbus CLI `SubscribeCommand` carries an explanatory comment on
the `OnDataChange` handler ("Route every data-change event to the CliFx console (not
@@ -206,4 +206,4 @@ Minor, but the rationale is worth keeping consistent across the CLI family.
**Recommendation:** Re-add the one-line comment from the Modbus `SubscribeCommand` so
the S7 copy explains why the event handler writes via `console.Output` synchronously.
**Resolution:** _(open)_
**Resolution:** Resolved 2026-05-23 — re-added the explanatory comment above the `OnDataChange` handler in the S7 `SubscribeCommand`, mirroring the Modbus copy: it explains the use of the CliFx `IConsole.Output` abstraction (rather than `System.Console`) and notes that the handler runs synchronously because it's raised from a driver background thread. Added `SubscribeCommandConsoleHandlerCommentTests` to guard the rationale against future copy-paste regressions.
+55 -11
View File
@@ -7,7 +7,7 @@
| Review date | 2026-05-22 |
| Commit reviewed | `76d35d1` |
| Status | Reviewed |
| Open findings | 5 |
| Open findings | 0 |
## Checklist coverage
@@ -89,7 +89,7 @@ correct the comment so the lossiness of UInt32 is documented.
| Severity | Low |
| Category | Correctness & logic bugs |
| Location | `S7Driver.cs:172`, `S7Driver.cs:255` |
| Status | Open |
| Status | Resolved |
**Description:** ReadAsync and WriteAsync dereference fullReferences.Count /
writes.Count with no null guard. A null argument throws NullReferenceException
@@ -101,7 +101,13 @@ inconsistent with it.
**Recommendation:** Add ArgumentNullException.ThrowIfNull for the list parameters
at the top of ReadAsync and WriteAsync.
**Resolution:** _(open)_
**Resolution:** Resolved 2026-05-23 — added `ArgumentNullException.ThrowIfNull`
at the top of both `ReadAsync` and `WriteAsync`, placed BEFORE `RequirePlc()` so
a null argument produces a typed `ArgumentNullException` (consistent with
`DiscoverAsync`) rather than either an NRE on `.Count` or the "not initialized"
`InvalidOperationException` from `RequirePlc`. Regression tests
`ReadAsync_with_null_fullReferences_throws_ArgumentNullException` and
`WriteAsync_with_null_writes_throws_ArgumentNullException`.
### Driver.S7-004
@@ -133,7 +139,7 @@ and swallowed poll-loop / shutdown exceptions.
| Severity | Low |
| Category | OtOpcUa conventions |
| Location | `S7Driver.cs:33`, `S7Driver.cs:433` |
| Status | Open |
| Status | Resolved |
**Description:** System.Collections.Concurrent.ConcurrentDictionary is written
out with a fully-qualified namespace at the field declarations instead of a
@@ -145,7 +151,11 @@ S7Driver.cs despite the file-top using S7.Net.
**Recommendation:** Add using System.Collections.Concurrent and drop the
redundant global::S7.Net. qualifiers where using S7.Net already covers them.
**Resolution:** _(open)_
**Resolution:** Resolved 2026-05-23 — `using System.Collections.Concurrent` was
already added by an earlier finding fix; this resolution removes the remaining
`global::S7.Net.Plc` qualifiers from the `ReadOneAsync` and `WriteOneAsync`
signatures, now using the unqualified `Plc` type (the file-top `using S7.Net`
already covers it). House style restored.
### Driver.S7-006
@@ -250,7 +260,7 @@ status, and update _health to Degraded on transport failures.
| Severity | Low |
| Category | Error handling & resilience |
| Location | `S7Driver.cs:392` |
| Status | Open |
| Status | Resolved |
**Description:** The subscription poll loop never reflects sustained polling
failure anywhere an operator can see it. PollLoopAsync swallows every
@@ -266,7 +276,19 @@ Interval indefinitely on a hard failure.
apply a capped backoff after consecutive errors; at minimum log the swallowed
exception (see Driver.S7-004).
**Resolution:** _(open)_
**Resolution:** Resolved 2026-05-23 — `PollLoopAsync` now tracks
`consecutiveFailures`, calls new `HandlePollFailure` which both logs (with the
failure count) AND degrades `_health` to `Degraded` once
`PollFailureHealthThreshold` (1) consecutive failures have accumulated, and
applies a capped exponential backoff via new `ComputeBackoffDelay` (doubles the
wait each consecutive failure up to a 30 s `PollBackoffCap`). A healthy tick
resets the counter so the cadence snaps back to the configured Interval.
`HandlePollFailure` refuses to downgrade a `Faulted` state (reserved for
permanent config faults like PUT/GET-denied). Regression test
`PollLoop_against_uninitialized_driver_degrades_health` proves the health
surface now reflects sustained failure; `PollLoop_applies_capped_backoff_after_consecutive_failures`
proves shutdown still completes inside the drain window even under a fault
storm.
### Driver.S7-010
@@ -275,7 +297,7 @@ exception (see Driver.S7-004).
| Severity | Low |
| Category | Performance & resource management |
| Location | `S7Driver.cs:504` |
| Status | Open |
| Status | Resolved |
**Description:** Dispose() is implemented as
DisposeAsync().AsTask().GetAwaiter().GetResult() - sync-over-async. Inside the
@@ -288,7 +310,16 @@ blocking wrap is unnecessary risk.
perform the teardown directly (cancel CTSs, close Plc, dispose _gate) without
round-tripping through the async path.
**Resolution:** _(open)_
**Resolution:** Resolved 2026-05-23 — `Dispose()` now performs teardown
directly via a new private `SynchronousTeardown` method that mirrors
`ShutdownAsync` but uses `Task.WhenAll(...).Wait(DrainTimeout)` instead of
`await Task.WhenAll(...).WaitAsync(...)`. Probe + poll Tasks are still drained
with the bounded 5 s timeout (so a wedged loop cannot hang `Dispose` indefinitely),
but the sync path no longer round-trips through `DisposeAsync().AsTask().GetAwaiter().GetResult()`.
`DisposeAsync` keeps its existing implementation for callers that opt into the
async dispose pattern. Regression tests
`Dispose_completes_synchronously_without_sync_over_async_round_trip` and
`Dispose_is_idempotent`.
### Driver.S7-011
@@ -358,7 +389,7 @@ ReadStatusAsync-based probe.
| Severity | Low |
| Category | Code organization & conventions |
| Location | `S7DriverOptions.cs:90`, `S7Driver.cs:300` |
| Status | Open |
| Status | Resolved |
**Description:** S7TagDefinition.StringLength is a public configured/JSON-bound
parameter (default 254) but is dead: S7DataType.String reads and writes both
@@ -376,7 +407,20 @@ StringLength) at InitializeAsync / factory validation with a clear "not yet
supported" error, so a partially-implemented type cannot be configured into a
live address space.
**Resolution:** _(open)_
**Resolution:** Resolved 2026-05-23 — `InitializeAsync` now runs new
`RejectUnsupportedTagDataTypes`, which throws `NotSupportedException` for any
tag whose `DataType` is in the `UnimplementedDataTypes` set (`Int64`, `UInt64`,
`Float64`, `String`, `DateTime`). The half-implemented types can no longer leak
into the live address space — a site that configures one fails fast at init
rather than seeing a node that returns `BadNotSupported` on every access.
Entries should be removed from `UnimplementedDataTypes` as each type is wired
through; the comment on `RejectUnsupportedTagDataTypes` makes it a single grep
target for that follow-up. `StringLength` remains in `S7TagDefinition` because
removing it would be a breaking change to existing config JSON; once `String`
is implemented it will be consumed without further config changes. Regression
tests `Initialize_rejects_not_yet_implemented_data_type_with_NotSupportedException`
(Theory, 5 types) and `Initialize_accepts_implemented_data_types` (Theory, 7
types) prove the guard is targeted.
### Driver.S7-014
+66 -15
View File
@@ -7,7 +7,7 @@
| Review date | 2026-05-22 |
| Commit reviewed | `76d35d1` |
| Status | Reviewed |
| Open findings | 7 |
| Open findings | 0 |
## Checklist coverage
@@ -40,7 +40,7 @@ a category produced nothing rather than leaving it blank.
| Severity | Low |
| Category | Correctness & logic bugs |
| Location | `TwinCATCommandBase.cs:23-24`, `Commands/SubscribeCommand.cs:23-24`, `Commands/BrowseCommand.cs:21-24` |
| Status | Open |
| Status | Resolved |
**Description:** Numeric command options are accepted without range validation. `--timeout-ms`
feeds `Timeout => TimeSpan.FromMilliseconds(TimeoutMs)`; passing `--timeout-ms 0` or a negative
@@ -56,7 +56,16 @@ failure mode should be a readable up-front rejection.
shared helper on `TwinCATCommandBase`) and throw `CliFx.Exceptions.CommandException` with a
clear message when `TimeoutMs <= 0`, `IntervalMs <= 0`, or `AmsPort` falls outside `1..65535`.
**Resolution:** _(open)_
**Resolution:** Added a virtual `Validate()` helper on `TwinCATCommandBase` that rejects
`TimeoutMs <= 0` and `AmsPort` outside `1..65535` with a clean
`CliFx.Exceptions.CommandException` carrying the offending value. `SubscribeCommand` overrides
`Validate()` to add the `IntervalMs > 0` check. Each `ExecuteAsync` calls `Validate()` first
so the CLI surfaces "bad argument" up-front instead of letting the driver fail with an opaque
transport error. Covered by `TwinCATCommandBaseTests.Validate_rejects_zero_timeout`,
`Validate_rejects_negative_timeout`, `Validate_rejects_out_of_range_ams_port` (theory, 4
cases), `Validate_accepts_in_range_ams_port` (theory, 4 cases),
`SubscribeCommand_validate_rejects_zero_interval`,
`SubscribeCommand_validate_rejects_negative_interval`.
### Driver.TwinCAT.Cli-002
@@ -65,7 +74,7 @@ clear message when `TimeoutMs <= 0`, `IntervalMs <= 0`, or `AmsPort` falls outsi
| Severity | Low |
| Category | Concurrency & thread safety |
| Location | `Commands/SubscribeCommand.cs:46-58` |
| Status | Open |
| Status | Resolved |
**Description:** The `OnDataChange` handler calls `console.Output.WriteLine(line)` synchronously.
In native ADS-notification mode the event is raised from the `Beckhoff.TwinCAT.Ads`
@@ -83,7 +92,12 @@ serialised on one poll loop; the TwinCAT native path has no such serialisation.
`TextWriter.Synchronized`. At minimum, gate it so the banner is written before the
subscription is registered (it already is) and lock the per-event writes against each other.
**Resolution:** _(open)_
**Resolution:** Introduced a per-execution `writeLock` object inside
`SubscribeCommand.ExecuteAsync`. Both the `OnDataChange` handler's `WriteLine` and the
post-subscription "Subscribed to ..." banner take the lock, so notification-callback writes
cannot interleave with the main-thread banner or with each other. Lock is local to the
command so parallel process instances do not contend with one another (each owns its own
console).
### Driver.TwinCAT.Cli-003
@@ -92,7 +106,7 @@ subscription is registered (it already is) and lock the per-event writes against
| Severity | Low |
| Category | Error handling & resilience |
| Location | `Commands/SubscribeCommand.cs:56-58` |
| Status | Open |
| Status | Resolved |
**Description:** The subscribe banner reports the mechanism purely from the `--poll-only` flag
(`var mode = PollOnly ? "polling" : "ADS notification"`). The doc (`docs/Driver.TwinCAT.Cli.md`)
@@ -108,7 +122,15 @@ returned `ISubscriptionHandle.DiagnosticId`, which is `twincat-native-sub-*` for
path vs the `PollGroupEngine` handle for poll mode) or soften the wording to "(requested:
ADS notification)" so it does not over-claim.
**Resolution:** _(open)_
**Resolution:** Added internal static
`SubscribeCommand.DescribeMechanism(ISubscriptionHandle)` that maps
`DiagnosticId.StartsWith("twincat-native-sub-", Ordinal)` to `"ADS notification"` and
anything else to `"polling"`. The banner now reads from the handle the driver actually
returned, so the line cannot disagree with what the driver did even if a future fallback
lands the subscription somewhere unexpected. Covered by
`SubscribeCommandMechanismTests.DescribeMechanism_returns_ADS_notification_for_native_handle`
(theory, 3 cases) and `DescribeMechanism_returns_polling_for_anything_else` (theory, 4 cases
including an ordinal case-sensitivity guard).
### Driver.TwinCAT.Cli-004
@@ -117,7 +139,7 @@ ADS notification)" so it does not over-claim.
| Severity | Low |
| Category | Design-document adherence |
| Location | `TwinCATCommandBase.cs:26-29`, `Commands/BrowseCommand.cs` |
| Status | Open |
| Status | Resolved |
**Description:** `--poll-only` is declared on `TwinCATCommandBase`, so it is inherited by
`browse`. `BrowseCommand` only ever calls `DiscoverAsync` — it never subscribes — so
@@ -131,7 +153,15 @@ disagree.
flag) onto an intermediate base shared by only `probe`/`read`/`subscribe`, or override/hide it
for `browse`. Alternatively document explicitly that the flag is a no-op for `browse`.
**Resolution:** _(open)_
**Resolution:** Introduced an intermediate `TwinCATTagCommandBase : TwinCATCommandBase` that
hosts the `--poll-only` flag and the `BuildOptions(...)` helper. `ProbeCommand`,
`ReadCommand`, `WriteCommand`, and `SubscribeCommand` inherit from this intermediate (they all
build a tag-list `TwinCATDriverOptions`). `BrowseCommand` keeps inheriting from
`TwinCATCommandBase` directly, so `--poll-only` no longer surfaces in `browse --help`. Browse
sets `UseNativeNotifications = true` on its inline options (irrelevant either way for the
discover-only path, but matches production wiring). Covered by
`TwinCATCommandBaseTests.BrowseCommand_does_not_expose_poll_only_flag` and
`ProbeCommand_still_exposes_poll_only_flag` (both reflect over the public property surface).
### Driver.TwinCAT.Cli-005
@@ -140,7 +170,7 @@ for `browse`. Alternatively document explicitly that the flag is a no-op for `br
| Severity | Low |
| Category | Code organization & conventions |
| Location | `Commands/ProbeCommand.cs:23`, `Commands/ReadCommand.cs:20`, `Commands/WriteCommand.cs:20`, `Commands/SubscribeCommand.cs:18` |
| Status | Open |
| Status | Resolved |
**Description:** The `--type` option is declared with the short alias `-t` on `read`, `write`,
and `subscribe`, but `ProbeCommand` declares `[CommandOption("type", ...)]` with no short
@@ -151,7 +181,10 @@ take the same `TwinCATDataType` option.
**Recommendation:** Add the `'t'` short alias to `ProbeCommand`'s `--type` option to match the
other three commands.
**Resolution:** _(open)_
**Resolution:** Added the `'t'` short alias on `ProbeCommand.DataType`'s `[CommandOption]`,
matching read/write/subscribe so muscle memory carries between the four verbs. Covered by
`TwinCATCommandBaseTests.ProbeCommand_type_option_carries_short_alias_t` which asserts the
`CommandOptionAttribute.ShortName` is `'t'`.
### Driver.TwinCAT.Cli-006
@@ -160,7 +193,7 @@ other three commands.
| Severity | Low |
| Category | Testing coverage |
| Location | `tests/Drivers/Cli/ZB.MOM.WW.OtOpcUa.Driver.TwinCAT.Cli.Tests/WriteCommandParseValueTests.cs` |
| Status | Open |
| Status | Resolved |
**Description:** The only test file covers `WriteCommand.ParseValue` and
`ReadCommand.SynthesiseTagName`. Other deterministic, router-independent logic is untested:
@@ -177,7 +210,20 @@ module's scope but worth flagging to whoever owns the test tree.
`BuildOptions` field wiring, and for the `CollectingAddressSpaceBuilder` prefix/max filtering
and access-classification logic.
**Resolution:** _(open)_
**Resolution:** Added two new test classes. `TwinCATCommandBaseTests` covers the Gateway
canonical form, round-trip through `TwinCATAmsAddress.TryParse`, DriverInstanceId
composition, Timeout projection, BuildOptions field wiring (devices, tags, timeout, probe
disabled, controller-browse disabled, UseNativeNotifications default true), and the PollOnly
toggle flipping UseNativeNotifications. `BrowseCommandFilterTests` covers
`CollectingAddressSpaceBuilder` (records variables in call order, treats `Folder` as a
same-builder pass-through), `BrowseCommand.FilterByPrefix` (empty/null prefix passes
everything, case-sensitive ordinal match), `BrowseCommand.PrintLimit` (max <= 0 = unbounded,
caps when matched > max, no padding when matched < max), and `BrowseCommand.AccessTag`
(ViewOnly -> RO, every other classification -> RW, theory over all 6 non-ViewOnly values).
`BrowseCommand.CollectingAddressSpaceBuilder` made `internal` (was `private`) so the test
project can construct it directly via the existing `InternalsVisibleTo` hook. Total tests
for this assembly went from 27 to 69. The stale empty sibling test directory mention is
left out of scope as noted.
### Driver.TwinCAT.Cli-007
@@ -186,7 +232,7 @@ and access-classification logic.
| Severity | Low |
| Category | Documentation & comments |
| Location | `TwinCATCommandBase.cs:31-36` |
| Status | Open |
| Status | Resolved |
**Description:** The `Timeout` override has an empty `init` accessor with the comment
`/* driven by TimeoutMs */`. Because the base `DriverCommandBase.Timeout` is declared
@@ -199,4 +245,9 @@ gives no hint of the deliberate no-op. This is a maintainability/clarity nit, no
computed projection of `--timeout-ms` and the `init` accessor is intentionally a no-op, so the
design intent survives refactoring.
**Resolution:** _(open)_
**Resolution:** Replaced the `<inheritdoc/>` on `TwinCATCommandBase.Timeout` with an explicit
`<summary>` documenting that `Timeout` is projected from `TimeoutMs`, that the `init`
accessor required by the abstract base property is intentionally a no-op, and that adding a
backing field would cause the two to drift on every refactor. The inner-block comment was
tightened to point at the XML summary so the design intent survives whichever doc surface a
future maintainer reads first.
+11 -11
View File
@@ -7,7 +7,7 @@
| Review date | 2026-05-22 |
| Commit reviewed | `76d35d1` |
| Status | Reviewed |
| Open findings | 5 |
| Open findings | 0 |
## Checklist coverage
@@ -112,7 +112,7 @@ does not support UDT tags, and `BrowseSymbolsAsync` already correctly yields
| Severity | Low |
| Category | Correctness & logic bugs |
| Location | `TwinCATDataType.cs:24-27` |
| Status | Open |
| Status | Resolved |
**Description:** The inline comments for the IEC time types are inaccurate. TwinCAT `TIME` is
a duration (32-bit, milliseconds) — not "ms since epoch of day". `DATE` is stored as seconds
@@ -125,7 +125,7 @@ implementer who tries to add proper conversion.
date/time semantics are intended to be exposed properly, track a follow-up to decode them to
`DriverDataType.DateTime`; otherwise document that they surface as raw counters.
**Resolution:** _(open)_
**Resolution:** Resolved 2026-05-23 — rewrote the inline comments to match the actual IEC 61131-3 / TwinCAT encoding (TIME = duration in ms, DATE = seconds since 1970-01-01 truncated to a day boundary, DT = seconds since 1970-01-01, TOD = ms since midnight) and added a block comment documenting that the driver surfaces them as raw UDINT counters via `DriverDataType.UInt32`. Test `Iec_time_types_map_to_uint32_raw_counter` pins the mapping.
### Driver.TwinCAT-005
@@ -156,7 +156,7 @@ catch), native-notification registration failures, and host state transitions
| Severity | Low |
| Category | OtOpcUa conventions |
| Location | `TwinCATDriver.cs:406-411` |
| Status | Open |
| Status | Resolved |
**Description:** `ResolveHost` falls back to `DriverInstanceId` when there are no configured
devices and the reference is unknown. `DriverInstanceId` is a logical config-DB identifier,
@@ -169,7 +169,7 @@ connectivity-status row.
empty string or a documented unresolved marker), or document why the instance ID is the chosen
fallback. Prefer the first device HostAddress only when one exists (already done).
**Resolution:** _(open)_
**Resolution:** Resolved 2026-05-23 — `ResolveHost` now returns `TwinCATDriver.UnresolvedHostSentinel` (empty string) when no devices are configured, replacing the `DriverInstanceId` collision with `GetHostStatuses()` rows. The sentinel is publicly documented on the driver type. Updated `ResolveHost_falls_back_to_unresolved_sentinel_when_no_devices` (was `_to_DriverInstanceId_`) and added `ResolveHost_returns_unresolved_sentinel_when_no_devices` + `ResolveHost_unresolved_sentinel_matches_no_GetHostStatuses_entry` regressions.
### Driver.TwinCAT-007
@@ -361,7 +361,7 @@ part of the documented driver contract, not optional.
| Severity | Low |
| Category | Design-document adherence |
| Location | `TwinCATDriverOptions.cs:41-43`, `TwinCATDriverOptions.cs:57-62`, `AdsTwinCATClient.cs:145` |
| Status | Open |
| Status | Resolved |
**Description:** Several drifts between the implemented config surface and
`docs/v2/driver-specs.md` section 6. The spec connection-settings list has separate `Host`
@@ -376,7 +376,7 @@ the probe path connects via `_options.Timeout` — a dead config field. The spec
shape (the doc is DRAFT, so updating it is acceptable). Remove or wire up
`TwinCATProbeOptions.Timeout`. Expose `NotificationMaxDelayMs` if batching control is wanted.
**Resolution:** _(open)_
**Resolution:** Resolved 2026-05-23 — `TwinCATProbeOptions.Timeout` is now wired into `EnsureConnectedAsync` via an optional `timeoutOverride` parameter that the probe loop passes (reads / writes keep the driver-level `_options.Timeout`). Added a `TwinCATDriverOptions.NotificationMaxDelayMs` config knob (parsed from `driverConfigJson` via `TwinCATDriverConfigDto.NotificationMaxDelayMs`) and threaded it through `ITwinCATClient.AddNotificationAsync` so `NotificationSettings` carries the configured max-delay instead of the hard-coded 0. The `Host` / `AmsNetId` / `AmsPort` triple in the spec was already implemented as the single `HostAddress` (parsed `ads://{netId}:{port}` URI) — kept as-is to match the v2 driver convention; covered by `TwinCATAmsAddress`. Regression tests: `ProbeOptions_Timeout_is_applied_to_probe_calls`, `NotificationMaxDelayMs_is_exposed_on_driver_options`, `NotificationMaxDelayMs_parses_from_driver_config_json`.
### Driver.TwinCAT-015
@@ -385,7 +385,7 @@ shape (the doc is DRAFT, so updating it is acceptable). Remove or wire up
| Severity | Low |
| Category | Code organization & conventions |
| Location | `TwinCATDriver.cs:431-432` |
| Status | Open |
| Status | Resolved |
**Description:** `Dispose()` runs `DisposeAsync().AsTask().GetAwaiter().GetResult()`
sync-over-async. `docs/v2/driver-stability.md` section Galaxy explicitly lists "sync-over-async
@@ -399,7 +399,7 @@ here — cancelling token sources, disposing clients, clearing dictionaries —
synchronous, and `PollGroupEngine.DisposeAsync` completes synchronously, so factor the
synchronous teardown out so `Dispose()` does not block on a `Task`.
**Resolution:** _(open)_
**Resolution:** Resolved 2026-05-23 — `Dispose()` now does an inline synchronous teardown with no `await` and no captured sync context: dispose native subscriptions, drive `PollGroupEngine.DisposeAsync` via `.AsTask().Wait(5s)` (no context capture), per-device `ProbeCts.Cancel()` + `ProbeTask.Wait(2s)`, `DisposeClient()` / `DisposeGate()`, then clear the dictionaries. `DisposeAsync` still routes through `ShutdownAsync` for genuinely async callers. Regression test `Dispose_does_not_block_on_async_in_default_synchronization_context` runs `Dispose()` inside a single-threaded `SynchronizationContext` that would deadlock a sync-over-async teardown and asserts it completes within 5s.
### Driver.TwinCAT-016
@@ -408,7 +408,7 @@ synchronous teardown out so `Dispose()` does not block on a `Task`.
| Severity | Low |
| Category | Testing coverage |
| Location | `tests/Drivers/ZB.MOM.WW.OtOpcUa.Driver.TwinCAT.Tests/` |
| Status | Open |
| Status | Resolved |
**Description:** Unit coverage exists for AMS-address parsing, symbol-path parsing, read/write,
native notifications, symbol browse, and the capability surface. Gaps tied to the findings
@@ -423,4 +423,4 @@ without truncation (Driver.TwinCAT-002).
addressed, especially a concurrency stress test for `EnsureConnectedAsync` and a
`ReinitializeAsync`-applies-new-config test.
**Resolution:** _(open)_
**Resolution:** Resolved 2026-05-23 — the previously-closed High findings each grew their regression coverage as they were resolved (see `TwinCATHighFindingsRegressionTests`: `ReinitializeAsync_applies_changed_device_config` for -001, `LInt_read_round_trips_value_above_int_MaxValue` + `DataType_mapping_preserves_width_and_signedness` for -002, `Concurrent_reads_on_one_device_create_a_single_client` + `Concurrent_reads_and_writes_share_one_client` for -007, `Symbol_version_changed_raises_OnRediscoveryNeeded` + `TwinCATDriver_implements_IRediscoverable` for -013). This pass added the two remaining gaps: `Structure_typed_pre_declared_tag_is_rejected_at_config_parse` (-003) and `Probe_loop_and_read_share_one_client_per_device` (-009 disposal-race coverage races 64 readers against the probe loop for 500ms and asserts a single client / single connect). All coverage lives in the test files `TwinCATHighFindingsRegressionTests.cs` and the new `TwinCATLowFindingsRegressionTests.cs`.
+144 -133
View File
@@ -12,148 +12,41 @@ Each module's `findings.md` is the source of truth; this file is generated from
|---|---|---|---|---|---|---|
| [Admin](Admin/findings.md) | Claude Code | 2026-05-22 | `76d35d1` | Reviewed | 0 | 13 |
| [Analyzers](Analyzers/findings.md) | Claude Code | 2026-05-22 | `76d35d1` | Reviewed | 0 | 7 |
| [Client.CLI](Client.CLI/findings.md) | Claude Code | 2026-05-22 | `76d35d1` | Reviewed | 8 | 10 |
| [Client.Shared](Client.Shared/findings.md) | Claude Code | 2026-05-22 | `76d35d1` | Reviewed | 5 | 11 |
| [Client.UI](Client.UI/findings.md) | Claude Code | 2026-05-22 | `76d35d1` | Reviewed | 6 | 11 |
| [Client.CLI](Client.CLI/findings.md) | Claude Code | 2026-05-22 | `76d35d1` | Reviewed | 0 | 10 |
| [Client.Shared](Client.Shared/findings.md) | Claude Code | 2026-05-22 | `76d35d1` | Reviewed | 0 | 11 |
| [Client.UI](Client.UI/findings.md) | Claude Code | 2026-05-22 | `76d35d1` | Reviewed | 0 | 11 |
| [Configuration](Configuration/findings.md) | Claude Code | 2026-05-22 | `76d35d1` | Reviewed | 0 | 11 |
| [Core](Core/findings.md) | Claude Code | 2026-05-22 | `76d35d1` | Reviewed | 0 | 12 |
| [Core.Abstractions](Core.Abstractions/findings.md) | Claude Code | 2026-05-22 | `76d35d1` | Reviewed | 0 | 8 |
| [Core.AlarmHistorian](Core.AlarmHistorian/findings.md) | Claude Code | 2026-05-22 | `76d35d1` | Reviewed | 0 | 11 |
| [Core.ScriptedAlarms](Core.ScriptedAlarms/findings.md) | Claude Code | 2026-05-22 | `76d35d1` | Reviewed | 0 | 12 |
| [Core.Scripting](Core.Scripting/findings.md) | Claude Code | 2026-05-22 | `76d35d1` | Reviewed | 0 | 11 |
| [Core.ScriptedAlarms](Core.ScriptedAlarms/findings.md) | Claude Code | 2026-05-23 | `a9be809` | Reviewed | 0 | 13 |
| [Core.Scripting](Core.Scripting/findings.md) | Claude Code | 2026-05-23 | `a9be809` | Reviewed | 0 | 16 |
| [Core.VirtualTags](Core.VirtualTags/findings.md) | Claude Code | 2026-05-22 | `76d35d1` | Reviewed | 0 | 13 |
| [Driver.AbCip](Driver.AbCip/findings.md) | Claude Code | 2026-05-22 | `76d35d1` | Reviewed | 5 | 15 |
| [Driver.AbCip.Cli](Driver.AbCip.Cli/findings.md) | Claude Code | 2026-05-22 | `76d35d1` | Reviewed | 6 | 8 |
| [Driver.AbLegacy](Driver.AbLegacy/findings.md) | Claude Code | 2026-05-22 | `76d35d1` | Reviewed | 3 | 13 |
| [Driver.AbLegacy.Cli](Driver.AbLegacy.Cli/findings.md) | Claude Code | 2026-05-22 | `76d35d1` | Reviewed | 6 | 7 |
| [Driver.Cli.Common](Driver.Cli.Common/findings.md) | Claude Code | 2026-05-22 | `76d35d1` | Reviewed | 2 | 6 |
| [Driver.FOCAS](Driver.FOCAS/findings.md) | Claude Code | 2026-05-22 | `76d35d1` | Reviewed | 5 | 12 |
| [Driver.FOCAS.Cli](Driver.FOCAS.Cli/findings.md) | Claude Code | 2026-05-22 | `76d35d1` | Reviewed | 5 | 5 |
| [Driver.Galaxy](Driver.Galaxy/findings.md) | Claude Code | 2026-05-22 | `76d35d1` | Reviewed | 4 | 14 |
| [Driver.Historian.Wonderware](Driver.Historian.Wonderware/findings.md) | Claude Code | 2026-05-22 | `76d35d1` | Reviewed | 7 | 12 |
| [Driver.Historian.Wonderware.Client](Driver.Historian.Wonderware.Client/findings.md) | Claude Code | 2026-05-22 | `76d35d1` | Reviewed | 5 | 10 |
| [Driver.Modbus](Driver.Modbus/findings.md) | Claude Code | 2026-05-22 | `76d35d1` | Reviewed | 7 | 12 |
| [Driver.Modbus.Addressing](Driver.Modbus.Addressing/findings.md) | Claude Code | 2026-05-22 | `76d35d1` | Reviewed | 3 | 9 |
| [Driver.Modbus.Cli](Driver.Modbus.Cli/findings.md) | Claude Code | 2026-05-22 | `76d35d1` | Reviewed | 6 | 8 |
| [Driver.OpcUaClient](Driver.OpcUaClient/findings.md) | Claude Code | 2026-05-22 | `76d35d1` | Reviewed | 2 | 15 |
| [Driver.S7](Driver.S7/findings.md) | Claude Code | 2026-05-22 | `76d35d1` | Reviewed | 5 | 14 |
| [Driver.S7.Cli](Driver.S7.Cli/findings.md) | Claude Code | 2026-05-22 | `76d35d1` | Reviewed | 4 | 7 |
| [Driver.TwinCAT](Driver.TwinCAT/findings.md) | Claude Code | 2026-05-22 | `76d35d1` | Reviewed | 5 | 16 |
| [Driver.TwinCAT.Cli](Driver.TwinCAT.Cli/findings.md) | Claude Code | 2026-05-22 | `76d35d1` | Reviewed | 7 | 7 |
| [Driver.AbCip](Driver.AbCip/findings.md) | Claude Code | 2026-05-22 | `76d35d1` | Reviewed | 0 | 15 |
| [Driver.AbCip.Cli](Driver.AbCip.Cli/findings.md) | Claude Code | 2026-05-22 | `76d35d1` | Reviewed | 0 | 8 |
| [Driver.AbLegacy](Driver.AbLegacy/findings.md) | Claude Code | 2026-05-22 | `76d35d1` | Reviewed | 0 | 13 |
| [Driver.AbLegacy.Cli](Driver.AbLegacy.Cli/findings.md) | Claude Code | 2026-05-22 | `76d35d1` | Reviewed | 0 | 7 |
| [Driver.Cli.Common](Driver.Cli.Common/findings.md) | Claude Code | 2026-05-23 | `a9be809` | Reviewed | 0 | 8 |
| [Driver.FOCAS](Driver.FOCAS/findings.md) | Claude Code | 2026-05-22 | `76d35d1` | Reviewed | 0 | 12 |
| [Driver.FOCAS.Cli](Driver.FOCAS.Cli/findings.md) | Claude Code | 2026-05-22 | `76d35d1` | Reviewed | 0 | 5 |
| [Driver.Galaxy](Driver.Galaxy/findings.md) | Claude Code | 2026-05-23 | `a9be809` | Reviewed | 0 | 18 |
| [Driver.Historian.Wonderware](Driver.Historian.Wonderware/findings.md) | Claude Code | 2026-05-22 | `76d35d1` | Reviewed | 0 | 12 |
| [Driver.Historian.Wonderware.Client](Driver.Historian.Wonderware.Client/findings.md) | Claude Code | 2026-05-22 | `76d35d1` | Reviewed | 0 | 10 |
| [Driver.Modbus](Driver.Modbus/findings.md) | Claude Code | 2026-05-22 | `76d35d1` | Reviewed | 0 | 12 |
| [Driver.Modbus.Addressing](Driver.Modbus.Addressing/findings.md) | Claude Code | 2026-05-22 | `76d35d1` | Reviewed | 0 | 9 |
| [Driver.Modbus.Cli](Driver.Modbus.Cli/findings.md) | Claude Code | 2026-05-22 | `76d35d1` | Reviewed | 0 | 8 |
| [Driver.OpcUaClient](Driver.OpcUaClient/findings.md) | Claude Code | 2026-05-22 | `76d35d1` | Reviewed | 0 | 15 |
| [Driver.S7](Driver.S7/findings.md) | Claude Code | 2026-05-22 | `76d35d1` | Reviewed | 0 | 14 |
| [Driver.S7.Cli](Driver.S7.Cli/findings.md) | Claude Code | 2026-05-22 | `76d35d1` | Reviewed | 0 | 7 |
| [Driver.TwinCAT](Driver.TwinCAT/findings.md) | Claude Code | 2026-05-22 | `76d35d1` | Reviewed | 0 | 16 |
| [Driver.TwinCAT.Cli](Driver.TwinCAT.Cli/findings.md) | Claude Code | 2026-05-22 | `76d35d1` | Reviewed | 0 | 7 |
| [Server](Server/findings.md) | Claude Code | 2026-05-22 | `76d35d1` | Reviewed | 0 | 15 |
## Pending findings
Findings with status `Open` or `In Progress`, ordered by severity.
| ID | Severity | Category | Location | Description |
|---|---|---|---|---|
| Client.CLI-002 | Low | Correctness & logic bugs | `Commands/SubscribeCommand.cs:129-137` | The summary computes `neverWentBad` as every target whose node-id key is absent from the `everBad` dictionary. A node that received no update at all is also absent from `everBad`, so it is counted in `neverWentBad` and printed under the he… |
| Client.CLI-003 | Low | Correctness & logic bugs | `Commands/BrowseCommand.cs:29-30`, `Commands/SubscribeCommand.cs:20-27`, `Commands/AlarmsCommand.cs:28-29`, `Commands/HistoryReadCommand.cs:42-43` | Numeric command options accept any value with no range validation. `--depth`, `--interval`, `--max-depth`, `--max`, and the history `--interval` can all be supplied as `0` or a negative number. A negative `--depth`/`--max-depth` silently d… |
| Client.CLI-004 | Low | OtOpcUa conventions | `Commands/SubscribeCommand.cs:13-37` | `SubscribeCommand` is the only command in the module whose constructor and all `[CommandOption]` properties have no XML doc comments. Every other command (`ConnectCommand`, `ReadCommand`, `WriteCommand`, `BrowseCommand`, `AlarmsCommand`, `… |
| Client.CLI-006 | Low | Error handling & resilience | `Commands/HistoryReadCommand.cs:73`, `Commands/HistoryReadCommand.cs:76`, `Helpers/NodeIdParser.cs:39` | Operator input-format errors surface as raw .NET exceptions rather than clean CLI errors. An unparseable start/end value throws `FormatException` straight out of `DateTime.Parse`; an invalid node id throws `FormatException`/`ArgumentExcept… |
| Client.CLI-007 | Low | Performance & resource management | `CommandBase.cs:112-123` | `ConfigureLogging` builds a new Serilog `LoggerConfiguration`, creates a logger, and assigns it to the static `Log.Logger` without disposing the previously assigned logger. For a single CLI invocation this leaks at most one logger and the… |
| Client.CLI-008 | Low | Documentation & comments | `docs/Client.CLI.md:158-217` | `docs/Client.CLI.md` is stale relative to the code at this commit. (1) The `subscribe` command section documents only `-n` and `-i`, but the code (`SubscribeCommand`) also exposes `-r/--recursive`, `--max-depth`, `-q/--quiet`, `--duration`… |
| Client.CLI-009 | Low | Code organization & conventions | `Commands/SubscribeCommand.cs:66-165`, `Commands/AlarmsCommand.cs:52-91` | Both long-running commands attach an event handler (`service.DataChanged += ...`, `service.AlarmEvent += ...`) with a lambda and never detach it. Because the handler closes over `console`, the captured console and the closure remain refere… |
| Client.CLI-010 | Low | Testing coverage | `tests/Client/ZB.MOM.WW.OtOpcUa.Client.CLI.Tests/SubscribeCommandTests.cs` | The new `SubscribeCommand` capabilities are largely untested. The four `SubscribeCommandTests` cover only single-node subscribe, unsubscribe-on-cancel, disconnect-in-finally, and the subscription message. There is no test for the `--recurs… |
| Client.Shared-003 | Low | Correctness & logic bugs | `Adapters/DefaultSessionAdapter.cs:76`, `Adapters/DefaultSessionAdapter.cs:273` | `WriteValueAsync` returns `response.Results[0]` and `CallMethodAsync` reads `result.Results[0]` without first checking the `Results` collection is non-empty. A malformed or service-level-faulted response (empty `Results` alongside a servic… |
| Client.Shared-004 | Low | OtOpcUa conventions | `Adapters/DefaultSessionAdapter.cs:228`, `Adapters/DefaultSessionAdapter.cs:121`, `Adapters/DefaultSessionAdapter.cs:172` | `CloseAsync`, `HistoryReadRawAsync`, and `HistoryReadAggregateAsync` are declared `async Task` but call the synchronous `Session.Close()` / `Session.HistoryRead(...)` APIs and contain no `await`. The history methods run a blocking synchron… |
| Client.Shared-009 | Low | Error handling & resilience / Documentation & comments | `OpcUaClientService.cs:302-322` | `AcknowledgeAlarmAsync` is typed `Task<StatusCode>` and its XML doc implies the returned code reports the ack outcome, but the method unconditionally `return StatusCodes.Good`. The actual failure path is `DefaultSessionAdapter.CallMethodAs… |
| Client.Shared-010 | Low | Performance & resource management | `Models/ConnectionSettings.cs:48`, `OpcUaClientService.cs:408-417` | `ConnectionSettings.CertificateStorePath` is initialized to `ClientStoragePaths.GetPkiPath()` as a property initializer, so every `ConnectionSettings` instantiation runs `Environment.GetFolderPath` + `Path.Combine` and, on the first call p… |
| Client.Shared-011 | Low | Testing coverage | `tests/Client/ZB.MOM.WW.OtOpcUa.Client.Shared.Tests/OpcUaClientServiceTests.cs` | The test suite is solid for the happy paths, connection lifecycle, and single-failover behavior. Gaps relative to the findings above: (a) no test exercises concurrent `SubscribeAsync`/failover to expose the `_activeDataSubscriptions` race… |
| Client.UI-003 | Low | OtOpcUa conventions | `ZB.MOM.WW.OtOpcUa.Client.UI.csproj:20-21`, `Program.cs:14-20` | The csproj references `Serilog` and `Serilog.Sinks.Console`, and `docs/Client.UI.md` lists Serilog as the logging technology, but no source file in the module uses Serilog. `Program.BuildAvaloniaApp()` uses Avalonia's `LogToTrace()` and th… |
| Client.UI-004 | Low | OtOpcUa conventions | `Views/MainWindow.axaml.cs:125-138` | `OnBrowseCertPathClicked` uses `OpenFolderDialog`, which is obsolete in Avalonia 11.x (the version pinned in the csproj). The supported replacement is the `StorageProvider` API (`StorageProvider.OpenFolderPickerAsync`). Using the obsolete… |
| Client.UI-006 | Low | Error handling & resilience | `ViewModels/MainWindowViewModel.cs:244-252`, `ViewModels/AlarmsViewModel.cs:88-112`, `ViewModels/SubscriptionsViewModel.cs:79-94` | Many catch blocks swallow exceptions silently with an empty body and only a comment (`// Redundancy info not available`, `// Subscribe failed`, `// Subscription failed; no item added`, and others). When a subscribe, alarm-subscribe, or red… |
| Client.UI-009 | Low | Design-document adherence | `ViewModels/HistoryViewModel.cs:44-54` | `HistoryViewModel.AggregateTypes` exposes eight entries: `null` (Raw) plus Average, Minimum, Maximum, Count, Start, End, and `StandardDeviation`. `docs/Client.UI.md` ("Query Options" table) lists only "Raw (default), Average, Minimum, Maxi… |
| Client.UI-010 | Low | Code organization & conventions | `Controls/DateTimeRangePicker.axaml.cs:33-37`, `Controls/DateTimeRangePicker.axaml.cs:70-80` | `DateTimeRangePicker` declares `MinDateTimeProperty` / `MaxDateTimeProperty` styled properties with public CLR accessors, but neither is read anywhere in the control. `TryParseDateTime`, `OnStartLostFocus`, and `OnEndLostFocus` never clamp… |
| Client.UI-011 | Low | Documentation & comments | `Views/MainWindow.axaml:81`, `Services/JsonSettingsService.cs:11-15` | The certificate-store-path `TextBox` watermark reads `(default: AppData/LmxOpcUaClient/pki)`, referencing the legacy pre-task-#208 folder name. Per `CLAUDE.md` / `docs/Client.UI.md` the canonical path is now `{LocalAppData}/OtOpcUaClient/`… |
| Driver.AbCip-007 | Low | OtOpcUa conventions | `AbCipDriver.cs` (whole file), `AbCipAlarmProjection.cs`, `LibplctagTagRuntime.cs` | `CLAUDE.md` Library Preferences mandate Serilog with a rolling daily file sink. The driver has no logging at all: no `ILogger`/Serilog dependency is injected or used. Failure paths instead swallow exceptions into the `_health` string (`Rea… |
| Driver.AbCip-011 | Low | Error handling & resilience | `AbCipDriver.cs:144-152`, `AbCipDriverOptions.cs:131-143` | `InitializeAsync` only starts probe loops when `_options.Probe.Enabled` is true AND `Probe.ProbeTagPath` is non-blank. When `Probe.Enabled` is true (the default) but `ProbeTagPath` is null (also the default; the doc comment says "PR 8 wire… |
| Driver.AbCip-012 | Low | Performance & resource management | `LibplctagTemplateReader.cs:15-35`, `AbCipDriver.cs:88-92` | `LibplctagTemplateReader` is created per `FetchUdtShapeAsync` call, and each call constructs a fresh libplctag `Tag` for the @udt pseudo-tag, initializes it (a CIP connection handshake), reads, and disposes it. There is no reuse of the `Ta… |
| Driver.AbCip-013 | Low | Design-document adherence | `AbCipDriverOptions.cs:70-73`, `PlcFamilies/AbCipPlcFamilyProfile.cs:13-19`, `LibplctagTagRuntime.cs:16-27` | `driver-specs.md` specifies the AB CIP per-device connection settings as discrete fields: Host, Path, PlcType, TimeoutMs, AllowPacking, ConnectionSize. The implementation instead collapses host + path into a single opaque ab:// URL string… |
| Driver.AbCip-015 | Low | Documentation & comments | `AbCipDriver.cs:9-11`, `PlcTagHandle.cs:23-27,53-58`, `AbCipTemplateCache.cs:12-15`, `IAbCipTagEnumerator.cs:6-11`, `AbCipDriverOptions.cs:21` | Numerous comments are stale relative to the commit under review. `AbCipDriver.cs:9-11` says the driver "Implements IDriver only for now" with capabilities shipping "in subsequent PRs (3-8)" while the class already implements all of them. `… |
| Driver.AbCip.Cli-003 | Low | Concurrency & thread safety | `src/Drivers/Cli/ZB.MOM.WW.OtOpcUa.Driver.AbCip.Cli/Commands/SubscribeCommand.cs:50-56,60-61` | The `OnDataChange` handler writes change lines to `console.Output` (a `TextWriter`) from the driver's poll-engine callback thread, while the command's main flow concurrently writes the "Subscribed to ... Ctrl+C to stop." line on the CLI th… |
| Driver.AbCip.Cli-004 | Low | Error handling & resilience | `src/Drivers/Cli/ZB.MOM.WW.OtOpcUa.Driver.AbCip.Cli/Commands/SubscribeCommand.cs:28,58`; `AbCipCommandBase.cs:26-34` | `--interval-ms` (`IntervalMs`) is taken verbatim and passed as `TimeSpan.FromMilliseconds(IntervalMs)` to `SubscribeAsync` with no validation. A zero or negative value produces a non-positive `TimeSpan`; the option description claims "Poll… |
| Driver.AbCip.Cli-005 | Low | Performance & resource management | `src/Drivers/Cli/ZB.MOM.WW.OtOpcUa.Driver.Cli.Common/DriverCommandBase.cs:51-59` | `ConfigureLogging` assigns a freshly created Serilog logger to the process-global `Log.Logger` but never calls `Log.CloseAndFlush()`. For a short-lived one-shot command (`probe`, `read`, `write`) the process exit flushes the console sink,… |
| Driver.AbCip.Cli-006 | Low | Design-document adherence | `src/Drivers/Cli/ZB.MOM.WW.OtOpcUa.Driver.AbCip.Cli/AbCipCommandBase.cs:29-34` | `AbCipCommandBase` overrides the abstract `DriverCommandBase.Timeout` property with a getter derived from `TimeoutMs` and an empty `init` body (`init { /* driven by TimeoutMs */ }`). Because the override has no `[CommandOption]` attribute,… |
| Driver.AbCip.Cli-007 | Low | Testing coverage | `tests/Drivers/Cli/ZB.MOM.WW.OtOpcUa.Driver.AbCip.Cli.Tests/WriteCommandParseValueTests.cs` | The only test file covers `WriteCommand.ParseValue` and `ReadCommand.SynthesiseTagName` — both pure static helpers. There is no coverage for `AbCipCommandBase.BuildOptions` (the flag-to-`AbCipDriverOptions` mapping that all four commands d… |
| Driver.AbCip.Cli-008 | Low | Documentation & comments | `docs/Driver.AbCip.Cli.md:8-9` | `docs/Driver.AbCip.Cli.md` opens with "Second of four driver test-client CLIs (Modbus -> AB CIP -> AB Legacy -> S7 -> TwinCAT)." The count "four" contradicts the chain that follows it (five names) and contradicts `docs/DriverClis.md`, whic… |
| Driver.AbLegacy-005 | Low | OtOpcUa conventions | `AbLegacyDriver.cs` (whole file) | The driver uses no `ILogger`/Serilog at all. Probe-loop failures, runtime initialisation failures, libplctag non-zero statuses, and read/write exceptions are folded into `DriverHealth.Detail` strings but never logged. CLAUDE.md names Seril… |
| Driver.AbLegacy-011 | Low | Performance & resource management | `AbLegacyDriver.cs:440` | `Dispose()` is implemented as `DisposeAsync().AsTask().GetAwaiter().GetResult()` - sync-over-async. `ShutdownAsync` awaits `_poll.DisposeAsync()` (which completes synchronously) and does no other real async work, so a deadlock is unlikely… |
| Driver.AbLegacy-013 | Low | Code organization & conventions | `AbLegacyDriver.cs:340-345`, `AbLegacyDriver.cs:238-264` | Two minor organisational issues: 1. `ResolveHost` returns `_options.Devices.FirstOrDefault()?.HostAddress ?? DriverInstanceId` when the reference is unknown and no devices are configured. `DriverInstanceId` is not a host address (ab://...)… |
| Driver.AbLegacy.Cli-002 | Low | Correctness & logic bugs | `Commands/WriteCommand.cs:27-29`, `Program.cs:6-9` | The `--value` option help text states "booleans accept true/false/1/0", but `ParseBool` (`WriteCommand.cs:74-80`) and the error message also accept `on/off` and `yes/no`, and `DriverClis.md` documents the full `true/false/1/0/yes/no/on/off… |
| Driver.AbLegacy.Cli-003 | Low | Concurrency & thread safety | `Commands/SubscribeCommand.cs:47-53` | The `OnDataChange` handler calls `console.Output.WriteLine(line)` (the synchronous overload) directly from the `PollGroupEngine` poll thread. The poll engine raises change events from a background timer/loop thread, so two ticks that fire… |
| Driver.AbLegacy.Cli-004 | Low | Error handling & resilience | `Commands/ProbeCommand.cs:37-56`, `Commands/ReadCommand.cs:39-50`, `Commands/WriteCommand.cs:48-59`, `Commands/SubscribeCommand.cs:41-76` | Every command does `await using var driver = new AbLegacyDriver(...)` *and* an explicit `await driver.ShutdownAsync(...)` in the `finally`. `AbLegacyDriver` `DisposeAsync` itself calls `ShutdownAsync`, so the driver is shut down twice on t… |
| Driver.AbLegacy.Cli-005 | Low | Design-document adherence | `Commands/SubscribeCommand.cs:23-25`, `docs/Driver.AbLegacy.Cli.md:94-96` | The subscribe command interval option is `--interval-ms` (default 1000). `docs/Driver.AbLegacy.Cli.md` shows the subscribe example as `otopcua-ablegacy-cli subscribe ... -i 500`, which works because of the short alias `'i'`, but the doc ne… |
| Driver.AbLegacy.Cli-006 | Low | Code organization & conventions | `Commands/ProbeCommand.cs:20-22` | `ProbeCommand` declares its `--type` option with no short alias, while `ReadCommand`, `WriteCommand`, and `SubscribeCommand` all declare `--type` with the short alias `'t'`. `ProbeCommand` also gives `--address` the alias `'a'`, matching t… |
| Driver.AbLegacy.Cli-007 | Low | Testing coverage | `tests/Drivers/Cli/ZB.MOM.WW.OtOpcUa.Driver.AbLegacy.Cli.Tests/WriteCommandParseValueTests.cs` | The only test file in the CLI test project covers `WriteCommand.ParseValue` and `ReadCommand.SynthesiseTagName`. Two behaviours that are pure logic (testable without a device) are uncovered: (1) `AbLegacyCommandBase.BuildOptions` — that it… |
| Driver.Cli.Common-004 | Low | Error handling & resilience | `src/Drivers/Cli/ZB.MOM.WW.OtOpcUa.Driver.Cli.Common/SnapshotFormatter.cs:68-70` | `FormatTable` calls `rows.Max(r => r.Tag.Length)` (and the same for the value and status columns) without guarding against empty input. When `tagNames` and `snapshots` are both empty (equal length, so the mismatch check at line 56 passes),… |
| Driver.Cli.Common-006 | Low | Documentation & comments | `src/Drivers/Cli/ZB.MOM.WW.OtOpcUa.Driver.Cli.Common/SnapshotFormatter.cs:71`, `src/Drivers/Cli/ZB.MOM.WW.OtOpcUa.Driver.Cli.Common/DriverCommandBase.cs:9` | Two minor doc inaccuracies. (1) The comment at `SnapshotFormatter.cs:71` states the "source-time column is fixed-width (ISO-8601 to ms) so no max-measurement needed" — true only when every snapshot has a non-null `SourceTimestampUtc`. `For… |
| Driver.FOCAS-007 | Low | Error handling & resilience | `FocasDriver.cs:140-148`, `FocasDriver.cs:478-484`, `FocasDriver.cs:529-533`, `FocasAlarmProjection.cs:61-63` | Numerous `try { ... } catch {}` blocks swallow every exception with no logging - `ShutdownAsync` (CTS cancel/dispose), `RecycleLoopAsync` (`DisposeClient`), `FixedTreeLoopAsync` transient catches, `ProbeLoopAsync`, and the alarm projection… |
| Driver.FOCAS-008 | Low | Performance & resource management | `FocasDriver.cs:201`, `FocasDriver.cs:253` | `ReadAsync` and `WriteAsync` call `FocasAddress.TryParse(def.Address)` on every operation, even though `InitializeAsync` already parsed and validated every tag address. On a subscription hot path (each poll tick re-enters `ReadAsync`) this… |
| Driver.FOCAS-009 | Low | Design-document adherence | `FocasDriverOptions.cs:110-115`, `FocasDriver.cs:468-486`, `FocasDriverFactoryExtensions.cs:75-80` | `FocasProbeOptions.Timeout` is parsed by the factory (`FocasProbeDto.TimeoutMs` to `FocasProbeOptions.Timeout`) but never consumed. `ProbeLoopAsync` calls `client.ProbeAsync(ct)` with only the probe-loop cancellation token; no per-probe ti… |
| Driver.FOCAS-010 | Low | Code organization & conventions | `IFocasClient.cs:210-227` (`FocasOpMode`), `FocasConstants.cs:42-78` (`FocasOperationMode`) | There are two parallel operation-mode-to-text mappings with divergent labels. `FocasOpMode.ToText` (used by the driver fixed-tree `OperationMode/ModeText` node) yields `"TJOG"`, `"TEACH_IN_HANDLE"`; `FocasOperationModeExtensions.ToText` (i… |
| Driver.FOCAS-011 | Low | Code organization & conventions | `IFocasClient.cs:275-287` (`FocasAlarmType`), `FocasAlarmProjection.cs:149-175` | `FocasAlarmType` declares its constants as `public const int`, but the only consumers - `FocasAlarmProjection.MapAlarmType(short type)` and `MapSeverity(short type)` - take a `short` and `switch` against these `int` constants. It compiles… |
| Driver.FOCAS.Cli-001 | Low | Error handling & resilience | `Commands/WriteCommand.cs:58-68` | `WriteCommand.ParseValue` parses the numeric `--value` types (`Byte`/`Int16`/`Int32`/`Float32`/`Float64`) with `sbyte.Parse` / `short.Parse` / etc. These throw raw `FormatException` or `OverflowException` for malformed or out-of-range inpu… |
| Driver.FOCAS.Cli-002 | Low | Concurrency & thread safety | `Commands/SubscribeCommand.cs:45-51` | The `subscribe` command attaches an `OnDataChange` handler that calls the synchronous `console.Output.WriteLine`. `OnDataChange` is raised from the driver's `PollGroupEngine` tick thread, while the command's main flow writes the "Subscribe… |
| Driver.FOCAS.Cli-003 | Low | Error handling & resilience | `FocasCommandBase.cs:19` (`CncPort`), `FocasCommandBase.cs:27` (`TimeoutMs`), `Commands/SubscribeCommand.cs:23` (`IntervalMs`) | The numeric command options `--cnc-port`, `--timeout-ms`, and `--interval-ms` are accepted without range validation. A zero or negative `--cnc-port` produces an invalid `focas://host:<n>` string; `--timeout-ms 0` yields a zero `TimeSpan` o… |
| Driver.FOCAS.Cli-004 | Low | Performance & resource management | `Commands/ProbeCommand.cs:37,54`; `Commands/ReadCommand.cs:37,46`; `Commands/WriteCommand.cs:45,54`; `Commands/SubscribeCommand.cs:39,73` | Every command declares `await using var driver = new FocasDriver(...)` |
| Driver.FOCAS.Cli-005 | Low | Design-document adherence | `Commands/WriteCommand.cs:50`, `Commands/ProbeCommand.cs:50` (via `SnapshotFormatter.FormatStatus`) | `docs/Driver.FOCAS.Cli.md` documents `BadDeviceFailure` and `BadCommunicationError` as the key diagnostic signals an operator reads off `probe` / `write` output ("A `BadCommunicationError` means ... `BadDeviceFailure` after a successful co… |
| Driver.Galaxy-005 | Low | OtOpcUa conventions | `Runtime/EventPump.cs:81-88` | The `BoundedChannelOptions` comment states "Newest-dropped policy: when full, the producer's TryWrite returns false ... We do this manually rather than relying on `BoundedChannelFullMode.DropWrite`" — but the option is then set to `FullMod… |
| Driver.Galaxy-010 | Low | Security | `GalaxyDriver.cs:311-341` | `ResolveApiKey` supports an `env:`/`file:` indirection and otherwise treats the config string as the literal API key ("Anything else — used as the literal API key. Convenient for dev"). `GalaxyGatewayOptions`' own XML doc claims "the API k… |
| Driver.Galaxy-012 | Low | Performance & resource management | `Runtime/SubscriptionRegistry.cs:65-67`, `GalaxyDriver.cs:538`, `GalaxyDriver.cs:675` | Several hot paths are O(n^2) per call. `SubscriptionRegistry.ResolveSubscribers` does `entry.Bindings.FirstOrDefault(b => b.ItemHandle == itemHandle)` — a linear scan of the whole binding list for every event dispatch; at 50k tags this is… |
| Driver.Galaxy-013 | Low | Design-document adherence | `GalaxyDriver.cs:14-27`, `GalaxyDriver.cs:374-382`, `Config/GalaxyDriverOptions.cs:84-86` | Multiple doc comments are stale relative to the shipped code. `GalaxyDriver`'s class summary still describes the file as "the project skeleton with `IDriver` bodies that wire to a future `IGalaxyGatewayClient` abstraction. Capability inter… |
| Driver.Historian.Wonderware-004 | Low | Correctness and logic bugs | `Backend/SdkAlarmHistorianWriteBackend.cs:198-201` | `ToHistorianEvent` only assigns `historianEvent.Id` when `Guid.TryParse(dto.EventId, ...)` succeeds. If `EventId` is not a parseable GUID (or is empty), `Id` stays `Guid.Empty` and the event is written to the historian with an all-zeros id… |
| Driver.Historian.Wonderware-005 | Low | Concurrency and thread safety | `Backend/HistorianDataSource.cs:124`, `:126-127` | `GetHealthSnapshot` reads `_activeProcessNode` and `_activeEventNode` inside `_healthLock`, but those two fields are written under `_connectionLock` / `_eventConnectionLock` (lines 183, 243, 209-210, 266-269) — a different lock. The health… |
| Driver.Historian.Wonderware-007 | Low | Error handling and resilience | `Ipc/PipeServer.cs:70-75` | When `VerifyCaller` rejects the peer SID, the server logs the reason and calls `_current.Disconnect()` with no `HelloAck` frame sent. The shared-secret-mismatch and major-version-mismatch paths below it both send a rejecting `HelloAck` so… |
| Driver.Historian.Wonderware-008 | Low | Error handling and resilience | `Backend/HistorianDataSource.cs:301-307`, `:374-380` | When `query.StartQuery` returns `false`, `ReadRawAsync` and `ReadAggregateAsync` call `HandleConnectionError()` and return an empty result list. A failed `StartQuery` is not necessarily a connection failure — it can be a bad tag name, an i… |
| Driver.Historian.Wonderware-010 | Low | Performance and resource management | `Backend/HistorianConfiguration.cs:32-36`, `Backend/HistorianDataSource.cs` (all read methods) | `HistorianConfiguration.RequestTimeoutSeconds` is documented as the "outer safety timeout applied to sync-over-async Historian operations" and is copied around (`SdkAlarmHistorianWriteBackend.CloneConfigWithServerName:346`), but it is neve… |
| Driver.Historian.Wonderware-011 | Low | Design-document adherence | `Backend/HistorianDataSource.cs:9-12`, `Backend/IHistorianDataSource.cs:9-11`, `Backend/HistorianSample.cs:7-9`, `Backend/HistorianConfiguration.cs:7-9` | Several XML doc comments reference the retired v1 architecture as if it were current: "inside Galaxy.Host", "the Proxy maps returned samples", "the Host returns these across the IPC boundary as `GalaxyDataValue`", "Populated from ... the P… |
| Driver.Historian.Wonderware-012 | Low | Testing coverage | `Backend/HistorianDataSource.cs`, `tests/Drivers/ZB.MOM.WW.OtOpcUa.Driver.Historian.Wonderware.Tests/` | The unit-test suite covers `HistorianQualityMapper`, `HistorianClusterEndpointPicker`, `SdkAlarmHistorianWriteBackend`, `AahClientManagedAlarmEventWriter`, the IPC round trip, and `Program` alarm-writer wiring. `HistorianDataSource` itself… |
| Driver.Historian.Wonderware.Client-003 | Low | Concurrency & thread safety | `WonderwareHistorianClient.cs:207`, `WonderwareHistorianClient.cs:132-150` | `_totalQueries` is mutated with `Interlocked.Increment` in `Invoke`, but read inside `GetHealthSnapshot` under `_healthLock`, and every other counter (`_totalSuccesses`, `_totalFailures`, `_consecutiveFailures`) is mutated only under `_hea… |
| Driver.Historian.Wonderware.Client-004 | Low | Concurrency & thread safety | `WonderwareHistorianClient.cs:203-267` | A sidecar-reported failure is recorded in two non-atomic steps under separate lock acquisitions: `Invoke` calls `RecordSuccess()` (line 211) and then the caller calls `ThrowIfFailed` which calls `ReclassifySuccessAsFailure()` (line 256), d… |
| Driver.Historian.Wonderware.Client-006 | Low | Error handling & resilience | `Internal/PipeChannel.cs:96-107`, `WonderwareHistorianClientOptions.cs:11-12` | `PipeChannel.InvokeAsync` retries exactly once on transport failure and otherwise propagates. The options expose `ReconnectInitialBackoff` and `ReconnectMaxBackoff` and `WonderwareHistorianClientOptions` documents them as exponential backo… |
| Driver.Historian.Wonderware.Client-008 | Low | Security | `ZB.MOM.WW.OtOpcUa.Driver.Historian.Wonderware.Client.csproj:29-32` | The csproj suppresses two NuGet audit advisories (`GHSA-37gx-xxp4-5rgx`, `GHSA-w3x6-4m5h-cxqf`) for the `MessagePack` 2.5.187 dependency with no inline comment recording why the suppression is safe, who reviewed it, or when it should be re… |
| Driver.Historian.Wonderware.Client-010 | Low | Documentation & comments | `WonderwareHistorianClient.cs:355-361`, `WonderwareHistorianClient.cs:132-150` | Two doc/behaviour mismatches. (1) The `Dispose()` XML comment asserts the underlying channel async cleanup is non-blocking so the `GetAwaiter()/GetResult()` bridge is safe. `PipeChannel.DisposeAsync` calls `ResetTransport()`, which invokes… |
| Driver.Modbus-003 | Low | Concurrency & thread safety | `ModbusDriver.cs:59,188,241,259,266,726,745,759` | `_health` is a non-`volatile` reference field written from multiple threads (concurrent `ReadAsync` callers, the coalesced-read path, `WriteAsync` indirectly, and `ProbeLoopAsync`) and read by `GetHealth()`. Reference assignment is atomic… |
| Driver.Modbus-007 | Low | Design-document adherence | `ModbusDriver.cs:1392`, `ModbusDriverOptions.cs:74-80` | Two design-vs-code drifts. (1) `MapDataType` maps `Int64`/`UInt64` to `DriverDataType.Int32` with the inline comment "widening to Int32 loses precision; PR 25 adds Int64 to DriverDataType". The address-space node for a 64-bit Modbus tag is… |
| Driver.Modbus-008 | Low | Documentation & comments | `ModbusDriver.cs:411-417,700-703,737-744` | Stale/misleading comments. (1) The `<summary>` block at `ModbusDriver.cs:411-417` says auto-prohibited ranges are "Cleared by ReinitializeAsync ... or by an explicit re-probe API (not yet shipped)" — the re-probe loop has shipped (#151, `R… |
| Driver.Modbus-009 | Low | Correctness & logic bugs | `ModbusDriver.cs:1160-1167`, `ModbusTcpTransport.cs:94-95` | Two edge cases. (1) `RegisterCount` for `ModbusDataType.String` computes `(tag.StringLength + 1) / 2`; a tag configured with `StringLength = 0` yields a register count of 0, flowing into `ReadOneAsync` as `totalRegs = 0` and producing an F… |
| Driver.Modbus-010 | Low | Error handling & resilience | `ModbusDriver.cs:864-868`, `ModbusDriverOptions.cs:116-125` | When `WriteOnChangeOnly` is enabled and `IsRedundantWrite` returns true, `WriteAsync` returns `WriteResult(0u)` (Good) without touching the wire. The suppression baseline (`_lastWrittenByRef`) is only invalidated by a *read* that returns a… |
| Driver.Modbus-011 | Low | Code organization & conventions | `ModbusDriver.cs:23-43,89-97,408-432` | Field and member declarations are interleaved with methods throughout `ModbusDriver`. `ResolveHost` (a public method) is the first member of the class, followed by `BuildSlaveHostName`, then a block of fields; `_lastPublishedByRef`/`_lastW… |
| Driver.Modbus-012 | Low | Testing coverage | `tests/Drivers/ZB.MOM.WW.OtOpcUa.Driver.Modbus.Tests/` | The unit suite is broad (coalescing, bisection, auto-recovery, byte order, arrays, BCD, RMW, caps, multi-unit, probe, reconnect, subscription). Gaps relative to the findings above: (1) no test exercises concurrent multi-subscription publis… |
| Driver.Modbus.Addressing-006 | Low | Error handling & resilience | `ModbusAddressParser.cs:297-301` | `TryParseFamilyNative` catches only `ArgumentException` and `OverflowException`. The current helpers throw only those (including `ArgumentOutOfRangeException`, which derives from `ArgumentException`), so today it is correct. But the parser… |
| Driver.Modbus.Addressing-007 | Low | Design-document adherence | `ModbusDataType.cs:91-95`, `docs/v2/dl205.md` section Strings | `ModbusStringByteOrder` (HighByteFirst / LowByteFirst) is defined in this assembly and documented as the DL205 low-byte-first string-packing knob, but `ParsedModbusAddress` has no field for it and `ModbusAddressParser` never produces or co… |
| Driver.Modbus.Addressing-009 | Low | Documentation & comments | `ModbusModiconAddress.cs:55-64`, `ModbusModiconAddress.cs:104-110` | The comments on `ModbusModiconAddress.TryParse` are slightly inaccurate. The remark that 5-digit Modicon is always exactly 5 chars (40001..49999) and 6-digit is exactly 6 (400001..465536-shaped) implies the leading digit is always 4, but t… |
| Driver.Modbus.Cli-003 | Low | Correctness & logic bugs | `src/Drivers/Cli/ZB.MOM.WW.OtOpcUa.Driver.Modbus.Cli/ModbusCommandBase.cs:14-24` | `Port` (`int`) and `TimeoutMs` (`int`) accept any 32-bit value, including negatives and ports above 65535. `UnitId` is a `byte`, so it accepts 0-255 even though the option description and `docs/Driver.Modbus.Cli.md` both say the valid rang… |
| Driver.Modbus.Cli-004 | Low | Concurrency & thread safety | `src/Drivers/Cli/ZB.MOM.WW.OtOpcUa.Driver.Modbus.Cli/Commands/SubscribeCommand.cs:61-67` | The `OnDataChange` handler is invoked from the driver's `PollGroupEngine` background thread and calls `console.Output.WriteLine` synchronously. An exception thrown inside this handler (e.g. an `IOException` on a redirected or closed stdout… |
| Driver.Modbus.Cli-005 | Low | Error handling & resilience | `src/Drivers/Cli/ZB.MOM.WW.OtOpcUa.Driver.Modbus.Cli/Commands/ProbeCommand.cs:21-54`; `Commands/ReadCommand.cs:46-75`; `Commands/WriteCommand.cs:54-89` | All three commands call `ConfigureLogging()` then `console.RegisterCancellationHandler()`, but if the operator presses Ctrl+C before `InitializeAsync` completes, the resulting `OperationCancelledException` propagates out of `ExecuteAsync`… |
| Driver.Modbus.Cli-006 | Low | Error handling & resilience | `src/Drivers/Cli/ZB.MOM.WW.OtOpcUa.Driver.Modbus.Cli/Commands/ProbeCommand.cs:35-53` | `probe` reports `Health: {health.State}` from `GetHealth()`. After a successful `InitializeAsync` the driver sets state to `Healthy` regardless of whether the subsequent probe register read returns Good or a Bad status code. `ReadAsync` do… |
| Driver.Modbus.Cli-007 | Low | Design-document adherence | `docs/Driver.Modbus.Cli.md:124-156`; `src/Drivers/Cli/ZB.MOM.WW.OtOpcUa.Driver.Modbus.Cli/Commands/ReadCommand.cs` | `docs/Driver.Modbus.Cli.md` devotes a whole "v2 addressing grammar" section to the industry-standard tag-address strings (`40001:F:CDAB`, `HR1:I`, `C100`, `V2000:F:CDAB`, etc.) and says "set the per-tag `addressString` field instead of the… |
| Driver.Modbus.Cli-008 | Low | Testing coverage | `tests/Drivers/Cli/ZB.MOM.WW.OtOpcUa.Driver.Modbus.Cli.Tests/` | The test project covers only the two pure-function seams: `ReadCommand.SynthesiseTagName` and `WriteCommand.ParseValue`. There is no coverage for `WriteCommand`'s read-only-region rejection (`Region is not (Coils or HoldingRegisters)`), no… |
| Driver.OpcUaClient-011 | Low | Documentation & comments | `OpcUaClientDriver.cs:783-784` | The comment on the isArray computation states "-1 = scalar; 1+ = array dimensions; 0 = one-dimensional array". This is inaccurate against OPC UA ValueRank semantics: -3 is ScalarOrOneDimension, -2 is Any, -1 is Scalar, and 0 is OneOrMoreDi… |
| Driver.OpcUaClient-014 | Low | Performance & resource management | `OpcUaClientDriver.cs:904`, `:1035` | `MonitoredItem.Notification += (mi, args) => ...` (and the alarm-event equivalent) attaches a closure-capturing lambda to each monitored item's event. The lambda is never detached. When UnsubscribeAsync removes a subscription it calls Subs… |
| Driver.S7-003 | Low | Correctness & logic bugs | `S7Driver.cs:172`, `S7Driver.cs:255` | ReadAsync and WriteAsync dereference fullReferences.Count / writes.Count with no null guard. A null argument throws NullReferenceException rather than ArgumentNullException, and the NRE escapes before the _gate is taken so it is not wrappe… |
| Driver.S7-005 | Low | OtOpcUa conventions | `S7Driver.cs:33`, `S7Driver.cs:433` | System.Collections.Concurrent.ConcurrentDictionary is written out with a fully-qualified namespace at the field declarations instead of a using System.Collections.Concurrent directive. ImplicitUsings is enabled and the rest of the codebase… |
| Driver.S7-009 | Low | Error handling & resilience | `S7Driver.cs:392` | The subscription poll loop never reflects sustained polling failure anywhere an operator can see it. PollLoopAsync swallows every non-cancellation exception with an empty catch and the comment claims "the health surface reflects it" - but… |
| Driver.S7-010 | Low | Performance & resource management | `S7Driver.cs:504` | Dispose() is implemented as DisposeAsync().AsTask().GetAwaiter().GetResult() - sync-over-async. Inside the generic host this is currently safe (no captured SynchronizationContext), but it is a known deadlock pattern. The only async work be… |
| Driver.S7-013 | Low | Code organization & conventions | `S7DriverOptions.cs:90`, `S7Driver.cs:300` | S7TagDefinition.StringLength is a public configured/JSON-bound parameter (default 254) but is dead: S7DataType.String reads and writes both throw NotSupportedException ("...land in a follow-up PR"), so StringLength is never consumed. Likew… |
| Driver.S7.Cli-004 | Low | Performance & resource management | `src/Drivers/Cli/ZB.MOM.WW.OtOpcUa.Driver.S7.Cli/Commands/ProbeCommand.cs:36,53`, `Commands/ReadCommand.cs:45,54`, `Commands/WriteCommand.cs:51,60`, `Commands/SubscribeCommand.cs:39,73` | Every command declares the driver with `await using var driver = new S7Driver(...)` and *also* calls `await driver.ShutdownAsync(...)` in a `finally` block. `S7Driver.DisposeAsync` itself calls `ShutdownAsync`, so shutdown runs twice per c… |
| Driver.S7.Cli-005 | Low | Code organization & conventions | `tests/ZB.MOM.WW.OtOpcUa.Driver.S7.Cli.Tests/` | A stale directory `tests/ZB.MOM.WW.OtOpcUa.Driver.S7.Cli.Tests/` exists containing only an `obj/` folder — no `.csproj`, no source. The real test project lives at `tests/Drivers/Cli/ZB.MOM.WW.OtOpcUa.Driver.S7.Cli.Tests/`. The empty direct… |
| Driver.S7.Cli-006 | Low | Testing coverage | `tests/Drivers/Cli/ZB.MOM.WW.OtOpcUa.Driver.S7.Cli.Tests/WriteCommandParseValueTests.cs` | The only test file covers `WriteCommand.ParseValue` and `ReadCommand.SynthesiseTagName`. `S7CommandBase.BuildOptions` — which maps the host / port / CPU / rack / slot / timeout flags onto an `S7DriverOptions` and forces `Probe.Enabled = fa… |
| Driver.S7.Cli-007 | Low | Documentation & comments | `src/Drivers/Cli/ZB.MOM.WW.OtOpcUa.Driver.S7.Cli/Commands/SubscribeCommand.cs:45-51` | The Modbus CLI `SubscribeCommand` carries an explanatory comment on the `OnDataChange` handler ("Route every data-change event to the CliFx console (not System.Console — the analyzer flags it + IConsole is the testable abstraction)"). The… |
| Driver.TwinCAT-004 | Low | Correctness & logic bugs | `TwinCATDataType.cs:24-27` | The inline comments for the IEC time types are inaccurate. TwinCAT `TIME` is a duration (32-bit, milliseconds) — not "ms since epoch of day". `DATE` is stored as seconds since 1970-01-01 (truncated to a day boundary), not "days since 1970-… |
| Driver.TwinCAT-006 | Low | OtOpcUa conventions | `TwinCATDriver.cs:406-411` | `ResolveHost` falls back to `DriverInstanceId` when there are no configured devices and the reference is unknown. `DriverInstanceId` is a logical config-DB identifier, not a host address; `IPerCallHostResolver` consumers expect a host key… |
| Driver.TwinCAT-014 | Low | Design-document adherence | `TwinCATDriverOptions.cs:41-43`, `TwinCATDriverOptions.cs:57-62`, `AdsTwinCATClient.cs:145` | Several drifts between the implemented config surface and `docs/v2/driver-specs.md` section 6. The spec connection-settings list has separate `Host` (IP), `AmsNetId`, and `AmsPort` fields; the implementation collapses these into a single `… |
| Driver.TwinCAT-015 | Low | Code organization & conventions | `TwinCATDriver.cs:431-432` | `Dispose()` runs `DisposeAsync().AsTask().GetAwaiter().GetResult()` — sync-over-async. `docs/v2/driver-stability.md` section Galaxy explicitly lists "sync-over-async on the OPC UA stack thread" among the four 2026-04-13 stability findings… |
| Driver.TwinCAT-016 | Low | Testing coverage | `tests/Drivers/ZB.MOM.WW.OtOpcUa.Driver.TwinCAT.Tests/` | Unit coverage exists for AMS-address parsing, symbol-path parsing, read/write, native notifications, symbol browse, and the capability surface. Gaps tied to the findings above: no test exercises `ReinitializeAsync` with a changed config (D… |
| Driver.TwinCAT.Cli-001 | Low | Correctness & logic bugs | `TwinCATCommandBase.cs:23-24`, `Commands/SubscribeCommand.cs:23-24`, `Commands/BrowseCommand.cs:21-24` | Numeric command options are accepted without range validation. `--timeout-ms` feeds `Timeout => TimeSpan.FromMilliseconds(TimeoutMs)`; passing `--timeout-ms 0` or a negative value yields `TimeSpan.Zero`/a negative `TimeSpan`, which is then… |
| Driver.TwinCAT.Cli-002 | Low | Concurrency & thread safety | `Commands/SubscribeCommand.cs:46-58` | The `OnDataChange` handler calls `console.Output.WriteLine(line)` synchronously. In native ADS-notification mode the event is raised from the `Beckhoff.TwinCAT.Ads` notification callback thread (see `TwinCATDriver.SubscribeAsync`, which in… |
| Driver.TwinCAT.Cli-003 | Low | Error handling & resilience | `Commands/SubscribeCommand.cs:56-58` | The subscribe banner reports the mechanism purely from the `--poll-only` flag (`var mode = PollOnly ? "polling" : "ADS notification"`). The doc (`docs/Driver.TwinCAT.Cli.md`) states the banner "announces which mechanism is in play". The CL… |
| Driver.TwinCAT.Cli-004 | Low | Design-document adherence | `TwinCATCommandBase.cs:26-29`, `Commands/BrowseCommand.cs` | `--poll-only` is declared on `TwinCATCommandBase`, so it is inherited by `browse`. `BrowseCommand` only ever calls `DiscoverAsync` — it never subscribes — so `UseNativeNotifications = !PollOnly` has no observable effect on a browse run. Th… |
| Driver.TwinCAT.Cli-005 | Low | Code organization & conventions | `Commands/ProbeCommand.cs:23`, `Commands/ReadCommand.cs:20`, `Commands/WriteCommand.cs:20`, `Commands/SubscribeCommand.cs:18` | The `--type` option is declared with the short alias `-t` on `read`, `write`, and `subscribe`, but `ProbeCommand` declares `[CommandOption("type", ...)]` with no short alias. An operator who has internalised `-t` from the other three verbs… |
| Driver.TwinCAT.Cli-006 | Low | Testing coverage | `tests/Drivers/Cli/ZB.MOM.WW.OtOpcUa.Driver.TwinCAT.Cli.Tests/WriteCommandParseValueTests.cs` | The only test file covers `WriteCommand.ParseValue` and `ReadCommand.SynthesiseTagName`. Other deterministic, router-independent logic is untested: `TwinCATCommandBase.Gateway` (the `ads://{netId}:{port}` string the driver's `TwinCATAmsAdd… |
| Driver.TwinCAT.Cli-007 | Low | Documentation & comments | `TwinCATCommandBase.cs:31-36` | The `Timeout` override has an empty `init` accessor with the comment `/* driven by TimeoutMs */`. Because the base `DriverCommandBase.Timeout` is declared `abstract { get; init; }`, the override must supply an `init`, but here it silently… |
_No pending findings._
## Closed findings
@@ -182,6 +75,7 @@ Findings with status `Resolved`, `Won't Fix`, or `Deferred`.
| Core.AlarmHistorian-006 | High | Resolved | Error handling & resilience | `src/Core/ZB.MOM.WW.OtOpcUa.Core.AlarmHistorian/SqliteStoreAndForwardSink.cs:103,135-216` |
| Core.ScriptedAlarms-001 | High | Resolved | Concurrency & thread safety | `ScriptedAlarmEngine.cs:175`, `ScriptedAlarmEngine.cs:178`, `ScriptedAlarmEngine.cs:73`, `ScriptedAlarmEngine.cs:368` |
| Core.Scripting-002 | High | Resolved | Security | `ForbiddenTypeAnalyzer.cs:70` |
| Core.Scripting-012 | High | Resolved | Security | `ForbiddenTypeAnalyzer.cs:60-76`, `ScriptSandbox.cs:96-126` |
| Core.VirtualTags-001 | High | Resolved | Correctness & logic bugs | `src/Core/ZB.MOM.WW.OtOpcUa.Core.VirtualTags/VirtualTagEngine.cs:306` |
| Driver.AbCip-001 | High | Resolved | Correctness & logic bugs | `AbCipDriver.cs:111`, `AbCipDriver.cs:163-167` |
| Driver.AbCip-002 | High | Resolved | Correctness & logic bugs | `AbCipStatusMapper.cs:65-78` |
@@ -190,6 +84,7 @@ Findings with status `Resolved`, `Won't Fix`, or `Deferred`.
| Driver.AbLegacy-001 | High | Resolved | Correctness & logic bugs | `AbLegacyAddress.cs:54`, `AbLegacyDriver.cs:368-374` |
| Driver.AbLegacy-006 | High | Resolved | Concurrency & thread safety | `AbLegacyDriver.cs:107-158`, `AbLegacyDriver.cs:162-234`, `LibplctagLegacyTagRuntime.cs` |
| Driver.Cli.Common-001 | High | Resolved | Correctness & logic bugs | `src/Drivers/Cli/ZB.MOM.WW.OtOpcUa.Driver.Cli.Common/SnapshotFormatter.cs:106-119` |
| Driver.Cli.Common-007 | High | Resolved | Correctness & logic bugs | `src/Drivers/Cli/ZB.MOM.WW.OtOpcUa.Driver.Cli.Common/SnapshotFormatter.cs:129` |
| Driver.FOCAS-001 | High | Resolved | Correctness & logic bugs | `FocasDriverFactoryExtensions.cs:54-86`, `FocasDriverFactoryExtensions.cs:132-140` |
| Driver.FOCAS-002 | High | Resolved | Correctness & logic bugs | `WireFocasClient.cs:164-179`, `FocasDriver.cs:513`, `FocasDriver.cs:593` |
| Driver.Galaxy-002 | High | Resolved | Correctness & logic bugs | `Browse/DataTypeMap.cs:13`, `Runtime/MxValueDecoder.cs:9` |
@@ -256,6 +151,9 @@ Findings with status `Resolved`, `Won't Fix`, or `Deferred`.
| Core.Scripting-004 | Medium | Resolved | Correctness & logic bugs | `DependencyExtractor.cs:73` |
| Core.Scripting-007 | Medium | Resolved | Error handling & resilience | `TimedScriptEvaluator.cs:60` |
| Core.Scripting-010 | Medium | Resolved | Testing coverage | `tests/Core/ZB.MOM.WW.OtOpcUa.Core.Scripting.Tests/ScriptSandboxTests.cs:54` |
| Core.Scripting-013 | Medium | Resolved | Security | `ScriptEvaluator.cs:202-225` (`BuildWrapperSource`) |
| Core.Scripting-014 | Medium | Resolved | Concurrency & thread safety | `CompiledScriptCache.cs:91-103` (`Clear`) |
| Core.Scripting-016 | Medium | Resolved | Performance & resource management | `src/Core/ZB.MOM.WW.OtOpcUa.Core.VirtualTags/VirtualTagEngine.cs:74-117`, `src/Core/ZB.MOM.WW.OtOpcUa.Core.ScriptedAlarms/ScriptedAlarmEngine.cs:139-182` |
| Core.VirtualTags-002 | Medium | Resolved | Correctness & logic bugs | `src/Core/ZB.MOM.WW.OtOpcUa.Core.VirtualTags/VirtualTagEngine.cs:237` |
| Core.VirtualTags-003 | Medium | Resolved | Correctness & logic bugs | `src/Core/ZB.MOM.WW.OtOpcUa.Core.VirtualTags/VirtualTagEngine.cs:117-120` |
| Core.VirtualTags-005 | Medium | Resolved | Concurrency & thread safety | `src/Core/ZB.MOM.WW.OtOpcUa.Core.VirtualTags/VirtualTagSource.cs:50-64` |
@@ -293,6 +191,7 @@ Findings with status `Resolved`, `Won't Fix`, or `Deferred`.
| Driver.Galaxy-009 | Medium | Resolved | Error handling & resilience | `GalaxyDriver.cs:354-371` |
| Driver.Galaxy-011 | Medium | Resolved | Performance & resource management | `GalaxyDriver.cs:411` |
| Driver.Galaxy-014 | Medium | Resolved | Testing coverage | `src/Drivers/ZB.MOM.WW.OtOpcUa.Driver.Galaxy` (module-wide) |
| Driver.Galaxy-016 | Medium | Resolved | Performance & resource management | `ZB.MOM.WW.OtOpcUa.Driver.Galaxy.csproj:43-47`, `libs/README.md:32-37` |
| Driver.Historian.Wonderware-002 | Medium | Resolved | Correctness and logic bugs | `Ipc/HistorianFrameHandler.cs:162`, `:181` |
| Driver.Historian.Wonderware-003 | Medium | Resolved | Correctness and logic bugs | `Backend/HistorianDataSource.cs:320-323`, `:457-460` |
| Driver.Historian.Wonderware-006 | Medium | Resolved | Error handling and resilience | `Ipc/PipeServer.cs:120-128` |
@@ -348,6 +247,25 @@ Findings with status `Resolved`, `Won't Fix`, or `Deferred`.
| Analyzers-004 | Low | Resolved | Performance & resource management | `src/Tooling/ZB.MOM.WW.OtOpcUa.Analyzers/UnwrappedCapabilityCallAnalyzer.cs:95-112` |
| Analyzers-005 | Low | Resolved | Design-document adherence | `src/Tooling/ZB.MOM.WW.OtOpcUa.Analyzers/UnwrappedCapabilityCallAnalyzer.cs:33-43` |
| Analyzers-007 | Low | Resolved | Documentation & comments | `src/Tooling/ZB.MOM.WW.OtOpcUa.Analyzers/UnwrappedCapabilityCallAnalyzer.cs:21-26` |
| Client.CLI-002 | Low | Resolved | Correctness & logic bugs | `Commands/SubscribeCommand.cs:129-137` |
| Client.CLI-003 | Low | Resolved | Correctness & logic bugs | `Commands/BrowseCommand.cs:29-30`, `Commands/SubscribeCommand.cs:20-27`, `Commands/AlarmsCommand.cs:28-29`, `Commands/HistoryReadCommand.cs:42-43` |
| Client.CLI-004 | Low | Resolved | OtOpcUa conventions | `Commands/SubscribeCommand.cs:13-37` |
| Client.CLI-006 | Low | Resolved | Error handling & resilience | `Commands/HistoryReadCommand.cs:73`, `Commands/HistoryReadCommand.cs:76`, `Helpers/NodeIdParser.cs:39` |
| Client.CLI-007 | Low | Resolved | Performance & resource management | `CommandBase.cs:112-123` |
| Client.CLI-008 | Low | Resolved | Documentation & comments | `docs/Client.CLI.md:158-217` |
| Client.CLI-009 | Low | Resolved | Code organization & conventions | `Commands/SubscribeCommand.cs:66-165`, `Commands/AlarmsCommand.cs:52-91` |
| Client.CLI-010 | Low | Resolved | Testing coverage | `tests/Client/ZB.MOM.WW.OtOpcUa.Client.CLI.Tests/SubscribeCommandTests.cs` |
| Client.Shared-003 | Low | Resolved | Correctness & logic bugs | `Adapters/DefaultSessionAdapter.cs:76`, `Adapters/DefaultSessionAdapter.cs:273` |
| Client.Shared-004 | Low | Resolved | OtOpcUa conventions | `Adapters/DefaultSessionAdapter.cs:228`, `Adapters/DefaultSessionAdapter.cs:121`, `Adapters/DefaultSessionAdapter.cs:172` |
| Client.Shared-009 | Low | Resolved | Error handling & resilience / Documentation & comments | `OpcUaClientService.cs:302-322` |
| Client.Shared-010 | Low | Resolved | Performance & resource management | `Models/ConnectionSettings.cs:48`, `OpcUaClientService.cs:408-417` |
| Client.Shared-011 | Low | Resolved | Testing coverage | `tests/Client/ZB.MOM.WW.OtOpcUa.Client.Shared.Tests/OpcUaClientServiceTests.cs` |
| Client.UI-003 | Low | Resolved | OtOpcUa conventions | `ZB.MOM.WW.OtOpcUa.Client.UI.csproj:20-21`, `Program.cs:14-20` |
| Client.UI-004 | Low | Resolved | OtOpcUa conventions | `Views/MainWindow.axaml.cs:125-138` |
| Client.UI-006 | Low | Resolved | Error handling & resilience | `ViewModels/MainWindowViewModel.cs:244-252`, `ViewModels/AlarmsViewModel.cs:88-112`, `ViewModels/SubscriptionsViewModel.cs:79-94` |
| Client.UI-009 | Low | Resolved | Design-document adherence | `ViewModels/HistoryViewModel.cs:44-54` |
| Client.UI-010 | Low | Resolved | Code organization & conventions | `Controls/DateTimeRangePicker.axaml.cs:33-37`, `Controls/DateTimeRangePicker.axaml.cs:70-80` |
| Client.UI-011 | Low | Resolved | Documentation & comments | `Views/MainWindow.axaml:81`, `Services/JsonSettingsService.cs:11-15` |
| Configuration-004 | Low | Resolved | OtOpcUa conventions | `src/Core/ZB.MOM.WW.OtOpcUa.Configuration/Enums/NodePermissions.cs:8`, `src/Core/ZB.MOM.WW.OtOpcUa.Configuration/OtOpcUaConfigDbContext.cs:417` |
| Configuration-005 | Low | Resolved | Concurrency & thread safety | `src/Core/ZB.MOM.WW.OtOpcUa.Configuration/LocalCache/LiteDbConfigCache.cs:50` |
| Configuration-007 | Low | Resolved | Error handling & resilience | `src/Core/ZB.MOM.WW.OtOpcUa.Configuration/Apply/GenerationApplier.cs:44` |
@@ -369,14 +287,16 @@ Findings with status `Resolved`, `Won't Fix`, or `Deferred`.
| Core.ScriptedAlarms-003 | Low | Resolved | Documentation & comments | `ScriptedAlarmEngine.cs:343`, `docs/ScriptedAlarms.md:107` |
| Core.ScriptedAlarms-006 | Low | Resolved | Concurrency & thread safety | `ScriptedAlarmEngine.cs:232`, `ScriptedAlarmEngine.cs:369` |
| Core.ScriptedAlarms-008 | Low | Resolved | Performance & resource management | `Part9StateMachine.cs:261-268` |
| Core.ScriptedAlarms-009 | Low | Won't Fix | Performance & resource management | `ScriptedAlarmEngine.cs:309-315`, `ScriptedAlarmEngine.cs:271` |
| Core.ScriptedAlarms-009 | Low | Resolved | Performance & resource management | `ScriptedAlarmEngine.cs:309-315`, `ScriptedAlarmEngine.cs:271` |
| Core.ScriptedAlarms-010 | Low | Resolved | Design-document adherence | `ScriptedAlarmEngine.cs:325-336`, `AlarmPredicateContext.cs:33-40`, `MessageTemplate.cs:47` |
| Core.ScriptedAlarms-011 | Low | Resolved | Code organization & conventions | `Part9StateMachine.cs:275` |
| Core.ScriptedAlarms-013 | Low | Resolved | Documentation & comments | `ScriptedAlarmEngine.cs:66-81` |
| Core.Scripting-005 | Low | Resolved | Correctness & logic bugs | `DependencyExtractor.cs:97` |
| Core.Scripting-006 | Low | Resolved | Concurrency & thread safety | `CompiledScriptCache.cs:55` |
| Core.Scripting-008 | Low | Won't Fix | Performance & resource management | `CompiledScriptCache.cs:34`, `ScriptEvaluator.cs:34` |
| Core.Scripting-008 | Low | Resolved | Performance & resource management | `CompiledScriptCache.cs:34`, `ScriptEvaluator.cs:34` |
| Core.Scripting-009 | Low | Resolved | Design-document adherence | `ForbiddenTypeAnalyzer.cs:45` |
| Core.Scripting-011 | Low | Resolved | Testing coverage | `tests/Core/ZB.MOM.WW.OtOpcUa.Core.Scripting.Tests/` |
| Core.Scripting-015 | Low | Resolved | Correctness & logic bugs | `ScriptEvaluator.cs:234-270` (`ToCSharpTypeName`) |
| Core.VirtualTags-004 | Low | Resolved | Correctness & logic bugs | `src/Core/ZB.MOM.WW.OtOpcUa.Core.VirtualTags/VirtualTagEngine.cs:349` |
| Core.VirtualTags-006 | Low | Resolved | Concurrency & thread safety | `src/Core/ZB.MOM.WW.OtOpcUa.Core.VirtualTags/VirtualTagEngine.cs:177-182`, `:395-401` |
| Core.VirtualTags-007 | Low | Resolved | Error handling & resilience | `src/Core/ZB.MOM.WW.OtOpcUa.Core.VirtualTags/TimerTriggerScheduler.cs:58` |
@@ -384,9 +304,100 @@ Findings with status `Resolved`, `Won't Fix`, or `Deferred`.
| Core.VirtualTags-010 | Low | Resolved | Documentation & comments | `src/Core/ZB.MOM.WW.OtOpcUa.Core.VirtualTags/ITagUpstreamSource.cs:18`, `VirtualTagContext.cs:30`, `VirtualTagDefinition.cs:28` |
| Core.VirtualTags-011 | Low | Resolved | Code organization & conventions | `src/Core/ZB.MOM.WW.OtOpcUa.Core.VirtualTags/VirtualTagEngine.cs:404-409` |
| Core.VirtualTags-013 | Low | Resolved | Documentation & comments | `src/Core/ZB.MOM.WW.OtOpcUa.Core.VirtualTags/DependencyGraph.cs:266-270` |
| Driver.AbCip-007 | Low | Resolved | OtOpcUa conventions | `AbCipDriver.cs` (whole file), `AbCipAlarmProjection.cs`, `LibplctagTagRuntime.cs` |
| Driver.AbCip-011 | Low | Resolved | Error handling & resilience | `AbCipDriver.cs:144-152`, `AbCipDriverOptions.cs:131-143` |
| Driver.AbCip-012 | Low | Resolved | Performance & resource management | `LibplctagTemplateReader.cs:15-35`, `AbCipDriver.cs:88-92` |
| Driver.AbCip-013 | Low | Resolved | Design-document adherence | `AbCipDriverOptions.cs:70-73`, `PlcFamilies/AbCipPlcFamilyProfile.cs:13-19`, `LibplctagTagRuntime.cs:16-27` |
| Driver.AbCip-015 | Low | Resolved | Documentation & comments | `AbCipDriver.cs:9-11`, `PlcTagHandle.cs:23-27,53-58`, `AbCipTemplateCache.cs:12-15`, `IAbCipTagEnumerator.cs:6-11`, `AbCipDriverOptions.cs:21` |
| Driver.AbCip.Cli-003 | Low | Resolved | Concurrency & thread safety | `src/Drivers/Cli/ZB.MOM.WW.OtOpcUa.Driver.AbCip.Cli/Commands/SubscribeCommand.cs:50-56,60-61` |
| Driver.AbCip.Cli-004 | Low | Resolved | Error handling & resilience | `src/Drivers/Cli/ZB.MOM.WW.OtOpcUa.Driver.AbCip.Cli/Commands/SubscribeCommand.cs:28,58`; `AbCipCommandBase.cs:26-34` |
| Driver.AbCip.Cli-005 | Low | Resolved | Performance & resource management | `src/Drivers/Cli/ZB.MOM.WW.OtOpcUa.Driver.Cli.Common/DriverCommandBase.cs:51-59` |
| Driver.AbCip.Cli-006 | Low | Resolved | Design-document adherence | `src/Drivers/Cli/ZB.MOM.WW.OtOpcUa.Driver.AbCip.Cli/AbCipCommandBase.cs:29-34` |
| Driver.AbCip.Cli-007 | Low | Resolved | Testing coverage | `tests/Drivers/Cli/ZB.MOM.WW.OtOpcUa.Driver.AbCip.Cli.Tests/WriteCommandParseValueTests.cs` |
| Driver.AbCip.Cli-008 | Low | Resolved | Documentation & comments | `docs/Driver.AbCip.Cli.md:8-9` |
| Driver.AbLegacy-005 | Low | Resolved | OtOpcUa conventions | `AbLegacyDriver.cs` (whole file) |
| Driver.AbLegacy-011 | Low | Resolved | Performance & resource management | `AbLegacyDriver.cs:440` |
| Driver.AbLegacy-013 | Low | Resolved | Code organization & conventions | `AbLegacyDriver.cs:340-345`, `AbLegacyDriver.cs:238-264` |
| Driver.AbLegacy.Cli-002 | Low | Resolved | Correctness & logic bugs | `Commands/WriteCommand.cs:27-29`, `Program.cs:6-9` |
| Driver.AbLegacy.Cli-003 | Low | Resolved | Concurrency & thread safety | `Commands/SubscribeCommand.cs:47-53` |
| Driver.AbLegacy.Cli-004 | Low | Resolved | Error handling & resilience | `Commands/ProbeCommand.cs:37-56`, `Commands/ReadCommand.cs:39-50`, `Commands/WriteCommand.cs:48-59`, `Commands/SubscribeCommand.cs:41-76` |
| Driver.AbLegacy.Cli-005 | Low | Resolved | Design-document adherence | `Commands/SubscribeCommand.cs:23-25`, `docs/Driver.AbLegacy.Cli.md:94-96` |
| Driver.AbLegacy.Cli-006 | Low | Resolved | Code organization & conventions | `Commands/ProbeCommand.cs:20-22` |
| Driver.AbLegacy.Cli-007 | Low | Resolved | Testing coverage | `tests/Drivers/Cli/ZB.MOM.WW.OtOpcUa.Driver.AbLegacy.Cli.Tests/WriteCommandParseValueTests.cs` |
| Driver.Cli.Common-004 | Low | Resolved | Error handling & resilience | `src/Drivers/Cli/ZB.MOM.WW.OtOpcUa.Driver.Cli.Common/SnapshotFormatter.cs:68-70` |
| Driver.Cli.Common-006 | Low | Resolved | Documentation & comments | `src/Drivers/Cli/ZB.MOM.WW.OtOpcUa.Driver.Cli.Common/SnapshotFormatter.cs:71`, `src/Drivers/Cli/ZB.MOM.WW.OtOpcUa.Driver.Cli.Common/DriverCommandBase.cs:9` |
| Driver.Cli.Common-008 | Low | Resolved | Testing coverage | `tests/Drivers/Cli/ZB.MOM.WW.OtOpcUa.Driver.Cli.Common.Tests/SnapshotFormatterTests.cs:50-64` |
| Driver.FOCAS-007 | Low | Resolved | Error handling & resilience | `FocasDriver.cs:140-148`, `FocasDriver.cs:478-484`, `FocasDriver.cs:529-533`, `FocasAlarmProjection.cs:61-63` |
| Driver.FOCAS-008 | Low | Resolved | Performance & resource management | `FocasDriver.cs:201`, `FocasDriver.cs:253` |
| Driver.FOCAS-009 | Low | Resolved | Design-document adherence | `FocasDriverOptions.cs:110-115`, `FocasDriver.cs:468-486`, `FocasDriverFactoryExtensions.cs:75-80` |
| Driver.FOCAS-010 | Low | Resolved | Code organization & conventions | `IFocasClient.cs:210-227` (`FocasOpMode`), `FocasConstants.cs:42-78` (`FocasOperationMode`) |
| Driver.FOCAS-011 | Low | Resolved | Code organization & conventions | `IFocasClient.cs:275-287` (`FocasAlarmType`), `FocasAlarmProjection.cs:149-175` |
| Driver.FOCAS.Cli-001 | Low | Resolved | Error handling & resilience | `Commands/WriteCommand.cs:58-68` |
| Driver.FOCAS.Cli-002 | Low | Resolved | Concurrency & thread safety | `Commands/SubscribeCommand.cs:45-51` |
| Driver.FOCAS.Cli-003 | Low | Resolved | Error handling & resilience | `FocasCommandBase.cs:19` (`CncPort`), `FocasCommandBase.cs:27` (`TimeoutMs`), `Commands/SubscribeCommand.cs:23` (`IntervalMs`) |
| Driver.FOCAS.Cli-004 | Low | Resolved | Performance & resource management | `Commands/ProbeCommand.cs:37,54`; `Commands/ReadCommand.cs:37,46`; `Commands/WriteCommand.cs:45,54`; `Commands/SubscribeCommand.cs:39,73` |
| Driver.FOCAS.Cli-005 | Low | Resolved | Design-document adherence | `Commands/WriteCommand.cs:50`, `Commands/ProbeCommand.cs:50` (via `SnapshotFormatter.FormatStatus`) |
| Driver.Galaxy-005 | Low | Resolved | OtOpcUa conventions | `Runtime/EventPump.cs:81-88` |
| Driver.Galaxy-010 | Low | Resolved | Security | `GalaxyDriver.cs:311-341` |
| Driver.Galaxy-012 | Low | Resolved | Performance & resource management | `Runtime/SubscriptionRegistry.cs:65-67`, `GalaxyDriver.cs:538`, `GalaxyDriver.cs:675` |
| Driver.Galaxy-013 | Low | Resolved | Design-document adherence | `GalaxyDriver.cs:14-27`, `GalaxyDriver.cs:374-382`, `Config/GalaxyDriverOptions.cs:84-86` |
| Driver.Galaxy-017 | Low | Deferred | Design-document adherence | `src/Drivers/ZB.MOM.WW.OtOpcUa.Driver.Galaxy/` (no source change), gateway proto contract |
| Driver.Galaxy-018 | Low | Resolved | Documentation & comments | `libs/README.md:32-37`, `ZB.MOM.WW.OtOpcUa.Driver.Galaxy.csproj:40-47` |
| Driver.Historian.Wonderware-004 | Low | Resolved | Correctness and logic bugs | `Backend/SdkAlarmHistorianWriteBackend.cs:198-201` |
| Driver.Historian.Wonderware-005 | Low | Resolved | Concurrency and thread safety | `Backend/HistorianDataSource.cs:124`, `:126-127` |
| Driver.Historian.Wonderware-007 | Low | Resolved | Error handling and resilience | `Ipc/PipeServer.cs:70-75` |
| Driver.Historian.Wonderware-008 | Low | Resolved | Error handling and resilience | `Backend/HistorianDataSource.cs:301-307`, `:374-380` |
| Driver.Historian.Wonderware-010 | Low | Resolved | Performance and resource management | `Backend/HistorianConfiguration.cs:32-36`, `Backend/HistorianDataSource.cs` (all read methods) |
| Driver.Historian.Wonderware-011 | Low | Resolved | Design-document adherence | `Backend/HistorianDataSource.cs:9-12`, `Backend/IHistorianDataSource.cs:9-11`, `Backend/HistorianSample.cs:7-9`, `Backend/HistorianConfiguration.cs:7-9` |
| Driver.Historian.Wonderware-012 | Low | Resolved | Testing coverage | `Backend/HistorianDataSource.cs`, `tests/Drivers/ZB.MOM.WW.OtOpcUa.Driver.Historian.Wonderware.Tests/` |
| Driver.Historian.Wonderware.Client-003 | Low | Resolved | Concurrency & thread safety | `WonderwareHistorianClient.cs:207`, `WonderwareHistorianClient.cs:132-150` |
| Driver.Historian.Wonderware.Client-004 | Low | Resolved | Concurrency & thread safety | `WonderwareHistorianClient.cs:203-267` |
| Driver.Historian.Wonderware.Client-006 | Low | Resolved | Error handling & resilience | `Internal/PipeChannel.cs:96-107`, `WonderwareHistorianClientOptions.cs:11-12` |
| Driver.Historian.Wonderware.Client-008 | Low | Resolved | Security | `ZB.MOM.WW.OtOpcUa.Driver.Historian.Wonderware.Client.csproj:29-32` |
| Driver.Historian.Wonderware.Client-010 | Low | Resolved | Documentation & comments | `WonderwareHistorianClient.cs:355-361`, `WonderwareHistorianClient.cs:132-150` |
| Driver.Modbus-003 | Low | Resolved | Concurrency & thread safety | `ModbusDriver.cs:59,188,241,259,266,726,745,759` |
| Driver.Modbus-007 | Low | Resolved | Design-document adherence | `ModbusDriver.cs:1392`, `ModbusDriverOptions.cs:74-80` |
| Driver.Modbus-008 | Low | Resolved | Documentation & comments | `ModbusDriver.cs:411-417,700-703,737-744` |
| Driver.Modbus-009 | Low | Resolved | Correctness & logic bugs | `ModbusDriver.cs:1160-1167`, `ModbusTcpTransport.cs:94-95` |
| Driver.Modbus-010 | Low | Resolved | Error handling & resilience | `ModbusDriver.cs:864-868`, `ModbusDriverOptions.cs:116-125` |
| Driver.Modbus-011 | Low | Resolved | Code organization & conventions | `ModbusDriver.cs:23-43,89-97,408-432` |
| Driver.Modbus-012 | Low | Resolved | Testing coverage | `tests/Drivers/ZB.MOM.WW.OtOpcUa.Driver.Modbus.Tests/` |
| Driver.Modbus.Addressing-006 | Low | Resolved | Error handling & resilience | `ModbusAddressParser.cs:297-301` |
| Driver.Modbus.Addressing-007 | Low | Resolved | Design-document adherence | `ModbusDataType.cs:91-95`, `docs/v2/dl205.md` section Strings |
| Driver.Modbus.Addressing-009 | Low | Resolved | Documentation & comments | `ModbusModiconAddress.cs:55-64`, `ModbusModiconAddress.cs:104-110` |
| Driver.Modbus.Cli-003 | Low | Resolved | Correctness & logic bugs | `src/Drivers/Cli/ZB.MOM.WW.OtOpcUa.Driver.Modbus.Cli/ModbusCommandBase.cs:14-24` |
| Driver.Modbus.Cli-004 | Low | Resolved | Concurrency & thread safety | `src/Drivers/Cli/ZB.MOM.WW.OtOpcUa.Driver.Modbus.Cli/Commands/SubscribeCommand.cs:61-67` |
| Driver.Modbus.Cli-005 | Low | Resolved | Error handling & resilience | `src/Drivers/Cli/ZB.MOM.WW.OtOpcUa.Driver.Modbus.Cli/Commands/ProbeCommand.cs:21-54`; `Commands/ReadCommand.cs:46-75`; `Commands/WriteCommand.cs:54-89` |
| Driver.Modbus.Cli-006 | Low | Resolved | Error handling & resilience | `src/Drivers/Cli/ZB.MOM.WW.OtOpcUa.Driver.Modbus.Cli/Commands/ProbeCommand.cs:35-53` |
| Driver.Modbus.Cli-007 | Low | Resolved | Design-document adherence | `docs/Driver.Modbus.Cli.md:124-156`; `src/Drivers/Cli/ZB.MOM.WW.OtOpcUa.Driver.Modbus.Cli/Commands/ReadCommand.cs` |
| Driver.Modbus.Cli-008 | Low | Resolved | Testing coverage | `tests/Drivers/Cli/ZB.MOM.WW.OtOpcUa.Driver.Modbus.Cli.Tests/` |
| Driver.OpcUaClient-011 | Low | Resolved | Documentation & comments | `OpcUaClientDriver.cs:1007-1015` |
| Driver.OpcUaClient-014 | Low | Resolved | Performance & resource management | `OpcUaClientDriver.cs:1138`, `:1314` |
| Driver.S7-003 | Low | Resolved | Correctness & logic bugs | `S7Driver.cs:172`, `S7Driver.cs:255` |
| Driver.S7-005 | Low | Resolved | OtOpcUa conventions | `S7Driver.cs:33`, `S7Driver.cs:433` |
| Driver.S7-009 | Low | Resolved | Error handling & resilience | `S7Driver.cs:392` |
| Driver.S7-010 | Low | Resolved | Performance & resource management | `S7Driver.cs:504` |
| Driver.S7-013 | Low | Resolved | Code organization & conventions | `S7DriverOptions.cs:90`, `S7Driver.cs:300` |
| Driver.S7.Cli-004 | Low | Resolved | Performance & resource management | `src/Drivers/Cli/ZB.MOM.WW.OtOpcUa.Driver.S7.Cli/Commands/ProbeCommand.cs:36,53`, `Commands/ReadCommand.cs:45,54`, `Commands/WriteCommand.cs:51,60`, `Commands/SubscribeCommand.cs:39,73` |
| Driver.S7.Cli-005 | Low | Resolved | Code organization & conventions | `tests/ZB.MOM.WW.OtOpcUa.Driver.S7.Cli.Tests/` |
| Driver.S7.Cli-006 | Low | Resolved | Testing coverage | `tests/Drivers/Cli/ZB.MOM.WW.OtOpcUa.Driver.S7.Cli.Tests/WriteCommandParseValueTests.cs` |
| Driver.S7.Cli-007 | Low | Resolved | Documentation & comments | `src/Drivers/Cli/ZB.MOM.WW.OtOpcUa.Driver.S7.Cli/Commands/SubscribeCommand.cs:45-51` |
| Driver.TwinCAT-004 | Low | Resolved | Correctness & logic bugs | `TwinCATDataType.cs:24-27` |
| Driver.TwinCAT-006 | Low | Resolved | OtOpcUa conventions | `TwinCATDriver.cs:406-411` |
| Driver.TwinCAT-014 | Low | Resolved | Design-document adherence | `TwinCATDriverOptions.cs:41-43`, `TwinCATDriverOptions.cs:57-62`, `AdsTwinCATClient.cs:145` |
| Driver.TwinCAT-015 | Low | Resolved | Code organization & conventions | `TwinCATDriver.cs:431-432` |
| Driver.TwinCAT-016 | Low | Resolved | Testing coverage | `tests/Drivers/ZB.MOM.WW.OtOpcUa.Driver.TwinCAT.Tests/` |
| Driver.TwinCAT.Cli-001 | Low | Resolved | Correctness & logic bugs | `TwinCATCommandBase.cs:23-24`, `Commands/SubscribeCommand.cs:23-24`, `Commands/BrowseCommand.cs:21-24` |
| Driver.TwinCAT.Cli-002 | Low | Resolved | Concurrency & thread safety | `Commands/SubscribeCommand.cs:46-58` |
| Driver.TwinCAT.Cli-003 | Low | Resolved | Error handling & resilience | `Commands/SubscribeCommand.cs:56-58` |
| Driver.TwinCAT.Cli-004 | Low | Resolved | Design-document adherence | `TwinCATCommandBase.cs:26-29`, `Commands/BrowseCommand.cs` |
| Driver.TwinCAT.Cli-005 | Low | Resolved | Code organization & conventions | `Commands/ProbeCommand.cs:23`, `Commands/ReadCommand.cs:20`, `Commands/WriteCommand.cs:20`, `Commands/SubscribeCommand.cs:18` |
| Driver.TwinCAT.Cli-006 | Low | Resolved | Testing coverage | `tests/Drivers/Cli/ZB.MOM.WW.OtOpcUa.Driver.TwinCAT.Cli.Tests/WriteCommandParseValueTests.cs` |
| Driver.TwinCAT.Cli-007 | Low | Resolved | Documentation & comments | `TwinCATCommandBase.cs:31-36` |
| Server-004 | Low | Resolved | OtOpcUa conventions | `src/Server/ZB.MOM.WW.OtOpcUa.Server/OpcUa/OtOpcUaServer.cs:187-200` |
| Server-006 | Low | Resolved | Concurrency & thread safety | `src/Server/ZB.MOM.WW.OtOpcUa.Server/OpcUa/DriverNodeManager.cs:478-482, 1342-1348` |
| Server-008 | Low | Resolved | Error handling & resilience | `src/Server/ZB.MOM.WW.OtOpcUa.Server/OpcUa/DriverNodeManager.cs:736` |
| Server-012 | Low | Resolved | Performance & resource management | `src/Server/ZB.MOM.WW.OtOpcUa.Server/Hosting/PeerHttpProbeLoop.cs:78-79` |
| Server-014 | Low | Resolved | Code organization & conventions | `src/Server/ZB.MOM.WW.OtOpcUa.Server/SealedBootstrap.cs` |
| Server-015 | Low | Resolved | Documentation & comments | `src/Server/ZB.MOM.WW.OtOpcUa.Server/OpcUa/OtOpcUaServer.cs:16-21`, `src/Server/ZB.MOM.WW.OtOpcUa.Server/OpcUa/OpcUaServerOptions.cs:21-26` |
| Driver.Galaxy-015 | ~~Medium~~ Low (re-triaged 2026-05-23) | Resolved | ~~Security~~ Documentation & comments (re-triaged 2026-05-23) | `libs/MxGateway.Client.dll`, `libs/MxGateway.Contracts.dll`, `libs/README.md` |
+40 -16
View File
@@ -149,53 +149,77 @@ otopcua-cli browse -u opc.tcp://localhost:4840/OtOpcUa -U admin -P admin123 -r -
### subscribe
Monitors a node for value changes using OPC UA subscriptions. Prints each data change notification with timestamp, value, and status code. Runs until Ctrl+C, then unsubscribes and disconnects cleanly.
Monitors a node (or every Variable in its subtree) for value changes using OPC UA subscriptions.
Prints each data-change notification with timestamp, value, and status code, then prints a
summary on exit. Exits on Ctrl+C, or automatically after `--duration` seconds.
```bash
# Subscribe to a single node
otopcua-cli subscribe -u opc.tcp://localhost:4840 -n "ns=2;s=MyNode" -i 500
# Browse a subtree and subscribe to every Variable, run for 60 seconds, write the summary to disk
otopcua-cli subscribe -u opc.tcp://localhost:4840 -n "ns=3;s=ZB" -r --max-depth 4 \
--duration 60 --quiet --summary-file C:\Temp\subscribe-summary.txt
```
| Flag | Description |
|------|-------------|
| `-n` / `--node` | Node ID to monitor (required) |
| `-i` / `--interval` | Sampling/publishing interval in milliseconds (default: 1000) |
| `-n` / `--node` | Node ID to monitor (required). When `--recursive` is set, this is the browse root. |
| `-i` / `--interval` | Sampling interval in milliseconds (default: 1000) |
| `-r` / `--recursive` | Browse recursively from `--node` and subscribe to every Variable found |
| `--max-depth` | Maximum recursion depth when `--recursive` is set (default: 10) |
| `-q` / `--quiet` | Suppress per-update output; only print the final summary |
| `--duration` | Auto-exit after N seconds and print the summary (0 = run until Ctrl+C, default: 0) |
| `--summary-file` | Also write the summary to this file path on exit |
#### Summary buckets
The summary prints per-node counts across these buckets:
- **Ever went BAD during window** — node received at least one notification whose status was not Good.
- **NEVER went bad (suspect)** — node received at least one notification and every one was Good.
- **Last status GOOD / NOT-GOOD** — final observed status across nodes that received any update.
- **No update received at all** — node was subscribed but no notification arrived during the window.
### historyread
Reads historical data from a node. Supports raw history reads and aggregate (processed) history reads.
`--start` and `--end` are parsed with `CultureInfo.InvariantCulture` and treated as UTC; supply
them in ISO 8601 UTC form (`YYYY-MM-DDTHH:MM:SSZ`) for unambiguous behaviour across hosts.
```bash
# Raw history
otopcua-cli historyread -u opc.tcp://localhost:4840/OtOpcUa \
-n "ns=1;s=TestMachine_001.TestHistoryValue" \
--start "2026-03-25" --end "2026-03-30"
--start "2026-03-25T00:00:00Z" --end "2026-03-30T00:00:00Z"
# Aggregate: 1-hour average
otopcua-cli historyread -u opc.tcp://localhost:4840/OtOpcUa \
-n "ns=1;s=TestMachine_001.TestHistoryValue" \
--start "2026-03-25" --end "2026-03-30" \
--start "2026-03-25T00:00:00Z" --end "2026-03-30T00:00:00Z" \
--aggregate Average --interval 3600000
```
| Flag | Description |
|------|-------------|
| `-n` / `--node` | Node ID to read history for (required) |
| `--start` | Start time, ISO 8601 or date string (default: 24 hours ago) |
| `--end` | End time, ISO 8601 or date string (default: now) |
| `--start` | Start time in ISO 8601 UTC format, e.g. `2026-01-15T08:00:00Z` (default: 24 hours ago) |
| `--end` | End time in ISO 8601 UTC format, e.g. `2026-01-15T09:00:00Z` (default: now) |
| `--max` | Maximum number of values (default: 1000) |
| `--aggregate` | Aggregate function: Average, Minimum, Maximum, Count, Start, End |
| `--aggregate` | Aggregate function: Average, Minimum, Maximum, Count, Start, End, StandardDeviation |
| `--interval` | Processing interval in milliseconds for aggregates (default: 3600000) |
#### Aggregate mapping
| Name | OPC UA Node ID |
|------|---------------|
| `Average` | `AggregateFunction_Average` |
| `Minimum` | `AggregateFunction_Minimum` |
| `Maximum` | `AggregateFunction_Maximum` |
| `Count` | `AggregateFunction_Count` |
| `Start` | `AggregateFunction_Start` |
| `End` | `AggregateFunction_End` |
| Name | Aliases | OPC UA Node ID |
|------|---------|---------------|
| `Average` | `avg` | `AggregateFunction_Average` |
| `Minimum` | `min` | `AggregateFunction_Minimum` |
| `Maximum` | `max` | `AggregateFunction_Maximum` |
| `Count` | | `AggregateFunction_Count` |
| `Start` | `first` | `AggregateFunction_Start` |
| `End` | `last` | `AggregateFunction_End` |
| `StandardDeviation` | `stddev`, `stdev` | `AggregateFunction_StandardDeviationSample` |
### alarms
+1 -1
View File
@@ -198,7 +198,7 @@ All times are in UTC. Invalid input turns red on blur.
| Option | Description |
|--------|-------------|
| Aggregate | Raw (default), Average, Minimum, Maximum, Count, Start, End |
| Aggregate | Raw (default), Average, Minimum, Maximum, Count, Start, End, Standard Deviation |
| Interval (ms) | Processing interval for aggregate queries (shown only for aggregates) |
| Max Values | Maximum number of raw values to return (default 1000) |
+3 -2
View File
@@ -4,8 +4,9 @@ Ad-hoc probe / read / write / subscribe tool for ControlLogix / CompactLogix /
Micro800 / GuardLogix PLCs, talking to the **same** `AbCipDriver` the OtOpcUa
server uses (libplctag under the hood).
Second of four driver test-client CLIs (Modbus → AB CIP → AB Legacy → S7 →
TwinCAT). Shares `Driver.Cli.Common` with the others.
Second of six driver test-client CLIs (Modbus → AB CIP → AB Legacy → S7 →
TwinCAT → FOCAS). Shares `Driver.Cli.Common` with the others; see
[DriverClis.md](DriverClis.md) for the authoritative roster.
## Build + run
+3
View File
@@ -95,6 +95,9 @@ PLC-managed — use with caution.
otopcua-ablegacy-cli subscribe -g ab://192.168.1.20/1,0 -a N7:10 -t Int -i 500
```
`-i` / `--interval-ms` is the publishing interval in milliseconds — default
`1000`. `PollGroupEngine` floors sub-250 ms values, so `-i 100` runs at 250 ms.
## Known caveat — ab_server upstream gap
The integration-fixture `ab_server` Docker container accepts TCP but its PCCC
+8
View File
@@ -122,6 +122,14 @@ gives plausible values is the correct one for that device.
## v2 addressing grammar
> **CLI scope:** the `read` / `write` / `subscribe` commands accept only
> the structured `--region` + `--address` + `--type` triple. The
> address-string grammar below is a **`DriverConfig` JSON** feature
> consumed by the driver itself; it is not reachable from this CLI's
> flags. To experiment with it via the CLI, use the structured flags;
> to deploy spreadsheets as-is, hand-author a `DriverConfig` and run
> the server.
The driver accepts the industry-standard tag-address grammar so you can
paste tag spreadsheets from Wonderware / Kepware / Ignition without
per-row manual translation. Full reference + grammar rules:
+1 -1
View File
@@ -35,7 +35,7 @@ new ScriptedAlarmDefinition(
## Predicate evaluation
Alarm predicates reuse the same Roslyn sandbox as virtual tags — `ScriptEvaluator<AlarmPredicateContext, bool>` compiles the source, `TimedScriptEvaluator` wraps it with the configured timeout (default from `TimedScriptEvaluator.DefaultTimeout`), and `DependencyExtractor` statically harvests the tag paths the script reads. The sandbox rules (forbidden types, cancellation, logging sinks) are documented in [VirtualTags.md](VirtualTags.md); ScriptedAlarms does not redefine them. The known resource limits — unbounded script-side memory, the per-publish accretion of dynamically-emitted script assemblies (Core.Scripting-008), and the orphan-thread CPU-budget caveat — are documented in that file as well.
Alarm predicates reuse the same Roslyn sandbox as virtual tags — `ScriptEvaluator<AlarmPredicateContext, bool>` compiles the source, `TimedScriptEvaluator` wraps it with the configured timeout (default from `TimedScriptEvaluator.DefaultTimeout`), and `DependencyExtractor` statically harvests the tag paths the script reads. The sandbox rules (forbidden types, cancellation, logging sinks) are documented in [VirtualTags.md](VirtualTags.md); ScriptedAlarms does not redefine them. The known resource limits — unbounded script-side memory and the orphan-thread CPU-budget caveat — are documented in that file as well; per-publish assembly accretion was resolved by the Core.Scripting-008 collectible-`AssemblyLoadContext` rewrite and no longer requires periodic server restarts.
`AlarmPredicateContext` (`AlarmPredicateContext.cs`) is the script's `ScriptContext` subclass:
+5 -3
View File
@@ -6,7 +6,7 @@ The runtime is split across two projects: `Core.Scripting` holds the Roslyn sand
## Roslyn script sandbox (`Core.Scripting`)
User scripts are compiled via `Microsoft.CodeAnalysis.CSharp.Scripting` against a `ScriptContext` subclass. `ScriptGlobals<TContext>` exposes the context as a field named `ctx`, so scripts read `ctx.GetTag("...")` / `ctx.SetVirtualTag("...", ...)` / `ctx.Now` / `ctx.Logger` and return a value.
User scripts are compiled via `Microsoft.CodeAnalysis.CSharp` (regular compiler, not the scripting variant — the original `CSharpScript` pipeline was retired by the Core.Scripting-008 / -016 rewrite, see "Compile cache" below). Each script's source is pasted as the body of a synthesized `CompiledScript.Run(ScriptGlobals<TContext>)` method against a `ScriptContext` subclass. `ScriptGlobals<TContext>` exposes the context as a field named `ctx`, so scripts read `ctx.GetTag("...")` / `ctx.SetVirtualTag("...", ...)` / `ctx.Now` / `ctx.Logger` and return a value.
### Compile pipeline (`ScriptEvaluator<TContext, TResult>`)
@@ -30,11 +30,13 @@ Similarly, **`System.Threading.Tasks` is now denied** (Core.Scripting-003), whic
`ConcurrentDictionary<string, Lazy<ScriptEvaluator<...>>>` keyed on `SHA-256(UTF8(source))` rendered to hex. `Lazy<T>` with `ExecutionAndPublication` mode means two threads racing a miss compile exactly once. Failed compiles evict the entry (via the `TryRemove(KeyValuePair<,>)` overload so a concurrently re-added retry entry is not collateral damage — Core.Scripting-006) so a corrected retry can succeed (used during Admin UI authoring). No capacity bound — scripts are operator-authored and bounded by the config DB. Whitespace changes miss the cache on purpose. `Clear()` is called on config-publish.
**Per-publish assembly accretion (accepted limitation, Core.Scripting-008).** Each compiled `ScriptEvaluator` holds a Roslyn `ScriptRunner<T>` delegate, which keeps the dynamically-emitted script assembly loaded for the process lifetime. Emitted assemblies in the default `AssemblyLoadContext` cannot be unloaded; `CompiledScriptCache.Clear()` drops the dictionary entries but does **not** unload the underlying assemblies. Across many config-publish generations (each `Clear()` followed by recompiling every script), the process accumulates dead script assemblies. For the expected "low thousands" of scripts this is benign, but a long-running server with very frequent publishes will see steady managed-memory growth that does not return until the process restarts. Out-of-process script evaluation or a collectible `AssemblyLoadContext` is a v3 concern; deployments with high-publish-frequency requirements should schedule a periodic server restart to reclaim the accrued assemblies.
**Per-publish assembly unload (Core.Scripting-008 resolved).** Each compiled `ScriptEvaluator` emits its script into a dedicated **collectible** `AssemblyLoadContext` — the BCL escape hatch for assemblies that can be unloaded. The compile path is hand-rolled `CSharpCompilation.Create` + `Emit(MemoryStream)` + `ScriptAssemblyLoadContext.LoadFromStream` rather than the legacy `CSharpScript.CreateDelegate` (which emits into the default ALC and cannot be unloaded). `ScriptEvaluator.Dispose()` calls `AssemblyLoadContext.Unload()` and `CompiledScriptCache.Clear()` disposes every materialised evaluator before dropping its dictionary entry, so the emitted assemblies become eligible for GC immediately after a config-publish. The reclaim is GC-timing-sensitive (Unload is *eligible-for-collection*, not synchronous); the next collection cycle reclaims them. Regression tests `Dispose_unloads_compiled_script_assembly_load_context` and `Clear_disposes_every_materialised_evaluator` in `CompiledScriptCacheTests` lock this contract via `WeakReference` + `GC.Collect()` assertions. Server restarts are no longer required to reclaim compiled-script memory.
**Scripting authoring convention.** With the collectible-ALC rewrite, the wrapper around a user script is an ordinary C# static method, not a Roslyn `Script` submission. The script body is pasted verbatim as the method body and must therefore end with an explicit `return …;` per ordinary C# rules — the legacy `CSharpScript` "last expression yields result" shorthand is gone. Every script in the existing test corpus already uses explicit `return`; this convention is operator-visible only when authoring a brand-new script from scratch.
### Per-evaluation timeout (`TimedScriptEvaluator<TContext, TResult>`)
Wraps `ScriptEvaluator` with a wall-clock budget. Default `DefaultTimeout = 250ms`. Implementation pushes the inner `RunAsync` onto `Task.Run` (so a CPU-bound script can't hog the calling thread before `WaitAsync` registers its timeout) then awaits `runTask.WaitAsync(Timeout, ct)`. A `TimeoutException` from `WaitAsync` is wrapped as `ScriptTimeoutException`. Caller-supplied `CancellationToken` cancellation wins over the timeout and propagates as `OperationCanceledException` — so a shutdown cancel is not misclassified. **Known leak:** when a CPU-bound script times out, the underlying `ScriptRunner` keeps running on its thread-pool thread until the Roslyn runtime returns (documented trade-off; out-of-process evaluation is a v3 concern).
Wraps `ScriptEvaluator` with a wall-clock budget. Default `DefaultTimeout = 250ms`. Implementation pushes the inner `RunAsync` onto `Task.Run` (so a CPU-bound script can't hog the calling thread before `WaitAsync` registers its timeout) then awaits `runTask.WaitAsync(Timeout, ct)`. A `TimeoutException` from `WaitAsync` is wrapped as `ScriptTimeoutException`. Caller-supplied `CancellationToken` cancellation wins over the timeout and propagates as `OperationCanceledException` — so a shutdown cancel is not misclassified. **Known leak:** when a CPU-bound script times out, the underlying compiled-script delegate keeps running on its `Task.Run` thread-pool thread until it returns of its own accord (the CT is checked only at evaluator entry; once the script body is running, only the script returning or throwing will release the thread). The post-rewrite delegate is a regular C# `Func<>` bound to the synthesized `CompiledScript.Run` method, so this is a vanilla "synchronous CPU-bound work on a pool thread" leak rather than anything Roslyn-specific. Documented trade-off; out-of-process evaluation is a v3 concern.
### Script logger plumbing
+13 -3
View File
@@ -583,10 +583,20 @@ language binding.
**Depends on:** A.1 merged (proto change live).
**Files** (`c:\Users\dohertj2\Desktop\mxaccessgw\clients\`):
**Files** (`c:\Users\dohertj2\Desktop\mxaccessgw\src\` for .NET — note the sibling
repo restructured after this plan was written; `clients/dotnet/MxGateway.Client.csproj`
no longer exists, the proto contracts now live in
`src/ZB.MOM.WW.MxGateway.Contracts/` under the new namespace
`ZB.MOM.WW.MxGateway.Contracts.Proto[.Galaxy]`; the OtOpcUa driver currently
consumes vendored binaries from the pre-restructure build — see
`src/Drivers/ZB.MOM.WW.OtOpcUa.Driver.Galaxy/libs/README.md`):
1. **.NET** — codegen runs on csproj rebuild via `Grpc.Tools`; just
rebuild `MxGateway.Client.csproj` after pulling A.1.
1. **.NET** — codegen runs on csproj rebuild via `Grpc.Tools`; rebuild
`src/ZB.MOM.WW.MxGateway.Contracts/ZB.MOM.WW.MxGateway.Contracts.csproj`
after pulling A.1. (If unwinding the driver's vendored binaries onto the
new contracts namespace as part of the alarm work, namespace-rename + a
reimplementation of the missing `MxGatewayClient` / `MxGatewaySession`
wrappers is also in scope.)
2. **Python** — run `clients\python\generate-proto.ps1`; commit the
regenerated `_pb2.py` + `_pb2_grpc.py` files under
`clients\python\src\`.
+3 -4
View File
@@ -1,6 +1,6 @@
# High-Level Requirements
> **Revision** — Refreshed 2026-04-19 for the OtOpcUa v2 multi-driver platform (task #205). The original 2025 text described a single-process Galaxy/MXAccess server called LmxOpcUa. Today the project is the **OtOpcUa** multi-driver OPC UA platform deployed as three cooperating processes (Server, Admin, Galaxy.Host). The Galaxy integration is one of seven shipped drivers. HLR-001 through HLR-008 have been rewritten driver-agnostically; HLR-009 has been retired (the embedded Status Dashboard is superseded by the Admin UI). HLR-010 through HLR-017 are new and cover plug-in drivers, resilience, Config DB / draft-publish, cluster redundancy, fleet-wide identifier uniqueness, Admin UI, audit logging, metrics, and the Roslyn capability-wrapping analyzer.
> **Revision** — Refreshed 2026-05-23 for the OtOpcUa v2 multi-driver platform. The original 2025 text described a single-process Galaxy/MXAccess server called LmxOpcUa. Today the project is the **OtOpcUa** multi-driver OPC UA platform deployed as two cooperating processes (Server, Admin). The Galaxy integration is one of seven shipped drivers and is now an in-process Tier-A driver that talks gRPC to a separately installed `mxaccessgw` gateway (sibling repo) — PR 7.2 (2026-04-30) retired the legacy out-of-process `Galaxy.Host` Windows service. HLR-001 through HLR-008 have been rewritten driver-agnostically; HLR-009 has been retired (the embedded Status Dashboard is superseded by the Admin UI). HLR-010 through HLR-017 cover plug-in drivers, resilience, Config DB / draft-publish, cluster redundancy, fleet-wide identifier uniqueness, Admin UI, audit logging, metrics, and the Roslyn capability-wrapping analyzer.
## HLR-001: OPC UA Server
@@ -28,11 +28,10 @@ Drivers whose backend has a native change signal (e.g. Galaxy's `time_of_last_de
## HLR-007: Service Hosting
The system shall be deployed as three cooperating Windows services:
The system shall be deployed as two cooperating Windows services (the legacy `OtOpcUa.Galaxy.Host` x86 host was retired in PR 7.2 — Galaxy access now flows through the separately installed `mxaccessgw` gateway, which lives in a sibling repository and is not part of the OtOpcUa deployment):
- **OtOpcUa.Server** — .NET 10 x64, `Microsoft.Extensions.Hosting` + `AddWindowsService`, hosts all non-Galaxy drivers in-process and the OPC UA endpoint.
- **OtOpcUa.Server** — .NET 10 AnyCPU, `Microsoft.Extensions.Hosting` + `AddWindowsService` (decision #30 replaced the original TopShelf choice), hosts every driver in-process — including the new Tier-A `GalaxyDriver` that speaks gRPC to `mxaccessgw` and the OPC UA endpoint.
- **OtOpcUa.Admin** — .NET 10 x64 Blazor Server web app, hosts the admin UI, SignalR hubs for live updates, `/metrics` Prometheus endpoint, and audit log writers.
- **OtOpcUa.Galaxy.Host** — .NET Framework 4.8 x86 (TopShelf), hosts MXAccess COM + Galaxy Repository SQL + Historian plugin. Talks to `Driver.Galaxy.Proxy` inside `OtOpcUa.Server` via a named pipe (MessagePack over length-prefixed frames, per-process shared secret, SID-restricted ACL).
## HLR-008: Logging
+6 -3
View File
@@ -151,8 +151,11 @@ substantive driver change, and revise this table when the data does.
`SubscriptionRegistry`, or a downstream consumer retaining
`DataValueSnapshot` references past their useful life.
## Scripted-alarm engine — known hot-path allocations
## Scripted-alarm engine — hot-path allocation reuse
`ScriptedAlarmEngine.BuildReadCache` allocates a fresh `Dictionary<string, DataValueSnapshot>` and `AlarmPredicateContext` on every predicate evaluation — i.e. once per upstream tag change per referencing alarm. On a busy line where many tags feeding many alarms change frequently, this is a steady stream of short-lived dictionary allocations on the hot path. (Core.ScriptedAlarms-009)
`ScriptedAlarmEngine` keeps a per-alarm reusable evaluation scratch in `_scratchByAlarmId` — the read-cache `Dictionary<string, DataValueSnapshot>` and the `AlarmPredicateContext` are allocated once per alarm (on first evaluation) and refilled in place across every subsequent predicate evaluation. The hot path no longer allocates a fresh dictionary + context per upstream tag change. (Core.ScriptedAlarms-009)
The allocations are deliberate for now: predicate evaluation is already serialised under `_evalGate`, so a single reused scratch dictionary would be safe, but the per-call dictionary keeps the evaluation surface immutable and trivially safe against future refactors. If a future scripted-alarm soak surfaces allocation pressure on this path, the mitigation is a per-alarm scratch buffer cleared between evaluations — note here before changing the engine.
Safety: reuse is serialised under `_evalGate`, so two threads can never observe the same scratch in a half-refilled state. The context wraps the read-cache by reference, so refilling the dictionary is what the predicate's `ctx.GetTag(path)` calls observe. `LoadAsync` clears `_scratchByAlarmId` alongside `_alarms` so a config-publish drops the prior generation's scratch (a new generation may carry different `Inputs` / `Logger`). Regression tests in `ScriptedAlarmEngineTests` lock the reuse contract:
- `Reevaluation_reuses_the_same_read_cache_dictionary` — asserts dictionary instance identity across two evaluations.
- `Reevaluation_reuses_the_same_predicate_context` — same, for the context.
- `LoadAsync_drops_the_prior_generations_scratch` — asserts a publish resets the scratch.
+9
View File
@@ -43,6 +43,15 @@ that a naive Modbus client will byte-swap [1][2].
really "read 10 consecutive holding registers starting at the Modbus address
that V2000 translates to (see next section), unpack each register low-byte
then high-byte, stop at the first `0x00`."
- **Grammar scope** (Driver.Modbus.Addressing-007): the
`ModbusStringByteOrder` knob (HighByteFirst / LowByteFirst) is **not**
expressible through the `ModbusAddressParser` grammar string — the 3rd grammar
field is the multi-register word/byte order (ABCD/CDAB/BADC/DCBA) and the 4th
is the array count, so there is no token slot for the per-string byte order.
Tags that need low-byte-first packing on DL205 must set
`ModbusTagDefinition.StringByteOrder = LowByteFirst` via the structured tag
form (the driver config DTO). The grammar default produces high-byte-first
strings (matches Ignition / Kepware default behaviour).
Test names:
`DL205_String_low_byte_first_within_register`,
@@ -29,7 +29,7 @@ Tie-in capability — **historian alarm sink**:
| 3 | Evaluation trigger = **change-driven + timer-driven**; operator chooses per-tag | Change-driven is cheap at steady state; timer is the escape hatch for polling derivations that don't have a discrete "input changed" signal. |
| 4 | Script shape = **Shape A — one script per virtual tag/alarm**; `return` produces the value (or `bool` for alarm condition) | Minimal surface; no predicate/action split. Alarm side-effects (severity, message) configured out-of-band, not in the script. |
| 5 | Alarm fidelity = **full OPC UA Part 9** | Uniform with Galaxy + ALMD on the wire; client-side tooling (HMIs, historians, event pipelines) gets one shape. |
| 6 | Sandbox = **read-only context**; scripts can only read any tag + write to virtual tags | Strict Roslyn `ScriptOptions` allow-list. Authoritative deny-list (`ForbiddenTypeAnalyzer`): namespace-prefix deny `System.IO`, `System.Net`, `System.Diagnostics`, `System.Reflection`, `System.Threading.Tasks` (Task / Parallel fan-out — Core.Scripting-003), `System.Runtime.InteropServices`, `Microsoft.Win32`; type-granular deny `System.Environment`, `System.AppDomain`, `System.GC`, `System.Activator`, `System.Threading.Thread` (these live directly in the allow-listed `System` / `System.Threading` namespaces, so a prefix rule cannot reach them without blocking primitives — Core.Scripting-001 / -009). |
| 6 | Sandbox = **read-only context**; scripts can only read any tag + write to virtual tags | Strict Roslyn `ScriptOptions` allow-list. Authoritative deny-list (`ForbiddenTypeAnalyzer`): namespace-prefix deny `System.IO`, `System.Net`, `System.Diagnostics`, `System.Reflection`, `System.Threading.Tasks` (Task / Parallel fan-out — Core.Scripting-003), `System.Runtime.InteropServices`, `System.Runtime.Loader` (AssemblyLoadContext et al. — Core.Scripting-012), `Microsoft.Win32`; type-granular deny `System.Environment`, `System.AppDomain`, `System.GC`, `System.Activator`, `System.Threading.Thread`, `System.Threading.ThreadPool` (Core.Scripting-012), `System.Threading.Timer` (Core.Scripting-012) (these live directly in the allow-listed `System` / `System.Threading` namespaces, so a prefix rule cannot reach them without blocking primitives — Core.Scripting-001 / -009 / -012). |
| 7 | Dependency declaration = **AST inference**; operator doesn't maintain a separate dependency list | `CSharpSyntaxWalker` extracts `ctx.GetTag("path")` string-literal calls at compile time; dynamic paths rejected at publish. |
| 8 | Config storage = **config DB with generation-sealed cache** (same as driver instances) | Virtual tags + alarms publish atomically in the same generation as the driver instance config they may depend on. |
| 9 | Script return value shape (`ctx.GetTag`) = **`DataValue { Value, StatusCode, Timestamp }`** | Scripts branch on quality naturally without separate `ctx.GetQuality(...)` calls. |
@@ -89,7 +89,7 @@ Tie-in capability — **historian alarm sink**:
### Stream A — `Core.Scripting` (Roslyn engine + sandbox + AST inference + logger) — **2 weeks**
1. **A.1** Project scaffold + NuGet `Microsoft.CodeAnalysis.CSharp.Scripting`. `ScriptOptions` allow-list (`typeof(object).Assembly`, `typeof(Enumerable).Assembly`, the Core.Scripting assembly itself — nothing else). Hand-written `ScriptContext` base class with `GetTag(string)` / `SetVirtualTag(string, object)` / `Logger` / `Now` / `Deadband(double, double, double)` helpers.
1. **A.1** Project scaffold + NuGet `Microsoft.CodeAnalysis.CSharp.Scripting`. `ScriptOptions` allow-list (`typeof(object).Assembly`, `typeof(Enumerable).Assembly`, the Core.Scripting assembly itself — nothing else). Hand-written `ScriptContext` base class with `GetTag(string)` / `SetVirtualTag(string, object)` / `Logger` / `Now` / `Deadband(double, double, double)` helpers. _(Implementation note 2026-05-23 — superseded by Core.Scripting-008 / -016: the `CSharpScript`/`ScriptRunner` path was replaced with a hand-rolled `CSharpCompilation.Create``Emit(MemoryStream)` → collectible `ScriptAssemblyLoadContext.LoadFromStream` pipeline so per-publish ALC accretion is reclaimable, and engines route compiles through `CompiledScriptCache` rather than calling `ScriptEvaluator.Compile` directly. The reference list was correspondingly widened from the narrow allow-list above to the full BCL `TRUSTED_PLATFORM_ASSEMBLIES` set (filtered to `System.*` + `netstandard` + `Microsoft.Win32.Registry`) because the new pipeline can't compile against the old narrow set; `ForbiddenTypeAnalyzer` is now the sole security gate, consistent with how Core.Scripting-001 / -002 established the analyzer must be the real boundary because type forwarding makes any references-list-only restriction porous. See `docs/VirtualTags.md` "Compile cache" for the current implementation contract.)_
2. **A.2** `DependencyExtractor : CSharpSyntaxWalker`. Visits every `InvocationExpressionSyntax` targeting `ctx.GetTag` / `ctx.SetVirtualTag`; accepts only a `LiteralExpressionSyntax` argument. Non-literal arguments (concat, variable, method call) → publish-time rejection with an actionable error pointing the operator at the exact span. Outputs `IReadOnlySet<string> Inputs` + `IReadOnlySet<string> Outputs`.
3. **A.3** Compile cache. `(source_hash) → compiled Script<T>`. Recompile only when source changes. Warm on `SealedBootstrap`.
4. **A.4** Per-evaluation timeout wrapper (default 250ms; configurable per tag). Timeout = tag quality `BadInternalError` + structured warning log. Keeps a single runaway script from starving the engine.
@@ -162,7 +162,7 @@ Tie-in capability — **historian alarm sink**:
## Compliance Checks (run at exit gate)
- [ ] **Sandbox escape**: attempts to reference any deny-listed namespace prefix (`System.IO`, `System.Net`, `System.Diagnostics`, `System.Reflection`, `System.Threading.Tasks`, `System.Runtime.InteropServices`, `Microsoft.Win32`) or any of the type-granular forbidden types (`System.Environment`, `System.AppDomain`, `System.GC`, `System.Activator`, `System.Threading.Thread`) fail at script compile with an actionable error. Vectors include direct calls, `typeof(T)`, generic type arguments, casts, `is`/`as` patterns, `default(T)`, array element types, and explicitly-typed local declarations.
- [ ] **Sandbox escape**: attempts to reference any deny-listed namespace prefix (`System.IO`, `System.Net`, `System.Diagnostics`, `System.Reflection`, `System.Threading.Tasks`, `System.Runtime.InteropServices`, `System.Runtime.Loader`, `Microsoft.Win32`) or any of the type-granular forbidden types (`System.Environment`, `System.AppDomain`, `System.GC`, `System.Activator`, `System.Threading.Thread`, `System.Threading.ThreadPool`, `System.Threading.Timer`) fail at script compile with an actionable error. Vectors include direct calls, `typeof(T)`, generic type arguments, casts, `is`/`as` patterns, `default(T)`, array element types, and explicitly-typed local declarations.
- [ ] **Dependency inference**: `ctx.GetTag(myStringVar)` (non-literal path) is rejected at publish with a span-pointed error; `ctx.GetTag("Line1/Speed")` is accepted + appears in the inferred input set.
- [ ] **Change cascade**: tag A → virtual tag B → virtual tag C. When A changes, B recomputes, then C recomputes. Single change event triggers the full cascade in topological order within one evaluation pass.
- [ ] **Cycle rejection**: publish a config where virtual tag B depends on A and A depends on B. Publish fails pre-commit with a clear cycle message.
+2 -2
View File
@@ -193,7 +193,7 @@ ConfigurationService
- Compact binary format, faster than JSON, good fit for high-frequency data change callbacks
- Simpler than gRPC on .NET 4.8 (which needs legacy `Grpc.Core` native library)
**Decided: Galaxy Host is a separate Windows service.**
**Decided: Galaxy Host is a separate Windows service.** _(Reversed by PR 7.2, 2026-04-30 — see PR 7.2's commit `ae7106d` and the project_galaxy_via_mxgateway memory entry. The legacy in-process `Galaxy.Host` / `Galaxy.Proxy` / `Galaxy.Shared` projects + the `OtOpcUaGalaxyHost` Windows service were retired; Galaxy access now flows through the in-process Tier-A `GalaxyDriver` talking gRPC to a separately installed `mxaccessgw` gateway sibling repo. The reasoning below was correct for the original LMX/x86-COM architecture; the gateway sibling repo now owns those constraints externally.)_
- Independent lifecycle from the OtOpcUa Server
- Can be restarted without affecting the main server or other drivers
- Galaxy.Proxy detects connection loss, sets Bad quality on Galaxy nodes, reconnects when Host comes back
@@ -801,7 +801,7 @@ aggregate runner (#253); server-side factory + seed SQL per driver (#210#213)
| 26 | Admin deploys on same server (co-hosted) | Simplifies deployment; can also run on separate management host | 2026-04-16 |
| 27 | Admin scaffold early, driver-specific screens deferred | Core CRUD for instances/drivers first; per-driver config UI added with each driver | 2026-04-16 |
| 28 | Named pipes for Galaxy IPC | Fast, no port conflicts, native to both .NET 4.8 and .NET 10 | 2026-04-16 |
| 29 | Galaxy Host is a separate Windows service | Independent lifecycle, can restart without affecting main server or other drivers | 2026-04-16 |
| 29 | Galaxy Host is a separate Windows service | Independent lifecycle, can restart without affecting main server or other drivers | 2026-04-16 (**reversed PR 7.2, 2026-04-30** — Galaxy is now an in-process Tier-A driver talking gRPC to the sibling `mxaccessgw` gateway; see the decision body above) |
| 30 | Drop TopShelf, use Microsoft.Extensions.Hosting | Built-in Windows Service support in .NET 10, no third-party dependency | 2026-04-16 |
| 31 | Mono-repo for all drivers | Simpler dependency management, single CI pipeline, shared abstractions | 2026-04-16 |
| 32 | MessagePack serialization for Galaxy IPC | Binary, fast, works on .NET 4.8+ and .NET 10 via MessagePack-CSharp NuGet | 2026-04-16 |
+1 -1
View File
@@ -67,7 +67,7 @@ Remaining Phase 6.3 surfaces (hardening, not release-blocking):
- **AB CIP** (PRs #202222) — `Driver.AbCip`, `Driver.AbCip.IntegrationTests` (7 tests), AB CIP Cli. Live-boot verified against a ControlLogix rig.
- **AB Legacy** (PRs #202, #223) — `Driver.AbLegacy`, `Driver.AbLegacy.IntegrationTests` (2 tests), AB Legacy Cli. PCCC cip-path workaround for SLC/MicroLogix.
- **TwinCAT ADS** (PRs #205, this branch `task-galaxy-e2e`) — `Driver.TwinCAT`, `Driver.TwinCAT.IntegrationTests` (2 tests), TwinCAT Cli. TCBSD/ESXi fixture for e2e since local Hyper-V / TwinCAT RTIME are mutually exclusive on the dev box.
- **TwinCAT ADS** (PR #205) — `Driver.TwinCAT`, `Driver.TwinCAT.IntegrationTests` (2 tests), TwinCAT Cli. TCBSD/ESXi fixture for e2e since local Hyper-V / TwinCAT RTIME are mutually exclusive on the dev box.
- **FOCAS** (PRs #173, #199 + this session's migration) — `Driver.FOCAS` with an **in-process managed `FocasWireClient`** that speaks FOCAS/2 over TCP directly. Tier-C isolation retired — `Driver.FOCAS.Host` + `Driver.FOCAS.Shared` + `FwlibNative` P/Invoke + shim DLL + NSSM service all deleted. `Driver.FOCAS.IntegrationTests` covers 9 scenarios (fixed tree identity/axes/program/timers/spindle + user-authored PARAM/MACRO/PMC reads, Browse, Subscribe, IAlarmSource raise/clear, Probe transitions).
Decision recorded: FOCAS is **read-only** against the CNC by design — writes return `BadNotWritable`. See `docs/drivers/FOCAS.md` + `docs/drivers/FOCAS-Test-Fixture.md` for the deployment + coverage map.
@@ -109,8 +109,16 @@ public abstract class CommandBase : ICommand
/// <summary>
/// Configures Serilog based on the verbose flag.
/// </summary>
/// <remarks>
/// Disposes the previously assigned <see cref="Log.Logger" /> via <see cref="Log.CloseAndFlush" />
/// before installing the new one, so repeated CLI invocations (e.g. in the test suite) do not
/// leak the prior logger's console sink.
/// </remarks>
protected void ConfigureLogging()
{
// Dispose any previously installed logger before swapping in a new one.
Log.CloseAndFlush();
var config = new LoggerConfiguration();
if (Verbose)
config.MinimumLevel.Debug()
@@ -1,6 +1,8 @@
using System.Threading.Channels;
using CliFx.Attributes;
using CliFx.Exceptions;
using CliFx.Infrastructure;
using Opc.Ua;
using ZB.MOM.WW.OtOpcUa.Client.CLI.Helpers;
using ZB.MOM.WW.OtOpcUa.Client.Shared;
using ZB.MOM.WW.OtOpcUa.Client.Shared.Models;
@@ -43,14 +45,25 @@ public class AlarmsCommand : CommandBase
public override async ValueTask ExecuteAsync(IConsole console)
{
ConfigureLogging();
if (Interval <= 0)
throw new CommandException($"--interval must be greater than 0 (was {Interval}).");
NodeId? sourceNodeId;
try
{
sourceNodeId = NodeIdParser.Parse(NodeId);
}
catch (Exception ex) when (ex is FormatException or ArgumentException)
{
throw new CommandException($"Invalid --node value: {ex.Message}");
}
IOpcUaClientService? service = null;
try
{
var ct = console.RegisterCancellationHandler();
(service, _) = await CreateServiceAndConnectAsync(ct);
var sourceNodeId = NodeIdParser.Parse(NodeId);
// Channel serialises SDK notification-thread writes to the main async loop so
// that concurrent alarm callbacks never interleave on the shared TextWriter.
var outputChannel = Channel.CreateUnbounded<string>(
@@ -1,4 +1,5 @@
using CliFx.Attributes;
using CliFx.Exceptions;
using CliFx.Infrastructure;
using Opc.Ua;
using ZB.MOM.WW.OtOpcUa.Client.CLI.Helpers;
@@ -42,13 +43,25 @@ public class BrowseCommand : CommandBase
public override async ValueTask ExecuteAsync(IConsole console)
{
ConfigureLogging();
if (Depth <= 0)
throw new CommandException($"--depth must be greater than 0 (was {Depth}).");
NodeId? startNode;
try
{
startNode = NodeIdParser.Parse(NodeId);
}
catch (Exception ex) when (ex is FormatException or ArgumentException)
{
throw new CommandException($"Invalid --node value: {ex.Message}");
}
IOpcUaClientService? service = null;
try
{
var ct = console.RegisterCancellationHandler();
(service, _) = await CreateServiceAndConnectAsync(ct);
var startNode = NodeIdParser.Parse(NodeId);
var maxDepth = Recursive ? Depth : 1;
await BrowseNodeAsync(service, console, startNode, maxDepth, 0, ct);
@@ -1,5 +1,6 @@
using System.Globalization;
using CliFx.Attributes;
using CliFx.Exceptions;
using CliFx.Infrastructure;
using Opc.Ua;
using ZB.MOM.WW.OtOpcUa.Client.CLI.Helpers;
@@ -62,22 +63,65 @@ public class HistoryReadCommand : CommandBase
public override async ValueTask ExecuteAsync(IConsole console)
{
ConfigureLogging();
if (MaxValues <= 0)
throw new CommandException($"--max must be greater than 0 (was {MaxValues}).");
if (!string.IsNullOrEmpty(Aggregate) && IntervalMs <= 0)
throw new CommandException($"--interval must be greater than 0 (was {IntervalMs}).");
NodeId nodeId;
try
{
nodeId = NodeIdParser.ParseRequired(NodeId);
}
catch (Exception ex) when (ex is FormatException or ArgumentException)
{
throw new CommandException($"Invalid --node value: {ex.Message}");
}
DateTime start, end;
try
{
start = string.IsNullOrEmpty(StartTime)
? DateTime.UtcNow.AddHours(-24)
: DateTime.Parse(StartTime, CultureInfo.InvariantCulture,
DateTimeStyles.AssumeUniversal | DateTimeStyles.AdjustToUniversal);
}
catch (FormatException ex)
{
throw new CommandException($"Invalid --start value '{StartTime}': {ex.Message}. Expected ISO 8601 UTC format, e.g. 2026-01-15T08:00:00Z.");
}
try
{
end = string.IsNullOrEmpty(EndTime)
? DateTime.UtcNow
: DateTime.Parse(EndTime, CultureInfo.InvariantCulture,
DateTimeStyles.AssumeUniversal | DateTimeStyles.AdjustToUniversal);
}
catch (FormatException ex)
{
throw new CommandException($"Invalid --end value '{EndTime}': {ex.Message}. Expected ISO 8601 UTC format, e.g. 2026-01-15T08:00:00Z.");
}
AggregateType aggregateType = default;
if (!string.IsNullOrEmpty(Aggregate))
{
try
{
aggregateType = ParseAggregateType(Aggregate);
}
catch (ArgumentException ex)
{
throw new CommandException($"Invalid --aggregate value: {ex.Message}");
}
}
IOpcUaClientService? service = null;
try
{
var ct = console.RegisterCancellationHandler();
(service, _) = await CreateServiceAndConnectAsync(ct);
var nodeId = NodeIdParser.ParseRequired(NodeId);
var start = string.IsNullOrEmpty(StartTime)
? DateTime.UtcNow.AddHours(-24)
: DateTime.Parse(StartTime, CultureInfo.InvariantCulture,
DateTimeStyles.AssumeUniversal | DateTimeStyles.AdjustToUniversal);
var end = string.IsNullOrEmpty(EndTime)
? DateTime.UtcNow
: DateTime.Parse(EndTime, CultureInfo.InvariantCulture,
DateTimeStyles.AssumeUniversal | DateTimeStyles.AdjustToUniversal);
IReadOnlyList<DataValue> values;
if (string.IsNullOrEmpty(Aggregate))
@@ -88,7 +132,6 @@ public class HistoryReadCommand : CommandBase
}
else
{
var aggregateType = ParseAggregateType(Aggregate);
await console.Output.WriteLineAsync(
$"History for {NodeId} ({Aggregate}, interval={IntervalMs}ms)");
values = await service.HistoryReadAggregateAsync(
@@ -1,5 +1,7 @@
using CliFx.Attributes;
using CliFx.Exceptions;
using CliFx.Infrastructure;
using Opc.Ua;
using ZB.MOM.WW.OtOpcUa.Client.CLI.Helpers;
using ZB.MOM.WW.OtOpcUa.Client.Shared;
@@ -29,13 +31,23 @@ public class ReadCommand : CommandBase
public override async ValueTask ExecuteAsync(IConsole console)
{
ConfigureLogging();
NodeId nodeId;
try
{
nodeId = NodeIdParser.ParseRequired(NodeId);
}
catch (Exception ex) when (ex is FormatException or ArgumentException)
{
throw new CommandException($"Invalid --node value: {ex.Message}");
}
IOpcUaClientService? service = null;
try
{
var ct = console.RegisterCancellationHandler();
(service, _) = await CreateServiceAndConnectAsync(ct);
var nodeId = NodeIdParser.ParseRequired(NodeId);
var value = await service.ReadValueAsync(nodeId, ct);
await console.Output.WriteLineAsync($"Node: {NodeId}");
@@ -1,6 +1,7 @@
using System.Collections.Concurrent;
using System.Threading.Channels;
using CliFx.Attributes;
using CliFx.Exceptions;
using CliFx.Infrastructure;
using Opc.Ua;
using ZB.MOM.WW.OtOpcUa.Client.CLI.Helpers;
@@ -12,42 +13,92 @@ namespace ZB.MOM.WW.OtOpcUa.Client.CLI.Commands;
[Command("subscribe", Description = "Monitor a node for value changes")]
public class SubscribeCommand : CommandBase
{
/// <summary>
/// Creates the subscribe command used to monitor a node (or a subtree of nodes) for data-change
/// notifications.
/// </summary>
/// <param name="factory">The factory that creates the shared client service for the command run.</param>
public SubscribeCommand(IOpcUaClientServiceFactory factory) : base(factory)
{
}
/// <summary>
/// Gets the node ID to monitor. When <see cref="Recursive" /> is set, this node is the browse root
/// and every <c>Variable</c> child it reaches is subscribed.
/// </summary>
[CommandOption("node", 'n', Description = "Node ID to monitor", IsRequired = true)]
public string NodeId { get; init; } = default!;
/// <summary>
/// Gets the sampling interval, in milliseconds, requested for every monitored item.
/// </summary>
[CommandOption("interval", 'i', Description = "Sampling interval in milliseconds")]
public int Interval { get; init; } = 1000;
/// <summary>
/// Gets a value indicating whether the command should browse from <see cref="NodeId" />
/// and subscribe to every <c>Variable</c> in the subtree.
/// </summary>
[CommandOption("recursive", 'r', Description = "Browse recursively from --node and subscribe to every Variable found")]
public bool Recursive { get; init; }
/// <summary>
/// Gets the maximum recursion depth applied while collecting variables when <see cref="Recursive" /> is set.
/// </summary>
[CommandOption("max-depth", Description = "Maximum recursion depth when --recursive is set")]
public int MaxDepth { get; init; } = 10;
/// <summary>
/// Gets a value indicating whether per-update lines should be suppressed in favour of the final summary only.
/// </summary>
[CommandOption("quiet", 'q', Description = "Suppress per-update output; only print a final summary on Ctrl+C")]
public bool Quiet { get; init; }
/// <summary>
/// Gets the duration, in seconds, before the command auto-exits and prints its summary.
/// A value of <c>0</c> means the command runs until Ctrl+C.
/// </summary>
[CommandOption("duration", Description = "Auto-exit after N seconds and print summary (0 = run until Ctrl+C)")]
public int DurationSeconds { get; init; } = 0;
/// <summary>
/// Gets the optional path that the command should write the final summary to on exit, in addition to stdout.
/// </summary>
[CommandOption("summary-file", Description = "Write summary to this file path on exit (in addition to stdout)")]
public string? SummaryFile { get; init; }
/// <summary>
/// Connects to the server, subscribes to <see cref="NodeId" /> (or its subtree when recursive),
/// streams data-change notifications to the console, and prints a summary when the command exits.
/// </summary>
/// <param name="console">The CLI console used for output and cancellation handling.</param>
public override async ValueTask ExecuteAsync(IConsole console)
{
ConfigureLogging();
if (Interval <= 0)
throw new CommandException($"--interval must be greater than 0 (was {Interval}).");
if (Recursive && MaxDepth <= 0)
throw new CommandException($"--max-depth must be greater than 0 (was {MaxDepth}).");
if (DurationSeconds < 0)
throw new CommandException($"--duration must be 0 or a positive number (was {DurationSeconds}).");
NodeId rootNodeId;
try
{
rootNodeId = NodeIdParser.ParseRequired(NodeId);
}
catch (Exception ex) when (ex is FormatException or ArgumentException)
{
throw new CommandException($"Invalid --node value: {ex.Message}");
}
IOpcUaClientService? service = null;
try
{
var ct = console.RegisterCancellationHandler();
(service, _) = await CreateServiceAndConnectAsync(ct);
var rootNodeId = NodeIdParser.ParseRequired(NodeId);
var targets = new List<(NodeId nodeId, string displayPath)>();
if (Recursive)
{
@@ -1,4 +1,5 @@
using CliFx.Attributes;
using CliFx.Exceptions;
using CliFx.Infrastructure;
using Opc.Ua;
using ZB.MOM.WW.OtOpcUa.Client.CLI.Helpers;
@@ -37,14 +38,23 @@ public class WriteCommand : CommandBase
public override async ValueTask ExecuteAsync(IConsole console)
{
ConfigureLogging();
NodeId nodeId;
try
{
nodeId = NodeIdParser.ParseRequired(NodeId);
}
catch (Exception ex) when (ex is FormatException or ArgumentException)
{
throw new CommandException($"Invalid --node value: {ex.Message}");
}
IOpcUaClientService? service = null;
try
{
var ct = console.RegisterCancellationHandler();
(service, _) = await CreateServiceAndConnectAsync(ct);
var nodeId = NodeIdParser.ParseRequired(NodeId);
// Read current value to determine type for conversion
var currentValue = await service.ReadValueAsync(nodeId, ct);
var typedValue = ValueConverter.ConvertValue(Value, currentValue.Value);
@@ -14,7 +14,13 @@ internal sealed class DefaultApplicationConfigurationFactory : IApplicationConfi
public async Task<ApplicationConfiguration> CreateAsync(ConnectionSettings settings, CancellationToken ct)
{
var storePath = settings.CertificateStorePath;
// Resolve the canonical PKI path lazily on first use so constructing a
// ConnectionSettings instance — including the throwaway copies the client
// service builds per failover attempt — does not touch the filesystem.
// Callers that supply an explicit path override the default.
var storePath = string.IsNullOrWhiteSpace(settings.CertificateStorePath)
? ClientStoragePaths.GetPkiPath()
: settings.CertificateStorePath;
var config = new ApplicationConfiguration
{
@@ -24,9 +24,47 @@ internal sealed class DefaultEndpointDiscovery : IEndpointDiscovery
using var client = DiscoveryClient.Create(new Uri(endpointUrl));
var allEndpoints = client.GetEndpoints(null);
return EndpointSelector.SelectBest(allEndpoints, endpointUrl, requestedMode);
}
}
/// <summary>
/// Pure best-endpoint selection logic, extracted from <see cref="DefaultEndpointDiscovery"/>
/// so it can be unit tested without standing up a real <see cref="DiscoveryClient"/>.
/// </summary>
internal static class EndpointSelector
{
private static readonly ILogger Logger = Log.ForContext(typeof(EndpointSelector));
/// <summary>
/// Picks the best endpoint from the discovery response that matches the requested
/// security mode, preferring <c>Basic256Sha256</c>, and rewrites the endpoint URL
/// host to match the user-supplied URL when the discovery response advertises a
/// different hostname.
/// </summary>
/// <param name="allEndpoints">Endpoints returned by the discovery query, in any order.</param>
/// <param name="endpointUrl">The endpoint URL the operator supplied; supplies the hostname rewrite target.</param>
/// <param name="requestedMode">The requested OPC UA message security mode.</param>
/// <exception cref="InvalidOperationException">
/// Thrown when no endpoint matches <paramref name="requestedMode"/>; the message lists the
/// security mode + policy combinations the server returned so operators can diagnose mismatches.
/// </exception>
public static EndpointDescription SelectBest(
IEnumerable<EndpointDescription> allEndpoints,
string endpointUrl,
MessageSecurityMode requestedMode)
{
ArgumentNullException.ThrowIfNull(allEndpoints);
if (string.IsNullOrWhiteSpace(endpointUrl))
throw new ArgumentException("Endpoint URL must not be null or empty.", nameof(endpointUrl));
// Materialise once so we can both iterate and produce a diagnostic message
// without re-running the underlying discovery enumeration.
var endpoints = allEndpoints.ToList();
EndpointDescription? best = null;
foreach (var ep in allEndpoints)
foreach (var ep in endpoints)
{
if (ep.SecurityMode != requestedMode)
continue;
@@ -37,18 +75,21 @@ internal sealed class DefaultEndpointDiscovery : IEndpointDiscovery
continue;
}
// Prefer Basic256Sha256 when multiple endpoints match the requested mode.
if (ep.SecurityPolicyUri == SecurityPolicies.Basic256Sha256)
best = ep;
}
if (best == null)
{
var available = string.Join(", ", allEndpoints.Select(e => $"{e.SecurityMode}/{e.SecurityPolicyUri}"));
var available = string.Join(", ", endpoints.Select(e => $"{e.SecurityMode}/{e.SecurityPolicyUri}"));
throw new InvalidOperationException(
$"No endpoint found with security mode '{requestedMode}'. Available endpoints: {available}");
}
// Rewrite endpoint URL hostname to match user-supplied hostname
// Rewrite endpoint URL hostname to match user-supplied hostname. Necessary
// when the OPC UA server returns a discovery URL using a different hostname
// (e.g. internal DNS name) than the one the operator routed to.
var serverUri = new Uri(best.EndpointUrl);
var requestedUri = new Uri(endpointUrl);
if (serverUri.Host != requestedUri.Host)
@@ -73,6 +73,17 @@ internal sealed class DefaultSessionAdapter : ISessionAdapter
var writeCollection = new WriteValueCollection { writeValue };
var response = await _session.WriteAsync(null, writeCollection, ct);
// A malformed or service-level-faulted response can come back with an empty
// Results collection alongside a service fault. Surface the service result
// (or BadUnexpectedError) rather than letting Results[0] throw
// IndexOutOfRangeException upstream.
if (response.Results == null || response.Results.Count == 0)
{
var serviceResult = response.ResponseHeader?.ServiceResult.Code ?? StatusCodes.BadUnexpectedError;
throw new ServiceResultException(serviceResult,
$"Write response contained no results for node {nodeId}.");
}
return response.Results[0];
}
@@ -143,15 +154,18 @@ internal sealed class DefaultSessionAdapter : ISessionAdapter
if (continuationPoint != null)
nodesToRead[0].ContinuationPoint = continuationPoint;
_session.HistoryRead(
// Use the async overload so this method is genuinely asynchronous,
// honors the cancellation token, and does not block the caller's thread
// (which would block the UI dispatcher for client.ui consumers).
var response = await _session.HistoryReadAsync(
null,
new ExtensionObject(details),
TimestampsToReturn.Source,
continuationPoint != null,
nodesToRead,
out var results,
out _);
ct).ConfigureAwait(false);
var results = response.Results;
if (results == null || results.Count == 0)
break;
@@ -186,15 +200,17 @@ internal sealed class DefaultSessionAdapter : ISessionAdapter
new HistoryReadValueId { NodeId = nodeId }
};
_session.HistoryRead(
// Use the async overload so the method honors the cancellation token and
// does not block on a synchronous service round-trip.
var response = await _session.HistoryReadAsync(
null,
new ExtensionObject(details),
TimestampsToReturn.Source,
false,
nodesToRead,
out var results,
out _);
ct).ConfigureAwait(false);
var results = response.Results;
var allValues = new List<DataValue>();
if (results != null && results.Count > 0)
@@ -229,7 +245,9 @@ internal sealed class DefaultSessionAdapter : ISessionAdapter
{
try
{
if (_session.Connected) _session.Close();
// Use the async overload so the caller does not block on the close
// service round-trip and the cancellation token is honored.
if (_session.Connected) await _session.CloseAsync(ct).ConfigureAwait(false);
}
catch (Exception ex)
{
@@ -270,6 +288,15 @@ internal sealed class DefaultSessionAdapter : ISessionAdapter
},
ct);
// An empty Results collection paired with a service fault must surface as
// a ServiceResultException, not an IndexOutOfRangeException from Results[0].
if (result.Results == null || result.Results.Count == 0)
{
var serviceResult = result.ResponseHeader?.ServiceResult.Code ?? StatusCodes.BadUnexpectedError;
throw new ServiceResultException(serviceResult,
$"Call response contained no results for method {methodId} on {objectId}.");
}
var callResult = result.Results[0];
if (StatusCode.IsBad(callResult.StatusCode))
throw new ServiceResultException(callResult.StatusCode);
@@ -96,6 +96,11 @@ public interface IOpcUaClientService : IDisposable
/// <param name="eventId">The event identifier returned by the OPC UA server for the alarm event.</param>
/// <param name="comment">The operator acknowledgment comment to write with the method call.</param>
/// <param name="ct">The cancellation token that aborts the acknowledgment request.</param>
/// <returns>
/// <see cref="StatusCodes.Good"/> on success, or the server's bad <see cref="StatusCode"/>
/// (from the underlying <see cref="ServiceResultException"/>) when the acknowledge call
/// returns a bad result. Other transport-level failures still surface as exceptions.
/// </returns>
Task<StatusCode> AcknowledgeAlarmAsync(string conditionNodeId, byte[] eventId, string comment, CancellationToken ct = default);
/// <summary>
@@ -41,11 +41,13 @@ public sealed class ConnectionSettings
public bool AutoAcceptCertificates { get; set; } = true;
/// <summary>
/// Path to the certificate store. Defaults to a subdirectory under LocalApplicationData
/// resolved via <see cref="ClientStoragePaths"/> so the one-shot legacy-folder migration
/// runs before the path is returned.
/// Path to the certificate store. Defaults to <see cref="string.Empty"/>; the
/// consuming application configuration factory resolves the canonical path via
/// <see cref="ClientStoragePaths.GetPkiPath"/> lazily on first connect, so
/// constructing settings — including the throwaway copies built per failover
/// attempt — does not touch disk or run the legacy-folder migration probe.
/// </summary>
public string CertificateStorePath { get; set; } = ClientStoragePaths.GetPkiPath();
public string CertificateStorePath { get; set; } = string.Empty;
/// <summary>
/// Validates the settings and throws if any required values are missing or invalid.
@@ -353,11 +353,24 @@ public sealed class OpcUaClientService : IOpcUaClientService
: NodeId.Parse(conditionNodeId + ".Condition");
var acknowledgeMethodId = MethodIds.AcknowledgeableConditionType_Acknowledge;
await _session!.CallMethodAsync(
conditionObjId,
acknowledgeMethodId,
[eventId, new LocalizedText(comment)],
ct);
// CallMethodAsync throws ServiceResultException on a bad call result;
// surface that as the returned StatusCode so callers using the documented
// `Task<StatusCode>` contract (e.g. `if (StatusCode.IsBad(result))`) see
// the failure instead of an uncaught exception they did not anticipate.
try
{
await _session!.CallMethodAsync(
conditionObjId,
acknowledgeMethodId,
[eventId, new LocalizedText(comment)],
ct);
}
catch (ServiceResultException ex)
{
Logger.Warning(ex, "Failed to acknowledge alarm on {ConditionId} (status {Status})",
conditionNodeId, ex.StatusCode);
return ex.StatusCode;
}
Logger.Debug("Acknowledged alarm on {ConditionId}", conditionNodeId);
return StatusCodes.Good;
@@ -30,12 +30,6 @@ public partial class DateTimeRangePicker : UserControl
public static readonly StyledProperty<string> EndTextProperty =
AvaloniaProperty.Register<DateTimeRangePicker, string>(nameof(EndText), defaultValue: "");
public static readonly StyledProperty<DateTimeOffset?> MinDateTimeProperty =
AvaloniaProperty.Register<DateTimeRangePicker, DateTimeOffset?>(nameof(MinDateTime));
public static readonly StyledProperty<DateTimeOffset?> MaxDateTimeProperty =
AvaloniaProperty.Register<DateTimeRangePicker, DateTimeOffset?>(nameof(MaxDateTime));
private bool _isUpdating;
public DateTimeRangePicker()
@@ -67,18 +61,6 @@ public partial class DateTimeRangePicker : UserControl
set => SetValue(EndTextProperty, value);
}
public DateTimeOffset? MinDateTime
{
get => GetValue(MinDateTimeProperty);
set => SetValue(MinDateTimeProperty, value);
}
public DateTimeOffset? MaxDateTime
{
get => GetValue(MaxDateTimeProperty);
set => SetValue(MaxDateTimeProperty, value);
}
protected override void OnLoaded(RoutedEventArgs e)
{
base.OnLoaded(e);
@@ -1,4 +1,6 @@
using Avalonia;
using Serilog;
using ZB.MOM.WW.OtOpcUa.Client.Shared;
namespace ZB.MOM.WW.OtOpcUa.Client.UI;
@@ -7,8 +9,16 @@ public class Program
[STAThread]
public static void Main(string[] args)
{
BuildAvaloniaApp()
.StartWithClassicDesktopLifetime(args);
ConfigureLogging();
try
{
BuildAvaloniaApp()
.StartWithClassicDesktopLifetime(args);
}
finally
{
Log.CloseAndFlush();
}
}
public static AppBuilder BuildAvaloniaApp()
@@ -18,4 +28,35 @@ public class Program
.WithInterFont()
.LogToTrace();
}
/// <summary>
/// Initializes the Serilog root logger with a console sink + a rolling daily file sink
/// under <c>{LocalAppData}/OtOpcUaClient/logs/</c>. CLAUDE.md mandates Serilog with a
/// rolling daily file sink as the project standard; this is also the only way the swallow
/// blocks in the alarms / subscriptions / redundancy view-models surface a diagnosable
/// trace when an operator hits a problem in the field.
/// </summary>
private static void ConfigureLogging()
{
var logsDir = Path.Combine(ClientStoragePaths.GetRoot(), "logs");
try
{
Directory.CreateDirectory(logsDir);
}
catch
{
// Best-effort; file sink will gracefully fall back if the dir can't be created.
}
Log.Logger = new LoggerConfiguration()
.MinimumLevel.Information()
.Enrich.FromLogContext()
.WriteTo.Console()
.WriteTo.File(
path: Path.Combine(logsDir, "client-ui-.log"),
rollingInterval: RollingInterval.Day,
retainedFileCountLimit: 14,
shared: true)
.CreateLogger();
}
}
@@ -2,6 +2,7 @@ using System.Collections.ObjectModel;
using CommunityToolkit.Mvvm.ComponentModel;
using CommunityToolkit.Mvvm.Input;
using Opc.Ua;
using Serilog;
using ZB.MOM.WW.OtOpcUa.Client.Shared;
using ZB.MOM.WW.OtOpcUa.Client.Shared.Models;
using ZB.MOM.WW.OtOpcUa.Client.UI.Services;
@@ -13,9 +14,18 @@ namespace ZB.MOM.WW.OtOpcUa.Client.UI.ViewModels;
/// </summary>
public partial class AlarmsViewModel : ObservableObject
{
private static readonly ILogger Logger = Log.ForContext<AlarmsViewModel>();
private readonly IUiDispatcher _dispatcher;
private readonly IOpcUaClientService _service;
/// <summary>
/// Last user-visible status message — set when an alarm subscribe / unsubscribe / refresh
/// operation fails so the shell can surface the diagnostic instead of silently dropping it.
/// Genuine failures are distinguished from "feature not supported" (condition refresh).
/// </summary>
[ObservableProperty] private string? _statusMessage;
[ObservableProperty] private int _interval = 1000;
[ObservableProperty]
@@ -95,19 +105,25 @@ public partial class AlarmsViewModel : ObservableObject
await _service.SubscribeAlarmsAsync(sourceNodeId, Interval);
IsSubscribed = true;
StatusMessage = null;
try
{
await _service.RequestConditionRefreshAsync();
}
catch
catch (Exception refreshEx)
{
// Refresh not supported
// Condition refresh is optional on the server side — log at info level and surface
// a soft notice rather than a hard failure so the operator can tell apart "server
// does not advertise refresh" from a genuine subscribe failure.
Logger.Information(refreshEx, "RequestConditionRefresh not supported by server");
StatusMessage = "Condition refresh not supported by server (subscribed).";
}
}
catch
catch (Exception ex)
{
// Subscribe failed
Logger.Warning(ex, "SubscribeAlarms failed for {Source}", MonitoredNodeIdText ?? "(all)");
StatusMessage = $"Subscribe to alarms failed: {ex.Message}";
}
}
@@ -123,10 +139,12 @@ public partial class AlarmsViewModel : ObservableObject
{
await _service.UnsubscribeAlarmsAsync();
IsSubscribed = false;
StatusMessage = null;
}
catch
catch (Exception ex)
{
// Unsubscribe failed
Logger.Warning(ex, "UnsubscribeAlarms failed");
StatusMessage = $"Unsubscribe alarms failed: {ex.Message}";
}
}
@@ -136,10 +154,14 @@ public partial class AlarmsViewModel : ObservableObject
try
{
await _service.RequestConditionRefreshAsync();
StatusMessage = null;
}
catch
catch (Exception ex)
{
// Refresh failed
// Same as the subscribe-time fallback: refresh is server-side optional. Information-
// level log + soft status so the operator sees why an explicit refresh did nothing.
Logger.Information(ex, "RequestConditionRefresh not supported by server");
StatusMessage = "Condition refresh not supported by server.";
}
}
@@ -189,19 +211,22 @@ public partial class AlarmsViewModel : ObservableObject
await _service.SubscribeAlarmsAsync(nodeId, Interval);
IsSubscribed = true;
StatusMessage = null;
try
{
await _service.RequestConditionRefreshAsync();
}
catch
catch (Exception refreshEx)
{
// Refresh not supported
Logger.Information(refreshEx, "RequestConditionRefresh not supported by server (restore path)");
StatusMessage = "Condition refresh not supported by server (restored subscription).";
}
}
catch
catch (Exception ex)
{
// Subscribe failed
Logger.Warning(ex, "RestoreAlarmSubscription failed for {Source}", sourceNodeId);
StatusMessage = $"Restore alarm subscription failed: {ex.Message}";
}
}
@@ -1,6 +1,7 @@
using System.Collections.ObjectModel;
using CommunityToolkit.Mvvm.ComponentModel;
using CommunityToolkit.Mvvm.Input;
using Serilog;
using ZB.MOM.WW.OtOpcUa.Client.Shared;
using ZB.MOM.WW.OtOpcUa.Client.Shared.Models;
using ZB.MOM.WW.OtOpcUa.Client.UI.Services;
@@ -12,6 +13,8 @@ namespace ZB.MOM.WW.OtOpcUa.Client.UI.ViewModels;
/// </summary>
public partial class MainWindowViewModel : ObservableObject, IDisposable
{
private static readonly ILogger Logger = Log.ForContext<MainWindowViewModel>();
private readonly IUiDispatcher _dispatcher;
private readonly IOpcUaClientServiceFactory _factory;
private readonly ISettingsService _settingsService;
@@ -137,6 +140,15 @@ public partial class MainWindowViewModel : ObservableObject, IDisposable
{
if (args.PropertyName == nameof(AlarmsViewModel.ActiveAlarmCount))
_dispatcher.Post(() => ActiveAlarmCount = Alarms.ActiveAlarmCount);
else if (args.PropertyName == nameof(AlarmsViewModel.StatusMessage)
&& !string.IsNullOrEmpty(Alarms.StatusMessage))
_dispatcher.Post(() => StatusMessage = Alarms.StatusMessage!);
};
Subscriptions.PropertyChanged += (_, args) =>
{
if (args.PropertyName == nameof(SubscriptionsViewModel.StatusMessage)
&& !string.IsNullOrEmpty(Subscriptions.StatusMessage))
_dispatcher.Post(() => StatusMessage = Subscriptions.StatusMessage!);
};
History = new HistoryViewModel(_service, _dispatcher);
@@ -244,15 +256,17 @@ public partial class MainWindowViewModel : ObservableObject, IDisposable
SessionLabel = $"{info.ServerName} | Session: {info.SessionName} ({info.SessionId})";
});
// Load redundancy info
// Load redundancy info — the server may not implement the redundancy facet, in which
// case we leave RedundancyInfo null but log so a field diagnosis can tell the difference
// between "facet not advertised" and "facet errored". The connection itself stays up.
try
{
var redundancy = await _service!.GetRedundancyInfoAsync();
_dispatcher.Post(() => RedundancyInfo = redundancy);
}
catch
catch (Exception redundancyEx)
{
// Redundancy info not available
Logger.Information(redundancyEx, "GetRedundancyInfo unavailable on this server");
}
// Load root nodes
@@ -2,6 +2,7 @@ using System.Collections.ObjectModel;
using CommunityToolkit.Mvvm.ComponentModel;
using CommunityToolkit.Mvvm.Input;
using Opc.Ua;
using Serilog;
using ZB.MOM.WW.OtOpcUa.Client.Shared;
using ZB.MOM.WW.OtOpcUa.Client.Shared.Models;
using ZB.MOM.WW.OtOpcUa.Client.UI.Services;
@@ -13,9 +14,17 @@ namespace ZB.MOM.WW.OtOpcUa.Client.UI.ViewModels;
/// </summary>
public partial class SubscriptionsViewModel : ObservableObject
{
private static readonly ILogger Logger = Log.ForContext<SubscriptionsViewModel>();
private readonly IUiDispatcher _dispatcher;
private readonly IOpcUaClientService _service;
/// <summary>
/// Last user-visible status message — set when a subscribe/unsubscribe operation fails so the
/// shell can surface the diagnostic instead of silently dropping the error. Cleared on success.
/// </summary>
[ObservableProperty] private string? _statusMessage;
[ObservableProperty]
[NotifyCanExecuteChangedFor(nameof(AddSubscriptionCommand))]
[NotifyCanExecuteChangedFor(nameof(RemoveSubscriptionCommand))]
@@ -85,11 +94,13 @@ public partial class SubscriptionsViewModel : ObservableObject
{
ActiveSubscriptions.Add(new SubscriptionItemViewModel(nodeIdStr, interval));
SubscriptionCount = ActiveSubscriptions.Count;
StatusMessage = null;
});
}
catch
catch (Exception ex)
{
// Subscription failed; no item added
Logger.Warning(ex, "AddSubscription failed for {NodeId}", nodeIdStr);
_dispatcher.Post(() => StatusMessage = $"Subscribe failed for {nodeIdStr}: {ex.Message}");
}
}
@@ -116,9 +127,11 @@ public partial class SubscriptionsViewModel : ObservableObject
_dispatcher.Post(() => ActiveSubscriptions.Remove(item));
}
catch
catch (Exception ex)
{
// Unsubscribe failed for this item; continue with others
Logger.Warning(ex, "Unsubscribe failed for {NodeId}", item.NodeId);
_dispatcher.Post(() => StatusMessage = $"Unsubscribe failed for {item.NodeId}: {ex.Message}");
// Continue with the other items in the batch.
}
}
@@ -146,11 +159,13 @@ public partial class SubscriptionsViewModel : ObservableObject
{
ActiveSubscriptions.Add(new SubscriptionItemViewModel(nodeIdStr, intervalMs));
SubscriptionCount = ActiveSubscriptions.Count;
StatusMessage = null;
});
}
catch
catch (Exception ex)
{
// Subscription failed
Logger.Warning(ex, "AddSubscriptionForNode failed for {NodeId}", nodeIdStr);
_dispatcher.Post(() => StatusMessage = $"Subscribe failed for {nodeIdStr}: {ex.Message}");
}
}
@@ -186,9 +201,10 @@ public partial class SubscriptionsViewModel : ObservableObject
foreach (var child in children)
await AddSubscriptionRecursiveAsync(child.NodeId, child.NodeClass, intervalMs, maxDepth, currentDepth + 1);
}
catch
catch (Exception ex)
{
// Browse failed for this node; skip it
Logger.Warning(ex, "Recursive browse failed for {NodeId}; skipping subtree", nodeIdStr);
_dispatcher.Post(() => StatusMessage = $"Browse failed for {nodeIdStr}: {ex.Message}");
}
}
@@ -78,7 +78,7 @@
<TextBox Text="{Binding CertificateStorePath}"
Width="370"
IsReadOnly="True"
Watermark="(default: AppData/LmxOpcUaClient/pki)" />
Watermark="(default: AppData/OtOpcUaClient/pki)" />
<Button Name="BrowseCertPathButton"
Content="..."
Width="30"
@@ -2,6 +2,7 @@ using System.ComponentModel;
using System.Reflection;
using Avalonia.Controls;
using Avalonia.Interactivity;
using Avalonia.Platform.Storage;
using SkiaSharp;
using Svg.Skia;
using ZB.MOM.WW.OtOpcUa.Client.UI.ViewModels;
@@ -126,15 +127,34 @@ public partial class MainWindow : Window
{
if (DataContext is not MainWindowViewModel vm) return;
var dialog = new OpenFolderDialog
var topLevel = TopLevel.GetTopLevel(this);
if (topLevel == null) return;
IStorageFolder? startLocation = null;
if (!string.IsNullOrEmpty(vm.CertificateStorePath))
{
try
{
startLocation = await topLevel.StorageProvider.TryGetFolderFromPathAsync(vm.CertificateStorePath);
}
catch
{
// Best-effort: if the existing path can't be resolved (missing/permission), open the dialog without it.
}
}
var folders = await topLevel.StorageProvider.OpenFolderPickerAsync(new FolderPickerOpenOptions
{
Title = "Select Certificate Store Folder",
Directory = vm.CertificateStorePath
};
AllowMultiple = false,
SuggestedStartLocation = startLocation
});
var result = await dialog.ShowAsync(this);
if (!string.IsNullOrEmpty(result))
vm.CertificateStorePath = result;
if (folders.Count == 0) return;
var picked = folders[0].TryGetLocalPath();
if (!string.IsNullOrEmpty(picked))
vm.CertificateStorePath = picked;
}
protected override void OnClosing(WindowClosingEventArgs e)
@@ -19,6 +19,7 @@
<PackageReference Include="CommunityToolkit.Mvvm" Version="8.4.0"/>
<PackageReference Include="Serilog" Version="4.2.0"/>
<PackageReference Include="Serilog.Sinks.Console" Version="6.0.0"/>
<PackageReference Include="Serilog.Sinks.File" Version="7.0.0"/>
</ItemGroup>
<ItemGroup>
@@ -48,6 +48,71 @@ public sealed class ScriptedAlarmEngine : IDisposable
// snapshot enumeration safe. The only write shapes are indexer-set and Clear,
// both of which ConcurrentDictionary supports atomically. (Core.ScriptedAlarms-001)
private readonly ConcurrentDictionary<string, AlarmState> _alarms = new(StringComparer.Ordinal);
/// <summary>
/// Per-alarm reusable evaluation scratch. The read-cache dictionary and the
/// <see cref="AlarmPredicateContext"/> instance are both allocated once per
/// alarm (on first evaluation) and reused across every subsequent re-eval —
/// the hot path no longer allocates a fresh dictionary + context per upstream
/// tag change. Safe because <see cref="EvaluatePredicateToStateAsync"/> only
/// runs under <see cref="_evalGate"/>, which serialises every evaluation:
/// two threads can never observe the same scratch in a half-refilled state.
/// Cleared in <see cref="LoadAsync"/> alongside <see cref="_alarms"/>.
/// (Core.ScriptedAlarms-009)
/// </summary>
private readonly ConcurrentDictionary<string, AlarmScratch> _scratchByAlarmId =
new(StringComparer.Ordinal);
/// <summary>
/// Compile cache for every alarm predicate. Routes <see cref="LoadAsync"/>'s
/// <see cref="ScriptEvaluator{TContext, TResult}.Compile"/> calls through the
/// cache so the collectible <see cref="System.Runtime.Loader.AssemblyLoadContext"/>
/// each compile produces is actually disposed on the publish-replace path
/// (Core.Scripting-016): the cache's <see cref="CompiledScriptCache{TContext, TResult}.Clear"/>
/// disposes every materialised evaluator before dropping its dictionary entry,
/// so a config-publish releases the prior generation's ALCs and the per-publish
/// accretion the Core.Scripting-008 fix targeted is actually freed in production.
/// Pre-fix the engine called <c>ScriptEvaluator.Compile</c> directly, which left
/// the ALCs rooted until the process exited — defeating -008 on the real path.
/// </summary>
private readonly CompiledScriptCache<AlarmPredicateContext, bool> _compileCache = new();
/// <summary>
/// Test-only diagnostic: returns the per-alarm scratch read-cache dictionary
/// if one has been allocated, else null. Used by Core.ScriptedAlarms-009
/// regression tests to assert the scratch is reused across evaluations
/// (two reads return the same instance).
/// </summary>
/// <remarks>
/// <b>Synchronization:</b> the returned <see cref="IReadOnlyDictionary{TKey, TValue}"/>
/// is the engine's live mutable read-cache. It is refilled in place by
/// <c>RefillReadCache</c> on every predicate evaluation, under <c>_evalGate</c>.
/// Test callers MUST NOT iterate this dictionary while the engine is
/// actively evaluating (i.e. while an upstream change is mid-flight); the
/// refill clears the dict before repopulating and a concurrent iterator
/// would observe torn / partial state. Safe uses are: reference-identity
/// comparisons (e.g. asserting the same instance is reused across calls),
/// and single-key reads against an engine that has quiesced after a
/// deterministic upstream push. Anything more involved should snapshot a
/// copy under the gate. (Core.ScriptedAlarms-013.)
/// </remarks>
internal IReadOnlyDictionary<string, DataValueSnapshot>? TryGetScratchReadCacheForTest(string alarmId)
=> _scratchByAlarmId.TryGetValue(alarmId, out var s) ? s.ReadCache : null;
/// <summary>
/// Test-only diagnostic: returns the per-alarm <see cref="AlarmPredicateContext"/>
/// if one has been allocated, else null. Companion to
/// <see cref="TryGetScratchReadCacheForTest"/>.
/// </summary>
/// <remarks>
/// <b>Synchronization:</b> the returned context wraps the same live
/// read-cache as <see cref="TryGetScratchReadCacheForTest"/> — the same
/// "don't iterate during an in-flight evaluation" caveat applies. Safe
/// for reference-identity assertions on a quiesced engine.
/// (Core.ScriptedAlarms-013.)
/// </remarks>
internal AlarmPredicateContext? TryGetScratchContextForTest(string alarmId)
=> _scratchByAlarmId.TryGetValue(alarmId, out var s) ? s.Context : null;
private readonly ConcurrentDictionary<string, DataValueSnapshot> _valueCache
= new(StringComparer.Ordinal);
private readonly Dictionary<string, HashSet<string>> _alarmsReferencing
@@ -108,6 +173,14 @@ public sealed class ScriptedAlarmEngine : IDisposable
UnsubscribeFromUpstream();
_alarms.Clear();
_alarmsReferencing.Clear();
// Drop the prior generation's per-alarm scratch buffers — definitions may
// have changed (different Inputs, different Logger), so any reuse would be
// unsafe. (Core.ScriptedAlarms-009)
_scratchByAlarmId.Clear();
// Dispose every compiled-predicate ALC from the prior generation BEFORE we
// recompile this one. Skipping this is what made Core.Scripting-008 a
// no-op in production. (Core.Scripting-016)
_compileCache.Clear();
var compileFailures = new List<string>();
foreach (var def in definitions)
@@ -122,7 +195,10 @@ public sealed class ScriptedAlarmEngine : IDisposable
continue;
}
var evaluator = ScriptEvaluator<AlarmPredicateContext, bool>.Compile(def.PredicateScriptSource);
// Route through CompiledScriptCache so the emitted assembly's
// collectible ALC participates in publish-replace cleanup.
// (Core.Scripting-016)
var evaluator = _compileCache.GetOrCompile(def.PredicateScriptSource);
var timed = new TimedScriptEvaluator<AlarmPredicateContext, bool>(evaluator, _scriptTimeout);
var logger = _loggerFactory.Create(def.AlarmId);
@@ -354,7 +430,13 @@ public sealed class ScriptedAlarmEngine : IDisposable
AlarmState state, AlarmConditionState seed, DateTime nowUtc, CancellationToken ct,
List<ScriptedAlarmEvent>? pendingEmissions = null)
{
var inputs = BuildReadCache(state.Inputs);
// Look up (or lazily allocate) the per-alarm scratch and refill its read cache
// in place. The dictionary + context survive across evaluations so the hot path
// no longer allocates per upstream tag change. (Core.ScriptedAlarms-009)
var scratch = _scratchByAlarmId.GetOrAdd(
state.Definition.AlarmId,
_ => new AlarmScratch(state.Inputs, state.Logger, _clock));
RefillReadCache(scratch.ReadCache, state.Inputs);
// Cold-start guard — skip the predicate when any referenced upstream tag has no
// cached value yet (the upstream subscription hasn't delivered its first push).
@@ -362,9 +444,9 @@ public sealed class ScriptedAlarmEngine : IDisposable
// every tick until the cache fills, spamming the log with identical stack traces.
// Bad quality is treated the same: the input isn't available at the predicate's
// expected type, so the only defensible move is to hold the prior condition state.
if (!AreInputsReady(inputs)) return seed;
if (!AreInputsReady(scratch.ReadCache)) return seed;
var context = new AlarmPredicateContext(inputs, state.Logger, _clock);
var context = scratch.Context;
bool predicateTrue;
try
@@ -399,12 +481,20 @@ public sealed class ScriptedAlarmEngine : IDisposable
return result.State;
}
private IReadOnlyDictionary<string, DataValueSnapshot> BuildReadCache(IReadOnlySet<string> inputs)
/// <summary>
/// Refill <paramref name="cache"/> in place from <c>_valueCache</c>, falling
/// back to a synchronous <c>ITagUpstreamSource.ReadTag</c> for paths whose
/// first upstream push hasn't arrived yet. The dictionary is cleared and
/// repopulated under <c>_evalGate</c> so no concurrent reader can observe
/// a partial state. Replaces the old <c>BuildReadCache</c> which allocated a
/// fresh dictionary every call (Core.ScriptedAlarms-009).
/// </summary>
private void RefillReadCache(
Dictionary<string, DataValueSnapshot> cache, IReadOnlySet<string> inputs)
{
var d = new Dictionary<string, DataValueSnapshot>(StringComparer.Ordinal);
cache.Clear();
foreach (var p in inputs)
d[p] = _valueCache.TryGetValue(p, out var v) ? v : _upstream.ReadTag(p);
return d;
cache[p] = _valueCache.TryGetValue(p, out var v) ? v : _upstream.ReadTag(p);
}
/// <summary>
@@ -596,12 +686,24 @@ public sealed class ScriptedAlarmEngine : IDisposable
}
}
// Do NOT clear _alarms here: Timer.Dispose() does not wait for in-flight callbacks,
// so a ShelvingCheckAsync or ReevaluateAsync can still be running inside _evalGate.
// Those paths now re-check _disposed after acquiring the gate and bail out safely.
// Clearing _alarms outside the gate would race concurrent reads and is unnecessary
// (the whole object is being discarded). (Core.ScriptedAlarms-005)
// Safe to clear here: the Task.WhenAll drain above guaranteed no
// ReevaluateAsync / ShelvingCheckAsync is mid-flight, and _disposed=true
// prevents new background work from being queued (OnUpstreamChange bails on
// line 334). Pre-Core.Scripting-016 the comment said "Do NOT clear _alarms",
// but that was when the engine called ScriptEvaluator.Compile directly and
// held the script ALCs through _alarms→AlarmState→TimedScriptEvaluator
// forever — leaving them rooted defeated the -008 collectible-ALC unload.
// Clearing now drops the delegate references so the cache's Dispose call
// below can actually unload the emitted assemblies. (Core.ScriptedAlarms-005
// re-evaluated under -016.)
_alarms.Clear();
_alarmsReferencing.Clear();
_scratchByAlarmId.Clear();
// Dispose every compiled-predicate ALC so the engine's shutdown actually
// releases the emitted assemblies. The drain above ensures no evaluator is
// mid-call; CompiledScriptCache.Dispose internally guards against use-after-
// dispose. (Core.Scripting-016)
_compileCache.Dispose();
}
private sealed record AlarmState(
@@ -611,6 +713,37 @@ public sealed class ScriptedAlarmEngine : IDisposable
IReadOnlyList<string> TemplateTokens,
ILogger Logger,
AlarmConditionState Condition);
/// <summary>
/// Per-alarm reusable evaluation scratch. The <see cref="ReadCache"/> dictionary
/// is the same instance across every evaluation of the owning alarm — it is
/// cleared and refilled in <see cref="ScriptedAlarmEngine.RefillReadCache"/> on
/// each call. <see cref="Context"/> wraps that dictionary by reference, so a
/// refilled <see cref="ReadCache"/> is what the predicate's
/// <c>ctx.GetTag(path)</c> calls observe. (Core.ScriptedAlarms-009)
/// </summary>
/// <remarks>
/// Reuse is safe because <see cref="ScriptedAlarmEngine"/> serialises every
/// evaluation under <c>_evalGate</c>: two threads can never observe the same
/// scratch in a half-refilled state.
/// </remarks>
private sealed class AlarmScratch
{
public Dictionary<string, DataValueSnapshot> ReadCache { get; }
public AlarmPredicateContext Context { get; }
public AlarmScratch(IReadOnlySet<string> inputs, ILogger logger, Func<DateTime> clock)
{
// Pre-size to the expected input count so the first refill doesn't pay the
// dictionary-grow cost. The dictionary auto-grows if Inputs changes (it
// cannot under the current contract — Inputs is fixed at LoadAsync — but
// pre-sizing is defensive against future changes).
ReadCache = new Dictionary<string, DataValueSnapshot>(inputs.Count, StringComparer.Ordinal);
// Context holds the read cache by reference. Refilling the dictionary
// updates what the context (and the script) observes.
Context = new AlarmPredicateContext(ReadCache, logger, clock);
}
}
}
/// <summary>
@@ -30,11 +30,20 @@ namespace ZB.MOM.WW.OtOpcUa.Core.Scripting;
/// bounded by config DB (typically low thousands). If that changes in v3, add an
/// LRU eviction policy — the API stays the same.
/// </para>
/// <para>
/// <b>Lifecycle:</b> compiled scripts hold a collectible
/// <see cref="System.Runtime.Loader.AssemblyLoadContext"/> per evaluator
/// (Core.Scripting-008 fix). <see cref="Clear"/> disposes every materialised
/// evaluator before dropping its dictionary entry so the emitted assemblies are
/// eligible for GC immediately after a publish. <see cref="Dispose"/> drops the
/// cache itself for graceful server shutdown.
/// </para>
/// </remarks>
public sealed class CompiledScriptCache<TContext, TResult>
public sealed class CompiledScriptCache<TContext, TResult> : IDisposable
where TContext : ScriptContext
{
private readonly ConcurrentDictionary<string, Lazy<ScriptEvaluator<TContext, TResult>>> _cache = new();
private bool _disposed;
/// <summary>
/// Return the compiled evaluator for <paramref name="scriptSource"/>, compiling
@@ -46,6 +55,7 @@ public sealed class CompiledScriptCache<TContext, TResult>
public ScriptEvaluator<TContext, TResult> GetOrCompile(string scriptSource)
{
if (scriptSource is null) throw new ArgumentNullException(nameof(scriptSource));
if (_disposed) throw new ObjectDisposedException(nameof(CompiledScriptCache<TContext, TResult>));
var key = HashSource(scriptSource);
var lazy = _cache.GetOrAdd(key, _ => new Lazy<ScriptEvaluator<TContext, TResult>>(
@@ -72,13 +82,71 @@ public sealed class CompiledScriptCache<TContext, TResult>
/// <summary>Current entry count. Exposed for Admin UI diagnostics / tests.</summary>
public int Count => _cache.Count;
/// <summary>Drop every cached compile. Used on config generation publish + tests.</summary>
public void Clear() => _cache.Clear();
/// <summary>
/// Drop every cached compile. Used on config generation publish + tests.
/// Disposes each materialised evaluator before removing it so its collectible
/// <see cref="System.Runtime.Loader.AssemblyLoadContext"/> unloads and the
/// emitted script assembly becomes eligible for GC (Core.Scripting-008).
/// </summary>
/// <remarks>
/// Safe to call after <see cref="Dispose"/> — the operation is idempotent.
/// <see cref="Dispose"/> sets <c>_disposed = true</c> before invoking this
/// method (so callers see the post-Dispose guard on <see cref="GetOrCompile"/>),
/// but this method itself MUST run to completion so the Dispose-triggered
/// drain actually unloads every materialised evaluator's ALC. (Core.Scripting-016
/// uncovered this — a previous Clear-aborts-when-disposed guard silently
/// skipped the entire drain on Dispose, leaving emitted assemblies rooted.)
/// </remarks>
public void Clear()
{
// Snapshot (key, value) pairs and remove with the value-scoped
// TryRemove(KeyValuePair<,>) overload — same shape as the
// Core.Scripting-006 fix in GetOrCompile's catch block. A concurrent
// GetOrCompile re-add that hashes to the same key between our snapshot
// and the TryRemove inserts a *different* Lazy reference; the value-
// scoped removal sees the mismatch and leaves the fresh entry intact
// (instead of evicting + disposing it while the concurrent caller
// still holds it). The fresh evaluator and its ALC stay live for the
// concurrent caller. (Core.Scripting-014.)
foreach (var entry in _cache.ToArray())
{
if (_cache.TryRemove(entry))
DisposeLazyIfMaterialised(entry.Value);
}
}
/// <summary>True when the exact source has been compiled at least once + is still cached.</summary>
public bool Contains(string scriptSource)
=> _cache.ContainsKey(HashSource(scriptSource));
/// <summary>
/// Drop the cache and dispose every materialised evaluator. After disposal
/// <see cref="GetOrCompile"/> throws <see cref="ObjectDisposedException"/>.
/// </summary>
public void Dispose()
{
if (_disposed) return;
_disposed = true;
Clear();
}
private static void DisposeLazyIfMaterialised(Lazy<ScriptEvaluator<TContext, TResult>> lazy)
{
// IsValueCreated is false for a faulted Lazy too, so the catch in GetOrCompile
// has already taken care of failed compiles — there's no evaluator to dispose.
if (!lazy.IsValueCreated) return;
try
{
lazy.Value.Dispose();
}
catch
{
// Dispose is best-effort here: an evaluator disposal failure would leak its
// ALC but mustn't prevent the rest of the cache from clearing. The ALC
// unload itself is exception-free in practice; this is defensive.
}
}
private static string HashSource(string source)
{
var bytes = Encoding.UTF8.GetBytes(source);
@@ -72,6 +72,11 @@ public static class ForbiddenTypeAnalyzer
// a Task fan-out outlives the evaluation timeout entirely
// (Core.Scripting-003).
"System.Runtime.InteropServices",
"System.Runtime.Loader", // AssemblyLoadContext + AssemblyDependencyResolver —
// arbitrary DLL load into the host process
// (Core.Scripting-012). Namespace-prefix rather than
// type-granular so future BCL additions to this
// namespace are denied by default.
"Microsoft.Win32", // registry
];
@@ -113,6 +118,24 @@ public static class ForbiddenTypeAnalyzer
// target it without blocking those legitimate types. Denied type-granularly here.
// (Core.Scripting-010.)
"System.Threading.Thread",
// Core.Scripting-012 — broadening the references list to the BCL trusted-platform-
// assemblies set (Core.Scripting-008 follow-up) re-exposed two background-work
// vectors the original deny-list missed. Both live in System.Threading (shared
// with allowed sync primitives like CancellationToken / SemaphoreSlim), so they
// must be denied type-granularly:
//
// System.Threading.ThreadPool — QueueUserWorkItem / UnsafeQueueUserWorkItem
// re-introduce the background-fanout threat
// Core.Scripting-003 closed against
// System.Threading.Tasks.
// System.Threading.Timer — Timer(callback, …) schedules unbounded work
// that outlives the per-evaluation timeout.
//
// System.Runtime.Loader.AssemblyLoadContext is also covered, but at the namespace-
// prefix level above (System.Runtime.Loader) so future BCL additions to that
// namespace are denied by default.
"System.Threading.ThreadPool",
"System.Threading.Timer",
];
/// <summary>
@@ -1,75 +1,420 @@
using Microsoft.CodeAnalysis.CSharp.Scripting;
using Microsoft.CodeAnalysis.Scripting;
using System.Reflection;
using System.Runtime.Loader;
using System.Text;
using Microsoft.CodeAnalysis;
using Microsoft.CodeAnalysis.CSharp;
namespace ZB.MOM.WW.OtOpcUa.Core.Scripting;
/// <summary>
/// Compiles + runs user scripts against a <see cref="ScriptContext"/> subclass. Core
/// evaluator — no caching, no timeout, no logging side-effects yet (those land in
/// Stream A.3, A.4, A.5 respectively). Stream B + C wrap this with the dependency
/// scheduler + alarm state machine.
/// evaluator — no caching, no timeout, no logging side-effects (those land in
/// <see cref="CompiledScriptCache{TContext, TResult}"/>,
/// <see cref="TimedScriptEvaluator{TContext, TResult}"/>, and
/// <see cref="ScriptLogCompanionSink"/> respectively).
/// </summary>
/// <remarks>
/// <para>
/// Scripts are compiled against <see cref="ScriptGlobals{TContext}"/> so the
/// context member is named <c>ctx</c> in the script, matching the
/// <see cref="DependencyExtractor"/>'s walker and the Admin UI type stub.
/// Scripts are wrapped in a synthesized <c>CompiledScript.Run(globals)</c> method
/// and compiled via <see cref="CSharpCompilation"/> into a regular .NET assembly
/// that is loaded into a <b>collectible</b>
/// <see cref="AssemblyLoadContext"/>. The collectible ALC is the fix for
/// Core.Scripting-008: per-publish recompile accretion was previously unbounded
/// because Roslyn's <c>CSharpScript.CreateDelegate</c> emits into the default ALC
/// (non-collectible); now <see cref="Dispose"/> unloads the entire ALC and the
/// emitted assembly becomes eligible for GC.
/// </para>
/// <para>
/// Compile pipeline is a three-step gate: (1) Roslyn compile — catches syntax
/// errors + type-resolution failures, throws <see cref="CompilationErrorException"/>;
/// (2) <see cref="ForbiddenTypeAnalyzer"/> runs against the semantic model —
/// catches sandbox escapes that slipped past reference restrictions due to .NET's
/// type forwarding, throws <see cref="ScriptSandboxViolationException"/>; (3)
/// delegate creation — throws at this layer only for internal Roslyn bugs, not
/// user error.
/// Compile pipeline is a three-step gate, unchanged in intent from the legacy
/// <c>CSharpScript</c> path: (1) Roslyn parse + compile against the
/// <see cref="ScriptSandbox"/> allow-list — catches syntax errors, unresolved
/// types (the sandbox's first line of defense), and most type-resolution
/// failures, throwing <see cref="CompilationErrorException"/>; (2)
/// <see cref="ForbiddenTypeAnalyzer"/> runs against the semantic model — catches
/// sandbox escapes that slipped past reference restrictions due to .NET's type
/// forwarding, throwing <see cref="ScriptSandboxViolationException"/>; (3) emit
/// to an in-memory PE stream + load into the collectible ALC — throws at this
/// layer only for internal Roslyn bugs, not user error.
/// </para>
/// <para>
/// Runtime exceptions thrown from user code propagate unwrapped. The virtual-tag
/// engine (Stream B) catches them per-tag + maps to <c>BadInternalError</c>
/// quality per Phase 7 decision #11 this layer doesn't swallow anything so
/// tests can assert on the original exception type.
/// engine catches them per-tag and maps to <c>BadInternalError</c> quality
/// per Phase 7 decision #11; this layer doesn't swallow anything so tests can
/// assert on the original exception type.
/// </para>
/// <para>
/// Scripts are expected to be statement bodies that end with an explicit
/// <c>return …;</c> — the wrapper provides only the surrounding method body, so
/// the script's final-expression-yields-result behavior of legacy
/// <c>CSharpScript</c> is replaced by ordinary C# method semantics. Every script
/// in the existing test corpus already uses explicit <c>return</c>; this is a
/// documented authoring convention.
/// </para>
/// </remarks>
public sealed class ScriptEvaluator<TContext, TResult>
public sealed class ScriptEvaluator<TContext, TResult> : IDisposable
where TContext : ScriptContext
{
private readonly ScriptRunner<TResult> _runner;
private readonly ScriptAssemblyLoadContext _alc;
private readonly Func<ScriptGlobals<TContext>, TResult> _func;
private bool _disposed;
private ScriptEvaluator(ScriptRunner<TResult> runner)
private ScriptEvaluator(ScriptAssemblyLoadContext alc, Func<ScriptGlobals<TContext>, TResult> func)
{
_runner = runner;
_alc = alc;
_func = func;
}
public static ScriptEvaluator<TContext, TResult> Compile(string scriptSource)
{
if (scriptSource is null) throw new ArgumentNullException(nameof(scriptSource));
var options = ScriptSandbox.Build(typeof(TContext));
var script = CSharpScript.Create<TResult>(
code: scriptSource,
options: options,
globalsType: typeof(ScriptGlobals<TContext>));
var sandbox = ScriptSandbox.Build(typeof(TContext));
// Step 1 — Roslyn compile. Throws CompilationErrorException on syntax / type errors.
var diagnostics = script.Compile();
// Step 1 — synthesize a wrapper class around the script body and parse it. The
// wrapper's `Run` method is what we invoke at runtime; the user's source is
// pasted in as its body so explicit `return` semantics apply.
var wrapperSource = BuildWrapperSource(scriptSource, sandbox.Imports);
var syntaxTree = CSharpSyntaxTree.ParseText(wrapperSource);
// Step 2forbidden-type semantic analysis. Defense-in-depth against reference-list
// leaks due to type forwarding.
var rejections = ForbiddenTypeAnalyzer.Analyze(script.GetCompilation());
// Step 1adefend against wrapper-source injection (Core.Scripting-013).
// A script body of `return 0; } public static int Evil() { return 0;` would
// close the synthesized `Run` method early, declare a sibling `Evil` method
// inside the synthesized `CompiledScript` class, and leave the wrapper's
// trailing `}` balanced. ForbiddenTypeAnalyzer still walks the injected
// members so this isn't a direct sandbox escape, but it relaxes the
// documented "method body" authoring contract and widens the analyzer's
// surface. Reject by requiring that the parsed `CompiledScript` class
// contains exactly one member declaration (the `Run` method).
EnforceSingleRunMember(syntaxTree);
// Step 2 — Roslyn compile against the sandbox allow-list. Anything not in the
// references set is unresolved and produces a compiler error.
var assemblyName = "ZB.MOM.WW.OtOpcUa.Core.Scripting.Compiled." +
Guid.NewGuid().ToString("N");
var compileOptions = new CSharpCompilationOptions(
OutputKind.DynamicallyLinkedLibrary,
optimizationLevel: OptimizationLevel.Release,
allowUnsafe: false,
// Don't generate XML doc warnings for the synthesized wrapper.
warningLevel: 4,
nullableContextOptions: NullableContextOptions.Enable);
var compilation = CSharpCompilation.Create(
assemblyName,
syntaxTrees: new[] { syntaxTree },
references: sandbox.References,
options: compileOptions);
var compileDiagnostics = compilation.GetDiagnostics();
var compileErrors = compileDiagnostics
.Where(d => d.Severity == DiagnosticSeverity.Error)
.ToArray();
if (compileErrors.Length > 0)
throw new CompilationErrorException(compileErrors);
// Step 3 — forbidden-type semantic analysis. Defense-in-depth against
// reference-list leaks due to type forwarding.
var rejections = ForbiddenTypeAnalyzer.Analyze(compilation);
if (rejections.Count > 0)
throw new ScriptSandboxViolationException(rejections);
// Step 3materialize the callable delegate.
var runner = script.CreateDelegate();
return new ScriptEvaluator<TContext, TResult>(runner);
// Step 4emit to an in-memory PE stream and load into a collectible ALC.
using var peStream = new MemoryStream();
var emitResult = compilation.Emit(peStream);
if (!emitResult.Success)
{
var emitErrors = emitResult.Diagnostics
.Where(d => d.Severity == DiagnosticSeverity.Error)
.ToArray();
throw new CompilationErrorException(emitErrors);
}
peStream.Position = 0;
var alc = new ScriptAssemblyLoadContext(assemblyName);
Assembly assembly;
try
{
assembly = alc.LoadFromStream(peStream);
}
catch
{
// Failed to load — drop the ALC so we don't leak a half-initialised one.
alc.Unload();
throw;
}
// Step 5 — resolve the wrapper's Run method and bind a typed delegate. The
// wrapper source above puts the type in this exact namespace + class — keep the
// names in sync with BuildWrapperSource.
Func<ScriptGlobals<TContext>, TResult> func;
try
{
var wrapperType = assembly.GetType(
"ZB.MOM.WW.OtOpcUa.Core.Scripting.Compiled.CompiledScript",
throwOnError: true)!;
var runMethod = wrapperType.GetMethod(
"Run",
BindingFlags.Public | BindingFlags.Static)
?? throw new InvalidOperationException(
"Synthesized wrapper is missing the public static Run method.");
func = (Func<ScriptGlobals<TContext>, TResult>)Delegate.CreateDelegate(
typeof(Func<ScriptGlobals<TContext>, TResult>), runMethod);
}
catch
{
alc.Unload();
throw;
}
return new ScriptEvaluator<TContext, TResult>(alc, func);
}
/// <summary>Run against an already-constructed context.</summary>
public Task<TResult> RunAsync(TContext context, CancellationToken ct = default)
{
if (_disposed) throw new ObjectDisposedException(nameof(ScriptEvaluator<TContext, TResult>));
if (context is null) throw new ArgumentNullException(nameof(context));
ct.ThrowIfCancellationRequested();
var globals = new ScriptGlobals<TContext> { ctx = context };
return _runner(globals, ct);
// The user's script is synchronous (Roslyn emits a static method that returns
// TResult directly). We surface a Task<TResult> only to keep the existing
// RunAsync contract consumers depend on. TimedScriptEvaluator wraps this in
// Task.Run so a long-running script still honours its wall-clock budget.
var result = _func(globals);
return Task.FromResult(result);
}
/// <summary>
/// Unload the collectible <see cref="AssemblyLoadContext"/> that holds the emitted
/// script assembly so the runtime can reclaim it. After disposal the evaluator can
/// no longer be invoked — call <see cref="ScriptEvaluator{TContext, TResult}.Compile"/>
/// again for a fresh one. Dispose is idempotent.
/// </summary>
/// <remarks>
/// Unload is <i>eligible-for-collection</i>, not synchronous: the assembly is
/// reclaimed when the GC determines no live references remain. The cache disposes
/// evaluators in <see cref="CompiledScriptCache{TContext, TResult}.Clear"/> so a
/// config-generation publish releases the prior generation in one sweep; the
/// reclaim then races with the next GC cycle. Tests verify the reclaim via
/// <see cref="WeakReference"/> + <see cref="GC.Collect()"/>.
/// </remarks>
public void Dispose()
{
if (_disposed) return;
_disposed = true;
_alc.Unload();
}
/// <summary>
/// Reject scripts whose source contains brace-balanced injections that would
/// declare sibling members alongside the synthesized <c>CompiledScript.Run</c>
/// method. The expected shape is a single <c>CompiledScript</c> class with
/// exactly one member — the <c>Run</c> method. Anything else (a sibling
/// method, nested class, additional class in the namespace, free-floating
/// top-level statement) means the user source closed the synthesized braces
/// early and injected its own declarations. (Core.Scripting-013.)
/// </summary>
private static void EnforceSingleRunMember(SyntaxTree syntaxTree)
{
var root = syntaxTree.GetCompilationUnitRoot();
// The compilation unit must hold exactly one type declaration — our
// CompiledScript. Anything else means the user closed the synthesized
// namespace or class early and injected another type declaration.
var typeMembers = root.DescendantNodes()
.OfType<Microsoft.CodeAnalysis.CSharp.Syntax.BaseTypeDeclarationSyntax>()
.ToArray();
if (typeMembers.Length != 1 || typeMembers[0].Identifier.ValueText != "CompiledScript")
{
throw new CompilationErrorException(new[]
{
Diagnostic.Create(
new DiagnosticDescriptor(
id: "LMX001",
title: "Script wrapper injection",
messageFormat: "Script source must be a statement body. Declarations of " +
"additional types alongside the wrapper's CompiledScript class " +
"are not allowed; check for unbalanced braces or stray " +
"`class` / `namespace` keywords in the source. (Core.Scripting-013)",
category: "Sandbox",
defaultSeverity: DiagnosticSeverity.Error,
isEnabledByDefault: true),
typeMembers.Length > 1 ? typeMembers[1].Identifier.GetLocation() : Location.None),
});
}
// The CompiledScript class itself must contain exactly one member — the Run
// method. A second member means the user closed Run early and started a sibling.
var classMembers = ((Microsoft.CodeAnalysis.CSharp.Syntax.ClassDeclarationSyntax)typeMembers[0]).Members;
if (classMembers.Count != 1 ||
classMembers[0] is not Microsoft.CodeAnalysis.CSharp.Syntax.MethodDeclarationSyntax m ||
m.Identifier.ValueText != "Run")
{
throw new CompilationErrorException(new[]
{
Diagnostic.Create(
new DiagnosticDescriptor(
id: "LMX002",
title: "Script wrapper injection",
messageFormat: "Script source must be a statement body. Declarations of " +
"sibling members (methods, properties, nested types) alongside " +
"the wrapper's Run method are not allowed; check for unbalanced " +
"braces or a stray `}` followed by a `public`/`private`/`static` " +
"declaration in the source. (Core.Scripting-013)",
category: "Sandbox",
defaultSeverity: DiagnosticSeverity.Error,
isEnabledByDefault: true),
classMembers.Count > 1 ? classMembers[1].GetLocation() : Location.None),
});
}
}
/// <summary>
/// Synthesize the source we hand to Roslyn. The user's script body is pasted
/// verbatim inside <c>CompiledScript.Run</c>; the <c>using</c> directives mirror
/// <see cref="ScriptSandbox"/>'s imports so scripts can write <c>Math.Abs</c>
/// instead of <c>System.Math.Abs</c>.
/// </summary>
private static string BuildWrapperSource(string userSource, IReadOnlyList<string> imports)
{
var sb = new StringBuilder();
foreach (var import in imports)
sb.Append("using ").Append(import).AppendLine(";");
sb.AppendLine();
sb.AppendLine("namespace ZB.MOM.WW.OtOpcUa.Core.Scripting.Compiled;");
sb.AppendLine();
sb.AppendLine("public static class CompiledScript");
sb.AppendLine("{");
sb.Append(" public static ").Append(ToCSharpTypeName(typeof(TResult)))
.Append(" Run(").Append(ToCSharpTypeName(typeof(ScriptGlobals<TContext>)))
.AppendLine(" globals)");
sb.AppendLine(" {");
sb.AppendLine(" var ctx = globals.ctx;");
// User source ends with `return X;` per the authoring convention; we paste it
// verbatim. The leading newline keeps Roslyn diagnostics' line numbers usable
// by operators (errors point at the user's source line, not the wrapper).
sb.AppendLine("#line 1");
sb.AppendLine(userSource);
sb.AppendLine(" }");
sb.AppendLine("}");
return sb.ToString();
}
/// <summary>
/// Convert a runtime <see cref="Type"/> to a C# type-name string suitable for
/// emitting into Roslyn source. Uses <c>global::</c>-qualified FQNs to avoid
/// accidental capture by the wrapper's <c>using</c> directives, handles nested
/// types (<c>+</c> → <c>.</c>), and recurses for generic arguments so the
/// <c>ScriptGlobals&lt;TContext&gt;</c> parameter is emitted correctly.
/// </summary>
private static string ToCSharpTypeName(Type t)
{
if (t == typeof(void)) return "void";
// Primitive aliases keep the synthesized source readable when diagnostic
// logging dumps it; functionally identical to the FQN form.
if (t == typeof(bool)) return "bool";
if (t == typeof(byte)) return "byte";
if (t == typeof(sbyte)) return "sbyte";
if (t == typeof(short)) return "short";
if (t == typeof(ushort)) return "ushort";
if (t == typeof(int)) return "int";
if (t == typeof(uint)) return "uint";
if (t == typeof(long)) return "long";
if (t == typeof(ulong)) return "ulong";
if (t == typeof(float)) return "float";
if (t == typeof(double)) return "double";
if (t == typeof(decimal)) return "decimal";
if (t == typeof(string)) return "string";
if (t == typeof(object)) return "object";
if (Nullable.GetUnderlyingType(t) is { } inner)
return ToCSharpTypeName(inner) + "?";
if (t.IsArray)
return ToCSharpTypeName(t.GetElementType()!) + "[]";
if (t.IsGenericType)
{
// Walk the FullName by '.' segments (after '+ → .'). For each segment
// ending with `Name\`N`, consume N generic arguments and emit them as
// `<…>` on that segment. Nested generic-in-generic (Outer<T>.Inner<U>)
// emits as `global::Ns.Outer<T>.Inner<U>` — valid C#. The pre-fix code
// used `IndexOf('`')` to find the FIRST backtick and truncated the
// entire name there, silently dropping the rest of the nested-generic
// closed args. (Core.Scripting-015.)
var rawName = t.GetGenericTypeDefinition().FullName!.Replace('+', '.');
var allArgs = t.GetGenericArguments();
var segments = rawName.Split('.');
var argIndex = 0;
var sb = new StringBuilder("global::");
for (int i = 0; i < segments.Length; i++)
{
if (i > 0) sb.Append('.');
var seg = segments[i];
var backtick = seg.IndexOf('`');
if (backtick >= 0)
{
var arity = int.Parse(seg.AsSpan(backtick + 1));
sb.Append(seg, 0, backtick);
sb.Append('<');
for (int j = 0; j < arity; j++)
{
if (j > 0) sb.Append(", ");
sb.Append(ToCSharpTypeName(allArgs[argIndex++]));
}
sb.Append('>');
}
else
{
sb.Append(seg);
}
}
return sb.ToString();
}
return "global::" + t.FullName!.Replace('+', '.');
}
}
/// <summary>
/// Collectible <see cref="AssemblyLoadContext"/> that hosts a single emitted script
/// assembly. Created per <see cref="ScriptEvaluator{TContext, TResult}"/> instance so
/// <see cref="AssemblyLoadContext.Unload"/> releases exactly that script. Resolves
/// dependencies via the default ALC — script assemblies reference the BCL + the
/// application's own types, all of which live in the default context.
/// </summary>
internal sealed class ScriptAssemblyLoadContext : AssemblyLoadContext
{
public ScriptAssemblyLoadContext(string name) : base(name, isCollectible: true)
{
}
protected override Assembly? Load(AssemblyName assemblyName) => null;
}
/// <summary>
/// Thrown by <see cref="ScriptEvaluator{TContext, TResult}.Compile"/> when Roslyn
/// reports compile-time errors against the wrapper source. Mirrors the
/// <c>Microsoft.CodeAnalysis.Scripting.CompilationErrorException</c> from the legacy
/// <c>CSharpScript</c> path so callers (engines + the Admin test-harness) keep the
/// same catch site after the Core.Scripting-008 rewrite.
/// </summary>
public sealed class CompilationErrorException : Exception
{
public IReadOnlyList<Diagnostic> Diagnostics { get; }
public CompilationErrorException(IReadOnlyList<Diagnostic> diagnostics)
: base(BuildMessage(diagnostics))
{
Diagnostics = diagnostics;
}
private static string BuildMessage(IReadOnlyList<Diagnostic> diagnostics)
{
if (diagnostics.Count == 0) return "Script compile failed.";
// Operators see this — match the legacy Roslyn format ("(line,col): error CSxxxx:
// message") so existing operator runbooks still match.
var first = diagnostics[0];
var rest = diagnostics.Count == 1 ? "" : $" (and {diagnostics.Count - 1} more)";
return first.ToString() + rest;
}
}
@@ -1,11 +1,10 @@
using Microsoft.CodeAnalysis.CSharp.Scripting;
using Microsoft.CodeAnalysis.Scripting;
using Microsoft.CodeAnalysis;
using ZB.MOM.WW.OtOpcUa.Core.Abstractions;
namespace ZB.MOM.WW.OtOpcUa.Core.Scripting;
/// <summary>
/// Factory for the <see cref="ScriptOptions"/> every user script is compiled against.
/// Factory for the compile-time sandbox every user script is built against.
/// Implements Phase 7 plan decision #6 (read-only sandbox) by whitelisting only the
/// assemblies + namespaces the script API needs; no <c>System.IO</c>, no
/// <c>System.Net</c>, no <c>System.Diagnostics.Process</c>, no
@@ -15,9 +14,12 @@ namespace ZB.MOM.WW.OtOpcUa.Core.Scripting;
/// </summary>
/// <remarks>
/// <para>
/// Roslyn's default <see cref="ScriptOptions"/> references <c>mscorlib</c> /
/// <c>System.Runtime</c> transitively which pulls in every type in the BCL — this
/// class overrides that with an explicit minimal allow-list.
/// Roslyn would otherwise pull in every type in the BCL transitively via
/// <c>mscorlib</c> / <c>System.Runtime</c> — this class overrides that with an
/// explicit minimal allow-list. The list is the same regardless of whether
/// <see cref="ScriptEvaluator{TContext, TResult}"/> uses the legacy
/// <c>CSharpScript</c> path or the collectible-<c>AssemblyLoadContext</c> path
/// (Core.Scripting-008): both go through <see cref="Build"/>.
/// </para>
/// <para>
/// Namespaces pre-imported so scripts don't have to write <c>using</c> clauses:
@@ -35,29 +37,21 @@ namespace ZB.MOM.WW.OtOpcUa.Core.Scripting;
public static class ScriptSandbox
{
/// <summary>
/// Build the <see cref="ScriptOptions"/> used for every virtual-tag / alarm
/// script. <paramref name="contextType"/> is the concrete
/// <see cref="ScriptContext"/> subclass the globals will be of — the compiler
/// uses its type to resolve <c>ctx.GetTag(...)</c> calls.
/// Build the sandbox configuration used for every virtual-tag / alarm script.
/// <paramref name="contextType"/> is the concrete <see cref="ScriptContext"/>
/// subclass the script's <c>ctx</c> will be of — the compiler uses its assembly
/// to resolve <c>ctx.GetTag(...)</c> calls.
/// </summary>
public static ScriptOptions Build(Type contextType)
public static SandboxConfig Build(Type contextType)
{
if (contextType is null) throw new ArgumentNullException(nameof(contextType));
if (!typeof(ScriptContext).IsAssignableFrom(contextType))
throw new ArgumentException(
$"Script context type must derive from {nameof(ScriptContext)}", nameof(contextType));
// Allow-listed assemblies — each explicitly chosen. Adding here is a
// plan-level decision; do not expand casually. HashSet so adding the
// contextType's assembly is idempotent when it happens to be Core.Scripting
// already.
var allowedAssemblies = new HashSet<System.Reflection.Assembly>
// OtOpcUa-owned assemblies — pinned by typeof(...) so they survive a rename.
var pinnedAssemblies = new HashSet<System.Reflection.Assembly>
{
// System.Private.CoreLib — primitives (int, double, bool, string, DateTime,
// TimeSpan, Math, Convert, nullable<T>). Can't practically script without it.
typeof(object).Assembly,
// System.Linq — IEnumerable extensions (Where / Select / Sum / Average / etc.).
typeof(System.Linq.Enumerable).Assembly,
// Core.Abstractions — DataValueSnapshot + DriverDataType so scripts can name
// the types they receive from ctx.GetTag.
typeof(DataValueSnapshot).Assembly,
@@ -72,7 +66,23 @@ public static class ScriptSandbox
contextType.Assembly,
};
var allowedImports = new[]
// BCL references. We list the trusted-platform-assemblies set restricted to
// System.* and netstandard so the synthesized wrapper can reference every BCL
// type by FQN — including the ones we forbid (HttpClient, File, Process,
// Registry, etc.). Letting those types resolve at compile is intentional: the
// hard security gate is ForbiddenTypeAnalyzer in step 3 of the compile pipeline
// (Core.Scripting-001 / -002 established the analyzer must be the sole gate
// because type forwarding makes any references-list-only restriction porous).
// The references list now serves only as scoping hygiene — out-of-band BCL
// surface (operator-authored hosting helpers, third-party packages, app code)
// is not on the list and stays unreachable.
var references = new List<MetadataReference>();
foreach (var asm in pinnedAssemblies)
references.Add(MetadataReference.CreateFromFile(asm.Location));
foreach (var path in EnumerateBclAssemblyPaths())
references.Add(MetadataReference.CreateFromFile(path));
var imports = new[]
{
"System",
"System.Linq",
@@ -80,8 +90,56 @@ public static class ScriptSandbox
"ZB.MOM.WW.OtOpcUa.Core.Scripting",
};
return ScriptOptions.Default
.WithReferences(allowedAssemblies)
.WithImports(allowedImports);
return new SandboxConfig(references, imports);
}
private static IEnumerable<string> EnumerateBclAssemblyPaths()
{
// The .NET host advertises the resolved runtime-shared-framework + BCL DLL set
// via the TRUSTED_PLATFORM_ASSEMBLIES AppContext data slot. This is what the
// ALC fallback uses when resolving assemblies, so anything in here is already
// loadable by the host process. We restrict to System.* and netstandard to keep
// the script's reachable surface to the BCL — anything else (Microsoft.*,
// application code, third-party packages happening to be in the runtime store)
// would expand the analyzer's deny-list job unnecessarily.
var raw = (string?)AppContext.GetData("TRUSTED_PLATFORM_ASSEMBLIES");
if (string.IsNullOrEmpty(raw))
yield break;
var separator = OperatingSystem.IsWindows() ? ';' : ':';
foreach (var path in raw.Split(separator, StringSplitOptions.RemoveEmptyEntries))
{
var name = System.IO.Path.GetFileName(path);
if (name.StartsWith("System.", StringComparison.Ordinal) ||
string.Equals(name, "netstandard.dll", StringComparison.Ordinal) ||
string.Equals(name, "mscorlib.dll", StringComparison.Ordinal) ||
// Microsoft.Win32.Registry isn't a System.* DLL but the analyzer's
// Microsoft.Win32 deny-list relies on the type being resolvable so it
// can identify + reject it (Core.Scripting-001 / -002). Add the one
// DLL we need rather than broadening to Microsoft.* (which would also
// pull in compilers, build tooling, etc.).
string.Equals(name, "Microsoft.Win32.Registry.dll", StringComparison.Ordinal))
{
yield return path;
}
}
}
}
/// <summary>
/// Compile-time sandbox configuration. Returned by <see cref="ScriptSandbox.Build"/>;
/// consumed by <see cref="ScriptEvaluator{TContext, TResult}"/>'s manual
/// <c>CSharpCompilation</c> path.
/// </summary>
/// <param name="References">
/// Metadata references (allow-listed assemblies) the script compilation is built
/// against. Anything not in this set is unresolved at compile, which is the sandbox's
/// first line of defense — <see cref="ForbiddenTypeAnalyzer"/> is the second.
/// </param>
/// <param name="Imports">
/// Namespaces pre-imported into the wrapper compilation as <c>using</c> directives
/// so scripts can write <c>Math.Abs</c> rather than <c>System.Math.Abs</c>.
/// </param>
public sealed record SandboxConfig(
IReadOnlyList<MetadataReference> References,
IReadOnlyList<string> Imports);
@@ -37,6 +37,21 @@ public sealed class VirtualTagEngine : IDisposable
private readonly DependencyGraph _graph = new();
private readonly Dictionary<string, VirtualTagState> _tags = new(StringComparer.Ordinal);
/// <summary>
/// Compile cache for every virtual-tag script. Routes <see cref="Load"/>'s
/// <see cref="ScriptEvaluator{TContext, TResult}.Compile"/> calls through the
/// cache so the collectible <see cref="System.Runtime.Loader.AssemblyLoadContext"/>
/// each compile produces is actually disposed on the publish-replace path
/// (Core.Scripting-016): the cache's <see cref="CompiledScriptCache{TContext, TResult}.Clear"/>
/// disposes every materialised evaluator before dropping its dictionary entry,
/// so a config-publish releases the prior generation's ALCs and the per-publish
/// accretion the Core.Scripting-008 fix targeted is actually freed in production.
/// Pre-fix the engine called <c>ScriptEvaluator.Compile</c> directly, which left
/// the ALCs rooted until the process exited — defeating -008 on the real path.
/// </summary>
private readonly CompiledScriptCache<VirtualTagContext, object?> _compileCache = new();
private readonly ConcurrentDictionary<string, DataValueSnapshot> _valueCache = new(StringComparer.Ordinal);
private readonly ConcurrentDictionary<string, List<Action<string, DataValueSnapshot>>> _observers
= new(StringComparer.Ordinal);
@@ -74,6 +89,10 @@ public sealed class VirtualTagEngine : IDisposable
UnsubscribeFromUpstream();
_tags.Clear();
_graph.Clear();
// Dispose every compiled-script ALC from the prior generation BEFORE we
// recompile this one. Skipping this is what made Core.Scripting-008 a
// no-op in production (Core.Scripting-016).
_compileCache.Clear();
var compileFailures = new List<string>();
var seenPaths = new HashSet<string>(StringComparer.Ordinal);
@@ -102,7 +121,9 @@ public sealed class VirtualTagEngine : IDisposable
continue;
}
var evaluator = ScriptEvaluator<VirtualTagContext, object?>.Compile(def.ScriptSource);
// Route through CompiledScriptCache so the emitted assembly's collectible
// ALC participates in publish-replace cleanup. (Core.Scripting-016)
var evaluator = _compileCache.GetOrCompile(def.ScriptSource);
var timed = new TimedScriptEvaluator<VirtualTagContext, object?>(evaluator, _scriptTimeout);
var scriptLogger = _loggerFactory.Create(def.Path);
@@ -481,6 +502,9 @@ public sealed class VirtualTagEngine : IDisposable
UnsubscribeFromUpstream();
_tags.Clear();
_graph.Clear();
// Dispose every compiled-script ALC so the engine's shutdown actually
// releases the emitted assemblies. (Core.Scripting-016)
_compileCache.Dispose();
}
internal DependencyGraph GraphForTesting => _graph;
@@ -27,10 +27,28 @@ public abstract class AbCipCommandBase : DriverCommandBase
public int TimeoutMs { get; init; } = 5000;
/// <inheritdoc />
/// <remarks>
/// The getter validates <see cref="TimeoutMs"/> (Driver.AbCip.Cli-004) — a zero or
/// negative <c>--timeout-ms</c> would otherwise propagate as a non-positive
/// <see cref="TimeSpan"/> into the driver. The <c>init</c> accessor is unreachable
/// because CliFx binds <see cref="TimeoutMs"/> rather than <c>Timeout</c>; it throws
/// <see cref="NotSupportedException"/> so an object-initializer assignment
/// (<c>new ReadCommand { Timeout = ... }</c>) fails fast instead of being silently
/// discarded (Driver.AbCip.Cli-006).
/// </remarks>
public override TimeSpan Timeout
{
get => TimeSpan.FromMilliseconds(TimeoutMs);
init { /* driven by TimeoutMs */ }
get
{
if (TimeoutMs <= 0)
throw new CliFx.Exceptions.CommandException(
$"--timeout-ms must be > 0 (got {TimeoutMs}). " +
"Pick a positive number of milliseconds for the per-operation timeout.");
return TimeSpan.FromMilliseconds(TimeoutMs);
}
init => throw new NotSupportedException(
$"{nameof(AbCipCommandBase)}.{nameof(Timeout)} is derived from {nameof(TimeoutMs)} " +
"and cannot be assigned directly. Set TimeoutMs instead.");
}
/// <summary>
@@ -54,6 +54,9 @@ public sealed class ProbeCommand : AbCipCommandBase
finally
{
await driver.ShutdownAsync(CancellationToken.None);
// Driver.AbCip.Cli-005 — flush Serilog before process exit so buffered log
// output emitted during driver shutdown is not lost.
FlushLogging();
}
}
}
@@ -49,6 +49,9 @@ public sealed class ReadCommand : AbCipCommandBase
finally
{
await driver.ShutdownAsync(CancellationToken.None);
// Driver.AbCip.Cli-005 — flush Serilog before process exit so buffered log
// output emitted during driver shutdown is not lost.
FlushLogging();
}
}
@@ -31,6 +31,10 @@ public sealed class SubscribeCommand : AbCipCommandBase
{
ConfigureLogging();
RejectStructure(DataType);
ValidateInterval(IntervalMs);
// Touch Timeout to surface the --timeout-ms guard (Driver.AbCip.Cli-004) before
// we open a driver — fast-fail with a clean CommandException on bad operator input.
_ = Timeout;
var ct = console.RegisterCancellationHandler();
var tagName = ReadCommand.SynthesiseTagName(TagPath, DataType);
@@ -48,6 +52,13 @@ public sealed class SubscribeCommand : AbCipCommandBase
{
await driver.InitializeAsync("{}", ct);
// Driver.AbCip.Cli-003 — emit the banner BEFORE wiring OnDataChange so the
// main-thread write cannot interleave with poll-thread change-event writes.
// TextWriter.WriteLine is not guaranteed thread-safe; once the handler is
// attached and SubscribeAsync starts, change events run on the poll thread.
await console.Output.WriteLineAsync(
$"Subscribed to {TagPath} @ {IntervalMs}ms. Ctrl+C to stop.");
driver.OnDataChange += (_, e) =>
{
var line = $"[{DateTime.UtcNow:HH:mm:ss.fff}] " +
@@ -58,8 +69,6 @@ public sealed class SubscribeCommand : AbCipCommandBase
handle = await driver.SubscribeAsync([tagName], TimeSpan.FromMilliseconds(IntervalMs), ct);
await console.Output.WriteLineAsync(
$"Subscribed to {TagPath} @ {IntervalMs}ms. Ctrl+C to stop.");
try
{
await Task.Delay(System.Threading.Timeout.InfiniteTimeSpan, ct);
@@ -77,6 +86,23 @@ public sealed class SubscribeCommand : AbCipCommandBase
catch { /* teardown best-effort */ }
}
await driver.ShutdownAsync(CancellationToken.None);
// Driver.AbCip.Cli-005 — flush Serilog before process exit so buffered log
// lines emitted just before Ctrl+C are not lost on abrupt termination.
FlushLogging();
}
}
/// <summary>
/// Guards <c>--interval-ms</c> against zero or negative values (Driver.AbCip.Cli-004).
/// A non-positive interval would produce a non-positive <see cref="TimeSpan"/> into
/// <c>SubscribeAsync</c>; the CLI should fail fast with an actionable error rather
/// than relying on the downstream <c>PollGroupEngine</c> to clamp the value.
/// </summary>
internal static void ValidateInterval(int intervalMs)
{
if (intervalMs <= 0)
throw new CliFx.Exceptions.CommandException(
$"--interval-ms must be > 0 (got {intervalMs}). " +
"PollGroupEngine floors sub-250ms values, but accepts any positive integer.");
}
}
@@ -60,6 +60,9 @@ public sealed class WriteCommand : AbCipCommandBase
finally
{
await driver.ShutdownAsync(CancellationToken.None);
// Driver.AbCip.Cli-005 — flush Serilog before process exit so buffered log
// output emitted during driver shutdown is not lost.
FlushLogging();
}
}
@@ -17,7 +17,7 @@ public sealed class ProbeCommand : AbLegacyCommandBase
"the pre-populated register every SLC / MicroLogix / PLC-5 ships with.")]
public string Address { get; init; } = "N7:0";
[CommandOption("type", Description =
[CommandOption("type", 't', Description =
"PCCC data type of the probe address (default Int — matches N files).")]
public AbLegacyDataType DataType { get; init; } = AbLegacyDataType.Int;
@@ -34,7 +34,10 @@ public sealed class ProbeCommand : AbLegacyCommandBase
Writable: false);
var options = BuildOptions([probeTag]);
await using var driver = new AbLegacyDriver(options, DriverInstanceId);
// Plain `var driver`: explicit ShutdownAsync(CancellationToken.None) in the
// finally is the deliberate teardown path; combining it with `await using`
// (which itself calls ShutdownAsync) would tear the driver down twice.
var driver = new AbLegacyDriver(options, DriverInstanceId);
try
{
await driver.InitializeAsync("{}", ct);
@@ -36,7 +36,10 @@ public sealed class ReadCommand : AbLegacyCommandBase
Writable: false);
var options = BuildOptions([tag]);
await using var driver = new AbLegacyDriver(options, DriverInstanceId);
// Plain `var driver`: explicit ShutdownAsync(CancellationToken.None) in the
// finally is the deliberate teardown path; combining it with `await using`
// (which itself calls ShutdownAsync) would tear the driver down twice.
var driver = new AbLegacyDriver(options, DriverInstanceId);
try
{
await driver.InitializeAsync("{}", ct);
@@ -21,7 +21,8 @@ public sealed class SubscribeCommand : AbLegacyCommandBase
public AbLegacyDataType DataType { get; init; } = AbLegacyDataType.Int;
[CommandOption("interval-ms", 'i', Description =
"Publishing interval in milliseconds (default 1000).")]
"Publishing interval in milliseconds (default 1000). PollGroupEngine floors " +
"sub-250ms values.")]
public int IntervalMs { get; init; } = 1000;
public override async ValueTask ExecuteAsync(IConsole console)
@@ -38,8 +39,17 @@ public sealed class SubscribeCommand : AbLegacyCommandBase
Writable: false);
var options = BuildOptions([tag]);
await using var driver = new AbLegacyDriver(options, DriverInstanceId);
// Plain `var driver` (no `await using`): driver.DisposeAsync internally calls
// ShutdownAsync, so combining `await using` with an explicit finally-shutdown
// would tear the driver down twice. The explicit teardown is preferred because
// it deliberately passes CancellationToken.None — `await using` would otherwise
// happen on a cancelled `ct` path which can cut teardown short.
var driver = new AbLegacyDriver(options, DriverInstanceId);
ISubscriptionHandle? handle = null;
// Serialise console writes from the poll-thread OnDataChange callback against
// the command-thread "Subscribed to ..." line and against each other; the
// PollGroupEngine raises change events on a background timer/loop thread.
var consoleGate = new object();
try
{
await driver.InitializeAsync("{}", ct);
@@ -49,13 +59,19 @@ public sealed class SubscribeCommand : AbLegacyCommandBase
var line = $"[{DateTime.UtcNow:HH:mm:ss.fff}] " +
$"{e.FullReference} = {SnapshotFormatter.FormatValue(e.Snapshot.Value)} " +
$"({SnapshotFormatter.FormatStatus(e.Snapshot.StatusCode)})";
console.Output.WriteLine(line);
lock (consoleGate)
{
console.Output.WriteLine(line);
}
};
handle = await driver.SubscribeAsync([tagName], TimeSpan.FromMilliseconds(IntervalMs), ct);
await console.Output.WriteLineAsync(
$"Subscribed to {Address} @ {IntervalMs}ms. Ctrl+C to stop.");
lock (consoleGate)
{
console.Output.WriteLine(
$"Subscribed to {Address} @ {IntervalMs}ms. Ctrl+C to stop.");
}
try
{
await Task.Delay(System.Threading.Timeout.InfiniteTimeSpan, ct);
@@ -25,7 +25,7 @@ public sealed class WriteCommand : AbLegacyCommandBase
public AbLegacyDataType DataType { get; init; } = AbLegacyDataType.Int;
[CommandOption("value", 'v', Description =
"Value to write. Parsed per --type (booleans accept true/false/1/0).",
"Value to write. Parsed per --type (booleans accept true/false, 1/0, on/off, yes/no).",
IsRequired = true)]
public string Value { get; init; } = default!;
@@ -45,7 +45,10 @@ public sealed class WriteCommand : AbLegacyCommandBase
var parsed = ParseValue(Value, DataType);
await using var driver = new AbLegacyDriver(options, DriverInstanceId);
// Plain `var driver`: explicit ShutdownAsync(CancellationToken.None) in the
// finally is the deliberate teardown path; combining it with `await using`
// (which itself calls ShutdownAsync) would tear the driver down twice.
var driver = new AbLegacyDriver(options, DriverInstanceId);
try
{
await driver.InitializeAsync("{}", ct);
@@ -68,7 +68,10 @@ public static class SnapshotFormatter
int tagW = rows.Length == 0 ? "TAG".Length : Math.Max("TAG".Length, rows.Max(r => r.Tag.Length));
int valW = rows.Length == 0 ? "VALUE".Length : Math.Max("VALUE".Length, rows.Max(r => r.Value.Length));
int statW = rows.Length == 0 ? "STATUS".Length : Math.Max("STATUS".Length, rows.Max(r => r.Status.Length));
// source-time column is fixed-width (ISO-8601 to ms) so no max-measurement needed.
// source-time is the right-most column, so it is intentionally not measured or padded;
// when a snapshot has a non-null SourceTimestampUtc the cell is 24 chars (ISO-8601 to ms),
// and when the timestamp is null FormatTimestamp emits "-" — the resulting unalignment is
// harmless because nothing is appended after this column.
var sb = new System.Text.StringBuilder();
sb.Append("TAG".PadRight(tagW)).Append(" ")
@@ -113,12 +116,17 @@ public static class SnapshotFormatter
{
0x00000000u => "Good",
0x80000000u => "Bad",
0x80020000u => "BadInternalError",
0x80050000u => "BadCommunicationError",
0x800A0000u => "BadTimeout",
0x80310000u => "BadNoCommunication",
0x80320000u => "BadWaitingForInitialData",
0x80340000u => "BadNodeIdUnknown",
0x80330000u => "BadNodeIdInvalid",
0x803B0000u => "BadNotWritable",
0x803C0000u => "BadOutOfRange",
0x803D0000u => "BadNotSupported",
0x808B0000u => "BadDeviceFailure",
0x80740000u => "BadTypeMismatch",
0x40000000u => "Uncertain",
_ => null,
@@ -24,6 +24,8 @@ public sealed class ProbeCommand : FocasCommandBase
public override async ValueTask ExecuteAsync(IConsole console)
{
ConfigureLogging();
// Driver.FOCAS.Cli-003: validate numeric option ranges before any driver work.
ValidateOptions();
var ct = console.RegisterCancellationHandler();
var probeTag = new FocasTagDefinition(
@@ -34,24 +36,20 @@ public sealed class ProbeCommand : FocasCommandBase
Writable: false);
var options = BuildOptions([probeTag]);
// Driver.FOCAS.Cli-004: `await using` is the sole disposal mechanism — FocasDriver.DisposeAsync
// already invokes ShutdownAsync, so a redundant explicit ShutdownAsync(CancellationToken.None)
// in a finally block ran shutdown twice. The await-using on the next line is enough.
await using var driver = new FocasDriver(options, DriverInstanceId);
try
{
await driver.InitializeAsync("{}", ct);
var snapshot = await driver.ReadAsync(["__probe"], ct);
var health = driver.GetHealth();
await driver.InitializeAsync("{}", ct);
var snapshot = await driver.ReadAsync(["__probe"], ct);
var health = driver.GetHealth();
await console.Output.WriteLineAsync($"CNC: {CncHost}:{CncPort}");
await console.Output.WriteLineAsync($"Series: {Series}");
await console.Output.WriteLineAsync($"Health: {health.State}");
if (health.LastError is { } err)
await console.Output.WriteLineAsync($"Last error: {err}");
await console.Output.WriteLineAsync();
await console.Output.WriteLineAsync(SnapshotFormatter.Format(Address, snapshot[0]));
}
finally
{
await driver.ShutdownAsync(CancellationToken.None);
}
await console.Output.WriteLineAsync($"CNC: {CncHost}:{CncPort}");
await console.Output.WriteLineAsync($"Series: {Series}");
await console.Output.WriteLineAsync($"Health: {health.State}");
if (health.LastError is { } err)
await console.Output.WriteLineAsync($"Last error: {err}");
await console.Output.WriteLineAsync();
await console.Output.WriteLineAsync(SnapshotFormatter.Format(Address, snapshot[0]));
}
}
@@ -23,6 +23,8 @@ public sealed class ReadCommand : FocasCommandBase
public override async ValueTask ExecuteAsync(IConsole console)
{
ConfigureLogging();
// Driver.FOCAS.Cli-003: validate numeric option ranges before any driver work.
ValidateOptions();
var ct = console.RegisterCancellationHandler();
var tagName = SynthesiseTagName(Address, DataType);
@@ -34,17 +36,13 @@ public sealed class ReadCommand : FocasCommandBase
Writable: false);
var options = BuildOptions([tag]);
// Driver.FOCAS.Cli-004: `await using` is the sole disposal mechanism — FocasDriver.DisposeAsync
// already invokes ShutdownAsync, so a redundant explicit ShutdownAsync(CancellationToken.None)
// in a finally block ran shutdown twice. The await-using on the next line is enough.
await using var driver = new FocasDriver(options, DriverInstanceId);
try
{
await driver.InitializeAsync("{}", ct);
var snapshot = await driver.ReadAsync([tagName], ct);
await console.Output.WriteLineAsync(SnapshotFormatter.Format(Address, snapshot[0]));
}
finally
{
await driver.ShutdownAsync(CancellationToken.None);
}
await driver.InitializeAsync("{}", ct);
var snapshot = await driver.ReadAsync([tagName], ct);
await console.Output.WriteLineAsync(SnapshotFormatter.Format(Address, snapshot[0]));
}
internal static string SynthesiseTagName(string address, FocasDataType type)
@@ -25,6 +25,10 @@ public sealed class SubscribeCommand : FocasCommandBase
public override async ValueTask ExecuteAsync(IConsole console)
{
ConfigureLogging();
// Driver.FOCAS.Cli-003: validate numeric option ranges (including the subscribe-only
// --interval-ms) before any driver work so a zero/negative interval surfaces as a
// clean CommandException rather than a tight-spinning poll loop.
ValidateOptions(IntervalMs);
var ct = console.RegisterCancellationHandler();
var tagName = ReadCommand.SynthesiseTagName(Address, DataType);
@@ -36,24 +40,59 @@ public sealed class SubscribeCommand : FocasCommandBase
Writable: false);
var options = BuildOptions([tag]);
// Driver.FOCAS.Cli-004: `await using` is the sole driver-disposal mechanism — FocasDriver.DisposeAsync
// already invokes ShutdownAsync, so a redundant ShutdownAsync(CancellationToken.None) in finally
// ran shutdown twice. Only UnsubscribeAsync stays in the finally block — that's a subscription
// lifecycle concern that is not part of driver disposal.
await using var driver = new FocasDriver(options, DriverInstanceId);
// Driver.FOCAS.Cli-002: serialize console writes from the PollGroupEngine background
// thread so overlapping poll ticks (and the "Subscribed to ..." banner from the CliFx
// invocation thread) can't interleave partial lines.
var writeLock = new object();
ISubscriptionHandle? handle = null;
try
{
await driver.InitializeAsync("{}", ct);
// Driver.FOCAS.Cli-002: route every data-change event to the CliFx console (not
// System.Console — the analyzer flags it + IConsole is the testable abstraction).
// The handler is synchronous because OnDataChange is raised from a driver
// background thread; the IConsole.Output writer is not documented as thread-safe
// so we serialize against the banner write via writeLock.
driver.OnDataChange += (_, e) =>
{
var line = $"[{DateTime.UtcNow:HH:mm:ss.fff}] " +
$"{e.FullReference} = {SnapshotFormatter.FormatValue(e.Snapshot.Value)} " +
$"({SnapshotFormatter.FormatStatus(e.Snapshot.StatusCode)})";
console.Output.WriteLine(line);
// Swallow + log write failures so a transient stdout error (closed pipe, IO
// exception on a redirected stream) cannot tear down the poll-engine
// background loop. Without this guard the unhandled exception would fault
// the long-running subscribe.
try
{
var line = $"[{DateTime.UtcNow:HH:mm:ss.fff}] " +
$"{e.FullReference} = {SnapshotFormatter.FormatValue(e.Snapshot.Value)} " +
$"({SnapshotFormatter.FormatStatus(e.Snapshot.StatusCode)})";
lock (writeLock)
{
console.Output.WriteLine(line);
}
}
catch (Exception ex)
{
Serilog.Log.Logger.Warning(ex,
"SubscribeCommand: console write failed for {Tag}; continuing poll loop.",
e.FullReference);
}
};
handle = await driver.SubscribeAsync([tagName], TimeSpan.FromMilliseconds(IntervalMs), ct);
await console.Output.WriteLineAsync(
$"Subscribed to {Address} @ {IntervalMs}ms. Ctrl+C to stop.");
// Driver.FOCAS.Cli-002: hold the lock around the banner write so the first
// poll-driven change line from the driver tick thread can't interleave with
// this banner.
lock (writeLock)
{
console.Output.WriteLine(
$"Subscribed to {Address} @ {IntervalMs}ms. Ctrl+C to stop.");
}
try
{
await Task.Delay(System.Threading.Timeout.InfiniteTimeSpan, ct);
@@ -67,10 +106,16 @@ public sealed class SubscribeCommand : FocasCommandBase
{
if (handle is not null)
{
// Driver.FOCAS.Cli-002: detach the OnDataChange handler before unsubscribe +
// disposal for symmetry with the handle teardown, so a future refactor that
// reuses the driver after the subscribe verb returns wouldn't leak a
// dangling subscription.
// (Single anonymous handler instance is captured implicitly by `await using`
// disposing the driver immediately afterwards; the unsubscribe + dispose
// sequence is what really cleans up here.)
try { await driver.UnsubscribeAsync(handle, CancellationToken.None); }
catch { /* teardown best-effort */ }
}
await driver.ShutdownAsync(CancellationToken.None);
}
}
}
@@ -29,6 +29,10 @@ public sealed class WriteCommand : FocasCommandBase
public override async ValueTask ExecuteAsync(IConsole console)
{
ConfigureLogging();
// Driver.FOCAS.Cli-003: validate numeric option ranges before any driver work so
// a zero/negative port/timeout surfaces as a clean CommandException rather than an
// opaque downstream exception.
ValidateOptions();
var ct = console.RegisterCancellationHandler();
var tagName = ReadCommand.SynthesiseTagName(Address, DataType);
@@ -42,30 +46,49 @@ public sealed class WriteCommand : FocasCommandBase
var parsed = ParseValue(Value, DataType);
// Driver.FOCAS.Cli-004: `await using` is the sole disposal mechanism — FocasDriver.DisposeAsync
// already invokes ShutdownAsync, so a redundant explicit ShutdownAsync(CancellationToken.None)
// in a finally block ran shutdown twice. The await-using on the next line is enough.
await using var driver = new FocasDriver(options, DriverInstanceId);
try
{
await driver.InitializeAsync("{}", ct);
var results = await driver.WriteAsync([new WriteRequest(tagName, parsed)], ct);
await console.Output.WriteLineAsync(SnapshotFormatter.FormatWrite(Address, results[0]));
}
finally
{
await driver.ShutdownAsync(CancellationToken.None);
}
await driver.InitializeAsync("{}", ct);
var results = await driver.WriteAsync([new WriteRequest(tagName, parsed)], ct);
await console.Output.WriteLineAsync(SnapshotFormatter.FormatWrite(Address, results[0]));
}
internal static object ParseValue(string raw, FocasDataType type) => type switch
/// <summary>Parse <c>--value</c> per <see cref="FocasDataType"/>, invariant culture throughout.</summary>
/// <remarks>
/// Driver.FOCAS.Cli-001: numeric parses are wrapped so that malformed input
/// (<see cref="FormatException"/> / <see cref="OverflowException"/>) surfaces
/// as a clean <see cref="CliFx.Exceptions.CommandException"/> rather than a raw
/// .NET stack trace — matching the friendly message the Bit path already produces.
/// </remarks>
internal static object ParseValue(string raw, FocasDataType type)
{
FocasDataType.Bit => ParseBool(raw),
FocasDataType.Byte => sbyte.Parse(raw, CultureInfo.InvariantCulture),
FocasDataType.Int16 => short.Parse(raw, CultureInfo.InvariantCulture),
FocasDataType.Int32 => int.Parse(raw, CultureInfo.InvariantCulture),
FocasDataType.Float32 => float.Parse(raw, CultureInfo.InvariantCulture),
FocasDataType.Float64 => double.Parse(raw, CultureInfo.InvariantCulture),
FocasDataType.String => raw,
_ => throw new CliFx.Exceptions.CommandException($"Unsupported DataType '{type}' for write."),
};
if (type == FocasDataType.Bit) return ParseBool(raw);
if (type == FocasDataType.String) return raw;
try
{
return type switch
{
FocasDataType.Byte => (object)sbyte.Parse(raw, CultureInfo.InvariantCulture),
FocasDataType.Int16 => (object)short.Parse(raw, CultureInfo.InvariantCulture),
FocasDataType.Int32 => (object)int.Parse(raw, CultureInfo.InvariantCulture),
FocasDataType.Float32 => (object)float.Parse(raw, CultureInfo.InvariantCulture),
FocasDataType.Float64 => (object)double.Parse(raw, CultureInfo.InvariantCulture),
_ => throw new CliFx.Exceptions.CommandException($"Unsupported DataType '{type}' for write."),
};
}
catch (FormatException ex)
{
throw new CliFx.Exceptions.CommandException(
$"Value '{raw}' is not a valid {type}: {ex.Message}");
}
catch (OverflowException ex)
{
throw new CliFx.Exceptions.CommandException(
$"Value '{raw}' is out of range for {type}: {ex.Message}");
}
}
private static bool ParseBool(string raw) => raw.Trim().ToLowerInvariant() switch
{
@@ -54,4 +54,26 @@ public abstract class FocasCommandBase : DriverCommandBase
};
protected string DriverInstanceId => $"focas-cli-{CncHost}:{CncPort}";
/// <summary>
/// Driver.FOCAS.Cli-003: validate numeric option ranges at the CLI boundary so a
/// zero/negative <c>--cnc-port</c>, <c>--timeout-ms</c>, or <c>--interval-ms</c>
/// surfaces as a clean <see cref="CliFx.Exceptions.CommandException"/> rather than
/// either an opaque downstream exception (invalid <c>focas://host:&lt;n&gt;</c> /
/// zero <c>TimeSpan</c>) or a tight-spinning poll loop. The <c>--interval-ms</c>
/// option is subscribe-only — pass <c>null</c> for probe/read/write so this
/// helper can be a single shared validator.
/// </summary>
protected void ValidateOptions(int? intervalMs = null)
{
if (CncPort < 1 || CncPort > 65535)
throw new CliFx.Exceptions.CommandException(
$"--cnc-port must be in the range 1..65535 (got {CncPort}).");
if (TimeoutMs <= 0)
throw new CliFx.Exceptions.CommandException(
$"--timeout-ms must be positive (got {TimeoutMs}).");
if (intervalMs is { } iv && iv <= 0)
throw new CliFx.Exceptions.CommandException(
$"--interval-ms must be positive (got {iv}).");
}
}
@@ -22,6 +22,10 @@
<ProjectReference Include="..\..\ZB.MOM.WW.OtOpcUa.Driver.FOCAS\ZB.MOM.WW.OtOpcUa.Driver.FOCAS.csproj"/>
</ItemGroup>
<ItemGroup>
<InternalsVisibleTo Include="ZB.MOM.WW.OtOpcUa.Driver.FOCAS.Cli.Tests"/>
</ItemGroup>
<!-- CLI runs the managed WireFocasClient and talks to the CNC over TCP:8193
directly — no Fwlib64.dll copy step needed. -->
@@ -1,5 +1,6 @@
using CliFx.Attributes;
using CliFx.Infrastructure;
using ZB.MOM.WW.OtOpcUa.Core.Abstractions;
using ZB.MOM.WW.OtOpcUa.Driver.Cli.Common;
namespace ZB.MOM.WW.OtOpcUa.Driver.Modbus.Cli.Commands;
@@ -21,6 +22,7 @@ public sealed class ProbeCommand : ModbusCommandBase
public override async ValueTask ExecuteAsync(IConsole console)
{
ConfigureLogging();
ValidateEndpoint();
var ct = console.RegisterCancellationHandler();
// Build with one probe tag + Probe.Enabled=false so InitializeAsync connects the
@@ -39,17 +41,59 @@ public sealed class ProbeCommand : ModbusCommandBase
var snapshot = await driver.ReadAsync(["__probe"], ct);
var health = driver.GetHealth();
// Driver.Modbus.Cli-006: derive the headline verdict from BOTH the driver state
// AND the probe-read StatusCode so the operator never sees the previous
// contradictory pair (`Health: Healthy` over a Bad snapshot line). The bare
// driver state is still printed below for diagnostics, but the verdict is what
// the operator scans first.
var verdict = ComputeVerdict(health.State, snapshot[0].StatusCode);
await console.Output.WriteLineAsync($"Host: {Host}:{Port} (unit {UnitId})");
await console.Output.WriteLineAsync($"Health: {health.State}");
await console.Output.WriteLineAsync($"Verdict: {verdict}");
await console.Output.WriteLineAsync($"Driver state: {health.State}");
if (health.LastError is { } err)
await console.Output.WriteLineAsync($"Last error: {err}");
await console.Output.WriteLineAsync();
await console.Output.WriteLineAsync(
SnapshotFormatter.Format($"HR[{ProbeAddress}]", snapshot[0]));
}
catch (OperationCanceledException) when (ct.IsCancellationRequested)
{
// Driver.Modbus.Cli-005: Ctrl+C during InitializeAsync — exit quietly so CliFx
// does not render a full stack trace for a user-initiated cancellation.
await console.Output.WriteLineAsync("Cancelled.");
}
finally
{
await driver.ShutdownAsync(CancellationToken.None);
}
}
/// <summary>
/// Driver.Modbus.Cli-006: combine the driver-side <see cref="DriverState"/> with the
/// probe snapshot's OPC UA <c>StatusCode</c> into a single headline verdict. Order
/// of precedence:
/// <list type="number">
/// <item><c>FAIL</c> — driver did not reach <see cref="DriverState.Healthy"/>
/// (Faulted / Reconnecting / Unknown) OR the snapshot reports Bad quality.</item>
/// <item><c>DEGRADED</c> — driver Healthy but the snapshot quality is Uncertain.</item>
/// <item><c>OK</c> — driver Healthy and snapshot Good.</item>
/// </list>
/// </summary>
public static string ComputeVerdict(DriverState state, uint statusCode)
{
// OPC UA StatusCode top 2 bits encode the quality class:
// 0x00xxxxxx → Good, 0x40xxxxxx → Uncertain, 0x80xxxxxx / 0xC0xxxxxx → Bad.
var qualityClass = statusCode & 0xC0000000u;
var snapshotGood = qualityClass == 0x00000000u;
var snapshotUncertain = qualityClass == 0x40000000u;
if (state != DriverState.Healthy || !snapshotGood && !snapshotUncertain)
return $"FAIL (driver={state}, probe={SnapshotFormatter.FormatStatus(statusCode)})";
if (snapshotUncertain)
return $"DEGRADED (driver={state}, probe={SnapshotFormatter.FormatStatus(statusCode)})";
return $"OK (driver={state}, probe={SnapshotFormatter.FormatStatus(statusCode)})";
}
}
@@ -46,6 +46,7 @@ public sealed class ReadCommand : ModbusCommandBase
public override async ValueTask ExecuteAsync(IConsole console)
{
ConfigureLogging();
ValidateEndpoint();
var ct = console.RegisterCancellationHandler();
var tagName = SynthesiseTagName(Region, Address, DataType);
@@ -68,6 +69,12 @@ public sealed class ReadCommand : ModbusCommandBase
var snapshot = await driver.ReadAsync([tagName], ct);
await console.Output.WriteLineAsync(SnapshotFormatter.Format(tagName, snapshot[0]));
}
catch (OperationCanceledException) when (ct.IsCancellationRequested)
{
// Driver.Modbus.Cli-005: Ctrl+C during driver connect/read — exit quietly so
// CliFx does not render a full stack trace for a user-initiated cancellation.
await console.Output.WriteLineAsync("Cancelled.");
}
finally
{
await driver.ShutdownAsync(CancellationToken.None);
@@ -53,6 +53,7 @@ public sealed class SubscribeCommand : ModbusCommandBase
public override async ValueTask ExecuteAsync(IConsole console)
{
ConfigureLogging();
ValidateEndpoint();
var ct = console.RegisterCancellationHandler();
var tagName = ReadCommand.SynthesiseTagName(Region, Address, DataType);
@@ -69,6 +70,9 @@ public sealed class SubscribeCommand : ModbusCommandBase
var options = BuildOptions([tag]);
await using var driver = new ModbusDriver(options, DriverInstanceId);
// Driver.Modbus.Cli-004: serialize console writes from the PollGroupEngine background
// thread so overlapping poll ticks can't interleave partial lines.
var writeLock = new object();
ISubscriptionHandle? handle = null;
try
{
@@ -78,10 +82,26 @@ public sealed class SubscribeCommand : ModbusCommandBase
// analyzer flags it + IConsole is the testable abstraction).
driver.OnDataChange += (_, e) =>
{
var line = $"[{DateTime.UtcNow:HH:mm:ss.fff}] " +
$"{e.FullReference} = {SnapshotFormatter.FormatValue(e.Snapshot.Value)} " +
$"({SnapshotFormatter.FormatStatus(e.Snapshot.StatusCode)})";
console.Output.WriteLine(line);
// Driver.Modbus.Cli-004: swallow + log write failures so a transient stdout
// error (closed pipe, IO exception on a redirected stream) cannot tear down
// the poll-engine background loop. Without this guard the unhandled
// exception would fault the long-running subscribe.
try
{
var line = $"[{DateTime.UtcNow:HH:mm:ss.fff}] " +
$"{e.FullReference} = {SnapshotFormatter.FormatValue(e.Snapshot.Value)} " +
$"({SnapshotFormatter.FormatStatus(e.Snapshot.StatusCode)})";
lock (writeLock)
{
console.Output.WriteLine(line);
}
}
catch (Exception ex)
{
Serilog.Log.Logger.Warning(ex,
"SubscribeCommand: console write failed for {Tag}; continuing poll loop.",
e.FullReference);
}
};
handle = await driver.SubscribeAsync([tagName], TimeSpan.FromMilliseconds(IntervalMs), ct);
@@ -54,6 +54,7 @@ public sealed class WriteCommand : ModbusCommandBase
public override async ValueTask ExecuteAsync(IConsole console)
{
ConfigureLogging();
ValidateEndpoint();
var ct = console.RegisterCancellationHandler();
if (Region is not (ModbusRegion.Coils or ModbusRegion.HoldingRegisters))
@@ -92,6 +93,12 @@ public sealed class WriteCommand : ModbusCommandBase
var results = await driver.WriteAsync([new WriteRequest(tagName, parsed)], ct);
await console.Output.WriteLineAsync(SnapshotFormatter.FormatWrite(tagName, results[0]));
}
catch (OperationCanceledException) when (ct.IsCancellationRequested)
{
// Driver.Modbus.Cli-005: Ctrl+C during driver connect/write — exit quietly so
// CliFx does not render a full stack trace for a user-initiated cancellation.
await console.Output.WriteLineAsync("Cancelled.");
}
finally
{
await driver.ShutdownAsync(CancellationToken.None);
@@ -1,4 +1,5 @@
using CliFx.Attributes;
using CliFx.Exceptions;
using ZB.MOM.WW.OtOpcUa.Driver.Cli.Common;
namespace ZB.MOM.WW.OtOpcUa.Driver.Modbus.Cli;
@@ -57,4 +58,33 @@ public abstract class ModbusCommandBase : DriverCommandBase
/// multiple endpoints in parallel can distinguish the logs.
/// </summary>
protected string DriverInstanceId => $"modbus-cli-{Host}:{Port}";
/// <summary>
/// Driver.Modbus.Cli-003: validate the endpoint flags at parse time so the operator
/// gets a clear CliFx error instead of an opaque socket / argument exception thrown
/// deep inside the driver. Ranges:
/// <list type="bullet">
/// <item><c>--port</c>: 1..65535 (IANA TCP port space, excludes the
/// "any" sentinel 0 and rejects negative / overflowed values).</item>
/// <item><c>--timeout-ms</c>: strictly positive — a non-positive
/// <see cref="TimeSpan"/> would propagate as an immediate-timeout into the
/// driver and surface as a confusing TimeoutException.</item>
/// <item><c>--unit-id</c>: 1..247 — the Modbus spec unicast unit-id range.
/// 0 is the broadcast address and not valid for read/write requests; 248-255
/// are reserved. (Documented in <c>docs/Driver.Modbus.Cli.md</c>.)</item>
/// </list>
/// </summary>
protected void ValidateEndpoint()
{
if (Port < 1 || Port > 65535)
throw new CommandException(
$"--port must be in 1..65535 (got {Port}).");
if (TimeoutMs <= 0)
throw new CommandException(
$"--timeout-ms must be strictly positive (got {TimeoutMs}).");
if (UnitId < 1 || UnitId > 247)
throw new CommandException(
$"--unit-id must be in 1..247 per the Modbus spec (got {UnitId}); " +
$"0 is the broadcast address, 248-255 are reserved.");
}
}
@@ -33,6 +33,9 @@ public sealed class ProbeCommand : S7CommandBase
Writable: false);
var options = BuildOptions([probeTag]);
// Driver.S7.Cli-004: `await using` is the sole disposal mechanism — S7Driver.DisposeAsync
// already invokes ShutdownAsync, so the previous explicit ShutdownAsync(CancellationToken.None)
// call in a finally block ran shutdown twice. The await-using on the next line is enough.
await using var driver = new S7Driver(options, DriverInstanceId);
// Driver.S7.Cli-003: wrap the entire probe sequence so that a refused/unreachable TCP
// connect still prints the structured Host/CPU/Health lines instead of crashing with a
@@ -66,9 +69,5 @@ public sealed class ProbeCommand : S7CommandBase
if (health.LastError is { } err)
await console.Output.WriteLineAsync($"Last error: {err}");
}
finally
{
await driver.ShutdownAsync(CancellationToken.None);
}
}
}
@@ -45,17 +45,13 @@ public sealed class ReadCommand : S7CommandBase
StringLength: StringLength);
var options = BuildOptions([tag]);
// Driver.S7.Cli-004: `await using` is the sole disposal mechanism — S7Driver.DisposeAsync
// already invokes ShutdownAsync, so a redundant explicit ShutdownAsync(CancellationToken.None)
// in a finally block ran shutdown twice. The await-using on the next line is enough.
await using var driver = new S7Driver(options, DriverInstanceId);
try
{
await driver.InitializeAsync("{}", ct);
var snapshot = await driver.ReadAsync([tagName], ct);
await console.Output.WriteLineAsync(SnapshotFormatter.Format(Address, snapshot[0]));
}
finally
{
await driver.ShutdownAsync(CancellationToken.None);
}
await driver.InitializeAsync("{}", ct);
var snapshot = await driver.ReadAsync([tagName], ct);
await console.Output.WriteLineAsync(SnapshotFormatter.Format(Address, snapshot[0]));
}
/// <summary>Tag-name key used internally. Address + type is already unique.</summary>
@@ -37,12 +37,20 @@ public sealed class SubscribeCommand : S7CommandBase
Writable: false);
var options = BuildOptions([tag]);
// Driver.S7.Cli-004: `await using` is the sole driver-disposal mechanism — S7Driver.DisposeAsync
// already invokes ShutdownAsync, so a redundant ShutdownAsync(CancellationToken.None) in finally
// ran shutdown twice. Only UnsubscribeAsync stays in the finally block — that's a subscription
// lifecycle concern that is not part of driver disposal.
await using var driver = new S7Driver(options, DriverInstanceId);
ISubscriptionHandle? handle = null;
try
{
await driver.InitializeAsync("{}", ct);
// Driver.S7.Cli-007: route every data-change event to the CliFx console (not
// System.Console — the analyzer flags it + IConsole is the testable abstraction).
// The handler is synchronous because OnDataChange is raised from a driver
// background thread; the IConsole.Output writer is thread-safe for line writes.
driver.OnDataChange += (_, e) =>
{
var line = $"[{DateTime.UtcNow:HH:mm:ss.fff}] " +
@@ -71,7 +79,6 @@ public sealed class SubscribeCommand : S7CommandBase
try { await driver.UnsubscribeAsync(handle, CancellationToken.None); }
catch { /* teardown best-effort */ }
}
await driver.ShutdownAsync(CancellationToken.None);
}
}
}
@@ -52,17 +52,13 @@ public sealed class WriteCommand : S7CommandBase
var parsed = ParseValue(Value, DataType);
// Driver.S7.Cli-004: `await using` is the sole disposal mechanism — S7Driver.DisposeAsync
// already invokes ShutdownAsync, so a redundant explicit ShutdownAsync(CancellationToken.None)
// in a finally block ran shutdown twice. The await-using on the next line is enough.
await using var driver = new S7Driver(options, DriverInstanceId);
try
{
await driver.InitializeAsync("{}", ct);
var results = await driver.WriteAsync([new WriteRequest(tagName, parsed)], ct);
await console.Output.WriteLineAsync(SnapshotFormatter.FormatWrite(Address, results[0]));
}
finally
{
await driver.ShutdownAsync(CancellationToken.None);
}
await driver.InitializeAsync("{}", ct);
var results = await driver.WriteAsync([new WriteRequest(tagName, parsed)], ct);
await console.Output.WriteLineAsync(SnapshotFormatter.FormatWrite(Address, results[0]));
}
/// <summary>Parse <c>--value</c> per <see cref="S7DataType"/>, invariant culture throughout.</summary>
@@ -10,6 +10,12 @@ namespace ZB.MOM.WW.OtOpcUa.Driver.TwinCAT.Cli.Commands;
/// when <c>EnableControllerBrowse = true</c> — structured UDTs / function-block instances
/// won't appear because the driver filters to the supported primitive surface.
/// </summary>
/// <remarks>
/// Inherits from <see cref="TwinCATCommandBase"/> rather than
/// <see cref="TwinCATTagCommandBase"/> so the <c>--poll-only</c> flag does NOT surface in
/// <c>browse --help</c>: browse never subscribes, the flag would be a no-op, and the help
/// text would mislead users (Driver.TwinCAT.Cli-004).
/// </remarks>
[Command("browse", Description = "Enumerate controller symbols via the driver's DiscoverAsync walk.")]
public sealed class BrowseCommand : TwinCATCommandBase
{
@@ -25,18 +31,21 @@ public sealed class BrowseCommand : TwinCATCommandBase
public override async ValueTask ExecuteAsync(IConsole console)
{
Validate();
ConfigureLogging();
var ct = console.RegisterCancellationHandler();
// Browse-only — no declared tags. EnableControllerBrowse=true flips DiscoverAsync's
// symbol-walk on so every recognized primitive surfaces through the builder.
// symbol-walk on so every recognized primitive surfaces through the builder. Native
// ADS notifications are irrelevant here (DiscoverAsync never subscribes); leave the
// default on so the options record matches the production wiring.
var options = new TwinCATDriverOptions
{
Devices = [new TwinCATDeviceOptions(Gateway, $"cli-{AmsNetId}:{AmsPort}")],
Tags = [],
Timeout = Timeout,
Probe = new TwinCATProbeOptions { Enabled = false },
UseNativeNotifications = !PollOnly,
UseNativeNotifications = true,
EnableControllerBrowse = true,
};
@@ -52,10 +61,8 @@ public sealed class BrowseCommand : TwinCATCommandBase
await driver.ShutdownAsync(CancellationToken.None);
}
var matched = builder.Variables
.Where(v => string.IsNullOrEmpty(Prefix) || v.BrowseName.StartsWith(Prefix, StringComparison.Ordinal))
.ToList();
var printLimit = Max <= 0 ? matched.Count : Math.Min(Max, matched.Count);
var matched = FilterByPrefix(builder.Variables, Prefix);
var printLimit = PrintLimit(matched.Count, Max);
await console.Output.WriteLineAsync($"AMS: {AmsNetId}:{AmsPort}");
await console.Output.WriteLineAsync(
@@ -64,8 +71,7 @@ public sealed class BrowseCommand : TwinCATCommandBase
foreach (var v in matched.Take(printLimit))
{
var access = v.Info.SecurityClass == SecurityClassification.ViewOnly ? "RO" : "RW";
await console.Output.WriteLineAsync($" [{access}] {v.Info.DriverDataType,-8} {v.BrowseName}");
await console.Output.WriteLineAsync($" [{AccessTag(v.Info)}] {v.Info.DriverDataType,-8} {v.BrowseName}");
}
if (matched.Count > printLimit)
@@ -73,7 +79,35 @@ public sealed class BrowseCommand : TwinCATCommandBase
$" … {matched.Count - printLimit} more — raise --max or tighten --prefix");
}
private sealed class CollectingAddressSpaceBuilder : IAddressSpaceBuilder
/// <summary>
/// Case-sensitive prefix filter. A null/empty prefix keeps everything; otherwise we
/// keep symbols whose browse name starts with <paramref name="prefix"/> under
/// <see cref="StringComparison.Ordinal"/> — TwinCAT identifiers are case-sensitive on
/// the wire, so a relaxed match would be misleading.
/// </summary>
internal static List<(string BrowseName, DriverAttributeInfo Info)> FilterByPrefix(
IReadOnlyList<(string BrowseName, DriverAttributeInfo Info)> source, string? prefix)
=> source
.Where(v => string.IsNullOrEmpty(prefix) || v.BrowseName.StartsWith(prefix, StringComparison.Ordinal))
.ToList();
/// <summary>
/// Cap-to-max projection. <paramref name="max"/> &lt;= 0 means unbounded, otherwise the
/// min of <paramref name="matchedCount"/> and <paramref name="max"/>.
/// </summary>
internal static int PrintLimit(int matchedCount, int max)
=> max <= 0 ? matchedCount : Math.Min(max, matchedCount);
/// <summary>
/// Coarse RO/RW label used in the browse output. <see cref="SecurityClassification.ViewOnly"/>
/// is the only classification that is unconditionally read-only; everything else can be
/// written from at least one ACL tier, so the CLI labels it RW. The real per-tier
/// authorization is enforced server-side.
/// </summary>
internal static string AccessTag(DriverAttributeInfo info)
=> info.SecurityClass == SecurityClassification.ViewOnly ? "RO" : "RW";
internal sealed class CollectingAddressSpaceBuilder : IAddressSpaceBuilder
{
public List<(string BrowseName, DriverAttributeInfo Info)> Variables { get; } = [];
@@ -11,7 +11,7 @@ namespace ZB.MOM.WW.OtOpcUa.Driver.TwinCAT.Cli.Commands;
/// server near the endpoint.
/// </summary>
[Command("probe", Description = "Verify the TwinCAT runtime is reachable and a sample symbol reads.")]
public sealed class ProbeCommand : TwinCATCommandBase
public sealed class ProbeCommand : TwinCATTagCommandBase
{
[CommandOption("symbol", 's', Description =
"Symbol path to probe. System-global examples: " +
@@ -20,11 +20,14 @@ public sealed class ProbeCommand : TwinCATCommandBase
IsRequired = true)]
public string SymbolPath { get; init; } = default!;
[CommandOption("type", Description = "Data type (default DInt — TwinCAT DINT maps to int32).")]
[CommandOption("type", 't', Description =
"Bool / SInt / USInt / Int / UInt / DInt / UDInt / LInt / ULInt / Real / LReal / " +
"String / WString / Time / Date / DateTime / TimeOfDay (default DInt).")]
public TwinCATDataType DataType { get; init; } = TwinCATDataType.DInt;
public override async ValueTask ExecuteAsync(IConsole console)
{
Validate();
ConfigureLogging();
var ct = console.RegisterCancellationHandler();
@@ -9,7 +9,7 @@ namespace ZB.MOM.WW.OtOpcUa.Driver.TwinCAT.Cli.Commands;
/// member list into individual reads if you need them.
/// </summary>
[Command("read", Description = "Read a single TwinCAT symbol.")]
public sealed class ReadCommand : TwinCATCommandBase
public sealed class ReadCommand : TwinCATTagCommandBase
{
[CommandOption("symbol", 's', Description =
"Symbol path. Program scope: 'MAIN.bStart'. Global: 'GVL.Counter'. " +
@@ -24,6 +24,7 @@ public sealed class ReadCommand : TwinCATCommandBase
public override async ValueTask ExecuteAsync(IConsole console)
{
Validate();
ConfigureLogging();
var ct = console.RegisterCancellationHandler();
@@ -1,4 +1,5 @@
using CliFx.Attributes;
using CliFx.Exceptions;
using CliFx.Infrastructure;
using ZB.MOM.WW.OtOpcUa.Core.Abstractions;
using ZB.MOM.WW.OtOpcUa.Driver.Cli.Common;
@@ -10,7 +11,7 @@ namespace ZB.MOM.WW.OtOpcUa.Driver.TwinCAT.Cli.Commands;
/// pushes on its own cycle); pass <c>--poll-only</c> to fall through to PollGroupEngine.
/// </summary>
[Command("subscribe", Description = "Watch a TwinCAT symbol via ADS notification or poll, until Ctrl+C.")]
public sealed class SubscribeCommand : TwinCATCommandBase
public sealed class SubscribeCommand : TwinCATTagCommandBase
{
[CommandOption("symbol", 's', Description = "Symbol path — same format as `read`.", IsRequired = true)]
public string SymbolPath { get; init; } = default!;
@@ -23,8 +24,17 @@ public sealed class SubscribeCommand : TwinCATCommandBase
[CommandOption("interval-ms", 'i', Description = "Publishing interval ms (default 1000).")]
public int IntervalMs { get; init; } = 1000;
protected override void Validate()
{
base.Validate();
if (IntervalMs <= 0)
throw new CommandException(
$"--interval-ms must be greater than 0 (got {IntervalMs}).");
}
public override async ValueTask ExecuteAsync(IConsole console)
{
Validate();
ConfigureLogging();
var ct = console.RegisterCancellationHandler();
@@ -43,19 +53,39 @@ public sealed class SubscribeCommand : TwinCATCommandBase
{
await driver.InitializeAsync("{}", ct);
// Native ADS notifications fire OnDataChange from the Beckhoff.TwinCAT.Ads
// notification callback thread — unlike the poll-mode path (which serialises on a
// single PollGroupEngine loop), the native callback can interleave with the banner
// write below and with subsequent change events if the PLC pushes faster than a
// single console write completes. A TextWriter is not guaranteed thread-safe, so
// we serialise every write through a lock to keep output clean for
// screen-recorded bug-report timelines (Driver.TwinCAT.Cli-002).
var writeLock = new object();
driver.OnDataChange += (_, e) =>
{
var line = $"[{DateTime.UtcNow:HH:mm:ss.fff}] " +
$"{e.FullReference} = {SnapshotFormatter.FormatValue(e.Snapshot.Value)} " +
$"({SnapshotFormatter.FormatStatus(e.Snapshot.StatusCode)})";
console.Output.WriteLine(line);
lock (writeLock)
{
console.Output.WriteLine(line);
}
};
handle = await driver.SubscribeAsync([tagName], TimeSpan.FromMilliseconds(IntervalMs), ct);
var mode = PollOnly ? "polling" : "ADS notification";
await console.Output.WriteLineAsync(
$"Subscribed to {SymbolPath} @ {IntervalMs}ms ({mode}). Ctrl+C to stop.");
// Driver.TwinCAT.Cli-003: derive the banner mechanism from the actual subscription
// handle the driver returned, not from --poll-only. The native ADS path tags its
// handle with a "twincat-native-sub-*" DiagnosticId; anything else means we landed
// on the shared PollGroupEngine. That way the line cannot disagree with what the
// driver actually did (e.g. a future fallback inside SubscribeAsync).
var mode = DescribeMechanism(handle);
lock (writeLock)
{
console.Output.WriteLine(
$"Subscribed to {SymbolPath} @ {IntervalMs}ms ({mode}). Ctrl+C to stop.");
}
try
{
await Task.Delay(System.Threading.Timeout.InfiniteTimeSpan, ct);
@@ -75,4 +105,16 @@ public sealed class SubscribeCommand : TwinCATCommandBase
await driver.ShutdownAsync(CancellationToken.None);
}
}
/// <summary>
/// Maps the returned subscription handle's <see cref="ISubscriptionHandle.DiagnosticId"/>
/// to the banner label. The TwinCAT driver tags native ADS subscriptions with
/// <c>twincat-native-sub-*</c> and the shared <c>PollGroupEngine</c> handle uses a
/// different format — anything else means we landed on the poll loop. Internal so the
/// test assembly can cover the mapping without spinning a real driver.
/// </summary>
internal static string DescribeMechanism(ISubscriptionHandle handle) =>
handle.DiagnosticId.StartsWith("twincat-native-sub-", StringComparison.Ordinal)
? "ADS notification"
: "polling";
}
@@ -11,7 +11,7 @@ namespace ZB.MOM.WW.OtOpcUa.Driver.TwinCAT.Cli.Commands;
/// JSON for those.
/// </summary>
[Command("write", Description = "Write a single TwinCAT symbol.")]
public sealed class WriteCommand : TwinCATCommandBase
public sealed class WriteCommand : TwinCATTagCommandBase
{
[CommandOption("symbol", 's', Description =
"Symbol path — same format as `read`.", IsRequired = true)]
@@ -29,6 +29,7 @@ public sealed class WriteCommand : TwinCATCommandBase
public override async ValueTask ExecuteAsync(IConsole console)
{
Validate();
ConfigureLogging();
var ct = console.RegisterCancellationHandler();
@@ -1,13 +1,15 @@
using CliFx.Attributes;
using CliFx.Exceptions;
using ZB.MOM.WW.OtOpcUa.Driver.Cli.Common;
namespace ZB.MOM.WW.OtOpcUa.Driver.TwinCAT.Cli;
/// <summary>
/// Base for every TwinCAT CLI command. Carries the AMS target options
/// (<c>--ams-net-id</c> + <c>--ams-port</c>) + the notification-mode toggle that the
/// driver itself takes. Exposes <see cref="BuildOptions"/> so each command can build a
/// single-device / single-tag <see cref="TwinCATDriverOptions"/> from flag input.
/// (<c>--ams-net-id</c> + <c>--ams-port</c>) + the per-call timeout. Commands that build
/// a single-device / single-tag <see cref="TwinCATDriverOptions"/> from flag input inherit
/// from <see cref="TwinCATTagCommandBase"/> instead — that intermediate adds the
/// <c>--poll-only</c> flag and the <c>BuildOptions</c> helper.
/// </summary>
public abstract class TwinCATCommandBase : DriverCommandBase
{
@@ -23,16 +25,19 @@ public abstract class TwinCATCommandBase : DriverCommandBase
[CommandOption("timeout-ms", Description = "Per-operation timeout in ms (default 5000).")]
public int TimeoutMs { get; init; } = 5000;
[CommandOption("poll-only", Description =
"Disable native ADS notifications and fall through to the shared PollGroupEngine " +
"(same as setting UseNativeNotifications=false in a real driver config).")]
public bool PollOnly { get; init; }
/// <inheritdoc />
/// <summary>
/// The per-operation timeout, projected from <see cref="TimeoutMs"/>. The CliFx
/// <c>init</c> accessor required by the abstract base property is intentionally a
/// no-op: <see cref="TimeoutMs"/> is the only source of truth, so any value an
/// `init` initialiser supplies to <see cref="Timeout"/> directly is silently
/// dropped. Do NOT add a backing field "fixing" the empty body — it would diverge
/// from <see cref="TimeoutMs"/> and the two would drift on every refactor
/// (Driver.TwinCAT.Cli-007).
/// </summary>
public override TimeSpan Timeout
{
get => TimeSpan.FromMilliseconds(TimeoutMs);
init { /* driven by TimeoutMs */ }
init { /* see XML summary — driven by TimeoutMs */ }
}
/// <summary>
@@ -41,22 +46,29 @@ public abstract class TwinCATCommandBase : DriverCommandBase
/// </summary>
protected string Gateway => $"ads://{AmsNetId}:{AmsPort}";
/// <summary>
/// Build a <see cref="TwinCATDriverOptions"/> with the AMS target this base collected +
/// the tag list a subclass supplies. Probe disabled, controller-browse disabled,
/// native notifications toggled by <see cref="PollOnly"/>.
/// </summary>
protected TwinCATDriverOptions BuildOptions(IReadOnlyList<TwinCATTagDefinition> tags) => new()
{
Devices = [new TwinCATDeviceOptions(
HostAddress: Gateway,
DeviceName: $"cli-{AmsNetId}:{AmsPort}")],
Tags = tags,
Timeout = Timeout,
Probe = new TwinCATProbeOptions { Enabled = false },
UseNativeNotifications = !PollOnly,
EnableControllerBrowse = false,
};
protected string DriverInstanceId => $"twincat-cli-{AmsNetId}:{AmsPort}";
/// <summary>
/// Validates the numeric options every TwinCAT CLI command shares (timeout + AMS port).
/// Subclasses override and call <c>base.Validate()</c> first to add their own range
/// checks. Throwing here surfaces a clean CliFx one-line error before the driver gets
/// a chance to fail with an opaque transport error (Driver.TwinCAT.Cli-001).
/// </summary>
protected virtual void Validate()
{
if (TimeoutMs <= 0)
throw new CommandException(
$"--timeout-ms must be greater than 0 (got {TimeoutMs}).");
if (AmsPort is <= 0 or > 65535)
throw new CommandException(
$"--ams-port must be in the range 1..65535 (got {AmsPort}).");
}
// ---- Test hooks ----
// Protected members are exposed to the test assembly through these internal accessors so the
// test project can cover Gateway / DriverInstanceId composition + range validation without
// needing reflection on every assertion (Driver.TwinCAT.Cli-006).
internal string GatewayForTest => Gateway;
internal string DriverInstanceIdForTest => DriverInstanceId;
internal void ValidateForTest() => Validate();
}
@@ -0,0 +1,40 @@
using CliFx.Attributes;
namespace ZB.MOM.WW.OtOpcUa.Driver.TwinCAT.Cli;
/// <summary>
/// Intermediate base for the four TwinCAT CLI commands that build a single-device /
/// single-tag <see cref="TwinCATDriverOptions"/> — <c>probe</c>, <c>read</c>, <c>write</c>,
/// <c>subscribe</c>. Adds the <c>--poll-only</c> flag (relevant only when the driver is
/// about to register native ADS notifications, which is why it does NOT live on the
/// <c>browse</c> command — Driver.TwinCAT.Cli-004) and the <c>BuildOptions</c> helper that
/// assembles the driver-side options record.
/// </summary>
public abstract class TwinCATTagCommandBase : TwinCATCommandBase
{
[CommandOption("poll-only", Description =
"Disable native ADS notifications and fall through to the shared PollGroupEngine " +
"(same as setting UseNativeNotifications=false in a real driver config).")]
public bool PollOnly { get; init; }
/// <summary>
/// Build a <see cref="TwinCATDriverOptions"/> with the AMS target this base collected +
/// the tag list a subclass supplies. Probe disabled, controller-browse disabled,
/// native notifications toggled by <see cref="PollOnly"/>.
/// </summary>
protected TwinCATDriverOptions BuildOptions(IReadOnlyList<TwinCATTagDefinition> tags) => new()
{
Devices = [new TwinCATDeviceOptions(
HostAddress: Gateway,
DeviceName: $"cli-{AmsNetId}:{AmsPort}")],
Tags = tags,
Timeout = Timeout,
Probe = new TwinCATProbeOptions { Enabled = false },
UseNativeNotifications = !PollOnly,
EnableControllerBrowse = false,
};
// ---- Test hook ----
internal TwinCATDriverOptions BuildOptionsForTest(IReadOnlyList<TwinCATTagDefinition> tags)
=> BuildOptions(tags);
}
@@ -1,3 +1,5 @@
using Microsoft.Extensions.Logging;
using Microsoft.Extensions.Logging.Abstractions;
using ZB.MOM.WW.OtOpcUa.Core.Abstractions;
namespace ZB.MOM.WW.OtOpcUa.Driver.AbCip;
@@ -32,14 +34,16 @@ internal sealed class AbCipAlarmProjection : IAsyncDisposable
{
private readonly AbCipDriver _driver;
private readonly TimeSpan _pollInterval;
private readonly ILogger _logger;
private readonly Dictionary<long, Subscription> _subs = new();
private readonly Lock _subsLock = new();
private long _nextId;
public AbCipAlarmProjection(AbCipDriver driver, TimeSpan pollInterval)
public AbCipAlarmProjection(AbCipDriver driver, TimeSpan pollInterval, ILogger? logger = null)
{
_driver = driver;
_pollInterval = pollInterval;
_logger = logger ?? NullLogger.Instance;
}
public async Task<IAlarmSubscriptionHandle> SubscribeAsync(
@@ -158,7 +162,14 @@ internal sealed class AbCipAlarmProjection : IAsyncDisposable
Tick(sub, results);
}
catch (OperationCanceledException) when (ct.IsCancellationRequested) { break; }
catch { /* per-tick failures are non-fatal; next tick retries */ }
catch (Exception ex)
{
// Per-tick failures are non-fatal; next tick retries. Log at debug because a
// wedged controller produces one exception per tick and the operator already
// sees the failed-read warning from ReadAsync below this layer; this log just
// confirms the alarm projection loop is still running.
_logger.LogDebug(ex, "AbCip alarm-projection poll tick failed (will retry)");
}
try { await Task.Delay(_pollInterval, ct).ConfigureAwait(false); }
catch (OperationCanceledException) { break; }
@@ -5,7 +5,8 @@ namespace ZB.MOM.WW.OtOpcUa.Driver.AbCip;
/// <summary>
/// Logix atomic + string data types, plus a <see cref="Structure"/> marker used when a tag
/// references a UDT / predefined structure (Timer, Counter, Control). The concrete UDT
/// shape is resolved via the CIP Template Object at discovery time (PR 5 / PR 6).
/// shape is resolved via the CIP Template Object at discovery time (see
/// <see cref="CipTemplateObjectDecoder"/> + <see cref="AbCipTemplateCache"/>).
/// </summary>
/// <remarks>
/// Mirrors the shape of <c>ModbusDataType</c>. Atomic Logix names (BOOL / SINT / INT / DINT /

Some files were not shown because too many files have changed in this diff Show More