Two real bugs uncovered by re-running with the new fixture defaults
pointing at the shared docker host. Both are test-side, not driver-side.
AbCip — Driver_reads_seeded_DInt_from_ab_server (4 parametrized rows):
Hardcoded 'ab://127.0.0.1:{port}/1,0' in the deviceUri instead of
the resolved fixture.Host. The new 10.100.0.35 default (and any
AB_SERVER_ENDPOINT override) silently couldn't reach this test —
the driver tried to connect to a non-existent localhost:44818 and
returned BadCommunicationError on all 4 profile rows. The sibling
Emulate tests already use the fixture's resolved endpoint; this
smoke test was missed in the original migration.
Fix: deviceUri = $"ab://{fixture.Host}:{fixture.Port}/1,0".
Modbus — Float32_With_CDAB_Roundtrips_Through_Wire:
Test wrote a Float32 to HR 100 (2 consecutive registers: 100+101).
standard.json's writable HR range declares [100,100] only — a
single-cell auto-incrementing register, not a 2-register pair. The
write to register 101 was rejected with Illegal Data Address
(BadOutOfRange).
Fix: moved the tag from HR 100 to HR 200 (in standard.json's
'[200, 209]' scratch range — 10 consecutive writable HRs). The
Float32+CDAB semantic the test exercises is unchanged.
Modbus — Block_Read_Coalescing_Reduces_PDU_Count_End_To_End:
Test read HR 300, 302, 304 — outside both the writable ranges and
the uint16 seed list. pymodbus rejects reads to unseeded HRs even
though 'hr size' is 2048. BadOutOfRange on every read.
Fix: moved the tags from 300/302/304 to 200/202/204 (within the
scratch range). The non-contiguous coalescing semantic (3 tags
inside a 5-register window with MaxReadGap=5) is preserved.
After this commit:
- Modbus.IntegrationTests: 6/38 pass / 32 skip / 0 fail
(was 4 pass / 32 skip / 2 fail; 32 skips are profile-gated
ExceptionInjectionTests — they need MODBUS_SIM_PROFILE=
exception_injection and a different container, intentional gating)
- AbCip.IntegrationTests: 10/12 pass / 2 skip / 0 fail
(was 6 pass / 2 skip / 4 fail; 2 skips are Emulate tests that
need the fixture for separate scenarios)
No driver code changed.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
CLAUDE.md "Docker Workflow" claims (per the 2026-04-28 migration note)
that all fixture-class default endpoints were rewritten to target the
shared Docker host at 10.100.0.35. Audit during today's e2e run showed
the claim was incomplete — five fixture classes still defaulted to
localhost / 127.0.0.1, causing every fixture-touching integration test
to skip with "endpoint unreachable" on a fresh box that hadn't set
the override env vars.
Files corrected:
- tests/.../Modbus.IntegrationTests/ModbusSimulatorFixture.cs
DefaultEndpoint: localhost:5020 → 10.100.0.35:5020
- tests/.../S7.IntegrationTests/Snap7ServerFixture.cs
DefaultEndpoint: localhost:1102 → 10.100.0.35:1102
- tests/.../OpcUaClient.IntegrationTests/OpcPlcFixture.cs
DefaultEndpoint: opc.tcp://localhost:50000 → opc.tcp://10.100.0.35:50000
- tests/.../AbCip.IntegrationTests/AbServerFixture.cs
Host default + ResolveHost fallback: 127.0.0.1 → 10.100.0.35
- tests/.../AbLegacy.IntegrationTests/AbLegacyServerFixture.cs
Host default + ResolveEndpoint fallback: 127.0.0.1 → 10.100.0.35
XML doc comments referencing the old localhost defaults were updated in
the same pass so the class-summary documentation matches the actual
default. The override-via-env-var mechanism (MODBUS_SIM_ENDPOINT,
AB_SERVER_ENDPOINT, AB_LEGACY_ENDPOINT, S7_SIM_ENDPOINT,
OPCUA_SIM_ENDPOINT) is unchanged — pointing at a real PLC or a
locally-running container still works exactly as before.
Verification:
- Solution-wide dotnet build: 0 errors.
- S7.IntegrationTests: 3/3 pass without env-var override.
- OpcUaClient.IntegrationTests: 3/3 pass without env-var override.
- Modbus.IntegrationTests: 4/38 (same as the env-var-override run —
the 2 failures + 32 skips are pre-existing fixture-profile
mismatches unrelated to this fix).
- AbCip.IntegrationTests / AbLegacy.IntegrationTests: same results
as the env-var-override run.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Five doc-content updates after this session's code-review resolution
sweep. No code touched; pure documentation drift correction.
1. docs/reqs/HighLevelReqs.md (HLR-007 — Service Hosting):
Refreshed the deployment description from "three cooperating
processes (Server, Admin, Galaxy.Host)" to "two cooperating
Windows services (Server, Admin)". The legacy x86 TopShelf
Galaxy.Host process was retired in PR 7.2 (2026-04-30); Galaxy
access now flows through the in-process Tier-A GalaxyDriver
talking gRPC to the sibling mxaccessgw gateway. Also called out
decision #30 (AddWindowsService replacing TopShelf) inline.
2. docs/VirtualTags.md:
- Line 9: "compiled via Microsoft.CodeAnalysis.CSharp.Scripting"
replaced with the current pipeline (Microsoft.CodeAnalysis.CSharp
regular compiler — Core.Scripting-008 / -016 retired the
CSharpScript/ScriptRunner path).
- Line 39: orphan-thread leak description rewritten. The
CSharp.Scripting-era "underlying ScriptRunner keeps running on
its thread-pool thread until the Roslyn runtime returns" is no
longer accurate — the new pipeline binds the script as a
regular C# Func<> delegate, so the leak is now "synchronous
CPU-bound work on a pool thread" (same operator-visible
effect, different mechanism).
3. docs/v2/plan.md decision #29 ("Galaxy Host is a separate Windows
service"):
Annotated both the decision body and the decision-log table row
with "Reversed PR 7.2, 2026-04-30" + a one-line summary of the
replacement architecture. The original reasoning is preserved as
audit trail per the decision-log convention.
4. docs/v2/implementation/phase-7-scripting-and-alarming.md A.1:
Added an Implementation note describing the
Core.Scripting-008 / -016 supersession of the original
CSharpScript pipeline. The historical record stays; the note
points future readers at docs/VirtualTags.md "Compile cache"
for the current contract.
5. docs/plans/alarms-over-gateway.md "Files" section under client
regeneration:
Updated the .NET regeneration instructions to point at the new
ZB.MOM.WW.MxGateway.Contracts.csproj path. The old
clients/dotnet/MxGateway.Client.csproj no longer exists in the
sibling repo (restructure after this plan was written) and the
vendored-binaries situation in
src/Drivers/ZB.MOM.WW.OtOpcUa.Driver.Galaxy/libs/ is called out
so a reader following the plan won't chase a deleted path.
Verification: grep against docs/ for the pre-fix wordings ("three
cooperating processes", "Galaxy.Host (TopShelf)", "ScriptRunner",
the wrong BadDeviceFailure hex code 0x80550000) returns no hits.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Single commit covering the four small/medium fixes from the updated
code review.
Core.Scripting-014 (Medium, Concurrency):
CompiledScriptCache.Clear() used the key-only TryRemove(key, out var
lazy) overload — same race shape Core.Scripting-006 closed in
GetOrCompile's catch block. A concurrent re-add between snapshot and
TryRemove was evicted + disposed while the new caller still held it.
Replaced with the value-scoped TryRemove(KeyValuePair<,>) overload.
Regression test
Clear_uses_value_scoped_TryRemove_so_a_race_inserted_entry_survives
added.
Core.Scripting-013 (Medium, Security):
Hand-rolled BuildWrapperSource pastes user source between literal
braces; brace-balanced source could inject sibling methods/classes
alongside CompiledScript.Run. Analyzer still walked the injected
members so it wasn't a direct escape, but it relaxed the documented
'method body' authoring contract. Added EnforceSingleRunMember:
after ParseText, the compilation unit must hold exactly one type
(CompiledScript) and that type must hold exactly one member (the Run
method). Any deviation throws CompilationErrorException with LMX001/
LMX002 diagnostic IDs and a Core.Scripting-013 reference in the
message. Two regression tests added covering the sibling-method and
sibling-class injection vectors.
Core.Scripting-015 (Low, Correctness, latent):
ToCSharpTypeName's generic branch truncated at the first backtick via
IndexOf, silently dropping closed args of nested-generic shapes
(Outer<T>.Inner<U>). No production caller exercises this shape today
(all TContext/TResult are top-level non-nested), so the bug was
latent. Rewrote the generic branch to walk the FullName segment-by-
segment, consuming generic args per segment so nested shapes emit
valid C# (global::Ns.Outer<T>.Inner<U> rather than the broken
Outer<T,U>).
Core.ScriptedAlarms-013 (Low, Documentation):
The internal test accessors TryGetScratchReadCacheForTest /
TryGetScratchContextForTest return live mutable scratch refilled in
place under _evalGate. XML docs didn't warn future test authors about
the synchronization contract. Added a <remarks> block to each
documenting the only-safe-on-quiesced-engine + identity-or-single-key
contract.
Verification (suites green):
Core.Scripting.Tests: 110/110 (was 107 — +3 new rejection/race tests)
Core.ScriptedAlarms.Tests: 67/67 (unchanged — doc-only fix)
Core.VirtualTags.Tests: 57/57 (unchanged)
After this commit, all 12 findings from the updated re-review are
closed (10 Resolved, 1 Won't Fix none, 1 Deferred — Driver.Galaxy-017).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Driver.Galaxy-015, -016, -017, -018 resolution (one logical change set).
Driver.Galaxy-016 (Medium, Perf/Resource):
Reconciled the csproj PackageReferences with what the vendored
MxGateway.Client.dll was actually built against, verified by
reflecting Assembly.GetReferencedAssemblies() on the DLL:
- Polly 8.5.2 → Polly.Core 8.6.6
(most consequential — Polly v7 fluent API vs Polly.Core v8
resilience-pipeline API are DIFFERENT packages; the DLL was
built against Polly.Core so the prior Polly reference would
have failed at runtime with MissingMethodException the first
time the gateway client's retry pipeline ran)
- Grpc.Net.Client 2.71.0 → 2.76.0 (matches sibling Server/Worker)
- Microsoft.Extensions.Logging.Abstractions 10.0.0 → 10.0.7
Google.Protobuf 3.34.1 and Grpc.Core.Api 2.76.0 already matched —
left unchanged.
Driver.Galaxy-015 (re-triaged from Medium-Security → Low-Documentation):
Original framing was a security concern about unknown-provenance
binaries. User clarified the DLLs are their own code, built from
their own mxaccessgw project, not third-party. Re-triaged to a
documentation / audit-trail concern. Fix:
- Added a Provenance section to libs/README.md recording the
source-commit SHA (dd7ca1634e2d2b8a866c81f0009bf87ee9427750,
extracted from the AssemblyInformationalVersion baked into
both DLLs by the original build) and SHA-256 checksums.
- Documented the re-verification recipe (sha256sum + ilspycmd
| grep AssemblyInformationalVersion).
Recommendations about .gitattributes and CI hash-check deferred —
the DLLs are frozen until an unwinding path is taken, so adding
LFS or CI infrastructure now would need removal at unwinding.
Driver.Galaxy-018 (Low, Documentation):
Most of the recommendation folded into the libs/README.md rewrite
(pointed at sibling Server/Worker csproj as the live version source
rather than the deleted MxGateway.Client.csproj; recorded source
commit + SHA-256). <SpecificVersion>false</SpecificVersion> on the
<Reference> items intentionally not added — MSBuild's default for
HintPath references with bare-name Include attributes is already
SpecificVersion=false, so explicitly setting it would be cosmetic
without changing behaviour.
Driver.Galaxy-017 (Low, Design) — Deferred:
Recommendation part (b) (record mxaccessgw source-commit SHA in
libs/README.md) is satisfied by Driver.Galaxy-015's resolution.
Parts (a) and (c) — a GetVersion RPC at session-open and a parity
test against the live gateway's proto descriptor — are substantial
new RPC + plumbing work not in scope for this code-review sweep.
The risk surface is bounded because either of the libs/README.md
unwinding paths closes the vendoring + this concern naturally.
Re-open if neither path is taken within the next quarter and the
live gateway evolves its proto under the driver.
Verification:
- Build clean (Driver.Galaxy.csproj 0 errors, 0 warnings).
- Driver.Galaxy.Tests: 245/245 pass against the corrected
package set.
- Solution-wide build remains clean.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Core.Scripting-012 (High, Security) resolution.
The Core.Scripting-008 rewrite broadened the BCL references list from a
narrow allow-list to the full System.* + netstandard +
Microsoft.Win32.Registry set, delegating the security gate entirely to
ForbiddenTypeAnalyzer. Three categories of dangerous BCL types were
reachable from script source without a deny-list entry:
- System.Threading.ThreadPool — QueueUserWorkItem re-introduces the
background-fanout threat Core.Scripting-003 closed against
System.Threading.Tasks.
- System.Threading.Timer — schedules unbounded callback work that
outlives the per-evaluation timeout.
- System.Runtime.Loader.AssemblyLoadContext — loads arbitrary DLLs.
Defense-in-depth gap; invocation needs reflection (already denied)
but the load itself was reachable.
Fix:
- Added 'System.Runtime.Loader' to ForbiddenNamespacePrefixes
(preferred over type-granular per the recommendation so future BCL
additions to that namespace are denied by default).
- Added 'System.Threading.ThreadPool' and 'System.Threading.Timer'
to ForbiddenFullTypeNames — both live in System.Threading shared
with allowed primitives so they must be type-granular.
Regression tests added to ScriptSandboxTests:
Rejects_ThreadPool_QueueUserWorkItem_at_compile
Rejects_Timer_new_at_compile
Rejects_AssemblyLoadContext_at_compile
Docs:
docs/v2/implementation/phase-7-scripting-and-alarming.md decision #6
and the Sandbox-escape compliance-check row both updated to enumerate
the new entries per the Core.Scripting-009 doc-sync convention.
Two lower-impact suggestions from the finding's recommendation
(System.Console, CultureInfo.DefaultThreadCurrentCulture) were
intentionally not addressed and are recorded as accepted minor risks
in the resolution.
Verification: Core.Scripting.Tests 107/107 (was 104 + 3 new rejection
tests).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Both VirtualTagEngine.Load and ScriptedAlarmEngine.LoadAsync were calling
ScriptEvaluator.Compile directly, bypassing CompiledScriptCache. The
Core.Scripting-008 collectible-ALC fix wired Dispose only through the cache's
Clear()/Dispose(), so the per-publish accretion the -008 fix was meant to
eliminate was still in effect on the actual production path — the headline
'no more restarts needed' guarantee wasn't delivered.
Resolution:
- VirtualTagEngine + ScriptedAlarmEngine each gained a private
CompiledScriptCache<TContext, TResult> instance.
- Both Load methods now call _compileCache.GetOrCompile(source).
- Publish-replace path: _compileCache.Clear() runs alongside the existing
_tags / _alarms clears so the prior generation's ALCs are disposed
before recompile.
- Engine Dispose now calls _compileCache.Dispose() so shutdown actually
releases the emitted assemblies.
Side-fix in CompiledScriptCache: Dispose() set _disposed=true then called
Clear(), but Clear() had a pre-existing 'if (_disposed) return' guard that
aborted the drain unconditionally — making the Dispose-triggered cleanup a
silent no-op. Removed the disposed-guard on Clear() (clearing an empty/
cleared cache is idempotent).
Side-fix in ScriptedAlarmEngine.Dispose: cleared _alarms AFTER the
Task.WhenAll drain. The drain guarantees no background callback is mid-
flight, so clearing is safe. Previously _alarms was deliberately NOT
cleared on Dispose (per Core.ScriptedAlarms-005), but that left the
AlarmState records holding TimedScriptEvaluator → ScriptEvaluator → delegate
references that rooted the emitted assemblies, defeating the cache's
Dispose work on the engine side.
Regression tests:
- VirtualTagEngineTests.Dispose_unloads_compiled_script_assembly
- ScriptedAlarmEngineTests.Dispose_unloads_compiled_predicate_assembly
Both use WeakReference + bounded GC.Collect() to prove the emitted
assembly is reclaimable after engine.Dispose(). The alarms test had to
be synchronous (not 'async Task<WeakReference>') because async state
machines capture locals as state-struct fields, keeping them alive past
the method's apparent end and defeating GC.
Verification:
- Core.Scripting.Tests: 104/104 (unchanged).
- VirtualTags.Tests: 57/57 (was 56 — +1 unload test).
- ScriptedAlarms.Tests: 67/67 (was 66 — +1 unload test).
- All other consumer suites still green.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Driver.Cli.Common-007 + Driver.Cli.Common-008 resolution.
Driver.Cli.Common-007 (High, Correctness):
0x80550000 is the canonical OPC UA spec value for BadSecurityPolicyRejected,
not BadDeviceFailure. The correct spec value for BadDeviceFailure is
0x808B0000 (verified against OPC Foundation Opc.Ua.StatusCodes;
corroborated locally by Driver.Galaxy.Runtime.StatusCodeMap and both
Wonderware historian quality mappers which all hand-pin the correct
value).
The bug was duplicated across six driver modules:
- FocasStatusMapper.BadDeviceFailure
- AbCipStatusMapper.BadDeviceFailure
- AbLegacyStatusMapper.BadDeviceFailure
- TwinCATStatusMapper.BadDeviceFailure
- ModbusDriver.StatusBadDeviceFailure
- S7Driver.StatusBadDeviceFailure
Plus the SnapshotFormatter shortlist that named 0x80550000 as
BadDeviceFailure, and three downstream Modbus tests that asserted
against the wrong value (so CI was blind).
This commit fixes all six native-mapper constants, the formatter
shortlist, and the three Modbus tests in one pass. Added a regression
guard to FormatStatus_does_not_apply_pre_fix_wrong_names that pins
0x80550000 never renders as BadDeviceFailure (mirroring the existing
-001 wrong-name guards).
Behavior change: OPC UA clients consuming the native drivers now see
the canonical BadDeviceFailure (0x808B0000) on device-fault paths
instead of the misnamed BadSecurityPolicyRejected (0x80550000). Wire-
level status semantics now match operator-facing CLI labels.
Driver.Cli.Common-008 (Low, Testing):
Deleted the redundant FormatStatus_names_native_driver_emitted_codes
Theory — its five InlineData rows were already covered by the
well-known Theory in the same commit (5a9c459), and used a weaker
ShouldContain vs the well-known Theory's ShouldBe (exact match).
Verification:
- Driver.Cli.Common.Tests: 43/43 pass (was 48 after the -008 deletion).
- Driver.Modbus.Tests: 263/263 pass.
- Driver.AbCip.Tests: 262/262.
- Driver.AbLegacy.Tests: 157/157.
- Driver.FOCAS.Tests: 178/178.
- Driver.S7.Tests: 112/112.
- Driver.TwinCAT.Tests: 131/131.
Total: 1146 tests across the affected modules, all green.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Re-reviewed the four modules with source changes since the previous review
commit 76d35d1, per REVIEW-PROCESS.md section 6. Updated each findings.md
header (date 2026-05-23, commit a9be809) and appended new findings under
continued numbering. Regenerated README.md.
## New findings — 12 total across 4 modules
### Core.Scripting (5 new, IDs -012 to -016)
- **-012 High Security** — broadened BCL references (System.* + netstandard)
re-expose System.Threading.ThreadPool / Timer / AssemblyLoadContext, which
the analyzer's deny-list doesn't cover. Re-introduces the background-work
threat Core.Scripting-003 closed via System.Threading.Tasks deny.
- **-013 Medium Security** — hand-rolled wrapper-source generation lets
brace-balanced user source inject sibling methods/classes alongside
CompiledScript.Run. Analyzer still gates forbidden types, but the
documented 'method body' authoring contract is silently relaxed.
- **-014 Medium Concurrency** — CompiledScriptCache.Clear() uses key-only
TryRemove(key, out _) — the same race the -006 resolution fixed in
GetOrCompile's catch is latent here on publish-replace.
- **-015 Low Correctness** — ToCSharpTypeName truncates at first backtick;
silently drops closed type arguments of nested-generic shapes (Outer<>.Inner<>).
Latent — no production caller uses this shape today.
- **-016 Medium Performance** — VirtualTagEngine + ScriptedAlarmEngine call
ScriptEvaluator.Compile directly without going through CompiledScriptCache,
so the headline -008 collectible-ALC fix doesn't run on the actual
production path — the per-publish leak is still in effect.
### Core.ScriptedAlarms (1 new, ID -013)
- **-013 Low Documentation** — new internal test accessors return the live
mutable scratch dictionary; XML docs don't warn future test authors about
the synchronisation contract.
### Driver.Cli.Common (2 new, IDs -007, -008)
- **-007 High Correctness** — 0x80550000 was added as BadDeviceFailure but
the real OPC UA spec value for BadDeviceFailure is 0x808B0000 (verified
against Driver.Galaxy.Runtime.StatusCodeMap and HistorianQualityMapper,
both of which use the correct 0x808B0000). 0x80550000 is actually
BadSecurityPolicyRejected. The native mappers (FOCAS / AbCip / AbLegacy)
all use the wrong 0x80550000; this session's SnapshotFormatter extension
propagated the wrong name and the test asserts against the same wrong
value so CI is blind — same shape of bug as Driver.Cli.Common-001.
- **-008 Low Testing** — new FormatStatus_names_native_driver_emitted_codes
Theory is redundant with the existing well-known Theory (same five
InlineData rows added to both) and uses weaker ShouldContain assertion
than the well-known Theory's ShouldBe.
### Driver.Galaxy (4 new, IDs -015 to -018)
- **-015 Medium Security** — vendored DLLs (libs/) have no recorded
provenance: no source-commit SHA from the mxaccessgw repo, no SHA-256
checksum in libs/README.md. Tampering / accidental swap undetectable.
- **-016 Medium Performance** — version skew between declared
PackageReferences (Polly 8.5.2 / Grpc.Net.Client 2.71.0 /
Microsoft.Extensions.Logging.Abstractions 10.0.0) and what the vendored
DLL was actually built against (Polly.Core 8.6.6 / Grpc.Net.Client
2.76.0 / Microsoft.Extensions.Logging.Abstractions 10.0.7). Latent now
(assembly-version refs are loose) but precise shape that produces a
runtime MissingMethodException.
- **-017 Low Design** — no contract-version handshake between the driver
and the gateway; proto could evolve under the gateway without the
driver noticing.
- **-018 Low Documentation** — libs/README.md points at the wrong sibling
csproj as the version source-of-truth; missing SpecificVersion=false
on the Reference items; missing mxaccessgw source-commit SHA.
## Particularly notable
Two findings undercut commits from this session:
- Driver.Cli.Common-007 invalidates commit 5a9c459 (which named 0x80550000
as BadDeviceFailure across the cross-CLI shortlist).
- Core.Scripting-016 invalidates the production effect of commit 7b6ab2e
(the collectible-ALC fix wired Dispose only via CompiledScriptCache,
which the engines don't use).
The wider native-mapper miscoding behind -007 also affects three driver
modules outside this session's edit scope (FocasStatusMapper,
AbCipStatusMapper, AbLegacyStatusMapper all carry the wrong code).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The task-galaxy-e2e branch was merged + deleted; the durable reference
is PR #205 alone. Tidies a dangling pointer that future readers might
chase looking for a branch that no longer exists.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The sibling mxaccessgw repo restructured: clients/dotnet/MxGateway.Client
no longer exists, and the proto contracts moved to a new namespace
(ZB.MOM.WW.MxGateway.Contracts.Proto, was MxGateway.Contracts.Proto). The
driver's source still expects the pre-restructure namespace, so the
broken ProjectReference produced 86 build errors in src/ + 1 in tests/
on master.
Resolution: vendor the last known-good build of MxGateway.Client.dll
(99 KB, May 22) and MxGateway.Contracts.dll (490 KB, May 23) under
src/Drivers/.../Driver.Galaxy/libs/, reference them via <Reference
HintPath=...> in both the driver and its test csproj, and declare the
NuGet packages the dropped ProjectReference was supplying transitively
(Google.Protobuf, Grpc.Core.Api, Grpc.Net.Client,
Microsoft.Extensions.Logging.Abstractions, Polly) at versions matching
the sibling repo's ZB.MOM.WW.MxGateway.Contracts.csproj so binary
compatibility is preserved.
Why this over a source migration:
Source migration would require namespace renames across ~19 driver
files PLUS reimplementing MxGatewayClient / MxGatewaySession /
GalaxyRepositoryClient (~2,200 LoC) — the sibling repo dropped the
client library entirely, keeping only the proto contracts. Vendoring
the last known-good binaries unblocks the build in minutes, freezes
the gateway contract surface at a known-good version, and preserves
the option to migrate properly once the sibling repo decides whether
to restore a client library or hand the work back to us.
libs/README.md documents the unwinding plan (either path closes the
debt: sibling restores a client library, or driver migrates to the new
contracts namespace + reimplements the client wrapper).
Verification:
- dotnet build ZB.MOM.WW.OtOpcUa.slnx: 0 errors (was 87).
- Driver.Galaxy unit tests: 245/245 pass.
- Integration tests not run here (require a live mxaccessgw gateway).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Core.ScriptedAlarms-009 resolution: replace the per-call Dictionary +
AlarmPredicateContext allocation with a per-alarm reusable AlarmScratch
held in _scratchByAlarmId, refilled in place under _evalGate on each
evaluation. The hot path no longer allocates per upstream tag change.
Why this matters:
On a busy line where many tags feeding many alarms change frequently,
the old BuildReadCache allocated a fresh dictionary + context on every
predicate evaluation — a steady stream of short-lived allocations the
GC eventually has to reclaim. With the reuse, the dictionary and
context are allocated once per alarm (on first evaluation) and refilled
in place across every subsequent re-eval.
Implementation:
- New private AlarmScratch class holds the reusable
Dictionary<string, DataValueSnapshot> read cache (pre-sized to the
alarm's Inputs.Count) and the AlarmPredicateContext that wraps it by
reference. The context observes refilled values without being
re-created.
- ConcurrentDictionary<string, AlarmScratch> _scratchByAlarmId on the
engine, cleared in LoadAsync alongside _alarms so a config-publish
drops the prior generation's scratch (Inputs / Logger may change).
- EvaluatePredicateToStateAsync looks up scratch via GetOrAdd, calls
the new RefillReadCache(Dictionary, IReadOnlySet) helper to clear +
repopulate the dictionary in place, then runs the predicate against
the reused context.
- BuildReadCache removed.
Safety:
Reuse is serialised under _evalGate which guarantees no two threads
ever observe the same scratch in a half-refilled state. The
AlarmPredicateContext is bound to the scratch dictionary by reference,
so the predicate's ctx.GetTag(path) sees the freshly-refilled values
rather than a stale snapshot.
Verification:
- All 66 ScriptedAlarms tests pass (was 63 — three new regression tests
locking the reuse contract).
- All 56 VirtualTags tests still pass (unchanged).
- All 104 Core.Scripting tests still pass (unchanged).
New tests in ScriptedAlarmEngineTests:
- Reevaluation_reuses_the_same_read_cache_dictionary — asserts
ReferenceEquals(scratch_before, scratch_after) across two
evaluations of the same alarm.
- Reevaluation_reuses_the_same_predicate_context — same, for the
context.
- LoadAsync_drops_the_prior_generations_scratch — asserts a config
publish wipes the prior scratch (so a stale Logger / Inputs can't
leak into the new generation).
Internal test hooks TryGetScratchReadCacheForTest /
TryGetScratchContextForTest added via the existing
InternalsVisibleTo for the tests project. Kept internal — not part of
the public engine surface.
Docs:
- docs/v2/Galaxy.Performance.md "Scripted-alarm engine" section
rewritten as "hot-path allocation reuse" documenting the new
contract + reuse safety reasoning + the three regression tests.
- code-reviews/Core.ScriptedAlarms/findings.md -009 flipped
Won't Fix → Resolved.
- code-reviews/README.md regenerated.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Core.Scripting-008 resolution: replace the legacy CSharpScript.CreateDelegate
path with hand-rolled CSharpCompilation + Emit + collectible AssemblyLoadContext,
so per-publish compile accretion no longer requires a server restart to reclaim.
Why this was needed:
Roslyn's CSharpScript path emits dynamically-compiled script assemblies into
the default AssemblyLoadContext, which is non-collectible. Across config-
publish generations each Clear() drops dictionary entries but the emitted
assemblies stay loaded for process lifetime, so memory grows steadily on
long-running servers with frequent publishes. The accepted-limitation note
in docs/VirtualTags.md recommended scheduled restarts as the workaround;
operator feedback was that restarts are difficult, so the underlying
limitation was the right thing to fix.
Implementation:
- New ScriptAssemblyLoadContext(name, isCollectible: true) hosts one emitted
script assembly per evaluator.
- ScriptEvaluator.Compile synthesises a wrapper class around the user source
(CompiledScript.Run(globals) — explicit return required per ordinary C#
semantics, which every existing script already uses), builds a
CSharpCompilation against the sandbox references, runs the
ForbiddenTypeAnalyzer over the semantic model unchanged, emits to an
in-memory PE stream, loads via ScriptAssemblyLoadContext.LoadFromStream,
and binds a strongly-typed Func<ScriptGlobals<TContext>, TResult> delegate
via reflection.
- ScriptEvaluator now implements IDisposable — Dispose calls
AssemblyLoadContext.Unload(), which makes the emitted assembly eligible
for GC at the next collection cycle.
- CompiledScriptCache.Clear() disposes every materialised evaluator before
dropping its dictionary entry; CompiledScriptCache itself is now
IDisposable for graceful server shutdown.
- ScriptSandbox.Build returns a new SandboxConfig (References + Imports)
instead of a Roslyn ScriptOptions; references now span BCL via the
TRUSTED_PLATFORM_ASSEMBLIES set filtered to System.* + netstandard +
Microsoft.Win32.Registry, so forbidden BCL types resolve at compile and
ForbiddenTypeAnalyzer is the sole security gate (consistent with the
Core.Scripting-001 / -002 model — references-list-only restriction is
porous against type forwarding, so the analyzer must be the real gate).
Verification:
- All 104 Core.Scripting tests pass (was 101 — three new regression tests
locking the unload contract).
- All 56 VirtualTags tests pass (unchanged).
- All 63 ScriptedAlarms tests pass (unchanged).
- New CompiledScriptCacheTests:
- Dispose_unloads_compiled_script_assembly_load_context — proves single-
evaluator ALC unload via WeakReference + bounded GC.Collect() loop.
- Clear_disposes_every_materialised_evaluator — proves publish-replace
releases every prior generation's ALC.
- GetOrCompile_after_Dispose_throws_ObjectDisposedException — locks the
post-dispose contract.
Docs:
- docs/VirtualTags.md "Compile cache" section rewritten: the accepted-
limitation note replaced with the unload contract + the new authoring
convention (explicit return).
- docs/ScriptedAlarms.md cross-reference updated to drop the obsolete
restart guidance.
- code-reviews/Core.Scripting/findings.md Core.Scripting-008 flipped
Won't Fix → Resolved with the implementation summary.
- code-reviews/README.md regenerated.
Pre-existing breakage note: Driver.Galaxy fails the solution-wide build on
master because its ProjectReference to the sibling mxaccessgw repo's
MxGateway.Client targets a path that the sibling repo no longer has after a
recent restructuring. This is unrelated to Core.Scripting-008 and was
verified to exist on master before this branch was cut.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Driver.FOCAS.Cli-005 follow-up: extend the SnapshotFormatter.FormatStatus
shortlist with the five Bad* codes the native-protocol mappers (FOCAS,
AbCip, AbLegacy) emit but which the shortlist previously left unnamed,
so they rendered only as severity-class 'Bad' instead of the documented
'BadDeviceFailure' / 'BadNotWritable' / ... names operators are told to
read off probe/write output.
Added entries:
0x80020000 BadInternalError
0x803B0000 BadNotWritable
0x803C0000 BadOutOfRange
0x803D0000 BadNotSupported
0x80550000 BadDeviceFailure
(BadTimeout 0x800A0000 was already added under Driver.Cli.Common-001.)
Tests: SnapshotFormatterTests gains a new [Theory]
FormatStatus_names_native_driver_emitted_codes covering the five names,
and the existing well-known [Theory] is extended with the same entries
to enforce exact '0x... (Name)' rendering. Suite now 47 green (was 42).
Flips Driver.FOCAS.Cli-005 from Deferred to Resolved; README regenerated.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Batch 7 closed the last Open findings in Client.UI. The review backlog
is now empty: 0 Open findings across all 31 modules.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Client.UI-003: wire Serilog properly per CLAUDE.md — console sink +
rolling daily file sink in Program.Main, Log.CloseAndFlush in finally,
per-VM Log.ForContext<> loggers.
- Client.UI-004: migrate the cert-store folder picker from the obsolete
OpenFolderDialog to StorageProvider.OpenFolderPickerAsync (with
TryGetFolderFromPathAsync seed + TryGetLocalPath extraction).
- Client.UI-006: surface formerly silent catch blocks via an observable
StatusMessage on the Subscriptions / Alarms VMs that bubbles up into
the shell's status bar; soft fallbacks log at Information level so
hard failures stay distinguishable.
- Client.UI-009: docs/Client.UI.md now lists Standard Deviation in the
Aggregate row of the Query Options table.
- Client.UI-010: removed the unused MinDateTimeProperty /
MaxDateTimeProperty styled properties from DateTimeRangePicker.
- Client.UI-011: updated the cert-store TextBox watermark from the
legacy AppData/LmxOpcUaClient/pki to the canonical
AppData/OtOpcUaClient/pki.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Batch 6 cleared Open findings in Driver.FOCAS.Cli (1 deferred to
Driver.Cli.Common), Driver.Cli.Common, Driver.Historian.Wonderware.Client,
Client.CLI, and Client.Shared.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Client.Shared-003: DefaultSessionAdapter.WriteValueAsync / CallMethodAsync
guard against null/empty Results and throw ServiceResultException with
the response's ServiceResult code instead of indexing into a missing
list.
- Client.Shared-004: DefaultSessionAdapter.CloseAsync / HistoryReadRawAsync
/ HistoryReadAggregateAsync use the Session.*Async overloads and honour
the caller's CancellationToken.
- Client.Shared-009: AcknowledgeAlarmAsync returns the underlying
ServiceResultException.StatusCode on failure instead of always Good;
IOpcUaClientService doc updated to describe the new contract.
- Client.Shared-010: ConnectionSettings.CertificateStorePath defaults to
empty; DefaultApplicationConfigurationFactory resolves the canonical
PKI path lazily, so per-failover ConnectionSettings copies don't hit
the filesystem.
- Client.Shared-011: added the alarm-fallback regression test, extracted
EndpointSelector as a pure static, and added EndpointSelectorTests
covering security-mode match, Basic256Sha256 preference, fallback,
diagnostics, hostname rewrite, and null/empty guards.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Client.CLI-002: SubscribeCommand's neverWentBad list now requires the
node to be present in lastStatus (i.e. received at least one update)
so the 'suspect' bucket only contains observed nodes.
- Client.CLI-003: every long-running command validates numeric option
ranges (Interval / Depth / MaxDepth / Duration / Max) and throws
CliFx CommandException on out-of-range values.
- Client.CLI-004: SubscribeCommand carries XML summary docs on the
type, ctor, every [CommandOption] property, and ExecuteAsync —
matching the sibling commands' style.
- Client.CLI-006: HistoryReadCommand parses --start / --end with
InvariantCulture+UTC and surfaces FormatException as CommandException;
every NodeIdParser.ParseRequired call wraps FormatException /
ArgumentException as CommandException.
- Client.CLI-007: CommandBase.ConfigureLogging calls Log.CloseAndFlush()
before assigning a new Log.Logger so prior sinks are disposed.
- Client.CLI-008: rewrote the subscribe and historyread sections of
docs/Client.CLI.md (every flag documented, summary-bucket vocabulary,
StandardDeviation aggregate, UTC --start/--end convention).
- Client.CLI-009: SubscribeCommand / AlarmsCommand use named local
handlers and detach them via -= after UnsubscribeAsync so no
notification reaches the console after the command's output phase
ends.
- Client.CLI-010: added CommandRangeValidationTests,
EventHandlerLifecycleTests, InputValidationErrorsTests,
LoggerLifecycleTests, and SubscribeCommandSummaryTests pinning every
Low fix; FakeOpcUaClientService gained AddDiscoveredVariable +
RaiseDataChanged + BrowseResultsByParent helpers.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Driver.Historian.Wonderware.Client-003: replaced the mixed Interlocked
+ healthLock counters with RecordOutcome that touches _totalQueries
and exactly one of _totalSuccesses / _totalFailures under one
acquisition.
- Driver.Historian.Wonderware.Client-004: InvokeAndClassifyAsync routes
transport + sidecar classification through a single RecordOutcome
call; the legacy ReclassifySuccessAsFailure two-step is gone.
- Driver.Historian.Wonderware.Client-006: removed the dead
ReconnectInitialBackoff / ReconnectMaxBackoff options and added a
doc <remarks> stating the channel performs a single in-place
reconnect; retry/backoff stays with the caller.
- Driver.Historian.Wonderware.Client-008: the audit-suppression comment
block now records advisory titles, why neither applies, and the
revisit trigger.
- Driver.Historian.Wonderware.Client-010: reworded Dispose() to claim
deadlock-safety and added a GetHealthSnapshot summary documenting the
single-channel collapse + counter invariant.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Driver.Cli.Common-004: confirm the FormatTable empty-input guard
landed earlier (commit 1433a1c); flip status to Resolved with a
cross-reference.
- Driver.Cli.Common-006: reword the SnapshotFormatter source-time
column comment to describe the actual behaviour (right-most column,
unmeasured, '-' for null timestamps) and confirm the
DriverCommandBase summary now enumerates FOCAS.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Driver.FOCAS.Cli-001: WriteCommand.ParseValue now wraps numeric
FormatException / OverflowException as CliFx CommandException with
the offending value.
- Driver.FOCAS.Cli-002: SubscribeCommand's OnDataChange handler and the
banner both take a writeLock so notification-callback and main-thread
writes can't interleave; handler exceptions are warn-and-swallow.
- Driver.FOCAS.Cli-003: FocasCommandBase.ValidateOptions rejects
--cnc-port outside 1..65535, non-positive --timeout-ms, and
non-positive --interval-ms; ExecuteAsync calls it first.
- Driver.FOCAS.Cli-004: 'await using var driver' is the sole driver
disposal path; dropped the redundant explicit await ShutdownAsync.
- Driver.FOCAS.Cli-005 (Deferred): the fix lives in
Driver.Cli.Common.SnapshotFormatter — explicitly naming the
status-code shortlist there benefits every driver CLI. Left as a
Driver.Cli.Common follow-up.
- Registered the new tests/Drivers/Cli/ZB.MOM.WW.OtOpcUa.Driver.FOCAS.Cli.Tests
project in ZB.MOM.WW.OtOpcUa.slnx.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Batch 5 cleared Open findings in Driver.AbCip.Cli, Driver.AbLegacy.Cli,
Driver.S7.Cli, Driver.TwinCAT.Cli, and Driver.Modbus.Cli.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Driver.Modbus.Cli-003: ModbusCommandBase.ValidateEndpoint rejects
--port outside 1..65535, non-positive --timeout-ms, and --unit-id
outside 1..247.
- Driver.Modbus.Cli-004: wrapped SubscribeCommand's OnDataChange handler
body in a try/catch (warn-and-swallow) and serialised the console
write through a lock.
- Driver.Modbus.Cli-005: Probe / Read / Write now catch the
cancellation-during-init OperationCanceledException and print
'Cancelled.' instead of dumping a stack trace.
- Driver.Modbus.Cli-006: ProbeCommand.ComputeVerdict derives the headline
from BOTH the driver state and the probe snapshot's OPC UA quality
class so the headline can't disagree with the wire result.
- Driver.Modbus.Cli-007: docs/Driver.Modbus.Cli.md carries an explicit
'CLI scope' callout — the address-string grammar is a DriverConfig
JSON feature; the CLI takes the structured triple only.
- Driver.Modbus.Cli-008: pinned BuildOptions, ValidateEndpoint, the
region-validation guards, ComputeVerdict, and the cancellation-during-
initialize paths.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Driver.TwinCAT.Cli-001: TwinCATCommandBase.Validate rejects
non-positive TimeoutMs / IntervalMs and AmsPort outside 1..65535;
ExecuteAsync calls it first.
- Driver.TwinCAT.Cli-002: SubscribeCommand serialises every WriteLine
through a writeLock to remove the notification-callback vs banner
interleave risk.
- Driver.TwinCAT.Cli-003: SubscribeCommand.DescribeMechanism derives
the banner label from the returned ISubscriptionHandle.DiagnosticId
so it can't disagree with what the driver actually did.
- Driver.TwinCAT.Cli-004: introduced TwinCATTagCommandBase carrying
--poll-only + BuildOptions; BrowseCommand stays on the slimmer
TwinCATCommandBase so --poll-only no longer surfaces in browse --help.
- Driver.TwinCAT.Cli-005: ProbeCommand --type now carries the 't' short
alias to match the other commands.
- Driver.TwinCAT.Cli-006: 35 new tests covering Gateway / AmsAddress
parse / BuildOptions / PollOnly / browse-helpers / probe-alias /
mechanism derivation.
- Driver.TwinCAT.Cli-007: replaced the empty-init <inheritdoc/> with an
explicit summary warning future maintainers about the no-op init.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Driver.S7.Cli-004: 'await using var driver' is the sole driver
disposal path; dropped the redundant explicit await ShutdownAsync from
each command's finally.
- Driver.S7.Cli-005: deleted the stale empty
tests/ZB.MOM.WW.OtOpcUa.Driver.S7.Cli.Tests/ directory (the real test
project lives under tests/Drivers/Cli/).
- Driver.S7.Cli-006: S7CommandBaseBuildOptionsTests cover the probe
toggle, timeout mapping, host/port/CPU/rack/slot wiring, and tag list
passthrough.
- Driver.S7.Cli-007: re-added the SubscribeCommand handler comment
explaining the CliFx IConsole.Output usage and that the poll-thread
raises events.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Driver.AbLegacy.Cli-002: WriteCommand.Value description lists the full
true/false, 1/0, on/off, yes/no alias set.
- Driver.AbLegacy.Cli-003: SubscribeCommand serialises every WriteLine
via a per-execution consoleGate lock so the poll-thread OnDataChange
handler can't interleave with the banner.
- Driver.AbLegacy.Cli-004: dropped 'await using var driver' in favour of
a plain 'var driver' + explicit await ShutdownAsync in finally; the
driver is no longer shut down twice.
- Driver.AbLegacy.Cli-005: SubscribeCommand.IntervalMs description
carries the PollGroupEngine 250ms-floor caveat; docs/Driver.AbLegacy.Cli.md
spells out the same.
- Driver.AbLegacy.Cli-006: ProbeCommand --type now carries the short
alias 't' to match the other commands.
- Driver.AbLegacy.Cli-007: BuildOptionsTests cover the probe-disabled,
device-shape, tag-passthrough, timeout-propagation, and empty-tag-list
paths.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Driver.AbCip.Cli-003: SubscribeCommand prints the 'Subscribed' banner
BEFORE wiring OnDataChange so the main thread can't interleave its
write with the poll-thread handler.
- Driver.AbCip.Cli-004: AbCipCommandBase.Timeout and SubscribeCommand
validate TimeoutMs / IntervalMs and throw CommandException on
non-positive values.
- Driver.AbCip.Cli-005: every command now calls FlushLogging() in its
finally block.
- Driver.AbCip.Cli-006: Timeout init throws NotSupportedException with a
pointer at TimeoutMs instead of silently swallowing assignments.
- Driver.AbCip.Cli-007: added AbCipCommandBaseTests covering BuildOptions
shape, probe / controller-browse / alarm toggles, host address, family
selection, tag list passthrough.
- Driver.AbCip.Cli-008: rewrote the opening paragraph in
docs/Driver.AbCip.Cli.md to credit the six-CLI roster with a pointer
at docs/DriverClis.md.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Batch 4 cleared Open findings in Driver.TwinCAT, Driver.Modbus,
Driver.OpcUaClient, Driver.Historian.Wonderware, and Driver.Modbus.Addressing.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Driver.Modbus.Addressing-006: broaden the catch in TryParseFamilyNative
so a future helper throwing a non-Argument/Overflow type still satisfies
the try-parse contract.
- Driver.Modbus.Addressing-007: document that the address grammar does
not carry ModbusStringByteOrder (the structured-tag path does);
add a 'Grammar scope' bullet to docs/v2/dl205.md.
- Driver.Modbus.Addressing-009: reword the ModbusModiconAddress comments
so they don't imply a leading-digit invariant the parser doesn't
enforce.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Driver.Historian.Wonderware-004: ToHistorianEvent synthesises a fresh
Guid when the upstream EventId is unparseable and logs the substitution
instead of writing the historian with Guid.Empty.
- Driver.Historian.Wonderware-005: GetHealthSnapshot derives the
connection-open booleans from the active-node fields so the snapshot
is self-consistent without depending on the secondary lock.
- Driver.Historian.Wonderware-007: SID-mismatch branch in PipeServer now
sends a HelloAck { Accepted=false, RejectReason } so the client sees a
symmetric rejection.
- Driver.Historian.Wonderware-008: classify StartQuery failures —
connection-class codes drop the connection, query-class codes throw
QueryClassStartQueryException so the IPC layer surfaces Success=false.
- Driver.Historian.Wonderware-010: RequestTimeoutSeconds now enforced
via BuildRequestCts linked to the caller's CancellationToken.
- Driver.Historian.Wonderware-011: refreshed XML docs to describe the
current sidecar / named-pipe architecture (Galaxy.Host / Proxy
references reframed as historical context).
- Driver.Historian.Wonderware-012: pinned the previously-uncovered
HistorianDataSource behaviours with five new test files; also removed
the stale empty tests/ZB.MOM.WW.OtOpcUa.Driver.Historian.Wonderware.Tests
directory.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Driver.OpcUaClient-011: rewrote the ValueRank comment with the OPC UA
Part 3 constants and an explicit scalar/array boundary at
valueRank >= 0.
- Driver.OpcUaClient-014: track every MonitoredItem.Notification handler
in a MonitoredItemNotificationHandle record; UnsubscribeAsync /
UnsubscribeAlarmsAsync / ShutdownAsync detach the handler before
Subscription.DeleteAsync so the SDK's invocation list no longer keeps
the driver alive.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Driver.Modbus-003: route every _health access through ReadHealth /
WriteHealth helpers backed by Volatile.Read / Volatile.Write so a
burst of concurrent ReadAsync callers always sees a complete snapshot.
- Driver.Modbus-007: promoted the Int64 / UInt64 → Int32 surfacing
caveat to a full <remarks> block; rewrote DisableFC23's doc to flag it
as reserved / no-op.
- Driver.Modbus-008: deleted stale duplicate doc, rewrote the
prohibition-block summaries to credit the shipped re-probe loop, and
removed the unused 'status' local in the ModbusException catch arm.
- Driver.Modbus-009: bind-time validation rejects StringLength < 1 for
String tags; ModbusTcpTransport clamps keep-alive intervals to whole
seconds (>=1).
- Driver.Modbus-010: documented WriteOnChangeOnly's cache-invalidation
policy (reads-only) and the write-only-tag caveat.
- Driver.Modbus-011: collected the scattered instance fields into a
single contiguous block at the top of ModbusDriver.
- Driver.Modbus-012: covered the previously-uncovered Reinitialize
state-hygiene, malformed/truncated/empty-bitmap response, and
DisposeAsync teardown paths.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Driver.TwinCAT-004: corrected the IEC time-type inline comments;
documented that the driver currently surfaces them as raw UInt32
counters.
- Driver.TwinCAT-006: ResolveHost returns a documented UnresolvedHost
sentinel when no devices are configured instead of returning the
logical DriverInstanceId (which never matches GetHostStatuses).
- Driver.TwinCAT-014: wired Probe.Timeout into the probe-loop call and
added a NotificationMaxDelayMs config knob threaded through
AddNotificationAsync.
- Driver.TwinCAT-015: Dispose() runs a genuinely synchronous teardown
with bounded waits (no sync-over-async deadlock pattern).
- Driver.TwinCAT-016: pinned the Structure-tag rejection and the
probe-loop vs read disposal race with regression tests.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Batch 3 cleared Open findings in Driver.Galaxy, Driver.AbCip,
Driver.AbLegacy, Driver.FOCAS, and Driver.S7.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Driver.S7-003: ArgumentNullException.ThrowIfNull on the references
argument at the top of ReadAsync / WriteAsync (was reaching .Count
before any null check).
- Driver.S7-005: drop the redundant global::S7.Net.Plc qualifiers in
ReadOneAsync / WriteOneAsync — using S7.Net already covers Plc.
- Driver.S7-009: PollLoopAsync degrades _health to Degraded after
sustained failure and backs off exponentially up to PollBackoffCap;
resets on a healthy tick so an operator can see the loop wedge.
- Driver.S7-010: Dispose runs the synchronous teardown directly with a
bounded WhenAll Wait drain instead of bridging via DisposeAsync().
- Driver.S7-013: reject unsupported S7DataType values (Int64 / UInt64 /
Float64 / String / DateTime) at InitializeAsync so half-implemented
types no longer leak BadNotSupported live nodes into the address space.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Driver.FOCAS-007: optional ILogger<FocasDriver> + alarm-projection
logger; log Debug around every formerly-empty catch (probe / shutdown
/ fixed-tree / recycle / alarms-read / projection).
- Driver.FOCAS-008: cache the parsed FocasAddress per tag at
InitializeAsync; Read/WriteAsync look it up instead of re-parsing on
every call.
- Driver.FOCAS-009: ProbeLoopAsync now wraps client.ProbeAsync in a
linked CTS honouring Probe.Timeout so a hung CNC socket can't block
past the configured limit.
- Driver.FOCAS-010: FocasOperationModeExtensions.ToText delegates to
FocasOpMode.ToText — single canonical op-mode label surface.
- Driver.FOCAS-011: FocasAlarmType constants are typed short to match
the cnc_rdalmmsg2 wire field and the projection switch arms.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Driver.AbLegacy-005: optional ILogger<AbLegacyDriver> ctor parameter,
logged init failure / probe transitions / first non-zero libplctag
status per device.
- Driver.AbLegacy-011: Dispose() runs the synchronous teardown directly
instead of bridging via DisposeAsync().AsTask().GetAwaiter().GetResult()
to remove the documented sync-over-async deadlock pattern.
- Driver.AbLegacy-013: documented the ResolveHost three-tier fallback
chain in XML and pointed DiscoverAsync's IsArray=false comment at the
Modbus ArrayCount pattern for the eventual multi-element follow-up.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Driver.AbCip-007: inject an optional ILogger<AbCipDriver> /
ILogger<AbCipAlarmProjection> (default NullLogger) and log around
every read / write / template-fetch / probe / alarm-poll failure path.
- Driver.AbCip-011: LogWarning when InitializeAsync is configured with
Probe.Enabled=true but ProbeTagPath is blank — operators now see why
GetHostStatuses keeps reporting Unknown.
- Driver.AbCip-012: documented the LibplctagTemplateReader per-call
Tag cost as accepted given libplctag's own connection pool and the
low-frequency discovery use-case.
- Driver.AbCip-013: per-device AllowPacking + ConnectionSize overrides
on AbCipDeviceOptions, threaded through AbCipTagCreateParams; central
BuildCreateParams helper replaces five ad-hoc clones; AllowPacking
now reaches Tag.AllowPacking at runtime.
- Driver.AbCip-015: stale-comment sweep — every PR-N forward-reference
is rewritten to describe present behaviour.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Driver.Galaxy-005: rewrite the EventPump BoundedChannelOptions comment
to honestly describe the Wait+TryWrite pattern.
- Driver.Galaxy-010: ResolveApiKey now warns when a literal API key is
used in production wiring; added an explicit dev: prefix for known
cleartext-in-dev cases and rewrote the GalaxyGatewayOptions doc.
- Driver.Galaxy-012: O(1) reverse-lookup for SubscriptionRegistry
dispatch via per-entry FullRefByItemHandle map; immutable hash-set for
the cross-binding reverse map; SubscribeAsync / ReadViaSubscribeOnce
use BuildResultIndex for per-reference correlation.
- Driver.Galaxy-013: ReinitializeAsync now validates the incoming JSON
against the running options; ReplayOnSessionLost honoured by the
Replay path; class summary rewritten to describe the shipped surface.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The Resolution prose was already recorded under Core.Scripting commit
(0454822); status was left as Open. Flip to Won't Fix to match.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Batch 2 cleared Open findings in Core.ScriptedAlarms, Core.Scripting,
Core.VirtualTags, Admin, and Server (Core.ScriptedAlarms-009 documented
under Won't Fix per the recommendation).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Server-004: pass the role-derived display name to UserIdentity's base
ctor (the SDK's DisplayName has no public setter) and drop the dead
Display property; make RoleBasedIdentity internal sealed.
- Server-006: derive a bounded CancellationToken from the SDK's
OperationContext.OperationDeadline in OnReadValue / OnWriteValue so a
stalled driver call can no longer pin the request thread.
- Server-008: mark handled slots via CallMethodRequest.Processed = true
in RouteScriptedAlarmMethodCalls (the SDK skips on Processed, not on a
Good error slot).
- Server-012: PeerHttpProbeLoop.ProbeAsync stops mutating client.Timeout
per call; uses a per-request CancellationTokenSource linked to the
shutdown token instead.
- Server-014: wire SealedBootstrap into Program.cs via AddSealedBootstrap
+ OpcUaServerService so the generation-sealed cache + stale-config flag
+ resilient reader actually run; /healthz now reflects cache-fallback
state.
- Server-015: replace the stale 'PR 16 / PR 17 minimum-viable scope'
class summaries on OtOpcUaServer and OpcUaServerOptions with the
shipped LDAP + anonymous-role + configurable security-profile prose.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Admin-010: vendor Bootstrap 5.3.3 (CSS + JS bundle + maps + provenance
README) under wwwroot/lib/bootstrap and reference local paths from
App.razor — Admin no longer pulls Bootstrap from jsDelivr.
- Admin-011: swap FleetStatusPoller's three plain dictionaries for
ConcurrentDictionary so ResetCache can't race a poll tick.
- Admin-012: drop the EquipmentId column from EquipmentCsvImporter (per
admin-ui.md — equipment id is system-derived from EquipmentUuid);
EquipmentImportBatchService and the textarea placeholder updated to
match.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Core.VirtualTags-004: CoerceResult now covers every scalar
DriverDataType and throws on the default arm; Load rejects unsupported
declared types.
- Core.VirtualTags-006: Subscribe/Unsub prune empty observer-list
entries from _observers under the same lock with a reconfirm-on-add
race guard.
- Core.VirtualTags-007: rewrote TimerTriggerScheduler so each TickGroup
tracks an InFlight flag (Interlocked CAS); ticks that overlap a still-
running tick for the same group are skipped + counted.
- Core.VirtualTags-009: DirectDependencies / DirectDependents return a
shared static empty set on miss instead of allocating per call.
- Core.VirtualTags-010: corrected XML docs to reference the real engine
symbols (OnUpstreamChange, CascadeAsync, etc.) instead of phantom types.
- Core.VirtualTags-011: Load now rejects scripts whose declared Writes
target a non-registered virtual-tag path.
- Core.VirtualTags-013: DependencyCycleException renders SCC members as
a set rather than a fabricated arrow-traversal edge path.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Core.Scripting-005: DependencyExtractor.HandleTagCall now recognises
raw-string literal paths by checking the StringLiteralExpression node
kind instead of the legacy StringLiteralToken kind.
- Core.Scripting-006: scope CompiledScriptCache failed-compile eviction
with TryRemove(KeyValuePair) so a racing retry entry is not evicted.
- Core.Scripting-008: document the per-publish assembly accretion as an
accepted limitation in docs/VirtualTags.md.
- Core.Scripting-009: enumerate the authoritative deny-list (namespace
prefixes + type-granular denies) in the Phase 7 decision-#6 entry to
match ForbiddenTypeAnalyzer.
- Core.Scripting-011: pin ScriptSandbox.Build, ScriptContext.Deadband
boundary semantics, and end-to-end factory + companion-sink
integration.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Core.ScriptedAlarms-003: emit OnEvent OUTSIDE _evalGate by collecting
pending emissions during the gate-held section and flushing them after
release; eliminates re-entrancy deadlock the docs already promised.
- Core.ScriptedAlarms-006: track every fire-and-forget Reevaluate /
ShelvingCheck task in _inFlight; Dispose drains the set so the engine
no longer races store writes against teardown.
- Core.ScriptedAlarms-008: store comments as ImmutableList<AlarmComment>
so AppendComment is O(log n) instead of O(n).
- Core.ScriptedAlarms-010: document the deliberate input-quality
asymmetry (Uncertain drives the predicate, renders {?} in the message)
in docs/ScriptedAlarms.md and on MessageTemplate.Resolve remarks.
- Core.ScriptedAlarms-011: propagate the no-op reason through
TransitionResult.NoOp(state, reason) and log it from
ScriptedAlarmEngine.ApplyAsync.
- Core.ScriptedAlarms-009 (Won't Fix per recommendation): documented the
per-evaluation dictionary allocation in docs/v2/Galaxy.Performance.md
with a mitigation path if a future soak surfaces pressure.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Batch 1 cleared Open findings in Core, Core.Abstractions, Core.AlarmHistorian,
Configuration, and Analyzers.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Analyzers-002: drop the three dead AlarmSurfaceInvoker entries from
the wrapper-method allow-list and from the diagnostic message.
- Analyzers-003: bail out of AnalyzeInvocation when the semantic model
is null (was previously emitting a false positive).
- Analyzers-004: resolve guarded-interface + wrapper-method symbols
once via CompilationStartAction and compare with SymbolEqualityComparer
instead of formatting fully-qualified names on every invocation.
- Analyzers-005: add regression tests for default-interface-method
reads (ReadAtTimeAsync / ReadEventsAsync on a concrete driver), with
+ without an override, and inside a CapabilityInvoker.ExecuteAsync
lambda.
- Analyzers-007: rewrite the analyzer remarks to accurately describe
the symbol-identity guarded-call detection, DIM handling, and the
wrapper-lambda match heuristic.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Core.AlarmHistorian-008: cache queue depth in an Interlocked counter so
EnqueueAsync no longer runs COUNT(*) on every alarm; consolidate
DrainOnceAsync onto a single SqliteConnection per tick (purge, batch
read, dead-letter, and outcome transaction all share it).
- Core.AlarmHistorian-011: confirm the stale Galaxy.Host XML doc
references were already fixed under earlier commits; flip to Resolved.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>