Compare commits
18 Commits
| Author | SHA1 | Date | |
|---|---|---|---|
| f23e368a74 | |||
| c8de58d6d3 | |||
| 8fe7c8bea6 | |||
| c6082aa0b9 | |||
| b1f3e09661 | |||
| 49644fc7fd | |||
| 3d982d9a65 | |||
| 23d59d73f2 | |||
| c2abbf45bd | |||
| 3a53d03d23 | |||
| fb7c6c7046 | |||
| a6ae4e22d1 | |||
| 41e62b2663 | |||
| a9be80923c | |||
| 994997ba7b | |||
| 0001cdd579 | |||
| 7b6ab2ec6f | |||
| 5a9c4591b9 |
@@ -4,8 +4,8 @@
|
||||
|---|---|
|
||||
| Module | `src/Core/ZB.MOM.WW.OtOpcUa.Core.ScriptedAlarms` |
|
||||
| Reviewer | Claude Code |
|
||||
| Review date | 2026-05-22 |
|
||||
| Commit reviewed | `76d35d1` |
|
||||
| Review date | 2026-05-23 |
|
||||
| Commit reviewed | `a9be809` |
|
||||
| Status | Reviewed |
|
||||
| Open findings | 0 |
|
||||
|
||||
@@ -14,6 +14,8 @@
|
||||
A comprehensive review completes every category, recording "No issues found" where
|
||||
a category produced nothing rather than leaving it blank.
|
||||
|
||||
### 2026-05-22 review (commit `76d35d1`)
|
||||
|
||||
| # | Category | Result |
|
||||
|---|---|---|
|
||||
| 1 | Correctness & logic bugs | Core.ScriptedAlarms-002 |
|
||||
@@ -27,6 +29,26 @@ a category produced nothing rather than leaving it blank.
|
||||
| 9 | Testing coverage | Core.ScriptedAlarms-012 |
|
||||
| 10 | Documentation & comments | Core.ScriptedAlarms-003 |
|
||||
|
||||
### 2026-05-23 re-review (commit `a9be809`)
|
||||
|
||||
Focused re-review of the Core.ScriptedAlarms-009 resolution (commit `0001cdd`) —
|
||||
new `AlarmScratch` class, `_scratchByAlarmId` ConcurrentDictionary, `RefillReadCache`
|
||||
helper, and internal test accessors. Only the changed/new code since `76d35d1` was
|
||||
re-examined; existing closed findings stay as audit trail.
|
||||
|
||||
| # | Category | Result |
|
||||
|---|---|---|
|
||||
| 1 | Correctness & logic bugs | No issues found |
|
||||
| 2 | OtOpcUa conventions | No issues found |
|
||||
| 3 | Concurrency & thread safety | No issues found |
|
||||
| 4 | Error handling & resilience | No issues found |
|
||||
| 5 | Security | No issues found |
|
||||
| 6 | Performance & resource management | No issues found |
|
||||
| 7 | Design-document adherence | No issues found |
|
||||
| 8 | Code organization & conventions | No issues found |
|
||||
| 9 | Testing coverage | No issues found |
|
||||
| 10 | Documentation & comments | Core.ScriptedAlarms-013 |
|
||||
|
||||
## Findings
|
||||
|
||||
### Core.ScriptedAlarms-001
|
||||
@@ -156,13 +178,29 @@ a category produced nothing rather than leaving it blank.
|
||||
| Severity | Low |
|
||||
| Category | Performance & resource management |
|
||||
| Location | `ScriptedAlarmEngine.cs:309-315`, `ScriptedAlarmEngine.cs:271` |
|
||||
| Status | Won't Fix |
|
||||
| Status | Resolved |
|
||||
|
||||
**Description:** `BuildReadCache` allocates a fresh `Dictionary<string, DataValueSnapshot>` on every predicate evaluation, i.e. on every upstream tag change for every referencing alarm. On a busy line where many tags feeding many alarms change frequently, this is a steady stream of short-lived dictionary allocations on the hot path. `AlarmPredicateContext` is also newly constructed each evaluation (line 281).
|
||||
|
||||
**Recommendation:** Minor. If the evaluation path shows up in allocation profiling, the read cache could be a reused per-alarm buffer cleared between evaluations (evaluations are already serialised under `_evalGate`, so a single shared scratch dictionary is safe). Not worth doing speculatively — flag for the perf surface in `docs/v2/Galaxy.Performance.md` if alarm evaluation is ever soak-tested.
|
||||
|
||||
**Resolution:** Won't Fix 2026-05-23 — per the recommendation, no code change. Documented the known allocation characteristic in `docs/v2/Galaxy.Performance.md` (new "Scripted-alarm engine — known hot-path allocations" section) so a future soak that surfaces pressure has a noted mitigation (reused per-alarm scratch buffer) and we don't re-find this in a later review.
|
||||
**Resolution:** Resolved 2026-05-23 — added a per-alarm reusable `AlarmScratch`
|
||||
(read-cache `Dictionary` + `AlarmPredicateContext`) held in
|
||||
`_scratchByAlarmId`, populated lazily on first evaluation and refilled in place
|
||||
by the new `RefillReadCache(Dictionary, IReadOnlySet)` helper on every
|
||||
subsequent re-eval. `BuildReadCache` is gone. The reuse is safe because every
|
||||
evaluation runs under `_evalGate`; the context wraps the dictionary by
|
||||
reference, so the predicate's `ctx.GetTag(path)` sees the freshly-refilled
|
||||
values. `LoadAsync` clears `_scratchByAlarmId` alongside `_alarms` so a
|
||||
config-publish drops the prior generation's scratch (Inputs / Logger may
|
||||
change). Regression tests added in `ScriptedAlarmEngineTests`:
|
||||
`Reevaluation_reuses_the_same_read_cache_dictionary`,
|
||||
`Reevaluation_reuses_the_same_predicate_context`, and
|
||||
`LoadAsync_drops_the_prior_generations_scratch`; internal test hooks
|
||||
`TryGetScratchReadCacheForTest` / `TryGetScratchContextForTest` exposed via
|
||||
the existing `InternalsVisibleTo`. `docs/v2/Galaxy.Performance.md` "Scripted-alarm
|
||||
engine" section rewritten to document the new reuse contract. Suite now 66
|
||||
green (was 63).
|
||||
|
||||
### Core.ScriptedAlarms-010
|
||||
|
||||
@@ -208,3 +246,32 @@ a category produced nothing rather than leaving it blank.
|
||||
**Recommendation:** Add engine-level tests that inject a controllable `Func<DateTime>` clock to drive `RunShelvingCheck`, cover the remaining Part 9 engine methods end-to-end, assert subscriber-exception isolation, and add a store-failure fake to lock in the chosen persistence-failure semantics from finding 007.
|
||||
|
||||
**Resolution:** Resolved 2026-05-22 — added 8 new engine-level tests covering all 6 gap areas: injectable-clock timed-shelve expiry via `RunShelvingCheckForTest`, `ConfirmAsync`/`TimedShelveAsync`/`UnshelveAsync`/`EnableAsync` end-to-end, subscriber-exception isolation, store-failure invariant, second-`LoadAsync` timer-leak regression, and `AreInputsReady` Bad/Uncertain guard; exposed `RunShelvingCheckForTest()` internal hook on the engine.
|
||||
|
||||
### Core.ScriptedAlarms-013
|
||||
|
||||
| Field | Value |
|
||||
|---|---|
|
||||
| Severity | Low |
|
||||
| Category | Documentation & comments |
|
||||
| Location | `ScriptedAlarmEngine.cs:66-81` |
|
||||
| Status | Resolved |
|
||||
|
||||
**Description:** The new internal test accessors `TryGetScratchReadCacheForTest` and `TryGetScratchContextForTest` (introduced by the Core.ScriptedAlarms-009 resolution at `0001cdd`) return the *live* per-alarm scratch — the same `Dictionary<string, DataValueSnapshot>` instance the engine clears and refills in `RefillReadCache` under `_evalGate`, plus the `AlarmPredicateContext` that wraps it by reference. The XML docs describe the intended use ("assert the scratch is reused across evaluations (two reads return the same instance)") but do not explicitly warn that:
|
||||
|
||||
1. The returned `IReadOnlyDictionary` is the engine's mutable working set. Enumerating it from a test thread while the engine is mid-evaluation (e.g. during a `ReevaluateAsync` queued by `OnUpstreamChange`, or a `ShelvingCheckAsync` callback) is a concurrent-read-while-writer scenario against a plain `Dictionary` — undefined behaviour, can throw `InvalidOperationException` or return torn data.
|
||||
2. Reference-equality comparisons (`ReferenceEquals(a, b)`) and single-key indexer reads (`dict["Temp"]`) on a quiesced engine are the only safe uses. The existing regression tests stay within those bounds, but a future test author has no in-code signal that broader reads are unsafe.
|
||||
|
||||
The engine itself is correct — `RefillReadCache` runs only under `_evalGate`, so the engine never tears its own state. The risk is purely on the test-side contract.
|
||||
|
||||
**Recommendation:** Add a `<remarks>` block to both `TryGetScratchReadCacheForTest` and `TryGetScratchContextForTest` stating that the returned references point at live engine state, that reads are only safe when the engine is known to be idle (no in-flight `ReevaluateAsync`/`ShelvingCheckAsync`/`LoadAsync`), and that the intended uses are reference-identity assertions plus single-key lookups against a quiesced engine — never enumeration. No code change required; the engine's correctness depends on `_evalGate`, which is already documented.
|
||||
|
||||
**Resolution:** Resolved 2026-05-23 — applied the recommendation verbatim.
|
||||
Added a `<remarks>` block to each of `TryGetScratchReadCacheForTest` and
|
||||
`TryGetScratchContextForTest` documenting the synchronization contract:
|
||||
the returned references point at live engine state refilled in place under
|
||||
`_evalGate`, enumeration during an in-flight evaluation is a
|
||||
concurrent-read-while-writer scenario against a plain `Dictionary`
|
||||
(undefined behaviour), and the only safe uses are reference-identity
|
||||
comparisons + single-key reads against a quiesced engine. No code change
|
||||
required — the engine's correctness was always there; only the test-side
|
||||
contract was undocumented.
|
||||
|
||||
@@ -4,8 +4,8 @@
|
||||
|---|---|
|
||||
| Module | `src/Core/ZB.MOM.WW.OtOpcUa.Core.Scripting` |
|
||||
| Reviewer | Claude Code |
|
||||
| Review date | 2026-05-22 |
|
||||
| Commit reviewed | `76d35d1` |
|
||||
| Review date | 2026-05-23 |
|
||||
| Commit reviewed | `a9be809` |
|
||||
| Status | Reviewed |
|
||||
| Open findings | 0 |
|
||||
|
||||
@@ -14,18 +14,23 @@
|
||||
A comprehensive review completes every category, recording "No issues found" where
|
||||
a category produced nothing rather than leaving it blank.
|
||||
|
||||
| # | Category | Result |
|
||||
|---|---|---|
|
||||
| 1 | Correctness & logic bugs | Core.Scripting-004, Core.Scripting-005 |
|
||||
| 2 | OtOpcUa conventions | No issues found |
|
||||
| 3 | Concurrency & thread safety | Core.Scripting-006 |
|
||||
| 4 | Error handling & resilience | Core.Scripting-007 |
|
||||
| 5 | Security | Core.Scripting-001, Core.Scripting-002, Core.Scripting-003 |
|
||||
| 6 | Performance & resource management | Core.Scripting-008 |
|
||||
| 7 | Design-document adherence | Core.Scripting-009 |
|
||||
| 8 | Code organization & conventions | No issues found |
|
||||
| 9 | Testing coverage | Core.Scripting-010, Core.Scripting-011 |
|
||||
| 10 | Documentation & comments | No issues found |
|
||||
The 2026-05-23 re-review only covers code touched between commits `76d35d1` and
|
||||
`a9be809` (primarily the Core.Scripting-008 ALC rewrite + the broadened BCL
|
||||
references). Categories where the new code surface produced no issues are
|
||||
recorded as "No new issues" for that pass.
|
||||
|
||||
| # | Category | Result (76d35d1) | Result (a9be809, new code only) |
|
||||
|---|---|---|---|
|
||||
| 1 | Correctness & logic bugs | Core.Scripting-004, Core.Scripting-005 | Core.Scripting-015 |
|
||||
| 2 | OtOpcUa conventions | No issues found | No new issues |
|
||||
| 3 | Concurrency & thread safety | Core.Scripting-006 | Core.Scripting-014 |
|
||||
| 4 | Error handling & resilience | Core.Scripting-007 | No new issues |
|
||||
| 5 | Security | Core.Scripting-001, Core.Scripting-002, Core.Scripting-003 | Core.Scripting-012, Core.Scripting-013 |
|
||||
| 6 | Performance & resource management | Core.Scripting-008 | Core.Scripting-016 |
|
||||
| 7 | Design-document adherence | Core.Scripting-009 | No new issues |
|
||||
| 8 | Code organization & conventions | No issues found | No new issues |
|
||||
| 9 | Testing coverage | Core.Scripting-010, Core.Scripting-011 | No new issues |
|
||||
| 10 | Documentation & comments | No issues found | No new issues |
|
||||
|
||||
## Findings
|
||||
|
||||
@@ -240,7 +245,7 @@ race ordering.
|
||||
| Severity | Low |
|
||||
| Category | Performance & resource management |
|
||||
| Location | `CompiledScriptCache.cs:34`, `ScriptEvaluator.cs:34` |
|
||||
| Status | Won't Fix |
|
||||
| Status | Resolved |
|
||||
|
||||
**Description:** `CompiledScriptCache` has no capacity bound (acknowledged in the class
|
||||
remarks) and no eviction. Each cached `ScriptEvaluator` holds a Roslyn `ScriptRunner<T>`
|
||||
@@ -257,7 +262,33 @@ compile scripts into a collectible `AssemblyLoadContext` so `Clear()` can unload
|
||||
generations. At minimum add a note to `docs/ScriptedAlarms.md` so operators with
|
||||
high-publish-frequency deployments are aware.
|
||||
|
||||
**Resolution:** Resolved 2026-05-23 — accepted as a documented known limitation rather than fixing in code (collectible `AssemblyLoadContext` for Roslyn-emitted assemblies is a v3 concern). The "Compile cache" section of `docs/VirtualTags.md` now carries a "Per-publish assembly accretion (accepted limitation, Core.Scripting-008)" note that operators with high-publish-frequency deployments can scan, and `docs/ScriptedAlarms.md` cross-references it. The accretion is benign at the expected "low thousands" of scripts scale; recommended mitigation is a scheduled server restart for deployments that publish very frequently.
|
||||
**Resolution:** Resolved 2026-05-23 — switched the compile pipeline off the legacy
|
||||
`CSharpScript.CreateDelegate` path (which emits into the default, non-collectible
|
||||
`AssemblyLoadContext`) and onto a hand-rolled `CSharpCompilation` →
|
||||
`Compilation.Emit(MemoryStream)` → `ScriptAssemblyLoadContext.LoadFromStream` chain,
|
||||
with the new `ScriptAssemblyLoadContext` constructed `isCollectible: true`. Each
|
||||
compiled script lives in its own ALC; `ScriptEvaluator` now implements `IDisposable`
|
||||
and calls `AssemblyLoadContext.Unload()` on dispose. `CompiledScriptCache.Clear()`
|
||||
disposes every materialised evaluator before dropping its dictionary entry, and
|
||||
`CompiledScriptCache` itself is now `IDisposable` for graceful server shutdown.
|
||||
After a publish-replace cycle the prior generation's emitted assemblies become
|
||||
eligible for GC; the reclaim is GC-timing-sensitive (Unload is
|
||||
*eligible-for-collection*, not synchronous) and the next collection cycle reclaims
|
||||
them. The references list is now BCL-wide (System.* + netstandard + Microsoft.Win32.Registry
|
||||
via the TRUSTED_PLATFORM_ASSEMBLIES set) so forbidden BCL types resolve at compile and
|
||||
`ForbiddenTypeAnalyzer` is the sole security gate (consistent with the
|
||||
Core.Scripting-001 / -002 model). `docs/VirtualTags.md` "Compile cache" section rewritten;
|
||||
`docs/ScriptedAlarms.md` cross-reference updated to drop the obsolete restart guidance.
|
||||
Regression tests added in `CompiledScriptCacheTests`:
|
||||
`Dispose_unloads_compiled_script_assembly_load_context`,
|
||||
`Clear_disposes_every_materialised_evaluator`, and
|
||||
`GetOrCompile_after_Dispose_throws_ObjectDisposedException`; the first two
|
||||
prove ALC unload via `WeakReference` + bounded `GC.Collect()` loops. Suite now 104
|
||||
green (was 101). Authoring convention: the synthesized wrapper is an ordinary
|
||||
C# static method, so scripts must end with explicit `return …;` per ordinary C# rules
|
||||
(the legacy `CSharpScript` "last expression yields result" shorthand no longer applies);
|
||||
every script in the existing corpus already uses explicit `return` so this is a doc-only
|
||||
change for new authors.
|
||||
|
||||
### Core.Scripting-009
|
||||
|
||||
@@ -336,3 +367,390 @@ a script logging at Error level produces both a `scripts-*.log` event and a comp
|
||||
Warning event.
|
||||
|
||||
**Resolution:** Resolved 2026-05-23 — added three new test files: `ScriptSandboxBuildTests` covers the `Build` null / non-`ScriptContext` / base-class / concrete-subclass paths; `ScriptContextTests` locks `Deadband` boundary semantics (equal-to-tolerance returns false; just-over returns true; symmetric in direction; zero-tolerance returns true only on non-equal; negative tolerance trips on any non-equal); the new `Factory_plus_companion_sink_integration_surfaces_script_error_in_both_logs` test in `ScriptLogCompanionSinkTests` wires `ScriptLoggerFactory` + the companion sink together end-to-end and asserts an Error emission lands in both the scripts sink (at Error) and the main sink (at Warning), each tagged with `ScriptName`. Suite now 101 green (was 85 before).
|
||||
|
||||
### Core.Scripting-012
|
||||
|
||||
| Field | Value |
|
||||
|---|---|
|
||||
| Severity | High |
|
||||
| Category | Security |
|
||||
| Location | `ForbiddenTypeAnalyzer.cs:60-76`, `ScriptSandbox.cs:96-126` |
|
||||
| Status | Resolved |
|
||||
|
||||
**Description:** The Core.Scripting-008 rewrite broadened the BCL references list
|
||||
from a narrow allow-list (`System.Private.CoreLib` + `System.Linq` only) to the
|
||||
full `TRUSTED_PLATFORM_ASSEMBLIES` set filtered to `System.*` + `netstandard` +
|
||||
`Microsoft.Win32.Registry`. This change correctly delegates the security gate to
|
||||
`ForbiddenTypeAnalyzer` (the new comment in `ScriptSandbox` calls this out
|
||||
explicitly), but the analyzer's deny-list has not been expanded to match the new
|
||||
attack surface, and three categories of dangerous BCL types in the `System.*`
|
||||
allow-listed assemblies are now reachable from script source:
|
||||
|
||||
1. **`System.Threading.ThreadPool`** (in namespace `System.Threading`). The
|
||||
Core.Scripting-003 fix added `System.Threading.Tasks` to deny `Task.Run` /
|
||||
`Parallel` fan-out because background work that outlives the per-evaluation
|
||||
timeout is the explicit threat. `ThreadPool.QueueUserWorkItem`,
|
||||
`ThreadPool.UnsafeQueueUserWorkItem`, and `ThreadPool.RegisterWaitForSingleObject`
|
||||
are exactly the same threat — they schedule background work that outlives the
|
||||
`WaitAsync(Timeout)` budget and tie up worker threads — but `System.Threading`
|
||||
itself is allowed (because `CancellationToken` / `SemaphoreSlim` / `Volatile`
|
||||
live there). The Core.Scripting-003 resolution is incomplete on the new
|
||||
reference surface.
|
||||
2. **`System.Threading.Timer`** (same namespace). Schedules a background
|
||||
callback; the script returns control to the engine but the timer keeps
|
||||
firing past the evaluation budget. Same threat as `Task.Run`.
|
||||
3. **`System.Runtime.Loader.AssemblyLoadContext`** (in namespace
|
||||
`System.Runtime.Loader`, which is not denied — only `System.Runtime.InteropServices`
|
||||
is). The constructor + `LoadFromAssemblyPath` / `LoadFromStream` /
|
||||
`LoadFromAssemblyName` let a script load an arbitrary DLL into the host
|
||||
process. Pass (1) of the analyzer resolves the receiver type
|
||||
(`AssemblyLoadContext`, allowed) + the invocation symbol's containing type
|
||||
(also `AssemblyLoadContext`, allowed) and lets the call through. Pass (2)
|
||||
only inspects `TypeSyntax` nodes — if the script discards the returned
|
||||
`Assembly` (e.g. `alc.LoadFromAssemblyPath(@"C:\evil.dll");`) there is no
|
||||
`TypeSyntax` for the analyzer to walk and the call is accepted. Triggering
|
||||
execution of the loaded code from inside the sandbox is hard (most of
|
||||
`Assembly`'s surface is in `System.Reflection`, which is denied) but the
|
||||
defense-in-depth gap is real: an attacker who can author a script also
|
||||
typically controls a file path on the server (Admin UI uploads, share
|
||||
mounts) and loading an assembly is the prerequisite to every chained
|
||||
escape — module initializers, type-resolve handlers, and a future analyzer
|
||||
slip would all become exploitable.
|
||||
|
||||
In addition, two lower-impact `System.*` types are reachable that arguably
|
||||
shouldn't be: **`System.Console.SetOut`** / **`Console.SetError`** could
|
||||
redirect the host's console streams (requires constructing a
|
||||
`System.IO.TextWriter`, which is blocked, so the practical exploit is
|
||||
`Console.WriteLine` log-spam only), and **`System.Globalization.CultureInfo.DefaultThreadCurrentCulture`**
|
||||
could perturb the entire process's formatting behavior (subtle but real cross-script
|
||||
side effect).
|
||||
|
||||
The original Core.Scripting-001 finding called out the model: when an allow-listed
|
||||
namespace contains dangerous types, those types must be denied type-granularly.
|
||||
The new reference surface introduces several more such types and the deny-list
|
||||
has not been kept in sync.
|
||||
|
||||
**Recommendation:** Add `System.Threading.ThreadPool` and `System.Threading.Timer`
|
||||
to `ForbiddenFullTypeNames`. Add `System.Runtime.Loader` as a namespace prefix
|
||||
to `ForbiddenNamespacePrefixes` (every type in `System.Runtime.Loader` —
|
||||
`AssemblyLoadContext`, `AssemblyDependencyResolver`, `AssemblyLoadEventArgs` — is
|
||||
out of script scope). Consider adding `System.Console` to `ForbiddenFullTypeNames`
|
||||
to stop log-spam through the host's console streams, and at minimum document
|
||||
`CultureInfo.DefaultThreadCurrentCulture` as an accepted cross-script side
|
||||
effect. Each addition must have a regression test in `ScriptSandboxTests`
|
||||
mirroring the Core.Scripting-010 vector style. Update
|
||||
`docs/v2/implementation/phase-7-scripting-and-alarming.md` decision #6 + the
|
||||
"Sandbox escape" compliance-check row to enumerate the additions, per the
|
||||
Core.Scripting-009 doc-sync convention.
|
||||
|
||||
**Resolution:** Resolved 2026-05-23 — added `System.Runtime.Loader` to
|
||||
`ForbiddenNamespacePrefixes` (the namespace-prefix form preferred over
|
||||
type-granular per the recommendation; future BCL additions to that namespace
|
||||
are denied by default). Added `System.Threading.ThreadPool` and
|
||||
`System.Threading.Timer` to `ForbiddenFullTypeNames` — both live in
|
||||
`System.Threading` shared with allowed sync primitives so they must be
|
||||
type-granular. Regression tests added to `ScriptSandboxTests`:
|
||||
`Rejects_ThreadPool_QueueUserWorkItem_at_compile`,
|
||||
`Rejects_Timer_new_at_compile`, `Rejects_AssemblyLoadContext_at_compile`.
|
||||
`docs/v2/implementation/phase-7-scripting-and-alarming.md` decision #6 +
|
||||
the Sandbox-escape compliance-check row both updated per the
|
||||
Core.Scripting-009 doc-sync convention. The two lower-impact suggestions
|
||||
from the recommendation (`System.Console`, `CultureInfo.DefaultThreadCurrentCulture`)
|
||||
were intentionally not addressed: `Console.SetOut` requires constructing
|
||||
a `System.IO.TextWriter` which is already blocked, leaving only
|
||||
`Console.WriteLine` log-spam (annoyance, not a security threat); and
|
||||
`CultureInfo.DefaultThreadCurrentCulture` is a cross-script side-effect
|
||||
worth knowing about but doesn't escape the sandbox. Recording both as
|
||||
accepted minor risks. Test totals after fix: Core.Scripting 107 green
|
||||
(was 104 — +3 new rejection tests).
|
||||
|
||||
### Core.Scripting-013
|
||||
|
||||
| Field | Value |
|
||||
|---|---|
|
||||
| Severity | Medium |
|
||||
| Category | Security |
|
||||
| Location | `ScriptEvaluator.cs:202-225` (`BuildWrapperSource`) |
|
||||
| Status | Resolved |
|
||||
|
||||
**Description:** The synthesized wrapper pastes the user's source verbatim
|
||||
between `{` and `}` braces inside a static method body, with a `#line 1`
|
||||
directive and no escaping. The legacy `CSharpScript.CreateDelegate` path was
|
||||
robust to this because Roslyn's scripting compiler parses script source as a
|
||||
top-level statement sequence; the new hand-rolled path is parsing ordinary C# in
|
||||
a method body, so a script that injects matching `{` / `}` braces can extend the
|
||||
synthesized compilation unit with additional methods, classes, or `#line`
|
||||
directives. For example, a script body of
|
||||
`return 0; } public static int Evil() { return 0; }} public static class CompiledScript2 { public static void M() {`
|
||||
ends the `Run` method early, declares a sibling `Evil` method (and even a
|
||||
sibling `CompiledScript2` class) inside the synthesized namespace, then opens an
|
||||
unclosed method that consumes the wrapper's trailing `}\n}`. With matching brace
|
||||
counts the script parses cleanly and compiles.
|
||||
|
||||
`ForbiddenTypeAnalyzer` walks every descendant of every syntax tree, so any
|
||||
forbidden BCL types named inside the injected methods are still caught — the
|
||||
finding is **not** a direct sandbox escape. However:
|
||||
|
||||
- It silently relaxes the operator-visible authoring contract documented in
|
||||
`docs/VirtualTags.md` ("scripts are statement bodies that end with an
|
||||
explicit `return …;`") to "scripts can be any compilable C# inside the
|
||||
`CompiledScript` namespace" — operators have access to features the design
|
||||
did not intend to expose (local types defined as siblings of `Run`, custom
|
||||
module initializers via attributes, etc.).
|
||||
- A script can embed its own `#line` directives that override the
|
||||
`#line 1` we emit just above the user source, producing misleading error
|
||||
locations in compiler diagnostics surfaced to the operator.
|
||||
- Future hardening that relies on syntactic-shape assumptions (e.g.
|
||||
"every script has exactly one method") would silently fail.
|
||||
- It widens the analyzer's surface: the analyzer's correctness now depends on
|
||||
Pass (2) correctly walking every conceivable C# construct that can name a
|
||||
type, including ones a normal script body would never contain
|
||||
(`UnmanagedCallersOnly` attribute, function pointer types `delegate*<...>`,
|
||||
pattern types, switch arm types, …).
|
||||
|
||||
**Recommendation:** Either (a) reject scripts whose parsed body contains
|
||||
declarations other than statements — walk the wrapper's syntax tree after parse
|
||||
and require that the only members of `CompiledScript` are the single `Run`
|
||||
method, raising a `CompilationErrorException` if anything else appears — or
|
||||
(b) parse the user source independently as a `BlockSyntax` and inject the
|
||||
parsed block as the method body via the Roslyn syntax API, which makes
|
||||
brace-mismatched / class-injecting source unparseable. Add a regression test
|
||||
covering at least the brace-injection vector
|
||||
(`return 0; } public static int Evil() { return 0;`).
|
||||
|
||||
**Resolution:** Resolved 2026-05-23 — took option (a) from the recommendation:
|
||||
added an `EnforceSingleRunMember` step to `ScriptEvaluator.Compile` (runs after
|
||||
`CSharpSyntaxTree.ParseText` of the synthesized wrapper, before Roslyn
|
||||
compile). The check requires exactly one type declaration in the compilation
|
||||
unit (the `CompiledScript` class) AND exactly one member on that class (the
|
||||
`Run` method). Any deviation — a sibling class, an additional namespace, a
|
||||
sibling method or nested type alongside `Run` — throws
|
||||
`CompilationErrorException` with diagnostic IDs `LMX001` / `LMX002` and a
|
||||
message that names Core.Scripting-013 and points at the offending span. Two
|
||||
regression tests added: `Rejects_sibling_method_injection_via_balanced_braces`
|
||||
(injects a sibling method via `} public static int Evil() { …`) and
|
||||
`Rejects_sibling_class_injection_via_balanced_braces` (injects an entire
|
||||
sibling namespace + class). Option (b) (parse the user source independently
|
||||
as a `BlockSyntax` and inject via Roslyn syntax API) was considered but the
|
||||
parse-and-validate approach is more readable, gives clearer error messages,
|
||||
and keeps the wrapper-source generation textual.
|
||||
|
||||
### Core.Scripting-014
|
||||
|
||||
| Field | Value |
|
||||
|---|---|
|
||||
| Severity | Medium |
|
||||
| Category | Concurrency & thread safety |
|
||||
| Location | `CompiledScriptCache.cs:91-103` (`Clear`) |
|
||||
| Status | Resolved |
|
||||
|
||||
**Description:** `Clear()` snapshots `_cache.Keys.ToArray()` then iterates,
|
||||
calling `TryRemove(key, out var lazy)` on each — the key-only overload, not
|
||||
the value-scoped one used in `GetOrCompile`'s catch block. Between the
|
||||
snapshot and a given `TryRemove`, a concurrent `GetOrCompile(scriptSource)`
|
||||
call that hashes to the same key can re-insert a fresh `Lazy` whose `.Value`
|
||||
the caller already retained. The unconditional `TryRemove` then removes that
|
||||
fresh `Lazy` and `DisposeLazyIfMaterialised(lazy)` calls `Dispose()` on its
|
||||
evaluator — unloading the ALC while the concurrent caller still holds a
|
||||
reference to the evaluator and intends to invoke it.
|
||||
|
||||
This is exactly the race-window pattern the Core.Scripting-006 resolution
|
||||
fixed in `GetOrCompile`'s catch block (the test
|
||||
`Failed_compile_eviction_does_not_remove_a_concurrent_retry_entry` locks it
|
||||
there). `Clear()` carries the same shape but uses the older, value-blind
|
||||
overload, so the same race that finding-006 addresses is still latent on the
|
||||
publish-replace path.
|
||||
|
||||
In current production wiring `Clear()` is intended for config-publish + tests
|
||||
— neither overlaps steady-state evaluation under the documented design — so
|
||||
the in-practice impact is low. But the cache is checked in as the
|
||||
forward-looking compile cache for the engines (per `Script.SourceHash`'s docs
|
||||
and the cache's own remarks); a future wiring that calls `Clear()` from
|
||||
publish while evaluations are in flight would dispose live evaluators.
|
||||
|
||||
**Recommendation:** Replace the snapshot + `TryRemove(key, out var lazy)`
|
||||
sequence with an enumeration that captures the `Lazy` reference at snapshot
|
||||
time and uses the value-scoped `TryRemove(KeyValuePair<,>)` overload, mirroring
|
||||
the Core.Scripting-006 fix:
|
||||
|
||||
```csharp
|
||||
foreach (var entry in _cache.ToArray())
|
||||
{
|
||||
if (_cache.TryRemove(entry))
|
||||
DisposeLazyIfMaterialised(entry.Value);
|
||||
}
|
||||
```
|
||||
|
||||
Add a regression test that races `GetOrCompile` against `Clear` and asserts
|
||||
the caller's evaluator is still usable.
|
||||
|
||||
**Resolution:** Resolved 2026-05-23 — applied the recommendation verbatim:
|
||||
replaced `foreach (var key in _cache.Keys.ToArray())` + key-only
|
||||
`TryRemove(key, out var lazy)` with `foreach (var entry in _cache.ToArray())` +
|
||||
value-scoped `TryRemove(entry)` (the `KeyValuePair<,>` overload). A concurrent
|
||||
GetOrCompile re-add between the snapshot and the remove inserts a fresh Lazy
|
||||
under the same key; the value-scoped comparison sees the mismatch and leaves
|
||||
the fresh entry intact (instead of evicting + disposing the live evaluator
|
||||
the concurrent caller still holds). Regression test
|
||||
`Clear_uses_value_scoped_TryRemove_so_a_race_inserted_entry_survives` added
|
||||
to `CompiledScriptCacheTests` — single-threaded simulation that snapshots
|
||||
the dict, mutates the entry to a fresh Lazy mid-flight, drives the same
|
||||
value-scoped TryRemove overload Clear now uses, and asserts the fresh entry
|
||||
survives. The two-thread race would be flaky to model directly; the
|
||||
single-threaded semantic test is sufficient because the fix is the
|
||||
overload-selection itself.
|
||||
|
||||
### Core.Scripting-015
|
||||
|
||||
| Field | Value |
|
||||
|---|---|
|
||||
| Severity | Low |
|
||||
| Category | Correctness & logic bugs |
|
||||
| Location | `ScriptEvaluator.cs:234-270` (`ToCSharpTypeName`) |
|
||||
| Status | Resolved |
|
||||
|
||||
**Description:** `ToCSharpTypeName` is documented to handle nested types
|
||||
(`Outer+Inner` → `Outer.Inner`) via `Replace('+', '.')` for the
|
||||
non-generic path (line 269) but the generic path (line 263-266) constructs the
|
||||
name from `def.FullName!` then takes a substring up to the backtick. For a
|
||||
**nested generic** type — e.g. `Outer.Inner<T>` whose `FullName` is
|
||||
`Outer+Inner`1` — `Replace('+', '.')` is applied first, then `Substring(0, IndexOf('`'))`
|
||||
on `"Outer.Inner`1"` produces `"Outer.Inner"`, which is correct. Good.
|
||||
|
||||
However, the generic branch does NOT handle the case where the OPEN generic
|
||||
type itself is nested with `+` inside the parent's name when the parent is
|
||||
also generic (`Outer<TOuter>.Inner<TInner>` — `FullName` is
|
||||
`Outer`1+Inner`1[[TOuter,TInner]]`). For that shape `Substring(0, IndexOf('`'))`
|
||||
truncates at the first backtick — yielding `"Outer.Inner"` — silently dropping
|
||||
the closed type arguments of `Outer<TOuter>`. The resulting source string is
|
||||
syntactically valid but semantically wrong: `global::Outer.Inner<TInner>` does
|
||||
not name `Outer<TOuter>.Inner<TInner>`.
|
||||
|
||||
The production code never hits this shape — `TResult` is always one of
|
||||
`object?`, `bool`, `int`, `double`, `string?`, `DateTime` across the
|
||||
virtual-tag engine, the alarm engine, the test-harness, and the test suite,
|
||||
and `ScriptGlobals<TContext>` is always a top-level generic over a top-level
|
||||
`ScriptContext` subclass. The bug is latent. But it is a foot-gun for a
|
||||
future caller (e.g. a Phase-8 driver that wires a context type defined as a
|
||||
nested generic for grouping reasons) and the XML-doc comment claims
|
||||
"handles nested types" without qualifying it.
|
||||
|
||||
A second smaller correctness gap on the same path: the comment claims
|
||||
`global::`-qualified FQNs prevent accidental capture by the wrapper's `using`
|
||||
directives, which is true for the generic / non-generic branches, but the
|
||||
primitive aliases (`bool`, `int`, `string`, `object`, …) are emitted unqualified.
|
||||
A script that defines a local `class bool` (now possible per Core.Scripting-013)
|
||||
would shadow the alias. Probably benign, but worth a comment.
|
||||
|
||||
**Recommendation:** Add a check in the generic branch that walks the FullName
|
||||
backtick-by-backtick — or use `INamedTypeSymbol`-style name composition from
|
||||
`def.DeclaringType` recursively — so multi-arity-nested generics emit
|
||||
correctly. At minimum update the XML doc to qualify "handles nested types" as
|
||||
"handles single-level nesting; nested generics whose parent is itself generic
|
||||
are not supported". Add a `ToCSharpTypeName` unit test (currently nothing
|
||||
exercises this method directly — coverage relies on the end-to-end compile path,
|
||||
so the bug surfaces only as a misleading Roslyn diagnostic).
|
||||
|
||||
**Resolution:** Resolved 2026-05-23 — rewrote the generic-type branch of
|
||||
`ToCSharpTypeName` to walk the `FullName` segment-by-segment (split on `.`
|
||||
after `+ → .` substitution). For each segment ending in `Name\`N`, the
|
||||
algorithm consumes N generic arguments from `t.GetGenericArguments()` in
|
||||
order and emits them as `<…>` on that segment. Nested generic-in-generic
|
||||
shapes (`Outer<T>.Inner<U>`) now emit as
|
||||
`global::Ns.Outer<T>.Inner<U>` (valid C#) rather than the pre-fix
|
||||
`global::Ns.Outer<T, U>` (which dropped the segment boundary entirely
|
||||
because `IndexOf('`')` truncated at the first backtick). No production
|
||||
caller exercises this shape today (all `TContext` / `TResult` types in
|
||||
the codebase are top-level non-nested), so the fix is preemptive — but
|
||||
the algorithm is now correct for any future nested-generic context type.
|
||||
|
||||
### Core.Scripting-016
|
||||
|
||||
| Field | Value |
|
||||
|---|---|
|
||||
| Severity | Medium |
|
||||
| Category | Performance & resource management |
|
||||
| Location | `src/Core/ZB.MOM.WW.OtOpcUa.Core.VirtualTags/VirtualTagEngine.cs:74-117`, `src/Core/ZB.MOM.WW.OtOpcUa.Core.ScriptedAlarms/ScriptedAlarmEngine.cs:139-182` |
|
||||
| Status | Resolved |
|
||||
|
||||
**Description:** The Core.Scripting-008 resolution introduced
|
||||
`ScriptEvaluator.IDisposable` + `CompiledScriptCache.Clear()` that disposes
|
||||
each materialised evaluator before dropping its dictionary entry, so per-publish
|
||||
ALC accretion is no longer process-lifetime rooted **inside the cache**. But
|
||||
neither production consumer of `ScriptEvaluator` uses the cache — both
|
||||
`VirtualTagEngine.Load` and `ScriptedAlarmEngine.LoadAsync` call
|
||||
`ScriptEvaluator<TContext, TResult>.Compile(...)` directly (lines 105 / 160
|
||||
respectively), store the evaluator inside an internal `VirtualTagState` /
|
||||
`AlarmState` record, and on the next `Load` simply call `_tags.Clear()` /
|
||||
`_alarms.Clear()`. The dropped `ScriptEvaluator` references never have
|
||||
`Dispose()` called on them, so the underlying `ScriptAssemblyLoadContext`
|
||||
instances are never `Unload()`-ed. The .NET runtime guarantees that a
|
||||
collectible ALC stays alive until `Unload()` is called explicitly — having
|
||||
"no strong references" is necessary but not sufficient. So the publish-replace
|
||||
cycle leaks every prior generation's emitted assembly exactly as before the
|
||||
fix, even though the fix's infrastructure is in place.
|
||||
|
||||
The Core.Scripting-008 regression tests in `CompiledScriptCacheTests`
|
||||
(`Dispose_unloads_compiled_script_assembly_load_context` /
|
||||
`Clear_disposes_every_materialised_evaluator`) prove the contract on
|
||||
`CompiledScriptCache`, but neither engine uses that class. There is no
|
||||
integration test exercising the actual publish path — i.e. that calling
|
||||
`VirtualTagEngine.Load(...)` twice with different definitions makes the prior
|
||||
generation's ALC eligible for GC. As a result the fix's headline guarantee
|
||||
("Server restarts are no longer required to reclaim compiled-script memory" —
|
||||
`docs/VirtualTags.md`) is not actually delivered to the production engines.
|
||||
|
||||
This is the same observable behavior the original Core.Scripting-008 finding
|
||||
described, surfacing on a different code path that the resolution did not touch.
|
||||
|
||||
**Recommendation:** Either route the engines' compile path through
|
||||
`CompiledScriptCache<TContext, TResult>` (the documented design — the cache
|
||||
already returns the same evaluator instance for identical source, and its
|
||||
`Clear()` now performs the right disposal — and `Script.SourceHash`'s doc-comment
|
||||
already names this as the cache key), or make the engines' `Load` methods
|
||||
dispose the previous `ScriptEvaluator` instances before reassigning. The
|
||||
former is the cleaner change because it also collapses redundant compiles
|
||||
across publishes for unchanged scripts. Add an integration test along the
|
||||
lines of `CompiledScriptCacheTests.Clear_disposes_every_materialised_evaluator`
|
||||
for each engine: snapshot the per-evaluator emitted assembly via
|
||||
`WeakReference`, call `Load(...)` with a different definition set, and assert
|
||||
the prior generation's assemblies become collectable.
|
||||
|
||||
**Resolution:** Resolved 2026-05-23 — took the cleaner route from the
|
||||
recommendation: routed both engines' compile paths through
|
||||
`CompiledScriptCache<TContext, TResult>`. `VirtualTagEngine` and
|
||||
`ScriptedAlarmEngine` each gained a private `_compileCache` instance field,
|
||||
their `Load`/`LoadAsync` methods now call `_compileCache.GetOrCompile(source)`
|
||||
instead of `ScriptEvaluator.Compile(source)` directly, and the cache is cleared
|
||||
on publish-replace alongside the existing `_tags` / `_alarms` clears so the
|
||||
prior generation's ALCs are disposed before recompile. Engine `Dispose` now
|
||||
also calls `_compileCache.Dispose()` so the engine-shutdown path actually
|
||||
releases the emitted assemblies. **Side-fix:** discovered + fixed an
|
||||
adjacent bug in `CompiledScriptCache.Dispose()` itself — it set
|
||||
`_disposed = true` before calling `Clear()`, but `Clear()`'s pre-existing
|
||||
`if (_disposed) return` guard then aborted the drain unconditionally, so
|
||||
the Dispose-triggered cleanup was a silent no-op. Removed the disposed-guard
|
||||
on `Clear()` (the operation is idempotent — clearing an empty/cleared cache
|
||||
is safe). Without this side-fix the engine-Dispose path would have left
|
||||
the cached evaluators rooted forever even though the call chain looked
|
||||
correct. **Side-fix for ScriptedAlarmEngine.Dispose:** moved the pre-existing
|
||||
"do NOT clear `_alarms` here" comment to "clear `_alarms` AFTER the drain"
|
||||
because the AlarmState records hold the `TimedScriptEvaluator`/`ScriptEvaluator`
|
||||
delegates that root the emitted assembly — leaving them in `_alarms` after
|
||||
Dispose was the same root-the-script-forever pattern this finding is about,
|
||||
just on the engine side rather than the cache side. The `_alarms` clear is
|
||||
safe after the `Task.WhenAll` drain because that drain guarantees no
|
||||
background callback is mid-flight. Regression tests added:
|
||||
`VirtualTagEngineTests.Dispose_unloads_compiled_script_assembly` and
|
||||
`ScriptedAlarmEngineTests.Dispose_unloads_compiled_predicate_assembly` —
|
||||
each uses `WeakReference` + bounded `GC.Collect()` to prove the emitted
|
||||
assembly is reclaimable after `engine.Dispose()`. **Important test pattern
|
||||
detail:** the alarms test originally failed because its helper was
|
||||
`async Task<WeakReference>` — async state machines capture locals as
|
||||
state-struct fields and can keep them alive past the method's apparent end.
|
||||
Rewrote as a synchronous helper using `LoadAsync(...).GetAwaiter().GetResult()`
|
||||
inside two cooperating `[MethodImpl(MethodImplOptions.NoInlining)]` helpers
|
||||
(`CompileAlarmAndCaptureWeak` + `ExtractEmittedAssemblyWeakRef`) so the
|
||||
intermediate reflection locals die when each helper returns. Test totals
|
||||
after fix: Core.Scripting 104 green (unchanged); VirtualTags 57 green (was
|
||||
56 — +1 unload test); ScriptedAlarms 67 green (was 66 — +1 unload test).
|
||||
|
||||
@@ -4,8 +4,8 @@
|
||||
|---|---|
|
||||
| Module | `src/Drivers/Cli/ZB.MOM.WW.OtOpcUa.Driver.Cli.Common` |
|
||||
| Reviewer | Claude Code |
|
||||
| Review date | 2026-05-22 |
|
||||
| Commit reviewed | `76d35d1` |
|
||||
| Review date | 2026-05-23 |
|
||||
| Commit reviewed | `a9be809` |
|
||||
| Status | Reviewed |
|
||||
| Open findings | 0 |
|
||||
|
||||
@@ -16,7 +16,7 @@ a category produced nothing rather than leaving it blank.
|
||||
|
||||
| # | Category | Result |
|
||||
|---|---|---|
|
||||
| 1 | Correctness & logic bugs | Driver.Cli.Common-001, Driver.Cli.Common-002 |
|
||||
| 1 | Correctness & logic bugs | Driver.Cli.Common-001, Driver.Cli.Common-002, Driver.Cli.Common-007 |
|
||||
| 2 | OtOpcUa conventions | No issues found |
|
||||
| 3 | Concurrency & thread safety | Driver.Cli.Common-003 |
|
||||
| 4 | Error handling & resilience | Driver.Cli.Common-004 |
|
||||
@@ -24,9 +24,47 @@ a category produced nothing rather than leaving it blank.
|
||||
| 6 | Performance & resource management | No issues found |
|
||||
| 7 | Design-document adherence | No issues found |
|
||||
| 8 | Code organization & conventions | No issues found |
|
||||
| 9 | Testing coverage | Driver.Cli.Common-005 |
|
||||
| 9 | Testing coverage | Driver.Cli.Common-005, Driver.Cli.Common-008 |
|
||||
| 10 | Documentation & comments | Driver.Cli.Common-006 |
|
||||
|
||||
## Re-review 2026-05-23 (commit `a9be809`)
|
||||
|
||||
Delta scope: commit `5a9c459` extends the `FormatStatus` shortlist with five
|
||||
`Bad*` codes (`BadInternalError` 0x80020000, `BadNotWritable` 0x803B0000,
|
||||
`BadOutOfRange` 0x803C0000, `BadNotSupported` 0x803D0000, `BadDeviceFailure`
|
||||
0x80550000) the FOCAS / AbCip / AbLegacy native-protocol mappers emit. Tests
|
||||
extended with parallel `[InlineData]` rows on the well-known Theory plus a new
|
||||
`FormatStatus_names_native_driver_emitted_codes` Theory.
|
||||
|
||||
Cross-checked the five new hex literals against the OPC Foundation
|
||||
`Opc.Ua.StatusCodes` table via DeepWiki:
|
||||
|
||||
| Name added | Code in shortlist | Spec value | Verdict |
|
||||
|---|---|---|---|
|
||||
| `BadInternalError` | `0x80020000` | `0x80020000` | Correct |
|
||||
| `BadNotWritable` | `0x803B0000` | `0x803B0000` | Correct |
|
||||
| `BadOutOfRange` | `0x803C0000` | `0x803C0000` | Correct |
|
||||
| `BadNotSupported` | `0x803D0000` | `0x803D0000` | Correct |
|
||||
| `BadDeviceFailure` | `0x80550000` | **`0x808B0000`** | **WRONG — `0x80550000` is `BadSecurityPolicyRejected`** |
|
||||
|
||||
The `BadDeviceFailure` mismapping is the same shape of bug as the original
|
||||
Driver.Cli.Common-001 (wrong hex literal copied into the shortlist); recorded
|
||||
as Driver.Cli.Common-007. The wrong constant also lives in
|
||||
`FocasStatusMapper.cs`, `AbCipStatusMapper.cs`, `AbLegacyStatusMapper.cs`,
|
||||
`TwinCATStatusMapper.cs`, `S7Driver.cs`, and `ModbusDriver.cs` — those are in
|
||||
other modules' review scope but are noted here so future re-reviewers know
|
||||
this isn't isolated. (`StatusCodeMap.cs` in Driver.Galaxy + the Wonderware
|
||||
historian mappers use the correct `0x808B0000`, confirming the discrepancy.)
|
||||
|
||||
Testing observation: the new `FormatStatus_names_native_driver_emitted_codes`
|
||||
Theory is fully redundant with the well-known Theory (the five rows were also
|
||||
added there in the same commit) and uses `ShouldContain` rather than
|
||||
`ShouldBe` — recorded as Driver.Cli.Common-008.
|
||||
|
||||
Other categories (concurrency, security, performance, design-doc adherence,
|
||||
code organisation, documentation) are unchanged by this delta — no new
|
||||
issues found.
|
||||
|
||||
## Findings
|
||||
|
||||
### Driver.Cli.Common-001
|
||||
@@ -205,3 +243,122 @@ to state the source-time column is the right-most one and intentionally not
|
||||
measured/padded, calling out the null-timestamp `"-"` case explicitly. (2) FOCAS was
|
||||
added to the `DriverCommandBase` class-summary driver enumeration in commit `7ff356b`
|
||||
(landed alongside the -003 work).
|
||||
|
||||
### Driver.Cli.Common-007
|
||||
|
||||
| Field | Value |
|
||||
|---|---|
|
||||
| Severity | High |
|
||||
| Category | Correctness & logic bugs |
|
||||
| Location | `src/Drivers/Cli/ZB.MOM.WW.OtOpcUa.Driver.Cli.Common/SnapshotFormatter.cs:129` |
|
||||
| Status | Resolved |
|
||||
|
||||
**Description:** Commit `5a9c459` added `0x80550000u => "BadDeviceFailure"` to the
|
||||
`FormatStatus` shortlist, but `0x80550000` is the canonical OPC UA spec value for
|
||||
`BadSecurityPolicyRejected`, not `BadDeviceFailure`. The correct spec value for
|
||||
`BadDeviceFailure` is `0x808B0000` (verified against the OPC Foundation
|
||||
`Opc.Ua.StatusCodes` table via DeepWiki; corroborated locally by
|
||||
`src/Drivers/ZB.MOM.WW.OtOpcUa.Driver.Galaxy/Runtime/StatusCodeMap.cs:40`
|
||||
(`BadDeviceFailure = 0x808B0000u`) and the two Wonderware historian quality mappers,
|
||||
which all hand-pin the correct value).
|
||||
|
||||
This is the same shape of bug Driver.Cli.Common-001 closed: a wrong hex literal
|
||||
in the shortlist that the test theory (`SnapshotFormatterTests.cs:42`) blindly
|
||||
asserts against the same wrong value, so the bug is invisible to CI.
|
||||
|
||||
Practical impact, two-sided:
|
||||
1. A driver that returns the real spec `BadDeviceFailure` (`0x808B0000`) — e.g.
|
||||
the `Driver.Galaxy.StatusCodeMap` path on a deploy-time device fault — falls
|
||||
through the named shortlist entirely. Since Driver.Cli.Common-002 added the
|
||||
severity-class fallback, it now renders as `0x808B0000 (Bad)` instead of
|
||||
`0x808B0000 (BadDeviceFailure)` — operators lose the specific class label
|
||||
`docs/Driver.FOCAS.Cli.md:153` tells them to read off the output.
|
||||
2. A driver that returns `0x80550000` (which `FocasStatusMapper`, `AbCipStatusMapper`,
|
||||
`AbLegacyStatusMapper`, `TwinCATStatusMapper`, `S7Driver`, and `ModbusDriver` all
|
||||
misuse as "BadDeviceFailure") now renders as `0x80550000 (BadDeviceFailure)` —
|
||||
matching driver intent but contradicting the OPC UA spec, which says any client
|
||||
that decodes the same payload using the OPC Foundation stack will see
|
||||
`BadSecurityPolicyRejected`. A security-monitoring tool keying on
|
||||
`BadSecurityPolicyRejected` will fire on a CPU fault, while real
|
||||
`BadSecurityPolicyRejected` returns from the secure-channel layer would be
|
||||
mislabelled as a device fault. Operator-facing CLI output and machine-readable
|
||||
status semantics disagree.
|
||||
|
||||
The deeper bug is the wrong constant in the native-protocol mappers (out of scope
|
||||
for this module), but the `SnapshotFormatter` shortlist is its own
|
||||
spec-authoritative reference point — Driver.Cli.Common-001 explicitly framed the
|
||||
shortlist as canonical, with the in-line "keep [these literals] in sync with [the
|
||||
Opc.Ua.StatusCodes] table" comment at `SnapshotFormatter.cs:112-113`. That
|
||||
contract is now broken.
|
||||
|
||||
**Recommendation:** Change line 129 to `0x808B0000u => "BadDeviceFailure"`. Update
|
||||
the matching `[InlineData]` rows in `SnapshotFormatterTests.cs` (line 42 in the
|
||||
well-known Theory; line 60 in the redundant Theory — see Driver.Cli.Common-008).
|
||||
Also note in the resolution that the native-protocol mappers (FOCAS / AbCip /
|
||||
AbLegacy / TwinCAT / S7 / Modbus) need the same fix recorded against their own
|
||||
module reviews — the constant `0x80550000` should be replaced with `0x808B0000`
|
||||
everywhere it claims to mean `BadDeviceFailure`. Consider Driver.Cli.Common-001's
|
||||
original recommendation again: add a CI test that cross-checks every shortlist
|
||||
entry against `Opc.Ua.StatusCodes` reflection so this class of bug stops
|
||||
recurring.
|
||||
|
||||
**Resolution:** Resolved 2026-05-23 — corrected `SnapshotFormatter.FormatStatus`
|
||||
to map `0x808B0000u => "BadDeviceFailure"` (was `0x80550000u`). Updated the
|
||||
`InlineData` row in the well-known Theory accordingly; the redundant native-
|
||||
emitted Theory was deleted entirely per Driver.Cli.Common-008. Added a regression
|
||||
row to `FormatStatus_does_not_apply_pre_fix_wrong_names` pinning that
|
||||
`0x80550000` no longer renders as `BadDeviceFailure` (mirroring the
|
||||
Driver.Cli.Common-001 wrong-name guards). The underlying constant was also
|
||||
corrected in all six native-protocol mappers as part of the same commit:
|
||||
`FocasStatusMapper.BadDeviceFailure`, `AbCipStatusMapper.BadDeviceFailure`,
|
||||
`AbLegacyStatusMapper.BadDeviceFailure`, `TwinCATStatusMapper.BadDeviceFailure`,
|
||||
`ModbusDriver.StatusBadDeviceFailure`, `S7Driver.StatusBadDeviceFailure` — all
|
||||
moved from `0x80550000u` to `0x808B0000u`. The three downstream Modbus tests
|
||||
(`ModbusExceptionMapperTests` 3 InlineData rows + 1 ShouldBe assertion;
|
||||
`ExceptionInjectionTests.StatusBadDeviceFailure` constant) updated to expect
|
||||
the corrected code. **Behavior change:** OPC UA clients consuming the native
|
||||
drivers now see the canonical `BadDeviceFailure` (0x808B0000) instead of the
|
||||
misnamed `BadSecurityPolicyRejected` (0x80550000) on device-fault paths —
|
||||
operator-facing CLI output and machine-readable status semantics now agree.
|
||||
Suite totals after fix: Driver.Cli.Common.Tests 43 green (was 48 — minus 5
|
||||
redundant rows); Modbus.Tests 263; AbCip.Tests 262; AbLegacy.Tests 157;
|
||||
FOCAS.Tests 178; S7.Tests 112; TwinCAT.Tests 131; all green. The Opc.Ua.StatusCodes
|
||||
cross-check the recommendation suggested is recorded as a follow-up worth
|
||||
considering but is out of scope for this fix.
|
||||
|
||||
### Driver.Cli.Common-008
|
||||
|
||||
| Field | Value |
|
||||
|---|---|
|
||||
| Severity | Low |
|
||||
| Category | Testing coverage |
|
||||
| Location | `tests/Drivers/Cli/ZB.MOM.WW.OtOpcUa.Driver.Cli.Common.Tests/SnapshotFormatterTests.cs:50-64` |
|
||||
| Status | Resolved |
|
||||
|
||||
**Description:** Commit `5a9c459` adds a new
|
||||
`FormatStatus_names_native_driver_emitted_codes` `[Theory]` whose five
|
||||
`[InlineData]` rows are identical to five rows added to the existing
|
||||
`FormatStatus_names_well_known_status_codes` `[Theory]` in the same commit
|
||||
(lines 32, 39, 40, 41, 42). The new Theory therefore adds no coverage. It is
|
||||
also weaker than the Theory it duplicates: it asserts
|
||||
`output.ShouldContain($"({expectedName})")` (substring match) where the
|
||||
well-known Theory asserts `output.ShouldBe($"0x{status:X8} ({expectedName})")`
|
||||
(exact match including the hex prefix). The substring form would not catch a
|
||||
regression where the hex literal renders wrong but the name is correct.
|
||||
|
||||
This is not a correctness problem — both Theories pass — but it's a
|
||||
copy-paste inconsistency that costs maintainer attention every time someone
|
||||
reads the test file and wonders which Theory is authoritative.
|
||||
|
||||
**Recommendation:** Either (a) delete the new Theory entirely — its five rows
|
||||
are already covered by the well-known Theory in the same commit — or (b) keep
|
||||
it but switch to `ShouldBe($"0x{status:X8} ({expectedName})")` so its
|
||||
assertion strength matches the rest of the file. Option (a) is cleaner: the
|
||||
commit's "operator workflow" intent is documented well enough in the
|
||||
well-known Theory comment block; the redundant Theory is dead weight.
|
||||
|
||||
**Resolution:** Resolved 2026-05-23 — took option (a): deleted the
|
||||
`FormatStatus_names_native_driver_emitted_codes` Theory entirely. Its five
|
||||
`InlineData` rows are covered by the well-known Theory's `ShouldBe` (strict
|
||||
exact-match assertion), which is the authoritative shortlist test. Landed
|
||||
alongside the Driver.Cli.Common-007 fix in the same commit.
|
||||
|
||||
@@ -187,7 +187,7 @@ the source-level convention against regression.
|
||||
| Severity | Low |
|
||||
| Category | Design-document adherence |
|
||||
| Location | `Commands/WriteCommand.cs:50`, `Commands/ProbeCommand.cs:50` (via `SnapshotFormatter.FormatStatus`) |
|
||||
| Status | Deferred |
|
||||
| Status | Resolved |
|
||||
|
||||
**Description:** `docs/Driver.FOCAS.Cli.md` documents `BadDeviceFailure` and
|
||||
`BadCommunicationError` as the key diagnostic signals an operator reads off
|
||||
@@ -214,14 +214,18 @@ actually emit — at minimum `BadNotWritable`, `BadOutOfRange`, `BadNotSupported
|
||||
because the gap defeats this module documented `probe`/`write` diagnostic
|
||||
workflow; cross-reference the `Driver.Cli.Common` review.
|
||||
|
||||
**Resolution:** Deferred 2026-05-23 — the recommended fix lives in
|
||||
`SnapshotFormatter.FormatStatus` inside the `Driver.Cli.Common` shared module,
|
||||
which is outside this module's edit scope. Driver.Cli.Common-001 / -002 have
|
||||
already corrected the existing shortlist mappings and added a severity-class
|
||||
fallback so the FOCAS-emitted codes now at least render with a "Bad" /
|
||||
"Uncertain" / "Good" suffix rather than bare hex; explicitly naming
|
||||
`BadNotWritable`, `BadOutOfRange`, `BadNotSupported`, `BadDeviceFailure`,
|
||||
`BadInternalError`, and the canonical `BadTimeout` (0x800A0000) belongs to
|
||||
the Driver.Cli.Common review's follow-up (and benefits every driver CLI, not
|
||||
just FOCAS). Re-open here only if Driver.Cli.Common declines to extend the
|
||||
shortlist.
|
||||
**Resolution:** Resolved 2026-05-23 — the cross-CLI fix landed in `Driver.Cli.Common`:
|
||||
`SnapshotFormatter.FormatStatus` now names `BadInternalError` (0x80020000),
|
||||
`BadNotWritable` (0x803B0000), `BadOutOfRange` (0x803C0000), `BadNotSupported`
|
||||
(0x803D0000), and `BadDeviceFailure` (0x80550000) — the five codes the FOCAS /
|
||||
AbCip / AbLegacy native-protocol mappers all emit but the shortlist previously
|
||||
left unnamed (the canonical `BadTimeout` 0x800A0000 was already added under
|
||||
Driver.Cli.Common-001). FOCAS `probe` / `write` against a non-writable parameter,
|
||||
out-of-range address, unsupported function, busy device, or CNC-handle failure
|
||||
now renders with the named status the `docs/Driver.FOCAS.Cli.md` workflow
|
||||
promises, restoring parity between the docs and the shipped behaviour. Regression
|
||||
`[Theory]` `FormatStatus_names_native_driver_emitted_codes` added to
|
||||
`SnapshotFormatterTests` so the five names can't silently drop out of the
|
||||
shortlist again; the existing well-known shortlist `[Theory]` was extended with
|
||||
the same five entries to enforce the exact `0x... (Name)` rendering. Suite now
|
||||
47 green (was 42).
|
||||
|
||||
@@ -4,8 +4,8 @@
|
||||
|---|---|
|
||||
| Module | `src/Drivers/ZB.MOM.WW.OtOpcUa.Driver.Galaxy` |
|
||||
| Reviewer | Claude Code |
|
||||
| Review date | 2026-05-22 |
|
||||
| Commit reviewed | `76d35d1` |
|
||||
| Review date | 2026-05-23 |
|
||||
| Commit reviewed | `a9be809` |
|
||||
| Status | Reviewed |
|
||||
| Open findings | 0 |
|
||||
|
||||
@@ -17,12 +17,39 @@
|
||||
| 2 | OtOpcUa conventions | Driver.Galaxy-005 |
|
||||
| 3 | Concurrency & thread safety | Driver.Galaxy-006, Driver.Galaxy-007 |
|
||||
| 4 | Error handling & resilience | Driver.Galaxy-001, Driver.Galaxy-008, Driver.Galaxy-009 |
|
||||
| 5 | Security | Driver.Galaxy-010 |
|
||||
| 6 | Performance & resource management | Driver.Galaxy-011, Driver.Galaxy-012 |
|
||||
| 7 | Design-document adherence | Driver.Galaxy-013 |
|
||||
| 5 | Security | Driver.Galaxy-010, Driver.Galaxy-015 |
|
||||
| 6 | Performance & resource management | Driver.Galaxy-011, Driver.Galaxy-012, Driver.Galaxy-016 |
|
||||
| 7 | Design-document adherence | Driver.Galaxy-013, Driver.Galaxy-017 |
|
||||
| 8 | Code organization & conventions | No issues found |
|
||||
| 9 | Testing coverage | Driver.Galaxy-014 |
|
||||
| 10 | Documentation & comments | Driver.Galaxy-005, Driver.Galaxy-013 |
|
||||
| 10 | Documentation & comments | Driver.Galaxy-005, Driver.Galaxy-013, Driver.Galaxy-018 |
|
||||
|
||||
## Re-review 2026-05-23 (commit `a9be809`)
|
||||
|
||||
The only code-affecting change since `76d35d1` was commit `994997b` — the
|
||||
sibling `mxaccessgw` repo restructured (the `clients/dotnet/MxGateway.Client`
|
||||
project path and the `MxGateway.Contracts.Proto` namespace both moved), and
|
||||
the driver's path-based `ProjectReference` started producing 87 build errors
|
||||
solution-wide. The fix is build-time only: the broken `ProjectReference` was
|
||||
replaced with `<Reference HintPath="libs\…">` items pointing at vendored
|
||||
binary copies of `MxGateway.Client.dll` (99 KB, May 2026 known-good build)
|
||||
and `MxGateway.Contracts.dll` (490 KB), and five `PackageReference`s that
|
||||
the dropped project was previously providing transitively (`Google.Protobuf`,
|
||||
`Grpc.Core.Api`, `Grpc.Net.Client`, `Microsoft.Extensions.Logging.Abstractions`,
|
||||
`Polly`) were declared explicitly. The matching `Tests` csproj got the same
|
||||
binary `<Reference>` for `MxGateway.Contracts` (replacing its own broken
|
||||
`ProjectReference`). A `libs/README.md` documents what is vendored and the
|
||||
two unwinding paths (sibling restores a client library, or driver migrates
|
||||
to the new `ZB.MOM.WW.MxGateway.Contracts.Proto` namespace + reimplements
|
||||
the `MxGatewayClient` / `MxGatewaySession` / `GalaxyRepositoryClient`
|
||||
wrapper, ~2,200 LoC).
|
||||
|
||||
No `*.cs` file changed; the re-review walked only the categories that apply
|
||||
to a build-time/packaging change. Categories with no new findings:
|
||||
Correctness (1), OtOpcUa conventions (2), Concurrency (3), Error handling
|
||||
(4), Code organization (8), Testing coverage (9). Four new findings are
|
||||
recorded below (Driver.Galaxy-015..018) — none Critical, none High; two
|
||||
Medium, two Low.
|
||||
|
||||
## Findings
|
||||
|
||||
@@ -235,3 +262,126 @@
|
||||
**Recommendation:** Add unit/parity tests covering: (a) stream fault -> supervisor reopen -> EventPump restart -> `OnDataChange` resumes; (b) `ReplayAsync` updates `SubscriptionRegistry` with new handles; (c) `StatusCodeMap.FromMxStatus` for both success and failure `MxStatusProxy` rows; (d) `DataTypeMap` for every Galaxy `mx_data_type` code including 64-bit integer.
|
||||
|
||||
**Resolution:** Resolved 2026-05-22 — added `GalaxyDriverInfrastructureTests` covering `GetMemoryFootprint` (Driver.Galaxy-011) and `IAsyncDisposable` (Driver.Galaxy-007); (a) stream-fault → supervisor reopen → EventPump restart → `OnDataChange` resumes is covered by `EventPumpStreamFaultTests.StreamFault_DrivesReconnectSupervisorReopenReplay` and `FaultedPump_IsNotRestartableInPlace_ButAFreshPumpResumesDispatch` (landed with Driver.Galaxy-001/008 resolution); (b) post-reconnect `ReplayAsync` rebinds handles is covered by `SubscriptionRegistryTests.Rebind_*` suite; (c) `StatusCodeMap.FromMxStatus` success/failure rows are covered by `StatusCodeMapTests.FromMxStatus_SuccessNonZeroAndCategoryOk_IsGood` and `FromMxStatus_SuccessNonZeroButCategoryNotOk_IsNotGood` (landed with Driver.Galaxy-003); (d) `DataTypeMap` for all seven mx_data_type codes including Int64 is covered by `DataTypeMapTests` (landed with Driver.Galaxy-002).
|
||||
|
||||
### Driver.Galaxy-015
|
||||
|
||||
| Field | Value |
|
||||
|---|---|
|
||||
| Severity | ~~Medium~~ Low (re-triaged 2026-05-23) |
|
||||
| Category | ~~Security~~ Documentation & comments (re-triaged 2026-05-23) |
|
||||
| Location | `libs/MxGateway.Client.dll`, `libs/MxGateway.Contracts.dll`, `libs/README.md` |
|
||||
| Status | Resolved |
|
||||
|
||||
**Description:** Commit `994997b` checks in two binary DLLs (`MxGateway.Client.dll`, 99 840 bytes; `MxGateway.Contracts.dll`, 489 984 bytes) under `src/Drivers/.../Driver.Galaxy/libs/` and references them via `<Reference HintPath="…" />`. These are the only checked-in binary build artefacts in the entire repo (a repo-wide `find` for non-`bin/`/`obj/` `*.dll` under `libs/` returns only these two), so the change sets a precedent. The accompanying `libs/README.md` states the DLLs are "byte-for-byte the build output" of the OtOpcUa team's own code against the gateway's open proto contracts, but there is no recorded provenance — no source-commit SHA from the sibling `mxaccessgw` repo that produced the build, no SHA-256/SHA-512 checksum, no `.gitattributes` rule marking these paths as binary (so a future churn-in-place will balloon the pack file). Without a recorded source commit + checksum it is impossible for a future reviewer/auditor to verify the binaries match a specific revision of the sibling repo — the assertion "we built them, not external" is unverifiable after the fact. Tampering or accidental swap (e.g. someone drops in a different DLL of the same name under the same path) would not be detectable.
|
||||
|
||||
**Recommendation:** (a) Pin the source provenance: add the sibling `mxaccessgw` commit SHA used to build each DLL to `libs/README.md`. (b) Record a SHA-256 of each `.dll` in `libs/README.md` so a future tamper or accidental update is detectable by running `Get-FileHash`/`sha256sum`. (c) Add a `.gitattributes` rule under `libs/` declaring `*.dll binary` (and consider `filter=lfs diff=lfs merge=lfs -text` if/when these need to be updated, to avoid bloating the pack file on every refresh). (d) Optional: a `dotnet test` time-check that compares the on-disk hash to the recorded hash, so a CI run notices if the file drifts from what the README claims.
|
||||
|
||||
**Resolution:** Resolved 2026-05-23. **Severity re-triage:** the original
|
||||
finding framed this as a security concern about "tampering or accidental
|
||||
swap by an unknown third party"; the user clarified that the DLLs are
|
||||
their own code, built from their own `mxaccessgw` project — not third-party
|
||||
binaries. That moves the concern from security (untrusted provenance) to
|
||||
documentation (audit trail). Re-classified as Low Documentation &
|
||||
Comments. Fix: `libs/README.md` now carries a Provenance section that
|
||||
records the source-commit SHA (`dd7ca1634e2d2b8a866c81f0009bf87ee9427750`,
|
||||
extracted from the `AssemblyInformationalVersion` baked into both DLLs by
|
||||
the original build) and SHA-256 checksums of both binaries, plus a
|
||||
re-verification recipe (`sha256sum libs/*.dll` + `ilspycmd <dll> | grep
|
||||
AssemblyInformationalVersion`). Recommendations (c) `.gitattributes` and
|
||||
(d) CI hash-check deferred — the DLLs are essentially frozen until one
|
||||
of the two unwinding paths is taken, so adding LFS or a CI guard would
|
||||
add infrastructure that the unwinding step would then have to remove.
|
||||
Re-open if the vendoring becomes a recurring update target.
|
||||
|
||||
### Driver.Galaxy-016
|
||||
|
||||
| Field | Value |
|
||||
|---|---|
|
||||
| Severity | Medium |
|
||||
| Category | Performance & resource management |
|
||||
| Location | `ZB.MOM.WW.OtOpcUa.Driver.Galaxy.csproj:43-47`, `libs/README.md:32-37` |
|
||||
| Status | Resolved |
|
||||
|
||||
**Description:** The five new `PackageReference` versions declared in the csproj (`Google.Protobuf` 3.34.1, `Grpc.Core.Api` 2.76.0, `Grpc.Net.Client` 2.71.0, `Microsoft.Extensions.Logging.Abstractions` 10.0.0, `Polly` 8.5.2) do not all match what the vendored `MxGateway.Client.dll` was built against. The DLL's PE metadata (extracted via `System.Reflection.Metadata`) shows references to `Grpc.Net.Client v2.0.0.0`, `Microsoft.Extensions.Logging.Abstractions v10.0.0.0`, and notably `Polly.Core v8.0.0.0` — and the source csproj just before the sibling-repo rename (commit `bd4a09a` from 2026-04-27) declared `Grpc.Net.Client` 2.76.0, `Microsoft.Extensions.Logging.Abstractions` 10.0.7, and `Polly.Core` 8.6.6 — *not* the meta-package `Polly`. Our driver pulls `Polly` 8.5.2 (which transitively pins `Polly.Core` 8.5.2 per its nuspec dependency), so the vendored client actually loads `Polly.Core` 8.5.2 at runtime against code compiled against 8.6.6. Across an 8.5 ↔ 8.6 minor delta this is usually safe (assembly-version is `v8.0.0.0` for both), but it is exactly the skew shape that surfaces as `MissingMethodException` if a 8.6-only API was used in the client. `libs/README.md` claims "versions match what the sibling repo's `ZB.MOM.WW.MxGateway.Contracts.csproj` uses so the gRPC + proto runtime stays binary-compatible" — that statement is correct only for `Google.Protobuf` and `Grpc.Core.Api`; the other three packages do not match.
|
||||
|
||||
**Recommendation:** Reconcile the declared package versions with what the vendored DLLs were built against — bump to `Grpc.Net.Client` 2.76.0, `Microsoft.Extensions.Logging.Abstractions` 10.0.7, swap `Polly` for `Polly.Core` 8.6.6 (the driver does not import the `Polly` legacy v7 surface, only Polly.Core via the client). Alternatively, rebuild the vendored DLLs against the same versions the csproj declares and refresh the binaries. Update `libs/README.md` to record the exact versions the DLLs were built against, so the next vendoring refresh has an authoritative reference.
|
||||
|
||||
**Resolution:** Resolved 2026-05-23 — took the first option (reconcile
|
||||
declared packages with what the DLL was built against, verified by
|
||||
reflecting `Assembly.GetReferencedAssemblies()` on `MxGateway.Client.dll`).
|
||||
Changes to the csproj: **`Polly` 8.5.2 → `Polly.Core` 8.6.6** (the most
|
||||
consequential — `Polly` (v7 fluent API) and `Polly.Core` (v8 resilience-
|
||||
pipeline API) are different packages, and the DLL was built against
|
||||
`Polly.Core`; the prior `Polly` reference would have failed at runtime
|
||||
with `MissingMethodException` the first time the gateway client's retry
|
||||
pipeline ran). Also bumped `Grpc.Net.Client` 2.71.0 → 2.76.0 and
|
||||
`Microsoft.Extensions.Logging.Abstractions` 10.0.0 → 10.0.7 to match the
|
||||
sibling Server/Worker projects' current versions. `Google.Protobuf`
|
||||
3.34.1 and `Grpc.Core.Api` 2.76.0 already matched; left unchanged.
|
||||
`libs/README.md` rewritten to record what was actually verified
|
||||
(`Assembly.GetReferencedAssemblies()` output + the resolved package
|
||||
versions, including the sibling Server/Worker csproj as the version
|
||||
source-of-truth — the deleted MxGateway.Client.csproj would have been
|
||||
the original source but no longer exists). Verification: solution-wide
|
||||
`dotnet build` clean, Driver.Galaxy.Tests 245/245 pass against the
|
||||
corrected package set.
|
||||
|
||||
### Driver.Galaxy-017
|
||||
|
||||
| Field | Value |
|
||||
|---|---|
|
||||
| Severity | Low |
|
||||
| Category | Design-document adherence |
|
||||
| Location | `src/Drivers/ZB.MOM.WW.OtOpcUa.Driver.Galaxy/` (no source change), gateway proto contract |
|
||||
| Status | Deferred |
|
||||
|
||||
**Description:** The vendored `MxGateway.Contracts.dll` only carries the OLD `MxGateway.Contracts.Proto[.Galaxy]` namespace (PE-namespace dump confirms — `MxGateway.Client`, `MxGateway.Contracts`, `MxGateway.Contracts.Proto`, `MxGateway.Contracts.Proto.Galaxy` only). The sibling `mxaccessgw` repo's live `Protos/mxaccess_gateway.proto`, `mxaccess_worker.proto`, and `galaxy_repository.proto` files now generate into `ZB.MOM.WW.MxGateway.Contracts.Proto.*`. The proto wire format itself can still evolve (new RPCs, renamed fields, removed fields) and the driver has no contract-version handshake (a repo-wide search for `ContractVersion|ProtocolVersion|ApiVersion|WireVersion` in the driver returns nothing) — so a gateway service that evolves its proto past what the vendored client knows will fail silently at runtime: gRPC `UNIMPLEMENTED` for a renamed RPC, default-value reads for a removed scalar field, or worse, a wire-tag collision if a field number is reused. The risk surface grew with vendoring: previously the `ProjectReference` would have hard-failed at build time if the proto changed shape; now the driver builds green against a frozen contract that may not match the running gateway.
|
||||
|
||||
**Recommendation:** (a) Add a single `Ping`/`GetVersion` RPC call at gateway-session open, comparing the gateway's reported contract version against a string baked into `libs/README.md` (or a `GatewayContractVersion` const) and refusing the session on mismatch with a clear log. (b) Document in `libs/README.md` the exact mxaccessgw commit SHA (and proto-file SHA-256s) the vendored DLLs were built from, so a parity-rig operator can grep the live gateway for the matching commit. (c) Add a soak/parity test that asserts the live gateway's proto descriptor still matches what the vendored DLL expects — fail loud rather than degrade.
|
||||
|
||||
**Resolution:** Deferred 2026-05-23 — the recommendation's part (b)
|
||||
(record the mxaccessgw source-commit SHA in `libs/README.md`) is satisfied
|
||||
by the Driver.Galaxy-015 resolution, which records both DLLs were built
|
||||
from mxaccessgw commit `dd7ca1634e2d2b8a866c81f0009bf87ee9427750`. Parts
|
||||
(a) and (c) — adding a `GetVersion` RPC at session-open and a parity
|
||||
test against the live gateway's proto descriptor — are substantial new
|
||||
RPC + plumbing work that is not in scope for this code-review-resolution
|
||||
sweep. The risk surface is bounded because either of the two unwinding
|
||||
paths in `libs/README.md` (sibling repo restores `MxGateway.Client.csproj`,
|
||||
or this driver migrates to the new namespace) will move the codebase
|
||||
past the vendoring + close this concern naturally. Re-open if neither
|
||||
unwinding path is taken within the next quarter and the live gateway
|
||||
service does evolve its proto under the driver.
|
||||
|
||||
### Driver.Galaxy-018
|
||||
|
||||
| Field | Value |
|
||||
|---|---|
|
||||
| Severity | Low |
|
||||
| Category | Documentation & comments |
|
||||
| Location | `libs/README.md:32-37`, `ZB.MOM.WW.OtOpcUa.Driver.Galaxy.csproj:40-47` |
|
||||
| Status | Resolved |
|
||||
|
||||
**Description:** Several small documentation issues in the vendoring artefacts:
|
||||
1. `libs/README.md` says "Versions match what the sibling repo's `ZB.MOM.WW.MxGateway.Contracts.csproj` uses" — but `ZB.MOM.WW.MxGateway.Contracts.csproj` only declares `Google.Protobuf` 3.34.1 and `Grpc.Core.Api` 2.76.0; the other three packages (`Grpc.Net.Client`, `Microsoft.Extensions.Logging.Abstractions`, `Polly`) come from the (now-deleted) `MxGateway.Client.csproj`, not the contracts csproj. The README points at the wrong source-of-truth file. See Driver.Galaxy-016 for the related version-skew issue.
|
||||
2. `libs/README.md` says the DLLs "are built against net10.0" — accurate, but the README should also pin the source-commit SHA from `mxaccessgw` that produced the build (currently no such reference). Without it, "May 2026" is the only locator and a future refresh has no fixed point to roll back to.
|
||||
3. The two `<Reference>` items in the csproj omit `<SpecificVersion>false</SpecificVersion>`. The vendored DLLs carry `AssemblyVersion 1.0.0.0`; MSBuild's default for `<Reference HintPath>` items is `SpecificVersion=true` only when the `Include` attribute contains version info, which it does not here, so this is benign — but spelling it out (`<SpecificVersion>false</SpecificVersion>`) would make a future refresh that bumps the AssemblyVersion robust without csproj edits.
|
||||
4. The csproj `<Reference Include="MxGateway.Client">` value relies on the bare assembly simple-name; an explicit `<Reference Include="MxGateway.Client, Version=1.0.0.0, Culture=neutral, PublicKeyToken=null">` plus `<SpecificVersion>false</SpecificVersion>` would document the contract surface inside the csproj where a reviewer reads it.
|
||||
|
||||
**Recommendation:** (a) Update `libs/README.md` to (i) point at `MxGateway.Client.csproj` for the `Grpc.Net.Client`/`Microsoft.Extensions.Logging.Abstractions`/`Polly` version source, (ii) record the mxaccessgw commit SHA the vendored binaries were built from, and (iii) record SHA-256 hashes (see Driver.Galaxy-015). (b) Add `<SpecificVersion>false</SpecificVersion>` to both `<Reference>` items in the csproj to make the intent explicit and refresh-robust.
|
||||
|
||||
**Resolution:** Resolved 2026-05-23 — most of (a) was addressed alongside
|
||||
Driver.Galaxy-015 + -016: `libs/README.md` rewritten to (i) point at the
|
||||
sibling Server/Worker csproj as the live version source-of-truth (the
|
||||
`MxGateway.Client.csproj` cited in the recommendation no longer exists —
|
||||
the deleted-csproj reference would not have been actionable for a
|
||||
future reader), (ii) record source commit
|
||||
`dd7ca1634e2d2b8a866c81f0009bf87ee9427750`, and (iii) record SHA-256
|
||||
checksums of both vendored DLLs. (b) `<SpecificVersion>false</SpecificVersion>`
|
||||
was intentionally NOT added — the vendored DLL's AssemblyVersion is
|
||||
`1.0.0.0` and MSBuild's default for `<Reference HintPath>` Include="bare-name"
|
||||
items is already `SpecificVersion=false`, so the spelling-it-out
|
||||
recommendation would be cosmetic without changing behaviour. If the
|
||||
vendored DLLs are ever refreshed against a build with a different
|
||||
`AssemblyVersion` the explicit attribute could be added then; for now
|
||||
the existing csproj works correctly.
|
||||
|
||||
+19
-7
@@ -19,17 +19,17 @@ Each module's `findings.md` is the source of truth; this file is generated from
|
||||
| [Core](Core/findings.md) | Claude Code | 2026-05-22 | `76d35d1` | Reviewed | 0 | 12 |
|
||||
| [Core.Abstractions](Core.Abstractions/findings.md) | Claude Code | 2026-05-22 | `76d35d1` | Reviewed | 0 | 8 |
|
||||
| [Core.AlarmHistorian](Core.AlarmHistorian/findings.md) | Claude Code | 2026-05-22 | `76d35d1` | Reviewed | 0 | 11 |
|
||||
| [Core.ScriptedAlarms](Core.ScriptedAlarms/findings.md) | Claude Code | 2026-05-22 | `76d35d1` | Reviewed | 0 | 12 |
|
||||
| [Core.Scripting](Core.Scripting/findings.md) | Claude Code | 2026-05-22 | `76d35d1` | Reviewed | 0 | 11 |
|
||||
| [Core.ScriptedAlarms](Core.ScriptedAlarms/findings.md) | Claude Code | 2026-05-23 | `a9be809` | Reviewed | 0 | 13 |
|
||||
| [Core.Scripting](Core.Scripting/findings.md) | Claude Code | 2026-05-23 | `a9be809` | Reviewed | 0 | 16 |
|
||||
| [Core.VirtualTags](Core.VirtualTags/findings.md) | Claude Code | 2026-05-22 | `76d35d1` | Reviewed | 0 | 13 |
|
||||
| [Driver.AbCip](Driver.AbCip/findings.md) | Claude Code | 2026-05-22 | `76d35d1` | Reviewed | 0 | 15 |
|
||||
| [Driver.AbCip.Cli](Driver.AbCip.Cli/findings.md) | Claude Code | 2026-05-22 | `76d35d1` | Reviewed | 0 | 8 |
|
||||
| [Driver.AbLegacy](Driver.AbLegacy/findings.md) | Claude Code | 2026-05-22 | `76d35d1` | Reviewed | 0 | 13 |
|
||||
| [Driver.AbLegacy.Cli](Driver.AbLegacy.Cli/findings.md) | Claude Code | 2026-05-22 | `76d35d1` | Reviewed | 0 | 7 |
|
||||
| [Driver.Cli.Common](Driver.Cli.Common/findings.md) | Claude Code | 2026-05-22 | `76d35d1` | Reviewed | 0 | 6 |
|
||||
| [Driver.Cli.Common](Driver.Cli.Common/findings.md) | Claude Code | 2026-05-23 | `a9be809` | Reviewed | 0 | 8 |
|
||||
| [Driver.FOCAS](Driver.FOCAS/findings.md) | Claude Code | 2026-05-22 | `76d35d1` | Reviewed | 0 | 12 |
|
||||
| [Driver.FOCAS.Cli](Driver.FOCAS.Cli/findings.md) | Claude Code | 2026-05-22 | `76d35d1` | Reviewed | 0 | 5 |
|
||||
| [Driver.Galaxy](Driver.Galaxy/findings.md) | Claude Code | 2026-05-22 | `76d35d1` | Reviewed | 0 | 14 |
|
||||
| [Driver.Galaxy](Driver.Galaxy/findings.md) | Claude Code | 2026-05-23 | `a9be809` | Reviewed | 0 | 18 |
|
||||
| [Driver.Historian.Wonderware](Driver.Historian.Wonderware/findings.md) | Claude Code | 2026-05-22 | `76d35d1` | Reviewed | 0 | 12 |
|
||||
| [Driver.Historian.Wonderware.Client](Driver.Historian.Wonderware.Client/findings.md) | Claude Code | 2026-05-22 | `76d35d1` | Reviewed | 0 | 10 |
|
||||
| [Driver.Modbus](Driver.Modbus/findings.md) | Claude Code | 2026-05-22 | `76d35d1` | Reviewed | 0 | 12 |
|
||||
@@ -75,6 +75,7 @@ Findings with status `Resolved`, `Won't Fix`, or `Deferred`.
|
||||
| Core.AlarmHistorian-006 | High | Resolved | Error handling & resilience | `src/Core/ZB.MOM.WW.OtOpcUa.Core.AlarmHistorian/SqliteStoreAndForwardSink.cs:103,135-216` |
|
||||
| Core.ScriptedAlarms-001 | High | Resolved | Concurrency & thread safety | `ScriptedAlarmEngine.cs:175`, `ScriptedAlarmEngine.cs:178`, `ScriptedAlarmEngine.cs:73`, `ScriptedAlarmEngine.cs:368` |
|
||||
| Core.Scripting-002 | High | Resolved | Security | `ForbiddenTypeAnalyzer.cs:70` |
|
||||
| Core.Scripting-012 | High | Resolved | Security | `ForbiddenTypeAnalyzer.cs:60-76`, `ScriptSandbox.cs:96-126` |
|
||||
| Core.VirtualTags-001 | High | Resolved | Correctness & logic bugs | `src/Core/ZB.MOM.WW.OtOpcUa.Core.VirtualTags/VirtualTagEngine.cs:306` |
|
||||
| Driver.AbCip-001 | High | Resolved | Correctness & logic bugs | `AbCipDriver.cs:111`, `AbCipDriver.cs:163-167` |
|
||||
| Driver.AbCip-002 | High | Resolved | Correctness & logic bugs | `AbCipStatusMapper.cs:65-78` |
|
||||
@@ -83,6 +84,7 @@ Findings with status `Resolved`, `Won't Fix`, or `Deferred`.
|
||||
| Driver.AbLegacy-001 | High | Resolved | Correctness & logic bugs | `AbLegacyAddress.cs:54`, `AbLegacyDriver.cs:368-374` |
|
||||
| Driver.AbLegacy-006 | High | Resolved | Concurrency & thread safety | `AbLegacyDriver.cs:107-158`, `AbLegacyDriver.cs:162-234`, `LibplctagLegacyTagRuntime.cs` |
|
||||
| Driver.Cli.Common-001 | High | Resolved | Correctness & logic bugs | `src/Drivers/Cli/ZB.MOM.WW.OtOpcUa.Driver.Cli.Common/SnapshotFormatter.cs:106-119` |
|
||||
| Driver.Cli.Common-007 | High | Resolved | Correctness & logic bugs | `src/Drivers/Cli/ZB.MOM.WW.OtOpcUa.Driver.Cli.Common/SnapshotFormatter.cs:129` |
|
||||
| Driver.FOCAS-001 | High | Resolved | Correctness & logic bugs | `FocasDriverFactoryExtensions.cs:54-86`, `FocasDriverFactoryExtensions.cs:132-140` |
|
||||
| Driver.FOCAS-002 | High | Resolved | Correctness & logic bugs | `WireFocasClient.cs:164-179`, `FocasDriver.cs:513`, `FocasDriver.cs:593` |
|
||||
| Driver.Galaxy-002 | High | Resolved | Correctness & logic bugs | `Browse/DataTypeMap.cs:13`, `Runtime/MxValueDecoder.cs:9` |
|
||||
@@ -149,6 +151,9 @@ Findings with status `Resolved`, `Won't Fix`, or `Deferred`.
|
||||
| Core.Scripting-004 | Medium | Resolved | Correctness & logic bugs | `DependencyExtractor.cs:73` |
|
||||
| Core.Scripting-007 | Medium | Resolved | Error handling & resilience | `TimedScriptEvaluator.cs:60` |
|
||||
| Core.Scripting-010 | Medium | Resolved | Testing coverage | `tests/Core/ZB.MOM.WW.OtOpcUa.Core.Scripting.Tests/ScriptSandboxTests.cs:54` |
|
||||
| Core.Scripting-013 | Medium | Resolved | Security | `ScriptEvaluator.cs:202-225` (`BuildWrapperSource`) |
|
||||
| Core.Scripting-014 | Medium | Resolved | Concurrency & thread safety | `CompiledScriptCache.cs:91-103` (`Clear`) |
|
||||
| Core.Scripting-016 | Medium | Resolved | Performance & resource management | `src/Core/ZB.MOM.WW.OtOpcUa.Core.VirtualTags/VirtualTagEngine.cs:74-117`, `src/Core/ZB.MOM.WW.OtOpcUa.Core.ScriptedAlarms/ScriptedAlarmEngine.cs:139-182` |
|
||||
| Core.VirtualTags-002 | Medium | Resolved | Correctness & logic bugs | `src/Core/ZB.MOM.WW.OtOpcUa.Core.VirtualTags/VirtualTagEngine.cs:237` |
|
||||
| Core.VirtualTags-003 | Medium | Resolved | Correctness & logic bugs | `src/Core/ZB.MOM.WW.OtOpcUa.Core.VirtualTags/VirtualTagEngine.cs:117-120` |
|
||||
| Core.VirtualTags-005 | Medium | Resolved | Concurrency & thread safety | `src/Core/ZB.MOM.WW.OtOpcUa.Core.VirtualTags/VirtualTagSource.cs:50-64` |
|
||||
@@ -186,6 +191,7 @@ Findings with status `Resolved`, `Won't Fix`, or `Deferred`.
|
||||
| Driver.Galaxy-009 | Medium | Resolved | Error handling & resilience | `GalaxyDriver.cs:354-371` |
|
||||
| Driver.Galaxy-011 | Medium | Resolved | Performance & resource management | `GalaxyDriver.cs:411` |
|
||||
| Driver.Galaxy-014 | Medium | Resolved | Testing coverage | `src/Drivers/ZB.MOM.WW.OtOpcUa.Driver.Galaxy` (module-wide) |
|
||||
| Driver.Galaxy-016 | Medium | Resolved | Performance & resource management | `ZB.MOM.WW.OtOpcUa.Driver.Galaxy.csproj:43-47`, `libs/README.md:32-37` |
|
||||
| Driver.Historian.Wonderware-002 | Medium | Resolved | Correctness and logic bugs | `Ipc/HistorianFrameHandler.cs:162`, `:181` |
|
||||
| Driver.Historian.Wonderware-003 | Medium | Resolved | Correctness and logic bugs | `Backend/HistorianDataSource.cs:320-323`, `:457-460` |
|
||||
| Driver.Historian.Wonderware-006 | Medium | Resolved | Error handling and resilience | `Ipc/PipeServer.cs:120-128` |
|
||||
@@ -281,14 +287,16 @@ Findings with status `Resolved`, `Won't Fix`, or `Deferred`.
|
||||
| Core.ScriptedAlarms-003 | Low | Resolved | Documentation & comments | `ScriptedAlarmEngine.cs:343`, `docs/ScriptedAlarms.md:107` |
|
||||
| Core.ScriptedAlarms-006 | Low | Resolved | Concurrency & thread safety | `ScriptedAlarmEngine.cs:232`, `ScriptedAlarmEngine.cs:369` |
|
||||
| Core.ScriptedAlarms-008 | Low | Resolved | Performance & resource management | `Part9StateMachine.cs:261-268` |
|
||||
| Core.ScriptedAlarms-009 | Low | Won't Fix | Performance & resource management | `ScriptedAlarmEngine.cs:309-315`, `ScriptedAlarmEngine.cs:271` |
|
||||
| Core.ScriptedAlarms-009 | Low | Resolved | Performance & resource management | `ScriptedAlarmEngine.cs:309-315`, `ScriptedAlarmEngine.cs:271` |
|
||||
| Core.ScriptedAlarms-010 | Low | Resolved | Design-document adherence | `ScriptedAlarmEngine.cs:325-336`, `AlarmPredicateContext.cs:33-40`, `MessageTemplate.cs:47` |
|
||||
| Core.ScriptedAlarms-011 | Low | Resolved | Code organization & conventions | `Part9StateMachine.cs:275` |
|
||||
| Core.ScriptedAlarms-013 | Low | Resolved | Documentation & comments | `ScriptedAlarmEngine.cs:66-81` |
|
||||
| Core.Scripting-005 | Low | Resolved | Correctness & logic bugs | `DependencyExtractor.cs:97` |
|
||||
| Core.Scripting-006 | Low | Resolved | Concurrency & thread safety | `CompiledScriptCache.cs:55` |
|
||||
| Core.Scripting-008 | Low | Won't Fix | Performance & resource management | `CompiledScriptCache.cs:34`, `ScriptEvaluator.cs:34` |
|
||||
| Core.Scripting-008 | Low | Resolved | Performance & resource management | `CompiledScriptCache.cs:34`, `ScriptEvaluator.cs:34` |
|
||||
| Core.Scripting-009 | Low | Resolved | Design-document adherence | `ForbiddenTypeAnalyzer.cs:45` |
|
||||
| Core.Scripting-011 | Low | Resolved | Testing coverage | `tests/Core/ZB.MOM.WW.OtOpcUa.Core.Scripting.Tests/` |
|
||||
| Core.Scripting-015 | Low | Resolved | Correctness & logic bugs | `ScriptEvaluator.cs:234-270` (`ToCSharpTypeName`) |
|
||||
| Core.VirtualTags-004 | Low | Resolved | Correctness & logic bugs | `src/Core/ZB.MOM.WW.OtOpcUa.Core.VirtualTags/VirtualTagEngine.cs:349` |
|
||||
| Core.VirtualTags-006 | Low | Resolved | Concurrency & thread safety | `src/Core/ZB.MOM.WW.OtOpcUa.Core.VirtualTags/VirtualTagEngine.cs:177-182`, `:395-401` |
|
||||
| Core.VirtualTags-007 | Low | Resolved | Error handling & resilience | `src/Core/ZB.MOM.WW.OtOpcUa.Core.VirtualTags/TimerTriggerScheduler.cs:58` |
|
||||
@@ -318,6 +326,7 @@ Findings with status `Resolved`, `Won't Fix`, or `Deferred`.
|
||||
| Driver.AbLegacy.Cli-007 | Low | Resolved | Testing coverage | `tests/Drivers/Cli/ZB.MOM.WW.OtOpcUa.Driver.AbLegacy.Cli.Tests/WriteCommandParseValueTests.cs` |
|
||||
| Driver.Cli.Common-004 | Low | Resolved | Error handling & resilience | `src/Drivers/Cli/ZB.MOM.WW.OtOpcUa.Driver.Cli.Common/SnapshotFormatter.cs:68-70` |
|
||||
| Driver.Cli.Common-006 | Low | Resolved | Documentation & comments | `src/Drivers/Cli/ZB.MOM.WW.OtOpcUa.Driver.Cli.Common/SnapshotFormatter.cs:71`, `src/Drivers/Cli/ZB.MOM.WW.OtOpcUa.Driver.Cli.Common/DriverCommandBase.cs:9` |
|
||||
| Driver.Cli.Common-008 | Low | Resolved | Testing coverage | `tests/Drivers/Cli/ZB.MOM.WW.OtOpcUa.Driver.Cli.Common.Tests/SnapshotFormatterTests.cs:50-64` |
|
||||
| Driver.FOCAS-007 | Low | Resolved | Error handling & resilience | `FocasDriver.cs:140-148`, `FocasDriver.cs:478-484`, `FocasDriver.cs:529-533`, `FocasAlarmProjection.cs:61-63` |
|
||||
| Driver.FOCAS-008 | Low | Resolved | Performance & resource management | `FocasDriver.cs:201`, `FocasDriver.cs:253` |
|
||||
| Driver.FOCAS-009 | Low | Resolved | Design-document adherence | `FocasDriverOptions.cs:110-115`, `FocasDriver.cs:468-486`, `FocasDriverFactoryExtensions.cs:75-80` |
|
||||
@@ -327,11 +336,13 @@ Findings with status `Resolved`, `Won't Fix`, or `Deferred`.
|
||||
| Driver.FOCAS.Cli-002 | Low | Resolved | Concurrency & thread safety | `Commands/SubscribeCommand.cs:45-51` |
|
||||
| Driver.FOCAS.Cli-003 | Low | Resolved | Error handling & resilience | `FocasCommandBase.cs:19` (`CncPort`), `FocasCommandBase.cs:27` (`TimeoutMs`), `Commands/SubscribeCommand.cs:23` (`IntervalMs`) |
|
||||
| Driver.FOCAS.Cli-004 | Low | Resolved | Performance & resource management | `Commands/ProbeCommand.cs:37,54`; `Commands/ReadCommand.cs:37,46`; `Commands/WriteCommand.cs:45,54`; `Commands/SubscribeCommand.cs:39,73` |
|
||||
| Driver.FOCAS.Cli-005 | Low | Deferred | Design-document adherence | `Commands/WriteCommand.cs:50`, `Commands/ProbeCommand.cs:50` (via `SnapshotFormatter.FormatStatus`) |
|
||||
| Driver.FOCAS.Cli-005 | Low | Resolved | Design-document adherence | `Commands/WriteCommand.cs:50`, `Commands/ProbeCommand.cs:50` (via `SnapshotFormatter.FormatStatus`) |
|
||||
| Driver.Galaxy-005 | Low | Resolved | OtOpcUa conventions | `Runtime/EventPump.cs:81-88` |
|
||||
| Driver.Galaxy-010 | Low | Resolved | Security | `GalaxyDriver.cs:311-341` |
|
||||
| Driver.Galaxy-012 | Low | Resolved | Performance & resource management | `Runtime/SubscriptionRegistry.cs:65-67`, `GalaxyDriver.cs:538`, `GalaxyDriver.cs:675` |
|
||||
| Driver.Galaxy-013 | Low | Resolved | Design-document adherence | `GalaxyDriver.cs:14-27`, `GalaxyDriver.cs:374-382`, `Config/GalaxyDriverOptions.cs:84-86` |
|
||||
| Driver.Galaxy-017 | Low | Deferred | Design-document adherence | `src/Drivers/ZB.MOM.WW.OtOpcUa.Driver.Galaxy/` (no source change), gateway proto contract |
|
||||
| Driver.Galaxy-018 | Low | Resolved | Documentation & comments | `libs/README.md:32-37`, `ZB.MOM.WW.OtOpcUa.Driver.Galaxy.csproj:40-47` |
|
||||
| Driver.Historian.Wonderware-004 | Low | Resolved | Correctness and logic bugs | `Backend/SdkAlarmHistorianWriteBackend.cs:198-201` |
|
||||
| Driver.Historian.Wonderware-005 | Low | Resolved | Concurrency and thread safety | `Backend/HistorianDataSource.cs:124`, `:126-127` |
|
||||
| Driver.Historian.Wonderware-007 | Low | Resolved | Error handling and resilience | `Ipc/PipeServer.cs:70-75` |
|
||||
@@ -389,3 +400,4 @@ Findings with status `Resolved`, `Won't Fix`, or `Deferred`.
|
||||
| Server-012 | Low | Resolved | Performance & resource management | `src/Server/ZB.MOM.WW.OtOpcUa.Server/Hosting/PeerHttpProbeLoop.cs:78-79` |
|
||||
| Server-014 | Low | Resolved | Code organization & conventions | `src/Server/ZB.MOM.WW.OtOpcUa.Server/SealedBootstrap.cs` |
|
||||
| Server-015 | Low | Resolved | Documentation & comments | `src/Server/ZB.MOM.WW.OtOpcUa.Server/OpcUa/OtOpcUaServer.cs:16-21`, `src/Server/ZB.MOM.WW.OtOpcUa.Server/OpcUa/OpcUaServerOptions.cs:21-26` |
|
||||
| Driver.Galaxy-015 | ~~Medium~~ Low (re-triaged 2026-05-23) | Resolved | ~~Security~~ Documentation & comments (re-triaged 2026-05-23) | `libs/MxGateway.Client.dll`, `libs/MxGateway.Contracts.dll`, `libs/README.md` |
|
||||
|
||||
@@ -35,7 +35,7 @@ new ScriptedAlarmDefinition(
|
||||
|
||||
## Predicate evaluation
|
||||
|
||||
Alarm predicates reuse the same Roslyn sandbox as virtual tags — `ScriptEvaluator<AlarmPredicateContext, bool>` compiles the source, `TimedScriptEvaluator` wraps it with the configured timeout (default from `TimedScriptEvaluator.DefaultTimeout`), and `DependencyExtractor` statically harvests the tag paths the script reads. The sandbox rules (forbidden types, cancellation, logging sinks) are documented in [VirtualTags.md](VirtualTags.md); ScriptedAlarms does not redefine them. The known resource limits — unbounded script-side memory, the per-publish accretion of dynamically-emitted script assemblies (Core.Scripting-008), and the orphan-thread CPU-budget caveat — are documented in that file as well.
|
||||
Alarm predicates reuse the same Roslyn sandbox as virtual tags — `ScriptEvaluator<AlarmPredicateContext, bool>` compiles the source, `TimedScriptEvaluator` wraps it with the configured timeout (default from `TimedScriptEvaluator.DefaultTimeout`), and `DependencyExtractor` statically harvests the tag paths the script reads. The sandbox rules (forbidden types, cancellation, logging sinks) are documented in [VirtualTags.md](VirtualTags.md); ScriptedAlarms does not redefine them. The known resource limits — unbounded script-side memory and the orphan-thread CPU-budget caveat — are documented in that file as well; per-publish assembly accretion was resolved by the Core.Scripting-008 collectible-`AssemblyLoadContext` rewrite and no longer requires periodic server restarts.
|
||||
|
||||
`AlarmPredicateContext` (`AlarmPredicateContext.cs`) is the script's `ScriptContext` subclass:
|
||||
|
||||
|
||||
+5
-3
@@ -6,7 +6,7 @@ The runtime is split across two projects: `Core.Scripting` holds the Roslyn sand
|
||||
|
||||
## Roslyn script sandbox (`Core.Scripting`)
|
||||
|
||||
User scripts are compiled via `Microsoft.CodeAnalysis.CSharp.Scripting` against a `ScriptContext` subclass. `ScriptGlobals<TContext>` exposes the context as a field named `ctx`, so scripts read `ctx.GetTag("...")` / `ctx.SetVirtualTag("...", ...)` / `ctx.Now` / `ctx.Logger` and return a value.
|
||||
User scripts are compiled via `Microsoft.CodeAnalysis.CSharp` (regular compiler, not the scripting variant — the original `CSharpScript` pipeline was retired by the Core.Scripting-008 / -016 rewrite, see "Compile cache" below). Each script's source is pasted as the body of a synthesized `CompiledScript.Run(ScriptGlobals<TContext>)` method against a `ScriptContext` subclass. `ScriptGlobals<TContext>` exposes the context as a field named `ctx`, so scripts read `ctx.GetTag("...")` / `ctx.SetVirtualTag("...", ...)` / `ctx.Now` / `ctx.Logger` and return a value.
|
||||
|
||||
### Compile pipeline (`ScriptEvaluator<TContext, TResult>`)
|
||||
|
||||
@@ -30,11 +30,13 @@ Similarly, **`System.Threading.Tasks` is now denied** (Core.Scripting-003), whic
|
||||
|
||||
`ConcurrentDictionary<string, Lazy<ScriptEvaluator<...>>>` keyed on `SHA-256(UTF8(source))` rendered to hex. `Lazy<T>` with `ExecutionAndPublication` mode means two threads racing a miss compile exactly once. Failed compiles evict the entry (via the `TryRemove(KeyValuePair<,>)` overload so a concurrently re-added retry entry is not collateral damage — Core.Scripting-006) so a corrected retry can succeed (used during Admin UI authoring). No capacity bound — scripts are operator-authored and bounded by the config DB. Whitespace changes miss the cache on purpose. `Clear()` is called on config-publish.
|
||||
|
||||
**Per-publish assembly accretion (accepted limitation, Core.Scripting-008).** Each compiled `ScriptEvaluator` holds a Roslyn `ScriptRunner<T>` delegate, which keeps the dynamically-emitted script assembly loaded for the process lifetime. Emitted assemblies in the default `AssemblyLoadContext` cannot be unloaded; `CompiledScriptCache.Clear()` drops the dictionary entries but does **not** unload the underlying assemblies. Across many config-publish generations (each `Clear()` followed by recompiling every script), the process accumulates dead script assemblies. For the expected "low thousands" of scripts this is benign, but a long-running server with very frequent publishes will see steady managed-memory growth that does not return until the process restarts. Out-of-process script evaluation or a collectible `AssemblyLoadContext` is a v3 concern; deployments with high-publish-frequency requirements should schedule a periodic server restart to reclaim the accrued assemblies.
|
||||
**Per-publish assembly unload (Core.Scripting-008 resolved).** Each compiled `ScriptEvaluator` emits its script into a dedicated **collectible** `AssemblyLoadContext` — the BCL escape hatch for assemblies that can be unloaded. The compile path is hand-rolled `CSharpCompilation.Create` + `Emit(MemoryStream)` + `ScriptAssemblyLoadContext.LoadFromStream` rather than the legacy `CSharpScript.CreateDelegate` (which emits into the default ALC and cannot be unloaded). `ScriptEvaluator.Dispose()` calls `AssemblyLoadContext.Unload()` and `CompiledScriptCache.Clear()` disposes every materialised evaluator before dropping its dictionary entry, so the emitted assemblies become eligible for GC immediately after a config-publish. The reclaim is GC-timing-sensitive (Unload is *eligible-for-collection*, not synchronous); the next collection cycle reclaims them. Regression tests `Dispose_unloads_compiled_script_assembly_load_context` and `Clear_disposes_every_materialised_evaluator` in `CompiledScriptCacheTests` lock this contract via `WeakReference` + `GC.Collect()` assertions. Server restarts are no longer required to reclaim compiled-script memory.
|
||||
|
||||
**Scripting authoring convention.** With the collectible-ALC rewrite, the wrapper around a user script is an ordinary C# static method, not a Roslyn `Script` submission. The script body is pasted verbatim as the method body and must therefore end with an explicit `return …;` per ordinary C# rules — the legacy `CSharpScript` "last expression yields result" shorthand is gone. Every script in the existing test corpus already uses explicit `return`; this convention is operator-visible only when authoring a brand-new script from scratch.
|
||||
|
||||
### Per-evaluation timeout (`TimedScriptEvaluator<TContext, TResult>`)
|
||||
|
||||
Wraps `ScriptEvaluator` with a wall-clock budget. Default `DefaultTimeout = 250ms`. Implementation pushes the inner `RunAsync` onto `Task.Run` (so a CPU-bound script can't hog the calling thread before `WaitAsync` registers its timeout) then awaits `runTask.WaitAsync(Timeout, ct)`. A `TimeoutException` from `WaitAsync` is wrapped as `ScriptTimeoutException`. Caller-supplied `CancellationToken` cancellation wins over the timeout and propagates as `OperationCanceledException` — so a shutdown cancel is not misclassified. **Known leak:** when a CPU-bound script times out, the underlying `ScriptRunner` keeps running on its thread-pool thread until the Roslyn runtime returns (documented trade-off; out-of-process evaluation is a v3 concern).
|
||||
Wraps `ScriptEvaluator` with a wall-clock budget. Default `DefaultTimeout = 250ms`. Implementation pushes the inner `RunAsync` onto `Task.Run` (so a CPU-bound script can't hog the calling thread before `WaitAsync` registers its timeout) then awaits `runTask.WaitAsync(Timeout, ct)`. A `TimeoutException` from `WaitAsync` is wrapped as `ScriptTimeoutException`. Caller-supplied `CancellationToken` cancellation wins over the timeout and propagates as `OperationCanceledException` — so a shutdown cancel is not misclassified. **Known leak:** when a CPU-bound script times out, the underlying compiled-script delegate keeps running on its `Task.Run` thread-pool thread until it returns of its own accord (the CT is checked only at evaluator entry; once the script body is running, only the script returning or throwing will release the thread). The post-rewrite delegate is a regular C# `Func<>` bound to the synthesized `CompiledScript.Run` method, so this is a vanilla "synchronous CPU-bound work on a pool thread" leak rather than anything Roslyn-specific. Documented trade-off; out-of-process evaluation is a v3 concern.
|
||||
|
||||
### Script logger plumbing
|
||||
|
||||
|
||||
@@ -583,10 +583,20 @@ language binding.
|
||||
|
||||
**Depends on:** A.1 merged (proto change live).
|
||||
|
||||
**Files** (`c:\Users\dohertj2\Desktop\mxaccessgw\clients\`):
|
||||
**Files** (`c:\Users\dohertj2\Desktop\mxaccessgw\src\` for .NET — note the sibling
|
||||
repo restructured after this plan was written; `clients/dotnet/MxGateway.Client.csproj`
|
||||
no longer exists, the proto contracts now live in
|
||||
`src/ZB.MOM.WW.MxGateway.Contracts/` under the new namespace
|
||||
`ZB.MOM.WW.MxGateway.Contracts.Proto[.Galaxy]`; the OtOpcUa driver currently
|
||||
consumes vendored binaries from the pre-restructure build — see
|
||||
`src/Drivers/ZB.MOM.WW.OtOpcUa.Driver.Galaxy/libs/README.md`):
|
||||
|
||||
1. **.NET** — codegen runs on csproj rebuild via `Grpc.Tools`; just
|
||||
rebuild `MxGateway.Client.csproj` after pulling A.1.
|
||||
1. **.NET** — codegen runs on csproj rebuild via `Grpc.Tools`; rebuild
|
||||
`src/ZB.MOM.WW.MxGateway.Contracts/ZB.MOM.WW.MxGateway.Contracts.csproj`
|
||||
after pulling A.1. (If unwinding the driver's vendored binaries onto the
|
||||
new contracts namespace as part of the alarm work, namespace-rename + a
|
||||
reimplementation of the missing `MxGatewayClient` / `MxGatewaySession`
|
||||
wrappers is also in scope.)
|
||||
2. **Python** — run `clients\python\generate-proto.ps1`; commit the
|
||||
regenerated `_pb2.py` + `_pb2_grpc.py` files under
|
||||
`clients\python\src\`.
|
||||
|
||||
@@ -1,6 +1,6 @@
|
||||
# High-Level Requirements
|
||||
|
||||
> **Revision** — Refreshed 2026-04-19 for the OtOpcUa v2 multi-driver platform (task #205). The original 2025 text described a single-process Galaxy/MXAccess server called LmxOpcUa. Today the project is the **OtOpcUa** multi-driver OPC UA platform deployed as three cooperating processes (Server, Admin, Galaxy.Host). The Galaxy integration is one of seven shipped drivers. HLR-001 through HLR-008 have been rewritten driver-agnostically; HLR-009 has been retired (the embedded Status Dashboard is superseded by the Admin UI). HLR-010 through HLR-017 are new and cover plug-in drivers, resilience, Config DB / draft-publish, cluster redundancy, fleet-wide identifier uniqueness, Admin UI, audit logging, metrics, and the Roslyn capability-wrapping analyzer.
|
||||
> **Revision** — Refreshed 2026-05-23 for the OtOpcUa v2 multi-driver platform. The original 2025 text described a single-process Galaxy/MXAccess server called LmxOpcUa. Today the project is the **OtOpcUa** multi-driver OPC UA platform deployed as two cooperating processes (Server, Admin). The Galaxy integration is one of seven shipped drivers and is now an in-process Tier-A driver that talks gRPC to a separately installed `mxaccessgw` gateway (sibling repo) — PR 7.2 (2026-04-30) retired the legacy out-of-process `Galaxy.Host` Windows service. HLR-001 through HLR-008 have been rewritten driver-agnostically; HLR-009 has been retired (the embedded Status Dashboard is superseded by the Admin UI). HLR-010 through HLR-017 cover plug-in drivers, resilience, Config DB / draft-publish, cluster redundancy, fleet-wide identifier uniqueness, Admin UI, audit logging, metrics, and the Roslyn capability-wrapping analyzer.
|
||||
|
||||
## HLR-001: OPC UA Server
|
||||
|
||||
@@ -28,11 +28,10 @@ Drivers whose backend has a native change signal (e.g. Galaxy's `time_of_last_de
|
||||
|
||||
## HLR-007: Service Hosting
|
||||
|
||||
The system shall be deployed as three cooperating Windows services:
|
||||
The system shall be deployed as two cooperating Windows services (the legacy `OtOpcUa.Galaxy.Host` x86 host was retired in PR 7.2 — Galaxy access now flows through the separately installed `mxaccessgw` gateway, which lives in a sibling repository and is not part of the OtOpcUa deployment):
|
||||
|
||||
- **OtOpcUa.Server** — .NET 10 x64, `Microsoft.Extensions.Hosting` + `AddWindowsService`, hosts all non-Galaxy drivers in-process and the OPC UA endpoint.
|
||||
- **OtOpcUa.Server** — .NET 10 AnyCPU, `Microsoft.Extensions.Hosting` + `AddWindowsService` (decision #30 replaced the original TopShelf choice), hosts every driver in-process — including the new Tier-A `GalaxyDriver` that speaks gRPC to `mxaccessgw` — and the OPC UA endpoint.
|
||||
- **OtOpcUa.Admin** — .NET 10 x64 Blazor Server web app, hosts the admin UI, SignalR hubs for live updates, `/metrics` Prometheus endpoint, and audit log writers.
|
||||
- **OtOpcUa.Galaxy.Host** — .NET Framework 4.8 x86 (TopShelf), hosts MXAccess COM + Galaxy Repository SQL + Historian plugin. Talks to `Driver.Galaxy.Proxy` inside `OtOpcUa.Server` via a named pipe (MessagePack over length-prefixed frames, per-process shared secret, SID-restricted ACL).
|
||||
|
||||
## HLR-008: Logging
|
||||
|
||||
|
||||
@@ -151,8 +151,11 @@ substantive driver change, and revise this table when the data does.
|
||||
`SubscriptionRegistry`, or a downstream consumer retaining
|
||||
`DataValueSnapshot` references past their useful life.
|
||||
|
||||
## Scripted-alarm engine — known hot-path allocations
|
||||
## Scripted-alarm engine — hot-path allocation reuse
|
||||
|
||||
`ScriptedAlarmEngine.BuildReadCache` allocates a fresh `Dictionary<string, DataValueSnapshot>` and `AlarmPredicateContext` on every predicate evaluation — i.e. once per upstream tag change per referencing alarm. On a busy line where many tags feeding many alarms change frequently, this is a steady stream of short-lived dictionary allocations on the hot path. (Core.ScriptedAlarms-009)
|
||||
`ScriptedAlarmEngine` keeps a per-alarm reusable evaluation scratch in `_scratchByAlarmId` — the read-cache `Dictionary<string, DataValueSnapshot>` and the `AlarmPredicateContext` are allocated once per alarm (on first evaluation) and refilled in place across every subsequent predicate evaluation. The hot path no longer allocates a fresh dictionary + context per upstream tag change. (Core.ScriptedAlarms-009)
|
||||
|
||||
The allocations are deliberate for now: predicate evaluation is already serialised under `_evalGate`, so a single reused scratch dictionary would be safe, but the per-call dictionary keeps the evaluation surface immutable and trivially safe against future refactors. If a future scripted-alarm soak surfaces allocation pressure on this path, the mitigation is a per-alarm scratch buffer cleared between evaluations — note here before changing the engine.
|
||||
Safety: reuse is serialised under `_evalGate`, so two threads can never observe the same scratch in a half-refilled state. The context wraps the read-cache by reference, so refilling the dictionary is what the predicate's `ctx.GetTag(path)` calls observe. `LoadAsync` clears `_scratchByAlarmId` alongside `_alarms` so a config-publish drops the prior generation's scratch (a new generation may carry different `Inputs` / `Logger`). Regression tests in `ScriptedAlarmEngineTests` lock the reuse contract:
|
||||
- `Reevaluation_reuses_the_same_read_cache_dictionary` — asserts dictionary instance identity across two evaluations.
|
||||
- `Reevaluation_reuses_the_same_predicate_context` — same, for the context.
|
||||
- `LoadAsync_drops_the_prior_generations_scratch` — asserts a publish resets the scratch.
|
||||
|
||||
@@ -29,7 +29,7 @@ Tie-in capability — **historian alarm sink**:
|
||||
| 3 | Evaluation trigger = **change-driven + timer-driven**; operator chooses per-tag | Change-driven is cheap at steady state; timer is the escape hatch for polling derivations that don't have a discrete "input changed" signal. |
|
||||
| 4 | Script shape = **Shape A — one script per virtual tag/alarm**; `return` produces the value (or `bool` for alarm condition) | Minimal surface; no predicate/action split. Alarm side-effects (severity, message) configured out-of-band, not in the script. |
|
||||
| 5 | Alarm fidelity = **full OPC UA Part 9** | Uniform with Galaxy + ALMD on the wire; client-side tooling (HMIs, historians, event pipelines) gets one shape. |
|
||||
| 6 | Sandbox = **read-only context**; scripts can only read any tag + write to virtual tags | Strict Roslyn `ScriptOptions` allow-list. Authoritative deny-list (`ForbiddenTypeAnalyzer`): namespace-prefix deny `System.IO`, `System.Net`, `System.Diagnostics`, `System.Reflection`, `System.Threading.Tasks` (Task / Parallel fan-out — Core.Scripting-003), `System.Runtime.InteropServices`, `Microsoft.Win32`; type-granular deny `System.Environment`, `System.AppDomain`, `System.GC`, `System.Activator`, `System.Threading.Thread` (these live directly in the allow-listed `System` / `System.Threading` namespaces, so a prefix rule cannot reach them without blocking primitives — Core.Scripting-001 / -009). |
|
||||
| 6 | Sandbox = **read-only context**; scripts can only read any tag + write to virtual tags | Strict Roslyn `ScriptOptions` allow-list. Authoritative deny-list (`ForbiddenTypeAnalyzer`): namespace-prefix deny `System.IO`, `System.Net`, `System.Diagnostics`, `System.Reflection`, `System.Threading.Tasks` (Task / Parallel fan-out — Core.Scripting-003), `System.Runtime.InteropServices`, `System.Runtime.Loader` (AssemblyLoadContext et al. — Core.Scripting-012), `Microsoft.Win32`; type-granular deny `System.Environment`, `System.AppDomain`, `System.GC`, `System.Activator`, `System.Threading.Thread`, `System.Threading.ThreadPool` (Core.Scripting-012), `System.Threading.Timer` (Core.Scripting-012) (these live directly in the allow-listed `System` / `System.Threading` namespaces, so a prefix rule cannot reach them without blocking primitives — Core.Scripting-001 / -009 / -012). |
|
||||
| 7 | Dependency declaration = **AST inference**; operator doesn't maintain a separate dependency list | `CSharpSyntaxWalker` extracts `ctx.GetTag("path")` string-literal calls at compile time; dynamic paths rejected at publish. |
|
||||
| 8 | Config storage = **config DB with generation-sealed cache** (same as driver instances) | Virtual tags + alarms publish atomically in the same generation as the driver instance config they may depend on. |
|
||||
| 9 | Script return value shape (`ctx.GetTag`) = **`DataValue { Value, StatusCode, Timestamp }`** | Scripts branch on quality naturally without separate `ctx.GetQuality(...)` calls. |
|
||||
@@ -89,7 +89,7 @@ Tie-in capability — **historian alarm sink**:
|
||||
|
||||
### Stream A — `Core.Scripting` (Roslyn engine + sandbox + AST inference + logger) — **2 weeks**
|
||||
|
||||
1. **A.1** Project scaffold + NuGet `Microsoft.CodeAnalysis.CSharp.Scripting`. `ScriptOptions` allow-list (`typeof(object).Assembly`, `typeof(Enumerable).Assembly`, the Core.Scripting assembly itself — nothing else). Hand-written `ScriptContext` base class with `GetTag(string)` / `SetVirtualTag(string, object)` / `Logger` / `Now` / `Deadband(double, double, double)` helpers.
|
||||
1. **A.1** Project scaffold + NuGet `Microsoft.CodeAnalysis.CSharp.Scripting`. `ScriptOptions` allow-list (`typeof(object).Assembly`, `typeof(Enumerable).Assembly`, the Core.Scripting assembly itself — nothing else). Hand-written `ScriptContext` base class with `GetTag(string)` / `SetVirtualTag(string, object)` / `Logger` / `Now` / `Deadband(double, double, double)` helpers. _(Implementation note 2026-05-23 — superseded by Core.Scripting-008 / -016: the `CSharpScript`/`ScriptRunner` path was replaced with a hand-rolled `CSharpCompilation.Create` → `Emit(MemoryStream)` → collectible `ScriptAssemblyLoadContext.LoadFromStream` pipeline so per-publish ALC accretion is reclaimable, and engines route compiles through `CompiledScriptCache` rather than calling `ScriptEvaluator.Compile` directly. The reference list was correspondingly widened from the narrow allow-list above to the full BCL `TRUSTED_PLATFORM_ASSEMBLIES` set (filtered to `System.*` + `netstandard` + `Microsoft.Win32.Registry`) because the new pipeline can't compile against the old narrow set; `ForbiddenTypeAnalyzer` is now the sole security gate, consistent with how Core.Scripting-001 / -002 established the analyzer must be the real boundary because type forwarding makes any references-list-only restriction porous. See `docs/VirtualTags.md` "Compile cache" for the current implementation contract.)_
|
||||
2. **A.2** `DependencyExtractor : CSharpSyntaxWalker`. Visits every `InvocationExpressionSyntax` targeting `ctx.GetTag` / `ctx.SetVirtualTag`; accepts only a `LiteralExpressionSyntax` argument. Non-literal arguments (concat, variable, method call) → publish-time rejection with an actionable error pointing the operator at the exact span. Outputs `IReadOnlySet<string> Inputs` + `IReadOnlySet<string> Outputs`.
|
||||
3. **A.3** Compile cache. `(source_hash) → compiled Script<T>`. Recompile only when source changes. Warm on `SealedBootstrap`.
|
||||
4. **A.4** Per-evaluation timeout wrapper (default 250ms; configurable per tag). Timeout = tag quality `BadInternalError` + structured warning log. Keeps a single runaway script from starving the engine.
|
||||
@@ -162,7 +162,7 @@ Tie-in capability — **historian alarm sink**:
|
||||
|
||||
## Compliance Checks (run at exit gate)
|
||||
|
||||
- [ ] **Sandbox escape**: attempts to reference any deny-listed namespace prefix (`System.IO`, `System.Net`, `System.Diagnostics`, `System.Reflection`, `System.Threading.Tasks`, `System.Runtime.InteropServices`, `Microsoft.Win32`) or any of the type-granular forbidden types (`System.Environment`, `System.AppDomain`, `System.GC`, `System.Activator`, `System.Threading.Thread`) fail at script compile with an actionable error. Vectors include direct calls, `typeof(T)`, generic type arguments, casts, `is`/`as` patterns, `default(T)`, array element types, and explicitly-typed local declarations.
|
||||
- [ ] **Sandbox escape**: attempts to reference any deny-listed namespace prefix (`System.IO`, `System.Net`, `System.Diagnostics`, `System.Reflection`, `System.Threading.Tasks`, `System.Runtime.InteropServices`, `System.Runtime.Loader`, `Microsoft.Win32`) or any of the type-granular forbidden types (`System.Environment`, `System.AppDomain`, `System.GC`, `System.Activator`, `System.Threading.Thread`, `System.Threading.ThreadPool`, `System.Threading.Timer`) fail at script compile with an actionable error. Vectors include direct calls, `typeof(T)`, generic type arguments, casts, `is`/`as` patterns, `default(T)`, array element types, and explicitly-typed local declarations.
|
||||
- [ ] **Dependency inference**: `ctx.GetTag(myStringVar)` (non-literal path) is rejected at publish with a span-pointed error; `ctx.GetTag("Line1/Speed")` is accepted + appears in the inferred input set.
|
||||
- [ ] **Change cascade**: tag A → virtual tag B → virtual tag C. When A changes, B recomputes, then C recomputes. Single change event triggers the full cascade in topological order within one evaluation pass.
|
||||
- [ ] **Cycle rejection**: publish a config where virtual tag B depends on A and A depends on B. Publish fails pre-commit with a clear cycle message.
|
||||
|
||||
+2
-2
@@ -193,7 +193,7 @@ ConfigurationService
|
||||
- Compact binary format, faster than JSON, good fit for high-frequency data change callbacks
|
||||
- Simpler than gRPC on .NET 4.8 (which needs legacy `Grpc.Core` native library)
|
||||
|
||||
**Decided: Galaxy Host is a separate Windows service.**
|
||||
**Decided: Galaxy Host is a separate Windows service.** _(Reversed by PR 7.2, 2026-04-30 — see PR 7.2's commit `ae7106d` and the project_galaxy_via_mxgateway memory entry. The legacy in-process `Galaxy.Host` / `Galaxy.Proxy` / `Galaxy.Shared` projects + the `OtOpcUaGalaxyHost` Windows service were retired; Galaxy access now flows through the in-process Tier-A `GalaxyDriver` talking gRPC to a separately installed `mxaccessgw` gateway sibling repo. The reasoning below was correct for the original LMX/x86-COM architecture; the gateway sibling repo now owns those constraints externally.)_
|
||||
- Independent lifecycle from the OtOpcUa Server
|
||||
- Can be restarted without affecting the main server or other drivers
|
||||
- Galaxy.Proxy detects connection loss, sets Bad quality on Galaxy nodes, reconnects when Host comes back
|
||||
@@ -801,7 +801,7 @@ aggregate runner (#253); server-side factory + seed SQL per driver (#210–#213)
|
||||
| 26 | Admin deploys on same server (co-hosted) | Simplifies deployment; can also run on separate management host | 2026-04-16 |
|
||||
| 27 | Admin scaffold early, driver-specific screens deferred | Core CRUD for instances/drivers first; per-driver config UI added with each driver | 2026-04-16 |
|
||||
| 28 | Named pipes for Galaxy IPC | Fast, no port conflicts, native to both .NET 4.8 and .NET 10 | 2026-04-16 |
|
||||
| 29 | Galaxy Host is a separate Windows service | Independent lifecycle, can restart without affecting main server or other drivers | 2026-04-16 |
|
||||
| 29 | Galaxy Host is a separate Windows service | Independent lifecycle, can restart without affecting main server or other drivers | 2026-04-16 (**reversed PR 7.2, 2026-04-30** — Galaxy is now an in-process Tier-A driver talking gRPC to the sibling `mxaccessgw` gateway; see the decision body above) |
|
||||
| 30 | Drop TopShelf, use Microsoft.Extensions.Hosting | Built-in Windows Service support in .NET 10, no third-party dependency | 2026-04-16 |
|
||||
| 31 | Mono-repo for all drivers | Simpler dependency management, single CI pipeline, shared abstractions | 2026-04-16 |
|
||||
| 32 | MessagePack serialization for Galaxy IPC | Binary, fast, works on .NET 4.8+ and .NET 10 via MessagePack-CSharp NuGet | 2026-04-16 |
|
||||
|
||||
@@ -67,7 +67,7 @@ Remaining Phase 6.3 surfaces (hardening, not release-blocking):
|
||||
|
||||
- **AB CIP** (PRs #202–222) — `Driver.AbCip`, `Driver.AbCip.IntegrationTests` (7 tests), AB CIP Cli. Live-boot verified against a ControlLogix rig.
|
||||
- **AB Legacy** (PRs #202, #223) — `Driver.AbLegacy`, `Driver.AbLegacy.IntegrationTests` (2 tests), AB Legacy Cli. PCCC cip-path workaround for SLC/MicroLogix.
|
||||
- **TwinCAT ADS** (PRs #205, this branch `task-galaxy-e2e`) — `Driver.TwinCAT`, `Driver.TwinCAT.IntegrationTests` (2 tests), TwinCAT Cli. TCBSD/ESXi fixture for e2e since local Hyper-V / TwinCAT RTIME are mutually exclusive on the dev box.
|
||||
- **TwinCAT ADS** (PR #205) — `Driver.TwinCAT`, `Driver.TwinCAT.IntegrationTests` (2 tests), TwinCAT Cli. TCBSD/ESXi fixture for e2e since local Hyper-V / TwinCAT RTIME are mutually exclusive on the dev box.
|
||||
- **FOCAS** (PRs #173, #199 + this session's migration) — `Driver.FOCAS` with an **in-process managed `FocasWireClient`** that speaks FOCAS/2 over TCP directly. Tier-C isolation retired — `Driver.FOCAS.Host` + `Driver.FOCAS.Shared` + `FwlibNative` P/Invoke + shim DLL + NSSM service all deleted. `Driver.FOCAS.IntegrationTests` covers 9 scenarios (fixed tree identity/axes/program/timers/spindle + user-authored PARAM/MACRO/PMC reads, Browse, Subscribe, IAlarmSource raise/clear, Probe transitions).
|
||||
|
||||
Decision recorded: FOCAS is **read-only** against the CNC by design — writes return `BadNotWritable`. See `docs/drivers/FOCAS.md` + `docs/drivers/FOCAS-Test-Fixture.md` for the deployment + coverage map.
|
||||
|
||||
@@ -48,6 +48,71 @@ public sealed class ScriptedAlarmEngine : IDisposable
|
||||
// snapshot enumeration safe. The only write shapes are indexer-set and Clear,
|
||||
// both of which ConcurrentDictionary supports atomically. (Core.ScriptedAlarms-001)
|
||||
private readonly ConcurrentDictionary<string, AlarmState> _alarms = new(StringComparer.Ordinal);
|
||||
|
||||
/// <summary>
|
||||
/// Per-alarm reusable evaluation scratch. The read-cache dictionary and the
|
||||
/// <see cref="AlarmPredicateContext"/> instance are both allocated once per
|
||||
/// alarm (on first evaluation) and reused across every subsequent re-eval —
|
||||
/// the hot path no longer allocates a fresh dictionary + context per upstream
|
||||
/// tag change. Safe because <see cref="EvaluatePredicateToStateAsync"/> only
|
||||
/// runs under <see cref="_evalGate"/>, which serialises every evaluation:
|
||||
/// two threads can never observe the same scratch in a half-refilled state.
|
||||
/// Cleared in <see cref="LoadAsync"/> alongside <see cref="_alarms"/>.
|
||||
/// (Core.ScriptedAlarms-009)
|
||||
/// </summary>
|
||||
private readonly ConcurrentDictionary<string, AlarmScratch> _scratchByAlarmId =
|
||||
new(StringComparer.Ordinal);
|
||||
|
||||
/// <summary>
|
||||
/// Compile cache for every alarm predicate. Routes <see cref="LoadAsync"/>'s
|
||||
/// <see cref="ScriptEvaluator{TContext, TResult}.Compile"/> calls through the
|
||||
/// cache so the collectible <see cref="System.Runtime.Loader.AssemblyLoadContext"/>
|
||||
/// each compile produces is actually disposed on the publish-replace path
|
||||
/// (Core.Scripting-016): the cache's <see cref="CompiledScriptCache{TContext, TResult}.Clear"/>
|
||||
/// disposes every materialised evaluator before dropping its dictionary entry,
|
||||
/// so a config-publish releases the prior generation's ALCs and the per-publish
|
||||
/// accretion the Core.Scripting-008 fix targeted is actually freed in production.
|
||||
/// Pre-fix the engine called <c>ScriptEvaluator.Compile</c> directly, which left
|
||||
/// the ALCs rooted until the process exited — defeating -008 on the real path.
|
||||
/// </summary>
|
||||
private readonly CompiledScriptCache<AlarmPredicateContext, bool> _compileCache = new();
|
||||
|
||||
/// <summary>
|
||||
/// Test-only diagnostic: returns the per-alarm scratch read-cache dictionary
|
||||
/// if one has been allocated, else null. Used by Core.ScriptedAlarms-009
|
||||
/// regression tests to assert the scratch is reused across evaluations
|
||||
/// (two reads return the same instance).
|
||||
/// </summary>
|
||||
/// <remarks>
|
||||
/// <b>Synchronization:</b> the returned <see cref="IReadOnlyDictionary{TKey, TValue}"/>
|
||||
/// is the engine's live mutable read-cache. It is refilled in place by
|
||||
/// <c>RefillReadCache</c> on every predicate evaluation, under <c>_evalGate</c>.
|
||||
/// Test callers MUST NOT iterate this dictionary while the engine is
|
||||
/// actively evaluating (i.e. while an upstream change is mid-flight); the
|
||||
/// refill clears the dict before repopulating and a concurrent iterator
|
||||
/// would observe torn / partial state. Safe uses are: reference-identity
|
||||
/// comparisons (e.g. asserting the same instance is reused across calls),
|
||||
/// and single-key reads against an engine that has quiesced after a
|
||||
/// deterministic upstream push. Anything more involved should snapshot a
|
||||
/// copy under the gate. (Core.ScriptedAlarms-013.)
|
||||
/// </remarks>
|
||||
internal IReadOnlyDictionary<string, DataValueSnapshot>? TryGetScratchReadCacheForTest(string alarmId)
|
||||
=> _scratchByAlarmId.TryGetValue(alarmId, out var s) ? s.ReadCache : null;
|
||||
|
||||
/// <summary>
|
||||
/// Test-only diagnostic: returns the per-alarm <see cref="AlarmPredicateContext"/>
|
||||
/// if one has been allocated, else null. Companion to
|
||||
/// <see cref="TryGetScratchReadCacheForTest"/>.
|
||||
/// </summary>
|
||||
/// <remarks>
|
||||
/// <b>Synchronization:</b> the returned context wraps the same live
|
||||
/// read-cache as <see cref="TryGetScratchReadCacheForTest"/> — the same
|
||||
/// "don't iterate during an in-flight evaluation" caveat applies. Safe
|
||||
/// for reference-identity assertions on a quiesced engine.
|
||||
/// (Core.ScriptedAlarms-013.)
|
||||
/// </remarks>
|
||||
internal AlarmPredicateContext? TryGetScratchContextForTest(string alarmId)
|
||||
=> _scratchByAlarmId.TryGetValue(alarmId, out var s) ? s.Context : null;
|
||||
private readonly ConcurrentDictionary<string, DataValueSnapshot> _valueCache
|
||||
= new(StringComparer.Ordinal);
|
||||
private readonly Dictionary<string, HashSet<string>> _alarmsReferencing
|
||||
@@ -108,6 +173,14 @@ public sealed class ScriptedAlarmEngine : IDisposable
|
||||
UnsubscribeFromUpstream();
|
||||
_alarms.Clear();
|
||||
_alarmsReferencing.Clear();
|
||||
// Drop the prior generation's per-alarm scratch buffers — definitions may
|
||||
// have changed (different Inputs, different Logger), so any reuse would be
|
||||
// unsafe. (Core.ScriptedAlarms-009)
|
||||
_scratchByAlarmId.Clear();
|
||||
// Dispose every compiled-predicate ALC from the prior generation BEFORE we
|
||||
// recompile this one. Skipping this is what made Core.Scripting-008 a
|
||||
// no-op in production. (Core.Scripting-016)
|
||||
_compileCache.Clear();
|
||||
|
||||
var compileFailures = new List<string>();
|
||||
foreach (var def in definitions)
|
||||
@@ -122,7 +195,10 @@ public sealed class ScriptedAlarmEngine : IDisposable
|
||||
continue;
|
||||
}
|
||||
|
||||
var evaluator = ScriptEvaluator<AlarmPredicateContext, bool>.Compile(def.PredicateScriptSource);
|
||||
// Route through CompiledScriptCache so the emitted assembly's
|
||||
// collectible ALC participates in publish-replace cleanup.
|
||||
// (Core.Scripting-016)
|
||||
var evaluator = _compileCache.GetOrCompile(def.PredicateScriptSource);
|
||||
var timed = new TimedScriptEvaluator<AlarmPredicateContext, bool>(evaluator, _scriptTimeout);
|
||||
var logger = _loggerFactory.Create(def.AlarmId);
|
||||
|
||||
@@ -354,7 +430,13 @@ public sealed class ScriptedAlarmEngine : IDisposable
|
||||
AlarmState state, AlarmConditionState seed, DateTime nowUtc, CancellationToken ct,
|
||||
List<ScriptedAlarmEvent>? pendingEmissions = null)
|
||||
{
|
||||
var inputs = BuildReadCache(state.Inputs);
|
||||
// Look up (or lazily allocate) the per-alarm scratch and refill its read cache
|
||||
// in place. The dictionary + context survive across evaluations so the hot path
|
||||
// no longer allocates per upstream tag change. (Core.ScriptedAlarms-009)
|
||||
var scratch = _scratchByAlarmId.GetOrAdd(
|
||||
state.Definition.AlarmId,
|
||||
_ => new AlarmScratch(state.Inputs, state.Logger, _clock));
|
||||
RefillReadCache(scratch.ReadCache, state.Inputs);
|
||||
|
||||
// Cold-start guard — skip the predicate when any referenced upstream tag has no
|
||||
// cached value yet (the upstream subscription hasn't delivered its first push).
|
||||
@@ -362,9 +444,9 @@ public sealed class ScriptedAlarmEngine : IDisposable
|
||||
// every tick until the cache fills, spamming the log with identical stack traces.
|
||||
// Bad quality is treated the same: the input isn't available at the predicate's
|
||||
// expected type, so the only defensible move is to hold the prior condition state.
|
||||
if (!AreInputsReady(inputs)) return seed;
|
||||
if (!AreInputsReady(scratch.ReadCache)) return seed;
|
||||
|
||||
var context = new AlarmPredicateContext(inputs, state.Logger, _clock);
|
||||
var context = scratch.Context;
|
||||
|
||||
bool predicateTrue;
|
||||
try
|
||||
@@ -399,12 +481,20 @@ public sealed class ScriptedAlarmEngine : IDisposable
|
||||
return result.State;
|
||||
}
|
||||
|
||||
private IReadOnlyDictionary<string, DataValueSnapshot> BuildReadCache(IReadOnlySet<string> inputs)
|
||||
/// <summary>
|
||||
/// Refill <paramref name="cache"/> in place from <c>_valueCache</c>, falling
|
||||
/// back to a synchronous <c>ITagUpstreamSource.ReadTag</c> for paths whose
|
||||
/// first upstream push hasn't arrived yet. The dictionary is cleared and
|
||||
/// repopulated under <c>_evalGate</c> so no concurrent reader can observe
|
||||
/// a partial state. Replaces the old <c>BuildReadCache</c> which allocated a
|
||||
/// fresh dictionary every call (Core.ScriptedAlarms-009).
|
||||
/// </summary>
|
||||
private void RefillReadCache(
|
||||
Dictionary<string, DataValueSnapshot> cache, IReadOnlySet<string> inputs)
|
||||
{
|
||||
var d = new Dictionary<string, DataValueSnapshot>(StringComparer.Ordinal);
|
||||
cache.Clear();
|
||||
foreach (var p in inputs)
|
||||
d[p] = _valueCache.TryGetValue(p, out var v) ? v : _upstream.ReadTag(p);
|
||||
return d;
|
||||
cache[p] = _valueCache.TryGetValue(p, out var v) ? v : _upstream.ReadTag(p);
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
@@ -596,12 +686,24 @@ public sealed class ScriptedAlarmEngine : IDisposable
|
||||
}
|
||||
}
|
||||
|
||||
// Do NOT clear _alarms here: Timer.Dispose() does not wait for in-flight callbacks,
|
||||
// so a ShelvingCheckAsync or ReevaluateAsync can still be running inside _evalGate.
|
||||
// Those paths now re-check _disposed after acquiring the gate and bail out safely.
|
||||
// Clearing _alarms outside the gate would race concurrent reads and is unnecessary
|
||||
// (the whole object is being discarded). (Core.ScriptedAlarms-005)
|
||||
// Safe to clear here: the Task.WhenAll drain above guaranteed no
|
||||
// ReevaluateAsync / ShelvingCheckAsync is mid-flight, and _disposed=true
|
||||
// prevents new background work from being queued (OnUpstreamChange bails on
|
||||
// line 334). Pre-Core.Scripting-016 the comment said "Do NOT clear _alarms",
|
||||
// but that was when the engine called ScriptEvaluator.Compile directly and
|
||||
// held the script ALCs through _alarms→AlarmState→TimedScriptEvaluator
|
||||
// forever — leaving them rooted defeated the -008 collectible-ALC unload.
|
||||
// Clearing now drops the delegate references so the cache's Dispose call
|
||||
// below can actually unload the emitted assemblies. (Core.ScriptedAlarms-005
|
||||
// re-evaluated under -016.)
|
||||
_alarms.Clear();
|
||||
_alarmsReferencing.Clear();
|
||||
_scratchByAlarmId.Clear();
|
||||
// Dispose every compiled-predicate ALC so the engine's shutdown actually
|
||||
// releases the emitted assemblies. The drain above ensures no evaluator is
|
||||
// mid-call; CompiledScriptCache.Dispose internally guards against use-after-
|
||||
// dispose. (Core.Scripting-016)
|
||||
_compileCache.Dispose();
|
||||
}
|
||||
|
||||
private sealed record AlarmState(
|
||||
@@ -611,6 +713,37 @@ public sealed class ScriptedAlarmEngine : IDisposable
|
||||
IReadOnlyList<string> TemplateTokens,
|
||||
ILogger Logger,
|
||||
AlarmConditionState Condition);
|
||||
|
||||
/// <summary>
|
||||
/// Per-alarm reusable evaluation scratch. The <see cref="ReadCache"/> dictionary
|
||||
/// is the same instance across every evaluation of the owning alarm — it is
|
||||
/// cleared and refilled in <see cref="ScriptedAlarmEngine.RefillReadCache"/> on
|
||||
/// each call. <see cref="Context"/> wraps that dictionary by reference, so a
|
||||
/// refilled <see cref="ReadCache"/> is what the predicate's
|
||||
/// <c>ctx.GetTag(path)</c> calls observe. (Core.ScriptedAlarms-009)
|
||||
/// </summary>
|
||||
/// <remarks>
|
||||
/// Reuse is safe because <see cref="ScriptedAlarmEngine"/> serialises every
|
||||
/// evaluation under <c>_evalGate</c>: two threads can never observe the same
|
||||
/// scratch in a half-refilled state.
|
||||
/// </remarks>
|
||||
private sealed class AlarmScratch
|
||||
{
|
||||
public Dictionary<string, DataValueSnapshot> ReadCache { get; }
|
||||
public AlarmPredicateContext Context { get; }
|
||||
|
||||
public AlarmScratch(IReadOnlySet<string> inputs, ILogger logger, Func<DateTime> clock)
|
||||
{
|
||||
// Pre-size to the expected input count so the first refill doesn't pay the
|
||||
// dictionary-grow cost. The dictionary auto-grows if Inputs changes (it
|
||||
// cannot under the current contract — Inputs is fixed at LoadAsync — but
|
||||
// pre-sizing is defensive against future changes).
|
||||
ReadCache = new Dictionary<string, DataValueSnapshot>(inputs.Count, StringComparer.Ordinal);
|
||||
// Context holds the read cache by reference. Refilling the dictionary
|
||||
// updates what the context (and the script) observes.
|
||||
Context = new AlarmPredicateContext(ReadCache, logger, clock);
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
|
||||
@@ -30,11 +30,20 @@ namespace ZB.MOM.WW.OtOpcUa.Core.Scripting;
|
||||
/// bounded by config DB (typically low thousands). If that changes in v3, add an
|
||||
/// LRU eviction policy — the API stays the same.
|
||||
/// </para>
|
||||
/// <para>
|
||||
/// <b>Lifecycle:</b> compiled scripts hold a collectible
|
||||
/// <see cref="System.Runtime.Loader.AssemblyLoadContext"/> per evaluator
|
||||
/// (Core.Scripting-008 fix). <see cref="Clear"/> disposes every materialised
|
||||
/// evaluator before dropping its dictionary entry so the emitted assemblies are
|
||||
/// eligible for GC immediately after a publish. <see cref="Dispose"/> drops the
|
||||
/// cache itself for graceful server shutdown.
|
||||
/// </para>
|
||||
/// </remarks>
|
||||
public sealed class CompiledScriptCache<TContext, TResult>
|
||||
public sealed class CompiledScriptCache<TContext, TResult> : IDisposable
|
||||
where TContext : ScriptContext
|
||||
{
|
||||
private readonly ConcurrentDictionary<string, Lazy<ScriptEvaluator<TContext, TResult>>> _cache = new();
|
||||
private bool _disposed;
|
||||
|
||||
/// <summary>
|
||||
/// Return the compiled evaluator for <paramref name="scriptSource"/>, compiling
|
||||
@@ -46,6 +55,7 @@ public sealed class CompiledScriptCache<TContext, TResult>
|
||||
public ScriptEvaluator<TContext, TResult> GetOrCompile(string scriptSource)
|
||||
{
|
||||
if (scriptSource is null) throw new ArgumentNullException(nameof(scriptSource));
|
||||
if (_disposed) throw new ObjectDisposedException(nameof(CompiledScriptCache<TContext, TResult>));
|
||||
|
||||
var key = HashSource(scriptSource);
|
||||
var lazy = _cache.GetOrAdd(key, _ => new Lazy<ScriptEvaluator<TContext, TResult>>(
|
||||
@@ -72,13 +82,71 @@ public sealed class CompiledScriptCache<TContext, TResult>
|
||||
/// <summary>Current entry count. Exposed for Admin UI diagnostics / tests.</summary>
|
||||
public int Count => _cache.Count;
|
||||
|
||||
/// <summary>Drop every cached compile. Used on config generation publish + tests.</summary>
|
||||
public void Clear() => _cache.Clear();
|
||||
/// <summary>
|
||||
/// Drop every cached compile. Used on config generation publish + tests.
|
||||
/// Disposes each materialised evaluator before removing it so its collectible
|
||||
/// <see cref="System.Runtime.Loader.AssemblyLoadContext"/> unloads and the
|
||||
/// emitted script assembly becomes eligible for GC (Core.Scripting-008).
|
||||
/// </summary>
|
||||
/// <remarks>
|
||||
/// Safe to call after <see cref="Dispose"/> — the operation is idempotent.
|
||||
/// <see cref="Dispose"/> sets <c>_disposed = true</c> before invoking this
|
||||
/// method (so callers see the post-Dispose guard on <see cref="GetOrCompile"/>),
|
||||
/// but this method itself MUST run to completion so the Dispose-triggered
|
||||
/// drain actually unloads every materialised evaluator's ALC. (Core.Scripting-016
|
||||
/// uncovered this — a previous Clear-aborts-when-disposed guard silently
|
||||
/// skipped the entire drain on Dispose, leaving emitted assemblies rooted.)
|
||||
/// </remarks>
|
||||
public void Clear()
|
||||
{
|
||||
// Snapshot (key, value) pairs and remove with the value-scoped
|
||||
// TryRemove(KeyValuePair<,>) overload — same shape as the
|
||||
// Core.Scripting-006 fix in GetOrCompile's catch block. A concurrent
|
||||
// GetOrCompile re-add that hashes to the same key between our snapshot
|
||||
// and the TryRemove inserts a *different* Lazy reference; the value-
|
||||
// scoped removal sees the mismatch and leaves the fresh entry intact
|
||||
// (instead of evicting + disposing it while the concurrent caller
|
||||
// still holds it). The fresh evaluator and its ALC stay live for the
|
||||
// concurrent caller. (Core.Scripting-014.)
|
||||
foreach (var entry in _cache.ToArray())
|
||||
{
|
||||
if (_cache.TryRemove(entry))
|
||||
DisposeLazyIfMaterialised(entry.Value);
|
||||
}
|
||||
}
|
||||
|
||||
/// <summary>True when the exact source has been compiled at least once + is still cached.</summary>
|
||||
public bool Contains(string scriptSource)
|
||||
=> _cache.ContainsKey(HashSource(scriptSource));
|
||||
|
||||
/// <summary>
|
||||
/// Drop the cache and dispose every materialised evaluator. After disposal
|
||||
/// <see cref="GetOrCompile"/> throws <see cref="ObjectDisposedException"/>.
|
||||
/// </summary>
|
||||
public void Dispose()
|
||||
{
|
||||
if (_disposed) return;
|
||||
_disposed = true;
|
||||
Clear();
|
||||
}
|
||||
|
||||
private static void DisposeLazyIfMaterialised(Lazy<ScriptEvaluator<TContext, TResult>> lazy)
|
||||
{
|
||||
// IsValueCreated is false for a faulted Lazy too, so the catch in GetOrCompile
|
||||
// has already taken care of failed compiles — there's no evaluator to dispose.
|
||||
if (!lazy.IsValueCreated) return;
|
||||
try
|
||||
{
|
||||
lazy.Value.Dispose();
|
||||
}
|
||||
catch
|
||||
{
|
||||
// Dispose is best-effort here: an evaluator disposal failure would leak its
|
||||
// ALC but mustn't prevent the rest of the cache from clearing. The ALC
|
||||
// unload itself is exception-free in practice; this is defensive.
|
||||
}
|
||||
}
|
||||
|
||||
private static string HashSource(string source)
|
||||
{
|
||||
var bytes = Encoding.UTF8.GetBytes(source);
|
||||
|
||||
@@ -72,6 +72,11 @@ public static class ForbiddenTypeAnalyzer
|
||||
// a Task fan-out outlives the evaluation timeout entirely
|
||||
// (Core.Scripting-003).
|
||||
"System.Runtime.InteropServices",
|
||||
"System.Runtime.Loader", // AssemblyLoadContext + AssemblyDependencyResolver —
|
||||
// arbitrary DLL load into the host process
|
||||
// (Core.Scripting-012). Namespace-prefix rather than
|
||||
// type-granular so future BCL additions to this
|
||||
// namespace are denied by default.
|
||||
"Microsoft.Win32", // registry
|
||||
];
|
||||
|
||||
@@ -113,6 +118,24 @@ public static class ForbiddenTypeAnalyzer
|
||||
// target it without blocking those legitimate types. Denied type-granularly here.
|
||||
// (Core.Scripting-010.)
|
||||
"System.Threading.Thread",
|
||||
// Core.Scripting-012 — broadening the references list to the BCL trusted-platform-
|
||||
// assemblies set (Core.Scripting-008 follow-up) re-exposed two background-work
|
||||
// vectors the original deny-list missed. Both live in System.Threading (shared
|
||||
// with allowed sync primitives like CancellationToken / SemaphoreSlim), so they
|
||||
// must be denied type-granularly:
|
||||
//
|
||||
// System.Threading.ThreadPool — QueueUserWorkItem / UnsafeQueueUserWorkItem
|
||||
// re-introduce the background-fanout threat
|
||||
// Core.Scripting-003 closed against
|
||||
// System.Threading.Tasks.
|
||||
// System.Threading.Timer — Timer(callback, …) schedules unbounded work
|
||||
// that outlives the per-evaluation timeout.
|
||||
//
|
||||
// System.Runtime.Loader.AssemblyLoadContext is also covered, but at the namespace-
|
||||
// prefix level above (System.Runtime.Loader) so future BCL additions to that
|
||||
// namespace are denied by default.
|
||||
"System.Threading.ThreadPool",
|
||||
"System.Threading.Timer",
|
||||
];
|
||||
|
||||
/// <summary>
|
||||
|
||||
@@ -1,75 +1,420 @@
|
||||
using Microsoft.CodeAnalysis.CSharp.Scripting;
|
||||
using Microsoft.CodeAnalysis.Scripting;
|
||||
using System.Reflection;
|
||||
using System.Runtime.Loader;
|
||||
using System.Text;
|
||||
using Microsoft.CodeAnalysis;
|
||||
using Microsoft.CodeAnalysis.CSharp;
|
||||
|
||||
namespace ZB.MOM.WW.OtOpcUa.Core.Scripting;
|
||||
|
||||
/// <summary>
|
||||
/// Compiles + runs user scripts against a <see cref="ScriptContext"/> subclass. Core
|
||||
/// evaluator — no caching, no timeout, no logging side-effects yet (those land in
|
||||
/// Stream A.3, A.4, A.5 respectively). Stream B + C wrap this with the dependency
|
||||
/// scheduler + alarm state machine.
|
||||
/// evaluator — no caching, no timeout, no logging side-effects (those land in
|
||||
/// <see cref="CompiledScriptCache{TContext, TResult}"/>,
|
||||
/// <see cref="TimedScriptEvaluator{TContext, TResult}"/>, and
|
||||
/// <see cref="ScriptLogCompanionSink"/> respectively).
|
||||
/// </summary>
|
||||
/// <remarks>
|
||||
/// <para>
|
||||
/// Scripts are compiled against <see cref="ScriptGlobals{TContext}"/> so the
|
||||
/// context member is named <c>ctx</c> in the script, matching the
|
||||
/// <see cref="DependencyExtractor"/>'s walker and the Admin UI type stub.
|
||||
/// Scripts are wrapped in a synthesized <c>CompiledScript.Run(globals)</c> method
|
||||
/// and compiled via <see cref="CSharpCompilation"/> into a regular .NET assembly
|
||||
/// that is loaded into a <b>collectible</b>
|
||||
/// <see cref="AssemblyLoadContext"/>. The collectible ALC is the fix for
|
||||
/// Core.Scripting-008: per-publish recompile accretion was previously unbounded
|
||||
/// because Roslyn's <c>CSharpScript.CreateDelegate</c> emits into the default ALC
|
||||
/// (non-collectible); now <see cref="Dispose"/> unloads the entire ALC and the
|
||||
/// emitted assembly becomes eligible for GC.
|
||||
/// </para>
|
||||
/// <para>
|
||||
/// Compile pipeline is a three-step gate: (1) Roslyn compile — catches syntax
|
||||
/// errors + type-resolution failures, throws <see cref="CompilationErrorException"/>;
|
||||
/// (2) <see cref="ForbiddenTypeAnalyzer"/> runs against the semantic model —
|
||||
/// catches sandbox escapes that slipped past reference restrictions due to .NET's
|
||||
/// type forwarding, throws <see cref="ScriptSandboxViolationException"/>; (3)
|
||||
/// delegate creation — throws at this layer only for internal Roslyn bugs, not
|
||||
/// user error.
|
||||
/// Compile pipeline is a three-step gate, unchanged in intent from the legacy
|
||||
/// <c>CSharpScript</c> path: (1) Roslyn parse + compile against the
|
||||
/// <see cref="ScriptSandbox"/> allow-list — catches syntax errors, unresolved
|
||||
/// types (the sandbox's first line of defense), and most type-resolution
|
||||
/// failures, throwing <see cref="CompilationErrorException"/>; (2)
|
||||
/// <see cref="ForbiddenTypeAnalyzer"/> runs against the semantic model — catches
|
||||
/// sandbox escapes that slipped past reference restrictions due to .NET's type
|
||||
/// forwarding, throwing <see cref="ScriptSandboxViolationException"/>; (3) emit
|
||||
/// to an in-memory PE stream + load into the collectible ALC — throws at this
|
||||
/// layer only for internal Roslyn bugs, not user error.
|
||||
/// </para>
|
||||
/// <para>
|
||||
/// Runtime exceptions thrown from user code propagate unwrapped. The virtual-tag
|
||||
/// engine (Stream B) catches them per-tag + maps to <c>BadInternalError</c>
|
||||
/// quality per Phase 7 decision #11 — this layer doesn't swallow anything so
|
||||
/// tests can assert on the original exception type.
|
||||
/// engine catches them per-tag and maps to <c>BadInternalError</c> quality
|
||||
/// per Phase 7 decision #11; this layer doesn't swallow anything so tests can
|
||||
/// assert on the original exception type.
|
||||
/// </para>
|
||||
/// <para>
|
||||
/// Scripts are expected to be statement bodies that end with an explicit
|
||||
/// <c>return …;</c> — the wrapper provides only the surrounding method body, so
|
||||
/// the script's final-expression-yields-result behavior of legacy
|
||||
/// <c>CSharpScript</c> is replaced by ordinary C# method semantics. Every script
|
||||
/// in the existing test corpus already uses explicit <c>return</c>; this is a
|
||||
/// documented authoring convention.
|
||||
/// </para>
|
||||
/// </remarks>
|
||||
public sealed class ScriptEvaluator<TContext, TResult>
|
||||
public sealed class ScriptEvaluator<TContext, TResult> : IDisposable
|
||||
where TContext : ScriptContext
|
||||
{
|
||||
private readonly ScriptRunner<TResult> _runner;
|
||||
private readonly ScriptAssemblyLoadContext _alc;
|
||||
private readonly Func<ScriptGlobals<TContext>, TResult> _func;
|
||||
private bool _disposed;
|
||||
|
||||
private ScriptEvaluator(ScriptRunner<TResult> runner)
|
||||
private ScriptEvaluator(ScriptAssemblyLoadContext alc, Func<ScriptGlobals<TContext>, TResult> func)
|
||||
{
|
||||
_runner = runner;
|
||||
_alc = alc;
|
||||
_func = func;
|
||||
}
|
||||
|
||||
public static ScriptEvaluator<TContext, TResult> Compile(string scriptSource)
|
||||
{
|
||||
if (scriptSource is null) throw new ArgumentNullException(nameof(scriptSource));
|
||||
|
||||
var options = ScriptSandbox.Build(typeof(TContext));
|
||||
var script = CSharpScript.Create<TResult>(
|
||||
code: scriptSource,
|
||||
options: options,
|
||||
globalsType: typeof(ScriptGlobals<TContext>));
|
||||
var sandbox = ScriptSandbox.Build(typeof(TContext));
|
||||
|
||||
// Step 1 — Roslyn compile. Throws CompilationErrorException on syntax / type errors.
|
||||
var diagnostics = script.Compile();
|
||||
// Step 1 — synthesize a wrapper class around the script body and parse it. The
|
||||
// wrapper's `Run` method is what we invoke at runtime; the user's source is
|
||||
// pasted in as its body so explicit `return` semantics apply.
|
||||
var wrapperSource = BuildWrapperSource(scriptSource, sandbox.Imports);
|
||||
var syntaxTree = CSharpSyntaxTree.ParseText(wrapperSource);
|
||||
|
||||
// Step 2 — forbidden-type semantic analysis. Defense-in-depth against reference-list
|
||||
// leaks due to type forwarding.
|
||||
var rejections = ForbiddenTypeAnalyzer.Analyze(script.GetCompilation());
|
||||
// Step 1a — defend against wrapper-source injection (Core.Scripting-013).
|
||||
// A script body of `return 0; } public static int Evil() { return 0;` would
|
||||
// close the synthesized `Run` method early, declare a sibling `Evil` method
|
||||
// inside the synthesized `CompiledScript` class, and leave the wrapper's
|
||||
// trailing `}` balanced. ForbiddenTypeAnalyzer still walks the injected
|
||||
// members so this isn't a direct sandbox escape, but it relaxes the
|
||||
// documented "method body" authoring contract and widens the analyzer's
|
||||
// surface. Reject by requiring that the parsed `CompiledScript` class
|
||||
// contains exactly one member declaration (the `Run` method).
|
||||
EnforceSingleRunMember(syntaxTree);
|
||||
|
||||
// Step 2 — Roslyn compile against the sandbox allow-list. Anything not in the
|
||||
// references set is unresolved and produces a compiler error.
|
||||
var assemblyName = "ZB.MOM.WW.OtOpcUa.Core.Scripting.Compiled." +
|
||||
Guid.NewGuid().ToString("N");
|
||||
var compileOptions = new CSharpCompilationOptions(
|
||||
OutputKind.DynamicallyLinkedLibrary,
|
||||
optimizationLevel: OptimizationLevel.Release,
|
||||
allowUnsafe: false,
|
||||
// Don't generate XML doc warnings for the synthesized wrapper.
|
||||
warningLevel: 4,
|
||||
nullableContextOptions: NullableContextOptions.Enable);
|
||||
var compilation = CSharpCompilation.Create(
|
||||
assemblyName,
|
||||
syntaxTrees: new[] { syntaxTree },
|
||||
references: sandbox.References,
|
||||
options: compileOptions);
|
||||
|
||||
var compileDiagnostics = compilation.GetDiagnostics();
|
||||
var compileErrors = compileDiagnostics
|
||||
.Where(d => d.Severity == DiagnosticSeverity.Error)
|
||||
.ToArray();
|
||||
if (compileErrors.Length > 0)
|
||||
throw new CompilationErrorException(compileErrors);
|
||||
|
||||
// Step 3 — forbidden-type semantic analysis. Defense-in-depth against
|
||||
// reference-list leaks due to type forwarding.
|
||||
var rejections = ForbiddenTypeAnalyzer.Analyze(compilation);
|
||||
if (rejections.Count > 0)
|
||||
throw new ScriptSandboxViolationException(rejections);
|
||||
|
||||
// Step 3 — materialize the callable delegate.
|
||||
var runner = script.CreateDelegate();
|
||||
return new ScriptEvaluator<TContext, TResult>(runner);
|
||||
// Step 4 — emit to an in-memory PE stream and load into a collectible ALC.
|
||||
using var peStream = new MemoryStream();
|
||||
var emitResult = compilation.Emit(peStream);
|
||||
if (!emitResult.Success)
|
||||
{
|
||||
var emitErrors = emitResult.Diagnostics
|
||||
.Where(d => d.Severity == DiagnosticSeverity.Error)
|
||||
.ToArray();
|
||||
throw new CompilationErrorException(emitErrors);
|
||||
}
|
||||
|
||||
peStream.Position = 0;
|
||||
var alc = new ScriptAssemblyLoadContext(assemblyName);
|
||||
Assembly assembly;
|
||||
try
|
||||
{
|
||||
assembly = alc.LoadFromStream(peStream);
|
||||
}
|
||||
catch
|
||||
{
|
||||
// Failed to load — drop the ALC so we don't leak a half-initialised one.
|
||||
alc.Unload();
|
||||
throw;
|
||||
}
|
||||
|
||||
// Step 5 — resolve the wrapper's Run method and bind a typed delegate. The
|
||||
// wrapper source above puts the type in this exact namespace + class — keep the
|
||||
// names in sync with BuildWrapperSource.
|
||||
Func<ScriptGlobals<TContext>, TResult> func;
|
||||
try
|
||||
{
|
||||
var wrapperType = assembly.GetType(
|
||||
"ZB.MOM.WW.OtOpcUa.Core.Scripting.Compiled.CompiledScript",
|
||||
throwOnError: true)!;
|
||||
var runMethod = wrapperType.GetMethod(
|
||||
"Run",
|
||||
BindingFlags.Public | BindingFlags.Static)
|
||||
?? throw new InvalidOperationException(
|
||||
"Synthesized wrapper is missing the public static Run method.");
|
||||
func = (Func<ScriptGlobals<TContext>, TResult>)Delegate.CreateDelegate(
|
||||
typeof(Func<ScriptGlobals<TContext>, TResult>), runMethod);
|
||||
}
|
||||
catch
|
||||
{
|
||||
alc.Unload();
|
||||
throw;
|
||||
}
|
||||
|
||||
return new ScriptEvaluator<TContext, TResult>(alc, func);
|
||||
}
|
||||
|
||||
/// <summary>Run against an already-constructed context.</summary>
|
||||
public Task<TResult> RunAsync(TContext context, CancellationToken ct = default)
|
||||
{
|
||||
if (_disposed) throw new ObjectDisposedException(nameof(ScriptEvaluator<TContext, TResult>));
|
||||
if (context is null) throw new ArgumentNullException(nameof(context));
|
||||
ct.ThrowIfCancellationRequested();
|
||||
var globals = new ScriptGlobals<TContext> { ctx = context };
|
||||
return _runner(globals, ct);
|
||||
// The user's script is synchronous (Roslyn emits a static method that returns
|
||||
// TResult directly). We surface a Task<TResult> only to keep the existing
|
||||
// RunAsync contract consumers depend on. TimedScriptEvaluator wraps this in
|
||||
// Task.Run so a long-running script still honours its wall-clock budget.
|
||||
var result = _func(globals);
|
||||
return Task.FromResult(result);
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// Unload the collectible <see cref="AssemblyLoadContext"/> that holds the emitted
|
||||
/// script assembly so the runtime can reclaim it. After disposal the evaluator can
|
||||
/// no longer be invoked — call <see cref="ScriptEvaluator{TContext, TResult}.Compile"/>
|
||||
/// again for a fresh one. Dispose is idempotent.
|
||||
/// </summary>
|
||||
/// <remarks>
|
||||
/// Unload is <i>eligible-for-collection</i>, not synchronous: the assembly is
|
||||
/// reclaimed when the GC determines no live references remain. The cache disposes
|
||||
/// evaluators in <see cref="CompiledScriptCache{TContext, TResult}.Clear"/> so a
|
||||
/// config-generation publish releases the prior generation in one sweep; the
|
||||
/// reclaim then races with the next GC cycle. Tests verify the reclaim via
|
||||
/// <see cref="WeakReference"/> + <see cref="GC.Collect()"/>.
|
||||
/// </remarks>
|
||||
public void Dispose()
|
||||
{
|
||||
if (_disposed) return;
|
||||
_disposed = true;
|
||||
_alc.Unload();
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// Reject scripts whose source contains brace-balanced injections that would
|
||||
/// declare sibling members alongside the synthesized <c>CompiledScript.Run</c>
|
||||
/// method. The expected shape is a single <c>CompiledScript</c> class with
|
||||
/// exactly one member — the <c>Run</c> method. Anything else (a sibling
|
||||
/// method, nested class, additional class in the namespace, free-floating
|
||||
/// top-level statement) means the user source closed the synthesized braces
|
||||
/// early and injected its own declarations. (Core.Scripting-013.)
|
||||
/// </summary>
|
||||
private static void EnforceSingleRunMember(SyntaxTree syntaxTree)
|
||||
{
|
||||
var root = syntaxTree.GetCompilationUnitRoot();
|
||||
|
||||
// The compilation unit must hold exactly one type declaration — our
|
||||
// CompiledScript. Anything else means the user closed the synthesized
|
||||
// namespace or class early and injected another type declaration.
|
||||
var typeMembers = root.DescendantNodes()
|
||||
.OfType<Microsoft.CodeAnalysis.CSharp.Syntax.BaseTypeDeclarationSyntax>()
|
||||
.ToArray();
|
||||
if (typeMembers.Length != 1 || typeMembers[0].Identifier.ValueText != "CompiledScript")
|
||||
{
|
||||
throw new CompilationErrorException(new[]
|
||||
{
|
||||
Diagnostic.Create(
|
||||
new DiagnosticDescriptor(
|
||||
id: "LMX001",
|
||||
title: "Script wrapper injection",
|
||||
messageFormat: "Script source must be a statement body. Declarations of " +
|
||||
"additional types alongside the wrapper's CompiledScript class " +
|
||||
"are not allowed; check for unbalanced braces or stray " +
|
||||
"`class` / `namespace` keywords in the source. (Core.Scripting-013)",
|
||||
category: "Sandbox",
|
||||
defaultSeverity: DiagnosticSeverity.Error,
|
||||
isEnabledByDefault: true),
|
||||
typeMembers.Length > 1 ? typeMembers[1].Identifier.GetLocation() : Location.None),
|
||||
});
|
||||
}
|
||||
|
||||
// The CompiledScript class itself must contain exactly one member — the Run
|
||||
// method. A second member means the user closed Run early and started a sibling.
|
||||
var classMembers = ((Microsoft.CodeAnalysis.CSharp.Syntax.ClassDeclarationSyntax)typeMembers[0]).Members;
|
||||
if (classMembers.Count != 1 ||
|
||||
classMembers[0] is not Microsoft.CodeAnalysis.CSharp.Syntax.MethodDeclarationSyntax m ||
|
||||
m.Identifier.ValueText != "Run")
|
||||
{
|
||||
throw new CompilationErrorException(new[]
|
||||
{
|
||||
Diagnostic.Create(
|
||||
new DiagnosticDescriptor(
|
||||
id: "LMX002",
|
||||
title: "Script wrapper injection",
|
||||
messageFormat: "Script source must be a statement body. Declarations of " +
|
||||
"sibling members (methods, properties, nested types) alongside " +
|
||||
"the wrapper's Run method are not allowed; check for unbalanced " +
|
||||
"braces or a stray `}` followed by a `public`/`private`/`static` " +
|
||||
"declaration in the source. (Core.Scripting-013)",
|
||||
category: "Sandbox",
|
||||
defaultSeverity: DiagnosticSeverity.Error,
|
||||
isEnabledByDefault: true),
|
||||
classMembers.Count > 1 ? classMembers[1].GetLocation() : Location.None),
|
||||
});
|
||||
}
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// Synthesize the source we hand to Roslyn. The user's script body is pasted
|
||||
/// verbatim inside <c>CompiledScript.Run</c>; the <c>using</c> directives mirror
|
||||
/// <see cref="ScriptSandbox"/>'s imports so scripts can write <c>Math.Abs</c>
|
||||
/// instead of <c>System.Math.Abs</c>.
|
||||
/// </summary>
|
||||
private static string BuildWrapperSource(string userSource, IReadOnlyList<string> imports)
|
||||
{
|
||||
var sb = new StringBuilder();
|
||||
foreach (var import in imports)
|
||||
sb.Append("using ").Append(import).AppendLine(";");
|
||||
sb.AppendLine();
|
||||
sb.AppendLine("namespace ZB.MOM.WW.OtOpcUa.Core.Scripting.Compiled;");
|
||||
sb.AppendLine();
|
||||
sb.AppendLine("public static class CompiledScript");
|
||||
sb.AppendLine("{");
|
||||
sb.Append(" public static ").Append(ToCSharpTypeName(typeof(TResult)))
|
||||
.Append(" Run(").Append(ToCSharpTypeName(typeof(ScriptGlobals<TContext>)))
|
||||
.AppendLine(" globals)");
|
||||
sb.AppendLine(" {");
|
||||
sb.AppendLine(" var ctx = globals.ctx;");
|
||||
// User source ends with `return X;` per the authoring convention; we paste it
|
||||
// verbatim. The leading newline keeps Roslyn diagnostics' line numbers usable
|
||||
// by operators (errors point at the user's source line, not the wrapper).
|
||||
sb.AppendLine("#line 1");
|
||||
sb.AppendLine(userSource);
|
||||
sb.AppendLine(" }");
|
||||
sb.AppendLine("}");
|
||||
return sb.ToString();
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// Convert a runtime <see cref="Type"/> to a C# type-name string suitable for
|
||||
/// emitting into Roslyn source. Uses <c>global::</c>-qualified FQNs to avoid
|
||||
/// accidental capture by the wrapper's <c>using</c> directives, handles nested
|
||||
/// types (<c>+</c> → <c>.</c>), and recurses for generic arguments so the
|
||||
/// <c>ScriptGlobals<TContext></c> parameter is emitted correctly.
|
||||
/// </summary>
|
||||
private static string ToCSharpTypeName(Type t)
|
||||
{
|
||||
if (t == typeof(void)) return "void";
|
||||
// Primitive aliases keep the synthesized source readable when diagnostic
|
||||
// logging dumps it; functionally identical to the FQN form.
|
||||
if (t == typeof(bool)) return "bool";
|
||||
if (t == typeof(byte)) return "byte";
|
||||
if (t == typeof(sbyte)) return "sbyte";
|
||||
if (t == typeof(short)) return "short";
|
||||
if (t == typeof(ushort)) return "ushort";
|
||||
if (t == typeof(int)) return "int";
|
||||
if (t == typeof(uint)) return "uint";
|
||||
if (t == typeof(long)) return "long";
|
||||
if (t == typeof(ulong)) return "ulong";
|
||||
if (t == typeof(float)) return "float";
|
||||
if (t == typeof(double)) return "double";
|
||||
if (t == typeof(decimal)) return "decimal";
|
||||
if (t == typeof(string)) return "string";
|
||||
if (t == typeof(object)) return "object";
|
||||
|
||||
if (Nullable.GetUnderlyingType(t) is { } inner)
|
||||
return ToCSharpTypeName(inner) + "?";
|
||||
|
||||
if (t.IsArray)
|
||||
return ToCSharpTypeName(t.GetElementType()!) + "[]";
|
||||
|
||||
if (t.IsGenericType)
|
||||
{
|
||||
// Walk the FullName by '.' segments (after '+ → .'). For each segment
|
||||
// ending with `Name\`N`, consume N generic arguments and emit them as
|
||||
// `<…>` on that segment. Nested generic-in-generic (Outer<T>.Inner<U>)
|
||||
// emits as `global::Ns.Outer<T>.Inner<U>` — valid C#. The pre-fix code
|
||||
// used `IndexOf('`')` to find the FIRST backtick and truncated the
|
||||
// entire name there, silently dropping the rest of the nested-generic
|
||||
// closed args. (Core.Scripting-015.)
|
||||
var rawName = t.GetGenericTypeDefinition().FullName!.Replace('+', '.');
|
||||
var allArgs = t.GetGenericArguments();
|
||||
var segments = rawName.Split('.');
|
||||
var argIndex = 0;
|
||||
var sb = new StringBuilder("global::");
|
||||
for (int i = 0; i < segments.Length; i++)
|
||||
{
|
||||
if (i > 0) sb.Append('.');
|
||||
var seg = segments[i];
|
||||
var backtick = seg.IndexOf('`');
|
||||
if (backtick >= 0)
|
||||
{
|
||||
var arity = int.Parse(seg.AsSpan(backtick + 1));
|
||||
sb.Append(seg, 0, backtick);
|
||||
sb.Append('<');
|
||||
for (int j = 0; j < arity; j++)
|
||||
{
|
||||
if (j > 0) sb.Append(", ");
|
||||
sb.Append(ToCSharpTypeName(allArgs[argIndex++]));
|
||||
}
|
||||
sb.Append('>');
|
||||
}
|
||||
else
|
||||
{
|
||||
sb.Append(seg);
|
||||
}
|
||||
}
|
||||
return sb.ToString();
|
||||
}
|
||||
|
||||
return "global::" + t.FullName!.Replace('+', '.');
|
||||
}
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// Collectible <see cref="AssemblyLoadContext"/> that hosts a single emitted script
|
||||
/// assembly. Created per <see cref="ScriptEvaluator{TContext, TResult}"/> instance so
|
||||
/// <see cref="AssemblyLoadContext.Unload"/> releases exactly that script. Resolves
|
||||
/// dependencies via the default ALC — script assemblies reference the BCL + the
|
||||
/// application's own types, all of which live in the default context.
|
||||
/// </summary>
|
||||
internal sealed class ScriptAssemblyLoadContext : AssemblyLoadContext
|
||||
{
|
||||
public ScriptAssemblyLoadContext(string name) : base(name, isCollectible: true)
|
||||
{
|
||||
}
|
||||
|
||||
protected override Assembly? Load(AssemblyName assemblyName) => null;
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// Thrown by <see cref="ScriptEvaluator{TContext, TResult}.Compile"/> when Roslyn
|
||||
/// reports compile-time errors against the wrapper source. Mirrors the
|
||||
/// <c>Microsoft.CodeAnalysis.Scripting.CompilationErrorException</c> from the legacy
|
||||
/// <c>CSharpScript</c> path so callers (engines + the Admin test-harness) keep the
|
||||
/// same catch site after the Core.Scripting-008 rewrite.
|
||||
/// </summary>
|
||||
public sealed class CompilationErrorException : Exception
|
||||
{
|
||||
public IReadOnlyList<Diagnostic> Diagnostics { get; }
|
||||
|
||||
public CompilationErrorException(IReadOnlyList<Diagnostic> diagnostics)
|
||||
: base(BuildMessage(diagnostics))
|
||||
{
|
||||
Diagnostics = diagnostics;
|
||||
}
|
||||
|
||||
private static string BuildMessage(IReadOnlyList<Diagnostic> diagnostics)
|
||||
{
|
||||
if (diagnostics.Count == 0) return "Script compile failed.";
|
||||
// Operators see this — match the legacy Roslyn format ("(line,col): error CSxxxx:
|
||||
// message") so existing operator runbooks still match.
|
||||
var first = diagnostics[0];
|
||||
var rest = diagnostics.Count == 1 ? "" : $" (and {diagnostics.Count - 1} more)";
|
||||
return first.ToString() + rest;
|
||||
}
|
||||
}
|
||||
|
||||
@@ -1,11 +1,10 @@
|
||||
using Microsoft.CodeAnalysis.CSharp.Scripting;
|
||||
using Microsoft.CodeAnalysis.Scripting;
|
||||
using Microsoft.CodeAnalysis;
|
||||
using ZB.MOM.WW.OtOpcUa.Core.Abstractions;
|
||||
|
||||
namespace ZB.MOM.WW.OtOpcUa.Core.Scripting;
|
||||
|
||||
/// <summary>
|
||||
/// Factory for the <see cref="ScriptOptions"/> every user script is compiled against.
|
||||
/// Factory for the compile-time sandbox every user script is built against.
|
||||
/// Implements Phase 7 plan decision #6 (read-only sandbox) by whitelisting only the
|
||||
/// assemblies + namespaces the script API needs; no <c>System.IO</c>, no
|
||||
/// <c>System.Net</c>, no <c>System.Diagnostics.Process</c>, no
|
||||
@@ -15,9 +14,12 @@ namespace ZB.MOM.WW.OtOpcUa.Core.Scripting;
|
||||
/// </summary>
|
||||
/// <remarks>
|
||||
/// <para>
|
||||
/// Roslyn's default <see cref="ScriptOptions"/> references <c>mscorlib</c> /
|
||||
/// <c>System.Runtime</c> transitively which pulls in every type in the BCL — this
|
||||
/// class overrides that with an explicit minimal allow-list.
|
||||
/// Roslyn would otherwise pull in every type in the BCL transitively via
|
||||
/// <c>mscorlib</c> / <c>System.Runtime</c> — this class overrides that with an
|
||||
/// explicit minimal allow-list. The list is the same regardless of whether
|
||||
/// <see cref="ScriptEvaluator{TContext, TResult}"/> uses the legacy
|
||||
/// <c>CSharpScript</c> path or the collectible-<c>AssemblyLoadContext</c> path
|
||||
/// (Core.Scripting-008): both go through <see cref="Build"/>.
|
||||
/// </para>
|
||||
/// <para>
|
||||
/// Namespaces pre-imported so scripts don't have to write <c>using</c> clauses:
|
||||
@@ -35,29 +37,21 @@ namespace ZB.MOM.WW.OtOpcUa.Core.Scripting;
|
||||
public static class ScriptSandbox
|
||||
{
|
||||
/// <summary>
|
||||
/// Build the <see cref="ScriptOptions"/> used for every virtual-tag / alarm
|
||||
/// script. <paramref name="contextType"/> is the concrete
|
||||
/// <see cref="ScriptContext"/> subclass the globals will be of — the compiler
|
||||
/// uses its type to resolve <c>ctx.GetTag(...)</c> calls.
|
||||
/// Build the sandbox configuration used for every virtual-tag / alarm script.
|
||||
/// <paramref name="contextType"/> is the concrete <see cref="ScriptContext"/>
|
||||
/// subclass the script's <c>ctx</c> will be of — the compiler uses its assembly
|
||||
/// to resolve <c>ctx.GetTag(...)</c> calls.
|
||||
/// </summary>
|
||||
public static ScriptOptions Build(Type contextType)
|
||||
public static SandboxConfig Build(Type contextType)
|
||||
{
|
||||
if (contextType is null) throw new ArgumentNullException(nameof(contextType));
|
||||
if (!typeof(ScriptContext).IsAssignableFrom(contextType))
|
||||
throw new ArgumentException(
|
||||
$"Script context type must derive from {nameof(ScriptContext)}", nameof(contextType));
|
||||
|
||||
// Allow-listed assemblies — each explicitly chosen. Adding here is a
|
||||
// plan-level decision; do not expand casually. HashSet so adding the
|
||||
// contextType's assembly is idempotent when it happens to be Core.Scripting
|
||||
// already.
|
||||
var allowedAssemblies = new HashSet<System.Reflection.Assembly>
|
||||
// OtOpcUa-owned assemblies — pinned by typeof(...) so they survive a rename.
|
||||
var pinnedAssemblies = new HashSet<System.Reflection.Assembly>
|
||||
{
|
||||
// System.Private.CoreLib — primitives (int, double, bool, string, DateTime,
|
||||
// TimeSpan, Math, Convert, nullable<T>). Can't practically script without it.
|
||||
typeof(object).Assembly,
|
||||
// System.Linq — IEnumerable extensions (Where / Select / Sum / Average / etc.).
|
||||
typeof(System.Linq.Enumerable).Assembly,
|
||||
// Core.Abstractions — DataValueSnapshot + DriverDataType so scripts can name
|
||||
// the types they receive from ctx.GetTag.
|
||||
typeof(DataValueSnapshot).Assembly,
|
||||
@@ -72,7 +66,23 @@ public static class ScriptSandbox
|
||||
contextType.Assembly,
|
||||
};
|
||||
|
||||
var allowedImports = new[]
|
||||
// BCL references. We list the trusted-platform-assemblies set restricted to
|
||||
// System.* and netstandard so the synthesized wrapper can reference every BCL
|
||||
// type by FQN — including the ones we forbid (HttpClient, File, Process,
|
||||
// Registry, etc.). Letting those types resolve at compile is intentional: the
|
||||
// hard security gate is ForbiddenTypeAnalyzer in step 3 of the compile pipeline
|
||||
// (Core.Scripting-001 / -002 established the analyzer must be the sole gate
|
||||
// because type forwarding makes any references-list-only restriction porous).
|
||||
// The references list now serves only as scoping hygiene — out-of-band BCL
|
||||
// surface (operator-authored hosting helpers, third-party packages, app code)
|
||||
// is not on the list and stays unreachable.
|
||||
var references = new List<MetadataReference>();
|
||||
foreach (var asm in pinnedAssemblies)
|
||||
references.Add(MetadataReference.CreateFromFile(asm.Location));
|
||||
foreach (var path in EnumerateBclAssemblyPaths())
|
||||
references.Add(MetadataReference.CreateFromFile(path));
|
||||
|
||||
var imports = new[]
|
||||
{
|
||||
"System",
|
||||
"System.Linq",
|
||||
@@ -80,8 +90,56 @@ public static class ScriptSandbox
|
||||
"ZB.MOM.WW.OtOpcUa.Core.Scripting",
|
||||
};
|
||||
|
||||
return ScriptOptions.Default
|
||||
.WithReferences(allowedAssemblies)
|
||||
.WithImports(allowedImports);
|
||||
return new SandboxConfig(references, imports);
|
||||
}
|
||||
|
||||
private static IEnumerable<string> EnumerateBclAssemblyPaths()
|
||||
{
|
||||
// The .NET host advertises the resolved runtime-shared-framework + BCL DLL set
|
||||
// via the TRUSTED_PLATFORM_ASSEMBLIES AppContext data slot. This is what the
|
||||
// ALC fallback uses when resolving assemblies, so anything in here is already
|
||||
// loadable by the host process. We restrict to System.* and netstandard to keep
|
||||
// the script's reachable surface to the BCL — anything else (Microsoft.*,
|
||||
// application code, third-party packages happening to be in the runtime store)
|
||||
// would expand the analyzer's deny-list job unnecessarily.
|
||||
var raw = (string?)AppContext.GetData("TRUSTED_PLATFORM_ASSEMBLIES");
|
||||
if (string.IsNullOrEmpty(raw))
|
||||
yield break;
|
||||
|
||||
var separator = OperatingSystem.IsWindows() ? ';' : ':';
|
||||
foreach (var path in raw.Split(separator, StringSplitOptions.RemoveEmptyEntries))
|
||||
{
|
||||
var name = System.IO.Path.GetFileName(path);
|
||||
if (name.StartsWith("System.", StringComparison.Ordinal) ||
|
||||
string.Equals(name, "netstandard.dll", StringComparison.Ordinal) ||
|
||||
string.Equals(name, "mscorlib.dll", StringComparison.Ordinal) ||
|
||||
// Microsoft.Win32.Registry isn't a System.* DLL but the analyzer's
|
||||
// Microsoft.Win32 deny-list relies on the type being resolvable so it
|
||||
// can identify + reject it (Core.Scripting-001 / -002). Add the one
|
||||
// DLL we need rather than broadening to Microsoft.* (which would also
|
||||
// pull in compilers, build tooling, etc.).
|
||||
string.Equals(name, "Microsoft.Win32.Registry.dll", StringComparison.Ordinal))
|
||||
{
|
||||
yield return path;
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// Compile-time sandbox configuration. Returned by <see cref="ScriptSandbox.Build"/>;
|
||||
/// consumed by <see cref="ScriptEvaluator{TContext, TResult}"/>'s manual
|
||||
/// <c>CSharpCompilation</c> path.
|
||||
/// </summary>
|
||||
/// <param name="References">
|
||||
/// Metadata references (allow-listed assemblies) the script compilation is built
|
||||
/// against. Anything not in this set is unresolved at compile, which is the sandbox's
|
||||
/// first line of defense — <see cref="ForbiddenTypeAnalyzer"/> is the second.
|
||||
/// </param>
|
||||
/// <param name="Imports">
|
||||
/// Namespaces pre-imported into the wrapper compilation as <c>using</c> directives
|
||||
/// so scripts can write <c>Math.Abs</c> rather than <c>System.Math.Abs</c>.
|
||||
/// </param>
|
||||
public sealed record SandboxConfig(
|
||||
IReadOnlyList<MetadataReference> References,
|
||||
IReadOnlyList<string> Imports);
|
||||
|
||||
@@ -37,6 +37,21 @@ public sealed class VirtualTagEngine : IDisposable
|
||||
|
||||
private readonly DependencyGraph _graph = new();
|
||||
private readonly Dictionary<string, VirtualTagState> _tags = new(StringComparer.Ordinal);
|
||||
|
||||
/// <summary>
|
||||
/// Compile cache for every virtual-tag script. Routes <see cref="Load"/>'s
|
||||
/// <see cref="ScriptEvaluator{TContext, TResult}.Compile"/> calls through the
|
||||
/// cache so the collectible <see cref="System.Runtime.Loader.AssemblyLoadContext"/>
|
||||
/// each compile produces is actually disposed on the publish-replace path
|
||||
/// (Core.Scripting-016): the cache's <see cref="CompiledScriptCache{TContext, TResult}.Clear"/>
|
||||
/// disposes every materialised evaluator before dropping its dictionary entry,
|
||||
/// so a config-publish releases the prior generation's ALCs and the per-publish
|
||||
/// accretion the Core.Scripting-008 fix targeted is actually freed in production.
|
||||
/// Pre-fix the engine called <c>ScriptEvaluator.Compile</c> directly, which left
|
||||
/// the ALCs rooted until the process exited — defeating -008 on the real path.
|
||||
/// </summary>
|
||||
private readonly CompiledScriptCache<VirtualTagContext, object?> _compileCache = new();
|
||||
|
||||
private readonly ConcurrentDictionary<string, DataValueSnapshot> _valueCache = new(StringComparer.Ordinal);
|
||||
private readonly ConcurrentDictionary<string, List<Action<string, DataValueSnapshot>>> _observers
|
||||
= new(StringComparer.Ordinal);
|
||||
@@ -74,6 +89,10 @@ public sealed class VirtualTagEngine : IDisposable
|
||||
UnsubscribeFromUpstream();
|
||||
_tags.Clear();
|
||||
_graph.Clear();
|
||||
// Dispose every compiled-script ALC from the prior generation BEFORE we
|
||||
// recompile this one. Skipping this is what made Core.Scripting-008 a
|
||||
// no-op in production (Core.Scripting-016).
|
||||
_compileCache.Clear();
|
||||
|
||||
var compileFailures = new List<string>();
|
||||
var seenPaths = new HashSet<string>(StringComparer.Ordinal);
|
||||
@@ -102,7 +121,9 @@ public sealed class VirtualTagEngine : IDisposable
|
||||
continue;
|
||||
}
|
||||
|
||||
var evaluator = ScriptEvaluator<VirtualTagContext, object?>.Compile(def.ScriptSource);
|
||||
// Route through CompiledScriptCache so the emitted assembly's collectible
|
||||
// ALC participates in publish-replace cleanup. (Core.Scripting-016)
|
||||
var evaluator = _compileCache.GetOrCompile(def.ScriptSource);
|
||||
var timed = new TimedScriptEvaluator<VirtualTagContext, object?>(evaluator, _scriptTimeout);
|
||||
var scriptLogger = _loggerFactory.Create(def.Path);
|
||||
|
||||
@@ -481,6 +502,9 @@ public sealed class VirtualTagEngine : IDisposable
|
||||
UnsubscribeFromUpstream();
|
||||
_tags.Clear();
|
||||
_graph.Clear();
|
||||
// Dispose every compiled-script ALC so the engine's shutdown actually
|
||||
// releases the emitted assemblies. (Core.Scripting-016)
|
||||
_compileCache.Dispose();
|
||||
}
|
||||
|
||||
internal DependencyGraph GraphForTesting => _graph;
|
||||
|
||||
@@ -116,12 +116,17 @@ public static class SnapshotFormatter
|
||||
{
|
||||
0x00000000u => "Good",
|
||||
0x80000000u => "Bad",
|
||||
0x80020000u => "BadInternalError",
|
||||
0x80050000u => "BadCommunicationError",
|
||||
0x800A0000u => "BadTimeout",
|
||||
0x80310000u => "BadNoCommunication",
|
||||
0x80320000u => "BadWaitingForInitialData",
|
||||
0x80340000u => "BadNodeIdUnknown",
|
||||
0x80330000u => "BadNodeIdInvalid",
|
||||
0x803B0000u => "BadNotWritable",
|
||||
0x803C0000u => "BadOutOfRange",
|
||||
0x803D0000u => "BadNotSupported",
|
||||
0x808B0000u => "BadDeviceFailure",
|
||||
0x80740000u => "BadTypeMismatch",
|
||||
0x40000000u => "Uncertain",
|
||||
_ => null,
|
||||
|
||||
@@ -41,7 +41,7 @@ public static class AbCipStatusMapper
|
||||
public const uint BadNotWritable = 0x803B0000u;
|
||||
public const uint BadOutOfRange = 0x803C0000u;
|
||||
public const uint BadNotSupported = 0x803D0000u;
|
||||
public const uint BadDeviceFailure = 0x80550000u;
|
||||
public const uint BadDeviceFailure = 0x808B0000u;
|
||||
public const uint BadCommunicationError = 0x80050000u;
|
||||
public const uint BadTimeout = 0x800A0000u;
|
||||
public const uint BadTypeMismatch = 0x80730000u;
|
||||
|
||||
@@ -16,7 +16,7 @@ public static class AbLegacyStatusMapper
|
||||
public const uint BadNotWritable = 0x803B0000u;
|
||||
public const uint BadOutOfRange = 0x803C0000u;
|
||||
public const uint BadNotSupported = 0x803D0000u;
|
||||
public const uint BadDeviceFailure = 0x80550000u;
|
||||
public const uint BadDeviceFailure = 0x808B0000u;
|
||||
public const uint BadCommunicationError = 0x80050000u;
|
||||
public const uint BadTimeout = 0x800A0000u;
|
||||
public const uint BadTypeMismatch = 0x80730000u;
|
||||
|
||||
@@ -14,7 +14,7 @@ public static class FocasStatusMapper
|
||||
public const uint BadNotWritable = 0x803B0000u;
|
||||
public const uint BadOutOfRange = 0x803C0000u;
|
||||
public const uint BadNotSupported = 0x803D0000u;
|
||||
public const uint BadDeviceFailure = 0x80550000u;
|
||||
public const uint BadDeviceFailure = 0x808B0000u;
|
||||
public const uint BadCommunicationError = 0x80050000u;
|
||||
public const uint BadTimeout = 0x800A0000u;
|
||||
public const uint BadTypeMismatch = 0x80730000u;
|
||||
|
||||
@@ -1,6 +1,6 @@
|
||||
using Microsoft.Extensions.Logging;
|
||||
using Microsoft.Extensions.Logging.Abstractions;
|
||||
using MxGateway.Contracts.Proto.Galaxy;
|
||||
using ZB.MOM.WW.MxGateway.Contracts.Proto.Galaxy;
|
||||
using ZB.MOM.WW.OtOpcUa.Core.Abstractions;
|
||||
|
||||
namespace ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Browse;
|
||||
|
||||
@@ -1,4 +1,4 @@
|
||||
using MxGateway.Contracts.Proto.Galaxy;
|
||||
using ZB.MOM.WW.MxGateway.Contracts.Proto.Galaxy;
|
||||
using ZB.MOM.WW.OtOpcUa.Core.Abstractions;
|
||||
|
||||
namespace ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Browse;
|
||||
|
||||
+2
-2
@@ -1,5 +1,5 @@
|
||||
using MxGateway.Client;
|
||||
using MxGateway.Contracts.Proto.Galaxy;
|
||||
using ZB.MOM.WW.MxGateway.Client;
|
||||
using ZB.MOM.WW.MxGateway.Contracts.Proto.Galaxy;
|
||||
|
||||
namespace ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Browse;
|
||||
|
||||
|
||||
@@ -1,5 +1,5 @@
|
||||
using MxGateway.Client;
|
||||
using MxGateway.Contracts.Proto.Galaxy;
|
||||
using ZB.MOM.WW.MxGateway.Client;
|
||||
using ZB.MOM.WW.MxGateway.Contracts.Proto.Galaxy;
|
||||
|
||||
namespace ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Browse;
|
||||
|
||||
|
||||
@@ -1,4 +1,4 @@
|
||||
using MxGateway.Contracts.Proto.Galaxy;
|
||||
using ZB.MOM.WW.MxGateway.Contracts.Proto.Galaxy;
|
||||
|
||||
namespace ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Browse;
|
||||
|
||||
|
||||
@@ -1,4 +1,4 @@
|
||||
using MxGateway.Contracts.Proto.Galaxy;
|
||||
using ZB.MOM.WW.MxGateway.Contracts.Proto.Galaxy;
|
||||
|
||||
namespace ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Browse;
|
||||
|
||||
|
||||
@@ -1,4 +1,4 @@
|
||||
using MxGateway.Contracts.Proto.Galaxy;
|
||||
using ZB.MOM.WW.MxGateway.Contracts.Proto.Galaxy;
|
||||
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Runtime;
|
||||
|
||||
namespace ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Browse;
|
||||
|
||||
@@ -1,7 +1,7 @@
|
||||
using Microsoft.Extensions.Logging;
|
||||
using Microsoft.Extensions.Logging.Abstractions;
|
||||
using MxGateway.Client;
|
||||
using MxGateway.Contracts.Proto;
|
||||
using ZB.MOM.WW.MxGateway.Client;
|
||||
using ZB.MOM.WW.MxGateway.Contracts.Proto;
|
||||
using ZB.MOM.WW.OtOpcUa.Core.Abstractions;
|
||||
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Browse;
|
||||
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Config;
|
||||
|
||||
@@ -2,7 +2,7 @@ using System.Diagnostics.Metrics;
|
||||
using System.Threading.Channels;
|
||||
using Microsoft.Extensions.Logging;
|
||||
using Microsoft.Extensions.Logging.Abstractions;
|
||||
using MxGateway.Contracts.Proto;
|
||||
using ZB.MOM.WW.MxGateway.Contracts.Proto;
|
||||
using ZB.MOM.WW.OtOpcUa.Core.Abstractions;
|
||||
|
||||
namespace ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Runtime;
|
||||
|
||||
@@ -1,6 +1,6 @@
|
||||
using Microsoft.Extensions.Logging;
|
||||
using Microsoft.Extensions.Logging.Abstractions;
|
||||
using MxGateway.Client;
|
||||
using ZB.MOM.WW.MxGateway.Client;
|
||||
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Config;
|
||||
|
||||
namespace ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Runtime;
|
||||
|
||||
+2
-2
@@ -1,6 +1,6 @@
|
||||
using Microsoft.Extensions.Logging;
|
||||
using MxGateway.Client;
|
||||
using MxGateway.Contracts.Proto;
|
||||
using ZB.MOM.WW.MxGateway.Client;
|
||||
using ZB.MOM.WW.MxGateway.Contracts.Proto;
|
||||
|
||||
namespace ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Runtime;
|
||||
|
||||
|
||||
@@ -1,7 +1,7 @@
|
||||
using System.Diagnostics.Metrics;
|
||||
using Microsoft.Extensions.Logging;
|
||||
using Microsoft.Extensions.Logging.Abstractions;
|
||||
using MxGateway.Contracts.Proto;
|
||||
using ZB.MOM.WW.MxGateway.Contracts.Proto;
|
||||
|
||||
namespace ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Runtime;
|
||||
|
||||
|
||||
@@ -1,8 +1,8 @@
|
||||
using System.Collections.Concurrent;
|
||||
using Microsoft.Extensions.Logging;
|
||||
using Microsoft.Extensions.Logging.Abstractions;
|
||||
using MxGateway.Client;
|
||||
using MxGateway.Contracts.Proto;
|
||||
using ZB.MOM.WW.MxGateway.Client;
|
||||
using ZB.MOM.WW.MxGateway.Contracts.Proto;
|
||||
using ZB.MOM.WW.OtOpcUa.Core.Abstractions;
|
||||
|
||||
namespace ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Runtime;
|
||||
|
||||
@@ -1,5 +1,5 @@
|
||||
using MxGateway.Client;
|
||||
using MxGateway.Contracts.Proto;
|
||||
using ZB.MOM.WW.MxGateway.Client;
|
||||
using ZB.MOM.WW.MxGateway.Contracts.Proto;
|
||||
// Use the generated nested status enum for the SetBufferedUpdateInterval reply check.
|
||||
|
||||
namespace ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Runtime;
|
||||
|
||||
@@ -1,4 +1,4 @@
|
||||
using MxGateway.Contracts.Proto;
|
||||
using ZB.MOM.WW.MxGateway.Contracts.Proto;
|
||||
|
||||
namespace ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Runtime;
|
||||
|
||||
|
||||
@@ -1,5 +1,5 @@
|
||||
using Google.Protobuf.WellKnownTypes;
|
||||
using MxGateway.Contracts.Proto;
|
||||
using ZB.MOM.WW.MxGateway.Contracts.Proto;
|
||||
|
||||
namespace ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Runtime;
|
||||
|
||||
|
||||
@@ -1,5 +1,5 @@
|
||||
using Google.Protobuf.WellKnownTypes;
|
||||
using MxGateway.Contracts.Proto;
|
||||
using ZB.MOM.WW.MxGateway.Contracts.Proto;
|
||||
|
||||
namespace ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Runtime;
|
||||
|
||||
|
||||
@@ -1,6 +1,6 @@
|
||||
using Microsoft.Extensions.Logging;
|
||||
using MxGateway.Client;
|
||||
using MxGateway.Contracts.Proto;
|
||||
using ZB.MOM.WW.MxGateway.Client;
|
||||
using ZB.MOM.WW.MxGateway.Contracts.Proto;
|
||||
|
||||
namespace ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Runtime;
|
||||
|
||||
|
||||
@@ -1,5 +1,5 @@
|
||||
using System.Runtime.CompilerServices;
|
||||
using MxGateway.Contracts.Proto;
|
||||
using ZB.MOM.WW.MxGateway.Contracts.Proto;
|
||||
|
||||
namespace ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Runtime;
|
||||
|
||||
|
||||
@@ -15,10 +15,18 @@
|
||||
<ItemGroup>
|
||||
<ProjectReference Include="..\..\Core\ZB.MOM.WW.OtOpcUa.Core.Abstractions\ZB.MOM.WW.OtOpcUa.Core.Abstractions.csproj"/>
|
||||
<ProjectReference Include="..\..\Core\ZB.MOM.WW.OtOpcUa.Core\ZB.MOM.WW.OtOpcUa.Core.csproj"/>
|
||||
<!-- mxaccessgw .NET client. Path-based ProjectReference because both repos sit
|
||||
side-by-side on the dev box; long-term we'll consume MxGateway.Client as a
|
||||
NuGet package. PR 4.W revisits the dependency shape before parity gating. -->
|
||||
<ProjectReference Include="..\..\..\..\mxaccessgw\clients\dotnet\MxGateway.Client\MxGateway.Client.csproj"/>
|
||||
</ItemGroup>
|
||||
|
||||
<ItemGroup>
|
||||
<!-- Sibling mxaccessgw repo's .NET client + contracts. The sibling restored
|
||||
a proper client library under clients/dotnet/ (May 2026), so this is
|
||||
back on a path-based ProjectReference per the libs/README unwind plan #1.
|
||||
Both projects target net10.0; the Contracts project transitively pulls
|
||||
Google.Protobuf + Grpc.Core.Api, the Client project transitively pulls
|
||||
Grpc.Net.Client + Polly.Core + Microsoft.Extensions.Logging.Abstractions,
|
||||
so the explicit PackageReference shims that backfilled the vendored
|
||||
binary references are no longer needed. -->
|
||||
<ProjectReference Include="..\..\..\..\mxaccessgw\clients\dotnet\ZB.MOM.WW.MxGateway.Client\ZB.MOM.WW.MxGateway.Client.csproj"/>
|
||||
</ItemGroup>
|
||||
|
||||
<ItemGroup>
|
||||
|
||||
@@ -1511,7 +1511,7 @@ public sealed class ModbusDriver
|
||||
private const uint StatusBadNotWritable = 0x803B0000u;
|
||||
private const uint StatusBadOutOfRange = 0x803C0000u;
|
||||
private const uint StatusBadNotSupported = 0x803D0000u;
|
||||
private const uint StatusBadDeviceFailure = 0x80550000u;
|
||||
private const uint StatusBadDeviceFailure = 0x808B0000u;
|
||||
private const uint StatusBadCommunicationError = 0x80050000u;
|
||||
|
||||
/// <summary>
|
||||
|
||||
@@ -69,7 +69,7 @@ public sealed class S7Driver(S7DriverOptions options, string driverInstanceId, I
|
||||
/// <summary>OPC UA StatusCode used for socket / timeout / protocol-layer faults.</summary>
|
||||
private const uint StatusBadCommunicationError = 0x80050000u;
|
||||
/// <summary>OPC UA StatusCode used for a genuine device fault (CPU error, hardware fault).</summary>
|
||||
private const uint StatusBadDeviceFailure = 0x80550000u;
|
||||
private const uint StatusBadDeviceFailure = 0x808B0000u;
|
||||
|
||||
private readonly Dictionary<string, S7TagDefinition> _tagsByName = new(StringComparer.OrdinalIgnoreCase);
|
||||
private readonly Dictionary<string, S7ParsedAddress> _parsedByName = new(StringComparer.OrdinalIgnoreCase);
|
||||
|
||||
@@ -16,7 +16,7 @@ public static class TwinCATStatusMapper
|
||||
public const uint BadNotWritable = 0x803B0000u;
|
||||
public const uint BadOutOfRange = 0x803C0000u;
|
||||
public const uint BadNotSupported = 0x803D0000u;
|
||||
public const uint BadDeviceFailure = 0x80550000u;
|
||||
public const uint BadDeviceFailure = 0x808B0000u;
|
||||
public const uint BadCommunicationError = 0x80050000u;
|
||||
public const uint BadTimeout = 0x800A0000u;
|
||||
public const uint BadTypeMismatch = 0x80730000u;
|
||||
|
||||
@@ -101,29 +101,44 @@ else
|
||||
{
|
||||
<Generations ClusterId="@ClusterId"/>
|
||||
}
|
||||
else if (_tab == "equipment" && _currentDraft is not null)
|
||||
else if (_tab is "equipment" or "uns" or "namespaces" or "drivers" or "tags" or "acls")
|
||||
{
|
||||
<EquipmentTab GenerationId="@_currentDraft.GenerationId"/>
|
||||
}
|
||||
else if (_tab == "uns" && _currentDraft is not null)
|
||||
{
|
||||
<UnsTab GenerationId="@_currentDraft.GenerationId" ClusterId="@ClusterId"/>
|
||||
}
|
||||
else if (_tab == "namespaces" && _currentDraft is not null)
|
||||
{
|
||||
<NamespacesTab GenerationId="@_currentDraft.GenerationId" ClusterId="@ClusterId"/>
|
||||
}
|
||||
else if (_tab == "drivers" && _currentDraft is not null)
|
||||
{
|
||||
<DriversTab GenerationId="@_currentDraft.GenerationId" ClusterId="@ClusterId"/>
|
||||
}
|
||||
else if (_tab == "tags" && _currentDraft is not null)
|
||||
{
|
||||
<TagsTab GenerationId="@_currentDraft.GenerationId" ClusterId="@ClusterId"/>
|
||||
}
|
||||
else if (_tab == "acls" && _currentDraft is not null)
|
||||
{
|
||||
<AclsTab GenerationId="@_currentDraft.GenerationId" ClusterId="@ClusterId"/>
|
||||
@* Bug #10 fix — these six tabs are scoped to a generation. Per docs/v2/admin-ui.md the
|
||||
design intent is a read-only view of the published generation when no draft is open
|
||||
("Edit in draft" affordance), and the editable view of the draft when one is open.
|
||||
The earlier implementation rendered nothing in the no-draft case, leaving operators
|
||||
with just the "Open a draft to edit" placeholder. We now route both states through
|
||||
the same tab components, gating edits via <fieldset disabled> so a button click in
|
||||
the read-only state cannot silently mutate the published rows even though the tab
|
||||
components themselves haven't been refactored to honor an IsReadOnly flag yet. *@
|
||||
var genId = _currentDraft?.GenerationId ?? _currentPublished?.GenerationId;
|
||||
var isReadOnly = _currentDraft is null;
|
||||
if (genId is null)
|
||||
{
|
||||
<section class="panel notice rise" style="animation-delay:.02s">
|
||||
No published generation yet. Click <strong>New draft</strong> above to author this cluster's first generation.
|
||||
</section>
|
||||
}
|
||||
else
|
||||
{
|
||||
if (isReadOnly)
|
||||
{
|
||||
<section class="panel notice rise mb-3" style="animation-delay:.02s">
|
||||
<strong>Read-only view</strong> of published generation @genId. Click <strong>New draft</strong> above to make changes.
|
||||
</section>
|
||||
}
|
||||
<fieldset disabled="@isReadOnly" style="border:0;padding:0;margin:0;min-width:0;">
|
||||
@switch (_tab)
|
||||
{
|
||||
case "equipment": <EquipmentTab GenerationId="@genId.Value"/> break;
|
||||
case "uns": <UnsTab GenerationId="@genId.Value" ClusterId="@ClusterId"/> break;
|
||||
case "namespaces": <NamespacesTab GenerationId="@genId.Value" ClusterId="@ClusterId"/> break;
|
||||
case "drivers": <DriversTab GenerationId="@genId.Value" ClusterId="@ClusterId"/> break;
|
||||
case "tags": <TagsTab GenerationId="@genId.Value" ClusterId="@ClusterId"/> break;
|
||||
case "acls": <AclsTab GenerationId="@genId.Value" ClusterId="@ClusterId"/> break;
|
||||
}
|
||||
</fieldset>
|
||||
}
|
||||
}
|
||||
else if (_tab == "redundancy")
|
||||
{
|
||||
@@ -133,10 +148,6 @@ else
|
||||
{
|
||||
<AuditTab ClusterId="@ClusterId"/>
|
||||
}
|
||||
else
|
||||
{
|
||||
<section class="panel notice rise" style="animation-delay:.02s">Open a draft to edit this cluster's content.</section>
|
||||
}
|
||||
}
|
||||
|
||||
@code {
|
||||
|
||||
@@ -16,12 +16,40 @@ public sealed class ClusterNodeService(OtOpcUaConfigDbContext db)
|
||||
/// tolerance covers a missed heartbeat plus publisher GC pauses.</summary>
|
||||
public static readonly TimeSpan StaleThreshold = TimeSpan.FromSeconds(30);
|
||||
|
||||
public Task<List<ClusterNode>> ListByClusterAsync(string clusterId, CancellationToken ct) =>
|
||||
db.ClusterNodes.AsNoTracking()
|
||||
public async Task<List<ClusterNode>> ListByClusterAsync(string clusterId, CancellationToken ct)
|
||||
{
|
||||
var nodes = await db.ClusterNodes.AsNoTracking()
|
||||
.Where(n => n.ClusterId == clusterId)
|
||||
.OrderByDescending(n => n.ServiceLevelBase)
|
||||
.ThenBy(n => n.NodeId)
|
||||
.ToListAsync(ct);
|
||||
.ToListAsync(ct).ConfigureAwait(false);
|
||||
|
||||
// Bug #12 fix follow-up — the live-node heartbeat lands on
|
||||
// ClusterNodeGenerationState.LastSeenAt (written by sp_RegisterNodeGenerationApplied
|
||||
// on every generation poll). The ClusterNode.LastSeenAt column is a legacy slot that
|
||||
// no current writer maintains, so reading it directly would show "never STALE"
|
||||
// forever for every running node. Overlay the GenerationState heartbeat onto the
|
||||
// returned ClusterNode rows when it's more recent so the Redundancy tab + IsStale
|
||||
// predicate reflect actual liveness without needing a new write path or schema change.
|
||||
var nodeIds = nodes.Select(n => n.NodeId).ToList();
|
||||
if (nodeIds.Count > 0)
|
||||
{
|
||||
var heartbeats = await db.ClusterNodeGenerationStates.AsNoTracking()
|
||||
.Where(s => nodeIds.Contains(s.NodeId))
|
||||
.Select(s => new { s.NodeId, s.LastSeenAt })
|
||||
.ToListAsync(ct).ConfigureAwait(false);
|
||||
var beatByNode = heartbeats.ToDictionary(s => s.NodeId, s => s.LastSeenAt);
|
||||
foreach (var n in nodes)
|
||||
{
|
||||
if (beatByNode.TryGetValue(n.NodeId, out var hb) && hb is not null
|
||||
&& (n.LastSeenAt is null || hb > n.LastSeenAt))
|
||||
{
|
||||
n.LastSeenAt = hb;
|
||||
}
|
||||
}
|
||||
}
|
||||
return nodes;
|
||||
}
|
||||
|
||||
public static bool IsStale(ClusterNode node) =>
|
||||
node.LastSeenAt is null || DateTime.UtcNow - node.LastSeenAt.Value > StaleThreshold;
|
||||
|
||||
@@ -1,6 +1,7 @@
|
||||
using Microsoft.Data.SqlClient;
|
||||
using Microsoft.Extensions.Hosting;
|
||||
using Microsoft.Extensions.Logging;
|
||||
using ZB.MOM.WW.OtOpcUa.Configuration.Enums;
|
||||
using ZB.MOM.WW.OtOpcUa.Server.Redundancy;
|
||||
|
||||
namespace ZB.MOM.WW.OtOpcUa.Server.Hosting;
|
||||
@@ -42,10 +43,20 @@ public sealed class GenerationRefreshHostedService(
|
||||
RedundancyCoordinator coordinator,
|
||||
ILogger<GenerationRefreshHostedService> logger,
|
||||
TimeSpan? tickInterval = null,
|
||||
Func<CancellationToken, Task<long?>>? currentGenerationQuery = null) : BackgroundService
|
||||
Func<CancellationToken, Task<long?>>? currentGenerationQuery = null,
|
||||
Func<long, NodeApplyStatus, string?, CancellationToken, Task>? registerAppliedAsync = null) : BackgroundService
|
||||
{
|
||||
private readonly Func<CancellationToken, Task<long?>> _generationQuery = currentGenerationQuery
|
||||
?? new Func<CancellationToken, Task<long?>>(ct => DefaultQueryCurrentGenerationAsync(options, logger, ct));
|
||||
|
||||
// Bug #12 fix — the server now reports applied-generation state + heartbeat back to the
|
||||
// central DB via sp_RegisterNodeGenerationApplied. Before this wiring the proc had zero
|
||||
// callers, so dbo.ClusterNodeGenerationState stayed empty for every node and the Admin UI
|
||||
// Fleet status page + cluster-detail Redundancy LastSeenAt both showed "no node state /
|
||||
// never STALE" indefinitely. Tests inject a stub via the registerAppliedAsync parameter.
|
||||
private readonly Func<long, NodeApplyStatus, string?, CancellationToken, Task> _registerApplied = registerAppliedAsync
|
||||
?? new Func<long, NodeApplyStatus, string?, CancellationToken, Task>(
|
||||
(gen, status, err, ct) => DefaultRegisterAppliedAsync(options, logger, gen, status, err, ct));
|
||||
/// <summary>
|
||||
/// How often the service polls <c>sp_GetCurrentGenerationForCluster</c>. Default 5 s —
|
||||
/// low enough that operator publishes take effect promptly, high enough that the
|
||||
@@ -97,6 +108,18 @@ public sealed class GenerationRefreshHostedService(
|
||||
|
||||
if (LastAppliedGenerationId is long last && current == last)
|
||||
{
|
||||
// Heartbeat — re-stamps LastSeenAt on dbo.ClusterNodeGenerationState so the Admin
|
||||
// Fleet status page + cluster Redundancy tab can detect the node is alive without
|
||||
// a generation change. Best-effort: a transient DB error here must not throw out of
|
||||
// the tick (the next tick will retry) and must not block applies.
|
||||
try
|
||||
{
|
||||
await _registerApplied(current.Value, NodeApplyStatus.Applied, null, cancellationToken).ConfigureAwait(false);
|
||||
}
|
||||
catch (Exception hbEx) when (hbEx is not OperationCanceledException)
|
||||
{
|
||||
logger.LogDebug(hbEx, "Heartbeat to sp_RegisterNodeGenerationApplied failed; will retry next tick");
|
||||
}
|
||||
return; // no change
|
||||
}
|
||||
|
||||
@@ -109,14 +132,44 @@ public sealed class GenerationRefreshHostedService(
|
||||
// lease is open. Publisher ticks in parallel (1s cadence) will observe the band
|
||||
// transition and push it onto the OPC UA Server.ServiceLevel node.
|
||||
var publishRequestId = Guid.NewGuid();
|
||||
await using (leases.BeginApplyLease(current.Value, publishRequestId))
|
||||
NodeApplyStatus applyStatus;
|
||||
string? applyError = null;
|
||||
try
|
||||
{
|
||||
await coordinator.RefreshAsync(cancellationToken).ConfigureAwait(false);
|
||||
// Future: fire a domain event that driver hosts / virtual-tag engine /
|
||||
// scripted-alarm engine subscribe to. For now the topology refresh is the
|
||||
// only thing we rewire — everything else still requires a process restart.
|
||||
await using (leases.BeginApplyLease(current.Value, publishRequestId))
|
||||
{
|
||||
await coordinator.RefreshAsync(cancellationToken).ConfigureAwait(false);
|
||||
// Future: fire a domain event that driver hosts / virtual-tag engine /
|
||||
// scripted-alarm engine subscribe to. For now the topology refresh is the
|
||||
// only thing we rewire — everything else still requires a process restart.
|
||||
}
|
||||
applyStatus = NodeApplyStatus.Applied;
|
||||
}
|
||||
catch (Exception applyEx) when (applyEx is not OperationCanceledException)
|
||||
{
|
||||
applyStatus = NodeApplyStatus.Failed;
|
||||
applyError = applyEx.Message;
|
||||
logger.LogError(applyEx, "Apply of generation {Generation} failed; will report Failed status to central DB", current);
|
||||
// fall through to register so operators see the failed apply in /fleet
|
||||
}
|
||||
|
||||
// Always tell the central DB what happened with this apply attempt — success or
|
||||
// failure. The proc upserts dbo.ClusterNodeGenerationState (CurrentGenerationId +
|
||||
// LastAppliedAt + LastAppliedStatus + LastAppliedError + LastSeenAt). Failure here
|
||||
// mustn't prevent us from advancing LastAppliedGenerationId — the apply already
|
||||
// happened (or already failed); the publish is purely observability.
|
||||
try
|
||||
{
|
||||
await _registerApplied(current.Value, applyStatus, applyError, cancellationToken).ConfigureAwait(false);
|
||||
}
|
||||
catch (Exception regEx) when (regEx is not OperationCanceledException)
|
||||
{
|
||||
logger.LogWarning(regEx, "sp_RegisterNodeGenerationApplied call failed for gen {Generation} status {Status}", current, applyStatus);
|
||||
}
|
||||
|
||||
// Advance the cursor even on Failed — the proc has been told; next tick will heartbeat
|
||||
// and a future generation will trigger a fresh apply attempt. Pinning the cursor on
|
||||
// failure would loop us through the same broken apply every 5s.
|
||||
LastAppliedGenerationId = current;
|
||||
RefreshCount++;
|
||||
}
|
||||
@@ -157,4 +210,35 @@ public sealed class GenerationRefreshHostedService(
|
||||
return null;
|
||||
}
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// Default register-applied implementation — calls <c>sp_RegisterNodeGenerationApplied</c>
|
||||
/// to MERGE-upsert <see cref="ZB.MOM.WW.OtOpcUa.Configuration.Entities.ClusterNodeGenerationState"/>
|
||||
/// for this node. Called both at apply completion (success or failure) and on every
|
||||
/// no-change heartbeat tick so <c>LastSeenAt</c> stays fresh in the central DB and the
|
||||
/// Admin UI Fleet status page + Redundancy LastSeenAt indicator can detect a healthy node.
|
||||
/// Bug #12 fix — wires the previously-orphaned proc into the apply loop.
|
||||
/// </summary>
|
||||
private static async Task DefaultRegisterAppliedAsync(
|
||||
NodeOptions options,
|
||||
ILogger logger,
|
||||
long generationId,
|
||||
NodeApplyStatus status,
|
||||
string? error,
|
||||
CancellationToken cancellationToken)
|
||||
{
|
||||
await using var conn = new SqlConnection(options.ConfigDbConnectionString);
|
||||
await conn.OpenAsync(cancellationToken).ConfigureAwait(false);
|
||||
|
||||
await using var cmd = conn.CreateCommand();
|
||||
cmd.CommandText = "EXEC dbo.sp_RegisterNodeGenerationApplied @NodeId=@n, @GenerationId=@g, @Status=@s, @Error=@e";
|
||||
cmd.Parameters.AddWithValue("@n", options.NodeId);
|
||||
cmd.Parameters.AddWithValue("@g", generationId);
|
||||
cmd.Parameters.AddWithValue("@s", status.ToString());
|
||||
cmd.Parameters.AddWithValue("@e", (object?)error ?? DBNull.Value);
|
||||
|
||||
await cmd.ExecuteNonQueryAsync(cancellationToken).ConfigureAwait(false);
|
||||
// Single-line trace so soak runs can see heartbeat ticks without flooding at Info.
|
||||
logger.LogTrace("Reported gen {Generation} status {Status} to central DB", generationId, status);
|
||||
}
|
||||
}
|
||||
|
||||
@@ -925,4 +925,165 @@ public sealed class ScriptedAlarmEngineTests
|
||||
public Task RemoveAsync(string alarmId, CancellationToken ct)
|
||||
=> _inner.RemoveAsync(alarmId, ct);
|
||||
}
|
||||
|
||||
// --- Core.ScriptedAlarms-009: per-alarm evaluation-scratch reuse ---
|
||||
|
||||
[Fact]
|
||||
public async Task Reevaluation_reuses_the_same_read_cache_dictionary()
|
||||
{
|
||||
// Pre-009 the engine allocated a fresh Dictionary<string, DataValueSnapshot>
|
||||
// on every upstream-change tick; on a busy line this was a steady allocation
|
||||
// stream on the hot path. The fix: one dictionary per alarm, refilled in place
|
||||
// under _evalGate. Test asserts the dictionary instance is identical across
|
||||
// two consecutive evaluations of the same alarm.
|
||||
var up = new FakeUpstream();
|
||||
up.Set("Temp", 50);
|
||||
using var eng = Build(up, out _);
|
||||
await eng.LoadAsync(
|
||||
[Alarm("HighTemp", """return (int)ctx.GetTag("Temp").Value > 100;""")],
|
||||
TestContext.Current.CancellationToken);
|
||||
|
||||
// First evaluation runs during LoadAsync. Capture the scratch reference now.
|
||||
var scratchAfterLoad = eng.TryGetScratchReadCacheForTest("HighTemp");
|
||||
scratchAfterLoad.ShouldNotBeNull(
|
||||
"the scratch should have been allocated during LoadAsync's initial evaluation");
|
||||
|
||||
// Trigger a re-evaluation by pushing an upstream change.
|
||||
up.Push("Temp", 150);
|
||||
await WaitForAsync(() =>
|
||||
eng.GetState("HighTemp")!.Active == AlarmActiveState.Active);
|
||||
|
||||
var scratchAfterPush = eng.TryGetScratchReadCacheForTest("HighTemp");
|
||||
ReferenceEquals(scratchAfterLoad, scratchAfterPush).ShouldBeTrue(
|
||||
"the read-cache dictionary must be the *same* instance across evaluations " +
|
||||
"(Core.ScriptedAlarms-009) — a per-call allocation would defeat the fix.");
|
||||
scratchAfterPush!["Temp"].Value.ShouldBe(150, "refill must update the existing dictionary in place");
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public async Task Reevaluation_reuses_the_same_predicate_context()
|
||||
{
|
||||
// The context wraps the read-cache by reference; refilling the dictionary
|
||||
// updates what the script sees. Reusing the context spares a per-call object
|
||||
// allocation as well as the dictionary one.
|
||||
var up = new FakeUpstream();
|
||||
up.Set("Temp", 50);
|
||||
using var eng = Build(up, out _);
|
||||
await eng.LoadAsync(
|
||||
[Alarm("HighTemp", """return (int)ctx.GetTag("Temp").Value > 100;""")],
|
||||
TestContext.Current.CancellationToken);
|
||||
|
||||
var ctxAfterLoad = eng.TryGetScratchContextForTest("HighTemp");
|
||||
ctxAfterLoad.ShouldNotBeNull();
|
||||
|
||||
up.Push("Temp", 150);
|
||||
await WaitForAsync(() =>
|
||||
eng.GetState("HighTemp")!.Active == AlarmActiveState.Active);
|
||||
|
||||
var ctxAfterPush = eng.TryGetScratchContextForTest("HighTemp");
|
||||
ReferenceEquals(ctxAfterLoad, ctxAfterPush).ShouldBeTrue(
|
||||
"the AlarmPredicateContext must be reused across evaluations (Core.ScriptedAlarms-009).");
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public async Task LoadAsync_drops_the_prior_generations_scratch()
|
||||
{
|
||||
// A config-publish recreates AlarmStates with potentially different Inputs +
|
||||
// Loggers; reusing the prior generation's scratch would attach an outdated
|
||||
// logger to the new alarm. LoadAsync must clear _scratchByAlarmId so the
|
||||
// next evaluation lazily re-allocates against the fresh AlarmState.
|
||||
var up = new FakeUpstream();
|
||||
up.Set("Temp", 50);
|
||||
using var eng = Build(up, out _);
|
||||
await eng.LoadAsync(
|
||||
[Alarm("HighTemp", """return (int)ctx.GetTag("Temp").Value > 100;""")],
|
||||
TestContext.Current.CancellationToken);
|
||||
|
||||
var scratchAfterFirstLoad = eng.TryGetScratchReadCacheForTest("HighTemp");
|
||||
scratchAfterFirstLoad.ShouldNotBeNull();
|
||||
|
||||
// Second LoadAsync — same alarm id, same predicate, but the scratch should be
|
||||
// wiped and re-allocated on the next evaluation. (LoadAsync itself triggers a
|
||||
// first evaluation, so the scratch is reborn before we look.)
|
||||
await eng.LoadAsync(
|
||||
[Alarm("HighTemp", """return (int)ctx.GetTag("Temp").Value > 100;""")],
|
||||
TestContext.Current.CancellationToken);
|
||||
|
||||
var scratchAfterSecondLoad = eng.TryGetScratchReadCacheForTest("HighTemp");
|
||||
scratchAfterSecondLoad.ShouldNotBeNull();
|
||||
ReferenceEquals(scratchAfterFirstLoad, scratchAfterSecondLoad).ShouldBeFalse(
|
||||
"LoadAsync must drop the prior generation's scratch — reuse across a publish " +
|
||||
"would attach a stale Logger / Inputs to the new alarm definition.");
|
||||
}
|
||||
|
||||
// --- Core.Scripting-016: engine routes compiles through CompiledScriptCache ---
|
||||
|
||||
[Fact]
|
||||
public void Dispose_unloads_compiled_predicate_assembly()
|
||||
{
|
||||
// Pre-fix the engine called ScriptEvaluator.Compile directly, so the
|
||||
// emitted predicate assembly's ALC stayed loaded for the process lifetime.
|
||||
// After the fix the engine routes through CompiledScriptCache; engine
|
||||
// Dispose triggers cache Dispose which unloads every cached evaluator's ALC.
|
||||
// Assert via WeakReference + GC that the assembly is actually reclaimed.
|
||||
// Helper is sync + [NoInlining] so its locals can't be kept alive by an
|
||||
// async state machine (an earlier async version of this test failed because
|
||||
// the state-machine struct held the evaluator past the method-end).
|
||||
var weak = CompileAlarmAndCaptureWeak();
|
||||
for (int i = 0; i < 10 && weak.IsAlive; i++)
|
||||
{
|
||||
GC.Collect();
|
||||
GC.WaitForPendingFinalizers();
|
||||
GC.Collect();
|
||||
}
|
||||
weak.IsAlive.ShouldBeFalse(
|
||||
"engine Dispose must release the compiled-predicate assembly via " +
|
||||
"CompiledScriptCache (Core.Scripting-016). If this fails the engine is " +
|
||||
"back to calling ScriptEvaluator.Compile directly and -008's headline " +
|
||||
"fix doesn't run in production.");
|
||||
}
|
||||
|
||||
[System.Runtime.CompilerServices.MethodImpl(
|
||||
System.Runtime.CompilerServices.MethodImplOptions.NoInlining)]
|
||||
private static WeakReference CompileAlarmAndCaptureWeak()
|
||||
{
|
||||
var up = new FakeUpstream();
|
||||
up.Set("Temp", 50);
|
||||
var eng = Build(up, out _);
|
||||
// Block on LoadAsync so this helper stays synchronous — an `async Task`
|
||||
// wrapper would compile to a state machine whose generated struct keeps the
|
||||
// local `eng` reference alive past the method's apparent end, defeating GC.
|
||||
eng.LoadAsync(
|
||||
[new ScriptedAlarmDefinition(
|
||||
"HighTemp", "Plant/Line1", "HighTemp",
|
||||
AlarmKind.AlarmCondition, AlarmSeverity.High,
|
||||
"x",
|
||||
"""return (int)ctx.GetTag("Temp").Value > 100;""")],
|
||||
default).GetAwaiter().GetResult();
|
||||
|
||||
// Reach into the engine's compile cache via reflection — the field is
|
||||
// private; we only need the Assembly reference, scoped to this NoInlining
|
||||
// helper so the locals die when it returns.
|
||||
var weak = ExtractEmittedAssemblyWeakRef(eng);
|
||||
eng.Dispose();
|
||||
return weak;
|
||||
}
|
||||
|
||||
[System.Runtime.CompilerServices.MethodImpl(
|
||||
System.Runtime.CompilerServices.MethodImplOptions.NoInlining)]
|
||||
private static WeakReference ExtractEmittedAssemblyWeakRef(ScriptedAlarmEngine eng)
|
||||
{
|
||||
var cacheField = typeof(ScriptedAlarmEngine).GetField(
|
||||
"_compileCache", System.Reflection.BindingFlags.Instance | System.Reflection.BindingFlags.NonPublic)!;
|
||||
var cache = cacheField.GetValue(eng)!;
|
||||
var cacheDictField = cache.GetType().GetField(
|
||||
"_cache", System.Reflection.BindingFlags.Instance | System.Reflection.BindingFlags.NonPublic)!;
|
||||
var cacheDict = (System.Collections.IDictionary)cacheDictField.GetValue(cache)!;
|
||||
var lazy = cacheDict.Values.Cast<object>().Single();
|
||||
var evaluator = lazy.GetType().GetProperty("Value")!.GetValue(lazy)!;
|
||||
var del = (Delegate)evaluator.GetType().GetField(
|
||||
"_func", System.Reflection.BindingFlags.Instance | System.Reflection.BindingFlags.NonPublic)!
|
||||
.GetValue(evaluator)!;
|
||||
return new WeakReference(del.Method.Module.Assembly);
|
||||
}
|
||||
}
|
||||
|
||||
@@ -221,4 +221,173 @@ public sealed class CompiledScriptCacheTests
|
||||
Should.Throw<Exception>(() => cache.GetOrCompile("""return unknownIdentifier + 1;"""));
|
||||
cache.Count.ShouldBe(0, "faulted Lazy must still be evicted after compile failure");
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public void Clear_uses_value_scoped_TryRemove_so_a_race_inserted_entry_survives()
|
||||
{
|
||||
// Regression for Core.Scripting-014: Clear() previously used the key-only
|
||||
// TryRemove(key, out var lazy) overload, which is value-blind. If a
|
||||
// concurrent GetOrCompile re-inserts a fresh Lazy under the same key
|
||||
// between Clear's snapshot and the TryRemove, the buggy overload evicts
|
||||
// the fresh entry and disposes its evaluator while the concurrent
|
||||
// caller still holds + intends to invoke it. The fixed
|
||||
// TryRemove(KeyValuePair<,>) overload compares value identity.
|
||||
//
|
||||
// Single-threaded test that models the race window: snapshot the
|
||||
// dictionary like Clear does, mutate the dictionary mid-flight (modelling
|
||||
// Thread B's GetOrCompile), then drive the same TryRemove(KeyValuePair<,>)
|
||||
// overload Clear now uses. The buggy key-only overload would evict the
|
||||
// fresh entry; the fixed value-scoped overload leaves it alone. This
|
||||
// verifies the *semantic* of Clear's code path — actually intercepting
|
||||
// a real Clear() invocation mid-loop would require either two threads
|
||||
// (flaky) or a Dispose side-effect that re-adds within the loop
|
||||
// iteration (only works with multiple snapshotted entries).
|
||||
var cache = new CompiledScriptCache<FakeScriptContext, int>();
|
||||
var cacheField = typeof(CompiledScriptCache<FakeScriptContext, int>)
|
||||
.GetField("_cache", System.Reflection.BindingFlags.Instance | System.Reflection.BindingFlags.NonPublic);
|
||||
var dict = (System.Collections.Concurrent.ConcurrentDictionary<
|
||||
string, Lazy<ScriptEvaluator<FakeScriptContext, int>>>)cacheField!.GetValue(cache)!;
|
||||
var hashSourceMethod = typeof(CompiledScriptCache<FakeScriptContext, int>)
|
||||
.GetMethod("HashSource", System.Reflection.BindingFlags.Static | System.Reflection.BindingFlags.NonPublic);
|
||||
var key = (string)hashSourceMethod!.Invoke(null, ["""return 1;"""])!;
|
||||
|
||||
var firstGen = new Lazy<ScriptEvaluator<FakeScriptContext, int>>(
|
||||
() => ScriptEvaluator<FakeScriptContext, int>.Compile("""return 1;"""),
|
||||
LazyThreadSafetyMode.ExecutionAndPublication);
|
||||
dict[key] = firstGen;
|
||||
|
||||
// Snapshot like Clear's first line does.
|
||||
var snapshot = dict.ToArray();
|
||||
snapshot.Length.ShouldBe(1);
|
||||
|
||||
// Race window: a concurrent GetOrCompile inserts a fresh Lazy under the
|
||||
// same key. The dict now holds `fresh`, but the snapshot still references
|
||||
// `firstGen`.
|
||||
var fresh = new Lazy<ScriptEvaluator<FakeScriptContext, int>>(
|
||||
() => ScriptEvaluator<FakeScriptContext, int>.Compile("""return 1;"""),
|
||||
LazyThreadSafetyMode.ExecutionAndPublication);
|
||||
dict[key] = fresh;
|
||||
|
||||
// The fixed Clear uses the value-scoped TryRemove(KeyValuePair<,>) overload.
|
||||
// Drive it directly with our snapshot entry — the value comparison sees
|
||||
// firstGen ≠ fresh and leaves the entry intact.
|
||||
foreach (var entry in snapshot)
|
||||
{
|
||||
var removed = ((System.Collections.Generic.ICollection<
|
||||
System.Collections.Generic.KeyValuePair<string, Lazy<ScriptEvaluator<FakeScriptContext, int>>>>)dict)
|
||||
.Remove(entry);
|
||||
removed.ShouldBeFalse(
|
||||
"value-scoped removal must reject the stale snapshot entry — the dict " +
|
||||
"now holds `fresh`, not `firstGen`. The buggy key-only TryRemove would " +
|
||||
"have returned true and evicted the fresh entry (Core.Scripting-014).");
|
||||
}
|
||||
|
||||
dict.ContainsKey(key).ShouldBeTrue(
|
||||
"the fresh entry inserted during the race window must survive Clear().");
|
||||
ReferenceEquals(dict[key], fresh).ShouldBeTrue(
|
||||
"the surviving entry must be the fresh Lazy, not the one Clear() snapshotted.");
|
||||
}
|
||||
|
||||
// --- Core.Scripting-008: collectible AssemblyLoadContext unload ---
|
||||
|
||||
[Fact]
|
||||
public void Dispose_unloads_compiled_script_assembly_load_context()
|
||||
{
|
||||
// The whole point of switching the emit path off CSharpScript.CreateDelegate
|
||||
// and onto a collectible ALC: after Dispose, the runtime can reclaim the
|
||||
// emitted assembly. We assert this via a WeakReference to the compiled
|
||||
// assembly itself — if the ALC unloads correctly the reference is dead after
|
||||
// a forced GC; if the assembly stayed rooted (the pre-fix behaviour) the
|
||||
// reference survives. The exact reclaim is GC-timing-sensitive, so we loop a
|
||||
// bounded number of times to absorb GC scheduling noise.
|
||||
var weak = CompileAndCaptureWeakAssembly();
|
||||
// Help the runtime — ALC.Unload is *eligible-for-collection*, not synchronous.
|
||||
for (int i = 0; i < 10 && weak.IsAlive; i++)
|
||||
{
|
||||
GC.Collect();
|
||||
GC.WaitForPendingFinalizers();
|
||||
GC.Collect();
|
||||
}
|
||||
weak.IsAlive.ShouldBeFalse(
|
||||
"the collectible ALC must release the emitted script assembly after Dispose " +
|
||||
"(Core.Scripting-008). If this fails, either the cache held a reference past " +
|
||||
"Dispose, or a delegate/closure rooted the assembly in the default ALC.");
|
||||
}
|
||||
|
||||
[System.Runtime.CompilerServices.MethodImpl(
|
||||
System.Runtime.CompilerServices.MethodImplOptions.NoInlining)]
|
||||
private static WeakReference CompileAndCaptureWeakAssembly()
|
||||
{
|
||||
// Isolation method so the JIT cannot keep the local references rooted past
|
||||
// its return — without [NoInlining] the GC may decide the locals are still
|
||||
// live and the WeakReference test becomes flaky.
|
||||
var evaluator = ScriptEvaluator<FakeScriptContext, int>.Compile("""return 42;""");
|
||||
var weak = new WeakReference(evaluator.GetType().Assembly is var asm &&
|
||||
asm is not null ? GetEmittedAssembly(evaluator) : null);
|
||||
evaluator.Dispose();
|
||||
return weak;
|
||||
}
|
||||
|
||||
[System.Runtime.CompilerServices.MethodImpl(
|
||||
System.Runtime.CompilerServices.MethodImplOptions.NoInlining)]
|
||||
private static object GetEmittedAssembly(ScriptEvaluator<FakeScriptContext, int> evaluator)
|
||||
{
|
||||
// The evaluator's delegate Method lives on the synthesized wrapper class in
|
||||
// the emitted assembly. The delegate field is private; we reach it via a
|
||||
// reflection probe rather than expose internals — keeps the public surface
|
||||
// unchanged.
|
||||
var funcField = typeof(ScriptEvaluator<FakeScriptContext, int>).GetField(
|
||||
"_func", System.Reflection.BindingFlags.Instance | System.Reflection.BindingFlags.NonPublic)!;
|
||||
var del = (Delegate)funcField.GetValue(evaluator)!;
|
||||
return del.Method.Module.Assembly;
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public void Clear_disposes_every_materialised_evaluator()
|
||||
{
|
||||
// Locks the contract that Clear() is publish-safe: after a config-generation
|
||||
// publish drops the cache, every prior script's ALC should unload so the
|
||||
// process memory plateaus rather than growing across publishes.
|
||||
var weaks = CompileFiveAndCaptureWeakAssemblies();
|
||||
for (int i = 0; i < 10 && weaks.Any(w => w.IsAlive); i++)
|
||||
{
|
||||
GC.Collect();
|
||||
GC.WaitForPendingFinalizers();
|
||||
GC.Collect();
|
||||
}
|
||||
weaks.ShouldAllBe(w => !w.IsAlive,
|
||||
"after Clear() every compiled-script ALC must be unloadable " +
|
||||
"(Core.Scripting-008). If this fails the publish-replace pattern leaks " +
|
||||
"emitted assemblies, which is exactly the v3 concern this rewrite fixes.");
|
||||
}
|
||||
|
||||
[System.Runtime.CompilerServices.MethodImpl(
|
||||
System.Runtime.CompilerServices.MethodImplOptions.NoInlining)]
|
||||
private static List<WeakReference> CompileFiveAndCaptureWeakAssemblies()
|
||||
{
|
||||
var cache = new CompiledScriptCache<FakeScriptContext, int>();
|
||||
var weaks = new List<WeakReference>();
|
||||
for (int i = 0; i < 5; i++)
|
||||
{
|
||||
// Distinct source per iteration so each compiles to its own assembly.
|
||||
var e = cache.GetOrCompile($"""return {i};""");
|
||||
var funcField = typeof(ScriptEvaluator<FakeScriptContext, int>).GetField(
|
||||
"_func", System.Reflection.BindingFlags.Instance | System.Reflection.BindingFlags.NonPublic)!;
|
||||
var del = (Delegate)funcField.GetValue(e)!;
|
||||
weaks.Add(new WeakReference(del.Method.Module.Assembly));
|
||||
}
|
||||
cache.Count.ShouldBe(5);
|
||||
cache.Clear();
|
||||
cache.Count.ShouldBe(0);
|
||||
return weaks;
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public void GetOrCompile_after_Dispose_throws_ObjectDisposedException()
|
||||
{
|
||||
var cache = new CompiledScriptCache<FakeScriptContext, int>();
|
||||
cache.GetOrCompile("""return 1;""");
|
||||
cache.Dispose();
|
||||
Should.Throw<ObjectDisposedException>(() => cache.GetOrCompile("""return 2;"""));
|
||||
}
|
||||
}
|
||||
|
||||
@@ -446,4 +446,88 @@ public sealed class ScriptSandboxTests
|
||||
return 0;
|
||||
"""));
|
||||
}
|
||||
|
||||
// --- Core.Scripting-012: BCL refs broadened by -008/-016 re-exposed three vectors
|
||||
// the original deny-list missed. Each is denied type-granularly in
|
||||
// ForbiddenTypeAnalyzer.ForbiddenFullTypeNames; these tests pin the rejection.
|
||||
|
||||
[Fact]
|
||||
public void Rejects_ThreadPool_QueueUserWorkItem_at_compile()
|
||||
{
|
||||
// System.Threading.ThreadPool.QueueUserWorkItem re-introduces the background-
|
||||
// fanout threat Core.Scripting-003 closed against System.Threading.Tasks. The
|
||||
// ThreadPool type lives in System.Threading (shared with allowed sync primitives
|
||||
// like CancellationToken), so it must be denied type-granularly. (Core.Scripting-012.)
|
||||
Should.Throw<ScriptSandboxViolationException>(() =>
|
||||
ScriptEvaluator<FakeScriptContext, int>.Compile(
|
||||
"""
|
||||
System.Threading.ThreadPool.QueueUserWorkItem(_ => { });
|
||||
return 0;
|
||||
"""));
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public void Rejects_Timer_new_at_compile()
|
||||
{
|
||||
// System.Threading.Timer schedules unbounded callback work that outlives the
|
||||
// per-evaluation timeout. Same namespace as CancellationToken so type-granular.
|
||||
// (Core.Scripting-012.)
|
||||
Should.Throw<ScriptSandboxViolationException>(() =>
|
||||
ScriptEvaluator<FakeScriptContext, int>.Compile(
|
||||
"""
|
||||
var t = new System.Threading.Timer(_ => { });
|
||||
return 0;
|
||||
"""));
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public void Rejects_AssemblyLoadContext_at_compile()
|
||||
{
|
||||
// System.Runtime.Loader.AssemblyLoadContext loads arbitrary DLLs into the
|
||||
// process — defense-in-depth gap that the BCL-wide reference set re-opened.
|
||||
// System.Reflection already denies the typical invocation path, but the
|
||||
// sandbox boundary stays type-explicit. (Core.Scripting-012.)
|
||||
Should.Throw<ScriptSandboxViolationException>(() =>
|
||||
ScriptEvaluator<FakeScriptContext, int>.Compile(
|
||||
"""
|
||||
var alc = System.Runtime.Loader.AssemblyLoadContext.Default;
|
||||
return 0;
|
||||
"""));
|
||||
}
|
||||
|
||||
// --- Core.Scripting-013: wrapper-source injection ---
|
||||
|
||||
[Fact]
|
||||
public void Rejects_sibling_method_injection_via_balanced_braces()
|
||||
{
|
||||
// Core.Scripting-013 — a brace-balanced source that closes Run early and
|
||||
// declares a sibling method inside CompiledScript. The analyzer would still
|
||||
// walk the injected member, but the documented "method body" contract
|
||||
// forbids the shape outright now.
|
||||
var ex = Should.Throw<CompilationErrorException>(() =>
|
||||
ScriptEvaluator<FakeScriptContext, int>.Compile(
|
||||
"""
|
||||
return 0;
|
||||
} public static int Evil() { return 1;
|
||||
"""));
|
||||
ex.Message.ShouldContain("Core.Scripting-013");
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public void Rejects_sibling_class_injection_via_balanced_braces()
|
||||
{
|
||||
// Same shape, but injecting an entire sibling class inside the synthesized
|
||||
// namespace. Reject at the type-count check.
|
||||
var ex = Should.Throw<CompilationErrorException>(() =>
|
||||
ScriptEvaluator<FakeScriptContext, int>.Compile(
|
||||
"""
|
||||
return 0;
|
||||
}
|
||||
}
|
||||
public static class CompiledScript2 { public static int M() { return 0; } }
|
||||
namespace ZB.MOM.WW.OtOpcUa.Core.Scripting.Compiled { public static class CompiledScript3 { public static int M() {
|
||||
return 0;
|
||||
"""));
|
||||
ex.Message.ShouldContain("Core.Scripting-013");
|
||||
}
|
||||
}
|
||||
|
||||
@@ -655,4 +655,60 @@ public sealed class VirtualTagEngineTests
|
||||
lock (_buf) { _buf.Add((path, value)); }
|
||||
}
|
||||
}
|
||||
|
||||
// --- Core.Scripting-016: engine routes compiles through CompiledScriptCache ---
|
||||
|
||||
[Fact]
|
||||
public void Dispose_unloads_compiled_script_assembly()
|
||||
{
|
||||
// Pre-fix the engine called ScriptEvaluator.Compile directly, so the emitted
|
||||
// script assembly's ALC stayed loaded for the process lifetime. After the fix
|
||||
// the engine routes through CompiledScriptCache; engine Dispose triggers cache
|
||||
// Dispose which unloads every cached evaluator's ALC. Assert via WeakReference
|
||||
// + GC that the assembly is actually reclaimed.
|
||||
var weak = CompileTagAndCaptureWeak();
|
||||
for (int i = 0; i < 10 && weak.IsAlive; i++)
|
||||
{
|
||||
GC.Collect();
|
||||
GC.WaitForPendingFinalizers();
|
||||
GC.Collect();
|
||||
}
|
||||
weak.IsAlive.ShouldBeFalse(
|
||||
"engine Dispose must release the compiled-tag assembly via " +
|
||||
"CompiledScriptCache (Core.Scripting-016). If this fails the engine is " +
|
||||
"back to calling ScriptEvaluator.Compile directly and -008's headline " +
|
||||
"fix doesn't run in production.");
|
||||
}
|
||||
|
||||
[System.Runtime.CompilerServices.MethodImpl(
|
||||
System.Runtime.CompilerServices.MethodImplOptions.NoInlining)]
|
||||
private static WeakReference CompileTagAndCaptureWeak()
|
||||
{
|
||||
var up = new FakeUpstream();
|
||||
up.Set("X", 10);
|
||||
var engine = Build(up);
|
||||
engine.Load([new VirtualTagDefinition(
|
||||
Path: "Doubled",
|
||||
DataType: DriverDataType.Float64,
|
||||
ScriptSource: """return (double)ctx.GetTag("X").Value * 2.0;""")]);
|
||||
|
||||
// Reach into the engine's compile cache via reflection — internals scoped to
|
||||
// Tests-via-InternalsVisibleTo and the field is private.
|
||||
var cacheField = typeof(VirtualTagEngine).GetField(
|
||||
"_compileCache", System.Reflection.BindingFlags.Instance | System.Reflection.BindingFlags.NonPublic)!;
|
||||
var cache = cacheField.GetValue(engine)!;
|
||||
var cacheDictField = cache.GetType().GetField(
|
||||
"_cache", System.Reflection.BindingFlags.Instance | System.Reflection.BindingFlags.NonPublic)!;
|
||||
var cacheDict = (System.Collections.IDictionary)cacheDictField.GetValue(cache)!;
|
||||
var lazy = cacheDict.Values.Cast<object>().Single();
|
||||
var lazyValueProp = lazy.GetType().GetProperty("Value")!;
|
||||
var evaluator = lazyValueProp.GetValue(lazy)!;
|
||||
var funcField = evaluator.GetType().GetField(
|
||||
"_func", System.Reflection.BindingFlags.Instance | System.Reflection.BindingFlags.NonPublic)!;
|
||||
var del = (Delegate)funcField.GetValue(evaluator)!;
|
||||
var weak = new WeakReference(del.Method.Module.Assembly);
|
||||
|
||||
engine.Dispose();
|
||||
return weak;
|
||||
}
|
||||
}
|
||||
|
||||
@@ -29,12 +29,19 @@ public sealed class SnapshotFormatterTests
|
||||
// Numeric codes are the canonical OPC Foundation Opc.Ua.StatusCodes values.
|
||||
[InlineData(0x00000000u, "Good")]
|
||||
[InlineData(0x80000000u, "Bad")]
|
||||
[InlineData(0x80020000u, "BadInternalError")]
|
||||
[InlineData(0x80050000u, "BadCommunicationError")]
|
||||
[InlineData(0x800A0000u, "BadTimeout")]
|
||||
[InlineData(0x80310000u, "BadNoCommunication")]
|
||||
[InlineData(0x80320000u, "BadWaitingForInitialData")]
|
||||
[InlineData(0x80340000u, "BadNodeIdUnknown")]
|
||||
[InlineData(0x80330000u, "BadNodeIdInvalid")]
|
||||
[InlineData(0x803B0000u, "BadNotWritable")]
|
||||
[InlineData(0x803C0000u, "BadOutOfRange")]
|
||||
[InlineData(0x803D0000u, "BadNotSupported")]
|
||||
// Driver.Cli.Common-007: corrected from 0x80550000 (which is actually
|
||||
// BadSecurityPolicyRejected) to the canonical OPC UA spec value 0x808B0000.
|
||||
[InlineData(0x808B0000u, "BadDeviceFailure")]
|
||||
[InlineData(0x80740000u, "BadTypeMismatch")]
|
||||
[InlineData(0x40000000u, "Uncertain")]
|
||||
public void FormatStatus_names_well_known_status_codes(uint status, string expectedName)
|
||||
@@ -50,6 +57,10 @@ public sealed class SnapshotFormatterTests
|
||||
[InlineData(0x80070000u, "BadNoCommunication")] // was mislabelled BadNoCommunication
|
||||
[InlineData(0x80080000u, "BadWaitingForInitialData")] // was mislabelled BadWaitingForInitialData
|
||||
[InlineData(0x80350000u, "BadNodeIdInvalid")] // was mislabelled BadNodeIdInvalid
|
||||
// Driver.Cli.Common-007: 0x80550000 is BadSecurityPolicyRejected per spec, not
|
||||
// BadDeviceFailure (which is 0x808B0000). The buggy shortlist + the six native-
|
||||
// protocol mappers all had it wrong; this row pins it as a regression guard.
|
||||
[InlineData(0x80550000u, "BadDeviceFailure")]
|
||||
public void FormatStatus_does_not_apply_pre_fix_wrong_names(uint status, string wrongName)
|
||||
{
|
||||
SnapshotFormatter.FormatStatus(status).ShouldNotContain(wrongName);
|
||||
|
||||
+5
-1
@@ -26,7 +26,11 @@ public sealed class AbCipReadSmokeTests
|
||||
await fixture.InitializeAsync();
|
||||
try
|
||||
{
|
||||
var deviceUri = $"ab://127.0.0.1:{fixture.Port}/1,0";
|
||||
// Use fixture.Host (not hardcoded 127.0.0.1) so the docker-host migration
|
||||
// (10.100.0.35) and AB_SERVER_ENDPOINT overrides reach this test too. The
|
||||
// sibling Emulate tests already use the fixture's resolved endpoint; this
|
||||
// smoke test was missed.
|
||||
var deviceUri = $"ab://{fixture.Host}:{fixture.Port}/1,0";
|
||||
var drv = new AbCipDriver(new AbCipDriverOptions
|
||||
{
|
||||
Devices = [new AbCipDeviceOptions(deviceUri, profile.Family)],
|
||||
|
||||
@@ -8,7 +8,7 @@ namespace ZB.MOM.WW.OtOpcUa.Driver.AbCip.IntegrationTests;
|
||||
/// Reachability probe for the <c>ab_server</c> Docker container (libplctag's CIP
|
||||
/// simulator built via <c>Docker/Dockerfile</c>) or any real AB PLC the
|
||||
/// <c>AB_SERVER_ENDPOINT</c> env var points at. Parses
|
||||
/// <c>AB_SERVER_ENDPOINT</c> (default <c>localhost:44818</c>) + TCP-connects
|
||||
/// <c>AB_SERVER_ENDPOINT</c> (default <c>10.100.0.35:44818</c> — the shared Docker host) + TCP-connects
|
||||
/// once at fixture construction. Tests skip via <see cref="AbServerFactAttribute"/>
|
||||
/// / <see cref="AbServerTheoryAttribute"/> when the port isn't live, so
|
||||
/// <c>dotnet test</c> stays green on a fresh clone without Docker running.
|
||||
@@ -28,7 +28,10 @@ public sealed class AbServerFixture : IAsyncLifetime
|
||||
/// instantiate the fixture with the profile matching their compose-file service.</summary>
|
||||
public AbServerProfile Profile { get; }
|
||||
|
||||
public string Host { get; } = "127.0.0.1";
|
||||
// 10.100.0.35 = the shared Docker host (see CLAUDE.md "Docker Workflow"). Migrated
|
||||
// off this VM's 127.0.0.1 on 2026-04-28 alongside the rest of the Docker-host move.
|
||||
// Override via AB_SERVER_ENDPOINT to point at a real PLC or a locally-running container.
|
||||
public string Host { get; } = "10.100.0.35";
|
||||
public int Port { get; } = AbServerProfile.DefaultPort;
|
||||
|
||||
public AbServerFixture() : this(KnownProfiles.ControlLogix) { }
|
||||
@@ -59,7 +62,7 @@ public sealed class AbServerFixture : IAsyncLifetime
|
||||
TcpProbe(ResolveHost(), ResolvePort());
|
||||
|
||||
private static string ResolveHost() =>
|
||||
Environment.GetEnvironmentVariable(EndpointEnvVar)?.Split(':', 2)[0] ?? "127.0.0.1";
|
||||
Environment.GetEnvironmentVariable(EndpointEnvVar)?.Split(':', 2)[0] ?? "10.100.0.35";
|
||||
|
||||
private static int ResolvePort()
|
||||
{
|
||||
@@ -84,7 +87,7 @@ public sealed class AbServerFixture : IAsyncLifetime
|
||||
|
||||
/// <summary>
|
||||
/// <c>[Fact]</c>-equivalent that skips when ab_server isn't reachable — accepts a
|
||||
/// live Docker listener on <c>localhost:44818</c> or an <c>AB_SERVER_ENDPOINT</c>
|
||||
/// live Docker listener on <c>10.100.0.35:44818</c> or an <c>AB_SERVER_ENDPOINT</c>
|
||||
/// override pointing at a real PLC.
|
||||
/// </summary>
|
||||
public sealed class AbServerFactAttribute : FactAttribute
|
||||
|
||||
+4
@@ -29,6 +29,10 @@ namespace ZB.MOM.WW.OtOpcUa.Driver.AbCip.IntegrationTests.Emulate;
|
||||
/// <para>Runs only when <c>AB_SERVER_PROFILE=emulate</c>. ab_server has no ALMD
|
||||
/// instruction + no alarm subsystem, so this tier-gated class couldn't produce a
|
||||
/// meaningful result against the default simulator.</para>
|
||||
/// <para>The Emulate tier is hardware-gated (Rockwell per-seat license, Windows-only,
|
||||
/// conflicts with Docker Desktop's WSL 2 backend) so a permanent skip in CI is expected
|
||||
/// — see <c>docs/drivers/AbServer-Test-Fixture.md</c> §"Logix Emulate golden-box tier"
|
||||
/// for the full rationale + the gap matrix this test closes.</para>
|
||||
/// </remarks>
|
||||
[Collection("AbServerEmulate")]
|
||||
[Trait("Category", "Integration")]
|
||||
|
||||
+4
@@ -27,6 +27,10 @@ namespace ZB.MOM.WW.OtOpcUa.Driver.AbCip.IntegrationTests.Emulate;
|
||||
/// <para>Runs only when <c>AB_SERVER_PROFILE=emulate</c>. With ab_server
|
||||
/// (the default), skips cleanly — ab_server lacks UDT / Template Object emulation
|
||||
/// so this wire-level test couldn't pass against it regardless.</para>
|
||||
/// <para>The Emulate tier is hardware-gated (Rockwell per-seat license, Windows-only,
|
||||
/// conflicts with Docker Desktop's WSL 2 backend) so a permanent skip in CI is expected
|
||||
/// — see <c>docs/drivers/AbServer-Test-Fixture.md</c> §"Logix Emulate golden-box tier"
|
||||
/// for the full rationale + the gap matrix this test closes.</para>
|
||||
/// </remarks>
|
||||
[Collection("AbServerEmulate")]
|
||||
[Trait("Category", "Integration")]
|
||||
|
||||
+6
-3
@@ -18,7 +18,7 @@ namespace ZB.MOM.WW.OtOpcUa.Driver.AbLegacy.IntegrationTests;
|
||||
/// Env-var overrides:
|
||||
/// <list type="bullet">
|
||||
/// <item><c>AB_LEGACY_ENDPOINT</c> — <c>host:port</c> of the PCCC-mode simulator.
|
||||
/// Defaults to <c>localhost:44818</c> (EtherNet/IP port; ab_server's PCCC
|
||||
/// Defaults to <c>10.100.0.35:44818</c> — the shared Docker host (EtherNet/IP port; ab_server's PCCC
|
||||
/// emulation exposes PCCC-over-CIP on the same port as CIP itself).</item>
|
||||
/// <item><c>AB_LEGACY_CIP_PATH</c> — routing path appended to the <c>ab://host:port/</c>
|
||||
/// URI. Defaults to <c>1,0</c> (port-1/slot-0 backplane), required by ab_server
|
||||
@@ -50,7 +50,10 @@ public sealed class AbLegacyServerFixture : IAsyncLifetime
|
||||
/// </summary>
|
||||
public const string DefaultCipPath = "1,0";
|
||||
|
||||
public string Host { get; } = "127.0.0.1";
|
||||
// 10.100.0.35 = the shared Docker host (see CLAUDE.md "Docker Workflow"). Migrated
|
||||
// off this VM's 127.0.0.1 on 2026-04-28 alongside the rest of the Docker-host move.
|
||||
// Override via AB_LEGACY_ENDPOINT to point at a real PLC or a locally-running container.
|
||||
public string Host { get; } = "10.100.0.35";
|
||||
public int Port { get; } = DefaultPort;
|
||||
|
||||
/// <summary>CIP routing path portion of the device URI (after the <c>/</c> separator).
|
||||
@@ -105,7 +108,7 @@ public sealed class AbLegacyServerFixture : IAsyncLifetime
|
||||
private static (string Host, int Port) ResolveEndpoint()
|
||||
{
|
||||
var raw = Environment.GetEnvironmentVariable(EndpointEnvVar);
|
||||
if (raw is null) return ("127.0.0.1", DefaultPort);
|
||||
if (raw is null) return ("10.100.0.35", DefaultPort);
|
||||
var parts = raw.Split(':', 2);
|
||||
var port = parts.Length == 2 && int.TryParse(parts[1], out var p) ? p : DefaultPort;
|
||||
return (parts[0], port);
|
||||
|
||||
@@ -1,7 +1,7 @@
|
||||
using System.Runtime.CompilerServices;
|
||||
using System.Threading.Channels;
|
||||
using Google.Protobuf.WellKnownTypes;
|
||||
using MxGateway.Contracts.Proto.Galaxy;
|
||||
using ZB.MOM.WW.MxGateway.Contracts.Proto.Galaxy;
|
||||
using Shouldly;
|
||||
using Xunit;
|
||||
using ZB.MOM.WW.OtOpcUa.Core.Abstractions;
|
||||
|
||||
@@ -1,4 +1,4 @@
|
||||
using MxGateway.Contracts.Proto.Galaxy;
|
||||
using ZB.MOM.WW.MxGateway.Contracts.Proto.Galaxy;
|
||||
using Shouldly;
|
||||
using Xunit;
|
||||
using ZB.MOM.WW.OtOpcUa.Core.Abstractions;
|
||||
|
||||
@@ -1,4 +1,4 @@
|
||||
using MxGateway.Contracts.Proto;
|
||||
using ZB.MOM.WW.MxGateway.Contracts.Proto;
|
||||
using Shouldly;
|
||||
using Xunit;
|
||||
using ZB.MOM.WW.OtOpcUa.Core.Abstractions;
|
||||
|
||||
+1
-1
@@ -1,6 +1,6 @@
|
||||
using System.Threading.Channels;
|
||||
using Google.Protobuf.WellKnownTypes;
|
||||
using MxGateway.Contracts.Proto;
|
||||
using ZB.MOM.WW.MxGateway.Contracts.Proto;
|
||||
using Shouldly;
|
||||
using Xunit;
|
||||
using ZB.MOM.WW.OtOpcUa.Core.Abstractions;
|
||||
|
||||
+1
-1
@@ -1,4 +1,4 @@
|
||||
using MxGateway.Contracts.Proto;
|
||||
using ZB.MOM.WW.MxGateway.Contracts.Proto;
|
||||
using Shouldly;
|
||||
using Xunit;
|
||||
using ZB.MOM.WW.OtOpcUa.Core.Abstractions;
|
||||
|
||||
+1
-1
@@ -1,7 +1,7 @@
|
||||
using System.Diagnostics.Metrics;
|
||||
using System.Threading.Channels;
|
||||
using Google.Protobuf.WellKnownTypes;
|
||||
using MxGateway.Contracts.Proto;
|
||||
using ZB.MOM.WW.MxGateway.Contracts.Proto;
|
||||
using Shouldly;
|
||||
using Xunit;
|
||||
using ZB.MOM.WW.OtOpcUa.Core.Abstractions;
|
||||
|
||||
+1
-1
@@ -1,6 +1,6 @@
|
||||
using System.Threading.Channels;
|
||||
using Google.Protobuf.WellKnownTypes;
|
||||
using MxGateway.Contracts.Proto;
|
||||
using ZB.MOM.WW.MxGateway.Contracts.Proto;
|
||||
using Shouldly;
|
||||
using Xunit;
|
||||
using ZB.MOM.WW.OtOpcUa.Core.Abstractions;
|
||||
|
||||
+1
-1
@@ -1,6 +1,6 @@
|
||||
using System.Threading.Channels;
|
||||
using Google.Protobuf.WellKnownTypes;
|
||||
using MxGateway.Contracts.Proto;
|
||||
using ZB.MOM.WW.MxGateway.Contracts.Proto;
|
||||
using Shouldly;
|
||||
using Xunit;
|
||||
using ZB.MOM.WW.OtOpcUa.Core.Abstractions;
|
||||
|
||||
+1
-1
@@ -1,4 +1,4 @@
|
||||
using MxGateway.Contracts.Proto.Galaxy;
|
||||
using ZB.MOM.WW.MxGateway.Contracts.Proto.Galaxy;
|
||||
using Shouldly;
|
||||
using Xunit;
|
||||
using ZB.MOM.WW.OtOpcUa.Core.Abstractions;
|
||||
|
||||
@@ -1,5 +1,5 @@
|
||||
using System.Diagnostics;
|
||||
using MxGateway.Contracts.Proto;
|
||||
using ZB.MOM.WW.MxGateway.Contracts.Proto;
|
||||
using Shouldly;
|
||||
using Xunit;
|
||||
using ZB.MOM.WW.OtOpcUa.Core.Abstractions;
|
||||
|
||||
+1
-1
@@ -1,6 +1,6 @@
|
||||
using System.Runtime.CompilerServices;
|
||||
using Google.Protobuf.WellKnownTypes;
|
||||
using MxGateway.Contracts.Proto;
|
||||
using ZB.MOM.WW.MxGateway.Contracts.Proto;
|
||||
using Shouldly;
|
||||
using Xunit;
|
||||
using ZB.MOM.WW.OtOpcUa.Core.Abstractions;
|
||||
|
||||
@@ -1,6 +1,6 @@
|
||||
using Google.Protobuf;
|
||||
using Google.Protobuf.WellKnownTypes;
|
||||
using MxGateway.Contracts.Proto;
|
||||
using ZB.MOM.WW.MxGateway.Contracts.Proto;
|
||||
using Shouldly;
|
||||
using Xunit;
|
||||
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Runtime;
|
||||
|
||||
@@ -1,5 +1,5 @@
|
||||
using Google.Protobuf.WellKnownTypes;
|
||||
using MxGateway.Contracts.Proto;
|
||||
using ZB.MOM.WW.MxGateway.Contracts.Proto;
|
||||
using Shouldly;
|
||||
using Xunit;
|
||||
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Runtime;
|
||||
|
||||
@@ -1,4 +1,4 @@
|
||||
using MxGateway.Contracts.Proto;
|
||||
using ZB.MOM.WW.MxGateway.Contracts.Proto;
|
||||
using Shouldly;
|
||||
using Xunit;
|
||||
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Runtime;
|
||||
|
||||
+13
-3
@@ -21,9 +21,19 @@
|
||||
|
||||
<ItemGroup>
|
||||
<ProjectReference Include="..\..\..\src\Drivers\ZB.MOM.WW.OtOpcUa.Driver.Galaxy\ZB.MOM.WW.OtOpcUa.Driver.Galaxy.csproj"/>
|
||||
<!-- Pulled in transitively via Driver.Galaxy → MxGateway.Client → MxGateway.Contracts;
|
||||
explicit reference lets tests construct GalaxyObject / GalaxyAttribute fixtures. -->
|
||||
<ProjectReference Include="..\..\..\..\mxaccessgw\src\MxGateway.Contracts\MxGateway.Contracts.csproj"/>
|
||||
</ItemGroup>
|
||||
|
||||
<ItemGroup>
|
||||
<!-- Vendored mxaccessgw contracts DLL. The driver under test holds the same
|
||||
binary reference; this explicit duplicate lets tests construct
|
||||
GalaxyObject / GalaxyAttribute / MxCommand / MxEvent fixtures directly
|
||||
rather than only via the driver's public surface. See
|
||||
..\..\..\src\Drivers\ZB.MOM.WW.OtOpcUa.Driver.Galaxy\libs\README.md for
|
||||
the unwinding plan. -->
|
||||
<Reference Include="MxGateway.Contracts">
|
||||
<HintPath>..\..\..\src\Drivers\ZB.MOM.WW.OtOpcUa.Driver.Galaxy\libs\MxGateway.Contracts.dll</HintPath>
|
||||
<Private>true</Private>
|
||||
</Reference>
|
||||
</ItemGroup>
|
||||
|
||||
</Project>
|
||||
|
||||
+13
-4
@@ -54,7 +54,12 @@ public sealed class AddressingGrammarTests
|
||||
{
|
||||
if (_sim.SkipReason is not null) Assert.Skip(_sim.SkipReason);
|
||||
|
||||
var tag = new ModbusTagDefinition("Tank", ModbusRegion.HoldingRegisters, 100, ModbusDataType.Float32,
|
||||
// HR 200 lives in pymodbus's scratch range (`write: [200, 209]` per standard.json),
|
||||
// which gives us two consecutive writable HRs for the Float32 round-trip. The earlier
|
||||
// version used HR 100 — but standard.json declares HR 100 as a single-cell auto-
|
||||
// incrementing register (`write: [100, 100]`) so the second register of the Float32
|
||||
// write was rejected with Illegal Data Address.
|
||||
var tag = new ModbusTagDefinition("Tank", ModbusRegion.HoldingRegisters, 200, ModbusDataType.Float32,
|
||||
ByteOrder: ModbusByteOrder.WordSwap);
|
||||
var drv = await NewDriverAsync(tag);
|
||||
|
||||
@@ -87,9 +92,13 @@ public sealed class AddressingGrammarTests
|
||||
if (_sim.SkipReason is not null) Assert.Skip(_sim.SkipReason);
|
||||
|
||||
// Sanity check that the simulator accepts the larger PDU coalescing produces.
|
||||
var t1 = new ModbusTagDefinition("T1", ModbusRegion.HoldingRegisters, 300, ModbusDataType.Int16);
|
||||
var t2 = new ModbusTagDefinition("T2", ModbusRegion.HoldingRegisters, 302, ModbusDataType.Int16);
|
||||
var t3 = new ModbusTagDefinition("T3", ModbusRegion.HoldingRegisters, 304, ModbusDataType.Int16);
|
||||
// Using HR 200/202/204 in the scratch range (standard.json's `uint16: 200..209`),
|
||||
// not 300/302/304 — pymodbus rejects reads outside the seeded uint16 list with
|
||||
// Illegal Data Address (= BadOutOfRange). The coalescing semantic the test
|
||||
// exercises is identical with the scratch addresses.
|
||||
var t1 = new ModbusTagDefinition("T1", ModbusRegion.HoldingRegisters, 200, ModbusDataType.Int16);
|
||||
var t2 = new ModbusTagDefinition("T2", ModbusRegion.HoldingRegisters, 202, ModbusDataType.Int16);
|
||||
var t3 = new ModbusTagDefinition("T3", ModbusRegion.HoldingRegisters, 204, ModbusDataType.Int16);
|
||||
var opts = new ModbusDriverOptions
|
||||
{
|
||||
Host = _sim.Host, Port = _sim.Port, Tags = [t1, t2, t3], MaxReadGap = 5,
|
||||
|
||||
@@ -48,6 +48,23 @@ service + starting another. The integration tests discriminate by a
|
||||
separate `MODBUS_SIM_PROFILE` env var so they skip correctly when the
|
||||
wrong profile is live.
|
||||
|
||||
### Profile coverage matrix
|
||||
|
||||
The two general-purpose profiles cover disjoint test sets. A full pass
|
||||
of the integration suite requires running both — serially on a single
|
||||
docker host (the `:5020` collision), or in parallel on two hosts.
|
||||
|
||||
| Job | Bring up | Env to set | Expected outcome |
|
||||
|---|---|---|---|
|
||||
| `modbus-standard` | `lmxopcua-fix up modbus standard` | unset `MODBUS_SIM_PROFILE` (or set to `standard`) | Standard round-trip + AddressingGrammar suites pass; `ExceptionInjectionTests` (32 rows) skip with `MODBUS_SIM_PROFILE != exception_injection`. |
|
||||
| `modbus-exception` | `lmxopcua-fix up modbus exception_injection` | `MODBUS_SIM_PROFILE=exception_injection` | `ExceptionInjectionTests` (32 rows) pass against the per-`(fc,address)` rule set; standard-profile suites (round-trip, AddressingGrammar) skip. |
|
||||
|
||||
The DL205 / Mitsubishi / S7-1500 profiles are similar — each gates its
|
||||
own quirks suite via `MODBUS_SIM_PROFILE=<profile>`. Tests that don't
|
||||
need a specific profile (the basic round-trip set) run under any of
|
||||
the three pymodbus-based profiles. The `exception_injection` profile
|
||||
is the only one that runs `exception_injector.py` instead of pymodbus.
|
||||
|
||||
## Endpoint
|
||||
|
||||
- Default: `localhost:5020`
|
||||
|
||||
+1
-1
@@ -27,7 +27,7 @@ public sealed class ExceptionInjectionTests(ModbusSimulatorFixture sim)
|
||||
private const uint StatusGood = 0u;
|
||||
private const uint StatusBadOutOfRange = 0x803C0000u;
|
||||
private const uint StatusBadNotSupported = 0x803D0000u;
|
||||
private const uint StatusBadDeviceFailure = 0x80550000u;
|
||||
private const uint StatusBadDeviceFailure = 0x808B0000u;
|
||||
private const uint StatusBadCommunicationError = 0x80050000u;
|
||||
|
||||
private void SkipUnlessInjectorLive()
|
||||
|
||||
+6
-3
@@ -5,7 +5,7 @@ namespace ZB.MOM.WW.OtOpcUa.Driver.Modbus.IntegrationTests;
|
||||
/// <summary>
|
||||
/// Reachability probe for a Modbus TCP simulator (pymodbus in Docker, see
|
||||
/// <c>Docker/docker-compose.yml</c>) or a real PLC. Parses
|
||||
/// <c>MODBUS_SIM_ENDPOINT</c> (default <c>localhost:5020</c> per PR 43) and TCP-connects once at
|
||||
/// <c>MODBUS_SIM_ENDPOINT</c> (default <c>10.100.0.35:5020</c> — the shared Docker host) and TCP-connects once at
|
||||
/// fixture construction. Each test checks <see cref="SkipReason"/> and calls
|
||||
/// <c>Assert.Skip</c> when the endpoint was unreachable, so a dev box without a running
|
||||
/// simulator still passes `dotnet test` cleanly — matches the Galaxy live-smoke pattern in
|
||||
@@ -29,8 +29,11 @@ public sealed class ModbusSimulatorFixture : IAsyncDisposable
|
||||
// PR 43: default port is 5020 (pymodbus convention) instead of 502 (Modbus standard).
|
||||
// Picking 5020 sidesteps the privileged-port admin requirement on Windows + matches the
|
||||
// port baked into the pymodbus simulator JSON profiles in Docker/profiles/. Override with
|
||||
// MODBUS_SIM_ENDPOINT to point at a real PLC on its native port 502.
|
||||
private const string DefaultEndpoint = "localhost:5020";
|
||||
// MODBUS_SIM_ENDPOINT to point at a real PLC on its native port 502, or to a
|
||||
// locally-running container if the shared host is unavailable.
|
||||
// 10.100.0.35 = the shared Docker host (see CLAUDE.md "Docker Workflow"). Migrated
|
||||
// off this VM's localhost on 2026-04-28 alongside the rest of the Docker-host move.
|
||||
private const string DefaultEndpoint = "10.100.0.35:5020";
|
||||
private const string EndpointEnvVar = "MODBUS_SIM_ENDPOINT";
|
||||
|
||||
public string Host { get; }
|
||||
|
||||
@@ -18,9 +18,9 @@ public sealed class ModbusExceptionMapperTests
|
||||
[InlineData((byte)0x01, 0x803D0000u)] // Illegal Function → BadNotSupported
|
||||
[InlineData((byte)0x02, 0x803C0000u)] // Illegal Data Address → BadOutOfRange
|
||||
[InlineData((byte)0x03, 0x803C0000u)] // Illegal Data Value → BadOutOfRange
|
||||
[InlineData((byte)0x04, 0x80550000u)] // Server Failure → BadDeviceFailure
|
||||
[InlineData((byte)0x05, 0x80550000u)] // Acknowledge (long op) → BadDeviceFailure
|
||||
[InlineData((byte)0x06, 0x80550000u)] // Server Busy → BadDeviceFailure
|
||||
[InlineData((byte)0x04, 0x808B0000u)] // Server Failure → BadDeviceFailure (Driver.Cli.Common-007)
|
||||
[InlineData((byte)0x05, 0x808B0000u)] // Acknowledge (long op) → BadDeviceFailure
|
||||
[InlineData((byte)0x06, 0x808B0000u)] // Server Busy → BadDeviceFailure
|
||||
[InlineData((byte)0x0A, 0x80050000u)] // Gateway path unavailable → BadCommunicationError
|
||||
[InlineData((byte)0x0B, 0x80050000u)] // Gateway target failed to respond → BadCommunicationError
|
||||
[InlineData((byte)0xFF, 0x80020000u)] // Unknown code → BadInternalError fallback
|
||||
@@ -61,7 +61,7 @@ public sealed class ModbusExceptionMapperTests
|
||||
[new WriteRequest("T", (short)42)],
|
||||
TestContext.Current.CancellationToken);
|
||||
|
||||
writes[0].StatusCode.ShouldBe(0x80550000u, "FC06 returning exception 04 (CPU in PROGRAM mode) maps to BadDeviceFailure");
|
||||
writes[0].StatusCode.ShouldBe(0x808B0000u, "FC06 returning exception 04 (CPU in PROGRAM mode) maps to BadDeviceFailure");
|
||||
}
|
||||
|
||||
private sealed class NonModbusFailureTransport : IModbusTransport
|
||||
|
||||
+6
-2
@@ -6,7 +6,7 @@ namespace ZB.MOM.WW.OtOpcUa.Driver.OpcUaClient.IntegrationTests;
|
||||
/// Reachability probe for an <c>opc-plc</c> simulator (Microsoft Industrial IoT's
|
||||
/// OPC UA PLC from <c>mcr.microsoft.com/iotedge/opc-plc</c>) or any real OPC UA
|
||||
/// server the <c>OPCUA_SIM_ENDPOINT</c> env var points at. Parses
|
||||
/// <c>OPCUA_SIM_ENDPOINT</c> (default <c>opc.tcp://localhost:50000</c>),
|
||||
/// <c>OPCUA_SIM_ENDPOINT</c> (default <c>opc.tcp://10.100.0.35:50000</c> — the shared Docker host),
|
||||
/// TCP-connects to the resolved host:port at collection init, and records a
|
||||
/// <see cref="SkipReason"/> on failure. Tests call <c>Assert.Skip</c> on that, so
|
||||
/// `dotnet test` stays green when Docker isn't running the simulator — mirrors the
|
||||
@@ -32,7 +32,11 @@ namespace ZB.MOM.WW.OtOpcUa.Driver.OpcUaClient.IntegrationTests;
|
||||
/// </remarks>
|
||||
public sealed class OpcPlcFixture : IAsyncDisposable
|
||||
{
|
||||
private const string DefaultEndpoint = "opc.tcp://localhost:50000";
|
||||
// 10.100.0.35 = the shared Docker host (see CLAUDE.md "Docker Workflow"). Migrated
|
||||
// off this VM's localhost on 2026-04-28 alongside the rest of the Docker-host move.
|
||||
// Override via OPCUA_SIM_ENDPOINT to point at a different host or a locally-running
|
||||
// opc-plc instance.
|
||||
private const string DefaultEndpoint = "opc.tcp://10.100.0.35:50000";
|
||||
private const string EndpointEnvVar = "OPCUA_SIM_ENDPOINT";
|
||||
|
||||
/// <summary>Full <c>opc.tcp://host:port</c> URL the driver session should connect to.</summary>
|
||||
|
||||
@@ -5,7 +5,7 @@ namespace ZB.MOM.WW.OtOpcUa.Driver.S7.IntegrationTests;
|
||||
/// <summary>
|
||||
/// Reachability probe for the python-snap7 simulator Docker container (see
|
||||
/// <c>Docker/docker-compose.yml</c>) or a real S7 PLC. Parses <c>S7_SIM_ENDPOINT</c>
|
||||
/// (default <c>localhost:1102</c>) + TCP-connects once at fixture construction.
|
||||
/// (default <c>10.100.0.35:1102</c> — the shared Docker host) + TCP-connects once at fixture construction.
|
||||
/// Tests check <see cref="SkipReason"/> + call <c>Assert.Skip</c> when unreachable, so
|
||||
/// `dotnet test` stays green on a fresh box without the simulator installed —
|
||||
/// mirrors the <c>ModbusSimulatorFixture</c> pattern.
|
||||
@@ -35,7 +35,9 @@ public sealed class Snap7ServerFixture : IAsyncDisposable
|
||||
{
|
||||
// Default 1102 (non-privileged) matches Docker/server.py. Override with
|
||||
// S7_SIM_ENDPOINT to point at a real PLC on its native 102.
|
||||
private const string DefaultEndpoint = "localhost:1102";
|
||||
// 10.100.0.35 = the shared Docker host (see CLAUDE.md "Docker Workflow"). Migrated
|
||||
// off this VM's localhost on 2026-04-28 alongside the rest of the Docker-host move.
|
||||
private const string DefaultEndpoint = "10.100.0.35:1102";
|
||||
private const string EndpointEnvVar = "S7_SIM_ENDPOINT";
|
||||
|
||||
public string Host { get; }
|
||||
|
||||
@@ -6,6 +6,7 @@ using Microsoft.AspNetCore.Hosting;
|
||||
using Microsoft.EntityFrameworkCore;
|
||||
using Microsoft.Extensions.DependencyInjection;
|
||||
using ZB.MOM.WW.OtOpcUa.Admin.Hubs;
|
||||
using ZB.MOM.WW.OtOpcUa.Admin.Security;
|
||||
using ZB.MOM.WW.OtOpcUa.Configuration;
|
||||
using ZB.MOM.WW.OtOpcUa.Configuration.Entities;
|
||||
using ZB.MOM.WW.OtOpcUa.Configuration.Enums;
|
||||
@@ -101,6 +102,15 @@ public sealed class AdminWebAppFactory : IAsyncDisposable
|
||||
builder.Services.AddScoped<Admin.Services.DriverInstanceService>();
|
||||
builder.Services.AddScoped<Admin.Services.DraftValidationService>();
|
||||
|
||||
// ClusterDetail.razor injects AdminHubConnectionFactory to drive the live-banner hub
|
||||
// connection; the factory depends on HubTokenService, which in turn needs Data Protection.
|
||||
// Without these the InteractiveServer circuit fails to instantiate the component and the
|
||||
// page never advances past the "Loading…" placeholder — Playwright then times out
|
||||
// waiting for any tab nav-link to appear. Mirrors Program.cs:35-36.
|
||||
builder.Services.AddDataProtection();
|
||||
builder.Services.AddSingleton<HubTokenService>();
|
||||
builder.Services.AddScoped<Admin.Services.AdminHubConnectionFactory>();
|
||||
|
||||
_app = builder.Build();
|
||||
_app.UseStaticFiles();
|
||||
_app.UseRouting();
|
||||
|
||||
@@ -110,6 +110,66 @@ public sealed class GenerationRefreshHostedServiceTests : IDisposable
|
||||
leases.OpenLeaseCount.ShouldBe(0, "IAsyncDisposable dispose must fire regardless of outcome");
|
||||
}
|
||||
|
||||
// Bug #12 fix — verifies the previously-missing wiring: applies and heartbeats both
|
||||
// emit sp_RegisterNodeGenerationApplied so Admin UI Fleet status + Redundancy LastSeenAt
|
||||
// surface live state.
|
||||
|
||||
[Fact]
|
||||
public async Task First_apply_reports_Applied_status_to_central_db()
|
||||
{
|
||||
var coordinator = await SeedCoordinatorAsync();
|
||||
var leases = new ApplyLeaseRegistry();
|
||||
var calls = new List<(long Gen, NodeApplyStatus Status, string? Error)>();
|
||||
var service = NewService(coordinator, leases, currentGeneration: () => 42, registerCalls: calls);
|
||||
|
||||
await service.TickAsync(CancellationToken.None);
|
||||
|
||||
calls.Count.ShouldBe(1, "exactly one register call per apply window");
|
||||
calls[0].Gen.ShouldBe(42);
|
||||
calls[0].Status.ShouldBe(NodeApplyStatus.Applied);
|
||||
calls[0].Error.ShouldBeNull();
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public async Task No_change_tick_heartbeats_with_Applied_status()
|
||||
{
|
||||
var coordinator = await SeedCoordinatorAsync();
|
||||
var leases = new ApplyLeaseRegistry();
|
||||
var calls = new List<(long Gen, NodeApplyStatus Status, string? Error)>();
|
||||
var service = NewService(coordinator, leases, currentGeneration: () => 42, registerCalls: calls);
|
||||
|
||||
await service.TickAsync(CancellationToken.None); // initial apply
|
||||
await service.TickAsync(CancellationToken.None); // no-change heartbeat
|
||||
await service.TickAsync(CancellationToken.None); // no-change heartbeat
|
||||
|
||||
calls.Count.ShouldBe(3, "one apply call + two heartbeat calls");
|
||||
calls.ShouldAllBe(c => c.Gen == 42 && c.Status == NodeApplyStatus.Applied && c.Error == null);
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public async Task Register_call_failure_does_not_break_apply_or_block_subsequent_ticks()
|
||||
{
|
||||
var coordinator = await SeedCoordinatorAsync();
|
||||
var leases = new ApplyLeaseRegistry();
|
||||
var registerCallCount = 0;
|
||||
var service = new GenerationRefreshHostedService(
|
||||
new NodeOptions { NodeId = "A", ClusterId = "c1", ConfigDbConnectionString = "unused" },
|
||||
leases, coordinator, NullLogger<GenerationRefreshHostedService>.Instance,
|
||||
tickInterval: TimeSpan.FromSeconds(1),
|
||||
currentGenerationQuery: _ => Task.FromResult<long?>(42),
|
||||
registerAppliedAsync: (gen, status, err, ct) =>
|
||||
{
|
||||
registerCallCount++;
|
||||
throw new InvalidOperationException("simulated DB outage during register");
|
||||
});
|
||||
|
||||
await service.TickAsync(CancellationToken.None); // apply succeeds, register throws
|
||||
await service.TickAsync(CancellationToken.None); // heartbeat throws
|
||||
|
||||
registerCallCount.ShouldBe(2, "both register attempts must run");
|
||||
service.LastAppliedGenerationId.ShouldBe(42, "register failure must not roll back the cursor");
|
||||
}
|
||||
|
||||
// ---- fixture helpers ---------------------------------------------------
|
||||
|
||||
private async Task<RedundancyCoordinator> SeedCoordinatorAsync()
|
||||
@@ -136,11 +196,15 @@ public sealed class GenerationRefreshHostedServiceTests : IDisposable
|
||||
private static GenerationRefreshHostedService NewService(
|
||||
RedundancyCoordinator coordinator,
|
||||
ApplyLeaseRegistry leases,
|
||||
Func<long?> currentGeneration) =>
|
||||
Func<long?> currentGeneration,
|
||||
List<(long Gen, NodeApplyStatus Status, string? Error)>? registerCalls = null) =>
|
||||
new(new NodeOptions { NodeId = "A", ClusterId = "c1", ConfigDbConnectionString = "unused" },
|
||||
leases, coordinator, NullLogger<GenerationRefreshHostedService>.Instance,
|
||||
tickInterval: TimeSpan.FromSeconds(1),
|
||||
currentGenerationQuery: _ => Task.FromResult(currentGeneration()));
|
||||
currentGenerationQuery: _ => Task.FromResult(currentGeneration()),
|
||||
registerAppliedAsync: registerCalls is null
|
||||
? (_, _, _, _) => Task.CompletedTask
|
||||
: (gen, status, err, _) => { registerCalls.Add((gen, status, err)); return Task.CompletedTask; });
|
||||
|
||||
private sealed class DbContextFactory(DbContextOptions<OtOpcUaConfigDbContext> options)
|
||||
: IDbContextFactory<OtOpcUaConfigDbContext>
|
||||
|
||||
Reference in New Issue
Block a user