From 7b6ab2ec6f6ca8c53553bbf007fc75d8e694cdf0 Mon Sep 17 00:00:00 2001 From: Joseph Doherty Date: Sat, 23 May 2026 15:55:04 -0400 Subject: [PATCH] fix(scripting): unload compiled-script assemblies via collectible ALC MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Core.Scripting-008 resolution: replace the legacy CSharpScript.CreateDelegate path with hand-rolled CSharpCompilation + Emit + collectible AssemblyLoadContext, so per-publish compile accretion no longer requires a server restart to reclaim. Why this was needed: Roslyn's CSharpScript path emits dynamically-compiled script assemblies into the default AssemblyLoadContext, which is non-collectible. Across config- publish generations each Clear() drops dictionary entries but the emitted assemblies stay loaded for process lifetime, so memory grows steadily on long-running servers with frequent publishes. The accepted-limitation note in docs/VirtualTags.md recommended scheduled restarts as the workaround; operator feedback was that restarts are difficult, so the underlying limitation was the right thing to fix. Implementation: - New ScriptAssemblyLoadContext(name, isCollectible: true) hosts one emitted script assembly per evaluator. - ScriptEvaluator.Compile synthesises a wrapper class around the user source (CompiledScript.Run(globals) — explicit return required per ordinary C# semantics, which every existing script already uses), builds a CSharpCompilation against the sandbox references, runs the ForbiddenTypeAnalyzer over the semantic model unchanged, emits to an in-memory PE stream, loads via ScriptAssemblyLoadContext.LoadFromStream, and binds a strongly-typed Func, TResult> delegate via reflection. - ScriptEvaluator now implements IDisposable — Dispose calls AssemblyLoadContext.Unload(), which makes the emitted assembly eligible for GC at the next collection cycle. - CompiledScriptCache.Clear() disposes every materialised evaluator before dropping its dictionary entry; CompiledScriptCache itself is now IDisposable for graceful server shutdown. - ScriptSandbox.Build returns a new SandboxConfig (References + Imports) instead of a Roslyn ScriptOptions; references now span BCL via the TRUSTED_PLATFORM_ASSEMBLIES set filtered to System.* + netstandard + Microsoft.Win32.Registry, so forbidden BCL types resolve at compile and ForbiddenTypeAnalyzer is the sole security gate (consistent with the Core.Scripting-001 / -002 model — references-list-only restriction is porous against type forwarding, so the analyzer must be the real gate). Verification: - All 104 Core.Scripting tests pass (was 101 — three new regression tests locking the unload contract). - All 56 VirtualTags tests pass (unchanged). - All 63 ScriptedAlarms tests pass (unchanged). - New CompiledScriptCacheTests: - Dispose_unloads_compiled_script_assembly_load_context — proves single- evaluator ALC unload via WeakReference + bounded GC.Collect() loop. - Clear_disposes_every_materialised_evaluator — proves publish-replace releases every prior generation's ALC. - GetOrCompile_after_Dispose_throws_ObjectDisposedException — locks the post-dispose contract. Docs: - docs/VirtualTags.md "Compile cache" section rewritten: the accepted- limitation note replaced with the unload contract + the new authoring convention (explicit return). - docs/ScriptedAlarms.md cross-reference updated to drop the obsolete restart guidance. - code-reviews/Core.Scripting/findings.md Core.Scripting-008 flipped Won't Fix → Resolved with the implementation summary. - code-reviews/README.md regenerated. Pre-existing breakage note: Driver.Galaxy fails the solution-wide build on master because its ProjectReference to the sibling mxaccessgw repo's MxGateway.Client targets a path that the sibling repo no longer has after a recent restructuring. This is unrelated to Core.Scripting-008 and was verified to exist on master before this branch was cut. Co-Authored-By: Claude Opus 4.7 (1M context) --- code-reviews/Core.Scripting/findings.md | 30 +- code-reviews/README.md | 2 +- docs/ScriptedAlarms.md | 2 +- docs/VirtualTags.md | 4 +- .../CompiledScriptCache.cs | 61 +++- .../ScriptEvaluator.cs | 312 ++++++++++++++++-- .../ScriptSandbox.cs | 108 ++++-- .../CompiledScriptCacheTests.cs | 103 ++++++ 8 files changed, 553 insertions(+), 69 deletions(-) diff --git a/code-reviews/Core.Scripting/findings.md b/code-reviews/Core.Scripting/findings.md index 9101cce..19cf939 100644 --- a/code-reviews/Core.Scripting/findings.md +++ b/code-reviews/Core.Scripting/findings.md @@ -240,7 +240,7 @@ race ordering. | Severity | Low | | Category | Performance & resource management | | Location | `CompiledScriptCache.cs:34`, `ScriptEvaluator.cs:34` | -| Status | Won't Fix | +| Status | Resolved | **Description:** `CompiledScriptCache` has no capacity bound (acknowledged in the class remarks) and no eviction. Each cached `ScriptEvaluator` holds a Roslyn `ScriptRunner` @@ -257,7 +257,33 @@ compile scripts into a collectible `AssemblyLoadContext` so `Clear()` can unload generations. At minimum add a note to `docs/ScriptedAlarms.md` so operators with high-publish-frequency deployments are aware. -**Resolution:** Resolved 2026-05-23 — accepted as a documented known limitation rather than fixing in code (collectible `AssemblyLoadContext` for Roslyn-emitted assemblies is a v3 concern). The "Compile cache" section of `docs/VirtualTags.md` now carries a "Per-publish assembly accretion (accepted limitation, Core.Scripting-008)" note that operators with high-publish-frequency deployments can scan, and `docs/ScriptedAlarms.md` cross-references it. The accretion is benign at the expected "low thousands" of scripts scale; recommended mitigation is a scheduled server restart for deployments that publish very frequently. +**Resolution:** Resolved 2026-05-23 — switched the compile pipeline off the legacy +`CSharpScript.CreateDelegate` path (which emits into the default, non-collectible +`AssemblyLoadContext`) and onto a hand-rolled `CSharpCompilation` → +`Compilation.Emit(MemoryStream)` → `ScriptAssemblyLoadContext.LoadFromStream` chain, +with the new `ScriptAssemblyLoadContext` constructed `isCollectible: true`. Each +compiled script lives in its own ALC; `ScriptEvaluator` now implements `IDisposable` +and calls `AssemblyLoadContext.Unload()` on dispose. `CompiledScriptCache.Clear()` +disposes every materialised evaluator before dropping its dictionary entry, and +`CompiledScriptCache` itself is now `IDisposable` for graceful server shutdown. +After a publish-replace cycle the prior generation's emitted assemblies become +eligible for GC; the reclaim is GC-timing-sensitive (Unload is +*eligible-for-collection*, not synchronous) and the next collection cycle reclaims +them. The references list is now BCL-wide (System.* + netstandard + Microsoft.Win32.Registry +via the TRUSTED_PLATFORM_ASSEMBLIES set) so forbidden BCL types resolve at compile and +`ForbiddenTypeAnalyzer` is the sole security gate (consistent with the +Core.Scripting-001 / -002 model). `docs/VirtualTags.md` "Compile cache" section rewritten; +`docs/ScriptedAlarms.md` cross-reference updated to drop the obsolete restart guidance. +Regression tests added in `CompiledScriptCacheTests`: +`Dispose_unloads_compiled_script_assembly_load_context`, +`Clear_disposes_every_materialised_evaluator`, and +`GetOrCompile_after_Dispose_throws_ObjectDisposedException`; the first two +prove ALC unload via `WeakReference` + bounded `GC.Collect()` loops. Suite now 104 +green (was 101). Authoring convention: the synthesized wrapper is an ordinary +C# static method, so scripts must end with explicit `return …;` per ordinary C# rules +(the legacy `CSharpScript` "last expression yields result" shorthand no longer applies); +every script in the existing corpus already uses explicit `return` so this is a doc-only +change for new authors. ### Core.Scripting-009 diff --git a/code-reviews/README.md b/code-reviews/README.md index 4711e69..dd01f44 100644 --- a/code-reviews/README.md +++ b/code-reviews/README.md @@ -286,7 +286,7 @@ Findings with status `Resolved`, `Won't Fix`, or `Deferred`. | Core.ScriptedAlarms-011 | Low | Resolved | Code organization & conventions | `Part9StateMachine.cs:275` | | Core.Scripting-005 | Low | Resolved | Correctness & logic bugs | `DependencyExtractor.cs:97` | | Core.Scripting-006 | Low | Resolved | Concurrency & thread safety | `CompiledScriptCache.cs:55` | -| Core.Scripting-008 | Low | Won't Fix | Performance & resource management | `CompiledScriptCache.cs:34`, `ScriptEvaluator.cs:34` | +| Core.Scripting-008 | Low | Resolved | Performance & resource management | `CompiledScriptCache.cs:34`, `ScriptEvaluator.cs:34` | | Core.Scripting-009 | Low | Resolved | Design-document adherence | `ForbiddenTypeAnalyzer.cs:45` | | Core.Scripting-011 | Low | Resolved | Testing coverage | `tests/Core/ZB.MOM.WW.OtOpcUa.Core.Scripting.Tests/` | | Core.VirtualTags-004 | Low | Resolved | Correctness & logic bugs | `src/Core/ZB.MOM.WW.OtOpcUa.Core.VirtualTags/VirtualTagEngine.cs:349` | diff --git a/docs/ScriptedAlarms.md b/docs/ScriptedAlarms.md index 0a2bea9..1277ba6 100644 --- a/docs/ScriptedAlarms.md +++ b/docs/ScriptedAlarms.md @@ -35,7 +35,7 @@ new ScriptedAlarmDefinition( ## Predicate evaluation -Alarm predicates reuse the same Roslyn sandbox as virtual tags — `ScriptEvaluator` compiles the source, `TimedScriptEvaluator` wraps it with the configured timeout (default from `TimedScriptEvaluator.DefaultTimeout`), and `DependencyExtractor` statically harvests the tag paths the script reads. The sandbox rules (forbidden types, cancellation, logging sinks) are documented in [VirtualTags.md](VirtualTags.md); ScriptedAlarms does not redefine them. The known resource limits — unbounded script-side memory, the per-publish accretion of dynamically-emitted script assemblies (Core.Scripting-008), and the orphan-thread CPU-budget caveat — are documented in that file as well. +Alarm predicates reuse the same Roslyn sandbox as virtual tags — `ScriptEvaluator` compiles the source, `TimedScriptEvaluator` wraps it with the configured timeout (default from `TimedScriptEvaluator.DefaultTimeout`), and `DependencyExtractor` statically harvests the tag paths the script reads. The sandbox rules (forbidden types, cancellation, logging sinks) are documented in [VirtualTags.md](VirtualTags.md); ScriptedAlarms does not redefine them. The known resource limits — unbounded script-side memory and the orphan-thread CPU-budget caveat — are documented in that file as well; per-publish assembly accretion was resolved by the Core.Scripting-008 collectible-`AssemblyLoadContext` rewrite and no longer requires periodic server restarts. `AlarmPredicateContext` (`AlarmPredicateContext.cs`) is the script's `ScriptContext` subclass: diff --git a/docs/VirtualTags.md b/docs/VirtualTags.md index 5f134b5..9ab1827 100644 --- a/docs/VirtualTags.md +++ b/docs/VirtualTags.md @@ -30,7 +30,9 @@ Similarly, **`System.Threading.Tasks` is now denied** (Core.Scripting-003), whic `ConcurrentDictionary>>` keyed on `SHA-256(UTF8(source))` rendered to hex. `Lazy` with `ExecutionAndPublication` mode means two threads racing a miss compile exactly once. Failed compiles evict the entry (via the `TryRemove(KeyValuePair<,>)` overload so a concurrently re-added retry entry is not collateral damage — Core.Scripting-006) so a corrected retry can succeed (used during Admin UI authoring). No capacity bound — scripts are operator-authored and bounded by the config DB. Whitespace changes miss the cache on purpose. `Clear()` is called on config-publish. -**Per-publish assembly accretion (accepted limitation, Core.Scripting-008).** Each compiled `ScriptEvaluator` holds a Roslyn `ScriptRunner` delegate, which keeps the dynamically-emitted script assembly loaded for the process lifetime. Emitted assemblies in the default `AssemblyLoadContext` cannot be unloaded; `CompiledScriptCache.Clear()` drops the dictionary entries but does **not** unload the underlying assemblies. Across many config-publish generations (each `Clear()` followed by recompiling every script), the process accumulates dead script assemblies. For the expected "low thousands" of scripts this is benign, but a long-running server with very frequent publishes will see steady managed-memory growth that does not return until the process restarts. Out-of-process script evaluation or a collectible `AssemblyLoadContext` is a v3 concern; deployments with high-publish-frequency requirements should schedule a periodic server restart to reclaim the accrued assemblies. +**Per-publish assembly unload (Core.Scripting-008 resolved).** Each compiled `ScriptEvaluator` emits its script into a dedicated **collectible** `AssemblyLoadContext` — the BCL escape hatch for assemblies that can be unloaded. The compile path is hand-rolled `CSharpCompilation.Create` + `Emit(MemoryStream)` + `ScriptAssemblyLoadContext.LoadFromStream` rather than the legacy `CSharpScript.CreateDelegate` (which emits into the default ALC and cannot be unloaded). `ScriptEvaluator.Dispose()` calls `AssemblyLoadContext.Unload()` and `CompiledScriptCache.Clear()` disposes every materialised evaluator before dropping its dictionary entry, so the emitted assemblies become eligible for GC immediately after a config-publish. The reclaim is GC-timing-sensitive (Unload is *eligible-for-collection*, not synchronous); the next collection cycle reclaims them. Regression tests `Dispose_unloads_compiled_script_assembly_load_context` and `Clear_disposes_every_materialised_evaluator` in `CompiledScriptCacheTests` lock this contract via `WeakReference` + `GC.Collect()` assertions. Server restarts are no longer required to reclaim compiled-script memory. + +**Scripting authoring convention.** With the collectible-ALC rewrite, the wrapper around a user script is an ordinary C# static method, not a Roslyn `Script` submission. The script body is pasted verbatim as the method body and must therefore end with an explicit `return …;` per ordinary C# rules — the legacy `CSharpScript` "last expression yields result" shorthand is gone. Every script in the existing test corpus already uses explicit `return`; this convention is operator-visible only when authoring a brand-new script from scratch. ### Per-evaluation timeout (`TimedScriptEvaluator`) diff --git a/src/Core/ZB.MOM.WW.OtOpcUa.Core.Scripting/CompiledScriptCache.cs b/src/Core/ZB.MOM.WW.OtOpcUa.Core.Scripting/CompiledScriptCache.cs index 2caaef6..ea35097 100644 --- a/src/Core/ZB.MOM.WW.OtOpcUa.Core.Scripting/CompiledScriptCache.cs +++ b/src/Core/ZB.MOM.WW.OtOpcUa.Core.Scripting/CompiledScriptCache.cs @@ -30,11 +30,20 @@ namespace ZB.MOM.WW.OtOpcUa.Core.Scripting; /// bounded by config DB (typically low thousands). If that changes in v3, add an /// LRU eviction policy — the API stays the same. /// +/// +/// Lifecycle: compiled scripts hold a collectible +/// per evaluator +/// (Core.Scripting-008 fix). disposes every materialised +/// evaluator before dropping its dictionary entry so the emitted assemblies are +/// eligible for GC immediately after a publish. drops the +/// cache itself for graceful server shutdown. +/// /// -public sealed class CompiledScriptCache +public sealed class CompiledScriptCache : IDisposable where TContext : ScriptContext { private readonly ConcurrentDictionary>> _cache = new(); + private bool _disposed; /// /// Return the compiled evaluator for , compiling @@ -46,6 +55,7 @@ public sealed class CompiledScriptCache public ScriptEvaluator GetOrCompile(string scriptSource) { if (scriptSource is null) throw new ArgumentNullException(nameof(scriptSource)); + if (_disposed) throw new ObjectDisposedException(nameof(CompiledScriptCache)); var key = HashSource(scriptSource); var lazy = _cache.GetOrAdd(key, _ => new Lazy>( @@ -72,13 +82,58 @@ public sealed class CompiledScriptCache /// Current entry count. Exposed for Admin UI diagnostics / tests. public int Count => _cache.Count; - /// Drop every cached compile. Used on config generation publish + tests. - public void Clear() => _cache.Clear(); + /// + /// Drop every cached compile. Used on config generation publish + tests. + /// Disposes each materialised evaluator before removing it so its collectible + /// unloads and the + /// emitted script assembly becomes eligible for GC (Core.Scripting-008). + /// + public void Clear() + { + if (_disposed) return; + // Snapshot the entries, swap them out, then dispose. We use TryRemove rather + // than _cache.Clear() so a concurrent GetOrCompile re-add after our snapshot + // is not silently lost — a new compile starts a fresh cache entry, the old + // evaluator is still disposed. + foreach (var key in _cache.Keys.ToArray()) + { + if (_cache.TryRemove(key, out var lazy)) + DisposeLazyIfMaterialised(lazy); + } + } /// True when the exact source has been compiled at least once + is still cached. public bool Contains(string scriptSource) => _cache.ContainsKey(HashSource(scriptSource)); + /// + /// Drop the cache and dispose every materialised evaluator. After disposal + /// throws . + /// + public void Dispose() + { + if (_disposed) return; + _disposed = true; + Clear(); + } + + private static void DisposeLazyIfMaterialised(Lazy> lazy) + { + // IsValueCreated is false for a faulted Lazy too, so the catch in GetOrCompile + // has already taken care of failed compiles — there's no evaluator to dispose. + if (!lazy.IsValueCreated) return; + try + { + lazy.Value.Dispose(); + } + catch + { + // Dispose is best-effort here: an evaluator disposal failure would leak its + // ALC but mustn't prevent the rest of the cache from clearing. The ALC + // unload itself is exception-free in practice; this is defensive. + } + } + private static string HashSource(string source) { var bytes = Encoding.UTF8.GetBytes(source); diff --git a/src/Core/ZB.MOM.WW.OtOpcUa.Core.Scripting/ScriptEvaluator.cs b/src/Core/ZB.MOM.WW.OtOpcUa.Core.Scripting/ScriptEvaluator.cs index 4f406ea..812cee9 100644 --- a/src/Core/ZB.MOM.WW.OtOpcUa.Core.Scripting/ScriptEvaluator.cs +++ b/src/Core/ZB.MOM.WW.OtOpcUa.Core.Scripting/ScriptEvaluator.cs @@ -1,75 +1,315 @@ -using Microsoft.CodeAnalysis.CSharp.Scripting; -using Microsoft.CodeAnalysis.Scripting; +using System.Reflection; +using System.Runtime.Loader; +using System.Text; +using Microsoft.CodeAnalysis; +using Microsoft.CodeAnalysis.CSharp; namespace ZB.MOM.WW.OtOpcUa.Core.Scripting; /// /// Compiles + runs user scripts against a subclass. Core -/// evaluator — no caching, no timeout, no logging side-effects yet (those land in -/// Stream A.3, A.4, A.5 respectively). Stream B + C wrap this with the dependency -/// scheduler + alarm state machine. +/// evaluator — no caching, no timeout, no logging side-effects (those land in +/// , +/// , and +/// respectively). /// /// /// -/// Scripts are compiled against so the -/// context member is named ctx in the script, matching the -/// 's walker and the Admin UI type stub. +/// Scripts are wrapped in a synthesized CompiledScript.Run(globals) method +/// and compiled via into a regular .NET assembly +/// that is loaded into a collectible +/// . The collectible ALC is the fix for +/// Core.Scripting-008: per-publish recompile accretion was previously unbounded +/// because Roslyn's CSharpScript.CreateDelegate emits into the default ALC +/// (non-collectible); now unloads the entire ALC and the +/// emitted assembly becomes eligible for GC. /// /// -/// Compile pipeline is a three-step gate: (1) Roslyn compile — catches syntax -/// errors + type-resolution failures, throws ; -/// (2) runs against the semantic model — -/// catches sandbox escapes that slipped past reference restrictions due to .NET's -/// type forwarding, throws ; (3) -/// delegate creation — throws at this layer only for internal Roslyn bugs, not -/// user error. +/// Compile pipeline is a three-step gate, unchanged in intent from the legacy +/// CSharpScript path: (1) Roslyn parse + compile against the +/// allow-list — catches syntax errors, unresolved +/// types (the sandbox's first line of defense), and most type-resolution +/// failures, throwing ; (2) +/// runs against the semantic model — catches +/// sandbox escapes that slipped past reference restrictions due to .NET's type +/// forwarding, throwing ; (3) emit +/// to an in-memory PE stream + load into the collectible ALC — throws at this +/// layer only for internal Roslyn bugs, not user error. /// /// /// Runtime exceptions thrown from user code propagate unwrapped. The virtual-tag -/// engine (Stream B) catches them per-tag + maps to BadInternalError -/// quality per Phase 7 decision #11 — this layer doesn't swallow anything so -/// tests can assert on the original exception type. +/// engine catches them per-tag and maps to BadInternalError quality +/// per Phase 7 decision #11; this layer doesn't swallow anything so tests can +/// assert on the original exception type. +/// +/// +/// Scripts are expected to be statement bodies that end with an explicit +/// return …; — the wrapper provides only the surrounding method body, so +/// the script's final-expression-yields-result behavior of legacy +/// CSharpScript is replaced by ordinary C# method semantics. Every script +/// in the existing test corpus already uses explicit return; this is a +/// documented authoring convention. /// /// -public sealed class ScriptEvaluator +public sealed class ScriptEvaluator : IDisposable where TContext : ScriptContext { - private readonly ScriptRunner _runner; + private readonly ScriptAssemblyLoadContext _alc; + private readonly Func, TResult> _func; + private bool _disposed; - private ScriptEvaluator(ScriptRunner runner) + private ScriptEvaluator(ScriptAssemblyLoadContext alc, Func, TResult> func) { - _runner = runner; + _alc = alc; + _func = func; } public static ScriptEvaluator Compile(string scriptSource) { if (scriptSource is null) throw new ArgumentNullException(nameof(scriptSource)); - var options = ScriptSandbox.Build(typeof(TContext)); - var script = CSharpScript.Create( - code: scriptSource, - options: options, - globalsType: typeof(ScriptGlobals)); + var sandbox = ScriptSandbox.Build(typeof(TContext)); - // Step 1 — Roslyn compile. Throws CompilationErrorException on syntax / type errors. - var diagnostics = script.Compile(); + // Step 1 — synthesize a wrapper class around the script body and parse it. The + // wrapper's `Run` method is what we invoke at runtime; the user's source is + // pasted in as its body so explicit `return` semantics apply. + var wrapperSource = BuildWrapperSource(scriptSource, sandbox.Imports); + var syntaxTree = CSharpSyntaxTree.ParseText(wrapperSource); - // Step 2 — forbidden-type semantic analysis. Defense-in-depth against reference-list - // leaks due to type forwarding. - var rejections = ForbiddenTypeAnalyzer.Analyze(script.GetCompilation()); + // Step 2 — Roslyn compile against the sandbox allow-list. Anything not in the + // references set is unresolved and produces a compiler error. + var assemblyName = "ZB.MOM.WW.OtOpcUa.Core.Scripting.Compiled." + + Guid.NewGuid().ToString("N"); + var compileOptions = new CSharpCompilationOptions( + OutputKind.DynamicallyLinkedLibrary, + optimizationLevel: OptimizationLevel.Release, + allowUnsafe: false, + // Don't generate XML doc warnings for the synthesized wrapper. + warningLevel: 4, + nullableContextOptions: NullableContextOptions.Enable); + var compilation = CSharpCompilation.Create( + assemblyName, + syntaxTrees: new[] { syntaxTree }, + references: sandbox.References, + options: compileOptions); + + var compileDiagnostics = compilation.GetDiagnostics(); + var compileErrors = compileDiagnostics + .Where(d => d.Severity == DiagnosticSeverity.Error) + .ToArray(); + if (compileErrors.Length > 0) + throw new CompilationErrorException(compileErrors); + + // Step 3 — forbidden-type semantic analysis. Defense-in-depth against + // reference-list leaks due to type forwarding. + var rejections = ForbiddenTypeAnalyzer.Analyze(compilation); if (rejections.Count > 0) throw new ScriptSandboxViolationException(rejections); - // Step 3 — materialize the callable delegate. - var runner = script.CreateDelegate(); - return new ScriptEvaluator(runner); + // Step 4 — emit to an in-memory PE stream and load into a collectible ALC. + using var peStream = new MemoryStream(); + var emitResult = compilation.Emit(peStream); + if (!emitResult.Success) + { + var emitErrors = emitResult.Diagnostics + .Where(d => d.Severity == DiagnosticSeverity.Error) + .ToArray(); + throw new CompilationErrorException(emitErrors); + } + + peStream.Position = 0; + var alc = new ScriptAssemblyLoadContext(assemblyName); + Assembly assembly; + try + { + assembly = alc.LoadFromStream(peStream); + } + catch + { + // Failed to load — drop the ALC so we don't leak a half-initialised one. + alc.Unload(); + throw; + } + + // Step 5 — resolve the wrapper's Run method and bind a typed delegate. The + // wrapper source above puts the type in this exact namespace + class — keep the + // names in sync with BuildWrapperSource. + Func, TResult> func; + try + { + var wrapperType = assembly.GetType( + "ZB.MOM.WW.OtOpcUa.Core.Scripting.Compiled.CompiledScript", + throwOnError: true)!; + var runMethod = wrapperType.GetMethod( + "Run", + BindingFlags.Public | BindingFlags.Static) + ?? throw new InvalidOperationException( + "Synthesized wrapper is missing the public static Run method."); + func = (Func, TResult>)Delegate.CreateDelegate( + typeof(Func, TResult>), runMethod); + } + catch + { + alc.Unload(); + throw; + } + + return new ScriptEvaluator(alc, func); } /// Run against an already-constructed context. public Task RunAsync(TContext context, CancellationToken ct = default) { + if (_disposed) throw new ObjectDisposedException(nameof(ScriptEvaluator)); if (context is null) throw new ArgumentNullException(nameof(context)); + ct.ThrowIfCancellationRequested(); var globals = new ScriptGlobals { ctx = context }; - return _runner(globals, ct); + // The user's script is synchronous (Roslyn emits a static method that returns + // TResult directly). We surface a Task only to keep the existing + // RunAsync contract consumers depend on. TimedScriptEvaluator wraps this in + // Task.Run so a long-running script still honours its wall-clock budget. + var result = _func(globals); + return Task.FromResult(result); + } + + /// + /// Unload the collectible that holds the emitted + /// script assembly so the runtime can reclaim it. After disposal the evaluator can + /// no longer be invoked — call + /// again for a fresh one. Dispose is idempotent. + /// + /// + /// Unload is eligible-for-collection, not synchronous: the assembly is + /// reclaimed when the GC determines no live references remain. The cache disposes + /// evaluators in so a + /// config-generation publish releases the prior generation in one sweep; the + /// reclaim then races with the next GC cycle. Tests verify the reclaim via + /// + . + /// + public void Dispose() + { + if (_disposed) return; + _disposed = true; + _alc.Unload(); + } + + /// + /// Synthesize the source we hand to Roslyn. The user's script body is pasted + /// verbatim inside CompiledScript.Run; the using directives mirror + /// 's imports so scripts can write Math.Abs + /// instead of System.Math.Abs. + /// + private static string BuildWrapperSource(string userSource, IReadOnlyList imports) + { + var sb = new StringBuilder(); + foreach (var import in imports) + sb.Append("using ").Append(import).AppendLine(";"); + sb.AppendLine(); + sb.AppendLine("namespace ZB.MOM.WW.OtOpcUa.Core.Scripting.Compiled;"); + sb.AppendLine(); + sb.AppendLine("public static class CompiledScript"); + sb.AppendLine("{"); + sb.Append(" public static ").Append(ToCSharpTypeName(typeof(TResult))) + .Append(" Run(").Append(ToCSharpTypeName(typeof(ScriptGlobals))) + .AppendLine(" globals)"); + sb.AppendLine(" {"); + sb.AppendLine(" var ctx = globals.ctx;"); + // User source ends with `return X;` per the authoring convention; we paste it + // verbatim. The leading newline keeps Roslyn diagnostics' line numbers usable + // by operators (errors point at the user's source line, not the wrapper). + sb.AppendLine("#line 1"); + sb.AppendLine(userSource); + sb.AppendLine(" }"); + sb.AppendLine("}"); + return sb.ToString(); + } + + /// + /// Convert a runtime to a C# type-name string suitable for + /// emitting into Roslyn source. Uses global::-qualified FQNs to avoid + /// accidental capture by the wrapper's using directives, handles nested + /// types (+.), and recurses for generic arguments so the + /// ScriptGlobals<TContext> parameter is emitted correctly. + /// + private static string ToCSharpTypeName(Type t) + { + if (t == typeof(void)) return "void"; + // Primitive aliases keep the synthesized source readable when diagnostic + // logging dumps it; functionally identical to the FQN form. + if (t == typeof(bool)) return "bool"; + if (t == typeof(byte)) return "byte"; + if (t == typeof(sbyte)) return "sbyte"; + if (t == typeof(short)) return "short"; + if (t == typeof(ushort)) return "ushort"; + if (t == typeof(int)) return "int"; + if (t == typeof(uint)) return "uint"; + if (t == typeof(long)) return "long"; + if (t == typeof(ulong)) return "ulong"; + if (t == typeof(float)) return "float"; + if (t == typeof(double)) return "double"; + if (t == typeof(decimal)) return "decimal"; + if (t == typeof(string)) return "string"; + if (t == typeof(object)) return "object"; + + if (Nullable.GetUnderlyingType(t) is { } inner) + return ToCSharpTypeName(inner) + "?"; + + if (t.IsArray) + return ToCSharpTypeName(t.GetElementType()!) + "[]"; + + if (t.IsGenericType) + { + var def = t.GetGenericTypeDefinition(); + var rawName = def.FullName!.Replace('+', '.'); + var nameNoArity = rawName.Substring(0, rawName.IndexOf('`')); + var args = string.Join(", ", t.GetGenericArguments().Select(ToCSharpTypeName)); + return "global::" + nameNoArity + "<" + args + ">"; + } + + return "global::" + t.FullName!.Replace('+', '.'); + } +} + +/// +/// Collectible that hosts a single emitted script +/// assembly. Created per instance so +/// releases exactly that script. Resolves +/// dependencies via the default ALC — script assemblies reference the BCL + the +/// application's own types, all of which live in the default context. +/// +internal sealed class ScriptAssemblyLoadContext : AssemblyLoadContext +{ + public ScriptAssemblyLoadContext(string name) : base(name, isCollectible: true) + { + } + + protected override Assembly? Load(AssemblyName assemblyName) => null; +} + +/// +/// Thrown by when Roslyn +/// reports compile-time errors against the wrapper source. Mirrors the +/// Microsoft.CodeAnalysis.Scripting.CompilationErrorException from the legacy +/// CSharpScript path so callers (engines + the Admin test-harness) keep the +/// same catch site after the Core.Scripting-008 rewrite. +/// +public sealed class CompilationErrorException : Exception +{ + public IReadOnlyList Diagnostics { get; } + + public CompilationErrorException(IReadOnlyList diagnostics) + : base(BuildMessage(diagnostics)) + { + Diagnostics = diagnostics; + } + + private static string BuildMessage(IReadOnlyList diagnostics) + { + if (diagnostics.Count == 0) return "Script compile failed."; + // Operators see this — match the legacy Roslyn format ("(line,col): error CSxxxx: + // message") so existing operator runbooks still match. + var first = diagnostics[0]; + var rest = diagnostics.Count == 1 ? "" : $" (and {diagnostics.Count - 1} more)"; + return first.ToString() + rest; } } diff --git a/src/Core/ZB.MOM.WW.OtOpcUa.Core.Scripting/ScriptSandbox.cs b/src/Core/ZB.MOM.WW.OtOpcUa.Core.Scripting/ScriptSandbox.cs index d4a207e..eb5335a 100644 --- a/src/Core/ZB.MOM.WW.OtOpcUa.Core.Scripting/ScriptSandbox.cs +++ b/src/Core/ZB.MOM.WW.OtOpcUa.Core.Scripting/ScriptSandbox.cs @@ -1,11 +1,10 @@ -using Microsoft.CodeAnalysis.CSharp.Scripting; -using Microsoft.CodeAnalysis.Scripting; +using Microsoft.CodeAnalysis; using ZB.MOM.WW.OtOpcUa.Core.Abstractions; namespace ZB.MOM.WW.OtOpcUa.Core.Scripting; /// -/// Factory for the every user script is compiled against. +/// Factory for the compile-time sandbox every user script is built against. /// Implements Phase 7 plan decision #6 (read-only sandbox) by whitelisting only the /// assemblies + namespaces the script API needs; no System.IO, no /// System.Net, no System.Diagnostics.Process, no @@ -15,9 +14,12 @@ namespace ZB.MOM.WW.OtOpcUa.Core.Scripting; /// /// /// -/// Roslyn's default references mscorlib / -/// System.Runtime transitively which pulls in every type in the BCL — this -/// class overrides that with an explicit minimal allow-list. +/// Roslyn would otherwise pull in every type in the BCL transitively via +/// mscorlib / System.Runtime — this class overrides that with an +/// explicit minimal allow-list. The list is the same regardless of whether +/// uses the legacy +/// CSharpScript path or the collectible-AssemblyLoadContext path +/// (Core.Scripting-008): both go through . /// /// /// Namespaces pre-imported so scripts don't have to write using clauses: @@ -35,29 +37,21 @@ namespace ZB.MOM.WW.OtOpcUa.Core.Scripting; public static class ScriptSandbox { /// - /// Build the used for every virtual-tag / alarm - /// script. is the concrete - /// subclass the globals will be of — the compiler - /// uses its type to resolve ctx.GetTag(...) calls. + /// Build the sandbox configuration used for every virtual-tag / alarm script. + /// is the concrete + /// subclass the script's ctx will be of — the compiler uses its assembly + /// to resolve ctx.GetTag(...) calls. /// - public static ScriptOptions Build(Type contextType) + public static SandboxConfig Build(Type contextType) { if (contextType is null) throw new ArgumentNullException(nameof(contextType)); if (!typeof(ScriptContext).IsAssignableFrom(contextType)) throw new ArgumentException( $"Script context type must derive from {nameof(ScriptContext)}", nameof(contextType)); - // Allow-listed assemblies — each explicitly chosen. Adding here is a - // plan-level decision; do not expand casually. HashSet so adding the - // contextType's assembly is idempotent when it happens to be Core.Scripting - // already. - var allowedAssemblies = new HashSet + // OtOpcUa-owned assemblies — pinned by typeof(...) so they survive a rename. + var pinnedAssemblies = new HashSet { - // System.Private.CoreLib — primitives (int, double, bool, string, DateTime, - // TimeSpan, Math, Convert, nullable). Can't practically script without it. - typeof(object).Assembly, - // System.Linq — IEnumerable extensions (Where / Select / Sum / Average / etc.). - typeof(System.Linq.Enumerable).Assembly, // Core.Abstractions — DataValueSnapshot + DriverDataType so scripts can name // the types they receive from ctx.GetTag. typeof(DataValueSnapshot).Assembly, @@ -72,7 +66,23 @@ public static class ScriptSandbox contextType.Assembly, }; - var allowedImports = new[] + // BCL references. We list the trusted-platform-assemblies set restricted to + // System.* and netstandard so the synthesized wrapper can reference every BCL + // type by FQN — including the ones we forbid (HttpClient, File, Process, + // Registry, etc.). Letting those types resolve at compile is intentional: the + // hard security gate is ForbiddenTypeAnalyzer in step 3 of the compile pipeline + // (Core.Scripting-001 / -002 established the analyzer must be the sole gate + // because type forwarding makes any references-list-only restriction porous). + // The references list now serves only as scoping hygiene — out-of-band BCL + // surface (operator-authored hosting helpers, third-party packages, app code) + // is not on the list and stays unreachable. + var references = new List(); + foreach (var asm in pinnedAssemblies) + references.Add(MetadataReference.CreateFromFile(asm.Location)); + foreach (var path in EnumerateBclAssemblyPaths()) + references.Add(MetadataReference.CreateFromFile(path)); + + var imports = new[] { "System", "System.Linq", @@ -80,8 +90,56 @@ public static class ScriptSandbox "ZB.MOM.WW.OtOpcUa.Core.Scripting", }; - return ScriptOptions.Default - .WithReferences(allowedAssemblies) - .WithImports(allowedImports); + return new SandboxConfig(references, imports); + } + + private static IEnumerable EnumerateBclAssemblyPaths() + { + // The .NET host advertises the resolved runtime-shared-framework + BCL DLL set + // via the TRUSTED_PLATFORM_ASSEMBLIES AppContext data slot. This is what the + // ALC fallback uses when resolving assemblies, so anything in here is already + // loadable by the host process. We restrict to System.* and netstandard to keep + // the script's reachable surface to the BCL — anything else (Microsoft.*, + // application code, third-party packages happening to be in the runtime store) + // would expand the analyzer's deny-list job unnecessarily. + var raw = (string?)AppContext.GetData("TRUSTED_PLATFORM_ASSEMBLIES"); + if (string.IsNullOrEmpty(raw)) + yield break; + + var separator = OperatingSystem.IsWindows() ? ';' : ':'; + foreach (var path in raw.Split(separator, StringSplitOptions.RemoveEmptyEntries)) + { + var name = System.IO.Path.GetFileName(path); + if (name.StartsWith("System.", StringComparison.Ordinal) || + string.Equals(name, "netstandard.dll", StringComparison.Ordinal) || + string.Equals(name, "mscorlib.dll", StringComparison.Ordinal) || + // Microsoft.Win32.Registry isn't a System.* DLL but the analyzer's + // Microsoft.Win32 deny-list relies on the type being resolvable so it + // can identify + reject it (Core.Scripting-001 / -002). Add the one + // DLL we need rather than broadening to Microsoft.* (which would also + // pull in compilers, build tooling, etc.). + string.Equals(name, "Microsoft.Win32.Registry.dll", StringComparison.Ordinal)) + { + yield return path; + } + } } } + +/// +/// Compile-time sandbox configuration. Returned by ; +/// consumed by 's manual +/// CSharpCompilation path. +/// +/// +/// Metadata references (allow-listed assemblies) the script compilation is built +/// against. Anything not in this set is unresolved at compile, which is the sandbox's +/// first line of defense — is the second. +/// +/// +/// Namespaces pre-imported into the wrapper compilation as using directives +/// so scripts can write Math.Abs rather than System.Math.Abs. +/// +public sealed record SandboxConfig( + IReadOnlyList References, + IReadOnlyList Imports); diff --git a/tests/Core/ZB.MOM.WW.OtOpcUa.Core.Scripting.Tests/CompiledScriptCacheTests.cs b/tests/Core/ZB.MOM.WW.OtOpcUa.Core.Scripting.Tests/CompiledScriptCacheTests.cs index 46e5ab8..77e1973 100644 --- a/tests/Core/ZB.MOM.WW.OtOpcUa.Core.Scripting.Tests/CompiledScriptCacheTests.cs +++ b/tests/Core/ZB.MOM.WW.OtOpcUa.Core.Scripting.Tests/CompiledScriptCacheTests.cs @@ -221,4 +221,107 @@ public sealed class CompiledScriptCacheTests Should.Throw(() => cache.GetOrCompile("""return unknownIdentifier + 1;""")); cache.Count.ShouldBe(0, "faulted Lazy must still be evicted after compile failure"); } + + // --- Core.Scripting-008: collectible AssemblyLoadContext unload --- + + [Fact] + public void Dispose_unloads_compiled_script_assembly_load_context() + { + // The whole point of switching the emit path off CSharpScript.CreateDelegate + // and onto a collectible ALC: after Dispose, the runtime can reclaim the + // emitted assembly. We assert this via a WeakReference to the compiled + // assembly itself — if the ALC unloads correctly the reference is dead after + // a forced GC; if the assembly stayed rooted (the pre-fix behaviour) the + // reference survives. The exact reclaim is GC-timing-sensitive, so we loop a + // bounded number of times to absorb GC scheduling noise. + var weak = CompileAndCaptureWeakAssembly(); + // Help the runtime — ALC.Unload is *eligible-for-collection*, not synchronous. + for (int i = 0; i < 10 && weak.IsAlive; i++) + { + GC.Collect(); + GC.WaitForPendingFinalizers(); + GC.Collect(); + } + weak.IsAlive.ShouldBeFalse( + "the collectible ALC must release the emitted script assembly after Dispose " + + "(Core.Scripting-008). If this fails, either the cache held a reference past " + + "Dispose, or a delegate/closure rooted the assembly in the default ALC."); + } + + [System.Runtime.CompilerServices.MethodImpl( + System.Runtime.CompilerServices.MethodImplOptions.NoInlining)] + private static WeakReference CompileAndCaptureWeakAssembly() + { + // Isolation method so the JIT cannot keep the local references rooted past + // its return — without [NoInlining] the GC may decide the locals are still + // live and the WeakReference test becomes flaky. + var evaluator = ScriptEvaluator.Compile("""return 42;"""); + var weak = new WeakReference(evaluator.GetType().Assembly is var asm && + asm is not null ? GetEmittedAssembly(evaluator) : null); + evaluator.Dispose(); + return weak; + } + + [System.Runtime.CompilerServices.MethodImpl( + System.Runtime.CompilerServices.MethodImplOptions.NoInlining)] + private static object GetEmittedAssembly(ScriptEvaluator evaluator) + { + // The evaluator's delegate Method lives on the synthesized wrapper class in + // the emitted assembly. The delegate field is private; we reach it via a + // reflection probe rather than expose internals — keeps the public surface + // unchanged. + var funcField = typeof(ScriptEvaluator).GetField( + "_func", System.Reflection.BindingFlags.Instance | System.Reflection.BindingFlags.NonPublic)!; + var del = (Delegate)funcField.GetValue(evaluator)!; + return del.Method.Module.Assembly; + } + + [Fact] + public void Clear_disposes_every_materialised_evaluator() + { + // Locks the contract that Clear() is publish-safe: after a config-generation + // publish drops the cache, every prior script's ALC should unload so the + // process memory plateaus rather than growing across publishes. + var weaks = CompileFiveAndCaptureWeakAssemblies(); + for (int i = 0; i < 10 && weaks.Any(w => w.IsAlive); i++) + { + GC.Collect(); + GC.WaitForPendingFinalizers(); + GC.Collect(); + } + weaks.ShouldAllBe(w => !w.IsAlive, + "after Clear() every compiled-script ALC must be unloadable " + + "(Core.Scripting-008). If this fails the publish-replace pattern leaks " + + "emitted assemblies, which is exactly the v3 concern this rewrite fixes."); + } + + [System.Runtime.CompilerServices.MethodImpl( + System.Runtime.CompilerServices.MethodImplOptions.NoInlining)] + private static List CompileFiveAndCaptureWeakAssemblies() + { + var cache = new CompiledScriptCache(); + var weaks = new List(); + for (int i = 0; i < 5; i++) + { + // Distinct source per iteration so each compiles to its own assembly. + var e = cache.GetOrCompile($"""return {i};"""); + var funcField = typeof(ScriptEvaluator).GetField( + "_func", System.Reflection.BindingFlags.Instance | System.Reflection.BindingFlags.NonPublic)!; + var del = (Delegate)funcField.GetValue(e)!; + weaks.Add(new WeakReference(del.Method.Module.Assembly)); + } + cache.Count.ShouldBe(5); + cache.Clear(); + cache.Count.ShouldBe(0); + return weaks; + } + + [Fact] + public void GetOrCompile_after_Dispose_throws_ObjectDisposedException() + { + var cache = new CompiledScriptCache(); + cache.GetOrCompile("""return 1;"""); + cache.Dispose(); + Should.Throw(() => cache.GetOrCompile("""return 2;""")); + } }