fix(scripting): unload compiled-script assemblies via collectible ALC

Core.Scripting-008 resolution: replace the legacy CSharpScript.CreateDelegate
path with hand-rolled CSharpCompilation + Emit + collectible AssemblyLoadContext,
so per-publish compile accretion no longer requires a server restart to reclaim.

Why this was needed:
  Roslyn's CSharpScript path emits dynamically-compiled script assemblies into
  the default AssemblyLoadContext, which is non-collectible. Across config-
  publish generations each Clear() drops dictionary entries but the emitted
  assemblies stay loaded for process lifetime, so memory grows steadily on
  long-running servers with frequent publishes. The accepted-limitation note
  in docs/VirtualTags.md recommended scheduled restarts as the workaround;
  operator feedback was that restarts are difficult, so the underlying
  limitation was the right thing to fix.

Implementation:
  - New ScriptAssemblyLoadContext(name, isCollectible: true) hosts one emitted
    script assembly per evaluator.
  - ScriptEvaluator.Compile synthesises a wrapper class around the user source
    (CompiledScript.Run(globals) — explicit return required per ordinary C#
    semantics, which every existing script already uses), builds a
    CSharpCompilation against the sandbox references, runs the
    ForbiddenTypeAnalyzer over the semantic model unchanged, emits to an
    in-memory PE stream, loads via ScriptAssemblyLoadContext.LoadFromStream,
    and binds a strongly-typed Func<ScriptGlobals<TContext>, TResult> delegate
    via reflection.
  - ScriptEvaluator now implements IDisposable — Dispose calls
    AssemblyLoadContext.Unload(), which makes the emitted assembly eligible
    for GC at the next collection cycle.
  - CompiledScriptCache.Clear() disposes every materialised evaluator before
    dropping its dictionary entry; CompiledScriptCache itself is now
    IDisposable for graceful server shutdown.
  - ScriptSandbox.Build returns a new SandboxConfig (References + Imports)
    instead of a Roslyn ScriptOptions; references now span BCL via the
    TRUSTED_PLATFORM_ASSEMBLIES set filtered to System.* + netstandard +
    Microsoft.Win32.Registry, so forbidden BCL types resolve at compile and
    ForbiddenTypeAnalyzer is the sole security gate (consistent with the
    Core.Scripting-001 / -002 model — references-list-only restriction is
    porous against type forwarding, so the analyzer must be the real gate).

Verification:
  - All 104 Core.Scripting tests pass (was 101 — three new regression tests
    locking the unload contract).
  - All 56 VirtualTags tests pass (unchanged).
  - All 63 ScriptedAlarms tests pass (unchanged).
  - New CompiledScriptCacheTests:
    - Dispose_unloads_compiled_script_assembly_load_context — proves single-
      evaluator ALC unload via WeakReference + bounded GC.Collect() loop.
    - Clear_disposes_every_materialised_evaluator — proves publish-replace
      releases every prior generation's ALC.
    - GetOrCompile_after_Dispose_throws_ObjectDisposedException — locks the
      post-dispose contract.

Docs:
  - docs/VirtualTags.md "Compile cache" section rewritten: the accepted-
    limitation note replaced with the unload contract + the new authoring
    convention (explicit return).
  - docs/ScriptedAlarms.md cross-reference updated to drop the obsolete
    restart guidance.
  - code-reviews/Core.Scripting/findings.md Core.Scripting-008 flipped
    Won't Fix → Resolved with the implementation summary.
  - code-reviews/README.md regenerated.

Pre-existing breakage note: Driver.Galaxy fails the solution-wide build on
master because its ProjectReference to the sibling mxaccessgw repo's
MxGateway.Client targets a path that the sibling repo no longer has after a
recent restructuring. This is unrelated to Core.Scripting-008 and was
verified to exist on master before this branch was cut.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
Joseph Doherty
2026-05-23 15:55:04 -04:00
parent 5a9c4591b9
commit 7b6ab2ec6f
8 changed files with 553 additions and 69 deletions
@@ -30,11 +30,20 @@ namespace ZB.MOM.WW.OtOpcUa.Core.Scripting;
/// bounded by config DB (typically low thousands). If that changes in v3, add an
/// LRU eviction policy — the API stays the same.
/// </para>
/// <para>
/// <b>Lifecycle:</b> compiled scripts hold a collectible
/// <see cref="System.Runtime.Loader.AssemblyLoadContext"/> per evaluator
/// (Core.Scripting-008 fix). <see cref="Clear"/> disposes every materialised
/// evaluator before dropping its dictionary entry so the emitted assemblies are
/// eligible for GC immediately after a publish. <see cref="Dispose"/> drops the
/// cache itself for graceful server shutdown.
/// </para>
/// </remarks>
public sealed class CompiledScriptCache<TContext, TResult>
public sealed class CompiledScriptCache<TContext, TResult> : IDisposable
where TContext : ScriptContext
{
private readonly ConcurrentDictionary<string, Lazy<ScriptEvaluator<TContext, TResult>>> _cache = new();
private bool _disposed;
/// <summary>
/// Return the compiled evaluator for <paramref name="scriptSource"/>, compiling
@@ -46,6 +55,7 @@ public sealed class CompiledScriptCache<TContext, TResult>
public ScriptEvaluator<TContext, TResult> GetOrCompile(string scriptSource)
{
if (scriptSource is null) throw new ArgumentNullException(nameof(scriptSource));
if (_disposed) throw new ObjectDisposedException(nameof(CompiledScriptCache<TContext, TResult>));
var key = HashSource(scriptSource);
var lazy = _cache.GetOrAdd(key, _ => new Lazy<ScriptEvaluator<TContext, TResult>>(
@@ -72,13 +82,58 @@ public sealed class CompiledScriptCache<TContext, TResult>
/// <summary>Current entry count. Exposed for Admin UI diagnostics / tests.</summary>
public int Count => _cache.Count;
/// <summary>Drop every cached compile. Used on config generation publish + tests.</summary>
public void Clear() => _cache.Clear();
/// <summary>
/// Drop every cached compile. Used on config generation publish + tests.
/// Disposes each materialised evaluator before removing it so its collectible
/// <see cref="System.Runtime.Loader.AssemblyLoadContext"/> unloads and the
/// emitted script assembly becomes eligible for GC (Core.Scripting-008).
/// </summary>
public void Clear()
{
if (_disposed) return;
// Snapshot the entries, swap them out, then dispose. We use TryRemove rather
// than _cache.Clear() so a concurrent GetOrCompile re-add after our snapshot
// is not silently lost — a new compile starts a fresh cache entry, the old
// evaluator is still disposed.
foreach (var key in _cache.Keys.ToArray())
{
if (_cache.TryRemove(key, out var lazy))
DisposeLazyIfMaterialised(lazy);
}
}
/// <summary>True when the exact source has been compiled at least once + is still cached.</summary>
public bool Contains(string scriptSource)
=> _cache.ContainsKey(HashSource(scriptSource));
/// <summary>
/// Drop the cache and dispose every materialised evaluator. After disposal
/// <see cref="GetOrCompile"/> throws <see cref="ObjectDisposedException"/>.
/// </summary>
public void Dispose()
{
if (_disposed) return;
_disposed = true;
Clear();
}
private static void DisposeLazyIfMaterialised(Lazy<ScriptEvaluator<TContext, TResult>> lazy)
{
// IsValueCreated is false for a faulted Lazy too, so the catch in GetOrCompile
// has already taken care of failed compiles — there's no evaluator to dispose.
if (!lazy.IsValueCreated) return;
try
{
lazy.Value.Dispose();
}
catch
{
// Dispose is best-effort here: an evaluator disposal failure would leak its
// ALC but mustn't prevent the rest of the cache from clearing. The ALC
// unload itself is exception-free in practice; this is defensive.
}
}
private static string HashSource(string source)
{
var bytes = Encoding.UTF8.GetBytes(source);