fix(scripting): unload compiled-script assemblies via collectible ALC

Core.Scripting-008 resolution: replace the legacy CSharpScript.CreateDelegate
path with hand-rolled CSharpCompilation + Emit + collectible AssemblyLoadContext,
so per-publish compile accretion no longer requires a server restart to reclaim.

Why this was needed:
  Roslyn's CSharpScript path emits dynamically-compiled script assemblies into
  the default AssemblyLoadContext, which is non-collectible. Across config-
  publish generations each Clear() drops dictionary entries but the emitted
  assemblies stay loaded for process lifetime, so memory grows steadily on
  long-running servers with frequent publishes. The accepted-limitation note
  in docs/VirtualTags.md recommended scheduled restarts as the workaround;
  operator feedback was that restarts are difficult, so the underlying
  limitation was the right thing to fix.

Implementation:
  - New ScriptAssemblyLoadContext(name, isCollectible: true) hosts one emitted
    script assembly per evaluator.
  - ScriptEvaluator.Compile synthesises a wrapper class around the user source
    (CompiledScript.Run(globals) — explicit return required per ordinary C#
    semantics, which every existing script already uses), builds a
    CSharpCompilation against the sandbox references, runs the
    ForbiddenTypeAnalyzer over the semantic model unchanged, emits to an
    in-memory PE stream, loads via ScriptAssemblyLoadContext.LoadFromStream,
    and binds a strongly-typed Func<ScriptGlobals<TContext>, TResult> delegate
    via reflection.
  - ScriptEvaluator now implements IDisposable — Dispose calls
    AssemblyLoadContext.Unload(), which makes the emitted assembly eligible
    for GC at the next collection cycle.
  - CompiledScriptCache.Clear() disposes every materialised evaluator before
    dropping its dictionary entry; CompiledScriptCache itself is now
    IDisposable for graceful server shutdown.
  - ScriptSandbox.Build returns a new SandboxConfig (References + Imports)
    instead of a Roslyn ScriptOptions; references now span BCL via the
    TRUSTED_PLATFORM_ASSEMBLIES set filtered to System.* + netstandard +
    Microsoft.Win32.Registry, so forbidden BCL types resolve at compile and
    ForbiddenTypeAnalyzer is the sole security gate (consistent with the
    Core.Scripting-001 / -002 model — references-list-only restriction is
    porous against type forwarding, so the analyzer must be the real gate).

Verification:
  - All 104 Core.Scripting tests pass (was 101 — three new regression tests
    locking the unload contract).
  - All 56 VirtualTags tests pass (unchanged).
  - All 63 ScriptedAlarms tests pass (unchanged).
  - New CompiledScriptCacheTests:
    - Dispose_unloads_compiled_script_assembly_load_context — proves single-
      evaluator ALC unload via WeakReference + bounded GC.Collect() loop.
    - Clear_disposes_every_materialised_evaluator — proves publish-replace
      releases every prior generation's ALC.
    - GetOrCompile_after_Dispose_throws_ObjectDisposedException — locks the
      post-dispose contract.

Docs:
  - docs/VirtualTags.md "Compile cache" section rewritten: the accepted-
    limitation note replaced with the unload contract + the new authoring
    convention (explicit return).
  - docs/ScriptedAlarms.md cross-reference updated to drop the obsolete
    restart guidance.
  - code-reviews/Core.Scripting/findings.md Core.Scripting-008 flipped
    Won't Fix → Resolved with the implementation summary.
  - code-reviews/README.md regenerated.

Pre-existing breakage note: Driver.Galaxy fails the solution-wide build on
master because its ProjectReference to the sibling mxaccessgw repo's
MxGateway.Client targets a path that the sibling repo no longer has after a
recent restructuring. This is unrelated to Core.Scripting-008 and was
verified to exist on master before this branch was cut.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
Joseph Doherty
2026-05-23 15:55:04 -04:00
parent 5a9c4591b9
commit 7b6ab2ec6f
8 changed files with 553 additions and 69 deletions
@@ -1,75 +1,315 @@
using Microsoft.CodeAnalysis.CSharp.Scripting;
using Microsoft.CodeAnalysis.Scripting;
using System.Reflection;
using System.Runtime.Loader;
using System.Text;
using Microsoft.CodeAnalysis;
using Microsoft.CodeAnalysis.CSharp;
namespace ZB.MOM.WW.OtOpcUa.Core.Scripting;
/// <summary>
/// Compiles + runs user scripts against a <see cref="ScriptContext"/> subclass. Core
/// evaluator — no caching, no timeout, no logging side-effects yet (those land in
/// Stream A.3, A.4, A.5 respectively). Stream B + C wrap this with the dependency
/// scheduler + alarm state machine.
/// evaluator — no caching, no timeout, no logging side-effects (those land in
/// <see cref="CompiledScriptCache{TContext, TResult}"/>,
/// <see cref="TimedScriptEvaluator{TContext, TResult}"/>, and
/// <see cref="ScriptLogCompanionSink"/> respectively).
/// </summary>
/// <remarks>
/// <para>
/// Scripts are compiled against <see cref="ScriptGlobals{TContext}"/> so the
/// context member is named <c>ctx</c> in the script, matching the
/// <see cref="DependencyExtractor"/>'s walker and the Admin UI type stub.
/// Scripts are wrapped in a synthesized <c>CompiledScript.Run(globals)</c> method
/// and compiled via <see cref="CSharpCompilation"/> into a regular .NET assembly
/// that is loaded into a <b>collectible</b>
/// <see cref="AssemblyLoadContext"/>. The collectible ALC is the fix for
/// Core.Scripting-008: per-publish recompile accretion was previously unbounded
/// because Roslyn's <c>CSharpScript.CreateDelegate</c> emits into the default ALC
/// (non-collectible); now <see cref="Dispose"/> unloads the entire ALC and the
/// emitted assembly becomes eligible for GC.
/// </para>
/// <para>
/// Compile pipeline is a three-step gate: (1) Roslyn compile — catches syntax
/// errors + type-resolution failures, throws <see cref="CompilationErrorException"/>;
/// (2) <see cref="ForbiddenTypeAnalyzer"/> runs against the semantic model —
/// catches sandbox escapes that slipped past reference restrictions due to .NET's
/// type forwarding, throws <see cref="ScriptSandboxViolationException"/>; (3)
/// delegate creation — throws at this layer only for internal Roslyn bugs, not
/// user error.
/// Compile pipeline is a three-step gate, unchanged in intent from the legacy
/// <c>CSharpScript</c> path: (1) Roslyn parse + compile against the
/// <see cref="ScriptSandbox"/> allow-list — catches syntax errors, unresolved
/// types (the sandbox's first line of defense), and most type-resolution
/// failures, throwing <see cref="CompilationErrorException"/>; (2)
/// <see cref="ForbiddenTypeAnalyzer"/> runs against the semantic model — catches
/// sandbox escapes that slipped past reference restrictions due to .NET's type
/// forwarding, throwing <see cref="ScriptSandboxViolationException"/>; (3) emit
/// to an in-memory PE stream + load into the collectible ALC — throws at this
/// layer only for internal Roslyn bugs, not user error.
/// </para>
/// <para>
/// Runtime exceptions thrown from user code propagate unwrapped. The virtual-tag
/// engine (Stream B) catches them per-tag + maps to <c>BadInternalError</c>
/// quality per Phase 7 decision #11 this layer doesn't swallow anything so
/// tests can assert on the original exception type.
/// engine catches them per-tag and maps to <c>BadInternalError</c> quality
/// per Phase 7 decision #11; this layer doesn't swallow anything so tests can
/// assert on the original exception type.
/// </para>
/// <para>
/// Scripts are expected to be statement bodies that end with an explicit
/// <c>return …;</c> — the wrapper provides only the surrounding method body, so
/// the script's final-expression-yields-result behavior of legacy
/// <c>CSharpScript</c> is replaced by ordinary C# method semantics. Every script
/// in the existing test corpus already uses explicit <c>return</c>; this is a
/// documented authoring convention.
/// </para>
/// </remarks>
public sealed class ScriptEvaluator<TContext, TResult>
public sealed class ScriptEvaluator<TContext, TResult> : IDisposable
where TContext : ScriptContext
{
private readonly ScriptRunner<TResult> _runner;
private readonly ScriptAssemblyLoadContext _alc;
private readonly Func<ScriptGlobals<TContext>, TResult> _func;
private bool _disposed;
private ScriptEvaluator(ScriptRunner<TResult> runner)
private ScriptEvaluator(ScriptAssemblyLoadContext alc, Func<ScriptGlobals<TContext>, TResult> func)
{
_runner = runner;
_alc = alc;
_func = func;
}
public static ScriptEvaluator<TContext, TResult> Compile(string scriptSource)
{
if (scriptSource is null) throw new ArgumentNullException(nameof(scriptSource));
var options = ScriptSandbox.Build(typeof(TContext));
var script = CSharpScript.Create<TResult>(
code: scriptSource,
options: options,
globalsType: typeof(ScriptGlobals<TContext>));
var sandbox = ScriptSandbox.Build(typeof(TContext));
// Step 1 — Roslyn compile. Throws CompilationErrorException on syntax / type errors.
var diagnostics = script.Compile();
// Step 1 — synthesize a wrapper class around the script body and parse it. The
// wrapper's `Run` method is what we invoke at runtime; the user's source is
// pasted in as its body so explicit `return` semantics apply.
var wrapperSource = BuildWrapperSource(scriptSource, sandbox.Imports);
var syntaxTree = CSharpSyntaxTree.ParseText(wrapperSource);
// Step 2 — forbidden-type semantic analysis. Defense-in-depth against reference-list
// leaks due to type forwarding.
var rejections = ForbiddenTypeAnalyzer.Analyze(script.GetCompilation());
// Step 2 — Roslyn compile against the sandbox allow-list. Anything not in the
// references set is unresolved and produces a compiler error.
var assemblyName = "ZB.MOM.WW.OtOpcUa.Core.Scripting.Compiled." +
Guid.NewGuid().ToString("N");
var compileOptions = new CSharpCompilationOptions(
OutputKind.DynamicallyLinkedLibrary,
optimizationLevel: OptimizationLevel.Release,
allowUnsafe: false,
// Don't generate XML doc warnings for the synthesized wrapper.
warningLevel: 4,
nullableContextOptions: NullableContextOptions.Enable);
var compilation = CSharpCompilation.Create(
assemblyName,
syntaxTrees: new[] { syntaxTree },
references: sandbox.References,
options: compileOptions);
var compileDiagnostics = compilation.GetDiagnostics();
var compileErrors = compileDiagnostics
.Where(d => d.Severity == DiagnosticSeverity.Error)
.ToArray();
if (compileErrors.Length > 0)
throw new CompilationErrorException(compileErrors);
// Step 3 — forbidden-type semantic analysis. Defense-in-depth against
// reference-list leaks due to type forwarding.
var rejections = ForbiddenTypeAnalyzer.Analyze(compilation);
if (rejections.Count > 0)
throw new ScriptSandboxViolationException(rejections);
// Step 3materialize the callable delegate.
var runner = script.CreateDelegate();
return new ScriptEvaluator<TContext, TResult>(runner);
// Step 4emit to an in-memory PE stream and load into a collectible ALC.
using var peStream = new MemoryStream();
var emitResult = compilation.Emit(peStream);
if (!emitResult.Success)
{
var emitErrors = emitResult.Diagnostics
.Where(d => d.Severity == DiagnosticSeverity.Error)
.ToArray();
throw new CompilationErrorException(emitErrors);
}
peStream.Position = 0;
var alc = new ScriptAssemblyLoadContext(assemblyName);
Assembly assembly;
try
{
assembly = alc.LoadFromStream(peStream);
}
catch
{
// Failed to load — drop the ALC so we don't leak a half-initialised one.
alc.Unload();
throw;
}
// Step 5 — resolve the wrapper's Run method and bind a typed delegate. The
// wrapper source above puts the type in this exact namespace + class — keep the
// names in sync with BuildWrapperSource.
Func<ScriptGlobals<TContext>, TResult> func;
try
{
var wrapperType = assembly.GetType(
"ZB.MOM.WW.OtOpcUa.Core.Scripting.Compiled.CompiledScript",
throwOnError: true)!;
var runMethod = wrapperType.GetMethod(
"Run",
BindingFlags.Public | BindingFlags.Static)
?? throw new InvalidOperationException(
"Synthesized wrapper is missing the public static Run method.");
func = (Func<ScriptGlobals<TContext>, TResult>)Delegate.CreateDelegate(
typeof(Func<ScriptGlobals<TContext>, TResult>), runMethod);
}
catch
{
alc.Unload();
throw;
}
return new ScriptEvaluator<TContext, TResult>(alc, func);
}
/// <summary>Run against an already-constructed context.</summary>
public Task<TResult> RunAsync(TContext context, CancellationToken ct = default)
{
if (_disposed) throw new ObjectDisposedException(nameof(ScriptEvaluator<TContext, TResult>));
if (context is null) throw new ArgumentNullException(nameof(context));
ct.ThrowIfCancellationRequested();
var globals = new ScriptGlobals<TContext> { ctx = context };
return _runner(globals, ct);
// The user's script is synchronous (Roslyn emits a static method that returns
// TResult directly). We surface a Task<TResult> only to keep the existing
// RunAsync contract consumers depend on. TimedScriptEvaluator wraps this in
// Task.Run so a long-running script still honours its wall-clock budget.
var result = _func(globals);
return Task.FromResult(result);
}
/// <summary>
/// Unload the collectible <see cref="AssemblyLoadContext"/> that holds the emitted
/// script assembly so the runtime can reclaim it. After disposal the evaluator can
/// no longer be invoked — call <see cref="ScriptEvaluator{TContext, TResult}.Compile"/>
/// again for a fresh one. Dispose is idempotent.
/// </summary>
/// <remarks>
/// Unload is <i>eligible-for-collection</i>, not synchronous: the assembly is
/// reclaimed when the GC determines no live references remain. The cache disposes
/// evaluators in <see cref="CompiledScriptCache{TContext, TResult}.Clear"/> so a
/// config-generation publish releases the prior generation in one sweep; the
/// reclaim then races with the next GC cycle. Tests verify the reclaim via
/// <see cref="WeakReference"/> + <see cref="GC.Collect()"/>.
/// </remarks>
public void Dispose()
{
if (_disposed) return;
_disposed = true;
_alc.Unload();
}
/// <summary>
/// Synthesize the source we hand to Roslyn. The user's script body is pasted
/// verbatim inside <c>CompiledScript.Run</c>; the <c>using</c> directives mirror
/// <see cref="ScriptSandbox"/>'s imports so scripts can write <c>Math.Abs</c>
/// instead of <c>System.Math.Abs</c>.
/// </summary>
private static string BuildWrapperSource(string userSource, IReadOnlyList<string> imports)
{
var sb = new StringBuilder();
foreach (var import in imports)
sb.Append("using ").Append(import).AppendLine(";");
sb.AppendLine();
sb.AppendLine("namespace ZB.MOM.WW.OtOpcUa.Core.Scripting.Compiled;");
sb.AppendLine();
sb.AppendLine("public static class CompiledScript");
sb.AppendLine("{");
sb.Append(" public static ").Append(ToCSharpTypeName(typeof(TResult)))
.Append(" Run(").Append(ToCSharpTypeName(typeof(ScriptGlobals<TContext>)))
.AppendLine(" globals)");
sb.AppendLine(" {");
sb.AppendLine(" var ctx = globals.ctx;");
// User source ends with `return X;` per the authoring convention; we paste it
// verbatim. The leading newline keeps Roslyn diagnostics' line numbers usable
// by operators (errors point at the user's source line, not the wrapper).
sb.AppendLine("#line 1");
sb.AppendLine(userSource);
sb.AppendLine(" }");
sb.AppendLine("}");
return sb.ToString();
}
/// <summary>
/// Convert a runtime <see cref="Type"/> to a C# type-name string suitable for
/// emitting into Roslyn source. Uses <c>global::</c>-qualified FQNs to avoid
/// accidental capture by the wrapper's <c>using</c> directives, handles nested
/// types (<c>+</c> → <c>.</c>), and recurses for generic arguments so the
/// <c>ScriptGlobals&lt;TContext&gt;</c> parameter is emitted correctly.
/// </summary>
private static string ToCSharpTypeName(Type t)
{
if (t == typeof(void)) return "void";
// Primitive aliases keep the synthesized source readable when diagnostic
// logging dumps it; functionally identical to the FQN form.
if (t == typeof(bool)) return "bool";
if (t == typeof(byte)) return "byte";
if (t == typeof(sbyte)) return "sbyte";
if (t == typeof(short)) return "short";
if (t == typeof(ushort)) return "ushort";
if (t == typeof(int)) return "int";
if (t == typeof(uint)) return "uint";
if (t == typeof(long)) return "long";
if (t == typeof(ulong)) return "ulong";
if (t == typeof(float)) return "float";
if (t == typeof(double)) return "double";
if (t == typeof(decimal)) return "decimal";
if (t == typeof(string)) return "string";
if (t == typeof(object)) return "object";
if (Nullable.GetUnderlyingType(t) is { } inner)
return ToCSharpTypeName(inner) + "?";
if (t.IsArray)
return ToCSharpTypeName(t.GetElementType()!) + "[]";
if (t.IsGenericType)
{
var def = t.GetGenericTypeDefinition();
var rawName = def.FullName!.Replace('+', '.');
var nameNoArity = rawName.Substring(0, rawName.IndexOf('`'));
var args = string.Join(", ", t.GetGenericArguments().Select(ToCSharpTypeName));
return "global::" + nameNoArity + "<" + args + ">";
}
return "global::" + t.FullName!.Replace('+', '.');
}
}
/// <summary>
/// Collectible <see cref="AssemblyLoadContext"/> that hosts a single emitted script
/// assembly. Created per <see cref="ScriptEvaluator{TContext, TResult}"/> instance so
/// <see cref="AssemblyLoadContext.Unload"/> releases exactly that script. Resolves
/// dependencies via the default ALC — script assemblies reference the BCL + the
/// application's own types, all of which live in the default context.
/// </summary>
internal sealed class ScriptAssemblyLoadContext : AssemblyLoadContext
{
public ScriptAssemblyLoadContext(string name) : base(name, isCollectible: true)
{
}
protected override Assembly? Load(AssemblyName assemblyName) => null;
}
/// <summary>
/// Thrown by <see cref="ScriptEvaluator{TContext, TResult}.Compile"/> when Roslyn
/// reports compile-time errors against the wrapper source. Mirrors the
/// <c>Microsoft.CodeAnalysis.Scripting.CompilationErrorException</c> from the legacy
/// <c>CSharpScript</c> path so callers (engines + the Admin test-harness) keep the
/// same catch site after the Core.Scripting-008 rewrite.
/// </summary>
public sealed class CompilationErrorException : Exception
{
public IReadOnlyList<Diagnostic> Diagnostics { get; }
public CompilationErrorException(IReadOnlyList<Diagnostic> diagnostics)
: base(BuildMessage(diagnostics))
{
Diagnostics = diagnostics;
}
private static string BuildMessage(IReadOnlyList<Diagnostic> diagnostics)
{
if (diagnostics.Count == 0) return "Script compile failed.";
// Operators see this — match the legacy Roslyn format ("(line,col): error CSxxxx:
// message") so existing operator runbooks still match.
var first = diagnostics[0];
var rest = diagnostics.Count == 1 ? "" : $" (and {diagnostics.Count - 1} more)";
return first.ToString() + rest;
}
}