Files
lmxopcua/tests/Core/ZB.MOM.WW.OtOpcUa.Core.Scripting.Tests/CompiledScriptCacheTests.cs
Joseph Doherty 7b6ab2ec6f fix(scripting): unload compiled-script assemblies via collectible ALC
Core.Scripting-008 resolution: replace the legacy CSharpScript.CreateDelegate
path with hand-rolled CSharpCompilation + Emit + collectible AssemblyLoadContext,
so per-publish compile accretion no longer requires a server restart to reclaim.

Why this was needed:
  Roslyn's CSharpScript path emits dynamically-compiled script assemblies into
  the default AssemblyLoadContext, which is non-collectible. Across config-
  publish generations each Clear() drops dictionary entries but the emitted
  assemblies stay loaded for process lifetime, so memory grows steadily on
  long-running servers with frequent publishes. The accepted-limitation note
  in docs/VirtualTags.md recommended scheduled restarts as the workaround;
  operator feedback was that restarts are difficult, so the underlying
  limitation was the right thing to fix.

Implementation:
  - New ScriptAssemblyLoadContext(name, isCollectible: true) hosts one emitted
    script assembly per evaluator.
  - ScriptEvaluator.Compile synthesises a wrapper class around the user source
    (CompiledScript.Run(globals) — explicit return required per ordinary C#
    semantics, which every existing script already uses), builds a
    CSharpCompilation against the sandbox references, runs the
    ForbiddenTypeAnalyzer over the semantic model unchanged, emits to an
    in-memory PE stream, loads via ScriptAssemblyLoadContext.LoadFromStream,
    and binds a strongly-typed Func<ScriptGlobals<TContext>, TResult> delegate
    via reflection.
  - ScriptEvaluator now implements IDisposable — Dispose calls
    AssemblyLoadContext.Unload(), which makes the emitted assembly eligible
    for GC at the next collection cycle.
  - CompiledScriptCache.Clear() disposes every materialised evaluator before
    dropping its dictionary entry; CompiledScriptCache itself is now
    IDisposable for graceful server shutdown.
  - ScriptSandbox.Build returns a new SandboxConfig (References + Imports)
    instead of a Roslyn ScriptOptions; references now span BCL via the
    TRUSTED_PLATFORM_ASSEMBLIES set filtered to System.* + netstandard +
    Microsoft.Win32.Registry, so forbidden BCL types resolve at compile and
    ForbiddenTypeAnalyzer is the sole security gate (consistent with the
    Core.Scripting-001 / -002 model — references-list-only restriction is
    porous against type forwarding, so the analyzer must be the real gate).

Verification:
  - All 104 Core.Scripting tests pass (was 101 — three new regression tests
    locking the unload contract).
  - All 56 VirtualTags tests pass (unchanged).
  - All 63 ScriptedAlarms tests pass (unchanged).
  - New CompiledScriptCacheTests:
    - Dispose_unloads_compiled_script_assembly_load_context — proves single-
      evaluator ALC unload via WeakReference + bounded GC.Collect() loop.
    - Clear_disposes_every_materialised_evaluator — proves publish-replace
      releases every prior generation's ALC.
    - GetOrCompile_after_Dispose_throws_ObjectDisposedException — locks the
      post-dispose contract.

Docs:
  - docs/VirtualTags.md "Compile cache" section rewritten: the accepted-
    limitation note replaced with the unload contract + the new authoring
    convention (explicit return).
  - docs/ScriptedAlarms.md cross-reference updated to drop the obsolete
    restart guidance.
  - code-reviews/Core.Scripting/findings.md Core.Scripting-008 flipped
    Won't Fix → Resolved with the implementation summary.
  - code-reviews/README.md regenerated.

Pre-existing breakage note: Driver.Galaxy fails the solution-wide build on
master because its ProjectReference to the sibling mxaccessgw repo's
MxGateway.Client targets a path that the sibling repo no longer has after a
recent restructuring. This is unrelated to Core.Scripting-008 and was
verified to exist on master before this branch was cut.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-23 15:55:04 -04:00

328 lines
15 KiB
C#

using Shouldly;
using Xunit;
using ZB.MOM.WW.OtOpcUa.Core.Scripting;
namespace ZB.MOM.WW.OtOpcUa.Core.Scripting.Tests;
/// <summary>
/// Exercises the source-hash keyed compile cache. Roslyn compilation is the most
/// expensive step in the evaluator pipeline; this cache collapses redundant
/// compiles of unchanged scripts to zero-cost lookups + makes sure concurrent
/// callers never double-compile.
/// </summary>
[Trait("Category", "Unit")]
public sealed class CompiledScriptCacheTests
{
private sealed class CompileCountingGate
{
public int Count;
}
[Fact]
public void First_call_compiles_and_caches()
{
var cache = new CompiledScriptCache<FakeScriptContext, int>();
cache.Count.ShouldBe(0);
var e = cache.GetOrCompile("""return 42;""");
e.ShouldNotBeNull();
cache.Count.ShouldBe(1);
cache.Contains("""return 42;""").ShouldBeTrue();
}
[Fact]
public void Identical_source_returns_the_same_compiled_evaluator()
{
var cache = new CompiledScriptCache<FakeScriptContext, int>();
var first = cache.GetOrCompile("""return 1;""");
var second = cache.GetOrCompile("""return 1;""");
ReferenceEquals(first, second).ShouldBeTrue();
cache.Count.ShouldBe(1);
}
[Fact]
public void Different_source_produces_different_evaluator()
{
var cache = new CompiledScriptCache<FakeScriptContext, int>();
var a = cache.GetOrCompile("""return 1;""");
var b = cache.GetOrCompile("""return 2;""");
ReferenceEquals(a, b).ShouldBeFalse();
cache.Count.ShouldBe(2);
}
[Fact]
public void Whitespace_difference_misses_cache()
{
// Documented behavior: reformatting a script recompiles. Simpler + cheaper
// than the alternative (AST-canonicalize then hash) and doesn't happen often.
var cache = new CompiledScriptCache<FakeScriptContext, int>();
cache.GetOrCompile("""return 1;""");
cache.GetOrCompile("return 1; "); // trailing whitespace — different hash
cache.Count.ShouldBe(2);
}
[Fact]
public async Task Cached_evaluator_still_runs_correctly()
{
var cache = new CompiledScriptCache<FakeScriptContext, double>();
var e = cache.GetOrCompile("""return (double)ctx.GetTag("In").Value * 3.0;""");
var ctx = new FakeScriptContext().Seed("In", 7.0);
// Run twice through the cache — both must return the same correct value.
var first = await e.RunAsync(ctx, TestContext.Current.CancellationToken);
var second = await cache.GetOrCompile("""return (double)ctx.GetTag("In").Value * 3.0;""")
.RunAsync(ctx, TestContext.Current.CancellationToken);
first.ShouldBe(21.0);
second.ShouldBe(21.0);
}
[Fact]
public void Failed_compile_is_evicted_so_retry_with_corrected_source_works()
{
var cache = new CompiledScriptCache<FakeScriptContext, int>();
// First attempt — undefined identifier, compile throws.
Should.Throw<Exception>(() => cache.GetOrCompile("""return unknownIdentifier + 1;"""));
cache.Count.ShouldBe(0, "failed compile must be evicted so retry can re-attempt");
// Retry with corrected source succeeds + caches.
cache.GetOrCompile("""return 42;""").ShouldNotBeNull();
cache.Count.ShouldBe(1);
}
[Fact]
public void Clear_drops_every_entry()
{
var cache = new CompiledScriptCache<FakeScriptContext, int>();
cache.GetOrCompile("""return 1;""");
cache.GetOrCompile("""return 2;""");
cache.Count.ShouldBe(2);
cache.Clear();
cache.Count.ShouldBe(0);
cache.Contains("""return 1;""").ShouldBeFalse();
}
[Fact]
public void Concurrent_compiles_of_the_same_source_deduplicate()
{
// LazyThreadSafetyMode.ExecutionAndPublication guarantees only one compile
// even when multiple threads race GetOrCompile against an empty cache.
// We can't directly count Roslyn compilations — but we can assert all
// concurrent callers see the same evaluator instance.
var cache = new CompiledScriptCache<FakeScriptContext, int>();
const string src = """return 99;""";
var tasks = Enumerable.Range(0, 20)
.Select(_ => Task.Run(() => cache.GetOrCompile(src)))
.ToArray();
Task.WhenAll(tasks).GetAwaiter().GetResult();
var firstInstance = tasks[0].Result;
foreach (var t in tasks)
ReferenceEquals(t.Result, firstInstance).ShouldBeTrue();
cache.Count.ShouldBe(1);
}
[Fact]
public void Different_TContext_TResult_pairs_use_separate_cache_instances()
{
// Documented: each engine (virtual-tag / alarm-predicate / alarm-action) owns
// its own cache. The type-parametric design makes this the default without
// cross-contamination at the dictionary level.
var intCache = new CompiledScriptCache<FakeScriptContext, int>();
var boolCache = new CompiledScriptCache<FakeScriptContext, bool>();
intCache.GetOrCompile("""return 1;""");
boolCache.GetOrCompile("""return true;""");
intCache.Count.ShouldBe(1);
boolCache.Count.ShouldBe(1);
intCache.Contains("""return true;""").ShouldBeFalse();
boolCache.Contains("""return 1;""").ShouldBeFalse();
}
[Fact]
public void Null_source_throws_ArgumentNullException()
{
var cache = new CompiledScriptCache<FakeScriptContext, int>();
Should.Throw<ArgumentNullException>(() => cache.GetOrCompile(null!));
}
[Fact]
public void Failed_compile_eviction_does_not_remove_a_concurrent_retry_entry()
{
// Regression for Core.Scripting-006: when a faulted Lazy is observed by a thread,
// the eviction must scope to that specific Lazy instance, not the key. If a
// concurrent retry has already inserted a fresh Lazy under the same key between
// the throw and the catch-block removal, the buggy TryRemove(key, out _) overload
// evicts the retry entry. The fixed TryRemove(KeyValuePair<,>) overload compares
// value identity, so only the faulted Lazy is removed.
//
// Deterministic setup: pre-populate the cache's internal dictionary with a
// faulted Lazy whose factory itself swaps the entry to a fresh Lazy as a side
// effect during the throw. By the time GetOrCompile reaches its catch block, the
// dictionary holds the fresh entry under the same key — exactly the race window
// the finding describes. The fix must leave the fresh entry in place.
var cache = new CompiledScriptCache<FakeScriptContext, int>();
// Reach the private _cache + HashSource via reflection — they're private, so
// InternalsVisibleTo doesn't help.
var cacheField = typeof(CompiledScriptCache<FakeScriptContext, int>)
.GetField("_cache", System.Reflection.BindingFlags.Instance | System.Reflection.BindingFlags.NonPublic);
cacheField.ShouldNotBeNull();
var dict = (System.Collections.Concurrent.ConcurrentDictionary<
string, Lazy<ScriptEvaluator<FakeScriptContext, int>>>)cacheField!.GetValue(cache)!;
const string source = """return 7;""";
var hashSourceMethod = typeof(CompiledScriptCache<FakeScriptContext, int>)
.GetMethod("HashSource", System.Reflection.BindingFlags.Static | System.Reflection.BindingFlags.NonPublic);
hashSourceMethod.ShouldNotBeNull();
var key = (string)hashSourceMethod!.Invoke(null, [source])!;
// The fresh Lazy is what a concurrent retry would have inserted between the
// faulted-throw and the catch's removal. Materialise it eagerly so we have a
// stable reference to assert identity against.
var fresh = new Lazy<ScriptEvaluator<FakeScriptContext, int>>(
() => ScriptEvaluator<FakeScriptContext, int>.Compile(source),
LazyThreadSafetyMode.ExecutionAndPublication);
// The faulted Lazy throws — but only after swapping its own dictionary entry
// for the fresh Lazy, modelling the race window between the throw and the
// catch-block eviction.
var faulted = new Lazy<ScriptEvaluator<FakeScriptContext, int>>(
() =>
{
dict[key] = fresh;
throw new InvalidOperationException("bad compile");
},
LazyThreadSafetyMode.ExecutionAndPublication);
dict[key] = faulted;
// Drive GetOrCompile through the public API. It observes the faulted Lazy
// currently under `key`, invokes .Value (which swaps in the fresh Lazy then
// throws), and runs the catch block's eviction. The fix removes only the
// specific faulted Lazy instance; the fresh entry survives.
Should.Throw<InvalidOperationException>(() => cache.GetOrCompile(source));
dict.ContainsKey(key).ShouldBeTrue(
"the fresh retry entry that won the race must survive the faulted Lazy eviction (Core.Scripting-006)");
ReferenceEquals(dict[key], fresh).ShouldBeTrue(
"the entry under the key must still be the fresh Lazy — an unconditional TryRemove(key) would have evicted it");
}
[Fact]
public void Failed_compile_path_still_evicts_its_own_faulted_entry()
{
// Companion to the race test above: confirm the fix's value-scoped eviction
// still removes the actual faulted Lazy (so retries with corrected source can
// succeed — the original Core.Scripting test that locked the contract).
var cache = new CompiledScriptCache<FakeScriptContext, int>();
Should.Throw<Exception>(() => cache.GetOrCompile("""return unknownIdentifier + 1;"""));
cache.Count.ShouldBe(0, "faulted Lazy must still be evicted after compile failure");
}
// --- Core.Scripting-008: collectible AssemblyLoadContext unload ---
[Fact]
public void Dispose_unloads_compiled_script_assembly_load_context()
{
// The whole point of switching the emit path off CSharpScript.CreateDelegate
// and onto a collectible ALC: after Dispose, the runtime can reclaim the
// emitted assembly. We assert this via a WeakReference to the compiled
// assembly itself — if the ALC unloads correctly the reference is dead after
// a forced GC; if the assembly stayed rooted (the pre-fix behaviour) the
// reference survives. The exact reclaim is GC-timing-sensitive, so we loop a
// bounded number of times to absorb GC scheduling noise.
var weak = CompileAndCaptureWeakAssembly();
// Help the runtime — ALC.Unload is *eligible-for-collection*, not synchronous.
for (int i = 0; i < 10 && weak.IsAlive; i++)
{
GC.Collect();
GC.WaitForPendingFinalizers();
GC.Collect();
}
weak.IsAlive.ShouldBeFalse(
"the collectible ALC must release the emitted script assembly after Dispose " +
"(Core.Scripting-008). If this fails, either the cache held a reference past " +
"Dispose, or a delegate/closure rooted the assembly in the default ALC.");
}
[System.Runtime.CompilerServices.MethodImpl(
System.Runtime.CompilerServices.MethodImplOptions.NoInlining)]
private static WeakReference CompileAndCaptureWeakAssembly()
{
// Isolation method so the JIT cannot keep the local references rooted past
// its return — without [NoInlining] the GC may decide the locals are still
// live and the WeakReference test becomes flaky.
var evaluator = ScriptEvaluator<FakeScriptContext, int>.Compile("""return 42;""");
var weak = new WeakReference(evaluator.GetType().Assembly is var asm &&
asm is not null ? GetEmittedAssembly(evaluator) : null);
evaluator.Dispose();
return weak;
}
[System.Runtime.CompilerServices.MethodImpl(
System.Runtime.CompilerServices.MethodImplOptions.NoInlining)]
private static object GetEmittedAssembly(ScriptEvaluator<FakeScriptContext, int> evaluator)
{
// The evaluator's delegate Method lives on the synthesized wrapper class in
// the emitted assembly. The delegate field is private; we reach it via a
// reflection probe rather than expose internals — keeps the public surface
// unchanged.
var funcField = typeof(ScriptEvaluator<FakeScriptContext, int>).GetField(
"_func", System.Reflection.BindingFlags.Instance | System.Reflection.BindingFlags.NonPublic)!;
var del = (Delegate)funcField.GetValue(evaluator)!;
return del.Method.Module.Assembly;
}
[Fact]
public void Clear_disposes_every_materialised_evaluator()
{
// Locks the contract that Clear() is publish-safe: after a config-generation
// publish drops the cache, every prior script's ALC should unload so the
// process memory plateaus rather than growing across publishes.
var weaks = CompileFiveAndCaptureWeakAssemblies();
for (int i = 0; i < 10 && weaks.Any(w => w.IsAlive); i++)
{
GC.Collect();
GC.WaitForPendingFinalizers();
GC.Collect();
}
weaks.ShouldAllBe(w => !w.IsAlive,
"after Clear() every compiled-script ALC must be unloadable " +
"(Core.Scripting-008). If this fails the publish-replace pattern leaks " +
"emitted assemblies, which is exactly the v3 concern this rewrite fixes.");
}
[System.Runtime.CompilerServices.MethodImpl(
System.Runtime.CompilerServices.MethodImplOptions.NoInlining)]
private static List<WeakReference> CompileFiveAndCaptureWeakAssemblies()
{
var cache = new CompiledScriptCache<FakeScriptContext, int>();
var weaks = new List<WeakReference>();
for (int i = 0; i < 5; i++)
{
// Distinct source per iteration so each compiles to its own assembly.
var e = cache.GetOrCompile($"""return {i};""");
var funcField = typeof(ScriptEvaluator<FakeScriptContext, int>).GetField(
"_func", System.Reflection.BindingFlags.Instance | System.Reflection.BindingFlags.NonPublic)!;
var del = (Delegate)funcField.GetValue(e)!;
weaks.Add(new WeakReference(del.Method.Module.Assembly));
}
cache.Count.ShouldBe(5);
cache.Clear();
cache.Count.ShouldBe(0);
return weaks;
}
[Fact]
public void GetOrCompile_after_Dispose_throws_ObjectDisposedException()
{
var cache = new CompiledScriptCache<FakeScriptContext, int>();
cache.GetOrCompile("""return 1;""");
cache.Dispose();
Should.Throw<ObjectDisposedException>(() => cache.GetOrCompile("""return 2;"""));
}
}