fix(scripting): route engines through CompiledScriptCache (Core.Scripting-016)

Both VirtualTagEngine.Load and ScriptedAlarmEngine.LoadAsync were calling
ScriptEvaluator.Compile directly, bypassing CompiledScriptCache. The
Core.Scripting-008 collectible-ALC fix wired Dispose only through the cache's
Clear()/Dispose(), so the per-publish accretion the -008 fix was meant to
eliminate was still in effect on the actual production path — the headline
'no more restarts needed' guarantee wasn't delivered.

Resolution:
  - VirtualTagEngine + ScriptedAlarmEngine each gained a private
    CompiledScriptCache<TContext, TResult> instance.
  - Both Load methods now call _compileCache.GetOrCompile(source).
  - Publish-replace path: _compileCache.Clear() runs alongside the existing
    _tags / _alarms clears so the prior generation's ALCs are disposed
    before recompile.
  - Engine Dispose now calls _compileCache.Dispose() so shutdown actually
    releases the emitted assemblies.

Side-fix in CompiledScriptCache: Dispose() set _disposed=true then called
Clear(), but Clear() had a pre-existing 'if (_disposed) return' guard that
aborted the drain unconditionally — making the Dispose-triggered cleanup a
silent no-op. Removed the disposed-guard on Clear() (clearing an empty/
cleared cache is idempotent).

Side-fix in ScriptedAlarmEngine.Dispose: cleared _alarms AFTER the
Task.WhenAll drain. The drain guarantees no background callback is mid-
flight, so clearing is safe. Previously _alarms was deliberately NOT
cleared on Dispose (per Core.ScriptedAlarms-005), but that left the
AlarmState records holding TimedScriptEvaluator → ScriptEvaluator → delegate
references that rooted the emitted assemblies, defeating the cache's
Dispose work on the engine side.

Regression tests:
  - VirtualTagEngineTests.Dispose_unloads_compiled_script_assembly
  - ScriptedAlarmEngineTests.Dispose_unloads_compiled_predicate_assembly
  Both use WeakReference + bounded GC.Collect() to prove the emitted
  assembly is reclaimable after engine.Dispose(). The alarms test had to
  be synchronous (not 'async Task<WeakReference>') because async state
  machines capture locals as state-struct fields, keeping them alive past
  the method's apparent end and defeating GC.

Verification:
  - Core.Scripting.Tests: 104/104 (unchanged).
  - VirtualTags.Tests: 57/57 (was 56 — +1 unload test).
  - ScriptedAlarms.Tests: 67/67 (was 66 — +1 unload test).
  - All other consumer suites still green.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
Joseph Doherty
2026-05-23 17:33:34 -04:00
parent a6ae4e22d1
commit fb7c6c7046
7 changed files with 242 additions and 13 deletions

View File

@@ -63,6 +63,20 @@ public sealed class ScriptedAlarmEngine : IDisposable
private readonly ConcurrentDictionary<string, AlarmScratch> _scratchByAlarmId =
new(StringComparer.Ordinal);
/// <summary>
/// Compile cache for every alarm predicate. Routes <see cref="LoadAsync"/>'s
/// <see cref="ScriptEvaluator{TContext, TResult}.Compile"/> calls through the
/// cache so the collectible <see cref="System.Runtime.Loader.AssemblyLoadContext"/>
/// each compile produces is actually disposed on the publish-replace path
/// (Core.Scripting-016): the cache's <see cref="CompiledScriptCache{TContext, TResult}.Clear"/>
/// disposes every materialised evaluator before dropping its dictionary entry,
/// so a config-publish releases the prior generation's ALCs and the per-publish
/// accretion the Core.Scripting-008 fix targeted is actually freed in production.
/// Pre-fix the engine called <c>ScriptEvaluator.Compile</c> directly, which left
/// the ALCs rooted until the process exited — defeating -008 on the real path.
/// </summary>
private readonly CompiledScriptCache<AlarmPredicateContext, bool> _compileCache = new();
/// <summary>
/// Test-only diagnostic: returns the per-alarm scratch read-cache dictionary
/// if one has been allocated, else null. Used by Core.ScriptedAlarms-009
@@ -143,6 +157,10 @@ public sealed class ScriptedAlarmEngine : IDisposable
// have changed (different Inputs, different Logger), so any reuse would be
// unsafe. (Core.ScriptedAlarms-009)
_scratchByAlarmId.Clear();
// Dispose every compiled-predicate ALC from the prior generation BEFORE we
// recompile this one. Skipping this is what made Core.Scripting-008 a
// no-op in production. (Core.Scripting-016)
_compileCache.Clear();
var compileFailures = new List<string>();
foreach (var def in definitions)
@@ -157,7 +175,10 @@ public sealed class ScriptedAlarmEngine : IDisposable
continue;
}
var evaluator = ScriptEvaluator<AlarmPredicateContext, bool>.Compile(def.PredicateScriptSource);
// Route through CompiledScriptCache so the emitted assembly's
// collectible ALC participates in publish-replace cleanup.
// (Core.Scripting-016)
var evaluator = _compileCache.GetOrCompile(def.PredicateScriptSource);
var timed = new TimedScriptEvaluator<AlarmPredicateContext, bool>(evaluator, _scriptTimeout);
var logger = _loggerFactory.Create(def.AlarmId);
@@ -645,12 +666,24 @@ public sealed class ScriptedAlarmEngine : IDisposable
}
}
// Do NOT clear _alarms here: Timer.Dispose() does not wait for in-flight callbacks,
// so a ShelvingCheckAsync or ReevaluateAsync can still be running inside _evalGate.
// Those paths now re-check _disposed after acquiring the gate and bail out safely.
// Clearing _alarms outside the gate would race concurrent reads and is unnecessary
// (the whole object is being discarded). (Core.ScriptedAlarms-005)
// Safe to clear here: the Task.WhenAll drain above guaranteed no
// ReevaluateAsync / ShelvingCheckAsync is mid-flight, and _disposed=true
// prevents new background work from being queued (OnUpstreamChange bails on
// line 334). Pre-Core.Scripting-016 the comment said "Do NOT clear _alarms",
// but that was when the engine called ScriptEvaluator.Compile directly and
// held the script ALCs through _alarms→AlarmState→TimedScriptEvaluator
// forever — leaving them rooted defeated the -008 collectible-ALC unload.
// Clearing now drops the delegate references so the cache's Dispose call
// below can actually unload the emitted assemblies. (Core.ScriptedAlarms-005
// re-evaluated under -016.)
_alarms.Clear();
_alarmsReferencing.Clear();
_scratchByAlarmId.Clear();
// Dispose every compiled-predicate ALC so the engine's shutdown actually
// releases the emitted assemblies. The drain above ensures no evaluator is
// mid-call; CompiledScriptCache.Dispose internally guards against use-after-
// dispose. (Core.Scripting-016)
_compileCache.Dispose();
}
private sealed record AlarmState(

View File

@@ -88,9 +88,17 @@ public sealed class CompiledScriptCache<TContext, TResult> : IDisposable
/// <see cref="System.Runtime.Loader.AssemblyLoadContext"/> unloads and the
/// emitted script assembly becomes eligible for GC (Core.Scripting-008).
/// </summary>
/// <remarks>
/// Safe to call after <see cref="Dispose"/> — the operation is idempotent.
/// <see cref="Dispose"/> sets <c>_disposed = true</c> before invoking this
/// method (so callers see the post-Dispose guard on <see cref="GetOrCompile"/>),
/// but this method itself MUST run to completion so the Dispose-triggered
/// drain actually unloads every materialised evaluator's ALC. (Core.Scripting-016
/// uncovered this — a previous Clear-aborts-when-disposed guard silently
/// skipped the entire drain on Dispose, leaving emitted assemblies rooted.)
/// </remarks>
public void Clear()
{
if (_disposed) return;
// Snapshot the entries, swap them out, then dispose. We use TryRemove rather
// than _cache.Clear() so a concurrent GetOrCompile re-add after our snapshot
// is not silently lost — a new compile starts a fresh cache entry, the old

View File

@@ -37,6 +37,21 @@ public sealed class VirtualTagEngine : IDisposable
private readonly DependencyGraph _graph = new();
private readonly Dictionary<string, VirtualTagState> _tags = new(StringComparer.Ordinal);
/// <summary>
/// Compile cache for every virtual-tag script. Routes <see cref="Load"/>'s
/// <see cref="ScriptEvaluator{TContext, TResult}.Compile"/> calls through the
/// cache so the collectible <see cref="System.Runtime.Loader.AssemblyLoadContext"/>
/// each compile produces is actually disposed on the publish-replace path
/// (Core.Scripting-016): the cache's <see cref="CompiledScriptCache{TContext, TResult}.Clear"/>
/// disposes every materialised evaluator before dropping its dictionary entry,
/// so a config-publish releases the prior generation's ALCs and the per-publish
/// accretion the Core.Scripting-008 fix targeted is actually freed in production.
/// Pre-fix the engine called <c>ScriptEvaluator.Compile</c> directly, which left
/// the ALCs rooted until the process exited — defeating -008 on the real path.
/// </summary>
private readonly CompiledScriptCache<VirtualTagContext, object?> _compileCache = new();
private readonly ConcurrentDictionary<string, DataValueSnapshot> _valueCache = new(StringComparer.Ordinal);
private readonly ConcurrentDictionary<string, List<Action<string, DataValueSnapshot>>> _observers
= new(StringComparer.Ordinal);
@@ -74,6 +89,10 @@ public sealed class VirtualTagEngine : IDisposable
UnsubscribeFromUpstream();
_tags.Clear();
_graph.Clear();
// Dispose every compiled-script ALC from the prior generation BEFORE we
// recompile this one. Skipping this is what made Core.Scripting-008 a
// no-op in production (Core.Scripting-016).
_compileCache.Clear();
var compileFailures = new List<string>();
var seenPaths = new HashSet<string>(StringComparer.Ordinal);
@@ -102,7 +121,9 @@ public sealed class VirtualTagEngine : IDisposable
continue;
}
var evaluator = ScriptEvaluator<VirtualTagContext, object?>.Compile(def.ScriptSource);
// Route through CompiledScriptCache so the emitted assembly's collectible
// ALC participates in publish-replace cleanup. (Core.Scripting-016)
var evaluator = _compileCache.GetOrCompile(def.ScriptSource);
var timed = new TimedScriptEvaluator<VirtualTagContext, object?>(evaluator, _scriptTimeout);
var scriptLogger = _loggerFactory.Create(def.Path);
@@ -481,6 +502,9 @@ public sealed class VirtualTagEngine : IDisposable
UnsubscribeFromUpstream();
_tags.Clear();
_graph.Clear();
// Dispose every compiled-script ALC so the engine's shutdown actually
// releases the emitted assemblies. (Core.Scripting-016)
_compileCache.Dispose();
}
internal DependencyGraph GraphForTesting => _graph;