Phase 7 Stream A.2 — compile cache + per-evaluation timeout wrapper. Second of 3 increments within Stream A. Adds two independent resilience primitives that the virtual-tag engine (Stream B) and scripted-alarm engine (Stream C) will compose with the base ScriptEvaluator. Both are generic on (TContext, TResult) so different engines get their own instances without cross-contamination.
CompiledScriptCache<TContext, TResult> — source-hash-keyed cache of compiled evaluators. Roslyn compilation is the most expensive step in the evaluator pipeline (5-20ms per script depending on size); re-compiling on every value-change event would starve the engine. ConcurrentDictionary of Lazy<ScriptEvaluator> with ExecutionAndPublication mode ensures concurrent callers never double-compile even on a cold cache race. Failed compiles evict the cache entry so an Admin UI retry with corrected source actually recompiles (otherwise the cached exception would persist). Whitespace-sensitive hash — reformatting a script misses the cache on purpose, simpler than AST-canonicalize and happens rarely. No capacity bound because virtual-tag + alarm scripts are config-DB bounded (thousands, not millions); if scale pushes past that in v3 an LRU eviction slots in behind the same API. TimedScriptEvaluator<TContext, TResult> — wraps a ScriptEvaluator with a per-evaluation wall-clock timeout (default 250ms per Phase 7 plan Stream A.4, configurable per tag so slower backends can widen). Critical implementation detail: the underlying Roslyn ScriptRunner executes synchronously on the calling thread for CPU-bound user scripts, returning an already-completed Task before the caller can register a timeout. Naive `Task.WaitAsync(timeout)` would see the completed task and never fire. Fix: push evaluation to a thread-pool thread via Task.Run, so the caller's thread is free to wait and the timeout reliably fires after the configured budget. Known trade-off (documented in the class summary): when a script times out, the underlying evaluation task continues running on the thread-pool thread until Roslyn returns; in the CPU-bound-infinite-loop case it's effectively leaked until the runtime decides to unwind. Tighter CPU budgeting would require an out-of-process script runner (v3 concern). In practice the timeout + structured warning log surfaces the offending script so the operator fixes it, and the orphan thread is rare. Caller-supplied CancellationToken is honored and takes precedence over the timeout, so driver-shutdown paths see a clean OperationCanceledException rather than a misclassified ScriptTimeoutException. ScriptTimeoutException carries the configured Timeout and a diagnostic message pointing the operator at ctx.Logger output around the failure plus suggesting widening the timeout, simplifying the script, or moving heavy work out of the evaluation path. The virtual-tag engine (Stream B) will catch this and map the owning tag's quality to BadInternalError per Phase 7 decision #11, logging a structured warning with the offending script name. Tests: CompiledScriptCacheTests (10) — first-call compile, identical-source dedupe to same instance, different-source produces different evaluator, whitespace-sensitivity documented, cached evaluator still runs correctly, failed compile evicted for retry, Clear drops entries, concurrent GetOrCompile of the same source deduplicates to one instance, different TContext/TResult use separate cache instances, null source rejected. TimedScriptEvaluatorTests (9) — fast script completes under timeout, CPU-bound script throws ScriptTimeoutException, caller cancellation takes precedence over timeout (shutdown path correctness), default 250ms per plan, zero/negative timeout rejected at construction, null inner rejected, null context rejected, user-thrown exceptions propagate unwrapped (not conflated with timeout), timeout exception message contains diagnostic guidance. Full suite: 48/48 green (29 from A.1 + 19 new). Next: Stream A.3 wires the dedicated scripts-*.log Serilog rolling sink + structured-property filtering + companion-WARN enricher to the main log, closing out Stream A. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -0,0 +1,151 @@
|
||||
using Shouldly;
|
||||
using Xunit;
|
||||
using ZB.MOM.WW.OtOpcUa.Core.Scripting;
|
||||
|
||||
namespace ZB.MOM.WW.OtOpcUa.Core.Scripting.Tests;
|
||||
|
||||
/// <summary>
|
||||
/// Exercises the source-hash keyed compile cache. Roslyn compilation is the most
|
||||
/// expensive step in the evaluator pipeline; this cache collapses redundant
|
||||
/// compiles of unchanged scripts to zero-cost lookups + makes sure concurrent
|
||||
/// callers never double-compile.
|
||||
/// </summary>
|
||||
[Trait("Category", "Unit")]
|
||||
public sealed class CompiledScriptCacheTests
|
||||
{
|
||||
private sealed class CompileCountingGate
|
||||
{
|
||||
public int Count;
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public void First_call_compiles_and_caches()
|
||||
{
|
||||
var cache = new CompiledScriptCache<FakeScriptContext, int>();
|
||||
cache.Count.ShouldBe(0);
|
||||
|
||||
var e = cache.GetOrCompile("""return 42;""");
|
||||
e.ShouldNotBeNull();
|
||||
cache.Count.ShouldBe(1);
|
||||
cache.Contains("""return 42;""").ShouldBeTrue();
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public void Identical_source_returns_the_same_compiled_evaluator()
|
||||
{
|
||||
var cache = new CompiledScriptCache<FakeScriptContext, int>();
|
||||
var first = cache.GetOrCompile("""return 1;""");
|
||||
var second = cache.GetOrCompile("""return 1;""");
|
||||
ReferenceEquals(first, second).ShouldBeTrue();
|
||||
cache.Count.ShouldBe(1);
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public void Different_source_produces_different_evaluator()
|
||||
{
|
||||
var cache = new CompiledScriptCache<FakeScriptContext, int>();
|
||||
var a = cache.GetOrCompile("""return 1;""");
|
||||
var b = cache.GetOrCompile("""return 2;""");
|
||||
ReferenceEquals(a, b).ShouldBeFalse();
|
||||
cache.Count.ShouldBe(2);
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public void Whitespace_difference_misses_cache()
|
||||
{
|
||||
// Documented behavior: reformatting a script recompiles. Simpler + cheaper
|
||||
// than the alternative (AST-canonicalize then hash) and doesn't happen often.
|
||||
var cache = new CompiledScriptCache<FakeScriptContext, int>();
|
||||
cache.GetOrCompile("""return 1;""");
|
||||
cache.GetOrCompile("return 1; "); // trailing whitespace — different hash
|
||||
cache.Count.ShouldBe(2);
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public async Task Cached_evaluator_still_runs_correctly()
|
||||
{
|
||||
var cache = new CompiledScriptCache<FakeScriptContext, double>();
|
||||
var e = cache.GetOrCompile("""return (double)ctx.GetTag("In").Value * 3.0;""");
|
||||
var ctx = new FakeScriptContext().Seed("In", 7.0);
|
||||
|
||||
// Run twice through the cache — both must return the same correct value.
|
||||
var first = await e.RunAsync(ctx, TestContext.Current.CancellationToken);
|
||||
var second = await cache.GetOrCompile("""return (double)ctx.GetTag("In").Value * 3.0;""")
|
||||
.RunAsync(ctx, TestContext.Current.CancellationToken);
|
||||
first.ShouldBe(21.0);
|
||||
second.ShouldBe(21.0);
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public void Failed_compile_is_evicted_so_retry_with_corrected_source_works()
|
||||
{
|
||||
var cache = new CompiledScriptCache<FakeScriptContext, int>();
|
||||
|
||||
// First attempt — undefined identifier, compile throws.
|
||||
Should.Throw<Exception>(() => cache.GetOrCompile("""return unknownIdentifier + 1;"""));
|
||||
cache.Count.ShouldBe(0, "failed compile must be evicted so retry can re-attempt");
|
||||
|
||||
// Retry with corrected source succeeds + caches.
|
||||
cache.GetOrCompile("""return 42;""").ShouldNotBeNull();
|
||||
cache.Count.ShouldBe(1);
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public void Clear_drops_every_entry()
|
||||
{
|
||||
var cache = new CompiledScriptCache<FakeScriptContext, int>();
|
||||
cache.GetOrCompile("""return 1;""");
|
||||
cache.GetOrCompile("""return 2;""");
|
||||
cache.Count.ShouldBe(2);
|
||||
|
||||
cache.Clear();
|
||||
cache.Count.ShouldBe(0);
|
||||
cache.Contains("""return 1;""").ShouldBeFalse();
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public void Concurrent_compiles_of_the_same_source_deduplicate()
|
||||
{
|
||||
// LazyThreadSafetyMode.ExecutionAndPublication guarantees only one compile
|
||||
// even when multiple threads race GetOrCompile against an empty cache.
|
||||
// We can't directly count Roslyn compilations — but we can assert all
|
||||
// concurrent callers see the same evaluator instance.
|
||||
var cache = new CompiledScriptCache<FakeScriptContext, int>();
|
||||
const string src = """return 99;""";
|
||||
|
||||
var tasks = Enumerable.Range(0, 20)
|
||||
.Select(_ => Task.Run(() => cache.GetOrCompile(src)))
|
||||
.ToArray();
|
||||
Task.WhenAll(tasks).GetAwaiter().GetResult();
|
||||
|
||||
var firstInstance = tasks[0].Result;
|
||||
foreach (var t in tasks)
|
||||
ReferenceEquals(t.Result, firstInstance).ShouldBeTrue();
|
||||
cache.Count.ShouldBe(1);
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public void Different_TContext_TResult_pairs_use_separate_cache_instances()
|
||||
{
|
||||
// Documented: each engine (virtual-tag / alarm-predicate / alarm-action) owns
|
||||
// its own cache. The type-parametric design makes this the default without
|
||||
// cross-contamination at the dictionary level.
|
||||
var intCache = new CompiledScriptCache<FakeScriptContext, int>();
|
||||
var boolCache = new CompiledScriptCache<FakeScriptContext, bool>();
|
||||
|
||||
intCache.GetOrCompile("""return 1;""");
|
||||
boolCache.GetOrCompile("""return true;""");
|
||||
|
||||
intCache.Count.ShouldBe(1);
|
||||
boolCache.Count.ShouldBe(1);
|
||||
intCache.Contains("""return true;""").ShouldBeFalse();
|
||||
boolCache.Contains("""return 1;""").ShouldBeFalse();
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public void Null_source_throws_ArgumentNullException()
|
||||
{
|
||||
var cache = new CompiledScriptCache<FakeScriptContext, int>();
|
||||
Should.Throw<ArgumentNullException>(() => cache.GetOrCompile(null!));
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,134 @@
|
||||
using Shouldly;
|
||||
using Xunit;
|
||||
using ZB.MOM.WW.OtOpcUa.Core.Scripting;
|
||||
|
||||
namespace ZB.MOM.WW.OtOpcUa.Core.Scripting.Tests;
|
||||
|
||||
/// <summary>
|
||||
/// Verifies the per-evaluation timeout wrapper. Fast scripts complete normally;
|
||||
/// CPU-bound or hung scripts throw <see cref="ScriptTimeoutException"/> instead of
|
||||
/// starving the engine. Caller-supplied cancellation tokens take precedence over the
|
||||
/// timeout so driver-shutdown paths see a clean cancel rather than a timeout.
|
||||
/// </summary>
|
||||
[Trait("Category", "Unit")]
|
||||
public sealed class TimedScriptEvaluatorTests
|
||||
{
|
||||
[Fact]
|
||||
public async Task Fast_script_completes_under_timeout_and_returns_value()
|
||||
{
|
||||
var inner = ScriptEvaluator<FakeScriptContext, double>.Compile(
|
||||
"""return (double)ctx.GetTag("In").Value + 1.0;""");
|
||||
var timed = new TimedScriptEvaluator<FakeScriptContext, double>(
|
||||
inner, TimeSpan.FromSeconds(1));
|
||||
|
||||
var ctx = new FakeScriptContext().Seed("In", 41.0);
|
||||
var result = await timed.RunAsync(ctx, TestContext.Current.CancellationToken);
|
||||
result.ShouldBe(42.0);
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public async Task Script_longer_than_timeout_throws_ScriptTimeoutException()
|
||||
{
|
||||
// Scripts can't easily do Thread.Sleep in the sandbox (System.Threading.Thread
|
||||
// is denied). But a tight CPU loop exceeds any short timeout.
|
||||
var inner = ScriptEvaluator<FakeScriptContext, int>.Compile(
|
||||
"""
|
||||
var end = Environment.TickCount64 + 5000;
|
||||
while (Environment.TickCount64 < end) { }
|
||||
return 1;
|
||||
""");
|
||||
var timed = new TimedScriptEvaluator<FakeScriptContext, int>(
|
||||
inner, TimeSpan.FromMilliseconds(50));
|
||||
|
||||
var ex = await Should.ThrowAsync<ScriptTimeoutException>(async () =>
|
||||
await timed.RunAsync(new FakeScriptContext(), TestContext.Current.CancellationToken));
|
||||
ex.Timeout.ShouldBe(TimeSpan.FromMilliseconds(50));
|
||||
ex.Message.ShouldContain("50.0");
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public async Task Caller_cancellation_takes_precedence_over_timeout()
|
||||
{
|
||||
// A CPU-bound script that would otherwise timeout; external ct fires first.
|
||||
// Expected: OperationCanceledException (not ScriptTimeoutException) so shutdown
|
||||
// paths aren't misclassified as timeouts.
|
||||
var inner = ScriptEvaluator<FakeScriptContext, int>.Compile(
|
||||
"""
|
||||
var end = Environment.TickCount64 + 10000;
|
||||
while (Environment.TickCount64 < end) { }
|
||||
return 1;
|
||||
""");
|
||||
var timed = new TimedScriptEvaluator<FakeScriptContext, int>(
|
||||
inner, TimeSpan.FromSeconds(5));
|
||||
|
||||
using var cts = new CancellationTokenSource(TimeSpan.FromMilliseconds(80));
|
||||
await Should.ThrowAsync<OperationCanceledException>(async () =>
|
||||
await timed.RunAsync(new FakeScriptContext(), cts.Token));
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public void Default_timeout_is_250ms_per_plan()
|
||||
{
|
||||
TimedScriptEvaluator<FakeScriptContext, int>.DefaultTimeout
|
||||
.ShouldBe(TimeSpan.FromMilliseconds(250));
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public void Zero_or_negative_timeout_is_rejected_at_construction()
|
||||
{
|
||||
var inner = ScriptEvaluator<FakeScriptContext, int>.Compile("""return 1;""");
|
||||
Should.Throw<ArgumentOutOfRangeException>(() =>
|
||||
new TimedScriptEvaluator<FakeScriptContext, int>(inner, TimeSpan.Zero));
|
||||
Should.Throw<ArgumentOutOfRangeException>(() =>
|
||||
new TimedScriptEvaluator<FakeScriptContext, int>(inner, TimeSpan.FromMilliseconds(-1)));
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public void Null_inner_is_rejected()
|
||||
{
|
||||
Should.Throw<ArgumentNullException>(() =>
|
||||
new TimedScriptEvaluator<FakeScriptContext, int>(null!));
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public void Null_context_is_rejected()
|
||||
{
|
||||
var inner = ScriptEvaluator<FakeScriptContext, int>.Compile("""return 1;""");
|
||||
var timed = new TimedScriptEvaluator<FakeScriptContext, int>(inner);
|
||||
Should.ThrowAsync<ArgumentNullException>(async () =>
|
||||
await timed.RunAsync(null!, TestContext.Current.CancellationToken));
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public async Task Script_exception_propagates_unwrapped()
|
||||
{
|
||||
// User-thrown exceptions must come through as-is — NOT wrapped in
|
||||
// ScriptTimeoutException. The virtual-tag engine catches them per-tag and
|
||||
// maps to BadInternalError; conflating with timeout would lose that info.
|
||||
var inner = ScriptEvaluator<FakeScriptContext, int>.Compile(
|
||||
"""throw new InvalidOperationException("script boom");""");
|
||||
var timed = new TimedScriptEvaluator<FakeScriptContext, int>(inner, TimeSpan.FromSeconds(1));
|
||||
|
||||
var ex = await Should.ThrowAsync<InvalidOperationException>(async () =>
|
||||
await timed.RunAsync(new FakeScriptContext(), TestContext.Current.CancellationToken));
|
||||
ex.Message.ShouldBe("script boom");
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public async Task ScriptTimeoutException_message_points_at_diagnostic_path()
|
||||
{
|
||||
var inner = ScriptEvaluator<FakeScriptContext, int>.Compile(
|
||||
"""
|
||||
var end = Environment.TickCount64 + 5000;
|
||||
while (Environment.TickCount64 < end) { }
|
||||
return 1;
|
||||
""");
|
||||
var timed = new TimedScriptEvaluator<FakeScriptContext, int>(
|
||||
inner, TimeSpan.FromMilliseconds(30));
|
||||
|
||||
var ex = await Should.ThrowAsync<ScriptTimeoutException>(async () =>
|
||||
await timed.RunAsync(new FakeScriptContext(), TestContext.Current.CancellationToken));
|
||||
ex.Message.ShouldContain("ctx.Logger");
|
||||
ex.Message.ShouldContain("widening the timeout");
|
||||
}
|
||||
}
|
||||
Reference in New Issue
Block a user