Phase 7 Stream A.2 — compile cache + per-evaluation timeout wrapper #178

Merged
dohertj2 merged 1 commits from phase-7-stream-a2-cache-timeout into v2 2026-04-20 16:41:09 -04:00
Owner

Second of 3 increments within Stream A. Two independent resilience primitives that Streams B + C will compose with the base ScriptEvaluator.

CompiledScriptCache<TContext, TResult>

  • Source-hash-keyed cache of compiled evaluators (SHA-256 of UTF-8 source bytes)
  • ConcurrentDictionary<string, Lazy<ScriptEvaluator>> with ExecutionAndPublication mode — concurrent callers never double-compile
  • Failed compiles evict the cache entry so an Admin UI retry with corrected source actually recompiles
  • Count / Clear / Contains exposed for diagnostics + tests

TimedScriptEvaluator<TContext, TResult>

  • Default timeout = 250ms per plan Stream A.4; constructor-configurable per tag
  • Critical implementation detail: Roslyn's ScriptRunner executes synchronously on the calling thread for CPU-bound scripts, returning an already-completed Task before the caller can register a timeout. Fix: push evaluation through Task.Run so the caller's thread is free to wait and the timeout reliably fires.
  • Known trade-off: orphaned eval task continues on the thread-pool thread until Roslyn unwinds (documented in class summary — tighter CPU budgeting would require an out-of-process runner, deferred to v3)
  • Caller-supplied CancellationToken takes precedence over timeout so shutdown paths see OperationCanceledException rather than a misclassified ScriptTimeoutException
  • ScriptTimeoutException carries the configured timeout and diagnostic message pointing the operator at ctx.Logger output + tuning paths

Tests — 48/48 green (29 from A.1 + 19 new)

  • CompiledScriptCacheTests (10) — first compile, dedupe, different-source, whitespace sensitivity, cached evaluator still runs, failed compile eviction, Clear, concurrent compile dedupe, separate TContext/TResult cache isolation, null rejection
  • TimedScriptEvaluatorTests (9) — fast completes, CPU-bound throws ScriptTimeoutException, caller cancel > timeout precedence, default 250ms per plan, zero/negative timeout rejected, null rejections, user exceptions unwrapped, diagnostic message shape

Next

Stream A.3 closes out Stream A: dedicated scripts-*.log Serilog rolling sink with structured-property filtering + companion-WARN enricher to the main log.

Second of 3 increments within Stream A. Two independent resilience primitives that Streams B + C will compose with the base `ScriptEvaluator`. ## `CompiledScriptCache<TContext, TResult>` - Source-hash-keyed cache of compiled evaluators (SHA-256 of UTF-8 source bytes) - `ConcurrentDictionary<string, Lazy<ScriptEvaluator>>` with `ExecutionAndPublication` mode — concurrent callers never double-compile - Failed compiles evict the cache entry so an Admin UI retry with corrected source actually recompiles - `Count` / `Clear` / `Contains` exposed for diagnostics + tests ## `TimedScriptEvaluator<TContext, TResult>` - Default timeout = 250ms per plan Stream A.4; constructor-configurable per tag - **Critical implementation detail**: Roslyn's `ScriptRunner` executes synchronously on the calling thread for CPU-bound scripts, returning an already-completed Task before the caller can register a timeout. Fix: push evaluation through `Task.Run` so the caller's thread is free to wait and the timeout reliably fires. - Known trade-off: orphaned eval task continues on the thread-pool thread until Roslyn unwinds (documented in class summary — tighter CPU budgeting would require an out-of-process runner, deferred to v3) - Caller-supplied `CancellationToken` takes precedence over timeout so shutdown paths see `OperationCanceledException` rather than a misclassified `ScriptTimeoutException` - `ScriptTimeoutException` carries the configured timeout and diagnostic message pointing the operator at `ctx.Logger` output + tuning paths ## Tests — 48/48 green (29 from A.1 + 19 new) - `CompiledScriptCacheTests` (10) — first compile, dedupe, different-source, whitespace sensitivity, cached evaluator still runs, failed compile eviction, Clear, concurrent compile dedupe, separate TContext/TResult cache isolation, null rejection - `TimedScriptEvaluatorTests` (9) — fast completes, CPU-bound throws ScriptTimeoutException, caller cancel > timeout precedence, default 250ms per plan, zero/negative timeout rejected, null rejections, user exceptions unwrapped, diagnostic message shape ## Next **Stream A.3** closes out Stream A: dedicated `scripts-*.log` Serilog rolling sink with structured-property filtering + companion-WARN enricher to the main log.
dohertj2 added 1 commit 2026-04-20 16:40:58 -04:00
CompiledScriptCache<TContext, TResult> — source-hash-keyed cache of compiled evaluators. Roslyn compilation is the most expensive step in the evaluator pipeline (5-20ms per script depending on size); re-compiling on every value-change event would starve the engine. ConcurrentDictionary of Lazy<ScriptEvaluator> with ExecutionAndPublication mode ensures concurrent callers never double-compile even on a cold cache race. Failed compiles evict the cache entry so an Admin UI retry with corrected source actually recompiles (otherwise the cached exception would persist). Whitespace-sensitive hash — reformatting a script misses the cache on purpose, simpler than AST-canonicalize and happens rarely. No capacity bound because virtual-tag + alarm scripts are config-DB bounded (thousands, not millions); if scale pushes past that in v3 an LRU eviction slots in behind the same API.

TimedScriptEvaluator<TContext, TResult> — wraps a ScriptEvaluator with a per-evaluation wall-clock timeout (default 250ms per Phase 7 plan Stream A.4, configurable per tag so slower backends can widen). Critical implementation detail: the underlying Roslyn ScriptRunner executes synchronously on the calling thread for CPU-bound user scripts, returning an already-completed Task before the caller can register a timeout. Naive `Task.WaitAsync(timeout)` would see the completed task and never fire. Fix: push evaluation to a thread-pool thread via Task.Run, so the caller's thread is free to wait and the timeout reliably fires after the configured budget. Known trade-off (documented in the class summary): when a script times out, the underlying evaluation task continues running on the thread-pool thread until Roslyn returns; in the CPU-bound-infinite-loop case it's effectively leaked until the runtime decides to unwind. Tighter CPU budgeting would require an out-of-process script runner (v3 concern). In practice the timeout + structured warning log surfaces the offending script so the operator fixes it, and the orphan thread is rare. Caller-supplied CancellationToken is honored and takes precedence over the timeout, so driver-shutdown paths see a clean OperationCanceledException rather than a misclassified ScriptTimeoutException.

ScriptTimeoutException carries the configured Timeout and a diagnostic message pointing the operator at ctx.Logger output around the failure plus suggesting widening the timeout, simplifying the script, or moving heavy work out of the evaluation path. The virtual-tag engine (Stream B) will catch this and map the owning tag's quality to BadInternalError per Phase 7 decision #11, logging a structured warning with the offending script name.

Tests: CompiledScriptCacheTests (10) — first-call compile, identical-source dedupe to same instance, different-source produces different evaluator, whitespace-sensitivity documented, cached evaluator still runs correctly, failed compile evicted for retry, Clear drops entries, concurrent GetOrCompile of the same source deduplicates to one instance, different TContext/TResult use separate cache instances, null source rejected. TimedScriptEvaluatorTests (9) — fast script completes under timeout, CPU-bound script throws ScriptTimeoutException, caller cancellation takes precedence over timeout (shutdown path correctness), default 250ms per plan, zero/negative timeout rejected at construction, null inner rejected, null context rejected, user-thrown exceptions propagate unwrapped (not conflated with timeout), timeout exception message contains diagnostic guidance. Full suite: 48/48 green (29 from A.1 + 19 new).

Next: Stream A.3 wires the dedicated scripts-*.log Serilog rolling sink + structured-property filtering + companion-WARN enricher to the main log, closing out Stream A.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
dohertj2 merged commit cb5d7b2d58 into v2 2026-04-20 16:41:09 -04:00
dohertj2 referenced this issue from a commit 2026-04-30 08:21:25 -04:00
AB CIP @tags walker — CIP Symbol Object decoder + LibplctagTagEnumerator. Closes task #178. CipSymbolObjectDecoder (pure-managed, no libplctag dep) parses the raw Symbol Object (class 0x6B) blob returned by reading the @tags pseudo-tag into an enumerable sequence of AbCipDiscoveredTag records. Entry layout per Rockwell CIP Vol 1 + Logix 5000 CIP Programming Manual 1756-PM019, cross-checked against libplctag's ab/cip.c handle_listed_tags_reply — u32 instance-id + u16 symbol-type + u16 element-length + 3×u32 array-dims + u16 name-length + name[len] + even-pad. Symbol-type lower 12 bits carry the CIP type code (0xC1 BOOL, 0xC2 SINT, …, 0xD0 STRING), bit 12 is the system-tag flag, bit 15 is the struct flag (when set lower 12 bits become the template instance id). Truncated tails stop decoding gracefully — caller keeps whatever parsed cleanly rather than getting an exception mid-walk. Program:-scope names (Program:MainProgram.StepIndex) are split via SplitProgramScope so the enumerator surfaces scope + simple name separately. 12 atomic type codes mapped (BOOL/SINT/INT/DINT/LINT/USINT/UINT/UDINT/ULINT/REAL/LREAL/STRING + DT/DATE_AND_TIME under Dt); unknown codes return null so the caller treats them as opaque Structure. LibplctagTagEnumerator is the real production walker — creates a libplctag Tag with name=@tags against the device's gateway/port/path, InitializeAsync + ReadAsync + GetBuffer, hands bytes to the decoder. Factory LibplctagTagEnumeratorFactory replaces EmptyAbCipTagEnumeratorFactory as the AbCipDriver default. AbCipDriverOptions gains EnableControllerBrowse (default false) matching the TwinCAT pattern — keeps the strict-config path for deployments where only declared tags should appear. When true, DiscoverAsync walks each device's @tags + emits surviving symbols under Discovered/ sub-folder. System-tag filter (AbCipSystemTagFilter shipped in PR 5) runs alongside the wire-layer system-flag hint. Tests — 18 new CipSymbolObjectDecoderTests with crafted byte arrays matching the documented layout — single-entry DInt, theory across 12 atomic type codes, unknown→null, struct flag override, system flag surface, Program:-scope split, multi-entry wire-order with even-pad, truncated-buffer graceful stop, empty buffer, SplitProgramScope theory across 6 shapes. 4 pre-existing AbCipDriverDiscoveryTests that tested controller-enumeration behavior updated with EnableControllerBrowse=true so they continue exercising the walker path (behavior unchanged from their perspective). Total AbCip unit tests now 192/192 passing (+26 from the RMW merge's 166); full solution builds 0 errors; other drivers untouched. Field validation note — the decoder layout matches published Rockwell docs + libplctag C source, but actual @tags responses vary slightly by controller firmware (some ship an older entry format with u16 array dims instead of u32). Any layout drift surfaces as gibberish names in the Discovered/ folder; field testing will flag that for a decoder patch if it occurs.
Sign in to join this conversation.