fix(scripted-alarms): reuse per-alarm evaluation scratch on the hot path
Core.ScriptedAlarms-009 resolution: replace the per-call Dictionary +
AlarmPredicateContext allocation with a per-alarm reusable AlarmScratch
held in _scratchByAlarmId, refilled in place under _evalGate on each
evaluation. The hot path no longer allocates per upstream tag change.
Why this matters:
On a busy line where many tags feeding many alarms change frequently,
the old BuildReadCache allocated a fresh dictionary + context on every
predicate evaluation — a steady stream of short-lived allocations the
GC eventually has to reclaim. With the reuse, the dictionary and
context are allocated once per alarm (on first evaluation) and refilled
in place across every subsequent re-eval.
Implementation:
- New private AlarmScratch class holds the reusable
Dictionary<string, DataValueSnapshot> read cache (pre-sized to the
alarm's Inputs.Count) and the AlarmPredicateContext that wraps it by
reference. The context observes refilled values without being
re-created.
- ConcurrentDictionary<string, AlarmScratch> _scratchByAlarmId on the
engine, cleared in LoadAsync alongside _alarms so a config-publish
drops the prior generation's scratch (Inputs / Logger may change).
- EvaluatePredicateToStateAsync looks up scratch via GetOrAdd, calls
the new RefillReadCache(Dictionary, IReadOnlySet) helper to clear +
repopulate the dictionary in place, then runs the predicate against
the reused context.
- BuildReadCache removed.
Safety:
Reuse is serialised under _evalGate which guarantees no two threads
ever observe the same scratch in a half-refilled state. The
AlarmPredicateContext is bound to the scratch dictionary by reference,
so the predicate's ctx.GetTag(path) sees the freshly-refilled values
rather than a stale snapshot.
Verification:
- All 66 ScriptedAlarms tests pass (was 63 — three new regression tests
locking the reuse contract).
- All 56 VirtualTags tests still pass (unchanged).
- All 104 Core.Scripting tests still pass (unchanged).
New tests in ScriptedAlarmEngineTests:
- Reevaluation_reuses_the_same_read_cache_dictionary — asserts
ReferenceEquals(scratch_before, scratch_after) across two
evaluations of the same alarm.
- Reevaluation_reuses_the_same_predicate_context — same, for the
context.
- LoadAsync_drops_the_prior_generations_scratch — asserts a config
publish wipes the prior scratch (so a stale Logger / Inputs can't
leak into the new generation).
Internal test hooks TryGetScratchReadCacheForTest /
TryGetScratchContextForTest added via the existing
InternalsVisibleTo for the tests project. Kept internal — not part of
the public engine surface.
Docs:
- docs/v2/Galaxy.Performance.md "Scripted-alarm engine" section
rewritten as "hot-path allocation reuse" documenting the new
contract + reuse safety reasoning + the three regression tests.
- code-reviews/Core.ScriptedAlarms/findings.md -009 flipped
Won't Fix → Resolved.
- code-reviews/README.md regenerated.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -156,13 +156,29 @@ a category produced nothing rather than leaving it blank.
|
||||
| Severity | Low |
|
||||
| Category | Performance & resource management |
|
||||
| Location | `ScriptedAlarmEngine.cs:309-315`, `ScriptedAlarmEngine.cs:271` |
|
||||
| Status | Won't Fix |
|
||||
| Status | Resolved |
|
||||
|
||||
**Description:** `BuildReadCache` allocates a fresh `Dictionary<string, DataValueSnapshot>` on every predicate evaluation, i.e. on every upstream tag change for every referencing alarm. On a busy line where many tags feeding many alarms change frequently, this is a steady stream of short-lived dictionary allocations on the hot path. `AlarmPredicateContext` is also newly constructed each evaluation (line 281).
|
||||
|
||||
**Recommendation:** Minor. If the evaluation path shows up in allocation profiling, the read cache could be a reused per-alarm buffer cleared between evaluations (evaluations are already serialised under `_evalGate`, so a single shared scratch dictionary is safe). Not worth doing speculatively — flag for the perf surface in `docs/v2/Galaxy.Performance.md` if alarm evaluation is ever soak-tested.
|
||||
|
||||
**Resolution:** Won't Fix 2026-05-23 — per the recommendation, no code change. Documented the known allocation characteristic in `docs/v2/Galaxy.Performance.md` (new "Scripted-alarm engine — known hot-path allocations" section) so a future soak that surfaces pressure has a noted mitigation (reused per-alarm scratch buffer) and we don't re-find this in a later review.
|
||||
**Resolution:** Resolved 2026-05-23 — added a per-alarm reusable `AlarmScratch`
|
||||
(read-cache `Dictionary` + `AlarmPredicateContext`) held in
|
||||
`_scratchByAlarmId`, populated lazily on first evaluation and refilled in place
|
||||
by the new `RefillReadCache(Dictionary, IReadOnlySet)` helper on every
|
||||
subsequent re-eval. `BuildReadCache` is gone. The reuse is safe because every
|
||||
evaluation runs under `_evalGate`; the context wraps the dictionary by
|
||||
reference, so the predicate's `ctx.GetTag(path)` sees the freshly-refilled
|
||||
values. `LoadAsync` clears `_scratchByAlarmId` alongside `_alarms` so a
|
||||
config-publish drops the prior generation's scratch (Inputs / Logger may
|
||||
change). Regression tests added in `ScriptedAlarmEngineTests`:
|
||||
`Reevaluation_reuses_the_same_read_cache_dictionary`,
|
||||
`Reevaluation_reuses_the_same_predicate_context`, and
|
||||
`LoadAsync_drops_the_prior_generations_scratch`; internal test hooks
|
||||
`TryGetScratchReadCacheForTest` / `TryGetScratchContextForTest` exposed via
|
||||
the existing `InternalsVisibleTo`. `docs/v2/Galaxy.Performance.md` "Scripted-alarm
|
||||
engine" section rewritten to document the new reuse contract. Suite now 66
|
||||
green (was 63).
|
||||
|
||||
### Core.ScriptedAlarms-010
|
||||
|
||||
|
||||
Reference in New Issue
Block a user