fix(core-scripted-alarms): resolve Low code-review findings (Core.ScriptedAlarms-003,006,008,010,011; -009 documented)
- Core.ScriptedAlarms-003: emit OnEvent OUTSIDE _evalGate by collecting
pending emissions during the gate-held section and flushing them after
release; eliminates re-entrancy deadlock the docs already promised.
- Core.ScriptedAlarms-006: track every fire-and-forget Reevaluate /
ShelvingCheck task in _inFlight; Dispose drains the set so the engine
no longer races store writes against teardown.
- Core.ScriptedAlarms-008: store comments as ImmutableList<AlarmComment>
so AppendComment is O(log n) instead of O(n).
- Core.ScriptedAlarms-010: document the deliberate input-quality
asymmetry (Uncertain drives the predicate, renders {?} in the message)
in docs/ScriptedAlarms.md and on MessageTemplate.Resolve remarks.
- Core.ScriptedAlarms-011: propagate the no-op reason through
TransitionResult.NoOp(state, reason) and log it from
ScriptedAlarmEngine.ApplyAsync.
- Core.ScriptedAlarms-009 (Won't Fix per recommendation): documented the
per-evaluation dictionary allocation in docs/v2/Galaxy.Performance.md
with a mitigation path if a future soak surfaces pressure.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -35,7 +35,7 @@ new ScriptedAlarmDefinition(
|
||||
|
||||
## Predicate evaluation
|
||||
|
||||
Alarm predicates reuse the same Roslyn sandbox as virtual tags — `ScriptEvaluator<AlarmPredicateContext, bool>` compiles the source, `TimedScriptEvaluator` wraps it with the configured timeout (default from `TimedScriptEvaluator.DefaultTimeout`), and `DependencyExtractor` statically harvests the tag paths the script reads. The sandbox rules (forbidden types, cancellation, logging sinks) are documented in [VirtualTags.md](VirtualTags.md); ScriptedAlarms does not redefine them. The known memory / CPU resource limits are documented there as well.
|
||||
Alarm predicates reuse the same Roslyn sandbox as virtual tags — `ScriptEvaluator<AlarmPredicateContext, bool>` compiles the source, `TimedScriptEvaluator` wraps it with the configured timeout (default from `TimedScriptEvaluator.DefaultTimeout`), and `DependencyExtractor` statically harvests the tag paths the script reads. The sandbox rules (forbidden types, cancellation, logging sinks) are documented in [VirtualTags.md](VirtualTags.md); ScriptedAlarms does not redefine them. The known resource limits — unbounded script-side memory, the per-publish accretion of dynamically-emitted script assemblies (Core.Scripting-008), and the orphan-thread CPU-budget caveat — are documented in that file as well.
|
||||
|
||||
`AlarmPredicateContext` (`AlarmPredicateContext.cs`) is the script's `ScriptContext` subclass:
|
||||
|
||||
@@ -79,6 +79,17 @@ Two invariants the machine enforces:
|
||||
|
||||
Fallback rules: a resolved `DataValueSnapshot` with a non-zero `StatusCode`, a `null` `Value`, or an unknown path becomes `{?}`. The event still fires — the operator sees where the reference broke rather than having the alarm swallowed.
|
||||
|
||||
## Input-quality policy
|
||||
|
||||
Predicate evaluation and message-template resolution deliberately treat tag-input quality differently:
|
||||
|
||||
| Surface | Quality bar | Rationale |
|
||||
|---|---|---|
|
||||
| `ScriptedAlarmEngine.AreInputsReady` (predicate gate) | **Bad rejected** (`StatusCode` bit 31 set). `Good` and `Uncertain` are both accepted. | Uncertain quality still carries a value the predicate can inspect; rejecting it would mask a transitional alarm condition. Predicate evaluation is a state-machine input — operators want it to track reality as closely as the quality allows. |
|
||||
| `MessageTemplate.Resolve` (operator-facing message) | **Any non-zero `StatusCode` rejected** — only `Good` substitutes; `Uncertain` / Bad / unknown all render as `{?}`. | The message is a human-readable signal; substituting an Uncertain value would let operators act on a questionable reading without seeing the qualifier. Rendering `{?}` makes the doubt explicit. |
|
||||
|
||||
`AlarmPredicateContext.GetTag` returns a `BadNodeIdUnknown` (`0x80340000`) snapshot for missing or empty paths, so a typo in the predicate flows through `AreInputsReady` (Bad → predicate skipped, prior state held) and `MessageTemplate.Resolve` (non-Good → `{?}`) without crashing the engine. (Core.ScriptedAlarms-010)
|
||||
|
||||
## State persistence
|
||||
|
||||
`IAlarmStateStore` (`IAlarmStateStore.cs`) is the persistence contract: `LoadAsync(alarmId)`, `LoadAllAsync`, `SaveAsync(state)`, `RemoveAsync(alarmId)`. `InMemoryAlarmStateStore` in the same file is the default for tests and dev deployments without a SQL backend. Stream E wires the production implementation against the `ScriptedAlarmState` config-DB table with audit logging through `Core.Abstractions.IAuditLogger`.
|
||||
|
||||
@@ -150,3 +150,9 @@ substantive driver change, and revise this table when the data does.
|
||||
leak guard. Likely culprits: lingering subscription handles in
|
||||
`SubscriptionRegistry`, or a downstream consumer retaining
|
||||
`DataValueSnapshot` references past their useful life.
|
||||
|
||||
## Scripted-alarm engine — known hot-path allocations
|
||||
|
||||
`ScriptedAlarmEngine.BuildReadCache` allocates a fresh `Dictionary<string, DataValueSnapshot>` and `AlarmPredicateContext` on every predicate evaluation — i.e. once per upstream tag change per referencing alarm. On a busy line where many tags feeding many alarms change frequently, this is a steady stream of short-lived dictionary allocations on the hot path. (Core.ScriptedAlarms-009)
|
||||
|
||||
The allocations are deliberate for now: predicate evaluation is already serialised under `_evalGate`, so a single reused scratch dictionary would be safe, but the per-call dictionary keeps the evaluation surface immutable and trivially safe against future refactors. If a future scripted-alarm soak surfaces allocation pressure on this path, the mitigation is a per-alarm scratch buffer cleared between evaluations — note here before changing the engine.
|
||||
|
||||
Reference in New Issue
Block a user