fix(core-scripted-alarms): resolve Low code-review findings (Core.ScriptedAlarms-003,006,008,010,011; -009 documented)
- Core.ScriptedAlarms-003: emit OnEvent OUTSIDE _evalGate by collecting
pending emissions during the gate-held section and flushing them after
release; eliminates re-entrancy deadlock the docs already promised.
- Core.ScriptedAlarms-006: track every fire-and-forget Reevaluate /
ShelvingCheck task in _inFlight; Dispose drains the set so the engine
no longer races store writes against teardown.
- Core.ScriptedAlarms-008: store comments as ImmutableList<AlarmComment>
so AppendComment is O(log n) instead of O(n).
- Core.ScriptedAlarms-010: document the deliberate input-quality
asymmetry (Uncertain drives the predicate, renders {?} in the message)
in docs/ScriptedAlarms.md and on MessageTemplate.Resolve remarks.
- Core.ScriptedAlarms-011: propagate the no-op reason through
TransitionResult.NoOp(state, reason) and log it from
ScriptedAlarmEngine.ApplyAsync.
- Core.ScriptedAlarms-009 (Won't Fix per recommendation): documented the
per-evaluation dictionary allocation in docs/v2/Galaxy.Performance.md
with a mitigation path if a future soak surfaces pressure.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -7,7 +7,7 @@
|
|||||||
| Review date | 2026-05-22 |
|
| Review date | 2026-05-22 |
|
||||||
| Commit reviewed | `76d35d1` |
|
| Commit reviewed | `76d35d1` |
|
||||||
| Status | Reviewed |
|
| Status | Reviewed |
|
||||||
| Open findings | 6 |
|
| Open findings | 0 |
|
||||||
|
|
||||||
## Checklist coverage
|
## Checklist coverage
|
||||||
|
|
||||||
@@ -66,13 +66,13 @@ a category produced nothing rather than leaving it blank.
|
|||||||
| Severity | Low |
|
| Severity | Low |
|
||||||
| Category | Documentation & comments |
|
| Category | Documentation & comments |
|
||||||
| Location | `ScriptedAlarmEngine.cs:343`, `docs/ScriptedAlarms.md:107` |
|
| Location | `ScriptedAlarmEngine.cs:343`, `docs/ScriptedAlarms.md:107` |
|
||||||
| Status | Open |
|
| Status | Resolved |
|
||||||
|
|
||||||
**Description:** `docs/ScriptedAlarms.md` (Composition step 3) and the `OnUpstreamChange` comment ("Fire-and-forget so driver-side dispatch isn't blocked", line 225-226) describe the `OnEvent` emission path as non-blocking / fire-and-forget. In the code, `EmitEvent` invokes `OnEvent?.Invoke(this, evt)` **synchronously while `_evalGate` is held** (called from `EvaluatePredicateToStateAsync` line 305 and `ApplyAsync` line 217, both inside the gate). A slow subscriber blocks the single evaluation gate for all alarms; a subscriber that re-enters the engine (e.g. calls `AcknowledgeAsync`) deadlocks because `_evalGate` is a non-reentrant `SemaphoreSlim(1,1)`. The behaviour is defensible (the historian sink is non-blocking, per the doc), but the comments/doc are misleading about where the work happens and the re-entrancy hazard is undocumented.
|
**Description:** `docs/ScriptedAlarms.md` (Composition step 3) and the `OnUpstreamChange` comment ("Fire-and-forget so driver-side dispatch isn't blocked", line 225-226) describe the `OnEvent` emission path as non-blocking / fire-and-forget. In the code, `EmitEvent` invokes `OnEvent?.Invoke(this, evt)` **synchronously while `_evalGate` is held** (called from `EvaluatePredicateToStateAsync` line 305 and `ApplyAsync` line 217, both inside the gate). A slow subscriber blocks the single evaluation gate for all alarms; a subscriber that re-enters the engine (e.g. calls `AcknowledgeAsync`) deadlocks because `_evalGate` is a non-reentrant `SemaphoreSlim(1,1)`. The behaviour is defensible (the historian sink is non-blocking, per the doc), but the comments/doc are misleading about where the work happens and the re-entrancy hazard is undocumented.
|
||||||
|
|
||||||
**Recommendation:** Either move `EmitEvent` outside the `_evalGate` critical section (collect emissions during the locked section and raise them after `Release()`), or document explicitly on `OnEvent` that handlers run under the engine lock, must be fast, and must never call back into the engine.
|
**Recommendation:** Either move `EmitEvent` outside the `_evalGate` critical section (collect emissions during the locked section and raise them after `Release()`), or document explicitly on `OnEvent` that handlers run under the engine lock, must be fast, and must never call back into the engine.
|
||||||
|
|
||||||
**Resolution:** _(open)_
|
**Resolution:** Resolved 2026-05-23 — split `EmitEvent` into `BuildEmission` (called under the gate to capture a coherent value-cache snapshot for message-template resolution) and `FireEvent` (called after `_evalGate.Release()` so subscribers can re-enter the engine without deadlocking and a slow subscriber no longer blocks concurrent engine operations). Updated `ApplyAsync`, `ReevaluateAsync`, `ShelvingCheckAsync`, and `LoadAsync` (startup-recovery path) to collect emissions in a pending list and flush after the gate is released; added regression tests for both the re-entry path and a white-box gate-acquirable-from-subscriber check.
|
||||||
|
|
||||||
### Core.ScriptedAlarms-004
|
### Core.ScriptedAlarms-004
|
||||||
|
|
||||||
@@ -111,13 +111,13 @@ a category produced nothing rather than leaving it blank.
|
|||||||
| Severity | Low |
|
| Severity | Low |
|
||||||
| Category | Concurrency & thread safety |
|
| Category | Concurrency & thread safety |
|
||||||
| Location | `ScriptedAlarmEngine.cs:232`, `ScriptedAlarmEngine.cs:369` |
|
| Location | `ScriptedAlarmEngine.cs:232`, `ScriptedAlarmEngine.cs:369` |
|
||||||
| Status | Open |
|
| Status | Resolved |
|
||||||
|
|
||||||
**Description:** `OnUpstreamChange` and `RunShelvingCheck` both launch fire-and-forget tasks (`_ = ReevaluateAsync(...)`, `_ = ShelvingCheckAsync(...)`) with `CancellationToken.None`. There is no tracking of these in-flight tasks, so `Dispose` cannot await them and a server shutdown can race a still-running re-evaluation that writes to the (possibly disposed) store. Combined with finding 005, an upstream push arriving during shutdown produces an unobserved background task touching torn state.
|
**Description:** `OnUpstreamChange` and `RunShelvingCheck` both launch fire-and-forget tasks (`_ = ReevaluateAsync(...)`, `_ = ShelvingCheckAsync(...)`) with `CancellationToken.None`. There is no tracking of these in-flight tasks, so `Dispose` cannot await them and a server shutdown can race a still-running re-evaluation that writes to the (possibly disposed) store. Combined with finding 005, an upstream push arriving during shutdown produces an unobserved background task touching torn state.
|
||||||
|
|
||||||
**Recommendation:** Track outstanding background tasks (or use a single serialised worker / `Channel`), and link them to a `CancellationTokenSource` that `Dispose` cancels and drains. At minimum, await the in-flight work in `Dispose`.
|
**Recommendation:** Track outstanding background tasks (or use a single serialised worker / `Channel`), and link them to a `CancellationTokenSource` that `Dispose` cancels and drains. At minimum, await the in-flight work in `Dispose`.
|
||||||
|
|
||||||
**Resolution:** _(open)_
|
**Resolution:** Resolved 2026-05-23 — added `_inFlight` HashSet + `TrackBackgroundTask(...)` helper to register every fire-and-forget `ReevaluateAsync`/`ShelvingCheckAsync` task, with a sync `ContinueWith` continuation that auto-removes on completion. `Dispose` snapshots the set under its own lock and `Task.WhenAll(...).GetAwaiter().GetResult()` drains them before returning; `OnUpstreamChange` also short-circuits when `_disposed` is set so no new work is queued during shutdown. Regression test exercises the slow-store path: Dispose blocks until the in-flight `SaveAsync` completes.
|
||||||
|
|
||||||
### Core.ScriptedAlarms-007
|
### Core.ScriptedAlarms-007
|
||||||
|
|
||||||
@@ -141,13 +141,13 @@ a category produced nothing rather than leaving it blank.
|
|||||||
| Severity | Low |
|
| Severity | Low |
|
||||||
| Category | Performance & resource management |
|
| Category | Performance & resource management |
|
||||||
| Location | `Part9StateMachine.cs:261-268` |
|
| Location | `Part9StateMachine.cs:261-268` |
|
||||||
| Status | Open |
|
| Status | Resolved |
|
||||||
|
|
||||||
**Description:** `AppendComment` copies the entire existing comment list into a new `List` on every audit-producing transition (ack, confirm, shelve, unshelve, enable, disable, add-comment, auto-unshelve). The `Comments` list is append-only and unbounded — for a long-lived alarm that is acknowledged/commented hundreds of times, every transition is an O(n) copy and the full history is also re-serialised to the store on every `SaveAsync`. Over a multi-month uptime this is a slowly growing per-transition cost.
|
**Description:** `AppendComment` copies the entire existing comment list into a new `List` on every audit-producing transition (ack, confirm, shelve, unshelve, enable, disable, add-comment, auto-unshelve). The `Comments` list is append-only and unbounded — for a long-lived alarm that is acknowledged/commented hundreds of times, every transition is an O(n) copy and the full history is also re-serialised to the store on every `SaveAsync`. Over a multi-month uptime this is a slowly growing per-transition cost.
|
||||||
|
|
||||||
**Recommendation:** Acceptable for now given audit requirements, but consider an immutable persistent list / `ImmutableList<AlarmComment>` to make append O(log n), or have the store persist comments incrementally (append-only audit table) rather than rewriting the whole collection each save. At minimum, note the unbounded-growth characteristic in the design doc.
|
**Recommendation:** Acceptable for now given audit requirements, but consider an immutable persistent list / `ImmutableList<AlarmComment>` to make append O(log n), or have the store persist comments incrementally (append-only audit table) rather than rewriting the whole collection each save. At minimum, note the unbounded-growth characteristic in the design doc.
|
||||||
|
|
||||||
**Resolution:** _(open)_
|
**Resolution:** Resolved 2026-05-23 — switched `AlarmConditionState.Comments` from `IReadOnlyList<AlarmComment>` to `ImmutableList<AlarmComment>` and rewrote `AppendComment` as `existing.Add(...)` so each append is O(log n) instead of the prior O(n) copy. `ImmutableList<T>` still implements `IReadOnlyList<T>` so existing consumers compile unchanged; the persistence layer continues to store comments as JSON so wire-format is unaffected. Regression test asserts the runtime type is `ImmutableList<AlarmComment>`.
|
||||||
|
|
||||||
### Core.ScriptedAlarms-009
|
### Core.ScriptedAlarms-009
|
||||||
|
|
||||||
@@ -156,13 +156,13 @@ a category produced nothing rather than leaving it blank.
|
|||||||
| Severity | Low |
|
| Severity | Low |
|
||||||
| Category | Performance & resource management |
|
| Category | Performance & resource management |
|
||||||
| Location | `ScriptedAlarmEngine.cs:309-315`, `ScriptedAlarmEngine.cs:271` |
|
| Location | `ScriptedAlarmEngine.cs:309-315`, `ScriptedAlarmEngine.cs:271` |
|
||||||
| Status | Open |
|
| Status | Won't Fix |
|
||||||
|
|
||||||
**Description:** `BuildReadCache` allocates a fresh `Dictionary<string, DataValueSnapshot>` on every predicate evaluation, i.e. on every upstream tag change for every referencing alarm. On a busy line where many tags feeding many alarms change frequently, this is a steady stream of short-lived dictionary allocations on the hot path. `AlarmPredicateContext` is also newly constructed each evaluation (line 281).
|
**Description:** `BuildReadCache` allocates a fresh `Dictionary<string, DataValueSnapshot>` on every predicate evaluation, i.e. on every upstream tag change for every referencing alarm. On a busy line where many tags feeding many alarms change frequently, this is a steady stream of short-lived dictionary allocations on the hot path. `AlarmPredicateContext` is also newly constructed each evaluation (line 281).
|
||||||
|
|
||||||
**Recommendation:** Minor. If the evaluation path shows up in allocation profiling, the read cache could be a reused per-alarm buffer cleared between evaluations (evaluations are already serialised under `_evalGate`, so a single shared scratch dictionary is safe). Not worth doing speculatively — flag for the perf surface in `docs/v2/Galaxy.Performance.md` if alarm evaluation is ever soak-tested.
|
**Recommendation:** Minor. If the evaluation path shows up in allocation profiling, the read cache could be a reused per-alarm buffer cleared between evaluations (evaluations are already serialised under `_evalGate`, so a single shared scratch dictionary is safe). Not worth doing speculatively — flag for the perf surface in `docs/v2/Galaxy.Performance.md` if alarm evaluation is ever soak-tested.
|
||||||
|
|
||||||
**Resolution:** _(open)_
|
**Resolution:** Won't Fix 2026-05-23 — per the recommendation, no code change. Documented the known allocation characteristic in `docs/v2/Galaxy.Performance.md` (new "Scripted-alarm engine — known hot-path allocations" section) so a future soak that surfaces pressure has a noted mitigation (reused per-alarm scratch buffer) and we don't re-find this in a later review.
|
||||||
|
|
||||||
### Core.ScriptedAlarms-010
|
### Core.ScriptedAlarms-010
|
||||||
|
|
||||||
@@ -171,13 +171,13 @@ a category produced nothing rather than leaving it blank.
|
|||||||
| Severity | Low |
|
| Severity | Low |
|
||||||
| Category | Design-document adherence |
|
| Category | Design-document adherence |
|
||||||
| Location | `ScriptedAlarmEngine.cs:325-336`, `AlarmPredicateContext.cs:33-40`, `MessageTemplate.cs:47` |
|
| Location | `ScriptedAlarmEngine.cs:325-336`, `AlarmPredicateContext.cs:33-40`, `MessageTemplate.cs:47` |
|
||||||
| Status | Open |
|
| Status | Resolved |
|
||||||
|
|
||||||
**Description:** Quality handling is inconsistent across the three places that inspect a `DataValueSnapshot.StatusCode`. `AreInputsReady` (engine, line 333) treats only outright Bad (bit 31) as not-ready, so an Uncertain-quality input is fed to the predicate. `MessageTemplate.Resolve` (line 47) rejects *any* non-zero status code — including Uncertain — and renders `{?}`. `AlarmPredicateContext.GetTag` returns `BadNodeIdUnknown` (`0x80340000`) for a missing path. The net effect: an Uncertain-quality tag is considered good enough to drive an alarm *activation* decision but not good enough to print in the alarm *message*. `docs/ScriptedAlarms.md` ("Fallback rules") only documents the message-template behaviour and does not mention that predicate evaluation accepts Uncertain. The two policies should be reconciled and documented.
|
**Description:** Quality handling is inconsistent across the three places that inspect a `DataValueSnapshot.StatusCode`. `AreInputsReady` (engine, line 333) treats only outright Bad (bit 31) as not-ready, so an Uncertain-quality input is fed to the predicate. `MessageTemplate.Resolve` (line 47) rejects *any* non-zero status code — including Uncertain — and renders `{?}`. `AlarmPredicateContext.GetTag` returns `BadNodeIdUnknown` (`0x80340000`) for a missing path. The net effect: an Uncertain-quality tag is considered good enough to drive an alarm *activation* decision but not good enough to print in the alarm *message*. `docs/ScriptedAlarms.md` ("Fallback rules") only documents the message-template behaviour and does not mention that predicate evaluation accepts Uncertain. The two policies should be reconciled and documented.
|
||||||
|
|
||||||
**Recommendation:** Decide one quality policy for "is this input usable" and apply it in both `AreInputsReady` and the message resolver, or explicitly document why predicate evaluation and message rendering treat Uncertain differently. Add the predicate-side Uncertain rule to `docs/ScriptedAlarms.md`.
|
**Recommendation:** Decide one quality policy for "is this input usable" and apply it in both `AreInputsReady` and the message resolver, or explicitly document why predicate evaluation and message rendering treat Uncertain differently. Add the predicate-side Uncertain rule to `docs/ScriptedAlarms.md`.
|
||||||
|
|
||||||
**Resolution:** _(open)_
|
**Resolution:** Resolved 2026-05-23 — documented the deliberate asymmetry. Added an "Input-quality policy" section to `docs/ScriptedAlarms.md` (table contrasting `AreInputsReady`'s Bad-only rejection with `MessageTemplate.Resolve`'s Good-only acceptance, plus the rationale) and a cross-referencing remarks block on `MessageTemplate.Resolve`. The two policies are kept distinct on purpose: predicate evaluation accepts Uncertain because the value is still inspectable, while the operator-facing message must render `{?}` to make the qualifier visible. Regression test locks in both behaviours with a single Uncertain-quality input that activates the alarm and surfaces `{?}` in the emission message.
|
||||||
|
|
||||||
### Core.ScriptedAlarms-011
|
### Core.ScriptedAlarms-011
|
||||||
|
|
||||||
@@ -186,13 +186,13 @@ a category produced nothing rather than leaving it blank.
|
|||||||
| Severity | Low |
|
| Severity | Low |
|
||||||
| Category | Code organization & conventions |
|
| Category | Code organization & conventions |
|
||||||
| Location | `Part9StateMachine.cs:275` |
|
| Location | `Part9StateMachine.cs:275` |
|
||||||
| Status | Open |
|
| Status | Resolved |
|
||||||
|
|
||||||
**Description:** `TransitionResult.NoOp(state, reason)` takes a `reason` string parameter that is documented in the calling code as a diagnostic ("disabled — predicate result ignored", "already acknowledged", etc.) but the factory method silently discards it — it just returns `new(state, EmissionKind.None)`, identical to `None(state)`. Every call site that passes a carefully-worded reason string is doing dead work, and the comments in `Part9StateMachine` and the class-level remarks claim disabled/no-op transitions "produce ... a diagnostic log line", which they do not.
|
**Description:** `TransitionResult.NoOp(state, reason)` takes a `reason` string parameter that is documented in the calling code as a diagnostic ("disabled — predicate result ignored", "already acknowledged", etc.) but the factory method silently discards it — it just returns `new(state, EmissionKind.None)`, identical to `None(state)`. Every call site that passes a carefully-worded reason string is doing dead work, and the comments in `Part9StateMachine` and the class-level remarks claim disabled/no-op transitions "produce ... a diagnostic log line", which they do not.
|
||||||
|
|
||||||
**Recommendation:** Either propagate the reason (add it to `TransitionResult` and have the engine log it at debug level when emission is `None` for a no-op), or remove the unused `reason` parameter and collapse `NoOp` into `None`. Update the `Part9StateMachine` remarks that promise a diagnostic log line.
|
**Recommendation:** Either propagate the reason (add it to `TransitionResult` and have the engine log it at debug level when emission is `None` for a no-op), or remove the unused `reason` parameter and collapse `NoOp` into `None`. Update the `Part9StateMachine` remarks that promise a diagnostic log line.
|
||||||
|
|
||||||
**Resolution:** _(open)_
|
**Resolution:** Resolved 2026-05-23 — added a nullable `NoOpReason` property to `TransitionResult` (defaulted on the primary constructor so existing positional `new TransitionResult(state, kind)` call sites remain valid) and propagated it from `TransitionResult.NoOp(state, reason)`. `ScriptedAlarmEngine.ApplyAsync` now logs the reason at debug level via the alarm's script logger when the transition is a no-op, fulfilling the class-level remarks. Two regression tests assert that `NoOp` carries the reason and `None` does not.
|
||||||
|
|
||||||
### Core.ScriptedAlarms-012
|
### Core.ScriptedAlarms-012
|
||||||
|
|
||||||
|
|||||||
@@ -35,7 +35,7 @@ new ScriptedAlarmDefinition(
|
|||||||
|
|
||||||
## Predicate evaluation
|
## Predicate evaluation
|
||||||
|
|
||||||
Alarm predicates reuse the same Roslyn sandbox as virtual tags — `ScriptEvaluator<AlarmPredicateContext, bool>` compiles the source, `TimedScriptEvaluator` wraps it with the configured timeout (default from `TimedScriptEvaluator.DefaultTimeout`), and `DependencyExtractor` statically harvests the tag paths the script reads. The sandbox rules (forbidden types, cancellation, logging sinks) are documented in [VirtualTags.md](VirtualTags.md); ScriptedAlarms does not redefine them. The known memory / CPU resource limits are documented there as well.
|
Alarm predicates reuse the same Roslyn sandbox as virtual tags — `ScriptEvaluator<AlarmPredicateContext, bool>` compiles the source, `TimedScriptEvaluator` wraps it with the configured timeout (default from `TimedScriptEvaluator.DefaultTimeout`), and `DependencyExtractor` statically harvests the tag paths the script reads. The sandbox rules (forbidden types, cancellation, logging sinks) are documented in [VirtualTags.md](VirtualTags.md); ScriptedAlarms does not redefine them. The known resource limits — unbounded script-side memory, the per-publish accretion of dynamically-emitted script assemblies (Core.Scripting-008), and the orphan-thread CPU-budget caveat — are documented in that file as well.
|
||||||
|
|
||||||
`AlarmPredicateContext` (`AlarmPredicateContext.cs`) is the script's `ScriptContext` subclass:
|
`AlarmPredicateContext` (`AlarmPredicateContext.cs`) is the script's `ScriptContext` subclass:
|
||||||
|
|
||||||
@@ -79,6 +79,17 @@ Two invariants the machine enforces:
|
|||||||
|
|
||||||
Fallback rules: a resolved `DataValueSnapshot` with a non-zero `StatusCode`, a `null` `Value`, or an unknown path becomes `{?}`. The event still fires — the operator sees where the reference broke rather than having the alarm swallowed.
|
Fallback rules: a resolved `DataValueSnapshot` with a non-zero `StatusCode`, a `null` `Value`, or an unknown path becomes `{?}`. The event still fires — the operator sees where the reference broke rather than having the alarm swallowed.
|
||||||
|
|
||||||
|
## Input-quality policy
|
||||||
|
|
||||||
|
Predicate evaluation and message-template resolution deliberately treat tag-input quality differently:
|
||||||
|
|
||||||
|
| Surface | Quality bar | Rationale |
|
||||||
|
|---|---|---|
|
||||||
|
| `ScriptedAlarmEngine.AreInputsReady` (predicate gate) | **Bad rejected** (`StatusCode` bit 31 set). `Good` and `Uncertain` are both accepted. | Uncertain quality still carries a value the predicate can inspect; rejecting it would mask a transitional alarm condition. Predicate evaluation is a state-machine input — operators want it to track reality as closely as the quality allows. |
|
||||||
|
| `MessageTemplate.Resolve` (operator-facing message) | **Any non-zero `StatusCode` rejected** — only `Good` substitutes; `Uncertain` / Bad / unknown all render as `{?}`. | The message is a human-readable signal; substituting an Uncertain value would let operators act on a questionable reading without seeing the qualifier. Rendering `{?}` makes the doubt explicit. |
|
||||||
|
|
||||||
|
`AlarmPredicateContext.GetTag` returns a `BadNodeIdUnknown` (`0x80340000`) snapshot for missing or empty paths, so a typo in the predicate flows through `AreInputsReady` (Bad → predicate skipped, prior state held) and `MessageTemplate.Resolve` (non-Good → `{?}`) without crashing the engine. (Core.ScriptedAlarms-010)
|
||||||
|
|
||||||
## State persistence
|
## State persistence
|
||||||
|
|
||||||
`IAlarmStateStore` (`IAlarmStateStore.cs`) is the persistence contract: `LoadAsync(alarmId)`, `LoadAllAsync`, `SaveAsync(state)`, `RemoveAsync(alarmId)`. `InMemoryAlarmStateStore` in the same file is the default for tests and dev deployments without a SQL backend. Stream E wires the production implementation against the `ScriptedAlarmState` config-DB table with audit logging through `Core.Abstractions.IAuditLogger`.
|
`IAlarmStateStore` (`IAlarmStateStore.cs`) is the persistence contract: `LoadAsync(alarmId)`, `LoadAllAsync`, `SaveAsync(state)`, `RemoveAsync(alarmId)`. `InMemoryAlarmStateStore` in the same file is the default for tests and dev deployments without a SQL backend. Stream E wires the production implementation against the `ScriptedAlarmState` config-DB table with audit logging through `Core.Abstractions.IAuditLogger`.
|
||||||
|
|||||||
@@ -150,3 +150,9 @@ substantive driver change, and revise this table when the data does.
|
|||||||
leak guard. Likely culprits: lingering subscription handles in
|
leak guard. Likely culprits: lingering subscription handles in
|
||||||
`SubscriptionRegistry`, or a downstream consumer retaining
|
`SubscriptionRegistry`, or a downstream consumer retaining
|
||||||
`DataValueSnapshot` references past their useful life.
|
`DataValueSnapshot` references past their useful life.
|
||||||
|
|
||||||
|
## Scripted-alarm engine — known hot-path allocations
|
||||||
|
|
||||||
|
`ScriptedAlarmEngine.BuildReadCache` allocates a fresh `Dictionary<string, DataValueSnapshot>` and `AlarmPredicateContext` on every predicate evaluation — i.e. once per upstream tag change per referencing alarm. On a busy line where many tags feeding many alarms change frequently, this is a steady stream of short-lived dictionary allocations on the hot path. (Core.ScriptedAlarms-009)
|
||||||
|
|
||||||
|
The allocations are deliberate for now: predicate evaluation is already serialised under `_evalGate`, so a single reused scratch dictionary would be safe, but the per-call dictionary keeps the evaluation surface immutable and trivially safe against future refactors. If a future scripted-alarm soak surfaces allocation pressure on this path, the mitigation is a per-alarm scratch buffer cleared between evaluations — note here before changing the engine.
|
||||||
|
|||||||
@@ -1,3 +1,5 @@
|
|||||||
|
using System.Collections.Immutable;
|
||||||
|
|
||||||
namespace ZB.MOM.WW.OtOpcUa.Core.ScriptedAlarms;
|
namespace ZB.MOM.WW.OtOpcUa.Core.ScriptedAlarms;
|
||||||
|
|
||||||
/// <summary>
|
/// <summary>
|
||||||
@@ -17,7 +19,10 @@ namespace ZB.MOM.WW.OtOpcUa.Core.ScriptedAlarms;
|
|||||||
/// <para>
|
/// <para>
|
||||||
/// <see cref="Comments"/> is append-only; comments + ack/confirm user identities
|
/// <see cref="Comments"/> is append-only; comments + ack/confirm user identities
|
||||||
/// are the audit surface regulators consume. The engine never rewrites past
|
/// are the audit surface regulators consume. The engine never rewrites past
|
||||||
/// entries.
|
/// entries. The runtime type is <see cref="ImmutableList{AlarmComment}"/> so
|
||||||
|
/// each append is O(log n) rather than the O(n) copy a plain
|
||||||
|
/// <c>IReadOnlyList<AlarmComment></c> would force on every audit-producing
|
||||||
|
/// transition. (Core.ScriptedAlarms-008)
|
||||||
/// </para>
|
/// </para>
|
||||||
/// </remarks>
|
/// </remarks>
|
||||||
public sealed record AlarmConditionState(
|
public sealed record AlarmConditionState(
|
||||||
@@ -36,7 +41,7 @@ public sealed record AlarmConditionState(
|
|||||||
DateTime? LastConfirmUtc,
|
DateTime? LastConfirmUtc,
|
||||||
string? LastConfirmUser,
|
string? LastConfirmUser,
|
||||||
string? LastConfirmComment,
|
string? LastConfirmComment,
|
||||||
IReadOnlyList<AlarmComment> Comments)
|
ImmutableList<AlarmComment> Comments)
|
||||||
{
|
{
|
||||||
/// <summary>Initial-load state for a newly registered alarm — everything in the "no-event" position.</summary>
|
/// <summary>Initial-load state for a newly registered alarm — everything in the "no-event" position.</summary>
|
||||||
public static AlarmConditionState Fresh(string alarmId, DateTime nowUtc) => new(
|
public static AlarmConditionState Fresh(string alarmId, DateTime nowUtc) => new(
|
||||||
@@ -55,7 +60,7 @@ public sealed record AlarmConditionState(
|
|||||||
LastConfirmUtc: null,
|
LastConfirmUtc: null,
|
||||||
LastConfirmUser: null,
|
LastConfirmUser: null,
|
||||||
LastConfirmComment: null,
|
LastConfirmComment: null,
|
||||||
Comments: []);
|
Comments: ImmutableList<AlarmComment>.Empty);
|
||||||
}
|
}
|
||||||
|
|
||||||
/// <summary>
|
/// <summary>
|
||||||
|
|||||||
@@ -33,6 +33,16 @@ public static class MessageTemplate
|
|||||||
/// has a non-Good <see cref="DataValueSnapshot.StatusCode"/> or a null
|
/// has a non-Good <see cref="DataValueSnapshot.StatusCode"/> or a null
|
||||||
/// <see cref="DataValueSnapshot.Value"/> resolve to <c>{?}</c>.
|
/// <see cref="DataValueSnapshot.Value"/> resolve to <c>{?}</c>.
|
||||||
/// </summary>
|
/// </summary>
|
||||||
|
/// <remarks>
|
||||||
|
/// Quality bar is intentionally <em>stricter</em> than predicate evaluation:
|
||||||
|
/// only Good (StatusCode == 0) is substituted; Uncertain renders as
|
||||||
|
/// <c>{?}</c>. The predicate gate (<c>ScriptedAlarmEngine.AreInputsReady</c>)
|
||||||
|
/// accepts Uncertain because it still carries a value the predicate can
|
||||||
|
/// inspect, but the operator-facing message must make doubt explicit rather
|
||||||
|
/// than substituting a value an operator might act on. See the
|
||||||
|
/// "Input-quality policy" section in <c>docs/ScriptedAlarms.md</c>.
|
||||||
|
/// (Core.ScriptedAlarms-010)
|
||||||
|
/// </remarks>
|
||||||
public static string Resolve(string template, Func<string, DataValueSnapshot?> resolveTag)
|
public static string Resolve(string template, Func<string, DataValueSnapshot?> resolveTag)
|
||||||
{
|
{
|
||||||
if (string.IsNullOrEmpty(template)) return template ?? string.Empty;
|
if (string.IsNullOrEmpty(template)) return template ?? string.Empty;
|
||||||
|
|||||||
@@ -1,3 +1,5 @@
|
|||||||
|
using System.Collections.Immutable;
|
||||||
|
|
||||||
namespace ZB.MOM.WW.OtOpcUa.Core.ScriptedAlarms;
|
namespace ZB.MOM.WW.OtOpcUa.Core.ScriptedAlarms;
|
||||||
|
|
||||||
/// <summary>
|
/// <summary>
|
||||||
@@ -258,21 +260,33 @@ public static class Part9StateMachine
|
|||||||
return s.UnshelveAtUtc is DateTime t && nowUtc >= t ? ShelvingState.Unshelved : s;
|
return s.UnshelveAtUtc is DateTime t && nowUtc >= t ? ShelvingState.Unshelved : s;
|
||||||
}
|
}
|
||||||
|
|
||||||
private static IReadOnlyList<AlarmComment> AppendComment(
|
private static ImmutableList<AlarmComment> AppendComment(
|
||||||
IReadOnlyList<AlarmComment> existing, DateTime ts, string user, string kind, string? text)
|
ImmutableList<AlarmComment> existing, DateTime ts, string user, string kind, string? text)
|
||||||
{
|
=> existing.Add(new AlarmComment(ts, user, kind, text ?? string.Empty));
|
||||||
var list = new List<AlarmComment>(existing.Count + 1);
|
|
||||||
list.AddRange(existing);
|
|
||||||
list.Add(new AlarmComment(ts, user, kind, text ?? string.Empty));
|
|
||||||
return list;
|
|
||||||
}
|
|
||||||
}
|
}
|
||||||
|
|
||||||
/// <summary>Result of a state-machine operation — new state + what to emit (if anything).</summary>
|
/// <summary>Result of a state-machine operation — new state + what to emit (if anything).</summary>
|
||||||
public sealed record TransitionResult(AlarmConditionState State, EmissionKind Emission)
|
/// <remarks>
|
||||||
|
/// <para>
|
||||||
|
/// <see cref="NoOpReason"/> carries a short diagnostic string for the
|
||||||
|
/// <see cref="NoOp(AlarmConditionState, string)"/> case (e.g.
|
||||||
|
/// "disabled — predicate result ignored", "already acknowledged"). The
|
||||||
|
/// engine logs this at debug level when a no-op result is observed, so
|
||||||
|
/// the class-level remarks on <see cref="Part9StateMachine"/> hold:
|
||||||
|
/// disabled-alarm and idempotent ack/confirm/shelve/unshelve
|
||||||
|
/// transitions do produce a diagnostic log line. Plain
|
||||||
|
/// <see cref="None(AlarmConditionState)"/> results (state unchanged,
|
||||||
|
/// no operator intent recorded — e.g. a predicate re-evaluation that
|
||||||
|
/// confirms the existing active state) leave <see cref="NoOpReason"/>
|
||||||
|
/// null because there is nothing to surface to an operator.
|
||||||
|
/// (Core.ScriptedAlarms-011)
|
||||||
|
/// </para>
|
||||||
|
/// </remarks>
|
||||||
|
public sealed record TransitionResult(AlarmConditionState State, EmissionKind Emission, string? NoOpReason = null)
|
||||||
{
|
{
|
||||||
public static TransitionResult None(AlarmConditionState state) => new(state, EmissionKind.None);
|
public static TransitionResult None(AlarmConditionState state) => new(state, EmissionKind.None);
|
||||||
public static TransitionResult NoOp(AlarmConditionState state, string reason) => new(state, EmissionKind.None);
|
public static TransitionResult NoOp(AlarmConditionState state, string reason)
|
||||||
|
=> new(state, EmissionKind.None, reason);
|
||||||
}
|
}
|
||||||
|
|
||||||
/// <summary>What kind of event, if any, the engine should emit after a transition.</summary>
|
/// <summary>What kind of event, if any, the engine should emit after a transition.</summary>
|
||||||
|
|||||||
@@ -59,6 +59,15 @@ public sealed class ScriptedAlarmEngine : IDisposable
|
|||||||
private bool _loaded;
|
private bool _loaded;
|
||||||
private bool _disposed;
|
private bool _disposed;
|
||||||
|
|
||||||
|
// Tracks fire-and-forget background work launched by OnUpstreamChange
|
||||||
|
// (ReevaluateAsync) and RunShelvingCheck (ShelvingCheckAsync). Dispose drains
|
||||||
|
// these so a re-evaluation in flight when shutdown begins finishes its
|
||||||
|
// SaveAsync before the engine returns control to the caller. The HashSet is
|
||||||
|
// accessed under its own lock — never under _evalGate — so registration /
|
||||||
|
// unregistration cannot deadlock against the gate. (Core.ScriptedAlarms-006)
|
||||||
|
private readonly HashSet<Task> _inFlight = [];
|
||||||
|
private readonly object _inFlightLock = new();
|
||||||
|
|
||||||
public ScriptedAlarmEngine(
|
public ScriptedAlarmEngine(
|
||||||
ITagUpstreamSource upstream,
|
ITagUpstreamSource upstream,
|
||||||
IAlarmStateStore store,
|
IAlarmStateStore store,
|
||||||
@@ -92,6 +101,7 @@ public sealed class ScriptedAlarmEngine : IDisposable
|
|||||||
if (_disposed) throw new ObjectDisposedException(nameof(ScriptedAlarmEngine));
|
if (_disposed) throw new ObjectDisposedException(nameof(ScriptedAlarmEngine));
|
||||||
if (definitions is null) throw new ArgumentNullException(nameof(definitions));
|
if (definitions is null) throw new ArgumentNullException(nameof(definitions));
|
||||||
|
|
||||||
|
var pending = new List<ScriptedAlarmEvent>(0);
|
||||||
await _evalGate.WaitAsync(ct).ConfigureAwait(false);
|
await _evalGate.WaitAsync(ct).ConfigureAwait(false);
|
||||||
try
|
try
|
||||||
{
|
{
|
||||||
@@ -157,11 +167,14 @@ public sealed class ScriptedAlarmEngine : IDisposable
|
|||||||
|
|
||||||
// Restore persisted state, falling back to Fresh where nothing was saved,
|
// Restore persisted state, falling back to Fresh where nothing was saved,
|
||||||
// then re-derive ActiveState from the current predicate per decision #14.
|
// then re-derive ActiveState from the current predicate per decision #14.
|
||||||
|
// Any predicate emissions queue into `pending` and fire after the gate
|
||||||
|
// is released — so a startup-recovery activation event can call back into
|
||||||
|
// the engine without deadlocking. (Core.ScriptedAlarms-003)
|
||||||
foreach (var (alarmId, state) in _alarms)
|
foreach (var (alarmId, state) in _alarms)
|
||||||
{
|
{
|
||||||
var persisted = await _store.LoadAsync(alarmId, ct).ConfigureAwait(false);
|
var persisted = await _store.LoadAsync(alarmId, ct).ConfigureAwait(false);
|
||||||
var seed = persisted ?? state.Condition;
|
var seed = persisted ?? state.Condition;
|
||||||
var afterPredicate = await EvaluatePredicateToStateAsync(state, seed, nowUtc: _clock(), ct)
|
var afterPredicate = await EvaluatePredicateToStateAsync(state, seed, nowUtc: _clock(), ct, pending)
|
||||||
.ConfigureAwait(false);
|
.ConfigureAwait(false);
|
||||||
_alarms[alarmId] = state with { Condition = afterPredicate };
|
_alarms[alarmId] = state with { Condition = afterPredicate };
|
||||||
await _store.SaveAsync(afterPredicate, ct).ConfigureAwait(false);
|
await _store.SaveAsync(afterPredicate, ct).ConfigureAwait(false);
|
||||||
@@ -192,6 +205,10 @@ public sealed class ScriptedAlarmEngine : IDisposable
|
|||||||
{
|
{
|
||||||
_evalGate.Release();
|
_evalGate.Release();
|
||||||
}
|
}
|
||||||
|
|
||||||
|
// Fire any emissions collected during startup recovery OUTSIDE the gate so
|
||||||
|
// subscribers can re-enter the engine safely. (Core.ScriptedAlarms-003)
|
||||||
|
foreach (var evt in pending) FireEvent(evt);
|
||||||
}
|
}
|
||||||
|
|
||||||
/// <summary>
|
/// <summary>
|
||||||
@@ -234,6 +251,7 @@ public sealed class ScriptedAlarmEngine : IDisposable
|
|||||||
if (!_alarms.TryGetValue(alarmId, out var state))
|
if (!_alarms.TryGetValue(alarmId, out var state))
|
||||||
throw new ArgumentException($"Unknown alarm {alarmId}", nameof(alarmId));
|
throw new ArgumentException($"Unknown alarm {alarmId}", nameof(alarmId));
|
||||||
|
|
||||||
|
ScriptedAlarmEvent? pending = null;
|
||||||
await _evalGate.WaitAsync(ct).ConfigureAwait(false);
|
await _evalGate.WaitAsync(ct).ConfigureAwait(false);
|
||||||
try
|
try
|
||||||
{
|
{
|
||||||
@@ -244,27 +262,50 @@ public sealed class ScriptedAlarmEngine : IDisposable
|
|||||||
// the exception propagates to the caller. (Core.ScriptedAlarms-007)
|
// the exception propagates to the caller. (Core.ScriptedAlarms-007)
|
||||||
await _store.SaveAsync(result.State, ct).ConfigureAwait(false);
|
await _store.SaveAsync(result.State, ct).ConfigureAwait(false);
|
||||||
_alarms[alarmId] = state with { Condition = result.State };
|
_alarms[alarmId] = state with { Condition = result.State };
|
||||||
if (result.Emission != EmissionKind.None) EmitEvent(state, result.State, result.Emission);
|
// Build the emission event under the gate (it captures a coherent
|
||||||
|
// snapshot of state + message-template values) but defer the actual
|
||||||
|
// OnEvent dispatch until after Release() so a slow subscriber or a
|
||||||
|
// subscriber that re-enters the engine doesn't block / deadlock.
|
||||||
|
// (Core.ScriptedAlarms-003)
|
||||||
|
if (result.Emission != EmissionKind.None)
|
||||||
|
pending = BuildEmission(state, result.State, result.Emission);
|
||||||
|
else if (result.NoOpReason is { } reason)
|
||||||
|
{
|
||||||
|
// The Part9StateMachine remarks promise a diagnostic log line for
|
||||||
|
// disabled-alarm no-ops + idempotent ack/confirm/shelve/unshelve
|
||||||
|
// calls. We surface them at debug so they're available when
|
||||||
|
// investigating "why didn't my ack take effect?" without spamming
|
||||||
|
// the main info log. (Core.ScriptedAlarms-011)
|
||||||
|
state.Logger.Debug("Alarm {AlarmId} no-op transition: {Reason}", alarmId, reason);
|
||||||
|
}
|
||||||
}
|
}
|
||||||
finally { _evalGate.Release(); }
|
finally { _evalGate.Release(); }
|
||||||
|
|
||||||
|
// OnEvent dispatch happens OUTSIDE _evalGate so subscribers can call back
|
||||||
|
// into the engine (e.g. AcknowledgeAsync from inside an Activated handler)
|
||||||
|
// without deadlocking against the non-reentrant SemaphoreSlim.
|
||||||
|
if (pending is not null) FireEvent(pending);
|
||||||
}
|
}
|
||||||
|
|
||||||
/// <summary>
|
/// <summary>
|
||||||
/// Upstream-change callback. Updates the value cache + enqueues predicate
|
/// Upstream-change callback. Updates the value cache + enqueues predicate
|
||||||
/// re-evaluation for every alarm referencing the changed path. Fire-and-forget
|
/// re-evaluation for every alarm referencing the changed path. Fire-and-forget
|
||||||
/// so driver-side dispatch isn't blocked.
|
/// so driver-side dispatch isn't blocked; the background task is tracked so
|
||||||
|
/// <see cref="Dispose"/> can drain it. (Core.ScriptedAlarms-006)
|
||||||
/// </summary>
|
/// </summary>
|
||||||
internal void OnUpstreamChange(string path, DataValueSnapshot value)
|
internal void OnUpstreamChange(string path, DataValueSnapshot value)
|
||||||
{
|
{
|
||||||
_valueCache[path] = value;
|
_valueCache[path] = value;
|
||||||
|
if (_disposed) return; // don't queue new work against a disposing engine
|
||||||
if (_alarmsReferencing.TryGetValue(path, out var alarmIds))
|
if (_alarmsReferencing.TryGetValue(path, out var alarmIds))
|
||||||
{
|
{
|
||||||
_ = ReevaluateAsync(alarmIds.ToArray(), CancellationToken.None);
|
TrackBackgroundTask(ReevaluateAsync(alarmIds.ToArray(), CancellationToken.None));
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
private async Task ReevaluateAsync(IReadOnlyList<string> alarmIds, CancellationToken ct)
|
private async Task ReevaluateAsync(IReadOnlyList<string> alarmIds, CancellationToken ct)
|
||||||
{
|
{
|
||||||
|
var pending = new List<ScriptedAlarmEvent>(0);
|
||||||
try
|
try
|
||||||
{
|
{
|
||||||
await _evalGate.WaitAsync(ct).ConfigureAwait(false);
|
await _evalGate.WaitAsync(ct).ConfigureAwait(false);
|
||||||
@@ -280,7 +321,7 @@ public sealed class ScriptedAlarmEngine : IDisposable
|
|||||||
{
|
{
|
||||||
if (!_alarms.TryGetValue(id, out var state)) continue;
|
if (!_alarms.TryGetValue(id, out var state)) continue;
|
||||||
var newState = await EvaluatePredicateToStateAsync(
|
var newState = await EvaluatePredicateToStateAsync(
|
||||||
state, state.Condition, _clock(), ct).ConfigureAwait(false);
|
state, state.Condition, _clock(), ct, pending).ConfigureAwait(false);
|
||||||
if (!ReferenceEquals(newState, state.Condition))
|
if (!ReferenceEquals(newState, state.Condition))
|
||||||
{
|
{
|
||||||
// Persist before updating in-memory so a store failure leaves
|
// Persist before updating in-memory so a store failure leaves
|
||||||
@@ -295,16 +336,23 @@ public sealed class ScriptedAlarmEngine : IDisposable
|
|||||||
catch (Exception ex)
|
catch (Exception ex)
|
||||||
{
|
{
|
||||||
_engineLogger.Error(ex, "ScriptedAlarmEngine reevaluate failed");
|
_engineLogger.Error(ex, "ScriptedAlarmEngine reevaluate failed");
|
||||||
|
return;
|
||||||
}
|
}
|
||||||
|
// Fire emissions OUTSIDE _evalGate so subscriber callbacks can re-enter
|
||||||
|
// the engine without deadlocking. (Core.ScriptedAlarms-003)
|
||||||
|
foreach (var evt in pending) FireEvent(evt);
|
||||||
}
|
}
|
||||||
|
|
||||||
/// <summary>
|
/// <summary>
|
||||||
/// Evaluate the predicate + apply the resulting state-machine transition.
|
/// Evaluate the predicate + apply the resulting state-machine transition.
|
||||||
/// Returns the new condition state. Emits the appropriate event if the
|
/// Returns the new condition state. If the transition produces an emission,
|
||||||
/// transition produces one.
|
/// appends it to <paramref name="pendingEmissions"/> so the caller can fire
|
||||||
|
/// them after releasing <c>_evalGate</c> — keeping subscriber callbacks
|
||||||
|
/// outside the gate. (Core.ScriptedAlarms-003)
|
||||||
/// </summary>
|
/// </summary>
|
||||||
private async Task<AlarmConditionState> EvaluatePredicateToStateAsync(
|
private async Task<AlarmConditionState> EvaluatePredicateToStateAsync(
|
||||||
AlarmState state, AlarmConditionState seed, DateTime nowUtc, CancellationToken ct)
|
AlarmState state, AlarmConditionState seed, DateTime nowUtc, CancellationToken ct,
|
||||||
|
List<ScriptedAlarmEvent>? pendingEmissions = null)
|
||||||
{
|
{
|
||||||
var inputs = BuildReadCache(state.Inputs);
|
var inputs = BuildReadCache(state.Inputs);
|
||||||
|
|
||||||
@@ -340,7 +388,14 @@ public sealed class ScriptedAlarmEngine : IDisposable
|
|||||||
|
|
||||||
var result = Part9StateMachine.ApplyPredicate(seed, predicateTrue, nowUtc);
|
var result = Part9StateMachine.ApplyPredicate(seed, predicateTrue, nowUtc);
|
||||||
if (result.Emission != EmissionKind.None)
|
if (result.Emission != EmissionKind.None)
|
||||||
EmitEvent(state, result.State, result.Emission);
|
{
|
||||||
|
var evt = BuildEmission(state, result.State, result.Emission);
|
||||||
|
if (evt is not null)
|
||||||
|
{
|
||||||
|
if (pendingEmissions is not null) pendingEmissions.Add(evt);
|
||||||
|
else FireEvent(evt); // LoadAsync path: no caller-supplied list, fire here.
|
||||||
|
}
|
||||||
|
}
|
||||||
return result.State;
|
return result.State;
|
||||||
}
|
}
|
||||||
|
|
||||||
@@ -373,14 +428,24 @@ public sealed class ScriptedAlarmEngine : IDisposable
|
|||||||
return true;
|
return true;
|
||||||
}
|
}
|
||||||
|
|
||||||
private void EmitEvent(AlarmState state, AlarmConditionState condition, EmissionKind kind)
|
/// <summary>
|
||||||
|
/// Build (but do not fire) the <see cref="ScriptedAlarmEvent"/> for a
|
||||||
|
/// transition. Returns null for kinds that should not be published
|
||||||
|
/// (<see cref="EmissionKind.Suppressed"/> and
|
||||||
|
/// <see cref="EmissionKind.None"/>). Pure construction — called under
|
||||||
|
/// <c>_evalGate</c> so the message-template resolution uses a coherent
|
||||||
|
/// value-cache snapshot. The actual <see cref="OnEvent"/> dispatch is
|
||||||
|
/// done by <see cref="FireEvent(ScriptedAlarmEvent)"/> AFTER the gate is
|
||||||
|
/// released. (Core.ScriptedAlarms-003)
|
||||||
|
/// </summary>
|
||||||
|
private ScriptedAlarmEvent? BuildEmission(AlarmState state, AlarmConditionState condition, EmissionKind kind)
|
||||||
{
|
{
|
||||||
// Suppressed kind means shelving ate the emission — we don't fire for subscribers
|
// Suppressed kind means shelving ate the emission — we don't fire for subscribers
|
||||||
// but the state record still advanced so startup recovery reflects reality.
|
// but the state record still advanced so startup recovery reflects reality.
|
||||||
if (kind == EmissionKind.Suppressed || kind == EmissionKind.None) return;
|
if (kind == EmissionKind.Suppressed || kind == EmissionKind.None) return null;
|
||||||
|
|
||||||
var message = MessageTemplate.Resolve(state.Definition.MessageTemplate, TryLookup);
|
var message = MessageTemplate.Resolve(state.Definition.MessageTemplate, TryLookup);
|
||||||
var evt = new ScriptedAlarmEvent(
|
return new ScriptedAlarmEvent(
|
||||||
AlarmId: state.Definition.AlarmId,
|
AlarmId: state.Definition.AlarmId,
|
||||||
EquipmentPath: state.Definition.EquipmentPath,
|
EquipmentPath: state.Definition.EquipmentPath,
|
||||||
AlarmName: state.Definition.AlarmName,
|
AlarmName: state.Definition.AlarmName,
|
||||||
@@ -390,10 +455,22 @@ public sealed class ScriptedAlarmEngine : IDisposable
|
|||||||
Condition: condition,
|
Condition: condition,
|
||||||
Emission: kind,
|
Emission: kind,
|
||||||
TimestampUtc: _clock());
|
TimestampUtc: _clock());
|
||||||
|
}
|
||||||
|
|
||||||
|
/// <summary>
|
||||||
|
/// Invoke the <see cref="OnEvent"/> handler for a built emission. Must be
|
||||||
|
/// called OUTSIDE <c>_evalGate</c>: a slow subscriber would otherwise
|
||||||
|
/// block the gate for every other engine operation, and a subscriber
|
||||||
|
/// that re-enters the engine (e.g. calls AcknowledgeAsync) would
|
||||||
|
/// deadlock against the non-reentrant SemaphoreSlim.
|
||||||
|
/// (Core.ScriptedAlarms-003)
|
||||||
|
/// </summary>
|
||||||
|
private void FireEvent(ScriptedAlarmEvent evt)
|
||||||
|
{
|
||||||
try { OnEvent?.Invoke(this, evt); }
|
try { OnEvent?.Invoke(this, evt); }
|
||||||
catch (Exception ex)
|
catch (Exception ex)
|
||||||
{
|
{
|
||||||
_engineLogger.Warning(ex, "ScriptedAlarmEngine OnEvent subscriber threw for {AlarmId}", state.Definition.AlarmId);
|
_engineLogger.Warning(ex, "ScriptedAlarmEngine OnEvent subscriber threw for {AlarmId}", evt.AlarmId);
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
@@ -404,7 +481,24 @@ public sealed class ScriptedAlarmEngine : IDisposable
|
|||||||
{
|
{
|
||||||
if (_disposed) return;
|
if (_disposed) return;
|
||||||
var ids = _alarms.Keys.ToArray();
|
var ids = _alarms.Keys.ToArray();
|
||||||
_ = ShelvingCheckAsync(ids, CancellationToken.None);
|
TrackBackgroundTask(ShelvingCheckAsync(ids, CancellationToken.None));
|
||||||
|
}
|
||||||
|
|
||||||
|
/// <summary>
|
||||||
|
/// Register a fire-and-forget task so <see cref="Dispose"/> can await it.
|
||||||
|
/// The task removes itself from the set on completion via a continuation.
|
||||||
|
/// (Core.ScriptedAlarms-006)
|
||||||
|
/// </summary>
|
||||||
|
private void TrackBackgroundTask(Task task)
|
||||||
|
{
|
||||||
|
lock (_inFlightLock) { _inFlight.Add(task); }
|
||||||
|
// Use ContinueWith with ExecuteSynchronously so the removal runs on the
|
||||||
|
// completing thread — avoids scheduler delay between completion and
|
||||||
|
// unregistration that would otherwise let Dispose see a stale set.
|
||||||
|
task.ContinueWith(t =>
|
||||||
|
{
|
||||||
|
lock (_inFlightLock) { _inFlight.Remove(t); }
|
||||||
|
}, CancellationToken.None, TaskContinuationOptions.ExecuteSynchronously, TaskScheduler.Default);
|
||||||
}
|
}
|
||||||
|
|
||||||
/// <summary>
|
/// <summary>
|
||||||
@@ -416,6 +510,7 @@ public sealed class ScriptedAlarmEngine : IDisposable
|
|||||||
|
|
||||||
private async Task ShelvingCheckAsync(IReadOnlyList<string> alarmIds, CancellationToken ct)
|
private async Task ShelvingCheckAsync(IReadOnlyList<string> alarmIds, CancellationToken ct)
|
||||||
{
|
{
|
||||||
|
var pending = new List<ScriptedAlarmEvent>(0);
|
||||||
try
|
try
|
||||||
{
|
{
|
||||||
await _evalGate.WaitAsync(ct).ConfigureAwait(false);
|
await _evalGate.WaitAsync(ct).ConfigureAwait(false);
|
||||||
@@ -440,7 +535,10 @@ public sealed class ScriptedAlarmEngine : IDisposable
|
|||||||
await _store.SaveAsync(result.State, ct).ConfigureAwait(false);
|
await _store.SaveAsync(result.State, ct).ConfigureAwait(false);
|
||||||
_alarms[id] = state with { Condition = result.State };
|
_alarms[id] = state with { Condition = result.State };
|
||||||
if (result.Emission != EmissionKind.None)
|
if (result.Emission != EmissionKind.None)
|
||||||
EmitEvent(state, result.State, result.Emission);
|
{
|
||||||
|
var evt = BuildEmission(state, result.State, result.Emission);
|
||||||
|
if (evt is not null) pending.Add(evt);
|
||||||
|
}
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
@@ -449,7 +547,10 @@ public sealed class ScriptedAlarmEngine : IDisposable
|
|||||||
catch (Exception ex)
|
catch (Exception ex)
|
||||||
{
|
{
|
||||||
_engineLogger.Warning(ex, "ScriptedAlarmEngine shelving-check failed");
|
_engineLogger.Warning(ex, "ScriptedAlarmEngine shelving-check failed");
|
||||||
|
return;
|
||||||
}
|
}
|
||||||
|
// Fire emissions OUTSIDE _evalGate. (Core.ScriptedAlarms-003)
|
||||||
|
foreach (var evt in pending) FireEvent(evt);
|
||||||
}
|
}
|
||||||
|
|
||||||
private void UnsubscribeFromUpstream()
|
private void UnsubscribeFromUpstream()
|
||||||
@@ -473,6 +574,28 @@ public sealed class ScriptedAlarmEngine : IDisposable
|
|||||||
_disposed = true;
|
_disposed = true;
|
||||||
_shelvingTimer?.Dispose();
|
_shelvingTimer?.Dispose();
|
||||||
UnsubscribeFromUpstream();
|
UnsubscribeFromUpstream();
|
||||||
|
|
||||||
|
// Drain any fire-and-forget background work (ReevaluateAsync from
|
||||||
|
// OnUpstreamChange + ShelvingCheckAsync from the 5s timer) that started
|
||||||
|
// before _disposed = true was visible. Without this, a SaveAsync in
|
||||||
|
// flight can outlive the engine and write to a (possibly disposed) store
|
||||||
|
// after Dispose() has returned. The tasks re-check _disposed after
|
||||||
|
// acquiring the gate and bail out, but the await still has to complete.
|
||||||
|
// (Core.ScriptedAlarms-006)
|
||||||
|
Task[] toAwait;
|
||||||
|
lock (_inFlightLock) { toAwait = [.. _inFlight]; }
|
||||||
|
if (toAwait.Length > 0)
|
||||||
|
{
|
||||||
|
try { Task.WhenAll(toAwait).GetAwaiter().GetResult(); }
|
||||||
|
catch (Exception ex)
|
||||||
|
{
|
||||||
|
// Background task failures already logged inside ReevaluateAsync /
|
||||||
|
// ShelvingCheckAsync; surface here at debug so a parent shutdown is
|
||||||
|
// not noisy. The key invariant is that the tasks have COMPLETED.
|
||||||
|
_engineLogger.Debug(ex, "ScriptedAlarmEngine background task threw during shutdown drain");
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
// Do NOT clear _alarms here: Timer.Dispose() does not wait for in-flight callbacks,
|
// Do NOT clear _alarms here: Timer.Dispose() does not wait for in-flight callbacks,
|
||||||
// so a ShelvingCheckAsync or ReevaluateAsync can still be running inside _evalGate.
|
// so a ShelvingCheckAsync or ReevaluateAsync can still be running inside _evalGate.
|
||||||
// Those paths now re-check _disposed after acquiring the gate and bail out safely.
|
// Those paths now re-check _disposed after acquiring the gate and bail out safely.
|
||||||
|
|||||||
@@ -606,6 +606,253 @@ public sealed class ScriptedAlarmEngineTests
|
|||||||
"Uncertain-quality inputs are treated as ready — predicate evaluates");
|
"Uncertain-quality inputs are treated as ready — predicate evaluates");
|
||||||
}
|
}
|
||||||
|
|
||||||
|
// -------------------------------------------------------------------------
|
||||||
|
// Core.ScriptedAlarms-003: OnEvent emission must not block under _evalGate.
|
||||||
|
// (1) A slow subscriber must not block the gate for other alarms.
|
||||||
|
// (2) A subscriber that re-enters the engine (e.g. AcknowledgeAsync) must
|
||||||
|
// not deadlock against _evalGate. Both regressions are covered here.
|
||||||
|
// -------------------------------------------------------------------------
|
||||||
|
[Fact]
|
||||||
|
public async Task OnEvent_subscriber_can_call_back_into_engine_without_deadlock(/* -003 */)
|
||||||
|
{
|
||||||
|
// Re-entrancy regression. When OnEvent emission was inside _evalGate, a
|
||||||
|
// subscriber that called an engine method (e.g. AcknowledgeAsync) hung
|
||||||
|
// forever because the non-reentrant SemaphoreSlim refused to re-grant
|
||||||
|
// the gate the dispatch path was still holding. After the fix, emission
|
||||||
|
// happens AFTER Release() so the subscriber's call acquires the gate
|
||||||
|
// cleanly and the operator-driven action completes.
|
||||||
|
var up = new FakeUpstream();
|
||||||
|
up.Set("Temp", 50);
|
||||||
|
var eng = Build(up, out _);
|
||||||
|
try
|
||||||
|
{
|
||||||
|
await eng.LoadAsync([Alarm("HighTemp", """return (int)ctx.GetTag("Temp").Value > 100;""")],
|
||||||
|
TestContext.Current.CancellationToken);
|
||||||
|
|
||||||
|
// Subscriber re-enters the engine via Task.Run so the OnEvent
|
||||||
|
// dispatch thread is not blocked while waiting. Either way, with
|
||||||
|
// the fix in place AcknowledgeAsync must acquire _evalGate (the
|
||||||
|
// dispatch path released it before invoking the subscriber) and
|
||||||
|
// complete in well under the timeout.
|
||||||
|
var ackDone = new TaskCompletionSource();
|
||||||
|
eng.OnEvent += (_, e) =>
|
||||||
|
{
|
||||||
|
if (e.Emission != EmissionKind.Activated) return;
|
||||||
|
_ = Task.Run(async () =>
|
||||||
|
{
|
||||||
|
try
|
||||||
|
{
|
||||||
|
await eng.AcknowledgeAsync(e.AlarmId, "sub", null, CancellationToken.None);
|
||||||
|
ackDone.TrySetResult();
|
||||||
|
}
|
||||||
|
catch (Exception ex) { ackDone.TrySetException(ex); }
|
||||||
|
});
|
||||||
|
};
|
||||||
|
|
||||||
|
up.Push("Temp", 150);
|
||||||
|
|
||||||
|
var winner = await Task.WhenAny(ackDone.Task, Task.Delay(TimeSpan.FromSeconds(3)));
|
||||||
|
winner.ShouldBe(ackDone.Task,
|
||||||
|
"subscriber re-entering the engine must not deadlock against _evalGate");
|
||||||
|
await ackDone.Task; // surface any inner exception
|
||||||
|
eng.GetState("HighTemp")!.Acked.ShouldBe(AlarmAckedState.Acknowledged);
|
||||||
|
}
|
||||||
|
finally
|
||||||
|
{
|
||||||
|
eng.Dispose();
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
[Fact]
|
||||||
|
public void OnEvent_emission_happens_outside_evalGate(/* -003 */)
|
||||||
|
{
|
||||||
|
// Direct white-box check on the gate-release ordering: AcknowledgeAsync
|
||||||
|
// emits the Acknowledged event AFTER releasing the gate. We assert that
|
||||||
|
// by observing the gate is acquirable from inside the subscriber.
|
||||||
|
// SemaphoreSlim.Wait(0) returns true only if the count > 0 (gate free).
|
||||||
|
var up = new FakeUpstream();
|
||||||
|
up.Set("Temp", 50);
|
||||||
|
var eng = Build(up, out _);
|
||||||
|
try
|
||||||
|
{
|
||||||
|
eng.LoadAsync([Alarm("HighTemp", """return (int)ctx.GetTag("Temp").Value > 100;""")],
|
||||||
|
TestContext.Current.CancellationToken).GetAwaiter().GetResult();
|
||||||
|
// Drive to Active so Acknowledge has something to ack.
|
||||||
|
up.Push("Temp", 150);
|
||||||
|
// Use the same WaitForAsync that other tests use — synchronously
|
||||||
|
// here since this is a non-async test.
|
||||||
|
for (var i = 0; i < 80 && eng.GetState("HighTemp")!.Active != AlarmActiveState.Active; i++)
|
||||||
|
Thread.Sleep(25);
|
||||||
|
eng.GetState("HighTemp")!.Active.ShouldBe(AlarmActiveState.Active);
|
||||||
|
|
||||||
|
// Use reflection to peek at _evalGate so the subscriber can probe it.
|
||||||
|
var gateField = typeof(ScriptedAlarmEngine).GetField(
|
||||||
|
"_evalGate", System.Reflection.BindingFlags.NonPublic | System.Reflection.BindingFlags.Instance);
|
||||||
|
gateField.ShouldNotBeNull();
|
||||||
|
var gate = (SemaphoreSlim)gateField.GetValue(eng)!;
|
||||||
|
|
||||||
|
var gateFreeInsideEmission = false;
|
||||||
|
eng.OnEvent += (_, e) =>
|
||||||
|
{
|
||||||
|
if (e.Emission != EmissionKind.Acknowledged) return;
|
||||||
|
// SemaphoreSlim.Wait(0) — non-blocking try-take. If the gate is
|
||||||
|
// free we acquire it (count back to 0); release immediately.
|
||||||
|
if (gate.Wait(0))
|
||||||
|
{
|
||||||
|
gateFreeInsideEmission = true;
|
||||||
|
gate.Release();
|
||||||
|
}
|
||||||
|
};
|
||||||
|
|
||||||
|
eng.AcknowledgeAsync("HighTemp", "alice", null, CancellationToken.None)
|
||||||
|
.GetAwaiter().GetResult();
|
||||||
|
|
||||||
|
gateFreeInsideEmission.ShouldBeTrue(
|
||||||
|
"_evalGate must be released before OnEvent fires so subscribers " +
|
||||||
|
"can call back into the engine without deadlocking");
|
||||||
|
}
|
||||||
|
finally
|
||||||
|
{
|
||||||
|
eng.Dispose();
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
// -------------------------------------------------------------------------
|
||||||
|
// Core.ScriptedAlarms-006: Dispose must drain in-flight background tasks
|
||||||
|
// launched by OnUpstreamChange / RunShelvingCheck. Otherwise a re-evaluation
|
||||||
|
// or shelving check started just before Dispose can keep running and write
|
||||||
|
// to a (possibly disposed) store after the engine has returned.
|
||||||
|
// -------------------------------------------------------------------------
|
||||||
|
[Fact]
|
||||||
|
public async Task Dispose_drains_in_flight_reevaluation_tasks(/* -006 */)
|
||||||
|
{
|
||||||
|
var up = new FakeUpstream();
|
||||||
|
up.Set("Temp", 50);
|
||||||
|
var logger = new LoggerConfiguration().CreateLogger();
|
||||||
|
var slowStore = new BlockingSaveAlarmStateStore();
|
||||||
|
var eng = new ScriptedAlarmEngine(up, slowStore, new ScriptLoggerFactory(logger), logger);
|
||||||
|
await eng.LoadAsync([Alarm("A", """return (int)ctx.GetTag("Temp").Value > 100;""")],
|
||||||
|
TestContext.Current.CancellationToken);
|
||||||
|
|
||||||
|
// Block the NEXT save (the one triggered by the push below).
|
||||||
|
var saveGate = new TaskCompletionSource();
|
||||||
|
slowStore.BlockNextSave = saveGate;
|
||||||
|
|
||||||
|
// Trigger a re-evaluation that will go inside _evalGate and call SaveAsync.
|
||||||
|
up.Push("Temp", 150);
|
||||||
|
|
||||||
|
// Wait until the store's SaveAsync is actually blocked.
|
||||||
|
await WaitForAsync(() => slowStore.SaveInProgress, timeoutMs: 1000);
|
||||||
|
|
||||||
|
// Dispose must wait for the in-flight reevaluation to complete rather
|
||||||
|
// than returning while a background task still runs.
|
||||||
|
var disposeTask = Task.Run(() => eng.Dispose());
|
||||||
|
|
||||||
|
// Verify Dispose does NOT complete immediately — it should block waiting
|
||||||
|
// for the in-flight task. Without the -006 fix Dispose returns straight
|
||||||
|
// away and the background reevaluation can outlive the engine.
|
||||||
|
var prematureFinish = await Task.WhenAny(disposeTask, Task.Delay(200));
|
||||||
|
prematureFinish.ShouldNotBe(disposeTask,
|
||||||
|
"Dispose must block until in-flight background tasks complete");
|
||||||
|
|
||||||
|
// Let the save complete and verify Dispose then returns.
|
||||||
|
saveGate.SetResult();
|
||||||
|
await disposeTask.WaitAsync(TimeSpan.FromSeconds(3), TestContext.Current.CancellationToken);
|
||||||
|
slowStore.SaveInProgress.ShouldBeFalse("background task drained before Dispose returned");
|
||||||
|
}
|
||||||
|
|
||||||
|
// -------------------------------------------------------------------------
|
||||||
|
// Core.ScriptedAlarms-010: predicate evaluation and message-template
|
||||||
|
// resolution apply different quality bars on purpose. Predicate evaluation
|
||||||
|
// accepts Uncertain (the predicate can still inspect the value); message
|
||||||
|
// resolution renders Uncertain as "{?}" so the operator sees the doubt
|
||||||
|
// explicitly. The two policies are documented in docs/ScriptedAlarms.md.
|
||||||
|
// -------------------------------------------------------------------------
|
||||||
|
[Fact]
|
||||||
|
public async Task Uncertain_quality_drives_predicate_but_renders_question_mark_in_message(/* -010 */)
|
||||||
|
{
|
||||||
|
var up = new FakeUpstream();
|
||||||
|
// Seed with Uncertain quality (severity bit 30 set, bit 31 clear).
|
||||||
|
up.Set("Temp", 150, statusCode: 0x40000000u);
|
||||||
|
using var eng = Build(up, out _);
|
||||||
|
await eng.LoadAsync([
|
||||||
|
new ScriptedAlarmDefinition(
|
||||||
|
"HighTemp", "Plant/Line1", "HighTemp",
|
||||||
|
AlarmKind.LimitAlarm, AlarmSeverity.High,
|
||||||
|
"Temp {Temp} exceeded limit",
|
||||||
|
"""return (int)ctx.GetTag("Temp").Value > 100;"""),
|
||||||
|
], TestContext.Current.CancellationToken);
|
||||||
|
|
||||||
|
// Predicate evaluated (Uncertain treated as ready) → alarm Active.
|
||||||
|
eng.GetState("HighTemp")!.Active.ShouldBe(AlarmActiveState.Active,
|
||||||
|
"AreInputsReady accepts Uncertain so the predicate runs");
|
||||||
|
|
||||||
|
// But the resolved emission message must show "{?}" for the Uncertain
|
||||||
|
// tag — only Good substitutes into the operator-facing message.
|
||||||
|
var events = new List<ScriptedAlarmEvent>();
|
||||||
|
eng.OnEvent += (_, e) => events.Add(e);
|
||||||
|
up.Push("Temp", 200, statusCode: 0x40000000u); // still Uncertain
|
||||||
|
// Trigger another evaluation to get an emission (already active, so
|
||||||
|
// we need a clear → re-activate cycle). Easier: force the same path
|
||||||
|
// through a comment which emits a CommentAdded message. But comments
|
||||||
|
// don't run the template. Instead clear it then re-activate.
|
||||||
|
up.Push("Temp", 50, statusCode: 0u); // Good, predicate becomes false
|
||||||
|
await WaitForAsync(() => events.Any(e => e.Emission == EmissionKind.Cleared));
|
||||||
|
events.Clear();
|
||||||
|
up.Push("Temp", 200, statusCode: 0x40000000u); // Uncertain, predicate true
|
||||||
|
await WaitForAsync(() => events.Any(e => e.Emission == EmissionKind.Activated));
|
||||||
|
|
||||||
|
// The Activated message must show {?} for the Uncertain input.
|
||||||
|
events.Single(e => e.Emission == EmissionKind.Activated).Message
|
||||||
|
.ShouldBe("Temp {?} exceeded limit",
|
||||||
|
"MessageTemplate.Resolve renders non-Good StatusCode as {?} " +
|
||||||
|
"even though predicate evaluation accepted the Uncertain value");
|
||||||
|
}
|
||||||
|
|
||||||
|
// -------------------------------------------------------------------------
|
||||||
|
// Core.ScriptedAlarms-008: switch Comments to ImmutableList for O(log n)
|
||||||
|
// append. The persisted runtime type must be ImmutableList<AlarmComment>
|
||||||
|
// (which still satisfies IReadOnlyList<AlarmComment> for existing
|
||||||
|
// consumers).
|
||||||
|
// -------------------------------------------------------------------------
|
||||||
|
[Fact]
|
||||||
|
public async Task Comments_collection_uses_ImmutableList_for_efficient_append(/* -008 */)
|
||||||
|
{
|
||||||
|
var up = new FakeUpstream();
|
||||||
|
up.Set("Temp", 50);
|
||||||
|
using var eng = Build(up, out _);
|
||||||
|
await eng.LoadAsync([Alarm("A", "return false;")], TestContext.Current.CancellationToken);
|
||||||
|
|
||||||
|
// Add a comment so AppendComment runs.
|
||||||
|
await eng.AddCommentAsync("A", "alice", "note", TestContext.Current.CancellationToken);
|
||||||
|
|
||||||
|
var s = eng.GetState("A")!;
|
||||||
|
s.Comments.ShouldBeOfType<System.Collections.Immutable.ImmutableList<AlarmComment>>(
|
||||||
|
"Comments should be an ImmutableList so append is O(log n), not O(n)");
|
||||||
|
}
|
||||||
|
|
||||||
|
// -------------------------------------------------------------------------
|
||||||
|
// Core.ScriptedAlarms-011: TransitionResult.NoOp's reason parameter must be
|
||||||
|
// propagated, not silently discarded. The class-level remarks promise a
|
||||||
|
// diagnostic log line for no-op disabled-alarm evaluations.
|
||||||
|
// -------------------------------------------------------------------------
|
||||||
|
[Fact]
|
||||||
|
public void TransitionResult_NoOp_propagates_reason(/* -011 */)
|
||||||
|
{
|
||||||
|
var fresh = AlarmConditionState.Fresh("a-1", DateTime.UtcNow);
|
||||||
|
var r = TransitionResult.NoOp(fresh, "disabled — predicate result ignored");
|
||||||
|
r.NoOpReason.ShouldBe("disabled — predicate result ignored",
|
||||||
|
"NoOp reason must be preserved on the TransitionResult so callers can log it");
|
||||||
|
}
|
||||||
|
|
||||||
|
[Fact]
|
||||||
|
public void TransitionResult_None_carries_no_reason(/* -011 */)
|
||||||
|
{
|
||||||
|
var fresh = AlarmConditionState.Fresh("a-1", DateTime.UtcNow);
|
||||||
|
var r = TransitionResult.None(fresh);
|
||||||
|
r.NoOpReason.ShouldBeNull("None() factory has no reason — only NoOp() carries one");
|
||||||
|
}
|
||||||
|
|
||||||
private static async Task WaitForAsync(Func<bool> cond, int timeoutMs = 2000)
|
private static async Task WaitForAsync(Func<bool> cond, int timeoutMs = 2000)
|
||||||
{
|
{
|
||||||
var deadline = DateTime.UtcNow.AddMilliseconds(timeoutMs);
|
var deadline = DateTime.UtcNow.AddMilliseconds(timeoutMs);
|
||||||
@@ -645,4 +892,37 @@ public sealed class ScriptedAlarmEngineTests
|
|||||||
public Task RemoveAsync(string alarmId, CancellationToken ct)
|
public Task RemoveAsync(string alarmId, CancellationToken ct)
|
||||||
=> _inner.RemoveAsync(alarmId, ct);
|
=> _inner.RemoveAsync(alarmId, ct);
|
||||||
}
|
}
|
||||||
|
|
||||||
|
/// <summary>
|
||||||
|
/// A store whose SaveAsync can be made to block until the test signals it.
|
||||||
|
/// Used to verify Dispose drains in-flight background tasks (finding -006).
|
||||||
|
/// </summary>
|
||||||
|
private sealed class BlockingSaveAlarmStateStore : IAlarmStateStore
|
||||||
|
{
|
||||||
|
private readonly InMemoryAlarmStateStore _inner = new();
|
||||||
|
public TaskCompletionSource? BlockNextSave { get; set; }
|
||||||
|
public bool SaveInProgress { get; private set; }
|
||||||
|
|
||||||
|
public Task<AlarmConditionState?> LoadAsync(string alarmId, CancellationToken ct)
|
||||||
|
=> _inner.LoadAsync(alarmId, ct);
|
||||||
|
|
||||||
|
public Task<IReadOnlyList<AlarmConditionState>> LoadAllAsync(CancellationToken ct)
|
||||||
|
=> _inner.LoadAllAsync(ct);
|
||||||
|
|
||||||
|
public async Task SaveAsync(AlarmConditionState state, CancellationToken ct)
|
||||||
|
{
|
||||||
|
var gate = BlockNextSave;
|
||||||
|
if (gate is not null)
|
||||||
|
{
|
||||||
|
BlockNextSave = null;
|
||||||
|
SaveInProgress = true;
|
||||||
|
try { await gate.Task.WaitAsync(ct).ConfigureAwait(false); }
|
||||||
|
finally { SaveInProgress = false; }
|
||||||
|
}
|
||||||
|
await _inner.SaveAsync(state, ct).ConfigureAwait(false);
|
||||||
|
}
|
||||||
|
|
||||||
|
public Task RemoveAsync(string alarmId, CancellationToken ct)
|
||||||
|
=> _inner.RemoveAsync(alarmId, ct);
|
||||||
|
}
|
||||||
}
|
}
|
||||||
|
|||||||
Reference in New Issue
Block a user