feat(alarms): HiLo trigger type with per-band level, hysteresis, messages, overrides

Adds a new HiLo alarm trigger type with four configurable setpoints
(LoLo / Lo / Hi / HiHi). Each setpoint carries an optional priority,
deadband (for hysteresis), and operator message. The site runtime emits
AlarmStateChanged with an AlarmLevel field so consumers can differentiate
warning vs critical bands.

Plumbing:
  - new AlarmLevel enum + AlarmStateChanged.Level/Message init properties
  - AlarmTriggerEditor (Blazor) gets a HiLo render with severity tinting
  - AlarmTriggerConfigCodec extracted from the editor for testability
  - sitestream.proto carries level + message over gRPC
  - SemanticValidator enforces numeric attribute, setpoint ordering,
    non-negative deadband
  - on-trigger scripts get an Alarm global (Name/Level/Priority/Message)
    so notification routing can branch by severity
  - per-instance InstanceAlarmOverride entity + EF migration + flattening
    step + CLI commands; HiLo overrides merge setpoint-by-setpoint, binary
    types whole-replace
  - DebugView shows a Level badge + per-band message tooltip
  - App.razor auto-reloads on permanent Blazor circuit failure
  - docker/regen-proto.sh automates the proto regen workflow (the linux/arm64
    protoc segfault means generated files are checked in for now)
This commit is contained in:
Joseph Doherty
2026-05-13 03:23:32 -04:00
parent 783da8e21a
commit 751248feb6
46 changed files with 4693 additions and 204 deletions

View File

@@ -32,7 +32,8 @@ Commons must define shared primitive and utility types used across multiple comp
- **`InstanceState` enum**: Enabled, Disabled.
- **`DeploymentStatus` enum**: Pending, InProgress, Success, Failed.
- **`AlarmState` enum**: Active, Normal.
- **`AlarmTriggerType` enum**: ValueMatch, RangeViolation, RateOfChange.
- **`AlarmLevel` enum**: None, Low, LowLow, High, HighHigh. Severity level for an active alarm; always `None` for binary trigger types, set by `HiLo` triggers.
- **`AlarmTriggerType` enum**: ValueMatch, RangeViolation, RateOfChange, HiLo.
- **`ConnectionHealth` enum**: Connected, Disconnected, Connecting, Error.
Types defined here must be immutable and thread-safe.

View File

@@ -176,19 +176,20 @@ When the Instance Actor is stopped (due to disable, delete, or redeployment), Ak
### Alarm Evaluation
- Subscribes to attribute change notifications from its parent Instance Actor for the attribute(s) referenced by its trigger definition.
- On each value update, evaluates the trigger condition:
- **Value Match**: Incoming value equals the predefined target.
- **Value Match**: Incoming value equals the predefined target. Supports `"!=X"` prefix for not-equals semantics.
- **Range Violation**: Value is outside the allowed min/max range.
- **Rate of Change**: Value change rate exceeds the defined threshold over time.
- When the condition is met and the alarm is currently in **normal** state, the alarm transitions to **active**:
- **Rate of Change**: Value change rate exceeds the defined threshold over a configurable time window. Direction filter (rising / falling / either) restricts which side of the rate triggers.
- **HiLo**: Multi-setpoint level alarm with up to four configurable setpoints (LoLo, Lo, Hi, HiHi). Any subset may be configured. Each setpoint may carry its own priority that overrides the alarm-level priority for that band.
- For binary trigger types (ValueMatch / RangeViolation / RateOfChange), when the condition is met and the alarm is currently in **normal** state, the alarm transitions to **active**:
- Updates the alarm state on the parent Instance Actor (which publishes to the Akka stream).
- If an on-trigger script is defined, spawns an Alarm Execution Actor to execute it.
- When the condition clears and the alarm is in **active** state, the alarm transitions to **normal**:
- Updates the alarm state on the parent Instance Actor.
- No script execution on clear.
- When the condition clears and the alarm is in **active** state, the alarm transitions to **normal**.
- For HiLo triggers, the actor tracks the current `AlarmLevel` (None / Low / LowLow / High / HighHigh). Each level transition emits a fresh `AlarmStateChanged` with the new level and its priority; level escalations (e.g., High → HighHigh) and de-escalations (HighHigh → High) both produce events. The on-trigger script fires only on the Normal → non-None edge, not on escalations between alarm bands.
- No script execution on clear in any trigger type.
### Alarm State
- Held **in memory** only — not persisted to SQLite.
- On restart (or failover), alarm states are re-evaluated from incoming values. All alarms start in normal state and transition to active when conditions are detected.
- Held **in memory** only — not persisted to SQLite. State comprises `AlarmState` (Active / Normal) and `AlarmLevel` (None for binary triggers; the active band for HiLo).
- On restart (or failover), alarm states are re-evaluated from incoming values. All alarms start in normal state with level None and transition based on incoming values.
### Alarm Execution Actor
- **Short-lived** child actor created when an on-trigger script needs to execute.

View File

@@ -106,16 +106,18 @@ Each alarm has:
- **Priority Level**: Numeric value from 01000.
- **Lock Flag**: Controls whether the alarm can be overridden downstream.
- **Trigger Definition**: One of the following trigger types:
- **Value Match**: Triggers when a monitored attribute equals a predefined value.
- **Value Match**: Triggers when a monitored attribute equals a predefined value. Supports a `!=X` prefix on the match value for not-equals semantics.
- **Range Violation**: Triggers when a monitored attribute value falls outside an allowed range.
- **Rate of Change**: Triggers when a monitored attribute value changes faster than a defined threshold.
- **Rate of Change**: Triggers when a monitored attribute value changes faster than a defined threshold over a configurable time window. A direction filter (rising / falling / either) restricts which side of the rate triggers.
- **HiLo**: Multi-setpoint level alarm. Any subset of four setpoints (LoLo, Lo, Hi, HiHi) may be configured. The most severe matching band wins (LoLo/HiHi outrank Lo/Hi). Each setpoint may carry its own priority that overrides the alarm-level priority for that band.
- **On-Trigger Script** *(optional)*: A script to execute when the alarm triggers. The alarm on-trigger script executes in the context of the instance and can call instance scripts, but instance scripts **cannot** call alarm on-trigger scripts. The call direction is one-way.
### 3.4.1 Alarm State
- Alarm state (active/normal) is **managed at the site level** per instance, held **in memory** by the Alarm Actor.
- Active alarms additionally carry an **alarm level**: `None` for binary trigger types (ValueMatch, RangeViolation, RateOfChange); one of `Low`, `LowLow`, `High`, `HighHigh` for HiLo triggers based on which setpoint the monitored attribute has crossed. Level transitions within an active HiLo alarm (e.g., High → HighHigh) emit fresh state-change events without re-running the on-trigger script — the script only fires on the Normal → non-None edge.
- When the alarm condition clears, the alarm **automatically returns to normal state** — no acknowledgment workflow is required.
- Alarm state is **not persisted** — on restart, alarm states are re-evaluated from incoming values.
- Alarm state changes are published to the site-wide Akka stream as `[InstanceUniqueName].[AlarmName]`, alarm state (active/normal), priority, timestamp.
- Alarm state changes are published to the site-wide Akka stream as `[InstanceUniqueName].[AlarmName]`, alarm state (active/normal), alarm level, priority, timestamp.
### 3.5 Template Relationships