docs: native alarm ingestion across component docs + CLAUDE.md

This commit is contained in:
Joseph Doherty
2026-05-31 02:55:00 -04:00
parent 2b7c765a58
commit 003e54c1fb
9 changed files with 265 additions and 6 deletions
+1
View File
@@ -99,6 +99,7 @@ Related repos cloned as sibling directories under `~/Desktop/` — referenced fo
- All timestamps are UTC throughout the system.
- Inter-cluster communication uses two transports: ClusterClient for command/control (deployments, lifecycle, subscribe/unsubscribe handshake, snapshots) and gRPC server-streaming for real-time data (attribute values, alarm states). Both CentralCommunicationActor and SiteCommunicationActor registered with receptionist. Central creates one ClusterClient per site using NodeA/NodeB as contact points. Sites configure multiple central contact points for failover. Addresses cached in CentralCommunicationActor, refreshed periodically (60s) and on admin changes. Heartbeats serve health monitoring only.
- gRPC streaming channel: SiteStreamGrpcServer on each site node (Kestrel HTTP/2, port 8083); central creates per-site SiteStreamGrpcClient via SiteStreamGrpcClientFactory. Site entity has GrpcNodeAAddress/GrpcNodeBAddress fields. Proto: sitestream.proto with SiteStreamService, SiteStreamEvent (oneof: AttributeValueUpdate, AlarmStateUpdate). DebugStreamEvent message removed (no longer flows through ClusterClient).
- Native alarms: a read-only mirror of native alarms from OPC UA Alarms & Conditions servers and the MxAccess Gateway, unified onto an A&C-style condition model (`AlarmConditionState`: orthogonal Active/Acked/Confirmed/Shelved/Suppressed + 01000 severity) plus an `AlarmKind` discriminator (Computed/NativeOpcUa/NativeMxAccess). New DCL capability seam `IAlarmSubscribableConnection` (implemented by the OPC UA and MxGateway adapters); the `DataConnectionActor` opens ONE alarm feed per connection and routes transitions to instances by source-object reference. A `NativeAlarmActor` (peer to the computed `AlarmActor` under `InstanceActor`) mirrors one source binding: snapshot atomic-swap on (re)subscribe, retention (drops once inactive+acked), per-source cap, and site SQLite persistence (`native_alarm_state`, survives failover, cleared on redeploy/undeploy — mirrors static overrides). State streams to central over the additively-enriched gRPC `AlarmStateUpdate` (the existing computed `AlarmStateChanged` was enriched additively) and seeds via the DebugView snapshot. Authoring: `TemplateNativeAlarmSource` / `InstanceNativeAlarmSourceOverride` entities flatten to `ResolvedNativeAlarmSource` (inherit/compose/override); management commands + ManagementActor handlers + CLI (`template/instance native-alarm-source`) + Central UI (template editor tab + instance override panel) + enriched DebugView alarm table. Read-only — no ack-back; no central tables.
### External Integrations
- External System Gateway: HTTP/REST only, JSON serialization, API key + Basic Auth.
+1 -1
View File
@@ -42,7 +42,7 @@ Both stacks share the infrastructure services in [`infra/`](infra/) (MS SQL, LDA
| # | Component | Document | Description |
|---|-----------|----------|-------------|
| 1 | Template Engine | [docs/requirements/Component-TemplateEngine.md](docs/requirements/Component-TemplateEngine.md) | Template modeling, inheritance, composition, path-qualified member addressing, override granularity, locking, alarms, flattening, semantic validation, revision hashing, diff calculation, and folder organization (nested folders, drag-drop). |
| 1 | Template Engine | [docs/requirements/Component-TemplateEngine.md](docs/requirements/Component-TemplateEngine.md) | Template modeling, inheritance, composition, path-qualified member addressing, override granularity, locking, alarms, native alarm source bindings, flattening, semantic validation, revision hashing, diff calculation, and folder organization (nested folders, drag-drop). |
| 2 | Deployment Manager | [docs/requirements/Component-DeploymentManager.md](docs/requirements/Component-DeploymentManager.md) | Central-side deployment pipeline with deployment ID/idempotency, per-instance operation lock, state transition matrix, all-or-nothing site apply, system-wide artifact deployment with per-site status. |
| 3 | Site Runtime | [docs/requirements/Component-SiteRuntime.md](docs/requirements/Component-SiteRuntime.md) | Site-side actor hierarchy with explicit supervision strategies, staggered startup, script trust model (constrained APIs), Tell/Ask conventions, concurrency serialization, and site-wide Akka stream with per-subscriber backpressure. |
| 4 | Data Connection Layer | [docs/requirements/Component-DataConnectionLayer.md](docs/requirements/Component-DataConnectionLayer.md) | Common data connection interface (OPC UA, MxGateway, custom), Become/Stash connection actor model, auto-reconnect, immediate bad quality on disconnect, transparent re-subscribe, synchronous write failures, tag path resolution retry, protocol-agnostic address-space browse. |
+43
View File
@@ -73,6 +73,9 @@ scadabridge template script update --id <id> [--name <name>] [--code <code>] [--
scadabridge template script delete --id <id>
scadabridge template composition add --template-id <id> --instance-name <name> --composed-template-id <id>
scadabridge template composition delete --template-id <id> --instance-name <name>
scadabridge template native-alarm-source add --template-id <id> --name <name> --connection <name> --source-ref <ref> [--filter <expr>] [--description <desc>] [--locked]
scadabridge template native-alarm-source list --template-id <id>
scadabridge template native-alarm-source remove --id <id>
```
### Instance Commands
@@ -85,6 +88,8 @@ scadabridge instance set-overrides --id <id> --overrides <json>
scadabridge instance alarm-override set --instance-id <id> --alarm <name> [--trigger-config <json>] [--priority <n>]
scadabridge instance alarm-override delete --instance-id <id> --alarm <name>
scadabridge instance alarm-override list --instance-id <id>
scadabridge instance native-alarm-source set --instance-id <id> --source <name> [--connection <name>] [--source-ref <ref>] [--filter <expr>]
scadabridge instance native-alarm-source clear --instance-id <id> --source <name>
scadabridge instance set-area --id <id> [--area-id <id>]
scadabridge instance diff --id <id>
scadabridge instance deploy --id <id>
@@ -97,6 +102,44 @@ scadabridge instance delete --id <id>
`[["Speed", 5], ["Mode", 7]]`. `--overrides` is a JSON object of attribute name to
value, e.g. `{"Speed": "100", "Mode": null}`.
### Native Alarm Source Commands
The `native-alarm-source` subcommands manage the **read-only native alarm mirror**
alarms surfaced from an alarm-capable data connection rather than evaluated by the
ScadaBridge alarm engine. Native alarm sources are declared on a template and may be
overridden per instance. The subcommands map to management commands that resolve via
`ManagementCommandRegistry`:
- `--connection` names an alarm-capable data connection (**OPC UA** or **MxGateway**).
- `--source-ref` is the connection-specific reference: an **OPC UA `SourceNode` nodeId**
or an **MxAccess object/area**.
- `--filter` is an optional connection-specific filter expression that narrows the
mirrored alarm set.
**Template-level** (defines the inherited native alarm sources):
| CLI command | Management command | Required role |
|-------------|--------------------|---------------|
| `template native-alarm-source add` | `AddTemplateNativeAlarmSourceCommand` | Design |
| `template native-alarm-source list` | `ListTemplateNativeAlarmSourcesCommand` | — |
| `template native-alarm-source remove` | `DeleteTemplateNativeAlarmSourceCommand` | Design |
`add` takes `--name`, `--connection`, and `--source-ref` (required), plus optional
`--filter`, `--description`, and `--locked` (a flag that prevents instance-level
override). `remove` targets a single native alarm source by its own `--id`.
**Instance-level** (per-instance overrides of an inherited source; upsert semantics):
| CLI command | Management command | Required role |
|-------------|--------------------|---------------|
| `instance native-alarm-source set` | `SetInstanceNativeAlarmSourceOverrideCommand` | Deployment |
| `instance native-alarm-source clear` | `DeleteInstanceNativeAlarmSourceOverrideCommand` | Deployment |
`set` is an **upsert** keyed by `--instance-id` and `--source` (the inherited source
name): a blank/omitted `--connection`, `--source-ref`, or `--filter` keeps the
inherited value, so only the supplied options are overridden. `clear` removes the
override and **reverts the instance to the inherited template value**.
### Site Commands
```
scadabridge site list
+24
View File
@@ -45,6 +45,13 @@ Central cluster only. Sites have no user interface.
- Manage template hierarchy (inheritance) — visual tree of parent/child relationships.
- Manage composition — add/remove feature module instances within templates. **Naming collision detection** provides immediate feedback if composed modules introduce duplicate attribute, alarm, or script names.
- Define and edit attributes, alarms, and scripts on templates.
- **Native Alarms tab** (`TemplateEdit`): a tab alongside Attributes / Alarms / Scripts / Compositions that lists the template's **native alarm source bindings** — the OPC UA Alarms & Conditions / MxAccess Gateway sources whose alarm state the instance mirrors. Each binding carries Name, Connection, Source Reference, optional Condition Filter, Description, and a Lock flag. Add / edit / delete go through a **modal**:
- **Name** — unique within the template (lock/inherit bookkeeping mirrors `TemplateAlarm`).
- **Connection** — a dropdown filtered to **alarm-capable connections only** (OPC UA and MxGateway protocols).
- **Source Reference** — the native key (OPC UA SourceNode / notifier nodeId, or MxAccess object/area).
- **Condition Filter** (optional) — blank mirrors *all* conditions under the source.
- **Description** (optional) and **Lock** (prevents instance-level override, like locked alarms/attributes).
- CRUD is **repository-direct** (Blazor Server runs in-process against `ICentralUiRepository`); no Akka round-trip is needed for design-time authoring.
- Set lock flags on attributes, alarms, and scripts.
- Visual indicator showing inherited vs. locally defined vs. overridden members.
- **On-demand validation**: A "Validate" action allows Design users to run comprehensive pre-deployment validation (flattening, naming collisions, script compilation, trigger references) without triggering a deployment. Provides early feedback during authoring.
@@ -97,6 +104,11 @@ Central cluster only. Sites have no user interface.
- **Override** — optional per-attribute OPC UA node id (or other protocol address). When set, replaces the template's `DataSourceReference` at flattening time; when blank, the template default is used. The greyed placeholder shows the template default for context.
- **Browse…** — opens the OPC UA Tag Browser dialog, populated live from the site's OPC UA server via `BrowseOpcUaNodeCommand`. Visible only when the row's connection uses the OPC UA protocol; disabled until a connection is picked on that row. The dialog lazy-loads the address space, supports manual node-id entry as a fallback, and remains usable when the site or its OPC UA session is offline (the manual-paste field stays active even on error).
- Set instance-level attribute overrides (non-locked attributes only).
- **Native Alarm Source Overrides card** (`InstanceConfigure`): a card placed **after the Alarm Overrides card**, listing the template's native alarm sources for per-instance binding. Each row offers **inline override** of the three fields that typically vary per physical instance:
- **Connection** — a dropdown (same alarm-capable filtering as the template editor).
- **Source Reference** — the concrete native key for this instance.
- **Filter** — the per-instance condition filter.
- A **blank field inherits** the template default (the greyed placeholder shows the inherited value for context, mirroring the per-attribute Override field). **Save** and **Clear** act per row — Save persists the row's overrides, Clear reverts the row to the template-inherited binding. Locked template sources are not overridable.
- Filter/search instances by site, area, template, or status.
- **Disable** instances — stops data collection, script triggers, and alarm evaluation at the site while retaining the deployed configuration.
- **Enable** instances — re-activates a disabled instance.
@@ -127,6 +139,18 @@ Central cluster only. Sites have no user interface.
- Stream includes attribute values formatted as `[InstanceUniqueName].[AttributePath].[AttributeName]` and alarm states formatted as `[InstanceUniqueName].[AlarmName]`.
- Subscribe-on-demand — stream starts when opened, stops when closed.
#### Alarm Table (Computed + Native)
The DebugView alarm table is the **only** runtime surface for native OPC UA Alarms & Conditions and MxAccess Gateway alarms (no dedicated operator/alarm-summary page). Native alarms are a **read-only mirror** of source-reported state — the source system owns the alarm lifecycle (ack / shelve / suppress), so the table never offers ack-back or any command action. Both enriched `AlarmStateChanged` events (live, via the gRPC stream) and the initial `DebugViewSnapshot` (via ClusterClient) carry the unified alarm shape, so native alarms appear on the first paint and update in place. The table is a custom Blazor + Bootstrap component (no third-party grid).
- **Kind column** — a badge distinguishing **Computed** alarms from native ones (an **OPC UA** or **MxAccess** badge), driven by the event's `AlarmKind` discriminator.
- **Sev column** — the unified **01000 severity** (`AlarmConditionState.Severity`) shown for every row. Computed rows surface their integer priority on the same scale.
- **Source reference subtitle** — for native rows, the `SourceReference` (e.g. `Tank01.Level.HiHi`) renders as a **monospace subtitle under the alarm name**. Computed rows have no subtitle and render exactly as before this change.
- **State cell composite badges** — the orthogonal condition sub-states roll up into badges shown beside the active/normal state: **Unacked**, **Shelved**, and **Suppressed** appear only when the corresponding `AlarmConditionState` flag is set. Computed alarms are auto-acked and never shelved/suppressed, so they show none of these.
- **Row tooltip** — hovering a row surfaces the native metadata that does not warrant its own column: alarm type (`AlarmTypeName`), category, operator user and comment (source-supplied ack metadata, display-only), original raise time, and the current/limit value.
- **Filter** — the existing alarm filter additionally matches the native `SourceReference` (in addition to the alarm name), so operators can find a mirrored condition by its source path.
- **Computed alarms render unchanged** — no Kind badge styling change, no subtitle, no new state badges beyond what the unified model implies; the enrichment is purely additive for native rows.
### Parked Message Management (Deployment Role)
- Query sites for parked messages (external system calls, cached DB writes). (Parked notifications are managed centrally on the Notification Outbox page, not here.)
- View message details (target, payload, retry count, timestamps).
@@ -87,6 +87,32 @@ The streaming protocol is defined in `sitestream.proto` (`src/ZB.MOM.WW.ScadaBri
- The `oneof event` pattern is extensible — future event types (health metrics, connection state changes) are added as new fields without breaking existing consumers.
- Proto field numbers are never reused. Old clients ignore unknown `oneof` variants.
#### Enriched AlarmStateUpdate (Native Alarm Mirror)
`AlarmStateUpdate` carries the read-only native alarm mirror (Computed, native OPC UA, and native MxAccess Gateway alarms) to central over the **existing gRPC real-time stream** — no new transport, no command/control round-trip. The message was extended **additively**: existing fields 17 are unchanged, and fields 821 carry the enriched native-alarm state. Old clients that only read fields 17 continue to work; new fields are populated only where the source provides them.
| Field | # | Type | Meaning |
|-------|---|------|---------|
| `kind` | 8 | string | Alarm origin: `Computed`, `NativeOpcUa`, or `NativeMxAccess`. |
| `active` | 9 | bool | Alarm condition is active. |
| `acknowledged` | 10 | bool | Alarm has been acknowledged. |
| `confirmed` | 11 | bool | Alarm has been confirmed. The domain `Confirmed` (`bool?`) collapses to a definite bool on the wire. |
| `shelve_state` | 12 | string | `Unshelved`, `OneShotShelved`, `TimedShelved`, or `PermanentShelved`. |
| `suppressed` | 13 | bool | Alarm is suppressed by the source system. |
| `source_reference` | 14 | string | Source node / tag reference. |
| `alarm_type_name` | 15 | string | Native alarm type name. |
| `category` | 16 | string | Alarm category. |
| `operator_user` | 17 | string | User who last acted on the alarm. |
| `operator_comment` | 18 | string | Operator comment from the last action. |
| `original_raise_time` | 19 | Timestamp | First-raise time of the underlying condition (nullable on the wire). |
| `current_value` | 20 | string | Current process value associated with the alarm. |
| `limit_value` | 21 | string | Limit / setpoint value that the alarm evaluates against. |
- **Server-side mapping (`StreamRelayActor.HandleAlarmStateChanged`)**: maps the enriched domain `AlarmStateChanged` event — `Kind` + `AlarmConditionState` + native metadata — out to the proto `AlarmStateUpdate`. The nullable `original_raise_time` is emitted only when present, and `shelve_state` is mapped from the domain shelve enum to its wire string via a new **`AlarmShelveStateCodec`** (string↔enum, defaulting to `Unshelved`). The domain `Confirmed` (`bool?`) is collapsed to a definite bool for field 11.
- **Client-side mapping (`SiteStreamGrpcClient.ConvertToDomainEvent`)**: reconstructs the domain `AlarmStateChanged` from the proto — `Kind` is parsed via `ParseAlarmKind`, the `Condition` is rebuilt with `severity` taken from the existing wire `priority`, and native metadata is repopulated from fields 821 — so central-side consumers receive the same domain event the site emitted.
> **Regeneration is manual (macOS-only).** `sitestream.proto` is **not** auto-compiled: the `<Protobuf>` include is commented out in the `.csproj`, and the generated C# is **vendored** under `SiteStreamGrpc/`. To regenerate after editing the proto: toggle the `<Protobuf>` include on, build so `Grpc.Tools` regenerates the C#, copy the generated files into `SiteStreamGrpc/`, then re-comment the include. Adding fields 821 followed this process.
#### gRPC Connection Keepalive
Three layers of dead-client detection prevent orphan streams on site nodes:
@@ -32,10 +32,12 @@ The configuration database stores all central system data, organized by domain a
- **TemplateFolders**: Hierarchical organizational folders for templates (`Id`, `Name`, nullable `ParentFolderId` self-reference, `SortOrder`). Unique index on `(ParentFolderId, Name)` enforces case-insensitive sibling uniqueness. Folders are UI-only — they have no effect on template resolution or flattening.
- **Template Attributes**: Attribute definitions per template (name, value, data type, lock flag, description, data source reference).
- **Template Alarms**: Alarm definitions per template (name, description, priority, lock flag, trigger type, trigger configuration, on-trigger script reference).
- **Native Alarm Sources** (`NativeAlarmSources`): Native alarm source bindings per template — alarms produced by the underlying data source rather than evaluated by the Site Runtime. Columns: `Name`, `Description`, `ConnectionName` (the data connection that surfaces the native alarms), `SourceReference` (the source-side address/path the native alarms are read from), and `ConditionFilter` (optional filter narrowing which native alarm conditions are subscribed). FK `TemplateId``Templates` with cascade delete; unique index on `(TemplateId, Name)`.
- **Template Scripts**: Script definitions per template (name, lock flag, C# source code, trigger type, trigger configuration, minimum time between runs, parameter definitions, return value definitions).
- **Template Compositions**: Feature module composition relationships (composing template, composed template, module instance name).
- **Instances**: Instance definitions (template reference, site reference, area reference, enabled/disabled state).
- **Instance Attribute Overrides**: Per-instance attribute value overrides.
- **Instance Native Alarm Source Overrides** (`InstanceNativeAlarmSourceOverrides`): Per-instance overrides for native alarm sources, keyed by the source's path-qualified canonical name. Columns: `SourceCanonicalName` (required, sized to fit composed `[ModuleInstanceName].[SourceName]` paths) and the nullable override fields `ConnectionNameOverride`, `SourceReferenceOverride`, and `ConditionFilterOverride` (a null override leaves the template value in effect). FK `InstanceId``Instances` with cascade delete; unique index on `(InstanceId, SourceCanonicalName)`.
- **Instance Connection Bindings**: Per-attribute data connection binding for each instance.
- **Areas**: Hierarchical area definitions per site (name, parent area reference, site reference).
@@ -90,13 +92,24 @@ A single `ScadaBridgeDbContext` (or a small number of bounded DbContexts if warr
- Configures relationships, indexes, constraints, and value conversions.
- Provides `SaveChangesAsync()` as the unit-of-work commit mechanism.
Each entity's Fluent mapping lives in its own `IEntityTypeConfiguration<T>` class under `Configurations/`, and `OnModelCreating` registers them all with `modelBuilder.ApplyConfigurationsFromAssembly(...)` — so a new mapping is picked up simply by adding its configuration class to the assembly.
#### Native Alarm Source Mappings
The native alarm source feature adds two EF-mapped entities (POCOs in Commons, Fluent mappings here), each exposed as a `DbSet` on `ScadaBridgeDbContext`:
- **`TemplateNativeAlarmSource`** → table `NativeAlarmSources` (`DbSet<TemplateNativeAlarmSource> TemplateNativeAlarmSources`). `Name` required (≤200), `Description` (≤2000), `ConnectionName` required (≤200), `SourceReference` required (≤1000), `ConditionFilter` (≤1000). Unique index `(TemplateId, Name)`. Owned by `Template` via `TemplateConfiguration` (`HasMany(t => t.NativeAlarmSources)` on FK `TemplateId`, `OnDelete: Cascade`) — deleting a template removes its native alarm sources.
- **`InstanceNativeAlarmSourceOverride`** → table `InstanceNativeAlarmSourceOverrides` (`DbSet<InstanceNativeAlarmSourceOverride> InstanceNativeAlarmSourceOverrides`). `SourceCanonicalName` required (≤400, wider than plain names so it can hold composed paths), `ConnectionNameOverride` (≤200), `SourceReferenceOverride` (≤1000), `ConditionFilterOverride` (≤1000). Unique index `(InstanceId, SourceCanonicalName)`. Owned by `Instance` via `InstanceConfiguration` (`HasMany(i => i.NativeAlarmSourceOverrides)` on FK `InstanceId`, `OnDelete: Cascade`).
Both mappings follow the same shape as the existing alarm definitions and alarm overrides: dedicated configuration classes that auto-register through `ApplyConfigurationsFromAssembly`, cascade-delete from their owning aggregate root, and a composite unique index that enforces name uniqueness within the owner.
### Per-Component Repository Implementations
Repository interfaces are defined in **Commons** alongside the POCO entity classes (see Component-Commons.md, REQ-COM-4). This component provides the **EF Core implementations** of those interfaces.
| Repository Interface (in Commons) | Consuming Component | Scope |
|---|---|---|
| `ITemplateEngineRepository` | Template Engine | Templates, attributes, alarms, scripts, compositions, instances, overrides, connection bindings, areas |
| `ITemplateEngineRepository` | Template Engine | Templates, attributes, alarms, native alarm sources, scripts, compositions, instances, overrides (including native alarm source overrides), connection bindings, areas |
| `IDeploymentManagerRepository` | Deployment Manager | Current deployment status per instance, deployed configuration snapshots, system-wide artifact deployment status per site (no deployment history — audit log provides historical traceability) |
| `ISecurityRepository` | Security & Auth | LDAP group mappings, site scoping rules |
| `IInboundApiRepository` | Inbound API | API keys, API method definitions |
@@ -109,6 +122,15 @@ Repository interfaces are defined in **Commons** alongside the POCO entity class
Each implementation class uses the DbContext internally and works with the POCO entity classes from Commons. Consuming components depend only on Commons (for interfaces and entities) — they never reference this component or EF Core directly. The DI container in the Host wires the implementations to the interfaces.
#### Native Alarm Source Repository Methods
`ITemplateEngineRepository` (implemented by `TemplateEngineRepository`) gains CRUD for both native alarm source entities, mirroring the existing alarm-override methods one-for-one:
- **Template side**: `GetTemplateNativeAlarmSourceByIdAsync`, `GetNativeAlarmSourcesByTemplateIdAsync`, `AddTemplateNativeAlarmSourceAsync`, `UpdateTemplateNativeAlarmSourceAsync`, `DeleteTemplateNativeAlarmSourceAsync`.
- **Instance side**: `GetNativeAlarmSourceOverridesByInstanceIdAsync`, `GetNativeAlarmSourceOverrideAsync(instanceId, sourceCanonicalName)`, `AddInstanceNativeAlarmSourceOverrideAsync`, `UpdateInstanceNativeAlarmSourceOverrideAsync`, `DeleteInstanceNativeAlarmSourceOverrideAsync`.
The aggregate loaders are extended to eager-load the new children so a template or instance is returned fully populated: `GetTemplateWithChildrenAsync` (and the other template loaders) `.Include(t => t.NativeAlarmSources)`, and the instance-with-children loader `.Include(i => i.NativeAlarmSourceOverrides)` alongside its existing attribute, alarm, and connection-binding includes. Consistent with every other repository method here, the `Add`/`Update`/`Delete` operations only **stage** changes on the DbContext — the caller commits them by invoking `SaveChangesAsync()` (typically together with the matching `IAuditService.LogAsync()` call in one transaction).
### Unit of Work
EF Core's DbContext naturally provides unit-of-work semantics:
@@ -248,6 +270,7 @@ A CI grep guard fails the build on any occurrence of `UPDATE … AuditLog` or `D
- Schema changes are managed via EF Core Migrations (`dotnet ef migrations add`, `dotnet ef migrations script`).
- Each migration is a versioned, incremental schema change.
- New tables are introduced as their own migration — for example, the `Notifications` table for the Notification Outbox ships as a dedicated EF Core migration that creates the table, its `Type`/`Status` value conversions, and its dispatcher and KPI indexes.
- The native alarm source tables ship in a dedicated `AddNativeAlarmSources` migration, parallel in shape to the existing `AddInstanceAlarmOverrides` migration: it creates `NativeAlarmSources` and `InstanceNativeAlarmSourceOverrides` with their columns, the `TemplateId``Templates` and `InstanceId``Instances` cascade-delete foreign keys, and the `(TemplateId, Name)` / `(InstanceId, SourceCanonicalName)` unique indexes.
- The initial `AuditLog` migration creates the monthly partition function `pf_AuditLog_Month` and partition scheme `ps_AuditLog_Month`, then creates the `AuditLog` table aligned to that scheme on `OccurredAtUtc`, along with the indexes listed under Database Schema. The migration also creates the `scadabridge_audit_writer` and `scadabridge_audit_purger` DB roles with the grants described in Database Roles. The ongoing **partition-maintenance job** that rolls the scheme forward each month (creating the next month's partition ahead of time) and switches out expired partitions is owned by the **Audit Log component** (`AuditLogPurgeActor` and its monthly roll-forward step), not by the Configuration Database component — this component is responsible only for the initial schema, roles, and any EF migrations against the table going forward.
### Development Environment
@@ -173,6 +173,58 @@ DCL is a clean data pipe on the hot path. Browse is an **opt-in capability** for
- Browse runs against the live session; no caching at DCL.
- **Frame-size guard**: the reply crosses the site→central Akka frame (default 128 KB) on a temp Ask actor; an oversized reply is silently discarded by remoting, hanging the picker. The child handler caps each `BrowseNodeResult` to a byte budget (~100 KB) before replying, OR-ing the adapter's own truncation signal into `Truncated`. This is protocol-agnostic (every adapter's reply funnels through it). Per-protocol upstream caps narrow the window first: OPC UA requests at most 500 references per node (continuation point → `Truncated`); MxGateway relies on the gateway's `BrowseChildren` page cap. A `Truncated` level prompts manual node-id entry in the picker rather than auto-paging.
## Native Alarm Mirroring
Some data sources publish their own alarms — OPC UA **Alarms & Conditions** servers and the **MxAccess Gateway**. The DCL can mirror these native alarms into the Site Runtime as a **read-only** feed: ScadaBridge reflects source alarm state but never acknowledges, confirms, shelves, or otherwise writes back to the source. This complements (does not replace) ScadaBridge's own computed alarms; it feeds the Site Runtime's `NativeAlarmActor` peer subsystem.
Like browse, this is an **opt-in capability** for protocols that support it. It does not touch the hot value path — alarm transitions flow over a separate per-connection feed.
### Capability Seam
Mirroring is exposed via the optional `IAlarmSubscribableConnection` capability interface (in Commons), which an `IDataConnection` implementation **may also** implement (mirroring the `IBrowsableDataConnection` pattern; consumed by the `DataConnectionActor` only):
```
IAlarmSubscribableConnection
├── SubscribeAlarmsAsync(sourceReference, conditionFilter?, callback, ct) → subscriptionId
└── UnsubscribeAlarmsAsync(subscriptionId, ct) → void
```
The `AlarmTransitionCallback` delivers a protocol-neutral `NativeAlarmTransition` per transition. On every (re)subscribe the adapter replays a **snapshot** of currently-active conditions (`Snapshot…` records terminated by a `SnapshotComplete` sentinel) so consumers can reconcile state after a reconnect.
### Protocol Adapters
- **OPC UA** (`OpcUaDataConnection` + `RealOpcUaClient`): a single **event MonitoredItem** (`AttributeId = EventNotifier`) on the Server object, with an `EventFilter` selecting `EventType` / `SourceNode` / `Severity` plus the `ConditionType` / `AcknowledgeableConditionType` / `AlarmConditionType` state fields. `ConditionRefresh` is invoked on subscribe to replay active conditions as the snapshot. The OPC UA field → `NativeAlarmTransition` mapping is isolated in the pure helper `OpcUaAlarmMapper`, unit-testable without a live server.
- **MxGateway** (`MxGatewayDataConnection` + `RealMxGatewayClient`): mirrors over the gateway package's `StreamAlarmsAsync` — a resumable background stream whose reconnect re-sends a snapshot. The field mapping lives in `MxGatewayAlarmMapper`.
Other/custom protocols do not implement the capability; a subscribe request against such a connection is replied to with a failure (`SubscribeAlarmsResponse.Success = false`).
### Connection Actor Behavior
The `DataConnectionActor` opens **one alarm feed per connection** (not per subscriber) and routes incoming transitions to instance subscribers by **source-object reference** — a prefix match of the transition's `SourceObjectReference` (falling back to `SourceReference`) against each subscriber's registered `SourceReference`. Subscribers (the Site Runtime's `NativeAlarmActor` instances) are **ref-counted per source**, so the underlying feed is opened once and torn down only when the last subscriber for that source unsubscribes.
- **State gating**: `SubscribeAlarmsRequest` is handled only in the **Connected** state; requests arriving while **Connecting**/**Reconnecting** are stashed (standard Become/Stash) and processed on entering Connected.
- **Capability check**: if `_adapter is not IAlarmSubscribableConnection`, the actor replies `SubscribeAlarmsResponse(Success = false, ...)`.
- **Reconnect handling**: on entering **Reconnecting**, the actor pushes a `NativeAlarmSourceUnavailable` to every alarm subscriber (consumers mark mirrored alarms uncertain rather than clearing them). On successful reconnection it re-subscribes the feed; the adapter re-emits a snapshot, reconciling state.
### Protocol-Neutral Types & Messages
All defined in Commons so the feed is identical across protocols:
| Type | Shape |
|------|-------|
| `NativeAlarmTransition` | `SourceReference`, `SourceObjectReference`, `AlarmTypeName`, `Kind`, `Condition`, `Category`, `Description`, `Message`, `OperatorUser`, `OperatorComment`, `OriginalRaiseTime?`, `TransitionTime`, `CurrentValue`, `LimitValue` |
| `AlarmConditionState` | `Active`, `Acknowledged`, `Confirmed?` (null when not confirmable), `Shelve`, `Suppressed`, `Severity` (01000) |
| `AlarmTransitionKind` (enum) | `Snapshot`, `SnapshotComplete`, `Raise`, `Acknowledge`, `Clear`, `Retrigger`, `StateChange` |
`OperatorUser` / `OperatorComment` and `CurrentValue` / `LimitValue` are display-only mirrors from the source.
**Messages:**
- `SubscribeAlarmsRequest` / `SubscribeAlarmsResponse` — instance (via the DCL manager) subscribes a source binding to native alarms; the response carries success + an optional error message.
- `UnsubscribeAlarmsRequest` — cancels a native alarm subscription for an instance + source.
- `NativeAlarmTransitionUpdate(ConnectionName, Transition)` — DCL → instance: one routed transition (including snapshot replay).
- `NativeAlarmSourceUnavailable(ConnectionName, SourceReference, Timestamp)` — DCL → instance: the feed for a source became unavailable (connection lost).
## Value Update Message Format
Each value update delivered to an Instance Actor includes:
@@ -245,11 +297,13 @@ The DCL reports the following metrics to the Health Monitoring component via the
## Dependencies
- **Site Runtime (Instance Actors)**: Receives subscription registrations and delivers value updates. Receives write requests.
- **Site Runtime (NativeAlarmActor)**: For alarm-subscribable connections, receives `SubscribeAlarmsRequest`/`UnsubscribeAlarmsRequest` and delivers `NativeAlarmTransitionUpdate` / `NativeAlarmSourceUnavailable` (read-only native alarm mirroring).
- **Health Monitoring**: Reports connection status.
- **Site Event Logging**: Logs connection status changes.
## Interactions
- **Site Runtime (Instance Actors)**: Bidirectional — delivers value updates, receives subscription registrations and write-back commands.
- **Site Runtime (NativeAlarmActor)**: Bidirectional — receives alarm subscribe/unsubscribe requests, delivers native alarm transitions and source-unavailable notifications (read-only; no ack-back to the source).
- **Health Monitoring**: Reports connection health periodically.
- **Site Event Logging**: Logs connection/disconnection events.
+72 -3
View File
@@ -34,9 +34,10 @@ Deployment Manager Singleton (Cluster Singleton)
│ │ └── Script Execution Actor — short-lived, per invocation
│ ├── Script Actor ("CalculateOEE") — coordinator
│ │ └── Script Execution Actor — short-lived, per invocation
│ ├── Alarm Actor ("OverTemp") — coordinator
│ ├── Alarm Actor ("OverTemp") — coordinator (computed)
│ │ └── Alarm Execution Actor — short-lived, per on-trigger invocation
── Alarm Actor ("LowPressure") — coordinator
── Alarm Actor ("LowPressure") — coordinator (computed)
│ └── Native Alarm Actor ("OpcUaServer1") — read-only mirror, peer to Alarm Actor
├── Instance Actor ("MachineA-002")
│ └── ...
└── ...
@@ -204,6 +205,74 @@ When the Instance Actor is stopped (due to disable, delete, or redeployment), Ak
---
## Native Alarm Actor
### Role
- **Read-only mirror** of alarms raised natively by an external source — OPC UA Alarms & Conditions (A&C) servers and the MxAccess Gateway — surfaced into the Site Runtime alongside the alarms ScadaBridge computes itself.
- Created as a child of the **Instance Actor** and is a **peer to the computed Alarm Actor** (not a child of it). One `NativeAlarmActor` is spawned per resolved native alarm source binding on the instance.
- Mirrors source-of-truth condition state into the Instance Actor's view and onto the site-wide stream; it **does not** acknowledge, clear, or otherwise write back to the source. There is no ack-back path — the external source remains authoritative.
### Construction
- Constructed with `(ResolvedNativeAlarmSource source, string instanceName, IActorRef instanceActor, IActorRef dclManager, SiteStorageService storage, SiteRuntimeOptions options, ILogger logger, AlarmKind nativeKind = NativeOpcUa)`.
- `nativeKind` distinguishes the two native flavors and stamps the `Kind` on every emitted `AlarmStateChanged`. The Instance Actor selects it from the bound connection's protocol (see **Instance Actor wiring** below).
### Lifecycle & Subscription
- **PreStart**: rehydrates any previously mirrored conditions for this source from the site SQLite `native_alarm_state` table, then subscribes to the source through the Data Connection Layer by sending a `SubscribeAlarmsRequest` to the DCL manager. The DCL routes the subscription to the bound connection's `IAlarmSubscribableConnection` implementation.
- **Failed subscribe**: schedules a retry timer at `NativeAlarmRetryIntervalMs` and re-attempts until the subscription is established. Rehydrated state remains visible in the meantime.
- **`NativeAlarmSourceUnavailable`**: the source connection has dropped. The actor **retains its last-known mirrored conditions** but marks them uncertain rather than purging them, so a transient disconnect does not flap every condition to normal. The set is reconciled against truth by the next reconnect snapshot.
### Transition Handling (`NativeAlarmTransitionUpdate`)
- **Snapshot / SnapshotComplete (reconnect reconciliation)**: `Snapshot` updates buffer into a staging set; `SnapshotComplete` performs an **atomic swap** of the mirrored set with the staged set. Any condition that was previously mirrored but is **not present** in the new snapshot emits a return-to-normal `AlarmStateChanged` and drops out. This is how the mirror self-corrects after an outage.
- **Live transitions** (`Raise` / `Ack` / `Clear` / `Retrigger` / `StateChange`): upsert the condition by `SourceReference`. Updates carrying a `TransitionTime` **older** than the currently held transition are ignored (out-of-order protection). Accepted transitions persist to SQLite and emit an enriched `AlarmStateChanged` upward to the Instance Actor.
- **Retention**: a mirrored condition is dropped once it is both inactive **and** acknowledged (`!Active && Acknowledged`) — the alarm has fully run its course at the source and no longer needs mirroring. The drop emits a final state change and deletes the SQLite row.
- **Per-source cap**: at most `MirroredAlarmCapPerSource` conditions are retained per source. When the cap is exceeded the **oldest** condition is dropped and the eviction is **logged** — there is no silent truncation.
### Persistence
- Mirrored condition state is persisted to the site SQLite `native_alarm_state` table on every accepted transition and removed on drop-out.
- Persistence is **best-effort / fire-and-forget**: a persistence failure is logged but never blocks the actor's mailbox and never aborts the upward `AlarmStateChanged` emit. The in-memory mirror remains authoritative for the running actor; SQLite exists to survive failover.
### Supervision & Restart
- Supervised by the Instance Actor under the same **OneForOneStrategy** as the computed Alarm Actor — a native source fault is isolated to its own actor.
- On site restart or failover, the actor rehydrates its mirror from `native_alarm_state` in PreStart, then reconciles against the source via the reconnect snapshot. Native mirror state therefore **survives failover** (unlike computed alarm state, which is re-evaluated from values).
- Mirrored native state **is cleared on redeploy/undeploy** of the instance (mirroring the static-override reset): the stale rows for the instance are removed and the fresh actor re-subscribes from a clean slate.
---
## Instance Actor — Native Alarm Wiring
The Instance Actor owns native-alarm setup alongside its computed Script and Alarm Actors:
- **Spawning**: for each entry in `_configuration.NativeAlarmSources`, the Instance Actor spawns a `NativeAlarmActor`. Spawning is **skipped when there is no DCL manager** (e.g., debug/test contexts with no data connections), since native alarms require a live source subscription.
- **Kind derivation**: the `AlarmKind` passed to each `NativeAlarmActor` is derived from the bound connection's protocol — `Mx*` protocols → `NativeMxAccess`, otherwise → `NativeOpcUa`.
- **Latest-event retention**: the Instance Actor retains the latest enriched `AlarmStateChanged` per alarm name in `_latestAlarmEvents`. The DebugView snapshot is built from this map so it carries the **unified condition view plus native metadata** for both computed and native alarms. Computed alarms that have not yet produced an event fall back to a **Normal projection** so the snapshot is complete.
- **Reset semantics**: `_latestAlarmEvents` and the mirrored native state are cleared on redeploy/undeploy (same trigger as static-override reset) but rehydrate from SQLite on failover.
---
## Native Alarm State Persistence (Site SQLite)
`SiteStorageService` gains a `native_alarm_state` table backing the native mirror:
- **Primary key**: `(instance_unique_name, source_canonical_name, source_reference)` — one row per mirrored condition.
- **Columns**: `condition_json` (the serialized `AlarmConditionState`) and `last_transition_at` (the accepted `TransitionTime`).
- **Operations**: `Upsert` (on accepted transition), `Delete` (on condition drop-out), `Get` (PreStart rehydrate, scoped to instance + source), and `ClearForInstance` (redeploy/undeploy reset).
- This is a **peer SQLite store** to the existing deployed-configuration, store-and-forward, operation-tracking, and `AuditLog` stores. Unlike computed alarm state, native mirror state is intentionally persisted so it survives failover.
---
## Enriched `AlarmStateChanged` Message
The `AlarmStateChanged` message published by both Alarm Actors and Native Alarm Actors was extended **additively** (existing consumers keep working with computed defaults):
- **`Kind`** (`AlarmKind`): `Computed` for ScadaBridge-evaluated alarms; `NativeOpcUa` / `NativeMxAccess` for mirrored native alarms.
- **`Condition`** (`AlarmConditionState`): the unified condition view. Computed alarms supply a computed default; native alarms carry the mirrored source condition.
- **Native metadata** (populated for native alarms; defaulted/empty for computed): `SourceReference`, `AlarmTypeName`, `Category`, `OperatorUser`, `OperatorComment`, `OriginalRaiseTime`, `CurrentValue`, `LimitValue`.
- **Computed-alarm projection**: computed alarms are surfaced as **auto-acknowledged** with `Severity = Priority`, so a single enriched shape carries both computed and native alarms onto the stream and into the DebugView snapshot.
The enriched message flows Instance Actor → site-wide Akka stream → `SiteStreamManager``SiteStreamGrpcServer` and is streamed to central as the gRPC `AlarmStateUpdate` event (see [Component-Communication.md](Component-Communication.md)).
---
## Shared Script Library
- Shared scripts are compiled at the site when received from central.
@@ -361,7 +430,7 @@ Per Akka.NET best practices, internal actor communication uses **Tell** (fire-an
## Dependencies
- **Data Connection Layer**: Provides tag value updates to Instance Actors. Receives write requests from Instance Actors.
- **Data Connection Layer**: Provides tag value updates to Instance Actors. Receives write requests from Instance Actors. Also feeds Native Alarm Actors: connections implementing `IAlarmSubscribableConnection` (OPC UA A&C servers, MxAccess Gateway) deliver `NativeAlarmTransitionUpdate` events in response to a `SubscribeAlarmsRequest`, and signal `NativeAlarmSourceUnavailable` on connection loss.
- **Store-and-Forward Engine**: Handles reliable delivery for external system calls, cached database writes, and notifications submitted by scripts. For the notification category specifically, it forwards to the central cluster for delivery (not directly to SMTP). Owns the site-local operation tracking table that backs `Tracking.Status(id)`.
- **External System Gateway**: Provides external system method invocations for scripts.
- **Communication Layer**: Receives deployments and lifecycle commands from central. Handles debug view requests. Reports deployment results.
+20 -1
View File
@@ -53,6 +53,13 @@ Central cluster only. Sites receive flattened output and have no awareness of te
- Trigger Definition: Value Match, Range Violation, or Rate of Change.
- Optional On-Trigger Script reference.
### Native Alarm Source (`TemplateNativeAlarmSource`)
- A read-only binding that mirrors **native alarms** raised by an upstream system — OPC UA Alarms & Conditions or the MxAccess Gateway — rather than alarms evaluated by the Site Runtime from attribute values.
- Fields: Name, Description *(optional)*, ConnectionName (the data connection that carries the native alarms), SourceReference (a raw connection address — an OPC UA `SourceNode` nodeId, or an MxAccess object/area), ConditionFilter *(optional — when null, mirror **all** conditions under the source)*, and the standard locking flags (`IsLocked`, `IsInherited`, `LockedInDerived`).
- `SourceReference` is a **raw connection address**, not a relative attribute path — the Template Engine does not interpret or rewrite it (contrast with an attribute's `DataSourceReference`).
- Defined on a template as a first-class member via `Template.NativeAlarmSources`.
- Resolved native alarm sources drive the Site Runtime's **NativeAlarmActor** (see Interactions); the Template Engine only models and flattens them.
### Script (Template-Level)
- Name, Lock Flag, C# source code.
- Trigger configuration: Interval, Value Change, Conditional, Expression, or invoked by alarm/other script. Conditional and Expression triggers also carry a fire mode — **OnTrue** (fire as the condition becomes true) or **WhileTrue** (re-fire on a timer while it stays true).
@@ -64,6 +71,7 @@ Central cluster only. Sites receive flattened output and have no awareness of te
- Associated with a specific template and a specific site.
- Assigned to an area within the site.
- Can override non-locked attribute values (no adding/removing attributes).
- Can override non-locked native alarm source bindings via `Instance.NativeAlarmSourceOverrides` (see Override Granularity) — no adding/removing sources.
- Bound to data connections at instance creation — **per-attribute binding** where each attribute with a data source reference individually selects its data connection.
- Can be in **enabled** or **disabled** state.
- Can be **deleted** — deletion is blocked if the site is unreachable.
@@ -99,6 +107,7 @@ Override and lock rules apply per entity type at the following granularity:
- **Attributes**: Value and Description are overridable. Data Type is fixed by the defining level. `DataSourceReference` on a template attribute defines the **default** physical address for that attribute. Instances may override per attribute via `InstanceConnectionBinding.DataSourceReferenceOverride`; the override replaces the template default at flattening time. When the override is null (the default), the template value is used. Lock applies to the entire attribute (when locked, no fields can be overridden).
- **Alarms**: Priority Level, Trigger Definition (thresholds/ranges/rates), Description, and On-Trigger Script reference are overridable. Name and Trigger Type (Value Match vs. Range vs. Rate of Change) are fixed. Lock applies to the entire alarm.
- **Native alarm sources**: An instance overrides a non-locked source via `InstanceNativeAlarmSourceOverride`, keyed by `SourceCanonicalName`. `ConnectionNameOverride`, `SourceReferenceOverride`, and `ConditionFilterOverride` are individually overridable — each is applied only when non-null; a null field **keeps the inherited value**. Name is fixed. Lock applies to the entire source.
- **Scripts**: C# source code, Trigger configuration, minimum time between runs, and parameter/return definitions are overridable. Name is fixed. Lock applies to the entire script.
- **Composed module members**: A composing template or child template can override non-locked members inside a composed module using the canonical path-qualified name.
@@ -122,6 +131,14 @@ When an instance is deployed, the Template Engine resolves the full configuratio
5. Resolve data connection bindings — replace connection name references with concrete connection details from the site.
6. Output a flat structure: list of attributes with resolved values and data source addresses, list of alarms with resolved trigger definitions, list of scripts with resolved code and triggers.
### Native Alarm Source Resolution
The `FlatteningService` resolves native alarm sources alongside alarms, emitting a `ResolvedNativeAlarmSource` (CanonicalName, ConnectionName, SourceReference, ConditionFilter *(optional)*, and `Source``Template` | `Inherited` | `Composed` | `Override`) for each. The resolved set is attached to `FlattenedConfiguration.NativeAlarmSources`.
- **Inheritance**: resolution walks the chain base → derived; a derived-level source wins over the base unless the base level locked it.
- **Composition**: a composed module's sources are path-qualified to the canonical name `[ModuleInstanceName].[Name]`, subject to the same naming-collision checks as other members. Because `SourceReference` is a raw connection address (not an attribute path), composition performs **no attribute-reference rewriting** on it.
- **Instance overrides**: `InstanceNativeAlarmSourceOverride` applies its non-null fields (`ConnectionNameOverride`, `SourceReferenceOverride`, `ConditionFilterOverride`) over the inherited/composed result and sets `Source = Override`.
## Diff Calculation
The Template Engine can compare:
@@ -150,6 +167,7 @@ Beyond compilation, the Template Engine performs static semantic checks:
- **Argument compatibility**: Parameter count and data types at call sites must match the target script's parameter definitions.
- **Return type compatibility**: If a script call's return value is used, the return type definition must match the caller's expectations.
- **Trigger operand types**: Alarm triggers and script conditional triggers must reference attributes with compatible data types (e.g., Range Violation requires numeric attributes).
- **Native alarm sources** (`ValidationCategory.NativeAlarmSourceInvalid`): `SemanticValidator.Validate` flags a `ResolvedNativeAlarmSource` when its `SourceReference` is empty, its `ConnectionName` is empty, or — when the caller supplies the alarm-capable connection set — its connection is unknown or not alarm-capable (protocol ∉ {`OpcUa`, `MxGateway`}). The alarm-capable connection set is an **optional, additive third parameter** to `Validate`; the empty-field checks always run, and the connection-binding check runs only when the set is provided.
### Graph Acyclicity
@@ -185,5 +203,6 @@ For shared scripts, pre-compilation validation is performed before deployment. S
## Interactions
- **Deployment Manager**: Requests flattened configurations, diffs, and validation results from the Template Engine.
- **Central UI**: Provides the data model for template authoring, instance management, and on-demand validation.
- **Central UI**: Provides the data model for template authoring, instance management, and on-demand validation. Native alarm source CRUD (template-level definitions and instance-level overrides) is exposed via the Management Service / CLI / Central UI alongside attributes and alarms.
- **Site Runtime (#3)**: Consumes each `ResolvedNativeAlarmSource` in the flattened configuration to drive its **NativeAlarmActor**, which mirrors the native OPC UA A&C / MxAccess Gateway alarms identified by the resolved connection, source reference, and condition filter.
- **Transport (#24)**: Reads templates, attributes, alarms, scripts, and composition relationships for bundle export; writes the same via repositories during bundle import.