Files

Joseph Doherty f2c1cc84e9 Phase 7 plan doc — scripting runtime + virtual tags + scripted alarms + historian alarm sink. Draft output from the 2026-04-20 interactive planning session. Phase 7 is the last phase before v2 release readiness; adds two additive runtime capabilities on top of the existing driver + Equipment address-space foundation: (1) virtual (calculated) tags — OPC UA variables whose values are computed by user-authored C# scripts against other tags, evaluated on change and/or timer, living in the existing Equipment tree alongside driver tags, behaving identically to clients; (2) Part 9 scripted alarms — full state machine (EnabledState/ActiveState/AckedState/ConfirmedState/ShelvingState) with persistent operator-supplied state across restarts, complementing (not replacing) the existing Galaxy-native and AB CIP ALMD alarm sources. A third tie-in capability — Aveva Historian as alarm system of record — routes every qualifying alarm transition from any IAlarmSource (scripted + Galaxy + ALMD) through a local SQLite store-and-forward queue to Galaxy.Host, which uses its already-loaded aahClientManaged DLLs to write to the Historian alarm schema; per-alarm HistorizeToAveva toggle gates which sources flow (default off for Galaxy-native to avoid duplicating the direct Galaxy historian path, default on for scripted).

Locks in 22 design decisions from the planning conversation: C# via Roslyn scripting; virtual tags in the Equipment tree (not a separate /Virtual/ namespace); change-driven + timer-driven triggers operator-configurable per tag; Shape A one-script-per-tag-or-alarm (no predicate/action split); full OPC UA Part 9 alarm fidelity; read-only sandbox (scripts read any tag, write only to virtual tags, no File/HttpClient/Process/reflection); AST-inferred dependencies via CSharpSyntaxWalker (non-literal tag paths rejected at publish); config DB storage with generation-sealed cache; ctx.GetTag returns a full DataValue {Value, StatusCode, Timestamp}; per-tag Historize checkbox; per-tag error isolation (throwing script sets tag quality BadInternalError, engine unaffected); dedicated scripts-*.log Serilog sink bound to ctx.Logger; alarm message as template with {TagPath} substitution resolved at event emission; ActiveState recomputed from tags on startup while EnabledState/AckedState/ConfirmedState/ShelvingState + audit persist to config DB; historian sink scope = all IAlarmSource impls with per-alarm toggle; SQLite store-and-forward on the node so operators are never blocked by Historian downtime; IPC to Galaxy.Host for ingestion reusing the already-loaded aahClientManaged DLLs; Monaco editor for Admin code editing; serial cascade evaluation for v1 (parallel as follow-up); shelving UX via OPC UA method calls only with no custom Admin controls (operator drives state transitions from plant HMIs or Client.CLI); 30-day dead-letter retention with manual retry button; test harness accepts only declared-input paths so the harness enforces dependency declaration.

Eight streams totaling ~10-12 weeks, scope-comparable to Phase 6: A - Core.Scripting (Roslyn engine + sandbox + AST inference + logger); B - virtual tag engine (dependency graph + change/timer schedulers + historize); C - scripted alarm engine (Part 9 state machine + template messages + startup recovery + OPC UA method binding); D - historian alarm sink (SQLite store-and-forward + Galaxy.Host IPC contract extension); E - config DB schema (four new tables under sp_PublishGeneration); F - Admin UI scripting tab (Monaco + test harness + dependency preview + script-log viewer + historian diagnostics); G - address-space integration (extend EquipmentNodeWalker for virtual source kind + extend DriverNodeManager dispatch); H - exit gate.

Compliance-check surface covers sandbox escape (typeof/Assembly.Load/File/HttpClient attempts must fail at compile), dependency inference (literal-only paths), change cascade (topological ordering), cycle rejection at publish, startup recovery (ack/confirm/shelve survive restart but ActiveState recomputed), ack audit trail persistence, historian queue durability (Galaxy.Host offline → online drains in-order), per-alarm historian toggle gating, script timeout isolation, log sink isolation, ACL binding (virtual tags inherit Equipment scope grants).

Follow-up artifacts tracked as tasks #231-#238 (stream placeholders). Supporting doc updates (plan.md §6 Migration Strategy, config-db-schema.md §§ for the four new tables, driver-specs.md §Alarm semantics clarification, new ADR-002 for driver-vs-virtual dispatch) will land alongside the streams that touch them, not in this doc.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

2026-04-20 16:05:12 -04:00

30 KiB

Raw Blame History

Phase 7 — Scripting Runtime, Virtual Tags, and Scripted Alarms

Status: DRAFT — planning output from the 2026-04-20 interactive planning session. Pending review before work begins. Task #230 tracks the draft; #231–#238 are the stream placeholders.

Branch: v2/phase-7-scripting-and-alarming Estimated duration: 10–12 weeks (scope-comparable to Phase 6; largest single phase outside Phase 2 Galaxy split) Predecessor: Phase 6.4 (Admin UI completion) — reuses the tab-plugin pattern + draft/publish flow Successor: v2 release-readiness capstone

Phase Objective

Add two additive runtime capabilities on top of the existing driver + Equipment address-space foundation:

Virtual (calculated) tags — OPC UA variables whose values are computed by user-authored C# scripts against other tags (driver or virtual), evaluated on change and/or timer. They live in the existing Equipment/UNS tree alongside driver tags and behave identically to clients (browse, subscribe, historize).
Scripted alarms — OPC UA Part 9 alarms whose condition is a user-authored C# predicate. Full state machine (EnabledState / ActiveState / AckedState / ConfirmedState / ShelvingState) with persistent operator-supplied state across restarts. Complement the existing Galaxy-native and AB CIP ALMD alarm sources — they do not replace them.

Tie-in capability — historian alarm sink:

Aveva Historian as alarm system of record — every qualifying alarm transition (activation, ack, confirm, clear, shelve, disable, comment) from any IAlarmSource (scripted + Galaxy + ALMD) routes through a new local SQLite store-and-forward queue to Galaxy.Host, which uses its already-loaded aahClientManaged DLLs to write to the Historian's alarm schema. Per-alarm HistorizeToAveva toggle gates which sources flow (default off for Galaxy-native since Galaxy itself already historizes them). Plant operators query one uniform historical alarm timeline.

Why it's additive, not a rewrite: every IAlarmSource implementation shipped in Phase 6.x stays unchanged; scripted alarms register as an additional source in the existing fan-out. The Equipment node walker built in ADR-001 gains a "virtual" source kind alongside "driver" without removing anything. Operator-facing semantics for existing driver tags and alarms are unchanged.

Design Decisions (locked in the 2026-04-20 planning session)

#	Decision	Rationale
1	Script language = C# via Roslyn scripting	Developer audience, strong typing, AST walkable for dependency inference, existing .NET 10 runtime in main server.
2	Virtual tags live in the Equipment tree alongside driver tags (not a separate `/Virtual/...` namespace)	Operator mental model stays unified; calculated `LineRate` shows up under the Line1 folder next to the driver-sourced `SpeedSetpoint` it's derived from.
3	Evaluation trigger = change-driven + timer-driven; operator chooses per-tag	Change-driven is cheap at steady state; timer is the escape hatch for polling derivations that don't have a discrete "input changed" signal.
4	Script shape = Shape A — one script per virtual tag/alarm; `return` produces the value (or `bool` for alarm condition)	Minimal surface; no predicate/action split. Alarm side-effects (severity, message) configured out-of-band, not in the script.
5	Alarm fidelity = full OPC UA Part 9	Uniform with Galaxy + ALMD on the wire; client-side tooling (HMIs, historians, event pipelines) gets one shape.
6	Sandbox = read-only context; scripts can only read any tag + write to virtual tags	Strict Roslyn `ScriptOptions` allow-list. No HttpClient / File / Process / reflection.
7	Dependency declaration = AST inference; operator doesn't maintain a separate dependency list	`CSharpSyntaxWalker` extracts `ctx.GetTag("path")` string-literal calls at compile time; dynamic paths rejected at publish.
8	Config storage = config DB with generation-sealed cache (same as driver instances)	Virtual tags + alarms publish atomically in the same generation as the driver instance config they may depend on.
9	Script return value shape (`ctx.GetTag`) = `DataValue { Value, StatusCode, Timestamp }`	Scripts branch on quality naturally without separate `ctx.GetQuality(...)` calls.
10	Historize virtual tags = per-tag checkbox	Writes flow through the same history-write path as driver tags. Consumed by existing `IHistoryProvider`.
11	Per-tag error isolation — a throwing script sets that tag's quality to `BadInternalError`; engine keeps running for every other tag	Mirrors Phase 6.1 Stream B's per-surface error handling.
12	Dedicated Serilog sink = `scripts-*.log` rolling file; structured-property `ScriptName` for filtering	Keeps noisy script logs out of the main `opcua-*.log`. `ctx.Logger.Info/Warning/Error/Debug` bound in the script context.
13	Alarm message = template with substitution (`"Reactor temp {Reactor/Temp} exceeded {Limit}"`)	Middle ground between static and separate message-script; engine resolves `{path}` tokens at event emission.
14	Alarm state persistence — `ActiveState` recomputed from tag values on startup; `EnabledState / AckedState / ConfirmedState / ShelvingState` + audit trail persist to config DB	Operators don't re-ack after restart; ack history survives for compliance (GxP / 21 CFR Part 11).
15	Historian sink scope = all `IAlarmSource` implementations, not just scripted; per-alarm `HistorizeToAveva` toggle	Plant gets one consolidated alarm timeline; Galaxy-native alarms default off to avoid duplication.
16	Historian failure mode = SQLite store-and-forward queue on the node; config DB is source of truth, Historian is best-effort projection	Operators never blocked by Historian downtime; failed writes queue + retry when Historian recovers.
17	Historian ingestion path = IPC to Galaxy.Host, which calls the already-loaded `aahClientManaged` DLLs	Reuses existing bitness / licensing / Tier-C isolation. No new 32-bit DLL load in the main server.
18	Admin UI code editor = Monaco via the Admin project's asset pipeline	Industry default for C# editing in a browser; ~3 MB bundle acceptable given Admin is operator-facing only, not public. Revisitable if bundle size becomes a deployment constraint.
19	Cascade evaluation order = serial for v1; parallel promoted to a Phase 7 follow-up	Deterministic, easier to reason about, simplifies cycle + ordering bugs in the rollout. Parallel becomes a tuning knob when real 1000+ virtual-tag deployments measure contention.
20	Shelving UX = OPC UA method calls only (`OneShotShelve` / `TimedShelve` / `Unshelve` on the `AlarmConditionType` node); no Admin UI shelve controls	Plant HMIs + OPC UA clients already speak these methods by spec; reinventing the UI adds surface without operator value. Admin still renders current shelve state + audit trail read-only on the alarm detail page.
21	Dead-lettered historian events retained for 30 days in the SQLite queue; Admin `/alarms/historian` exposes a "Retry dead-lettered" button	Long enough for a Historian outage or licensing glitch to be resolved + operator to investigate; short enough that the SQLite file doesn't grow unbounded. Configurable via `AlarmHistorian:DeadLetterRetentionDays` for deployments with stricter compliance windows.
22	Test harness synthetic inputs = declared inputs only (from the AST walker's extracted dependency set)	Enforces the dependency declaration — if a path can't be supplied to the harness, the AST walker didn't see it and the script can't reference it at runtime. Catches dependency-inference drift at test time, not publish time.

Scope — What Changes

Concern	Change
New project `OtOpcUa.Core.Scripting` (.NET 10)	Roslyn-based script engine. Compiles user C# scripts with a sandboxed `ScriptOptions` allow-list (numeric / string / datetime / `ScriptContext` API only — no reflection / File / Process / HttpClient). `DependencyExtractor` uses `CSharpSyntaxWalker` to enumerate `ctx.GetTag("...")` literal-string calls; rejects non-literal paths at publish time. Per-script compile cache keyed by source hash. Per-evaluation timeout. Exception in script → tag goes `BadInternalError`; engine unaffected for other tags. `ctx.Logger` is a Serilog `ILogger` bound to the `scripts-*.log` rolling sink with structured property `ScriptName`.
New project `OtOpcUa.Core.VirtualTags` (.NET 10)	`VirtualTagEngine` consumes the `DependencyExtractor` output, builds a topological dependency graph spanning driver tags + other virtual tags (cycle detection at publish time), schedules re-evaluation on change + on timer, propagates results through an `IVirtualTagSource` that implements `IReadable` + `ISubscribable` so `DriverNodeManager` routes reads / subscriptions uniformly. Per-tag `Historize` flag routes to the same history-write path driver tags use.
New project `OtOpcUa.Core.ScriptedAlarms` (.NET 10)	`ScriptedAlarmEngine` materializes each configured alarm as an OPC UA `AlarmConditionType` (or `LimitAlarmType` / `OffNormalAlarmType`). On startup, re-evaluates every predicate against current tag values to rebuild `ActiveState` — no persistence needed for the active flag. Persistent state: `EnabledState`, `AckedState`, `ConfirmedState`, `ShelvingState`, branch stack, ack audit (user/time/comment). Template message substitution resolves `{TagPath}` tokens at event emission. Ack / Confirm / Shelve method nodes bound to the engine; transitions audit-logged via the existing `IAuditLogger` (Phase 6.2). Registers as an additional `IAlarmSource` — no change to the existing fan-out.
New project `OtOpcUa.Core.AlarmHistorian` (.NET 10)	`IAlarmHistorianSink` abstraction + `SqliteStoreAndForwardSink` default implementation. Every qualifying `IAlarmSource` emission (per-alarm `HistorizeToAveva` toggle) persists to a local SQLite queue (`%ProgramData%\OtOpcUa\alarm-historian-queue.db`). Background drain worker reads unsent rows + forwards over IPC to Galaxy.Host. Failed writes keep the row pending with exponential backoff. Queue capacity bounded (default 1M events, oldest-dropped with a structured warning log).
`Driver.Galaxy.Shared` — new IPC contracts	`HistorianAlarmEventRequest` (activation / ack / confirm / clear / shelve / disable / comment payloads matching the Aveva Historian alarm schema) + `HistorianAlarmEventResponse` (ack / retry-please / permanent-fail). `HistorianConnectivityStatusNotification` so the main server can surface "Historian disconnected" on the Admin `/hosts` page.
`Driver.Galaxy.Host` — new frame handler for alarm writes	Reuses the already-loaded `aahClientManaged.dll` + `aahClientCommon.dll`. Maps the IPC request DTOs to the historian SDK's alarm-event API (exact method TBD during Stream D.2 — needs a live-historian smoke to confirm the right SDK entry point). Errors map to structured response codes so the main server's backoff logic can distinguish "transient" from "permanent".
Config DB schema — new tables	`VirtualTag (Id, EquipmentPath, Name, DataType, IntervalMs?, ChangeTriggerEnabled, Historize, ScriptId)`; `Script (Id, SourceCode, CompiledHash, Language='CSharp')`; `ScriptedAlarm (Id, EquipmentPath, Name, AlarmType, Severity, MessageTemplate, HistorizeToAveva, PredicateScriptId)`; `ScriptedAlarmState (AlarmId, EnabledState, AckedState, ConfirmedState, ShelvingState, ShelvingExpiresUtc?, LastAckUser, LastAckComment, LastAckUtc, BranchStack_JSON)`. Every write goes through `sp_PublishGeneration` + `IAuditLogger`.
Address-space build — Phase 6 `EquipmentNodeWalker` extension	Emits virtual-tag nodes alongside driver-sourced nodes under the same Equipment folder. `NodeScopeResolver` gains a `Virtual` source kind alongside `Driver`. `DriverNodeManager` dispatch routes reads / writes / subscriptions to the `VirtualTagEngine` when the source is virtual.
Admin UI — new tabs	`/virtual-tags` and `/scripted-alarms` tabs under the existing draft/publish flow. Monaco-based C# code editor (syntax highlighting, IntelliSense against a hand-written type stub for `ScriptContext`). Dependency preview panel shows the inferred input list from the AST walker. Test-harness lets operator supply synthetic `DataValue` inputs + see script output + logger emissions without publishing. Per-alarm controls: `AlarmType`, `Severity`, `MessageTemplate`, `HistorizeToAveva`. New `/alarms/historian` diagnostics view: queue depth, drain rate, last-successful-write, per-alarm "last routed to historian" timestamp.
`DriverTypeRegistry` — no change	Scripting is not a driver — it doesn't register as a `DriverType`. The engine hangs off the same `SealedBootstrap` as drivers but through a different composition root.

Scope — What Does NOT Change

Item	Reason
Existing `IAlarmSource` implementations (Galaxy, AB CIP ALMD)	Scripted alarms register as an additional source; existing sources pass through unchanged. Default `HistorizeToAveva=false` for Galaxy alarms avoids duplicating records the Galaxy historian wiring already captures.
Driver capability surface (`IReadable` / `IWritable` / `ISubscribable` / etc.)	Virtual tags implement the same interfaces — drivers and virtual tags are interchangeable from the node manager's perspective. No new capability.
Config DB publication flow (`sp_PublishGeneration` + sealed cache)	Virtual tag + alarm tables plug in as additional rows. Atomic publish semantics unchanged.
Authorization trie (Phase 6.2)	Virtual-tag nodes inherit the Equipment scope's grants — same treatment as the Phase 6.4 Identification sub-folder. No new scope level.
Tier-C isolation topology	Scripting engine runs in the main .NET 10 server process. Roslyn scripts are already sandboxed via `ScriptOptions`; no need for process isolation because they have no unmanaged reach. Galaxy.Host's existing Tier-C boundary already owns the historian SDK writes.
Galaxy alarm ingestion path into the historian	Galaxy writes alarms directly via `aahClientManaged` today; Phase 7 Stream D gives it a second path (via the new sink) when a Galaxy alarm has `HistorizeToAveva=true`, but the direct path stays for the default case.
OPC UA wire protocol / AddressSpace schema	Clients see new nodes under existing folders + new alarm conditions. No new namespaces, no new ObjectTypes beyond what Part 9 already defines.

Entry Gate Checklist

All Phase 6.x exit gates cleared (#133, #142, #151, #158)
Equipment node walker wired into DriverNodeManager (task #212 — done)
IAuditLogger surface live (Phase 6.2 Stream A)
sp_PublishGeneration + sealed-cache flow verified on the existing driver-config tables
Dev Aveva Historian reachable from the dev box (for Stream D.2 smoke)
v2 branch clean + baseline tests green
Blazor editor component library picked (Monaco confirmed vs alternatives — see decision to log)
Review this plan — decisions #1–#17 signed off, no open questions

Task Breakdown

Stream A — `Core.Scripting` (Roslyn engine + sandbox + AST inference + logger) — 2 weeks

A.1 Project scaffold + NuGet Microsoft.CodeAnalysis.CSharp.Scripting. ScriptOptions allow-list (typeof(object).Assembly, typeof(Enumerable).Assembly, the Core.Scripting assembly itself — nothing else). Hand-written ScriptContext base class with GetTag(string) / SetVirtualTag(string, object) / Logger / Now / Deadband(double, double, double) helpers.
A.2 DependencyExtractor : CSharpSyntaxWalker. Visits every InvocationExpressionSyntax targeting ctx.GetTag / ctx.SetVirtualTag; accepts only a LiteralExpressionSyntax argument. Non-literal arguments (concat, variable, method call) → publish-time rejection with an actionable error pointing the operator at the exact span. Outputs IReadOnlySet<string> Inputs + IReadOnlySet<string> Outputs.
A.3 Compile cache. (source_hash) → compiled Script<T>. Recompile only when source changes. Warm on SealedBootstrap.
A.4 Per-evaluation timeout wrapper (default 250ms; configurable per tag). Timeout = tag quality BadInternalError + structured warning log. Keeps a single runaway script from starving the engine.
A.5 Serilog sink wiring. New scripts-*.log rolling file enricher; ctx.Logger returns an ILogger with ForContext("ScriptName", ...). Main opcua-*.log gets a companion entry at WARN level if a script logs ERROR, so the operator sees it in the primary log.
A.6 Tests: AST extraction unit tests (30+ cases covering literal / concat / variable / null / method-returned paths); sandbox escape tests (attempt typeof, Assembly.Load, File.OpenRead — all must fail at compile); exception isolation (throwing script doesn't kill the engine); timeout behavior; logger structured-property binding.

Stream B — Virtual tag engine (dependency graph + change/timer schedulers + historize) — 1.5 weeks

B.1 VirtualTagEngine. Ingests the set of compiled scripts + their inputs/outputs; builds a directed dependency graph (driver tag ID → virtual tag ID → virtual tag ID). Cycle detection at publish-time via Tarjan; publish rejects with a clear error message listing the cycle.
B.2 ChangeTriggerDispatcher. Subscribes to every referenced driver tag via the existing ISubscribable fan-out. On a DataValueSnapshot delta (value / status / timestamp — any of the three), enqueues affected virtual tags for re-evaluation in topological order.
B.3 TimerTriggerDispatcher. Per-tag IntervalMs scheduled via a shared timer-wheel. Independent of change triggers — a tag can have both, either, or neither.
B.4 EvaluationPipeline. Serial evaluation per cascade (parallel promoted to a follow-up — avoids cross-tag ordering bugs on first rollout). Exception handling per A.4; propagates results via IVirtualTagSource.
B.5 IVirtualTagSource implementation. Implements IReadable + ISubscribable. Reads return the most recent evaluated value; subscriptions receive OnDataChange events on each re-evaluation.
B.6 History routing. Per-tag Historize flag emits the value + timestamp to the existing history-write path used by drivers.
B.7 Tests: dependency graph (happy + cycle); change cascade through two levels of virtual tags; timer-only tag ignores input changes; change + timer both configured; error propagation; historize on/off.

Stream C — Scripted alarm engine + Part 9 state machine + template messages — 2.5 weeks

C.1 Alarm config model + ScriptedAlarmEngine skeleton. Alarms materialize as AlarmConditionType (or subtype — LimitAlarm, OffNormal) nodes under their configured Equipment path. Severity loaded from config.
C.2 Part9StateMachine. Tracks EnabledState, ActiveState, AckedState, ConfirmedState, ShelvingState per condition ID. Shelving has OneShotShelving + TimedShelving variants + an UnshelveTime timer.
C.3 Predicate evaluation. On any input change (same trigger mechanism as Stream B), run the bool predicate. On false → true transition, activate (increment branch stack if prior Ack-but-not-Confirmed state exists). On true → false, clear (but keep condition visible if retain flag set).
C.4 Startup recovery. For every configured alarm, run the predicate against current tag values to rebuild ActiveState only. Load EnabledState / AckedState / ConfirmedState / ShelvingState + audit from the ScriptedAlarmState table. No re-acknowledgment required for conditions that were acked before restart.
C.5 Template substitution. Engine resolves {TagPath} tokens in MessageTemplate at event emission time using current tag values. Unresolvable tokens (bad path, missing tag) emit a structured error log + substitute {?} so the event still fires.
C.6 OPC UA method binding. Acknowledge, Confirm, AddComment, OneShotShelve, TimedShelve, Unshelve methods on each condition node route to the engine + persist via audit-logged writes to ScriptedAlarmState.
C.7 IAlarmSource implementation. Emits Part 9-shaped events through the existing fan-out the AlarmTracker composes.
C.8 Tests: every transition (all 32 state combinations the state machine can produce); startup recovery (seed table with varied ack/confirm/shelve state, restart, verify correct recovery); template substitution (literal path, nested path, bad path); shelving timer expiry; OPC UA method calls via Client.CLI.

Stream D — Historian alarm sink (SQLite store-and-forward + Galaxy.Host IPC) — 2 weeks

D.1 Core.AlarmHistorian project. IAlarmHistorianSink interface; SqliteStoreAndForwardSink default implementation using Microsoft.Data.Sqlite. Schema: Queue (RowId, AlarmId, EventType, PayloadJson, EnqueuedUtc, LastAttemptUtc?, AttemptCount, DeadLettered). Queue capacity bounded; oldest-dropped on overflow with structured warning.
D.2 Live-historian smoke against the dev box's Aveva Historian. Identify the exact aahClientManaged alarm-write API entry point (likely IAlarmsDatabase.WriteAlarmEvent or equivalent — verify with a throwaway Galaxy.Host test hook). Document in a short docs/v2/historian-alarm-api.md artifact.
D.3 Driver.Galaxy.Shared contract additions. HistorianAlarmEventRequest / HistorianAlarmEventResponse / HistorianConnectivityStatusNotification. Round-trip tests in Driver.Galaxy.Shared.Tests.
D.4 Driver.Galaxy.Host handler. Translates incoming HistorianAlarmEventRequest to the SDK call identified in D.2. Returns structured response (Ack / RetryPlease / PermanentFail). Connectivity notifications sent proactively when the SDK's session drops.
D.5 Drain worker in the main server. Polls the SQLite queue; batches up to 100 events per IPC round-trip; exponential backoff on RetryPlease (1s → 2s → 5s → 15s → 60s cap); PermanentFail dead-letters the row + structured error log.
D.6 Per-alarm toggle wired through: HistorizeToAveva column on both ScriptedAlarm + a new AlarmHistorizationPolicy projection the Galaxy / ALMD alarm sources consult (default false for Galaxy, true for scripted, operator-adjustable per-alarm).
D.7 /alarms/historian diagnostics view in Admin. Queue depth, drain rate, last-successful-write, last-error, per-alarm last-routed timestamp.
D.8 Tests: SQLite queue round-trip; drain worker with fake IPC (success / retry / perm-fail); overflow eviction; Galaxy.Host handler against a stub historian API; end-to-end with the live historian on the dev box (non-CI — operator-invoked).

Stream E — Config DB schema + generation-sealed cache extensions — 1 week

E.1 EF migration for new tables. Foreign keys from VirtualTag.ScriptId / ScriptedAlarm.PredicateScriptId to Script.Id.
E.2 sp_PublishGeneration extension. Sealed-cache snapshot includes virtual tags + scripted alarms + their scripts. Atomic publish guarantees the address-space build sees a consistent view.
E.3 CRUD services. VirtualTagService, ScriptedAlarmService, ScriptService. Each audit-logged; Ack / Confirm / Shelve persist through ScriptedAlarmStateService with full audit trail (who / when / comment / previous state).
E.4 Tests: migration up / down; publish atomicity (concurrent writes to different alarm rows don't leak into an in-flight publish); audit trail on every mutation.

Stream F — Admin UI scripting tab — 2 weeks

F.1 Monaco editor Razor component. CSS-isolated; loads Monaco via NPM + the Admin project's existing asset pipeline. C# syntax highlighting (Monaco ships it). IntelliSense via a hand-written ScriptContext.cs type stub delivered with the editor (not the compiled Core.Scripting DLL — keeps the browser bundle small).
F.2 /virtual-tags tab. List view (Equipment path / Name / DataType / inputs-summary / Historize / actions). Edit pane splits: Monaco editor left, dependency preview panel right (live-updates from a debounced /api/scripting/analyze endpoint that runs the DependencyExtractor). Publish button gated by Phase 6.2 WriteConfigure permission.
F.3 /scripted-alarms tab. Same editor shape + extra controls: AlarmType dropdown, Severity slider, MessageTemplate textbox with live-preview showing {path} token resolution against latest tag values, HistorizeToAveva checkbox. Alarm detail page displays current ShelvingState + LastAckUser / LastAckUtc / LastAckComment read-only — no shelve/unshelve / ack / confirm buttons per decision #20. Operators drive state transitions via OPC UA method calls from plant HMIs or the Client.CLI.
F.4 Test harness. Modal that lets the operator supply synthetic DataValue inputs for the dependency set + see script output + logger emissions (rendered in a virtual terminal). Enables testing without publishing.
F.5 Script log viewer. SignalR stream of the scripts-*.log sink filtered by the script under edit (using the structured ScriptName property). Tail-last-200 + "load more".
F.6 /alarms/historian diagnostics view per Stream D.7.
F.7 Playwright smoke. Author a calc tag, publish, verify it appears in the equipment tree via a probe OPC UA read. Author an alarm, verify it appears in AlarmsAndConditions.

Stream G — Address-space integration — 1 week

G.1 EquipmentNodeWalker extension. Current walker iterates driver tags per equipment; extend to also iterate virtual tags + alarms. NodeScopeResolver returns NodeSource.Virtual for virtual nodes and NodeSource.Driver for existing.
G.2 DriverNodeManager dispatch. Read / Write / Subscribe operations check the resolved source and route to VirtualTagEngine or the driver as appropriate. Writes to virtual tags allowed only from scripts (per decision #6) — OPC UA client writes to a virtual node return BadUserAccessDenied.
G.3 AlarmTracker composition. The ScriptedAlarmEngine registers as an additional IAlarmSource — no new composition code, the existing fan-out already accepts multiple sources.
G.4 Tests: mixed equipment folder (driver tag + virtual tag + driver-native alarm + scripted alarm) browsable via Client.CLI; read / subscribe round-trip for the virtual tag; scripted alarm transitions visible in the alarm event stream.

Stream H — Exit gate — 1 week

H.1 Compliance script real-checks: schema migrations applied; new tables populated from a draft→publish cycle; sealed-generation snapshot includes virtual tags + alarms; SQLite alarm queue initialized; scripts-*.log sink emitting; AlarmConditionType nodes materialize in the address space; per-alarm HistorizeToAveva toggle enforced end-to-end.
H.2 Full-solution dotnet test baseline. Target: Phase 6 baseline + ~300 new tests across Streams A–G.
H.3 docs/v2/plan.md Migration Strategy §6 update — add Phase 7.
H.4 Phase-status memory update.
H.5 Merge v2/phase-7-scripting-and-alarming → v2.

Compliance Checks (run at exit gate)

Sandbox escape: attempts to reference System.IO.File, System.Net.Http.HttpClient, System.Diagnostics.Process, or typeof(X).Assembly.Load fail at script compile with an actionable error.
Dependency inference: ctx.GetTag(myStringVar) (non-literal path) is rejected at publish with a span-pointed error; ctx.GetTag("Line1/Speed") is accepted + appears in the inferred input set.
Change cascade: tag A → virtual tag B → virtual tag C. When A changes, B recomputes, then C recomputes. Single change event triggers the full cascade in topological order within one evaluation pass.
Cycle rejection: publish a config where virtual tag B depends on A and A depends on B. Publish fails pre-commit with a clear cycle message.
Startup recovery: seed ScriptedAlarmState with one acked+confirmed alarm + one shelved alarm + one clean alarm, restart, verify operator does NOT see ack prompts for the first two, shelving remains in effect, clean alarm is clear.
Ack audit: acknowledge an alarm; IAuditLogger captures user / timestamp / comment / prior state; row persists through restart.
Historian queue durability: take Galaxy.Host offline, fire 10 alarm transitions, bring Galaxy.Host back; queue drains all 10 in order.
Per-alarm historian toggle: Galaxy-native alarm with HistorizeToAveva=false does NOT enqueue; scripted alarm with HistorizeToAveva=true DOES enqueue.
Script timeout: infinite-loop script times out at 250ms; tag quality BadInternalError; other tags unaffected.
Log isolation: ctx.Logger.Error("test") lands in scripts-*.log with structured property ScriptName=<name>; main opcua-*.log gets a WARN companion entry.
ACL binding: virtual tag under an Equipment scope inherits the Equipment's grants. User without the Equipment grant reads the virtual tag and gets BadUserAccessDenied.

Decisions Resolved in Plan Review

Every open question from the initial draft was resolved in the 2026-04-20 plan review — see decisions #18–#22 in the decisions table above. No pending questions block Stream A.

References

docs/v2/plan.md §6 Migration Strategy — add Phase 7 as the final additive phase before v2 release readiness.
docs/v2/implementation/overview.md — phase gate conventions.
docs/v2/implementation/phase-6-2-authorization-runtime.md — IAuditLogger surface reused for Ack/Confirm/Shelve + script edits.
docs/v2/implementation/phase-6-4-admin-ui-completion.md — draft/publish flow, diff viewer, tab-plugin pattern reused.
docs/v2/implementation/phase-2-galaxy-out-of-process.md — Galaxy.Host IPC shape + shared-contract conventions reused for Stream D.
docs/v2/driver-specs.md §Alarm semantics — Part 9 fidelity requirements.
docs/v2/driver-stability.md — per-surface error handling, crash-loop breaker patterns Stream A.4 mirrors.
docs/v2/config-db-schema.md — add a Phase 7 §§ for VirtualTag, Script, ScriptedAlarm, ScriptedAlarmState.

30 KiB Raw Blame History Unescape Escape