Files
lmxopcua/tests/ZB.MOM.WW.OtOpcUa.Core.ScriptedAlarms.Tests/ScriptedAlarmEngineTests.cs
Joseph Doherty df39809526 Phase 7 Stream C — Core.ScriptedAlarms project (Part 9 state machine + predicate engine + IAlarmSource adapter)
Ships the Part 9 alarm fidelity layer Phase 7 committed to in plan decision #5. Every scripted alarm gets a full OPC UA AlarmConditionType state machine — EnabledState, ActiveState, AckedState, ConfirmedState, ShelvingState — with persistent operator-supplied state across server restarts per Phase 7 plan decision #14. Runtime shape matches the Galaxy-native + AB CIP ALMD alarm sources: scripted alarms fan out through the existing IAlarmSource surface so Phase 6.1 AlarmTracker composition consumes them without per-source branching.

Part9StateMachine is a pure-functions module — no instance state, no I/O, no mutation. Every transition (ApplyPredicate, ApplyAcknowledge, ApplyConfirm, ApplyOneShotShelve, ApplyTimedShelve, ApplyUnshelve, ApplyEnable, ApplyDisable, ApplyAddComment, ApplyShelvingCheck) takes the current AlarmConditionState record plus the event and returns a fresh state + EmissionKind hint. Two structural invariants enforced: disabled alarms never transition ActiveState / AckedState / ConfirmedState; shelved alarms still advance state (so startup recovery reflects reality) but emit a Suppressed hint so subscribers do not see the transition. OneShot shelving expires on clear; Timed shelving expires via ApplyShelvingCheck against the UnshelveAtUtc timestamp. Comments are append-only — every acknowledge, confirm, shelve, unshelve, enable, disable, explicit add-comment, and auto-unshelve appends an AlarmComment record with user identity + timestamp + kind + text for the GxP / 21 CFR Part 11 audit surface.

AlarmConditionState is the persistent record the store saves. Fields: AlarmId, Enabled, Active, Acked, Confirmed, Shelving (kind + UnshelveAtUtc), LastTransitionUtc, LastActiveUtc, LastClearedUtc, LastAckUtc + LastAckUser + LastAckComment, LastConfirmUtc + LastConfirmUser + LastConfirmComment, Comments. Fresh factory initializes everything to the no-event position.

IAlarmStateStore is the persistence abstraction — LoadAsync, LoadAllAsync, SaveAsync, RemoveAsync. Stream E wires this to a SQL-backed store with IAuditLogger hooks; tests use InMemoryAlarmStateStore. Startup recovery per Phase 7 plan decision #14: LoadAsync runs every configured alarm predicate against current tag values to rederive ActiveState, but EnabledState / AckedState / ConfirmedState / ShelvingState + audit history are loaded verbatim from the store so operators do not re-ack after an outage and shelved alarms stay shelved through maintenance windows.

MessageTemplate implements Phase 7 plan decision #13 — static-with-substitution. {TagPath} tokens resolved at event emission time from the engine value cache. Missing paths, non-Good quality, or null values all resolve to {?} so the event still fires but the operator sees where the reference broke. ExtractTokenPaths enumerates tokens at publish time so the engine knows to subscribe to every template-referenced tag in addition to predicate-referenced tags.

AlarmPredicateContext is the ScriptContext subclass alarm scripts see. GetTag reads from the engine shared cache; SetVirtualTag is explicitly rejected at runtime with a pointed error message — alarm predicates must be pure so their output does not couple to virtual-tag state in ways that become impossible to reason about. If cross-tag side effects are needed, the operator authors a virtual tag and the alarm predicate reads it.

ScriptedAlarmEngine orchestrates. LoadAsync compiles every predicate through Stream A ScriptSandbox + ForbiddenTypeAnalyzer, runs DependencyExtractor to find the read set, adds template token paths to the input set, reports every compile failure as one aggregated InvalidOperationException (not one-at-a-time), subscribes to each unique referenced upstream path, seeds the value cache, loads persisted state for each alarm (falling back to Fresh for first-load), re-evaluates the predicate, and saves the recovered state. ChangeTrigger — when an upstream tag changes, look up every alarm referencing that path in a per-path inverse index, enqueue all of them for re-evaluation via a SemaphoreSlim-gated path. Unlike the virtual-tag engine, scripted alarms are leaves in the evaluation DAG (no alarm drives another alarm), so no topological sort is needed. Operator actions (AcknowledgeAsync, ConfirmAsync, OneShotShelveAsync, TimedShelveAsync, UnshelveAsync, EnableAsync, DisableAsync, AddCommentAsync) route through the state machine, persist, and emit if there is an emission. A 5-second shelving-check timer auto-expires Timed shelving and emits Unshelved events at the right moment. Predicate evaluation errors (script throws, timeout, compile-time reads bad tag) leave the state unchanged — the engine does NOT invent a clear transition on predicate failure. Logged as scripts-*.log Error; companion WARN in main log.

ScriptedAlarmSource implements IAlarmSource. SubscribeAlarmsAsync filter is a set of equipment-path prefixes; empty means all. AcknowledgeAsync from the base interface routes to the engine with user identity "opcua-client" — Stream G will replace this with the authenticated principal from the OPC UA dispatch layer. The adapter implements only the base IAlarmSource methods; richer Part 9 methods (Confirm, Shelve, Unshelve, AddComment) remain on the engine and will bind to OPC UA method nodes in Stream G.

47 unit tests across 5 files. Part9StateMachineTests (16) — every transition + noop edge cases: predicate true/false, same-state noop, disabled ignores predicate, acknowledge records user/comment/adds audit, idempotent acknowledge, reject no-user ack, full activate-ack-clear-confirm walk, one-shot shelve suppresses next activation, one-shot expires on clear, timed shelve requires future unshelve time, timed shelve expires via shelving-check, explicit unshelve emits, add-comment appends to audit, comments append-only through multiple operations, full lifecycle walk emits every expected EmissionKind. MessageTemplateTests (11) — no-token passthrough, single+multiple token substitution, bad quality becomes {?}, unknown path becomes {?}, null value becomes {?}, tokens with slashes+dots, empty + null template, ExtractTokenPaths returns every distinct path, whitespace inside tokens trimmed. ScriptedAlarmEngineTests (13) — load compiles+subscribes, compile failures aggregated, upstream change emits Activated, clearing emits Cleared, message template resolves at emission, ack persists to store, startup recovery preserves ack but rederives active, shelved activation state-advances but suppresses emission, runtime exception isolates to owning alarm, disable prevents activation until re-enable, AddComment appends audit without state change, SetVirtualTag from predicate rejected (state unchanged), Dispose releases upstream subscriptions. ScriptedAlarmSourceTests (5) — empty filter matches all, equipment-prefix filter, Unsubscribe stops events, AcknowledgeAsync routes with default user, null arguments rejected. FakeUpstream fixture gives tests an in-memory driver mock with subscription count tracking.

Full Phase 7 test count after Stream C: 146 green (63 Scripting + 36 VirtualTags + 47 ScriptedAlarms). Stream D (historian alarm sink with SQLite store-and-forward + Galaxy.Host IPC) consumes ScriptedAlarmEvent + similar Galaxy / AB CIP emissions to produce the unified alarm timeline. Stream G wires the OPC UA method calls and AlarmSource into DriverNodeManager dispatch.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-20 18:49:48 -04:00

317 lines
12 KiB
C#

using Serilog;
using Shouldly;
using Xunit;
using ZB.MOM.WW.OtOpcUa.Core.Abstractions;
using ZB.MOM.WW.OtOpcUa.Core.Scripting;
using ZB.MOM.WW.OtOpcUa.Core.ScriptedAlarms;
namespace ZB.MOM.WW.OtOpcUa.Core.ScriptedAlarms.Tests;
/// <summary>
/// End-to-end engine tests: load, predicate evaluation, change-triggered
/// re-evaluation, state persistence, startup recovery, error isolation.
/// </summary>
[Trait("Category", "Unit")]
public sealed class ScriptedAlarmEngineTests
{
private static ScriptedAlarmEngine Build(FakeUpstream up, out IAlarmStateStore store)
{
store = new InMemoryAlarmStateStore();
var logger = new LoggerConfiguration().CreateLogger();
return new ScriptedAlarmEngine(up, store, new ScriptLoggerFactory(logger), logger);
}
private static ScriptedAlarmDefinition Alarm(string id, string predicate,
string msg = "condition", AlarmSeverity sev = AlarmSeverity.High) =>
new(AlarmId: id,
EquipmentPath: "Plant/Line1/Reactor",
AlarmName: id,
Kind: AlarmKind.AlarmCondition,
Severity: sev,
MessageTemplate: msg,
PredicateScriptSource: predicate);
[Fact]
public async Task Load_compiles_and_subscribes_to_referenced_upstreams()
{
var up = new FakeUpstream();
up.Set("Temp", 50);
using var eng = Build(up, out _);
await eng.LoadAsync([Alarm("a1", """return (int)ctx.GetTag("Temp").Value > 100;""")],
TestContext.Current.CancellationToken);
eng.LoadedAlarmIds.ShouldContain("a1");
up.ActiveSubscriptionCount.ShouldBe(1);
}
[Fact]
public async Task Compile_failures_aggregated_into_one_error()
{
var up = new FakeUpstream();
using var eng = Build(up, out _);
var ex = await Should.ThrowAsync<InvalidOperationException>(async () =>
await eng.LoadAsync([
Alarm("bad1", "return unknownIdentifier;"),
Alarm("good", "return true;"),
Alarm("bad2", "var x = alsoUnknown; return x;"),
], TestContext.Current.CancellationToken));
ex.Message.ShouldContain("2 alarm(s) did not compile");
}
[Fact]
public async Task Upstream_change_re_evaluates_predicate_and_emits_Activated()
{
var up = new FakeUpstream();
up.Set("Temp", 50);
using var eng = Build(up, out _);
await eng.LoadAsync([Alarm("HighTemp", """return (int)ctx.GetTag("Temp").Value > 100;""")],
TestContext.Current.CancellationToken);
var events = new List<ScriptedAlarmEvent>();
eng.OnEvent += (_, e) => events.Add(e);
up.Push("Temp", 150);
await WaitForAsync(() => events.Count > 0);
events[0].AlarmId.ShouldBe("HighTemp");
events[0].Emission.ShouldBe(EmissionKind.Activated);
eng.GetState("HighTemp")!.Active.ShouldBe(AlarmActiveState.Active);
}
[Fact]
public async Task Clearing_upstream_emits_Cleared_event()
{
var up = new FakeUpstream();
up.Set("Temp", 150);
using var eng = Build(up, out _);
await eng.LoadAsync([Alarm("HighTemp", """return (int)ctx.GetTag("Temp").Value > 100;""")],
TestContext.Current.CancellationToken);
// Startup sees 150 → active.
eng.GetState("HighTemp")!.Active.ShouldBe(AlarmActiveState.Active);
var events = new List<ScriptedAlarmEvent>();
eng.OnEvent += (_, e) => events.Add(e);
up.Push("Temp", 50);
await WaitForAsync(() => events.Any(e => e.Emission == EmissionKind.Cleared));
eng.GetState("HighTemp")!.Active.ShouldBe(AlarmActiveState.Inactive);
}
[Fact]
public async Task Message_template_resolves_tag_values_at_emission()
{
var up = new FakeUpstream();
up.Set("Temp", 50);
up.Set("Limit", 100);
using var eng = Build(up, out _);
await eng.LoadAsync([
new ScriptedAlarmDefinition(
"HighTemp", "Plant/Line1", "HighTemp",
AlarmKind.LimitAlarm, AlarmSeverity.High,
"Temp {Temp}C exceeded limit {Limit}C",
"""return (int)ctx.GetTag("Temp").Value > (int)ctx.GetTag("Limit").Value;"""),
], TestContext.Current.CancellationToken);
var events = new List<ScriptedAlarmEvent>();
eng.OnEvent += (_, e) => events.Add(e);
up.Push("Temp", 150);
await WaitForAsync(() => events.Any());
events[0].Message.ShouldBe("Temp 150C exceeded limit 100C");
}
[Fact]
public async Task Ack_records_user_and_persists_to_store()
{
var up = new FakeUpstream();
up.Set("Temp", 150);
using var eng = Build(up, out var store);
await eng.LoadAsync([Alarm("HighTemp", """return (int)ctx.GetTag("Temp").Value > 100;""")],
TestContext.Current.CancellationToken);
await eng.AcknowledgeAsync("HighTemp", "alice", "checking", TestContext.Current.CancellationToken);
var persisted = await store.LoadAsync("HighTemp", TestContext.Current.CancellationToken);
persisted.ShouldNotBeNull();
persisted!.Acked.ShouldBe(AlarmAckedState.Acknowledged);
persisted.LastAckUser.ShouldBe("alice");
persisted.LastAckComment.ShouldBe("checking");
persisted.Comments.Any(c => c.Kind == "Acknowledge" && c.User == "alice").ShouldBeTrue();
}
[Fact]
public async Task Startup_recovery_preserves_ack_but_rederives_active_from_predicate()
{
var up = new FakeUpstream();
up.Set("Temp", 50); // predicate will go false on second load
// First run — alarm goes active + operator acks.
using (var eng1 = Build(up, out var sharedStore))
{
up.Set("Temp", 150);
await eng1.LoadAsync([Alarm("HighTemp", """return (int)ctx.GetTag("Temp").Value > 100;""")],
TestContext.Current.CancellationToken);
eng1.GetState("HighTemp")!.Active.ShouldBe(AlarmActiveState.Active);
await eng1.AcknowledgeAsync("HighTemp", "alice", null, TestContext.Current.CancellationToken);
eng1.GetState("HighTemp")!.Acked.ShouldBe(AlarmAckedState.Acknowledged);
}
// Simulate restart — temp is back to 50 (below threshold).
up.Set("Temp", 50);
var logger = new LoggerConfiguration().CreateLogger();
var store2 = new InMemoryAlarmStateStore();
// seed store2 with the acked state from before restart
await store2.SaveAsync(new AlarmConditionState(
"HighTemp",
AlarmEnabledState.Enabled,
AlarmActiveState.Active, // was active pre-restart
AlarmAckedState.Acknowledged, // ack persisted
AlarmConfirmedState.Unconfirmed,
ShelvingState.Unshelved,
DateTime.UtcNow,
DateTime.UtcNow, null,
DateTime.UtcNow, "alice", null,
null, null, null,
[new AlarmComment(DateTime.UtcNow, "alice", "Acknowledge", "")]),
TestContext.Current.CancellationToken);
using var eng2 = new ScriptedAlarmEngine(up, store2, new ScriptLoggerFactory(logger), logger);
await eng2.LoadAsync([Alarm("HighTemp", """return (int)ctx.GetTag("Temp").Value > 100;""")],
TestContext.Current.CancellationToken);
var s = eng2.GetState("HighTemp")!;
s.Active.ShouldBe(AlarmActiveState.Inactive, "Active recomputed from current tag value");
s.Acked.ShouldBe(AlarmAckedState.Acknowledged, "Ack persisted across restart");
s.LastAckUser.ShouldBe("alice");
}
[Fact]
public async Task Shelved_active_transitions_state_but_suppresses_emission()
{
var up = new FakeUpstream();
up.Set("Temp", 50);
using var eng = Build(up, out _);
await eng.LoadAsync([Alarm("HighTemp", """return (int)ctx.GetTag("Temp").Value > 100;""")],
TestContext.Current.CancellationToken);
await eng.OneShotShelveAsync("HighTemp", "alice", TestContext.Current.CancellationToken);
var events = new List<ScriptedAlarmEvent>();
eng.OnEvent += (_, e) => events.Add(e);
up.Push("Temp", 150);
await Task.Delay(200);
events.Any(e => e.Emission == EmissionKind.Activated).ShouldBeFalse(
"OneShot shelve suppresses activation emission");
eng.GetState("HighTemp")!.Active.ShouldBe(AlarmActiveState.Active,
"state still advances so startup recovery is consistent");
}
[Fact]
public async Task Predicate_runtime_exception_does_not_transition_state()
{
var up = new FakeUpstream();
up.Set("Temp", 150);
using var eng = Build(up, out _);
await eng.LoadAsync([
Alarm("BadScript", """throw new InvalidOperationException("boom");"""),
Alarm("GoodScript", """return (int)ctx.GetTag("Temp").Value > 100;"""),
], TestContext.Current.CancellationToken);
// Bad script doesn't activate + doesn't disable other alarms.
eng.GetState("BadScript")!.Active.ShouldBe(AlarmActiveState.Inactive);
eng.GetState("GoodScript")!.Active.ShouldBe(AlarmActiveState.Active);
}
[Fact]
public async Task Disable_prevents_activation_until_re_enabled()
{
var up = new FakeUpstream();
up.Set("Temp", 50);
using var eng = Build(up, out _);
await eng.LoadAsync([Alarm("HighTemp", """return (int)ctx.GetTag("Temp").Value > 100;""")],
TestContext.Current.CancellationToken);
await eng.DisableAsync("HighTemp", "alice", TestContext.Current.CancellationToken);
up.Push("Temp", 150);
await Task.Delay(100);
eng.GetState("HighTemp")!.Active.ShouldBe(AlarmActiveState.Inactive,
"disabled alarm ignores predicate");
await eng.EnableAsync("HighTemp", "alice", TestContext.Current.CancellationToken);
up.Push("Temp", 160);
await WaitForAsync(() => eng.GetState("HighTemp")!.Active == AlarmActiveState.Active);
}
[Fact]
public async Task AddComment_appends_to_audit_without_state_change()
{
var up = new FakeUpstream();
up.Set("Temp", 50);
using var eng = Build(up, out var store);
await eng.LoadAsync([Alarm("A", """return false;""")], TestContext.Current.CancellationToken);
await eng.AddCommentAsync("A", "alice", "peeking at this", TestContext.Current.CancellationToken);
var s = await store.LoadAsync("A", TestContext.Current.CancellationToken);
s.ShouldNotBeNull();
s!.Comments.Count.ShouldBe(1);
s.Comments[0].User.ShouldBe("alice");
s.Comments[0].Kind.ShouldBe("AddComment");
}
[Fact]
public async Task Predicate_scripts_cannot_SetVirtualTag()
{
var up = new FakeUpstream();
up.Set("Temp", 100);
using var eng = Build(up, out _);
// The script compiles fine but throws at runtime when SetVirtualTag is called.
// The engine swallows the exception + leaves state unchanged.
await eng.LoadAsync([
new ScriptedAlarmDefinition(
"Bad", "Plant/Line1", "Bad",
AlarmKind.AlarmCondition, AlarmSeverity.High, "bad",
"""
ctx.SetVirtualTag("NotAllowed", 1);
return true;
"""),
], TestContext.Current.CancellationToken);
// Bad alarm's predicate threw — state unchanged.
eng.GetState("Bad")!.Active.ShouldBe(AlarmActiveState.Inactive);
}
[Fact]
public async Task Dispose_releases_upstream_subscriptions()
{
var up = new FakeUpstream();
up.Set("Temp", 50);
var eng = Build(up, out _);
await eng.LoadAsync([Alarm("A", """return (int)ctx.GetTag("Temp").Value > 100;""")],
TestContext.Current.CancellationToken);
up.ActiveSubscriptionCount.ShouldBe(1);
eng.Dispose();
up.ActiveSubscriptionCount.ShouldBe(0);
}
private static async Task WaitForAsync(Func<bool> cond, int timeoutMs = 2000)
{
var deadline = DateTime.UtcNow.AddMilliseconds(timeoutMs);
while (DateTime.UtcNow < deadline)
{
if (cond()) return;
await Task.Delay(25);
}
throw new TimeoutException("Condition did not become true in time");
}
}