diff --git a/docs/plans/2026-06-14-galaxy-phase-b-native-alarms-plan.md b/docs/plans/2026-06-14-galaxy-phase-b-native-alarms-plan.md new file mode 100644 index 00000000..2e6d299e --- /dev/null +++ b/docs/plans/2026-06-14-galaxy-phase-b-native-alarms-plan.md @@ -0,0 +1,622 @@ +# Galaxy Phase B — Native Alarms on the Equipment-Tag Path — Implementation Plan + +> **For Claude:** REQUIRED SUB-SKILL: Use superpowers-extended-cc:executing-plans (or subagent-driven-development) to implement this plan task-by-task. + +**Goal:** A Galaxy equipment `Tag` marked as a native alarm (via its `TagConfig` JSON) materializes a real OPC UA Part 9 `AlarmConditionState` under its equipment folder, and the driver's live `IAlarmSource.OnAlarmEvent` transitions drive that condition (active/severity/message/ack) and fan out to the `alerts` topic — mirroring the scripted-alarm seam. **No EF/schema migration.** + +**Architecture:** Three reused layers + one new seam. (1) Alarm intent rides in the schemaless `TagConfig` blob, parsed byte-parity in `Phase7Composer` + `DeploymentArtifact` (`EquipmentTagPlan.Alarm`). (2) `Phase7Applier.MaterialiseEquipmentTags` branches: alarm tag → the existing `MaterialiseAlarmCondition` (reused verbatim), else the existing `EnsureVariable`. (3) A new driver→server alarm seam: `DriverInstanceActor` subscribes `IAlarmSource.OnAlarmEvent` (mirroring its `OnDataChange` subscription) and publishes `AttributeAlarmPublished` to `DriverHostActor`, which projects each transition into the existing `AlarmConditionSnapshot` (`NativeAlarmProjector`) and Tells the unchanged `OpcUaPublishActor.AlarmStateUpdate` → `OtOpcUaNodeManager.WriteAlarmCondition`, plus a Primary-gated `AlarmTransitionEvent` to `alerts`. A small additive contract change adds the transition `Kind` to `AlarmEventArgs` (the driver already has it; the record's surviving consumers compile via a default). + +**Tech Stack:** C#/.NET 10, Akka.NET (fused-host actors, Akka.TestKit.Xunit2 — **xUnit v2**, use `CancellationToken.None` not `TestContext.Current`), xUnit + Shouldly, OPC Foundation UA stack. No bUnit — Razor/live paths proven only by the user-driven docker-dev `/run` gate. + +**Design:** `docs/plans/2026-06-14-galaxy-phase-b-native-alarms-design.md` (master `90096e9c`). + +**Hard rules (every task):** stage by path, never `git add .`; never stage `sql_login.txt`, `src/Server/ZB.MOM.WW.OtOpcUa.Host/pki/`, `pending.md`, `current.md`, `docker-dev/docker-compose.yml`; never echo secrets; no force-push, no `--no-verify`; **NO Configuration entity / EF migration change**; no bUnit. Commit per task by path. + +--- + +### Task 0: Feature branch + +**Classification:** trivial +**Estimated implement time:** ~1 min +**Parallelizable with:** none + +**Step 1:** From master at `90096e9c`: +```bash +cd /Users/dohertj2/Desktop/OtOpcUa +git checkout master && git rev-parse --short HEAD # expect 90096e9c +git checkout -b feat/galaxy-phase-b-native-alarms +``` +No code change. Do NOT touch the working-tree `docker-dev/docker-compose.yml` or `pending.md`. + +--- + +### Task 1: Transition-kind contract (`AlarmEventArgs.Kind`) + Galaxy populates it — WS-1 + +**Classification:** standard +**Estimated implement time:** ~4 min +**Parallelizable with:** Task 2 + +**Files:** +- Modify: `src/Core/ZB.MOM.WW.OtOpcUa.Core.Abstractions/IAlarmSource.cs` (add enum `AlarmTransitionKind`; add trailing `Kind` param to `AlarmEventArgs`) +- Modify: `src/Drivers/ZB.MOM.WW.OtOpcUa.Driver.Galaxy/GalaxyDriver.cs:~1128-1167` (`OnAlarmFeedTransition` maps `transition.TransitionKind` → `Kind`) +- Test: `tests/Core/ZB.MOM.WW.OtOpcUa.Core.Abstractions.Tests/` (new `AlarmEventArgsTests.cs`) +- Test: `tests/Drivers/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Tests/` (extend the existing alarm-feed/transition test, or add `GalaxyAlarmTransitionKindTests.cs`) + +**Context:** `GalaxyAlarmTransition` (`Driver.Galaxy/Runtime/GalaxyAlarmTransition.cs`) carries `GalaxyAlarmTransitionKind {Unspecified=0, Raise=1, Acknowledge=2, Clear=3, Retrigger=4}` but `OnAlarmFeedTransition` drops it when building `AlarmEventArgs`. Other `IAlarmSource` implementers (FOCAS/OpcUaClient/AbCip/ScriptedAlarmSource) construct `AlarmEventArgs` without a kind — a record default keeps them compiling untouched. + +**Step 1 (failing test — contract):** In `AlarmEventArgsTests.cs`: +```csharp +using Shouldly; +using ZB.MOM.WW.OtOpcUa.Core.Abstractions; +using Xunit; + +namespace ZB.MOM.WW.OtOpcUa.Core.Abstractions.Tests; + +public class AlarmEventArgsTests +{ + private static AlarmEventArgs Make(AlarmTransitionKind? kind = null) => + kind is null + ? new AlarmEventArgs(new FakeHandle(), "Tank1.Hi", "c1", "LimitAlarm.Hi", "msg", AlarmSeverity.High, DateTime.UnixEpoch) + : new AlarmEventArgs(new FakeHandle(), "Tank1.Hi", "c1", "LimitAlarm.Hi", "msg", AlarmSeverity.High, DateTime.UnixEpoch, Kind: kind.Value); + + [Fact] + public void Kind_defaults_to_Unspecified_so_existing_callers_compile() + => Make().Kind.ShouldBe(AlarmTransitionKind.Unspecified); + + [Fact] + public void Kind_round_trips_when_supplied() + => Make(AlarmTransitionKind.Raise).Kind.ShouldBe(AlarmTransitionKind.Raise); + + private sealed class FakeHandle : IAlarmSubscriptionHandle { public string DiagnosticId => "t"; } +} +``` +Run: `dotnet test tests/Core/ZB.MOM.WW.OtOpcUa.Core.Abstractions.Tests --filter AlarmEventArgsTests` → FAIL (no `Kind`, no `AlarmTransitionKind`). + +**Step 2 (implement contract):** In `IAlarmSource.cs`, add the enum (next to `AlarmSeverity`) and the `Kind` param as the **last** parameter of `AlarmEventArgs` (after `AlarmCategory`): +```csharp +/// The kind of alarm state change. Mirrors the driver-internal transition kinds so a +/// consumer can derive Part 9 active/ack state without a separate value subscription. Defaults to +/// so existing implementers compile unchanged. +public enum AlarmTransitionKind { Unspecified = 0, Raise, Acknowledge, Clear, Retrigger } +``` +```csharp +// …existing params… + string? AlarmCategory = null, + AlarmTransitionKind Kind = AlarmTransitionKind.Unspecified); +``` +Add a `/// …` doc line to the record's XML doc (the project sets `TreatWarningsAsErrors` — a missing `` for a documented record is a build error). + +**Step 3 (failing test — Galaxy mapping):** Confirm how the existing Galaxy tests feed a `GalaxyAlarmTransition` onto `OnAlarmEvent` (grep `OnAlarmFeedTransition`/`OnAlarmEvent` in `tests/Drivers/.../Driver.Galaxy.Tests`). Add a test that, for each `GalaxyAlarmTransitionKind`, the surfaced `AlarmEventArgs.Kind` equals the matching `AlarmTransitionKind`. If the test harness can't reach the private `OnAlarmFeedTransition`, extract a tiny `internal static AlarmTransitionKind MapKind(GalaxyAlarmTransitionKind)` and test that directly (mark `Driver.Galaxy` `InternalsVisibleTo` the test project if not already — grep for it first). +Run → FAIL. + +**Step 4 (implement Galaxy mapping):** In `GalaxyDriver.OnAlarmFeedTransition`, add `Kind:` to the `new AlarmEventArgs(...)`: +```csharp +Kind: transition.TransitionKind switch +{ + GalaxyAlarmTransitionKind.Raise => AlarmTransitionKind.Raise, + GalaxyAlarmTransitionKind.Acknowledge => AlarmTransitionKind.Acknowledge, + GalaxyAlarmTransitionKind.Clear => AlarmTransitionKind.Clear, + GalaxyAlarmTransitionKind.Retrigger => AlarmTransitionKind.Retrigger, + _ => AlarmTransitionKind.Unspecified, +}); +``` + +**Step 5:** `dotnet build src/Core/ZB.MOM.WW.OtOpcUa.Core.Abstractions` then run both test filters → PASS. Build the full solution (`dotnet build ZB.MOM.WW.OtOpcUa.slnx`) to confirm no other `IAlarmSource` implementer broke. + +**Step 6 (commit):** +```bash +git add src/Core/ZB.MOM.WW.OtOpcUa.Core.Abstractions/IAlarmSource.cs \ + src/Drivers/ZB.MOM.WW.OtOpcUa.Driver.Galaxy/GalaxyDriver.cs \ + tests/Core/ZB.MOM.WW.OtOpcUa.Core.Abstractions.Tests/AlarmEventArgsTests.cs \ + tests/Drivers/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Tests/ +git commit -m "feat(alarms): carry transition Kind on AlarmEventArgs; Galaxy populates it (Phase B WS-1)" +``` + +--- + +### Task 2: Alarm intent in TagConfig → `EquipmentTagPlan.Alarm` (byte-parity) — WS-2 + +**Classification:** high-risk +**Estimated implement time:** ~5 min +**Parallelizable with:** Task 1 + +**Files:** +- Modify: `src/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer/Phase7Composer.cs` (`EquipmentTagPlan` record `:80-88`; new `EquipmentTagAlarmInfo`; `Select(...)` `:323-331`; new `ExtractTagAlarm` next to `ExtractTagFullName` `:432`) +- Modify: `src/Server/ZB.MOM.WW.OtOpcUa.Runtime/Drivers/DeploymentArtifact.cs` (`BuildEquipmentTagPlans` `:440-448`; new `ExtractTagAlarm` mirror next to `ExtractTagFullName` `:637`) +- Test: `tests/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer.Tests/` (new `ExtractTagAlarmTests.cs`; extend the existing composer↔artifact parity test) + +**Context — the byte-parity invariant:** `Phase7Composer` (compose from DB rows) and `DeploymentArtifact.BuildEquipmentTagPlans` (decode the deployment artifact JSON) MUST produce identical `EquipmentTagPlan`s for the same input — there is an existing parity test (grep `EquipmentTagPlan` in `OpcUaServer.Tests` for the parity fixture). The `alarm` object rides in the same `TagConfig` blob `FullName` already does, so both sides parse it the same way. Absent/malformed `alarm` ⇒ `null` (plain variable). TagConfig shape: +```json +{ "FullName": "TestMachine_002.HiAlarm", "alarm": { "alarmType": "OffNormalAlarm", "severity": 700 } } +``` +`alarmType` default `"AlarmCondition"`, `severity` default `500` (mirrors `ScriptedAlarm` defaults). Valid `alarmType` strings are whatever `OtOpcUaNodeManager.CreateAlarmConditionOfType` accepts (`OffNormalAlarm`/`DiscreteAlarm`/`LimitAlarm`/`AlarmCondition`); an unknown string falls back to the base type downstream — do not validate here. + +**Step 1 (failing test — `ExtractTagAlarm`):** In `ExtractTagAlarmTests.cs` (test the composer's private helper via `[InternalsVisibleTo]` if present, else assert through `ComposeAsync`/the plan; prefer a thin `internal static` so both helpers can be unit-tested): +```csharp +[Theory] +[InlineData("{\"FullName\":\"X.Y\"}", false, null, 0)] // no alarm ⇒ null +[InlineData("{\"FullName\":\"X.Y\",\"alarm\":{}}", true, "AlarmCondition", 500)] // defaults +[InlineData("{\"FullName\":\"X.Y\",\"alarm\":{\"alarmType\":\"OffNormalAlarm\",\"severity\":700}}", true, "OffNormalAlarm", 700)] +[InlineData("not json", false, null, 0)] // malformed ⇒ null +[InlineData("{\"FullName\":\"X.Y\",\"alarm\":\"oops\"}", false, null, 0)] // wrong kind ⇒ null +public void ExtractTagAlarm_parses_or_returns_null(string cfg, bool present, string? type, int sev) +{ + var info = Phase7Composer.ExtractTagAlarm(cfg); // make it internal static + if (!present) { info.ShouldBeNull(); return; } + info!.AlarmType.ShouldBe(type); + info.Severity.ShouldBe(sev); +} +``` +Run → FAIL. + +**Step 2 (implement `EquipmentTagAlarmInfo` + `EquipmentTagPlan.Alarm`):** In `Phase7Composer.cs`, next to `EquipmentTagPlan`: +```csharp +/// Native-alarm intent parsed from an equipment tag's TagConfig.alarm object. Null ⇒ +/// the tag is a plain value variable. is an OPC UA Part 9 subtype string +/// (OffNormalAlarm/DiscreteAlarm/LimitAlarm/AlarmCondition); is the 1..1000 scale. +public sealed record EquipmentTagAlarmInfo(string AlarmType, int Severity); +``` +Add `EquipmentTagAlarmInfo? Alarm` as the **last** field of `EquipmentTagPlan` (after `Writable`). Update the record's XML doc with a `` sentence. + +**Step 3 (implement `ExtractTagAlarm` — composer):** Next to `ExtractTagFullName`: +```csharp +/// Parses the optional alarm object from a tag's TagConfig JSON. Returns null +/// when absent, non-object, or non-JSON (the tag is then a plain variable). Never throws. The +/// artifact-decode side (DeploymentArtifact.ExtractTagAlarm) MUST parse identically (byte-parity). +internal static EquipmentTagAlarmInfo? ExtractTagAlarm(string? tagConfig) +{ + if (string.IsNullOrWhiteSpace(tagConfig)) return null; + try + { + using var doc = JsonDocument.Parse(tagConfig); + if (doc.RootElement.ValueKind != JsonValueKind.Object) return null; + if (!doc.RootElement.TryGetProperty("alarm", out var a) || a.ValueKind != JsonValueKind.Object) return null; + var type = a.TryGetProperty("alarmType", out var tEl) && tEl.ValueKind == JsonValueKind.String + ? (tEl.GetString() ?? "AlarmCondition") : "AlarmCondition"; + var sev = a.TryGetProperty("severity", out var sEl) && sEl.ValueKind == JsonValueKind.Number + ? sEl.GetInt32() : 500; + return new EquipmentTagAlarmInfo(type, sev); + } + catch (JsonException) { return null; } +} +``` +Wire it into the `Select(...)` at `:331` (add after `Writable:`): +```csharp + Writable: t.AccessLevel == TagAccessLevel.ReadWrite, + Alarm: ExtractTagAlarm(t.TagConfig))) +``` + +**Step 4 (implement `ExtractTagAlarm` — artifact mirror, byte-identical):** In `DeploymentArtifact.cs`, add a private `ExtractTagAlarm(string? tagConfig)` with the **same body** (it constructs the same `EquipmentTagAlarmInfo` — Runtime already references the assembly defining `EquipmentTagPlan`, so the type is in scope). Wire into `BuildEquipmentTagPlans` `:448`: +```csharp + Writable: writable, + Alarm: ExtractTagAlarm(tagConfig))); +``` + +**Step 5 (failing test — parity):** Extend the existing composer↔artifact parity fixture with an alarm-bearing equipment tag (TagConfig carrying the `alarm` object) and assert the composed `EquipmentTagPlan.Alarm` equals the artifact-decoded one (record equality covers it). Run the parity test → it now exercises `Alarm`. + +**Step 6:** `dotnet test tests/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer.Tests --filter "ExtractTagAlarm|Parity"` → PASS. Full build clean. + +**Step 7 (commit):** +```bash +git add src/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer/Phase7Composer.cs \ + src/Server/ZB.MOM.WW.OtOpcUa.Runtime/Drivers/DeploymentArtifact.cs \ + tests/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer.Tests/ExtractTagAlarmTests.cs \ + tests/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer.Tests/ +git commit -m "feat(alarms): EquipmentTagPlan.Alarm parsed byte-parity from TagConfig (Phase B WS-2)" +``` + +--- + +### Task 3: Materialize a condition node for an alarm tag — WS-3 + +**Classification:** standard +**Estimated implement time:** ~3 min +**Parallelizable with:** Task 4, Task 5 + +**Files:** +- Modify: `src/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer/Phase7Applier.cs:186-193` (`MaterialiseEquipmentTags` variable loop) +- Test: `tests/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer.Tests/` (extend the Phase7Applier materialise tests) + +**Context:** `MaterialiseEquipmentTags` currently calls `SafeEnsureVariable(...)` for every tag (`:192`). `SafeMaterialiseAlarmCondition(alarmNodeId, equipmentNodeId, displayName, alarmType, severity)` already exists (`:309`) — the same one scripted alarms use. An alarm tag becomes a **condition node only** (not also a variable). The condition NodeId = the tag's folder-scoped NodeId (`EquipmentNodeIds.Variable(...)`, same formula). Use the **sub-folder** as the condition's parent when `FolderPath` is set (the variable loop already computes `parent`). + +**Step 1 (failing test):** With the existing test sink (grep the materialise tests for the fake/recording `IOpcUaAddressSpaceSink`), feed a composition whose `EquipmentTags` contains one plain tag and one `Alarm != null` tag; assert the plain one called `EnsureVariable` and the alarm one called `MaterialiseAlarmCondition` (with the matching nodeId/type/severity) and did **not** call `EnsureVariable`. Run → FAIL. + +**Step 2 (implement the branch):** Replace the `SafeEnsureVariable(...)` call at `:192` with: +```csharp + if (tag.Alarm is not null) + { + // Native alarm tag → a real Part 9 condition node (reuses the scripted-alarm path), + // NOT a value variable. Parent is the sub-folder when set, else the equipment folder. + SafeMaterialiseAlarmCondition(nodeId, parent, tag.Name, tag.Alarm.AlarmType, tag.Alarm.Severity); + } + else + { + SafeEnsureVariable(nodeId, parent, tag.Name, tag.DataType, tag.Writable); + } +``` + +**Step 3:** `dotnet test tests/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer.Tests --filter MaterialiseEquipmentTags` → PASS. + +**Step 4 (commit):** +```bash +git add src/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer/Phase7Applier.cs \ + tests/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer.Tests/ +git commit -m "feat(alarms): materialise a Part 9 condition for an alarm equipment tag (Phase B WS-3)" +``` + +--- + +### Task 4: `NativeAlarmProjector` (transition → snapshot) — WS-4a + +**Classification:** standard +**Estimated implement time:** ~5 min +**Parallelizable with:** Task 3, Task 5 + +**Files:** +- Create: `src/Server/ZB.MOM.WW.OtOpcUa.Runtime/Drivers/NativeAlarmProjector.cs` +- Test: `tests/Server/ZB.MOM.WW.OtOpcUa.Runtime.Tests/Drivers/NativeAlarmProjectorTests.cs` + +**Context:** `AlarmEventArgs` is a delta (raise/clear/ack). `AlarmConditionSnapshot` (`Commons/OpcUa/AlarmConditionSnapshot.cs`: `Active, Acknowledged, Confirmed, Enabled, Shelving, Severity(ushort), Message`) is the full state `WriteAlarmCondition` wants. The projector keeps per-condition-NodeId prior `(Active, Acked, Severity, Message)` and derives the next snapshot from `Kind`. It lives in Runtime (references both `Core.Abstractions` and `Commons`). It is owned by the single-threaded `DriverHostActor` → a plain `Dictionary` is safe (no locking). + +**Severity map** (`AlarmSeverity` 4-bucket → ushort 1..1000): Low→200, Medium→500, High→700, Critical→900. + +**Kind → snapshot:** +| Kind | Active | Acknowledged | Severity/Message | +|---|---|---|---| +| Raise / Retrigger | true | false | from event | +| Acknowledge | prior | true | from event (keep prior severity if event is a pure ack — use event severity, fine) | +| Clear | false | prior | from event | +| Unspecified | prior | prior | from event | +Constant fields: `Enabled=true`, `Confirmed=true`, `Shelving=Unshelved`. + +**Step 1 (failing tests):** +```csharp +using Shouldly; +using ZB.MOM.WW.OtOpcUa.Commons.OpcUa; +using ZB.MOM.WW.OtOpcUa.Core.Abstractions; +using Xunit; + +namespace ZB.MOM.WW.OtOpcUa.Runtime.Tests.Drivers; + +public class NativeAlarmProjectorTests +{ + private static AlarmEventArgs Evt(AlarmTransitionKind kind, AlarmSeverity sev = AlarmSeverity.High, string msg = "m") + => new(new H(), "Tank1.Hi", "c1", "LimitAlarm.Hi", msg, sev, DateTime.UnixEpoch, Kind: kind); + + [Fact] + public void Raise_is_active_and_unacked() + { + var s = new NativeAlarmProjector().Project("n1", Evt(AlarmTransitionKind.Raise)); + s.Active.ShouldBeTrue(); s.Acknowledged.ShouldBeFalse(); + s.Severity.ShouldBe((ushort)700); s.Enabled.ShouldBeTrue(); + s.Shelving.ShouldBe(AlarmShelvingKind.Unshelved); + } + + [Fact] + public void Acknowledge_sets_acked_and_keeps_prior_active() + { + var p = new NativeAlarmProjector(); + p.Project("n1", Evt(AlarmTransitionKind.Raise)); + var s = p.Project("n1", Evt(AlarmTransitionKind.Acknowledge)); + s.Active.ShouldBeTrue(); s.Acknowledged.ShouldBeTrue(); + } + + [Fact] + public void Clear_deactivates_and_keeps_prior_ack() + { + var p = new NativeAlarmProjector(); + p.Project("n1", Evt(AlarmTransitionKind.Raise)); + var s = p.Project("n1", Evt(AlarmTransitionKind.Clear)); + s.Active.ShouldBeFalse(); s.Acknowledged.ShouldBeFalse(); + } + + [Fact] + public void State_is_isolated_per_nodeId() + { + var p = new NativeAlarmProjector(); + p.Project("n1", Evt(AlarmTransitionKind.Raise)); + var s2 = p.Project("n2", Evt(AlarmTransitionKind.Clear)); // cold n2: clear from default-inactive + s2.Active.ShouldBeFalse(); + } + + private sealed class H : IAlarmSubscriptionHandle { public string DiagnosticId => "t"; } +} +``` +Run: `dotnet test tests/Server/ZB.MOM.WW.OtOpcUa.Runtime.Tests --filter NativeAlarmProjector` → FAIL. + +**Step 2 (implement):** +```csharp +using ZB.MOM.WW.OtOpcUa.Commons.OpcUa; +using ZB.MOM.WW.OtOpcUa.Core.Abstractions; + +namespace ZB.MOM.WW.OtOpcUa.Runtime.Drivers; + +/// +/// Derives a full Part 9 from each native +/// delta, tracking per-condition-NodeId prior state. Owned by the +/// single-threaded DriverHostActor (no locking). Native alarms carry only a transition +/// , not a full state machine, so this is the translation the +/// scripted-alarm engine does internally. +/// +public sealed class NativeAlarmProjector +{ + private readonly Dictionary _prior = + new(StringComparer.Ordinal); + + /// Project an alarm transition onto the full condition snapshot for . + public AlarmConditionSnapshot Project(string nodeId, AlarmEventArgs e) + { + var prev = _prior.TryGetValue(nodeId, out var p) ? p : (Active: false, Acked: true, Severity: (ushort)0, Message: string.Empty); + var sev = MapSeverity(e.Severity); + var (active, acked) = e.Kind switch + { + AlarmTransitionKind.Raise or AlarmTransitionKind.Retrigger => (true, false), + AlarmTransitionKind.Acknowledge => (prev.Active, true), + AlarmTransitionKind.Clear => (false, prev.Acked), + _ => (prev.Active, prev.Acked), + }; + _prior[nodeId] = (active, acked, sev, e.Message); + return new AlarmConditionSnapshot( + Active: active, Acknowledged: acked, Confirmed: true, Enabled: true, + Shelving: AlarmShelvingKind.Unshelved, Severity: sev, Message: e.Message); + } + + /// Clears tracked state (call on address-space rebuild). + public void Clear() => _prior.Clear(); + + private static ushort MapSeverity(AlarmSeverity s) => s switch + { + AlarmSeverity.Low => 200, AlarmSeverity.Medium => 500, + AlarmSeverity.High => 700, AlarmSeverity.Critical => 900, _ => 500, + }; +} +``` + +**Step 3:** Run the filter → PASS. + +**Step 4 (commit):** +```bash +git add src/Server/ZB.MOM.WW.OtOpcUa.Runtime/Drivers/NativeAlarmProjector.cs \ + tests/Server/ZB.MOM.WW.OtOpcUa.Runtime.Tests/Drivers/NativeAlarmProjectorTests.cs +git commit -m "feat(alarms): NativeAlarmProjector maps transitions to condition snapshots (Phase B WS-4a)" +``` + +--- + +### Task 5: `DriverInstanceActor` subscribes `OnAlarmEvent` + publishes `AttributeAlarmPublished` — WS-4b + +**Classification:** high-risk +**Estimated implement time:** ~5 min +**Parallelizable with:** Task 3, Task 4 + +**Files:** +- Modify: `src/Server/ZB.MOM.WW.OtOpcUa.Runtime/Drivers/DriverInstanceActor.cs` (msg defs `:63-66`; new fields; `AttachAlarmSource`/`DetachAlarmSource`; `Receive`; wire attach on Connected-entry, detach in `DetachSubscription` + `PostStop`) +- Test: `tests/Server/ZB.MOM.WW.OtOpcUa.Runtime.Tests/Drivers/` (new `DriverInstanceActorNativeAlarmTests.cs`) + +**Context — mirror the `OnDataChange` pattern exactly** (`:408-409` attach, `:452-460` `DetachSubscription`): the driver fires `OnAlarmEvent` on its own thread → marshal to the actor thread via `self.Tell(...)`. Galaxy's alarm feed auto-starts in `InitializeAsync` and fires `OnAlarmEvent` independent of `SubscribeAlarmsAsync`, so the actor only subscribes the C# event (no `SubscribeAlarmsAsync` call — that's a deferred follow-up for drivers that gate on it). The server filters by `SourceNodeId` downstream (unknown refs drop), so forward every transition. + +**Step 1 (failing test, Akka.TestKit — xUnit v2):** +```csharp +// Fake driver implements IDriver + ISubscribable + IAlarmSource; exposes RaiseAlarm(args). +// Spawn the actor, drive it to Connected, then fakeDriver.RaiseAlarm(...) and +// ExpectMsg on a probe wired as the parent. +// Use CancellationToken.None (NOT TestContext.Current — this is xUnit v2 / Akka.TestKit.Xunit2). +``` +Model it on the existing `DriverInstanceActor` subscribe tests (grep `AttributeValuePublished` in `Runtime.Tests/Drivers`). Assert `AttributeAlarmPublished.DriverInstanceId` + `.Args.SourceNodeId` match. Run → FAIL. + +**Step 2 (implement messages):** After `AttributeValuePublished` (`:65`): +```csharp +/// Published to the parent whenever the subscribed driver (an ) fires +/// . The parent () projects + routes it +/// to the materialised Part 9 condition. Parallels . +public sealed record AttributeAlarmPublished(string DriverInstanceId, AlarmEventArgs Args); +private sealed record NativeAlarmRaised(AlarmEventArgs Args); +``` +Add fields near `_dataChangeHandler`: +```csharp +private EventHandler? _alarmEventHandler; +``` + +**Step 3 (implement attach/detach + receive):** +```csharp +/// Subscribe the driver's native alarm event (if it is an ), +/// marshaling each transition to the actor thread. Idempotent; mirrors the OnDataChange attach. +private void AttachAlarmSource() +{ + if (_driver is not IAlarmSource src || _alarmEventHandler is not null) return; + var self = Self; + _alarmEventHandler = (_, e) => self.Tell(new NativeAlarmRaised(e)); + src.OnAlarmEvent += _alarmEventHandler; +} + +/// Symmetric teardown — called from and PostStop so a stale +/// handler never pushes to a disconnected actor. +private void DetachAlarmSource() +{ + if (_driver is IAlarmSource src && _alarmEventHandler is not null) + src.OnAlarmEvent -= _alarmEventHandler; + _alarmEventHandler = null; +} +``` +Register in the `Connected()` behavior (next to `Receive(OnDataChangeForward)` at `:275`): +```csharp +Receive(m => Context.Parent.Tell(new AttributeAlarmPublished(_driverInstanceId, m.Args))); +``` +Call `AttachAlarmSource()` on every Connected entry — the cleanest single site is right where `ResubscribeDesired()` is called after `Become(Connected)` (both the first-connect `InitializeSucceeded` path **and** the `Reconnecting` `InitializeSucceeded` at `:295-302`). Add `AttachAlarmSource();` immediately after each `Become(Connected);`. Add `DetachAlarmSource();` inside `DetachSubscription()` (`:452`, so it fires on Connected→Reconnecting + Unsubscribe + PostStop, which all route through it — verify PostStop calls `DetachSubscription`; if not, add `DetachAlarmSource()` to `PostStop` too). + +**Step 4:** Run the filter → PASS. Confirm no double-attach (the `_alarmEventHandler is not null` guard) and that a teardown→reconnect re-attaches cleanly (add an assertion that a second Connected entry still delivers exactly one `AttributeAlarmPublished` per raise). + +**Step 5 (commit):** +```bash +git add src/Server/ZB.MOM.WW.OtOpcUa.Runtime/Drivers/DriverInstanceActor.cs \ + tests/Server/ZB.MOM.WW.OtOpcUa.Runtime.Tests/Drivers/DriverInstanceActorNativeAlarmTests.cs +git commit -m "feat(alarms): DriverInstanceActor forwards native OnAlarmEvent to parent (Phase B WS-4b)" +``` + +--- + +### Task 6: `DriverHostActor` alarm map + `ForwardNativeAlarm` → live condition — WS-4c + +**Classification:** high-risk +**Estimated implement time:** ~5 min +**Parallelizable with:** none + +**Files:** +- Modify: `src/Server/ZB.MOM.WW.OtOpcUa.Runtime/Drivers/DriverHostActor.cs` (new `_alarmNodeIdByDriverRef` map + `_nativeAlarmProjector` field `:99`; build in `PushDesiredSubscriptions` `:748-763`, branching on `Alarm != null`; `Receive` in the two behaviors that hold `Receive`; new `ForwardNativeAlarm`) +- Test: `tests/Server/ZB.MOM.WW.OtOpcUa.Runtime.Tests/Drivers/` (new `DriverHostActorNativeAlarmTests.cs`) + +**Context:** `ForwardToMux` (`:440-467`) is the value analogue. The value maps (`_nodeIdByDriverRef`, `_driverRefByNodeId`) + `refsByDriver` (value subscription) are built in `PushDesiredSubscriptions` (`:734-763`). Alarm tags must be **excluded** from the value maps + value subscription (they're conditions, not variables) and routed via a parallel alarm map. An alarm's `AlarmEventArgs.SourceNodeId` equals the tag's `FullName`, so the alarm map keys on the same `(DriverInstanceId, FullName)`. + +**Step 1 (failing test, Akka.TestKit):** Spawn `DriverHostActor` with a wired `_opcUaPublishActor` probe; apply a composition (or call the SubscribeBulk path) containing one alarm-bearing `EquipmentTag`; send `AttributeAlarmPublished(driverId, raiseEvent)`; `ExpectMsg` at the alarm tag's NodeId with `State.Active == true`. Second test: an `AttributeAlarmPublished` for an unknown ref produces **no** `AlarmStateUpdate`. Model on the existing `DriverHostActor` value-routing tests (grep `AttributeValueUpdate`/`_nodeIdByDriverRef` in `Runtime.Tests`). Run → FAIL. + +**Step 2 (fields):** At `:99` add: +```csharp +/// (DriverInstanceId, FullName=alarm SourceNodeId) → folder-scoped condition NodeId(s). Built +/// from EquipmentTags whose plan carries Alarm, alongside the value maps; resolves native alarm +/// transitions to their materialised Part 9 condition node(s). +private readonly Dictionary<(string DriverInstanceId, string FullName), HashSet> _alarmNodeIdByDriverRef = new(); +private readonly NativeAlarmProjector _nativeAlarmProjector = new(); +``` + +**Step 3 (build the map — `PushDesiredSubscriptions`):** In the `EquipmentTags` loop (`:755-763`), branch so alarm tags go ONLY into `_alarmNodeIdByDriverRef` and are kept OUT of the value maps; and exclude alarm refs from `refsByDriver` (the value-subscription set at `:734-741`). Concretely: +- Add `_alarmNodeIdByDriverRef.Clear();` next to the other `Clear()`s. +- In the `foreach (var t in composition.EquipmentTags)` loop: +```csharp + var key = (t.DriverInstanceId, t.FullName); + var nodeId = EquipmentNodeIds.Variable(t.EquipmentId, t.FolderPath, t.Name); + if (t.Alarm is not null) + { + if (!_alarmNodeIdByDriverRef.TryGetValue(key, out var aset)) + _alarmNodeIdByDriverRef[key] = aset = new HashSet(StringComparer.Ordinal); + aset.Add(nodeId); + continue; // alarm tags are conditions, not value variables + } + // …existing value-map population (unchanged)… +``` +- For `refsByDriver` (`:734-741`), add `.Where(t => t.Alarm is null)` before the `GroupBy` so the driver doesn't value-subscribe alarm attributes. +- Reset projector state on rebuild: call `_nativeAlarmProjector.Clear();` alongside the map clears (the condition nodes are torn down + rebuilt each apply, so prior state must not leak). + +**Step 4 (`ForwardNativeAlarm` + receives):** Add the handler (model on `ForwardToMux`): +```csharp +private void ForwardNativeAlarm(DriverInstanceActor.AttributeAlarmPublished msg) +{ + if (_opcUaPublishActor is null) return; + if (!_alarmNodeIdByDriverRef.TryGetValue((msg.DriverInstanceId, msg.Args.SourceNodeId), out var nodeIds)) + { + _log.Debug("DriverHost {Node}: no alarm condition for ({Driver},{Ref}) — transition dropped", + _localNode, msg.DriverInstanceId, msg.Args.SourceNodeId); + return; + } + foreach (var nodeId in nodeIds) + { + var snapshot = _nativeAlarmProjector.Project(nodeId, msg.Args); + _opcUaPublishActor.Tell(new ZB.MOM.WW.OtOpcUa.Runtime.OpcUa.OpcUaPublishActor.AlarmStateUpdate( + nodeId, snapshot, msg.Args.SourceTimestampUtc)); + } +} +``` +Register `Receive(ForwardNativeAlarm);` immediately after **each** `Receive(ForwardToMux);` (there are two — the steady + applying behaviors; grep `ForwardToMux` to find both registration sites). + +**Step 5:** Run the filter → PASS. Full build clean. + +**Step 6 (commit):** +```bash +git add src/Server/ZB.MOM.WW.OtOpcUa.Runtime/Drivers/DriverHostActor.cs \ + tests/Server/ZB.MOM.WW.OtOpcUa.Runtime.Tests/Drivers/DriverHostActorNativeAlarmTests.cs +git commit -m "feat(alarms): DriverHostActor routes native alarm transitions to Part 9 conditions (Phase B WS-4c)" +``` + +--- + +### Task 7: Primary-gated `AlarmTransitionEvent` fan-out to `alerts` — WS-5 + +**Classification:** high-risk +**Estimated implement time:** ~5 min +**Parallelizable with:** none + +**Files:** +- Modify: `src/Server/ZB.MOM.WW.OtOpcUa.Runtime/Drivers/DriverHostActor.cs` (alarm-meta map; mediator publish in `ForwardNativeAlarm`, Primary-gated via `_localRole`) +- Test: `tests/Server/ZB.MOM.WW.OtOpcUa.Runtime.Tests/Drivers/DriverHostActorNativeAlarmTests.cs` (extend) + +**Context:** `ScriptedAlarmHostActor.OnEngineEmission` (`:289-315`) publishes `AlarmTransitionEvent` to the `alerts` topic (consumed by `HistorianAdapterActor` + AdminUI `/alerts`), **Primary-gated** (suppress on `Secondary`/`Detached`). DriverHostActor already holds `_localRole` (`:121`, same gate it uses for `RouteNodeWrite`) and a DistributedPubSub mediator (it subscribes `RedundancyStateTopic` at `:265`). Reuse both. The event needs `EquipmentPath`/`AlarmName`/`AlarmTypeName` per condition node → add a small meta map built in the same `PushDesiredSubscriptions` pass. Grep `AlertsTopic` + `AlarmTransitionEvent` (Commons `Messages/Alerts/`) and `ScriptedAlarmHostActor` for the exact topic constant, the mediator field name, and the `Publish` usage to copy. + +**Step 1 (failing test):** Drive the host to `_localRole = Secondary` (send the `RedundancyStateChanged` it consumes; grep the existing redundancy-gate test for `RouteNodeWrite`/"not primary" to copy the setup) → an `AttributeAlarmPublished` still Tells `AlarmStateUpdate` (condition stays warm) but publishes **no** `AlarmTransitionEvent`. With `_localRole = Primary` (or unset) → it publishes one. Run → FAIL. + +**Step 2 (implement meta map):** Add `private readonly Dictionary _alarmMetaByNodeId = new(StringComparer.Ordinal);`; populate it in the alarm branch of the `PushDesiredSubscriptions` loop (Task 6 step 3): `_alarmMetaByNodeId[nodeId] = (t.EquipmentId, t.Name, t.Alarm.AlarmType);` and `Clear()` it with the others. + +**Step 3 (implement gated publish):** In `ForwardNativeAlarm`, after the `AlarmStateUpdate` Tell (inside the `foreach`), append the Primary-gated publish, mirroring `ScriptedAlarmHostActor`: +```csharp + if (_localRole is RedundancyRole.Secondary or RedundancyRole.Detached) continue; // warm-standby dedup + var meta = _alarmMetaByNodeId.TryGetValue(nodeId, out var m) ? m : (EquipmentId: nodeId, Name: nodeId, AlarmType: "AlarmCondition"); + _mediator.Tell(new Publish(AlertsTopic, new AlarmTransitionEvent( + AlarmId: nodeId, + EquipmentPath: meta.EquipmentId, + AlarmName: meta.Name, + TransitionKind: msg.Args.Kind.ToString(), + Severity: snapshot.Severity, + Message: msg.Args.Message, + User: msg.Args.OperatorComment is null ? string.Empty : "device", + TimestampUtc: msg.Args.SourceTimestampUtc, + AlarmTypeName: meta.AlarmType, + Comment: msg.Args.OperatorComment, + HistorizeToAveva: true))); +``` +Use the exact `AlarmTransitionEvent` constructor argument list discovered by grep (adjust names/order if they differ; the historian + `/alerts` consumers already handle this shape from scripted alarms). Resolve the mediator handle the same way the existing redundancy subscription does (the field used at `:265`). + +**Step 4:** Run the filter → PASS. Full build + `dotnet test tests/Server/ZB.MOM.WW.OtOpcUa.Runtime.Tests` green. + +**Step 5 (commit):** +```bash +git add src/Server/ZB.MOM.WW.OtOpcUa.Runtime/Drivers/DriverHostActor.cs \ + tests/Server/ZB.MOM.WW.OtOpcUa.Runtime.Tests/Drivers/DriverHostActorNativeAlarmTests.cs +git commit -m "feat(alarms): Primary-gated AlarmTransitionEvent fan-out for native alarms (Phase B WS-5)" +``` + +--- + +### Task 8: Document the `TagConfig` alarm schema — Docs + +**Classification:** small +**Estimated implement time:** ~3 min +**Parallelizable with:** Task 3, Task 4, Task 5 + +**Files:** +- Modify: `docs/ScriptedAlarms.md` (add a "Native driver alarms (equipment-tag path)" section) — or `docs/AlarmTracking.md` if that's the better home; pick whichever already covers condition materialization. + +**Step 1:** Add a section documenting: a native driver alarm (Galaxy) is an authored equipment `Tag` whose `TagConfig` carries an `alarm` object — `{"FullName":"tag.attr","alarm":{"alarmType":"OffNormalAlarm","severity":700}}`; absent ⇒ plain variable; `alarmType` ∈ {AlarmCondition, OffNormalAlarm, DiscreteAlarm, LimitAlarm} (default AlarmCondition), `severity` 1..1000 (default 500). Note it materializes a Part 9 condition (not a variable) under the equipment folder, fed by the driver's live `IAlarmSource.OnAlarmEvent`; transitions fan out to `/alerts` + the historian (Primary-gated). State the two deferred follow-ups: inbound device-ack (client Acknowledge → AVEVA) and the AdminUI Galaxy-picker pre-fill. + +**Step 2 (commit):** +```bash +git add docs/ScriptedAlarms.md +git commit -m "docs(alarms): native driver-alarm TagConfig schema (Phase B)" +``` + +--- + +### Task 9: Live docker-dev `/run` verification (user-driven) + +**Classification:** verification +**Estimated implement time:** n/a (user drives; the agent does NOT sign in) +**Parallelizable with:** none + +**Gate (the design's live-verify):** On the live-gateway-backed `MAIN-galaxy-eq` (dev rig is LOCAL on this Mac — OrbStack; central-1 @ `localhost:4840`, deploy/AdminUI @ `localhost:9200`, sql @ `localhost:14330`; the Galaxy gateway needs the ephemeral `GALAXY_MXGW_API_KEY` re-exported on container recreate — see `pending.md` Galaxy dev-rig note): +1. Rebuild central on the branch: `docker compose -f docker-dev/docker-compose.yml up -d --build migrator central-1 central-2` (re-export `GALAXY_MXGW_API_KEY=…` on the recreate). +2. Author a Galaxy alarm equipment tag on `EQ-55297329838d` whose `TagConfig` carries the `alarm` object pointing at a real galaxy alarm reference (a `TestMachine_002` attribute with an alarm extension, or the seeded `TestMachine_001.TestAlarm00x`). Deploy: `POST http://localhost:9200/api/deployments`, header `X-Api-Key: docker-dev-deploy-key`. **Order:** deploy FIRST, then recreate central if the driver is faulted (a faulted driver ignores `ApplyDelta`). +3. Trip the Galaxy alarm. Confirm via Client.CLI `alarms`/`read` against `opc.tcp://localhost:4840` that a Part 9 `AlarmConditionState` appears **active** under the equipment NodeId; confirm the AdminUI `/alerts` row appears. Clear → condition goes inactive. +4. (Device-ack round-trip is the deferred follow-up — NOT part of this gate.) + +**On pass:** finish via `superpowers-extended-cc:finishing-a-development-branch` (intent: merge-to-master + push). Update `pending.md` (disk-only) marking Phase B done; refresh the memory. + +--- + +## Execution order / dependency summary + +``` +T0 ─┬─ T1 (contract) ─┬─ T4 (projector) ─┐ + │ └─ T5 (instance actor) ─┐ + └─ T2 (plan field) ─┬─ T3 (applier) ├─ T6 (host live) ── T7 (host alerts) ── T9 (live) + └─ T8 (docs) ┘ +``` +- **Parallel batch A** (after T0): T1 ∥ T2. +- **Parallel batch B** (T1+T2 done): T3 ∥ T4 ∥ T5 ∥ T8 (all disjoint files). +- **Serial tail:** T6 (needs T2+T4+T5) → T7 (same file) → T9. diff --git a/docs/plans/2026-06-14-galaxy-phase-b-native-alarms-plan.md.tasks.json b/docs/plans/2026-06-14-galaxy-phase-b-native-alarms-plan.md.tasks.json new file mode 100644 index 00000000..29fd1515 --- /dev/null +++ b/docs/plans/2026-06-14-galaxy-phase-b-native-alarms-plan.md.tasks.json @@ -0,0 +1,16 @@ +{ + "planPath": "docs/plans/2026-06-14-galaxy-phase-b-native-alarms-plan.md", + "tasks": [ + {"id": "367", "subject": "Task 0: Feature branch feat/galaxy-phase-b-native-alarms", "classification": "trivial", "status": "pending"}, + {"id": "368", "subject": "Task 1 (WS-1): AlarmEventArgs.Kind contract + Galaxy populates it", "classification": "standard", "status": "pending", "blockedBy": ["367"], "parallelizableWith": ["369"]}, + {"id": "369", "subject": "Task 2 (WS-2): EquipmentTagPlan.Alarm parsed byte-parity from TagConfig", "classification": "high-risk", "status": "pending", "blockedBy": ["367"], "parallelizableWith": ["368"]}, + {"id": "370", "subject": "Task 3 (WS-3): Materialise a Part 9 condition for an alarm tag", "classification": "standard", "status": "pending", "blockedBy": ["369"], "parallelizableWith": ["371", "372", "375"]}, + {"id": "371", "subject": "Task 4 (WS-4a): NativeAlarmProjector (transition -> snapshot)", "classification": "standard", "status": "pending", "blockedBy": ["368"], "parallelizableWith": ["370", "372", "375"]}, + {"id": "372", "subject": "Task 5 (WS-4b): DriverInstanceActor forwards native OnAlarmEvent", "classification": "high-risk", "status": "pending", "blockedBy": ["368"], "parallelizableWith": ["370", "371", "375"]}, + {"id": "373", "subject": "Task 6 (WS-4c): DriverHostActor routes native alarms to conditions", "classification": "high-risk", "status": "pending", "blockedBy": ["369", "371", "372"]}, + {"id": "374", "subject": "Task 7 (WS-5): Primary-gated AlarmTransitionEvent fan-out", "classification": "high-risk", "status": "pending", "blockedBy": ["373"]}, + {"id": "375", "subject": "Task 8 (Docs): Native driver-alarm TagConfig schema", "classification": "small", "status": "pending", "blockedBy": ["369"], "parallelizableWith": ["370", "371", "372"]}, + {"id": "376", "subject": "Task 9: Live docker-dev /run verification (user-driven)", "classification": "verification", "status": "pending", "blockedBy": ["374", "375"]} + ], + "lastUpdated": "2026-06-14" +}