The original single T17 (inbound method dispatch + ack plumbing) proved on a 2026-06-11 deep dive to be four hard problems: roles on the session identity (T17), node-manager command router + AlarmAck veto + alarm-commands DPS topic (T18), host-actor inbound handler (T19), and delta-gate double-emit (T20). Old T18->T21 (AdminUI), old T19 split into T22 (Client.CLI feature) + T23 (verify), old T20->T24. Adds the Layer 2 design-decisions preamble.
36 KiB
Script-log Engine Emit + Scripted-Alarm Runtime — Implementation Plan
For Claude: REQUIRED SUB-SKILL: use superpowers-extended-cc:subagent-driven-development (or executing-plans) to implement this plan task-by-task.
Goal: Make the Script-log page tail real script output, and stand up scripted alarms end-to-end including real OPC UA Part 9 condition nodes + client ack.
Architecture: Three sequenced layers off one shared seam (a root script logger
fanning to file + companion + a new DPS topic sink). Layer 0 = emit (F8 live). Layer 1
= F9 engine runtime on the Akka equipment-namespace runtime. Layer 2 = F14b real Part 9
nodes + events + inbound ack. Design: docs/plans/2026-06-10-script-log-and-scripted-alarm-runtime-design.md.
Verified gap analysis: pending.md.
Tech: .NET 10, Akka.NET, EF Core (SQL prod / InMemory tests), Serilog, OPC Foundation UA .NET Standard, xUnit + Shouldly, Akka TestKit. No bUnit.
Hard rules (every task): stage by explicit path — never git add .; never stage
sql_login.txt or src/Server/ZB.MOM.WW.OtOpcUa.Host/pki/; never echo the gateway API
key into a new tracked file; no force-push, no --no-verify. No Configuration entity
/ EF migration change (ScriptedAlarmState table already exists). Agent does not
sign in to the AdminUI — the user drives live /run.
Branch: feat/scriptlog-alarm-runtime off master @ df4c2657 (design committed there).
Reference patterns to mirror: VirtualTagHostActor (host actor shape),
EfAlarmActorStateStore (EF store shape), the {{equip}} two-seam parity work
(Phase7Composer ↔ DeploymentArtifact), RoslynVirtualTagEvaluator (evaluator).
LAYER 0 — Shared script-log emit + F8 live
Task 0: Branch + test-project check
Classification: small · ~2 min · Parallelizable with: none
Files: none created (branch + verification only)
Steps:
git switch -c feat/scriptlog-alarm-runtime(offmaster @ df4c2657).- Confirm
tests/Core/ZB.MOM.WW.OtOpcUa.Core.Scripting.Tests/exists and is in the.slnx(it does —ScriptLoggerFactoryTests.cslives there). New Layer-0 tests land here. Confirmtests/Server/ZB.MOM.WW.OtOpcUa.Runtime.Tests/for Layer-1 tests. dotnet build ZB.MOM.WW.OtOpcUa.slnx— green baseline. Commit nothing.
Task 1: IScriptLogPublisher + ScriptLogTopicSink
Classification: standard · ~4 min · Parallelizable with: none
Files:
- Create:
src/Core/ZB.MOM.WW.OtOpcUa.Core.Scripting/IScriptLogPublisher.cs - Create:
src/Core/ZB.MOM.WW.OtOpcUa.Core.Scripting/ScriptLogTopicSink.cs - Test:
tests/Core/ZB.MOM.WW.OtOpcUa.Core.Scripting.Tests/ScriptLogTopicSinkTests.cs - Maybe modify:
Core.Scripting.csproj(add ProjectReference toCommonsforScriptLogEntryif not already referenced — verify first).
Step 1 — failing tests (ScriptLogTopicSinkTests):
- A
LogEvent(Information) with propertiesScriptId="S1",VirtualTagId="V1",EquipmentId="EQ1", message"hello"→ publisher receives oneScriptLogEntrywith those fields,Level=="Information",Message=="hello". AlarmIdproperty maps toScriptLogEntry.AlarmId; absent properties → null fields.- A
Debugevent with defaultminLevel=Information→ publisher receives nothing. - Template message renders (
"v={V}"+ prop V=3 →"v=3"). Use a fakeIScriptLogPublishercapturing entries.
Step 2 — run, expect fail (types don't exist).
Step 3 — implement:
public interface IScriptLogPublisher { void Publish(ScriptLogEntry entry); }
public sealed class ScriptLogTopicSink : ILogEventSink
{
private readonly IScriptLogPublisher _publisher;
private readonly LogEventLevel _min;
public ScriptLogTopicSink(IScriptLogPublisher publisher,
LogEventLevel min = LogEventLevel.Information) { _publisher = publisher; _min = min; }
public void Emit(LogEvent e)
{
if (e is null || e.Level < _min) return;
string? P(string k) => e.Properties.TryGetValue(k, out var v)
&& v is ScalarValue { Value: string s } ? s : null;
_publisher.Publish(new ScriptLogEntry(
ScriptId: P("ScriptId") ?? P("ScriptName") ?? "unknown",
Level: e.Level.ToString(),
Message: e.RenderMessage(),
TimestampUtc: e.Timestamp.UtcDateTime,
VirtualTagId: P("VirtualTagId"), AlarmId: P("AlarmId"), EquipmentId: P("EquipmentId")));
}
}
(Property-name constants — reuse/extend ScriptLoggerFactory's ScriptNameProperty;
add ScriptIdProperty/VirtualTagIdProperty/AlarmIdProperty/EquipmentIdProperty.)
Step 4 — run tests, expect pass. Step 5 — commit (git add the 3 files by path).
Task 2: Root script logger + DpsScriptLogPublisher + Host wiring
Classification: standard · ~5 min · Parallelizable with: none (depends T1)
Files:
- Create:
src/Server/ZB.MOM.WW.OtOpcUa.Runtime/Scripting/DpsScriptLogPublisher.cs(or Host — wherever theActorSystem/Mediatoris reachable at construction). - Create:
src/Server/ZB.MOM.WW.OtOpcUa.Host/Logging/ScriptRootLoggerFactory.cs(builds the rootILogger: rollingscripts-*.log+ScriptLogCompanionSink+ScriptLogTopicSink). - Modify:
src/Server/ZB.MOM.WW.OtOpcUa.Host/Program.cs(build + register root logger; registerIScriptLogPublisher). - Test:
tests/.../Core.Scripting.Tests/ScriptRootLoggerFanoutTests.cs(or Host.Tests).
Steps (TDD):
- Failing test: a logger built by
ScriptRootLoggerFactorywith a fake publisher + in-memory companion → anErrorevent reaches the companion mirror AND the topic publisher; aDebugevent reaches neither topic nor companion (file only). (Assert via fakes; don't assert the physical file.) - Implement
DpsScriptLogPublisher— ctor takes the DPS mediatorIActorRef(orActorSystem);Publish→mediator.Tell(new Publish("script-logs", entry))(topic constantVirtualTagActor.ScriptLogsTopic). - Implement
ScriptRootLoggerFactory.Build(IScriptLogPublisher, config)→LoggerConfiguration().WriteTo.File(...).WriteTo.Sink(new ScriptLogCompanionSink(Log.Logger)) .WriteTo.Sink(new ScriptLogTopicSink(publisher, minLevel)).CreateLogger(). Program.cs: resolve the mediator after the ActorSystem is up; registerIScriptLogPublisher(singleton) + the rootILogger(keyed/named for scripts). Min-level from config (Scripting:LogTopicMinLevel, defaultInformation).- Run + commit by path.
Task 3: Rewire evaluators to the root script logger
Classification: standard · ~5 min · Parallelizable with: none (depends T1, T2)
Files:
- Modify:
src/Server/ZB.MOM.WW.OtOpcUa.Host/Engines/RoslynVirtualTagEvaluator.cs - Modify:
src/Server/ZB.MOM.WW.OtOpcUa.Host/Engines/RoslynScriptedAlarmEvaluator.cs - Modify:
src/Core/ZB.MOM.WW.OtOpcUa.Core.Scripting/ScriptLoggerFactory.cs(bind the full property set, not justScriptName). - Modify:
src/Server/ZB.MOM.WW.OtOpcUa.Host/Program.cs(inject root logger into the evaluators). - Test:
tests/.../Core.Scripting.Tests/— evaluator emits via fake publisher.
Steps:
- Failing test: a
RoslynVirtualTagEvaluatorbuilt with a root logger wired to a fake publisher; evaluate a scriptctx.Logger.Information("hi"); return 1;→ publisher gets one entry withScriptId/VirtualTagIdbound andMessage=="hi". - Replace the static
ScriptLoggerfield with a ctor-injected rootILogger. Per evaluation,var log = _root.ForContext("ScriptId", id).ForContext("VirtualTagId", virtualTagId)(+EquipmentIdwhen available) and pass into theVirtualTagContext. Same for the alarm evaluator (bindsAlarmId). ScriptLoggerFactory: add aCreate(scriptId, virtualTagId?, alarmId?, equipmentId?)overload binding the standard properties (keep the oldCreate(scriptName)for compatibility).Program.cs: pass the root logger to both evaluator registrations.- Run + commit by path.
Note:
IVirtualTagEvaluator.EvaluatecarriesvirtualTagId; in the live pathscriptId == virtualTagId, so Layer 0 binds both from it. Threading a distinctEquipmentId(nice-to-have on the page) is optional here — if it requires an interface change, defer it to a Layer-1 follow-up rather than expanding T3.
Task 4: Live-verify Layer 0
Classification: verification · Parallelizable with: none (depends T2, T3)
Steps:
- Rebuild docker-dev central nodes (user-driven
/run). Author a virtual tag whose script callsctx.Logger.Information(...). - Open
/script-log; drive the dependency so the script evaluates; confirm the line appears live with the right ScriptId/level. Confirm Debug stays off the page, Information+ shows. - Agent does not sign in — user signs in and drives. Record outcome. No code unless a defect surfaces (→ new fix task).
LAYER 1 — F9 engine runtime
Task 5: EquipmentScriptedAlarmPlan + Phase7Composer enrichment
Classification: standard · ~5 min · Parallelizable with: Task 7, Task 8
Files:
- Modify:
src/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer/Phase7Composer.cs(new record + build the enriched list fromScriptedAlarm+Scriptrows). - Test:
tests/Server/ZB.MOM.WW.OtOpcUa.Server.Tests/Phase7/(or wherever Phase7Composer is tested) — newPhase7ComposerScriptedAlarmTests.cs.
Steps:
- Failing test: compose two equipments each with a scripted alarm referencing a script;
assert each
EquipmentScriptedAlarmPlancarries the resolvedPredicateSource, extractedDependencyRefs(viaDependencyExtractor),AlarmType,Severity,MessageTemplate,HistorizeToAveva,Retain,Enabled,Name. - Add
public sealed record EquipmentScriptedAlarmPlan(string ScriptedAlarmId, string EquipmentId, string Name, string AlarmType, int Severity, string MessageTemplate, string PredicateScriptId, string PredicateSource, IReadOnlyList<string> DependencyRefs, bool HistorizeToAveva, bool Retain, bool Enabled); - In
Compose: joinScriptedAlarm.PredicateScriptId → Script.SourceCode; runDependencyExtractor.Extract(source).Reads(∪MessageTemplatetoken paths) forDependencyRefs; project into the new list on the composition result. SkipEnabled=falsealarms (or carry the flag — carry it; host decides). Drop alarms whose script is missing with a structured warning (don't throw the whole compose). - Run + commit by path.
Task 6: DeploymentArtifact parity for the alarm plan
Classification: standard · ~5 min · Parallelizable with: Task 7, Task 8 (depends T5)
Files:
- Modify:
src/Server/ZB.MOM.WW.OtOpcUa.Runtime/Drivers/DeploymentArtifact.cs(encode/decodeEquipmentScriptedAlarmPlan; addPhase7CompositionResult.EquipmentScriptedAlarms; filter-by-equipment likeEquipmentVirtualTagsat :263). - Test:
tests/Server/ZB.MOM.WW.OtOpcUa.Runtime.Tests/— artifact round-trip + parity with the Composer for the same input.
Steps:
- Failing test: build a composition via
Phase7Composer, serialize to artifact, parse back →EquipmentScriptedAlarmsis byte-identical (same discipline as the{{equip}}parity tests). Equipment-filter test (only alarms for resident equipment survive). - Add the field to
Phase7CompositionResult; mirror theEquipmentVirtualTagsencode/decode/filter exactly (:202,:263). - Run + commit by path.
Task 7: DependencyMuxTagUpstreamSource
Classification: standard · ~4 min · Parallelizable with: Task 5, Task 6, Task 8
Files:
- Create:
src/Server/ZB.MOM.WW.OtOpcUa.Runtime/ScriptedAlarms/DependencyMuxTagUpstreamSource.cs(implementsCore.ScriptedAlarms/Core.VirtualTagsITagUpstreamSource). - Test:
tests/Server/ZB.MOM.WW.OtOpcUa.Runtime.Tests/ScriptedAlarms/DependencyMuxTagUpstreamSourceTests.cs
Steps:
- Failing tests:
Push(path, snapshot)updates cache soReadTag(path)returns it;SubscribeTag(path, obs)→obsfires on the nextPush;ReadTagfor an unknown path returns a Bad-quality snapshot; dispose removes the observer. - Implement: a thread-safe cache (
ConcurrentDictionary<string, DataValueSnapshot>) + per-path observer list;Pushupdates cache then invokes observers;ReadTagreads cache (Bad if absent);SubscribeTagreturns anIDisposablethat deregisters. The host actor callsPushfrom itsDependencyValueChangedhandler. Value wrap:new DataValueSnapshot(value, StatusCode:0, ts, ts). - Run + commit by path.
Task 8: EfAlarmConditionStateStore : IAlarmStateStore
Classification: standard · ~5 min · Parallelizable with: Task 5, Task 6, Task 7
Files:
- Create:
src/Server/ZB.MOM.WW.OtOpcUa.Runtime/ScriptedAlarms/EfAlarmConditionStateStore.cs - Test:
tests/Server/ZB.MOM.WW.OtOpcUa.Runtime.Tests/ScriptedAlarms/EfAlarmConditionStateStoreTests.cs(in-memory EF).
Steps:
- Failing tests (in-memory
OtOpcUaConfigDbContext):SaveAsync(state)thenLoadAsync(alarmId)round-trips Enabled/Acked/Confirmed/Shelving(+UnshelveAtUtc)/ LastAck*/LastConfirm*/Comments;LoadAsyncof an unknown id → null;ActiveStateis not persisted (a saved state's Active is ignored on load — load returns the stored operator state, Active defaults). Comments JSON round-trips. - Implement mapping
AlarmConditionState↔ScriptedAlarmStateentity (mirrorEfAlarmActorStateStore'sIDbContextFactoryupsert pattern; serializeImmutableList<AlarmComment>↔CommentsJson). Map enum states ↔ the entity's string columns. - Run + commit by path.
Task 9: ScriptedAlarmHostActor
Classification: high-risk · ~5 min · Parallelizable with: none (depends T6, T7, T8; needs Layer 0 T2/T3 root logger)
Files:
- Create:
src/Server/ZB.MOM.WW.OtOpcUa.Runtime/ScriptedAlarms/ScriptedAlarmHostActor.cs - Test:
tests/Server/ZB.MOM.WW.OtOpcUa.Runtime.Tests/ScriptedAlarms/ScriptedAlarmHostActorTests.cs(Akka TestKit + a real engine with the fake upstream, or a fake engine seam).
Design: mirrors VirtualTagHostActor. Owns one ScriptedAlarmEngine (built with the
DependencyMuxTagUpstreamSource, the EfAlarmConditionStateStore, a ScriptLoggerFactory
wrapping the Layer 0 root logger, and the engine's root logger). Message
ApplyScriptedAlarms(IReadOnlyList<EquipmentScriptedAlarmPlan> Plans).
Steps:
- Failing TestKit tests:
ApplyScriptedAlarmswith one alarm → engine loaded (assert via a probe/seam); registers interest with the (probe) mux for the alarm's dep refs.- A
DependencyValueChangedthat makes the predicate true → the host tells the (probe)OpcUaPublishActoraWriteAlarmState(alarmId, active:true, …), tells the (probe) historian anAlarmHistorianEvent(whenHistorizeToAveva), and publishes anAlarmTransitionEventonalerts. - Re-
ApplyScriptedAlarmswith a different set reloads the engine (LoadAsync replace).
- Implement: on
ApplyScriptedAlarms, buildScriptedAlarmDefinitions from the plans (mapAlarmType→AlarmKind,Severity→AlarmSeverity,EquipmentId→EquipmentPath),engine.LoadAsync; register mux interest for⋃ DependencyRefs; onDependencyValueChanged→_upstream.Push(...). Subscribeengine.OnEventonce → mapScriptedAlarmEvent.Conditionto(active, acknowledged)→OpcUaPublishActor.WriteAlarmState; map →AlarmHistorianEvent→ historian (if Historize); publishAlarmTransitionEventonalerts. Dispose engine inPostStop. - Run targeted tests (
dotnet test --filter ScriptedAlarmHostActor). Commit by path.
Task 10: Spawn + apply in DriverHostActor
Classification: standard · ~4 min · Parallelizable with: none (depends T9)
Files:
- Modify:
src/Server/ZB.MOM.WW.OtOpcUa.Runtime/Drivers/DriverHostActor.cs(spawnScriptedAlarmHostActornext toVirtualTagHostActor~:197; tellApplyScriptedAlarms(composition.EquipmentScriptedAlarms)next to the vtag apply ~:532; add an override field for tests like_virtualTagHostOverride). - Modify:
src/Server/ZB.MOM.WW.OtOpcUa.Runtime/ServiceCollectionExtensions.csif the host needs new injected deps (EF store factory, root logger, historian ref). - Test: extend
DriverHostActorTests— apply pushesApplyScriptedAlarms.
Steps: mirror the VirtualTag spawn/apply exactly; thread _opcUaPublishActor,
_dependencyMux, the EF store, the root logger, the historian actor ref. Run + commit.
Task 11: Retire the orphaned actor + F9b evaluator
Classification: small · ~3 min · Parallelizable with: none (depends T9, T10)
Files:
- Delete:
src/Server/ZB.MOM.WW.OtOpcUa.Runtime/ScriptedAlarms/ScriptedAlarmActor.cs(+ its tests) andsrc/Server/ZB.MOM.WW.OtOpcUa.Host/Engines/RoslynScriptedAlarmEvaluator.cs. - Modify:
src/Server/ZB.MOM.WW.OtOpcUa.Host/Program.cs(remove F9b DI registration, lines ~110-114) and anyIScriptedAlarmEvaluatorreferences. KeepEfAlarmActorStateStoreonly if nothing else uses it — otherwise delete with the actor.
Steps: delete, fix build, run full dotnet test for the touched projects, commit by
path. (If something unexpectedly depends on these, stop and surface — don't expand scope.)
Task 12: Live-verify Layer 1
Classification: verification · Parallelizable with: none (depends T10, T11)
Steps: rebuild docker-dev; author a scripted alarm whose predicate references a live
tag; drive the tag; confirm the alarm node flips active/clear, the historian queue
advances (/alarms/historian), the alerts/Alerts page shows it, and predicate
ctx.Logger output appears on /script-log. User drives sign-in. Defects → new tasks.
LAYER 2 — F14b real Part 9 + client ack
Task 13: SDK research spike (DeepWiki)
Classification: small (research) · ~5 min · Parallelizable with: Layer-1 tasks
Steps: Use the DeepWiki MCP (OPCFoundation/UA-.NETStandard) to confirm: how to
create + add an AlarmConditionState (and Limit/OffNormal/Discrete subtypes) under a
parent in a CustomNodeManager2; how to set ActiveState/AckedState/ConfirmedState/
ShelvingState/Severity/Retain; how transitions fire events (ReportEvent); how inbound
Acknowledge/Shelve/Confirm method calls are dispatched + where to hook them. Write
findings to docs/v2/f14b-part9-sdk-notes.md (committed). This de-risks T14-T17.
Task 14: Real condition-node materialisation
Classification: high-risk · ~5 min · Parallelizable with: none (depends T13)
Files: src/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer/OtOpcUaNodeManager.cs (replace the
placeholder [active, ack] variable in WriteAlarmState / add a MaterialiseAlarmCondition
path per AlarmType); Phase7Applier.cs (call the new materialiser); tests where the
SDK allows (node existence/type assertions).
Steps: create real condition nodes on materialise; keep WriteAlarmState as a thin
shim during transition or replace its callers. Run + commit. (SDK threading: all via the
pinned OpcUaPublishActor dispatcher.)
Task 15: Richer alarm-state bridge
Classification: standard · ~4 min · Parallelizable with: Task 17 (depends T14, T9)
Files: src/Server/ZB.MOM.WW.OtOpcUa.Runtime/OpcUa/OpcUaPublishActor.cs (new message
carrying the full AlarmConditionState, not 2 bools); ScriptedAlarmHostActor bridge
(send the richer message); OtOpcUaNodeManager (apply full state to the condition).
Tests: message mapping.
Task 16: Event firing on transition
Classification: high-risk · ~5 min · Parallelizable with: none (depends T14, T15)
Files: OtOpcUaNodeManager.cs (condition.ReportEvent(...) on state change). Tests:
mapping/coverage where feasible; behaviour proven in T19.
LAYER 2 — inbound client ack/shelve (re-scoped 2026-06-11)
Status: T0–T16 are merged to
master(Layers 0+1 live-verified; Layer 2 Part-9 nodes/state/events done). The original single T17 "Inbound method dispatch + ack plumbing" (high-risk · ~5 min) proved to be four separate hard problems, each its own task. After a 2026-06-11 deep dive into the real code, T17 is split into T17–T20 and the old T18–T20 shift to T21–T24 (old T19's Client.CLI work also grew a feature half, T22). This is the deferred "fresh piece": branch off the currentmaster(git switch -c feat/scriptlog-alarm-ack), not the oldfeat/scriptlog-alarm-runtimebase.
Layer 2 design decisions (resolved in the re-scope deep dive)
These are the load-bearing findings the new tasks rest on — verified against the current code, not the original recon's assumptions:
-
Topology — same-node co-location, multi-node ownership. The OPC UA SDK server (+
OtOpcUaNodeManager) and theScriptedAlarmHostActorare both spawned on every driver-role node in the sameActorSystem(OtOpcUaServerHostedService+DriverHostActor.SpawnScriptedAlarmHost). So per node they're co-located and an in-processTellwould reach the local host. But in a multi-driver-node cluster each node owns a disjoint subset of alarms (its resident equipment, via the T6 artifact equipment-filter), and a client connects to one node's server. Whether that node owns the alarm the client acks is not guaranteed. Decision: route inbound commands over a new DPS topicalarm-commands(mirrorsalerts/script-logs), and have eachScriptedAlarmHostActorignore commands for alarmIds its engine doesn't own. This works same-node and cross-node with one mechanism. (Open item to confirm in T18: whether each node's address space is partitioned to its own equipment or replicated — if partitioned, a client can only ever ack local alarms and the DPS broadcast is still correct, just always locally satisfied.) -
The node manager has no Akka handle and must stay that way.
OtOpcUaNodeManager(server, configuration)(OtOpcUaSdkServer.CreateMasterNodeManager) holds noIActorRef/ActorSystem/ DI. The existing forward seam isOpcUaPublishActor → IOpcUaAddressSpaceSink (DeferredAddressSpaceSink → SdkAddressSpaceSink → node manager). For the reverse path the node manager gets a settable command-router delegate (Action<AlarmCommand>), wired at boot byOtOpcUaServerHostedService(which does have the DPS mediator) to publish ontoalarm-commands. The node manager itself never touches Akka. -
No explicit re-projection after an engine op. Every
ScriptedAlarmEngineop (AcknowledgeAsync/ConfirmAsync/OneShotShelveAsync/TimedShelveAsync/UnshelveAsync/EnableAsync/DisableAsync/AddCommentAsync— all exist, signatures verified) raises the engine'sOnEvent, which the host's existingOnEngineEmissionalready projects to the node. So the inbound handler just calls the op and awaits — the ack visibly updates the node for free. This makes T19 small. -
Roles are dropped at the impersonation seam.
OpcUaApplicationHost.cs:292doesargs.Identity = new UserIdentity(token)and discardsresult.Roles(only logs them at :293).OpcUaUserAuthResult.RolesisIReadOnlyList<string>(ReadOnly/WriteOperate/WriteTune/WriteConfigure/AlarmAck); there is anOpcUaOperationenum (Core.Abstractions) withAlarmAcknowledge/AlarmConfirm/AlarmShelve, but no role is consulted anywhere post-auth today (writes aren't gated either — this is greenfield, not a pattern to copy). Risk (drives T17 being its own task): it is unconfirmed that a customUserIdentitysubclass survives the SDK round-trip back tocontext.UserIdentityinside a method handler. T17 must prove the round-trip (integration assertion); fallback is populatingGrantedRoleIds(NodeIdCollection) by mapping role strings → role NodeIds, which is more work. -
Double-emit is real, and delta-gating resolves it.
WriteAlarmConditioncallsReportConditionEventunconditionally (OtOpcUaNodeManager.cs:156); the node manager keeps no previous snapshot. Once inbound acks route through the engine, the SDK's ownOnAcknowledgeCalledauto-fires event E2 (applying acked state to the node) and the engine round-trip re-projects → would fire E3. Because the SDK applies the acked state before the async engine round-trip completes, delta-gatingWriteAlarmConditionagainst the node's current state suppresses E3 (no delta) while still firing on genuine engine-driven transitions. That's T20. (Fallback if it proves racy: the correlation-suppression option already sketched in the:190-198in-code note — skip engine re-projection for inbound-originated transitions.)
Task 17: Carry LDAP roles onto the OPC UA session identity
Classification: high-risk · ~5 min · Parallelizable with: Task 22
Files:
- Create:
src/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer/Security/RoleCarryingUserIdentity.cs(: UserIdentity, addsIReadOnlyList<string> Roles). - Modify:
src/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer/OpcUaApplicationHost.cs:292(args.Identity = new RoleCarryingUserIdentity(token, result.Roles)). - Test:
tests/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer.Tests/OpcUaApplicationHostImpersonationTests.cs(existing home forHandleImpersonation).
Steps:
- Round-trip spike FIRST (de-risk the whole task). Before building anything, confirm the SDK
preserves a custom
IUserIdentityinstance: in a booted in-process server test (mirrorSdkAddressSpaceSinkTests' server fixture), setargs.Identityto a sentinel subclass during impersonation and assert a method handler reads it back via(context as ISessionOperationContext)?.UserIdentityas that subclass. If it does NOT survive (SDK wraps/strips it), STOP and switch to theGrantedRoleIdsapproach — surface this as a scope change, don't silently expand. - Failing unit test:
HandleImpersonationon a successful auth setsargs.Identityto aRoleCarryingUserIdentitywhoseRolesequalsresult.Roles(and the existing identity/denial/anonymous tests still pass). - Implement
RoleCarryingUserIdentity+ the one-line:292swap. - Run (
OpcUaServer.Tests+ the impersonation tests) + commit by path.
Security-path change → high-risk. Touches only the identity construction; no auth-decision logic changes (roles were already resolved, just discarded). Do not change
IOpcUaUserAuthenticatoror the LDAP bind.
Task 18: Node-manager command router + AlarmAck veto gate + alarm-commands topic
Classification: high-risk · ~5 min · Parallelizable with: none (depends T17)
Files:
- Create:
src/Core/ZB.MOM.WW.OtOpcUa.Commons/OpcUa/AlarmCommand.cs(record AlarmCommand(string AlarmId, string Operation, string User, string? Comment, DateTime? UnshelveAtUtc);Operation∈ Acknowledge/Confirm/OneShotShelve/TimedShelve/Unshelve/Enable/Disable/AddComment). - Modify:
src/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer/OtOpcUaNodeManager.cs— add a settableAction<AlarmCommand>? AlarmCommandRouter; inMaterialiseAlarmCondition(afterCreate+ initial state, beforeAddChild) wirealarm.OnAcknowledge/OnConfirm/OnAddComment/OnShelve/OnTimedUnshelve. Each delegate: (a) read principal via(context as ISessionOperationContext)?.UserIdentity as RoleCarryingUserIdentity, gate onAlarmAck→ returnStatusCodes.BadUserAccessDeniedif absent; (b) invokeAlarmCommandRouterwith the mappedAlarmCommand(so the engine updates the domain store + audit + alerts historization); (c) returnServiceResult.Goodso the SDK applies node state + auto-fires (the engine re-projection is de-duped in T20). - Modify:
src/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer/OtOpcUaSdkServer.cs(pass-through to expose the router setter on the node manager). - Modify:
src/Server/ZB.MOM.WW.OtOpcUa.Host/OpcUa/OtOpcUaServerHostedService.cs(after the server starts + node manager exists: resolve the DPS mediator, set the router tomediator.Tell(new Publish(ScriptedAlarmHostActor.AlarmCommandsTopic, cmd))). - Add the topic const
AlarmCommandsTopic = "alarm-commands"onScriptedAlarmHostActor(used here and in T19). - Test:
OpcUaServer.Tests— veto gate allows withAlarmAck/ denies without (drive a wired condition'sOnAcknowledgewith aRoleCarryingUserIdentitycontext); router invoked with the correctly-mappedAlarmCommand(fakeAction).
Steps: TDD the gate + router-mapping in the node manager; then the SDK-server pass-through; then
the hosted-service wiring (no unit test for the boot wiring — exercised by T23 live-verify). Commit
by path. Serialize with T20 (both touch OtOpcUaNodeManager.cs).
Task 19: ScriptedAlarmHostActor inbound command handler
Classification: standard · ~4 min · Parallelizable with: Task 20 (depends T18)
Files:
- Modify:
src/Server/ZB.MOM.WW.OtOpcUa.Runtime/ScriptedAlarms/ScriptedAlarmHostActor.cs— subscribe toAlarmCommandsTopicinPreStart(alongside the existing_mediatoruse);Receive<AlarmCommand>(OnAlarmCommand);OnAlarmCommandisasync void, switches onOperation→ the matchingengine.<Op>Async(AlarmId, User, …, CancellationToken.None). Ownership filter: if the engine doesn't ownAlarmId, no-op (multi-node broadcast). Catch + log op failures (mirrorOnLoadFailed). No explicit re-projection — the engine'sOnEventdrives the existingOnEngineEmission→ node update. - Test:
tests/Server/ZB.MOM.WW.OtOpcUa.Runtime.Tests/ScriptedAlarms/ScriptedAlarmHostActorTests.cs(extend) — TestKit: anAlarmCommand{Operation="Acknowledge"}for a loaded alarm calls the engine'sAcknowledgeAsync(fake/probe engine seam); an unknownAlarmIdis ignored;TimedShelvewithoutUnshelveAtUtcis rejected/logged, not thrown.
Steps: TDD via the existing host-actor test seam; run dotnet test --filter ScriptedAlarmHostActor;
commit by path.
Task 20: Delta-gate event firing (kill the inbound double-emit)
Classification: high-risk · ~4 min · Parallelizable with: Task 19 (depends T18)
Files:
- Modify:
src/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer/OtOpcUaNodeManager.cs— keep aConcurrentDictionary<string, AlarmConditionSnapshot> _lastAlarmState; inWriteAlarmCondition, after projecting, compare the newstateto the stored snapshot and only callReportConditionEventwhen it differs (then store it). Replace the now-stale:151-156"fire exactly one event" comment + tighten the:190-198double-emit note to "resolved by delta-gate". - Test:
tests/.../OpcUaServer.Tests/SdkAddressSpaceSinkTests.cs(extend) — two identicalWriteAlarmConditioncalls fire the condition event once; a changed call fires again. (Assert via an event-count probe / monitored-item on the booted server fixture.)
Steps: TDD the delta-gate; run OpcUaServer.Tests; commit by path. If the booted-server test
can't cleanly count events, fall back to asserting the gate's decision via a seam and prove
end-to-end in T23. Serialize after T18 (same file).
Task 21: AdminUI ack/shelve control
Classification: standard · ~5 min · Parallelizable with: none (depends T19)
Files:
- Create:
src/Core/ZB.MOM.WW.OtOpcUa.Commons/Messages/Admin/AcknowledgeAlarmCommand.cs+ShelveAlarmCommand.cs(control-plane messages, mirrorStartDeployment's shape with aCorrelationId). - Modify:
src/Server/ZB.MOM.WW.OtOpcUa.ControlPlane/AdminOperations/AdminOperationsActor.cs(the existing admin-pinned cluster singleton) —ReceiveAsynchandlers that publish ontoalarm-commands(reusing T18's topic + the host's ownership filter → the singleton solves cross-node routing for the AdminUI path too). - Modify:
src/Server/ZB.MOM.WW.OtOpcUa.AdminUI/Clients/AdminOperationsClient.cs(+ its interface) —AcknowledgeAlarmAsync/ShelveAlarmAsync(mirrorStartDeploymentAsync). - Modify:
src/Server/ZB.MOM.WW.OtOpcUa.AdminUI/Components/Pages/Alerts.razor— per-row Acknowledge / Shelve buttons →IAdminOperationsClient; operator fromAuthState … User.Identity?.Name. - Test: the control-plane command service + the new
AdminOperationsActorhandlers (TestKit / Ask). No bUnit — the razor is proven in T23.
Steps: TDD the messages + actor handlers + client; wire the razor; run + commit by path.
Task 22: Client.CLI ack / confirm / shelve commands
Classification: standard · ~5 min · Parallelizable with: Task 17–T21 (disjoint Client.*)
Files:
- Modify:
src/Client/ZB.MOM.WW.OtOpcUa.Client.Shared/IOpcUaClientService.cs+OpcUaClientService.cs—AcknowledgeAlarmAsyncalready declared (no command wires it yet); addConfirmAlarmAsync+ShelveAlarmAsync(call the SDKConfirm/OneShotShelve/TimedShelve/Unshelvemethods on the condition). - Create:
src/Client/ZB.MOM.WW.OtOpcUa.Client.CLI/Commands/{Acknowledge,Confirm,Shelve}Command.cs(--node,--event-id,--comment; shelve adds--kind OneShot|Timed --unshelve-at). - Test: unit-test what's pure (arg→request mapping); the live round-trip is T23.
Steps: add the service methods + CLI commands; build; commit by path. This is net-new client feature work (the reason old "T19 verification" couldn't just be a verify pass).
Task 23: Live-verify Layer 2 end-to-end
Classification: verification · Parallelizable with: none (depends T18, T19, T20, T21, T22)
Steps: docker-dev up. Use the already-deployed t12-overheat alarm (rig state, below) as the
live condition. With Client.CLI: alarms --refresh shows the real condition; drive
TestMachine_002.TestChangingInt past the predicate so it fires an event on transition; call the new
acknowledge command → confirm the ack round-trips (node AckedState flips, exactly one ack
event fires — no double-emit, T20 — state persists across a node restart). Repeat the ack from
the AdminUI /alerts buttons (T21) and confirm parity. Verify the AlarmAck gate: a user without
AlarmAck is denied (BadUserAccessDenied). Agent does not sign in — user drives. Defects → new
fix tasks.
Task 24: Docs + cleanup + finish branch
Classification: small · ~5 min · Parallelizable with: none (depends all)
Files: update docs/ScriptedAlarms.md, docs/VirtualTags.md, docs/v2/Runtime.md,
docs/AlarmTracking.md (the inbound-ack + AlarmAck-gate flow is now real); correct the stale
docs/v2/phase-7-status.md alarm-runtime status; CLAUDE.md note. Clean up the deliberately-left
rig artifacts (t12-overheat, script SC-ba675b168a85, the layer0-logcheck vtag, and revert
filler-02's inert cycle-time-s logger line — redeploy). Delete/condense resume.md + pending.md.
Then run superpowers-extended-cc:finishing-a-development-branch (full dotnet test, merge to master).
Execution notes
- Parallel dispatch (Layers 0+1, done): Layer 0 serial (T1→T2→T3→T4). Layer 1: T5→T6 serial; T7, T8 parallel with T5/T6; T9 waits on T6/T7/T8; T10→T11→T12 serial.
- Parallel dispatch (Layer 2 remainder, T17–T24):
- T17 first (roles) — its step-1 round-trip spike is a go/no-go gate for the gate design.
- T18 after T17 (the veto gate needs the roles). T19 ∥ T20 after T18 (disjoint files:
ScriptedAlarmHostActor.csvsOtOpcUaNodeManager.cs). - T22 runs in parallel with the whole T17–T21 server chain (only
Client.*files). - T21 after T19. T23 after T18/T19/T20/T21/T22. T24 last.
- One writer at a time within a shared file:
OtOpcUaNodeManager.csis touched by T18 and T20 — serialize T18 → T20. (Layers 0/1 already merged, so Program.cs/T14-T16 contention is moot.) - Layer boundaries are natural checkpoints — Layers 0+1 shipped; the T17 round-trip spike is the next gate before committing to the rest of the Layer 2 inbound epic.