HistorianAdapterActor now subscribes to the redundancy-state DPS topic,
caches the local node's RedundancyRole, and SKIPS the durable-sink enqueue
when the local node is Secondary or Detached. Unknown/null role default-writes
so single-node deploys and the boot window never silently drop historization.
GetStatus stays ungated.
PREMISE: verified the actor is registered but FED BY NOTHING in production —
there is no AlarmHistorianEvent producer and nothing resolves its registry key
to Tell it. This is a FORWARD-LOOKING / DEFENSIVE guard, not a fix for a live
double-write: the moment a per-node feeder lands (engine -> historian, expected
as a per-node cluster broadcast like the alerts topic), only the Primary will
write to the durable sink (exactly-once across all alarm sources).
Mirrors the sibling A1 treatment of ScriptedAlarmHostActor (06c4155) and
OpcUaPublishActor's redundancy-state handler. localNode threaded through
HistorianAdapterActor.Props from ServiceCollectionExtensions (roleInfo.LocalNode).
T21: add an AdminUI path for acknowledging/shelving alarms that routes
through the admin-pinned AdminOperationsActor cluster singleton, which
republishes onto the same 'alarm-commands' DPS topic the OPC UA method
path (T18) and the engine subscriber (T19) use. The broadcast + the
ScriptedAlarmHostActor ownership filter handle cross-node routing, so
the singleton needs no knowledge of which node owns the alarm.
- Commons: AcknowledgeAlarmCommand/ShelveAlarmCommand (+ result records)
and a shared AlarmCommandsTopic const; ScriptedAlarmHostActor now
re-exports that const (mirrors the DriverControlTopic pattern).
- AdminOperationsActor: two handlers map the control-plane messages to
AlarmCommand (Acknowledge / OneShotShelve / TimedShelve / Unshelve,
threading User/Comment/UnshelveAtUtc) and publish via the DPS mediator.
- IAdminOperationsClient + AdminOperationsClient: typed Acknowledge/Shelve
ask wrappers mirroring StartDeploymentAsync.
- Alerts.razor: per-row DriverOperator-gated Ack/Shelve/Unshelve controls;
operator name from AuthenticationState. Timed-shelve datetime UI deferred.
- 5 TestKit tests (mediator-probe subscribed to alarm-commands) verifying
each kind's mapping + reply; 56/56 ControlPlane tests green.
Three code-review points on commit 004558c2 were correct behavior
that was under-documented, not bugs:
1. AlarmConditionDelta gains explicit paragraphs explaining why
CommentAdded is absent: it always originates from a client
AddComment call whose T18 OnAddComment handler returns Good →
SDK auto-fires the comment event (E2); the engine re-projection
carries no delta-field change, so the gate correctly suppresses
the duplicate. Force-firing would double-emit.
2. Same doc explains Retain is intentionally absent: Retain is a
pure function of Active/Acknowledged (both compared), so it
cannot flip without a real delta. Notes future risk if that
ever changes.
3. ReportConditionEvent Time/ReceiveTime comment corrected: the
projection was already applied by WriteAlarmCondition above
with identical values; the restamp is a locality repeat, not a
reorder guard.
Also adds one seam unit-test (103 total, was 102) pinning the
null-vs-empty Message normalization boundary so a change to the
?? string.Empty coalescing is caught at the seam level.
Validate AddComment up-front (IsNullOrWhiteSpace guard + Warning log) so
a blank-comment command is cleanly rejected before reaching the engine
rather than faulting inside ApplyAddComment and being silently swallowed
by the outer catch. Mirrors the existing TimedShelve missing-UnshelveAtUtc
pattern.
Also fix two stale inline comments: the "async void crash" note on
TimedShelve now correctly says "fault escaping async Task → supervision
restart", and the ownership-filter now documents the benign race with a
concurrent LoadAsync clearing the loaded set.
Tests: AlarmCommand_add_comment_empty_text_is_rejected_not_driven (Theory
— empty string + whitespace) and AlarmCommand_add_comment_nonempty_drives_engine
(positive path, asserts CommentAdded transition on alerts topic).
Inbound client acks now route through the engine (T18/T19). On a successful
Acknowledge the T18 gate returns Good, so the SDK applies the acked state to the
AlarmConditionState node and auto-fires its own condition event (E2) -- directly
on the node, bypassing WriteAlarmCondition. The engine then re-projects that same
transition through WriteAlarmCondition, which fired again (E3): a double-emit.
Gate WriteAlarmCondition's ReportConditionEvent on a genuine delta computed
against the node's CURRENT live state (read before projecting the snapshot), not
a last-written cache (which would be stale, since the SDK-applied ack never went
through this method). For a re-projected ack the snapshot equals the node's
already-applied state -> no delta -> suppress E3. Genuine engine-driven
transitions still differ -> fire.
Compared fields (value-equality via AlarmConditionDelta record): Active, Acked,
Confirmed, Enabled, Shelving (mapped from the shelving state machine), Severity
(mapped through MapSeverity to match the bucket the node stores), Message.
Optional Confirmed/Shelving fold to the node read-back default when the child is
absent so they can't register a phantom delta.
Tests prove both: suppression of the simulated inbound ack re-projection
(EventId unchanged) and that genuine transitions fire while identical
re-projections suppress; plus a direct unit test of the ShouldFireConditionEvent
seam. 102/102 OpcUaServer.Tests green.
Subscribe the host to the cluster alarm-commands DPS topic in PreStart and
drive the matching ScriptedAlarmEngine op per inbound AlarmCommand. An
ownership filter (engine.LoadedAlarmIds) ignores commands for alarms this
node does not own; TimedShelve without UnshelveAtUtc and unknown operations
are logged + rejected (never thrown); op failures are caught + logged so a
faulting op can't fault the actor. Re-projection is left to the engine's
existing OnEvent -> OnEngineEmission path.
Handler is a Task-returning ReceiveAsync (the project's AK2003 analyzer
forbids an async-void Receive delegate), giving ordered awaited async on the
actor thread. Adds 3 TestKit tests: ack drives the engine with mapped args,
unowned command ignored, missing-UnshelveAtUtc TimedShelve rejected not
thrown.
The SDK fires OnTimedUnshelve with the node manager's system context (no
session, no user identity) when a TimedShelve duration expires. Routing
through the shared HandleAlarmCommand hit the AlarmAck gate and returned
BadUserAccessDenied, leaving the alarm permanently shelved.
Replace the delegated HandleAlarmCommand call with an inline lambda that
bypasses the client gate, extracts the AlarmId the same way, and routes an
Unshelve command so the engine clears its shelve state. The manual-client
Unshelve path via OnShelve(shelving:false) remains gated.
Update the AlarmCommandRouterTests OnTimedUnshelve test to use a real
system context (no UserIdentity) — reproducing the actual SDK invocation
path — and assert Good, AlarmId, Operation==Unshelve, User==empty.
Add a doc note to AlarmCommand.Operation that Enable/Disable are in the
vocabulary but not yet wired at the node-manager seam.
Wire the materialised AlarmConditionState method handlers so a client calling
Acknowledge/Confirm/Shelve/AddComment is gated on the AlarmAck data-plane role
and, when allowed, routed back to the scripted-alarm engine via a new
`alarm-commands` DistributedPubSub topic.
- Commons: new AlarmCommand DTO (AlarmId/Operation/User/Comment/UnshelveAtUtc).
- ScriptedAlarmHostActor: add AlarmCommandsTopic const.
- OtOpcUaNodeManager: settable AlarmCommandRouter + wire OnAcknowledge/OnConfirm/
OnAddComment/OnShelve/OnTimedUnshelve. Each resolves the principal off
ISessionOperationContext.UserIdentity as RoleCarryingUserIdentity, fails closed
(BadUserAccessDenied) when the AlarmAck role is absent or no identity, else maps
+ routes an AlarmCommand and returns Good. OnShelve discriminates OneShotShelve/
TimedShelve/Unshelve from the SDK flags; TimedShelve expiry = UtcNow + ms.
No Akka/IActorRef handle — only the Action<AlarmCommand> delegate. T20 de-dup
note left; WriteAlarmCondition untouched.
- OpcUaServer.Security: OpcUaDataPlaneRoles.AlarmAck shared const (the role was a
bare string everywhere; introduced one symbol for the gate + tests).
- OtOpcUaSdkServer: SetAlarmCommandRouter pass-through.
- Host: boot wiring publishes each command via mediator.Tell(Publish(...)) using a
lazy ActorSystem accessor (mirrors DpsScriptLogPublisher).
- Tests: 11 new gate + mapping tests (OpcUaServer.Tests 88->99, all green).
Important 1: ShelveAlarmAsync Timed branch now multiplies shelvingTimeSeconds × 1000.0
before passing to CallMethodAsync — OPC UA Part 9 TimedShelve ShelvingTime is a Duration
in milliseconds, not seconds. IOpcUaClientService XML doc and ShelveCommand --duration
description updated to document the seconds-in / ms-out contract.
Important 2: ShelveAlarmAsync builds shelvingStateNodeId with the same
EndsWith(".ShelvingState") guard already used by the .Condition suffix in
AcknowledgeAlarmAsync / ConfirmAlarmAsync, preventing double-append.
Important 3: Add 6 service-layer tests to OpcUaClientServiceTests —
ConfirmAlarmAsync_OnSuccess_ReturnsGood
ConfirmAlarmAsync_OnServiceResultException_ReturnsBadStatusCode
ShelveAlarmAsync_OneShot_CallsMethodWithNoArgs
ShelveAlarmAsync_Timed_PassesDurationInMilliseconds (regression guard for Important 1)
ShelveAlarmAsync_Unshelve_CallsMethodWithNoArgs
ShelveAlarmAsync_OnServiceResultException_ReturnsBadStatusCode
FakeSessionAdapter extended with CallMethodInputArgs list to record per-call input
arguments so the Timed test can assert the ms value.
Minor 4: ShelveCommand output changed from "Shelve (OneShot) successful" to
"{shelveKind} successful/failed" so Unshelve reads "Unshelve successful: …".
Minor 6: ShelveCommand --duration description updated to "(must be > 0; in seconds,
converted to milliseconds for the OPC UA call; required for --kind Timed)".
Adds ConfirmAlarmAsync and ShelveAlarmAsync to IOpcUaClientService and
OpcUaClientService (mirroring the AcknowledgeAlarmAsync pattern: same
CallMethodAsync/ServiceResultException/StatusCode contract). Adds ShelveKind
enum (OneShot/Timed/Unshelve). Adds three new CLI commands — ack, confirm,
shelve — with hex EventId input and per-command argument validation.
Updates both fakes (CLI + UI) to implement the new interface members and
record calls. Adds 16 unit tests covering argument mapping, invalid-input
CommandException paths, bad-status output, and disconnect-in-finally.
Stop discarding the authenticator's resolved roles during impersonation.
HandleImpersonation now sets args.Identity to a RoleCarryingUserIdentity
(: UserIdentity) that carries result.Roles, so a downstream method handler
can read them off context.UserIdentity for the inbound AlarmAck gate (T18).
Verified via the decompiled SDK (1.5.378.106) that the instance we assign to
ImpersonateEventArgs.Identity is stored by reference onto Session.Identity /
EffectiveIdentity and surfaced unchanged on OperationContext.UserIdentity --
the custom subclass survives the round-trip. No auth-decision logic changes.
Default HistorizeToAveva/Retain/Enabled to the entity defaults (true) when a
field is absent/null/non-boolean so a partial blob decodes identically to the
composer's view of a default-constructed ScriptedAlarm (byte-parity), and only
call GetBoolean for a genuine true/false token. Add direct ExtractAlarmDependencyRefs
unit tests (overlap dedup + reserved {{equip}} exclusion).
Concrete ITagUpstreamSource the scripted-alarm host actor pushes
DependencyValueChanged values into and ScriptedAlarmEngine reads/subscribes
from. Thread-safe: ConcurrentDictionary value cache + per-path ImmutableList
observer lists with atomic add/remove and capture-then-invoke fan-out.
ReadTag of an unknown path returns a Bad-quality (0x80000000) snapshot stamped
via the injected clock. Adds the Core.ScriptedAlarms project reference Runtime
needs to see the interface.
Composes the one Layer-0 hop existing tests left uncovered together:
ScriptLogSignalRBridge subscribing to the script-logs DPS topic and
fanning a ScriptLogEntry out to the IInProcessBroadcaster<ScriptLogEntry>
singleton resolved from the SAME DI container the /script-log page injects.
Mirrors DriverStatusHubE2eTests. Confirms the server-side topic→page chain
delivers end-to-end (only the live Blazor circuit remains manual).