lmxopcua

Author	SHA1	Message	Date
Joseph Doherty	943c621371	feat(historian): config-gated SqliteStoreAndForward→Wonderware sink (AddAlarmHistorian)	2026-06-11 11:30:31 -04:00
Joseph Doherty	e9355e9514	refactor(historian): gate before translate (no discarded alloc on secondary) + strengthen double-write warning (review)	2026-06-11 11:24:48 -04:00
Joseph Doherty	bb42e5834a	feat(historian): subscribe to alerts topic + translate to AlarmHistorianEvent (Primary-gated, exactly-once)	2026-06-11 11:18:26 -04:00
Joseph Doherty	8ac3ac5be9	feat(alarms): carry AlarmTypeName + operator Comment on AlarmTransitionEvent (historian feed prep)	2026-06-11 11:03:00 -04:00
Joseph Doherty	0742946108	feat(redundancy): gate alarm historization on Primary (A2, defensive — actor currently unfed) HistorianAdapterActor now subscribes to the redundancy-state DPS topic, caches the local node's RedundancyRole, and SKIPS the durable-sink enqueue when the local node is Secondary or Detached. Unknown/null role default-writes so single-node deploys and the boot window never silently drop historization. GetStatus stays ungated. PREMISE: verified the actor is registered but FED BY NOTHING in production — there is no AlarmHistorianEvent producer and nothing resolves its registry key to Tell it. This is a FORWARD-LOOKING / DEFENSIVE guard, not a fix for a live double-write: the moment a per-node feeder lands (engine -> historian, expected as a per-node cluster broadcast like the alerts topic), only the Primary will write to the durable sink (exactly-once across all alarm sources). Mirrors the sibling A1 treatment of ScriptedAlarmHostActor (`06c4155`) and OpcUaPublishActor's redundancy-state handler. localNode threaded through HistorianAdapterActor.Props from ServiceCollectionExtensions (roleInfo.LocalNode).	2026-06-11 08:57:41 -04:00
Joseph Doherty	06c415598c	feat(redundancy): gate scripted-alarm alerts publish on Primary (A1)	2026-06-11 08:44:44 -04:00
Joseph Doherty	370a2b7b48	feat(alerts): AdminUI alarm ack/shelve via AdminOperationsActor singleton T21: add an AdminUI path for acknowledging/shelving alarms that routes through the admin-pinned AdminOperationsActor cluster singleton, which republishes onto the same 'alarm-commands' DPS topic the OPC UA method path (T18) and the engine subscriber (T19) use. The broadcast + the ScriptedAlarmHostActor ownership filter handle cross-node routing, so the singleton needs no knowledge of which node owns the alarm. - Commons: AcknowledgeAlarmCommand/ShelveAlarmCommand (+ result records) and a shared AlarmCommandsTopic const; ScriptedAlarmHostActor now re-exports that const (mirrors the DriverControlTopic pattern). - AdminOperationsActor: two handlers map the control-plane messages to AlarmCommand (Acknowledge / OneShotShelve / TimedShelve / Unshelve, threading User/Comment/UnshelveAtUtc) and publish via the DPS mediator. - IAdminOperationsClient + AdminOperationsClient: typed Acknowledge/Shelve ask wrappers mirroring StartDeploymentAsync. - Alerts.razor: per-row DriverOperator-gated Ack/Shelve/Unshelve controls; operator name from AuthenticationState. Timed-shelve datetime UI deferred. - 5 TestKit tests (mediator-probe subscribed to alarm-commands) verifying each kind's mapping + reply; 56/56 ControlPlane tests green.	2026-06-11 06:44:27 -04:00
Joseph Doherty	1d7e2a0f8b	fix(runtime): reject empty AddComment instead of silently swallowing it Validate AddComment up-front (IsNullOrWhiteSpace guard + Warning log) so a blank-comment command is cleanly rejected before reaching the engine rather than faulting inside ApplyAddComment and being silently swallowed by the outer catch. Mirrors the existing TimedShelve missing-UnshelveAtUtc pattern. Also fix two stale inline comments: the "async void crash" note on TimedShelve now correctly says "fault escaping async Task → supervision restart", and the ownership-filter now documents the benign race with a concurrent LoadAsync clearing the loaded set. Tests: AlarmCommand_add_comment_empty_text_is_rejected_not_driven (Theory — empty string + whitespace) and AlarmCommand_add_comment_nonempty_drives_engine (positive path, asserts CommentAdded transition on alerts topic).	2026-06-11 06:32:53 -04:00
Joseph Doherty	4f7999eac2	feat(alarms): consume alarm-commands topic in ScriptedAlarmHostActor (T19) Subscribe the host to the cluster alarm-commands DPS topic in PreStart and drive the matching ScriptedAlarmEngine op per inbound AlarmCommand. An ownership filter (engine.LoadedAlarmIds) ignores commands for alarms this node does not own; TimedShelve without UnshelveAtUtc and unknown operations are logged + rejected (never thrown); op failures are caught + logged so a faulting op can't fault the actor. Re-projection is left to the engine's existing OnEvent -> OnEngineEmission path. Handler is a Task-returning ReceiveAsync (the project's AK2003 analyzer forbids an async-void Receive delegate), giving ordered awaited async on the actor thread. Adds 3 TestKit tests: ack drives the engine with mapped args, unowned command ignored, missing-UnshelveAtUtc TimedShelve rejected not thrown.	2026-06-11 06:23:08 -04:00
Joseph Doherty	63289d377c	feat(alarms): route inbound Part 9 alarm methods through AlarmAck gate (T18) Wire the materialised AlarmConditionState method handlers so a client calling Acknowledge/Confirm/Shelve/AddComment is gated on the AlarmAck data-plane role and, when allowed, routed back to the scripted-alarm engine via a new `alarm-commands` DistributedPubSub topic. - Commons: new AlarmCommand DTO (AlarmId/Operation/User/Comment/UnshelveAtUtc). - ScriptedAlarmHostActor: add AlarmCommandsTopic const. - OtOpcUaNodeManager: settable AlarmCommandRouter + wire OnAcknowledge/OnConfirm/ OnAddComment/OnShelve/OnTimedUnshelve. Each resolves the principal off ISessionOperationContext.UserIdentity as RoleCarryingUserIdentity, fails closed (BadUserAccessDenied) when the AlarmAck role is absent or no identity, else maps + routes an AlarmCommand and returns Good. OnShelve discriminates OneShotShelve/ TimedShelve/Unshelve from the SDK flags; TimedShelve expiry = UtcNow + ms. No Akka/IActorRef handle — only the Action<AlarmCommand> delegate. T20 de-dup note left; WriteAlarmCondition untouched. - OpcUaServer.Security: OpcUaDataPlaneRoles.AlarmAck shared const (the role was a bare string everywhere; introduced one symbol for the gate + tests). - OtOpcUaSdkServer: SetAlarmCommandRouter pass-through. - Host: boot wiring publishes each command via mediator.Tell(Publish(...)) using a lazy ActorSystem accessor (mirrors DpsScriptLogPublisher). - Tests: 11 new gate + mapping tests (OpcUaServer.Tests 88->99, all green).	2026-06-11 06:05:39 -04:00
Joseph Doherty	4eb1d65e2b	feat(scripted-alarms): richer AlarmConditionState bridge to the OPC UA node (T15)	2026-06-10 19:41:16 -04:00
Joseph Doherty	60d48a2a0a	feat(scripted-alarms): materialise real Part 9 AlarmConditionState nodes (T14)	2026-06-10 19:19:10 -04:00
Joseph Doherty	fc0d43a3dc	refactor(scripted-alarms): retire orphaned ScriptedAlarmActor + F9b evaluator (T11)	2026-06-10 15:22:26 -04:00
Joseph Doherty	5256761368	feat(scripted-alarms): spawn + apply ScriptedAlarmHostActor in DriverHostActor (T10)	2026-06-10 15:17:29 -04:00
Joseph Doherty	dafaf2faec	fix(scripted-alarms): ScriptedAlarmHostActor review fixes — load-gen guard, quiet cancel, parse guard (T9 review)	2026-06-10 15:08:54 -04:00
Joseph Doherty	3b418a54f1	feat(scripted-alarms): ScriptedAlarmHostActor — engine runtime host (T9)	2026-06-10 14:57:42 -04:00
Joseph Doherty	c9590c03d0	fix(scripted-alarms): harden artifact boolean decode + direct helper tests (T6 review) Default HistorizeToAveva/Retain/Enabled to the entity defaults (true) when a field is absent/null/non-boolean so a partial blob decodes identically to the composer's view of a default-constructed ScriptedAlarm (byte-parity), and only call GetBoolean for a genuine true/false token. Add direct ExtractAlarmDependencyRefs unit tests (overlap dedup + reserved {{equip}} exclusion).	2026-06-10 14:47:24 -04:00
Joseph Doherty	8e8ca9efe8	feat(scripted-alarms): DeploymentArtifact byte-parity for the alarm plan (T6)	2026-06-10 14:41:46 -04:00
Joseph Doherty	55101baaa4	refactor(scripted-alarms): review-fix polish for T5/T7/T8 (observer isolation, warning hoist, doc)	2026-06-10 14:32:49 -04:00
Joseph Doherty	1c96fe0be0	feat(scripted-alarms): EfAlarmConditionStateStore (T8)	2026-06-10 14:21:19 -04:00
Joseph Doherty	945ccd0b85	feat(scripted-alarms): DependencyMuxTagUpstreamSource (T7) Concrete ITagUpstreamSource the scripted-alarm host actor pushes DependencyValueChanged values into and ScriptedAlarmEngine reads/subscribes from. Thread-safe: ConcurrentDictionary value cache + per-path ImmutableList observer lists with atomic add/remove and capture-then-invoke fan-out. ReadTag of an unknown path returns a Bad-quality (0x80000000) snapshot stamped via the injected clock. Adds the Core.ScriptedAlarms project reference Runtime needs to see the interface.	2026-06-10 14:20:02 -04:00
Joseph Doherty	73014258ef	feat(scripting): root script logger + DPS publisher wired in Host	2026-06-10 11:50:50 -04:00
Joseph Doherty	66ea9c56f6	feat(runtime): DeploymentArtifact substitutes {{equip}} (parity with composer)	2026-06-10 07:53:20 -04:00
Joseph Doherty	d909a8e4f6	docs+test(deploy): clarify driver-less attribution docs + no-line exclusion test (Task 2 review)	2026-06-08 07:02:25 -04:00
Joseph Doherty	c688899134	fix(deploy): cluster-attribute driver-less equipment via its UNS line area (BuildClusterSets)	2026-06-08 06:53:41 -04:00
Joseph Doherty	6b36eff2d3	refactor(runtime): capture-first in HandleWriteAsync; assert no handler leak on resubscribe; fix stale comment	2026-06-07 10:31:20 -04:00
Joseph Doherty	98259ab026	fix(runtime): capture Sender before await in DriverInstanceActor subscribe (no-ActorContext race)	2026-06-07 10:26:17 -04:00
Joseph Doherty	4c221ce2b3	merge: equipment-namespace live values (VirtualTag route) v2-ci / build (push) Failing after 36s Details v2-ci / unit-tests (tests/Core/ZB.MOM.WW.OtOpcUa.Cluster.Tests) (push) Has been skipped Details v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.ControlPlane.Tests) (push) Has been skipped Details v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer.Tests) (push) Has been skipped Details v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.Runtime.Tests) (push) Has been skipped Details v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.Security.Tests) (push) Has been skipped Details v2-ci / integration (tests/Server/ZB.MOM.WW.OtOpcUa.Host.IntegrationTests) (push) Has been skipped Details v2-ci / integration (tests/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer.IntegrationTests) (push) Has been skipped Details	2026-06-07 09:33:21 -04:00
Joseph Doherty	1c579410cd	fix(runtime): flag cross-cluster orphan-equipment bindings on rebuild v2-ci / build (push) Failing after 42s Details v2-ci / unit-tests (tests/Core/ZB.MOM.WW.OtOpcUa.Cluster.Tests) (push) Has been skipped Details v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.ControlPlane.Tests) (push) Has been skipped Details v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer.Tests) (push) Has been skipped Details v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.Runtime.Tests) (push) Has been skipped Details v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.Security.Tests) (push) Has been skipped Details v2-ci / integration (tests/Server/ZB.MOM.WW.OtOpcUa.Host.IntegrationTests) (push) Has been skipped Details v2-ci / integration (tests/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer.IntegrationTests) (push) Has been skipped Details ParseComposition(blob, nodeId, onInconsistency?) detects a kept equipment whose UNS line belongs to another cluster (a same-cluster-invariant violation that would orphan the equipment folder) and reports it via an optional callback, wired to OpcUaPublishActor's logger. Detection-only; the upstream draft validator remains the authority. Adds two unit tests.	2026-06-07 08:24:11 -04:00
Joseph Doherty	2bfe18abcf	chore(runtime): warn on missing VirtualTag evaluator; document Stale-recovery VirtualTag behaviour Log a WARNING on startup when IVirtualTagEvaluator is not registered so a DI misconfig on a driver-role node is visible in logs instead of silently evaluating all VirtualTags to NoChange. Add a comment in PushDesiredSubscriptions noting that TryRecoverFromStale does not call this method, so VirtualTags remain empty after a Stale recovery until the next deployment dispatch (intentional, consistent with driver recovery).	2026-06-07 05:46:24 -04:00
Joseph Doherty	397f9b783a	feat(runtime): spawn+apply VirtualTagHostActor on deploy apply and restore	2026-06-07 05:41:04 -04:00
Joseph Doherty	5e2869bab7	fix(runtime): VirtualTagHost watches children + respawns after unexpected death Context.Watch each spawned child; OnChildTerminated evicts it from _children so the next ApplyVirtualTags (still containing that vtagId) falls through the ContainsKey guard and re-spawns a fresh VirtualTagActor. Adds a spawn-site Debug log, moves the TODO about in-place plan mutation to the skip-existing branch where it belongs, and adds a deterministic TestKit test (Child_is_respawned_after_unexpected_termination) that kills the first child, drains its UnregisterInterest from the mux probe, re-applies, and asserts a second distinct RegisterInterest arrives.	2026-06-07 05:34:50 -04:00
Joseph Doherty	85a36cec54	feat(runtime): VirtualTagHostActor spawns VTag actors + bridges results to OPC UA	2026-06-07 05:28:46 -04:00
Joseph Doherty	695e61dedf	feat(opcua): materialise Equipment VirtualTag variables on rebuild	2026-06-07 05:22:22 -04:00
Joseph Doherty	c7661d0510	feat(opcua): parse Equipment VirtualTag plans from the deployment artifact	2026-06-07 05:09:53 -04:00
Joseph Doherty	8ce57e47a3	feat(runtime): OPC UA rebuild materialises only the node's ClusterId slice	2026-06-07 03:23:02 -04:00
Joseph Doherty	1b7f995aea	feat(runtime): DriverHost spawns + subscribes only its own ClusterId's drivers	2026-06-07 03:19:22 -04:00
Joseph Doherty	4fca4e1aca	feat(runtime): node-scoped ParseComposition filters address space by ClusterId	2026-06-07 03:15:46 -04:00
Joseph Doherty	7b2f64fdb8	refactor(runtime): case-insensitive ClusterId/NodeId match + suppress short-circuit + edge tests (review)	2026-06-07 03:12:09 -04:00
Joseph Doherty	24796f2c12	feat(runtime): ClusterId scope resolution + node-scoped driver-spec parse	2026-06-07 03:05:02 -04:00
Joseph Doherty	aaf869145a	fix(opcua): equipment-tag planner diff + folder-scoped NodeIds (review findings) Two bundle-review fixes + idempotency coverage: - CRITICAL: the planner ignored EquipmentTags, so an incremental deploy changing only equipment tags produced an empty plan and HandleRebuild short-circuited before materialising them. Add TagId to EquipmentTagPlan + Added/Removed/ChangedEquipmentTags to Phase7Plan (diffed by TagId, in IsEmpty, driving Apply's needsRebuild) — mirroring the GalaxyTags treatment. - IMPORTANT: equipment variable NodeId was the raw driver FullName, which collides across identical machines (e.g. two PLCs both exposing register 40001) — the second variable was silently dropped. NodeId is now folder-scoped (parent/Name); FullName stays on EquipmentTagPlan for the later values-routing milestone. - Task 4: SDK-backed idempotency test (double-apply -> single variable); restart-safety confirmed (RestoreApplied reuses the same RebuildAddressSpace -> HandleRebuild path). - Minor: align composer equipment-tag sort with the artifact decoder (coalesce FolderPath).	2026-06-06 15:02:50 -04:00
Joseph Doherty	08cddfe128	fix(opcua): UNS equipment folders browse by friendly Name, NodeId stays the logical Id Equipment folder DisplayName was the colloquial MachineCode; the live rebuild (artifact ReadEquipmentNode) + composer now use the UNS level-5 Name segment, matching Area/Line folders + EquipmentNodeWalker. NodeId stays the logical EquipmentId so browse-path resolution + ACLs are unaffected.	2026-06-06 14:51:12 -04:00
Joseph Doherty	df0dc516c3	feat(opcua): materialise Equipment-namespace tags in the live rebuild Add Phase7Applier.MaterialiseEquipmentTags — a sink-based pass (Task-0 decision A) that ensures each EquipmentTagPlan's Variable (NodeId = FullName) under its existing equipment folder, nesting any FolderPath as a sub-folder. Wire it into OpcUaPublishActor.HandleRebuild after the Galaxy pass. Variables start BadWaitingForInitialData; never re-creates equipment folders (decision #4).	2026-06-06 14:46:38 -04:00
Joseph Doherty	febe462750	feat(opcua): carry Equipment-namespace tags through the deployment composition Add EquipmentTagPlan + an init-only EquipmentTags member on Phase7CompositionResult (mirror of GalaxyTags). Populate it compose-side (Tag.EquipmentId != null AND owning namespace Kind == Equipment) and artifact-decode-side via BuildEquipmentTagPlans, with FullName extracted from Tag.TagConfig. Init-only member (not a 7th positional param) so existing convenience constructors + call sites are untouched.	2026-06-06 14:42:38 -04:00
Joseph Doherty	b1b3f3ff23	fix(runtime): materialise from applied artifact + restore served state on bootstrap v2-ci / build (push) Failing after 47s Details v2-ci / unit-tests (tests/Core/ZB.MOM.WW.OtOpcUa.Cluster.Tests) (push) Has been skipped Details v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.ControlPlane.Tests) (push) Has been skipped Details v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer.Tests) (push) Has been skipped Details v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.Runtime.Tests) (push) Has been skipped Details v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.Security.Tests) (push) Has been skipped Details v2-ci / integration (tests/Server/ZB.MOM.WW.OtOpcUa.Host.IntegrationTests) (push) Has been skipped Details v2-ci / integration (tests/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer.IntegrationTests) (push) Has been skipped Details Two ordering/lifecycle gaps surfaced once tag values began streaming: 1. OpcUaPublishActor.HandleRebuild loaded the latest Sealed artifact, but the rebuild fires at apply time — before this deployment seals — so it materialised the PREVIOUS revision while SubscribeBulk subscribed to the applied one. The two disagreed (4 variables materialised vs 396 subscribed) and every config needed two deploys. RebuildAddressSpace now carries the applied DeploymentId and the rebuild loads that exact artifact. 2. On restart a node recovered its revision from NodeDeploymentState but left the driver children + address space empty (and an identical-config redeploy no-ops on the unchanged revision), so a rebuilt node served nothing until a config change. Bootstrap now calls RestoreApplied: re-spawn drivers, rebuild from the applied artifact, re-push SubscribeBulk — no re-ack. Verified live: recreating the driver nodes auto-restores all 396 galaxy mirror tags across 40 machines with Good live values, no deploy required.	2026-06-06 12:53:38 -04:00
Joseph Doherty	c1ce5833e9	feat(runtime): wire driver SubscribeBulk pass so tag values stream v2-ci / build (push) Failing after 51s Details v2-ci / unit-tests (tests/Core/ZB.MOM.WW.OtOpcUa.Cluster.Tests) (push) Has been skipped Details v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.ControlPlane.Tests) (push) Has been skipped Details v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer.Tests) (push) Has been skipped Details v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.Runtime.Tests) (push) Has been skipped Details v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.Security.Tests) (push) Has been skipped Details v2-ci / integration (tests/Server/ZB.MOM.WW.OtOpcUa.Host.IntegrationTests) (push) Has been skipped Details v2-ci / integration (tests/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer.IntegrationTests) (push) Has been skipped Details Materialised SystemPlatform/Galaxy variables previously stayed BadWaitingForInitialData because nothing told the driver to subscribe (OpcUaPublishActor TODO 'on a future SubscribeBulk pass') and published values were only forwarded to the VirtualTag mux, never the OPC UA sink. DriverHostActor now, after each apply, groups the deployment's galaxy tag MXAccess refs by driver and sends DriverInstanceActor.SetDesiredSubscriptions; the actor retains the set and (re)subscribes on every Connected entry, so values resume after reconnects/redeploys (closes the F8b/#113 gap). Published values are also forwarded to OpcUaPublishActor as AttributeValueUpdate (NodeId == galaxy MxAccessRef) so the materialised variable shows live data. Verified live in docker-dev: galaxy TestMachine_001 tags go Good with a changing TestChangingInt. +1 unit test.	2026-06-06 12:31:55 -04:00
Joseph Doherty	662f3f9f5c	refactor(driver-pages): address Phase 6/8 deep-review findings v2-ci / build (push) Failing after 32s Details v2-ci / unit-tests (tests/Core/ZB.MOM.WW.OtOpcUa.Cluster.Tests) (push) Has been skipped Details v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.ControlPlane.Tests) (push) Has been skipped Details v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer.Tests) (push) Has been skipped Details v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.Runtime.Tests) (push) Has been skipped Details v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.Security.Tests) (push) Has been skipped Details v2-ci / integration (tests/Server/ZB.MOM.WW.OtOpcUa.Host.IntegrationTests) (push) Has been skipped Details v2-ci / integration (tests/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer.IntegrationTests) (push) Has been skipped Details - Topic-name drift fix: DriverHealthChanged.TopicName and DriverControlTopic.Name now live on the message contracts in Commons. AkkaDriverHealthPublisher, DriverStatusSignalRBridge, DriverHostActor, and AdminOperationsActor all delegate to the single constant so a rename can't silently desynchronise publisher and subscriber. - DriverStatusPanel._opResultClearTimer switched from System.Timers.Timer to System.Threading.Timer + awaited DisposeAsync. Prevents an in-flight 8s clear-callback from invoking StateHasChanged on a component whose hub has already been released. - PublishHealthSnapshot deduplicates against the last published (state, lastSuccess, lastError, errorCount) fingerprint. The 30s heartbeat no longer floods the SignalR layer with identical Healthy snapshots — newly-joined clients still warm up via the snapshot store on JoinDriver.	2026-05-28 11:52:20 -04:00
Joseph Doherty	dcd2509548	refactor(driver-pages): address post-review follow-ups - DriverInstanceSpec carries ClusterId from the deployment artifact; DriverHostActor threads the real cluster identity into DriverInstanceActor instead of the local NodeId. Old pre-PR artifacts without a ClusterId field fall back to the NodeId so in-flight deployments keep working. - DriverHostActor.ChildEntry holds the full DriverInstanceSpec (was only carrying DriverType + LastConfigJson). Restart respawns preserve RowId, Name, Enabled, ClusterId — no placeholder values. - Drop the unnecessary _faultLock on DriverInstanceActor — every read/write site runs inside an Akka message handler which is single-threaded per actor instance. - DriverStatusPanel.DisposeAsync awaits Timer.DisposeAsync so an in-flight 5s tick can't invoke StateHasChanged on a component whose hub has already been torn down.	2026-05-28 11:41:46 -04:00
Joseph Doherty	ffcc8d1065	feat(adminui): Reconnect/Restart on DriverStatusPanel (DriverOperator-gated) - RestartDriver / ReconnectDriver messages + AdminOperationsActor handlers (broadcast via driver-control DPS topic; audited via ConfigEdits). - DriverHostActor subscribes to driver-control; locates the matching child DriverInstanceActor and stops+respawns it (Restart) or sends it a ForceReconnect internal message (Reconnect — re-enters Reconnecting state without full stop). DriverInstanceSpec constructor call uses named args to handle the full 6-parameter signature. - New DriverOperator authorization policy mapped to DriverOperator or FleetAdmin role; documented in docs/security.md. Map LDAP group via GroupToRole (e.g. "ot-driver-operator": "DriverOperator"). - DriverStatusPanel renders Reconnect + Restart buttons when the user holds the DriverOperator policy (hidden otherwise). Restart requires an in-page Razor confirm block (no JS confirm, keeps SignalR event loop unblocked). Both buttons show a spinner and are disabled during in-flight; result chip auto-clears after 8s. Username sourced from AuthenticationStateProvider. Reconnect resolves to "ForceReconnect" (re-enter Reconnecting, not full stop+respawn) — transport drops and retries while actor and in-memory state are preserved. All DriverInstanceActor states handle ForceReconnect safely (no-op when already in transition).	2026-05-28 11:14:04 -04:00
Joseph Doherty	4203b84d51	feat(runtime): publish DriverHealthChanged via DriverInstanceActor - IDriverHealthPublisher in Core.Abstractions + NullDriverHealthPublisher no-op for tests/dev-stub paths. - AkkaDriverHealthPublisher in Runtime forwards to the cluster-wide `driver-health` DPS topic. - DriverInstanceActor instrumented to publish snapshots on every observable state change + a periodic 30s heartbeat so the AdminUI snapshot store warms up for newly-joined SignalR clients. - Sliding 5-minute Faulted-count tracked per actor via Queue<DateTime>. - DriverHostActor.SpawnChild threads clusterId (_localNode.Value) and the health publisher down to every DriverInstanceActor child. - ServiceCollectionExtensions.AddOtOpcUaRuntime registers AkkaDriverHealthPublisher as IDriverHealthPublisher singleton.	2026-05-28 10:22:44 -04:00

1 2 3

126 Commits