lmxopcua

Author	SHA1	Message	Date
Joseph Doherty	87dd65b97a	test(alarms): native ack wrong-role deny + tidy NativeAlarmAck doc (code-review)	2026-06-15 14:39:26 -04:00
Joseph Doherty	a6d9de091b	feat(alarms): native condition Acknowledge routes to NativeAlarmAckRouter with principal [H6c]	2026-06-15 14:33:58 -04:00
Joseph Doherty	be6858baa1	fix(alarms): OnEnableDisable native-check via lock-guarded IsNativeAlarmNode + unstale AlarmCommand doc (code-review)	2026-06-15 14:30:17 -04:00
Joseph Doherty	328bd1b9ee	feat(alarms): wire OnEnableDisable over OPC UA (AlarmAck-gated; native→BadNotSupported) [H4]	2026-06-15 14:24:19 -04:00
Joseph Doherty	226587d817	test(alarms): cover isNative rebuild/kind-flip lifecycle + Phase7Applier call-site (code-review)	2026-06-15 14:20:20 -04:00
Joseph Doherty	2423edf232	test(alarms): assert Galaxy ack null-OperatorUser falls back to empty (code-review)	2026-06-15 14:18:57 -04:00
Joseph Doherty	418663b359	feat(alarms): thread isNative through MaterialiseAlarmCondition; node manager tracks native conditions [H6a]	2026-06-15 14:13:30 -04:00
Joseph Doherty	ed941c51da	feat(alarms): AlarmAcknowledgeRequest carries OperatorUser; Galaxy/ScriptedAlarmSource honor it [H6b]	2026-06-15 14:11:40 -04:00
Joseph Doherty	c236263e8d	fix(authz): give HistoryUpdate its own NodePermissions bit (was aliased to HistoryRead) [H2]	2026-06-15 14:09:35 -04:00
Joseph Doherty	6ab3d8630b	docs(alarms): Phase 3 implementation plan + tasks (H4 + H2-bit + H6)	2026-06-15 14:05:00 -04:00
Joseph Doherty	40b883effe	docs(alarms): Phase 3 design — OPC UA standards completeness (H4 Enable/Disable + H2 HistoryUpdate bit + H6 native-ack→AVEVA)	2026-06-15 13:59:28 -04:00
Joseph Doherty	4af8e65af1	fix(redundancy): PeerProbeSupervisor explicitly ignores co-mingled OpcUaProbeResult (integration review) v2-ci / build (push) Failing after 34s Details v2-ci / unit-tests (tests/Core/ZB.MOM.WW.OtOpcUa.Cluster.Tests) (push) Has been skipped Details v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.ControlPlane.Tests) (push) Has been skipped Details v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer.Tests) (push) Has been skipped Details v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.Runtime.Tests) (push) Has been skipped Details v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.Security.Tests) (push) Has been skipped Details v2-ci / integration (tests/Server/ZB.MOM.WW.OtOpcUa.Host.IntegrationTests) (push) Has been skipped Details v2-ci / integration (tests/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer.IntegrationTests) (push) Has been skipped Details	2026-06-15 13:40:16 -04:00
Joseph Doherty	4c78dcd358	feat(redundancy): wire dbHealth into OpcUaPublishActor + spawn PeerProbeSupervisor per node	2026-06-15 13:33:34 -04:00
Joseph Doherty	393b746d9b	docs(redundancy): sync component table with the wired calculator + PeerProbeSupervisor	2026-06-15 13:30:55 -04:00
Joseph Doherty	5a064e086d	test(redundancy): lock in stale-Terminated guard + clarify OnTerminated (code-review)	2026-06-15 13:29:58 -04:00
Joseph Doherty	70e6d3d2c0	docs(redundancy): ServiceLevelCalculator is wired into the live publish path	2026-06-15 13:26:34 -04:00
Joseph Doherty	f41e957e07	feat(redundancy): PeerProbeSupervisor maintains one peer OPC UA probe per driver peer	2026-06-15 13:22:38 -04:00
Joseph Doherty	37b32a5623	feat(redundancy): periodic HealthTick refreshes DB reachability via Ask/PipeTo	2026-06-15 13:15:26 -04:00
Joseph Doherty	5382eea9b5	test(redundancy): cover stale-probe-not-demoted branch + make _probeFreshnessWindow readonly (code-review)	2026-06-15 13:11:01 -04:00
Joseph Doherty	cf278035d2	feat(redundancy): OpcUaProbeOk from peer-probes-me with freshness debounce	2026-06-15 13:04:41 -04:00
Joseph Doherty	a9ff1a64b2	fix(redundancy): always publish first ServiceLevel (even 0) + log SafeSelfStatus failures (code-review)	2026-06-15 13:00:25 -04:00
Joseph Doherty	3e609a2b19	feat(redundancy): OpcUaPublishActor computes ServiceLevel via calculator (DB+stale+leader; legacy seam)	2026-06-15 12:51:32 -04:00
Joseph Doherty	ff0f62db38	refactor(redundancy): move ServiceLevelCalculator to Core.Cluster (shared, Runtime-reachable)	2026-06-15 12:45:17 -04:00
Joseph Doherty	7605f4d8fd	docs(redundancy): Phase 2 implementation plan + tasks (H3 ServiceLevel wiring)	2026-06-15 12:41:51 -04:00
Joseph Doherty	0528353315	docs(redundancy): Phase 2 design — health-aware ServiceLevel (H3)	2026-06-15 12:33:09 -04:00
Joseph Doherty	4bd7180e7f	fix(docker-dev): stop seeding retired SystemPlatform namespace v2-ci / build (push) Failing after 36s Details v2-ci / unit-tests (tests/Core/ZB.MOM.WW.OtOpcUa.Cluster.Tests) (push) Has been skipped Details v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.ControlPlane.Tests) (push) Has been skipped Details v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer.Tests) (push) Has been skipped Details v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.Runtime.Tests) (push) Has been skipped Details v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.Security.Tests) (push) Has been skipped Details v2-ci / integration (tests/Server/ZB.MOM.WW.OtOpcUa.Host.IntegrationTests) (push) Has been skipped Details v2-ci / integration (tests/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer.IntegrationTests) (push) Has been skipped Details The seed re-inserted a Namespace with Kind='SystemPlatform' (+ a GalaxyMxGateway driver + 3 mirror tags), but that NamespaceKind member was removed when Galaxy became Equipment-kind (migration CleanupSystemPlatformNamespaces). cluster-seed runs after the migrator, so a fresh down -v/up re-introduced a Kind the current code can't EF-materialize — 500ing /deployments and failing every publish (ConfigComposer reads db.Namespaces). Remove the obsolete inserts; author an Equipment-kind Galaxy driver via the UI if a fixture is needed.	2026-06-15 12:17:02 -04:00
Joseph Doherty	907005d2d2	docs(claude): note local docker-dev rig has login disabled — run live /run verification directly, don't wait for sign-in v2-ci / build (push) Failing after 48s Details v2-ci / unit-tests (tests/Core/ZB.MOM.WW.OtOpcUa.Cluster.Tests) (push) Has been skipped Details v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.ControlPlane.Tests) (push) Has been skipped Details v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer.Tests) (push) Has been skipped Details v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.Runtime.Tests) (push) Has been skipped Details v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.Security.Tests) (push) Has been skipped Details v2-ci / integration (tests/Server/ZB.MOM.WW.OtOpcUa.Host.IntegrationTests) (push) Has been skipped Details v2-ci / integration (tests/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer.IntegrationTests) (push) Has been skipped Details	2026-06-15 11:50:55 -04:00
Joseph Doherty	c6a543d1b6	docs(vtags): note rename-respawn transient + write-side-only historize (integration review) v2-ci / build (push) Failing after 44s Details v2-ci / unit-tests (tests/Core/ZB.MOM.WW.OtOpcUa.Cluster.Tests) (push) Has been skipped Details v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.ControlPlane.Tests) (push) Has been skipped Details v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer.Tests) (push) Has been skipped Details v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.Runtime.Tests) (push) Has been skipped Details v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.Security.Tests) (push) Has been skipped Details v2-ci / integration (tests/Server/ZB.MOM.WW.OtOpcUa.Host.IntegrationTests) (push) Has been skipped Details v2-ci / integration (tests/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer.IntegrationTests) (push) Has been skipped Details	2026-06-15 10:50:08 -04:00
Joseph Doherty	aaa5d8b851	docs(vtags): document runtime Historize honoring + infra-gated durable sink (Phase 1 H5)	2026-06-15 10:43:29 -04:00
Joseph Doherty	4501f12669	feat(vtags): wire IHistoryWriter through DriverHostActor (Null default; durable sink infra-gated) (H5d, stillpending §1)	2026-06-15 10:38:49 -04:00
Joseph Doherty	2f30c54dc1	test(vtags): thread-safe CapturingHistoryWriter + drop redundant wait (H5c review follow-up)	2026-06-15 10:33:14 -04:00
Joseph Doherty	0c6d4c5491	feat(vtags): forward historized vtag results to IHistoryWriter (H5c, stillpending §1)	2026-06-15 10:26:25 -04:00
Joseph Doherty	83d3b9f7be	test(vtags): planner detects Historize-only toggle as a change + doc nit (H5a review follow-up)	2026-06-15 10:21:31 -04:00
Joseph Doherty	9c5a091395	feat(vtags): decode VirtualTag Historize from artifact, byte-parity with composer (H5b, stillpending §1)	2026-06-15 10:17:08 -04:00
Joseph Doherty	fc8121cbf3	feat(vtags): carry VirtualTag.Historize onto EquipmentVirtualTagPlan (H5a, stillpending §1)	2026-06-15 10:17:05 -04:00
Joseph Doherty	ebf2f1dd7a	fix(vtags): prune _planByVtag on child termination + crash-then-change test (H1b review follow-up)	2026-06-15 10:12:11 -04:00
Joseph Doherty	ada01e1af8	fix(vtags): respawn equipment virtualtag child on in-place plan change (H1b, stillpending §1)	2026-06-15 10:05:29 -04:00
Joseph Doherty	1dc713693a	fix(deploy): count removed equipment tags/vtags in RemovedNodes (H1a review follow-up)	2026-06-15 10:01:37 -04:00
Joseph Doherty	1e95856b00	fix(deploy): rebuild address space on changed-only deploys (H1a, stillpending §1)	2026-06-15 09:57:40 -04:00
Joseph Doherty	50a2fdf32d	chore(plans): mark confirmed-shipped .tasks.json completed so audits don't re-flag (stillpending §7)	2026-06-15 09:52:51 -04:00
Joseph Doherty	a9d267c91a	docs(security,core): correct stale write-outcome doc + note benign DraftSnapshot/LeaderChanged residue (stillpending §9/§3)	2026-06-15 09:48:14 -04:00
Joseph Doherty	b4af9e7f37	docs(comments): correct 7 stale 'later task/milestone' comments (stillpending §9)	2026-06-15 09:47:08 -04:00
Joseph Doherty	68a0f759f0	docs(plans): Phase 0+1 implementation plan for the still-pending backlog 12 tasks (0 branch; 1-3 Phase 0 hygiene; 4-5 H1 changed-only-deploy fix; 6-9 H5 vtag Historize threading + IHistoryWriter seam; 10 docs; 11 verify). Conservative rebuild-on-change; no EF migration (Historize column + artifact already carry it); durable AVEVA sink flagged infra-gated.	2026-06-15 09:40:03 -04:00
Joseph Doherty	f64be52796	docs(plans): phased completion design for the still-pending backlog Roadmap for closing stillpending.md §1-§5 + §7/§9 cleanup in 9 phases (0 hygiene -> 1 silent-deploy bugs H1/H5 -> 2 ServiceLevel H3 -> 3 OPC UA standards H4/H2-bit/H6 -> 4 driver coverage -> 5 probes -> 6 AdminUI -> 7 Client.UI -> 8 per-cluster scoping). Conservative rebuild-on-change for H1; plan-and-execute phase-by-phase; no EF migration; defer-list flagged (Denied/Simulated/Language/InlayHints/ HistoryUpdate-service/Galaxy-gateway-write).	2026-06-15 09:27:06 -04:00
Joseph Doherty	151b7165af	docs(abcip,focas): document RetireAsync one-tick overlap residual + guard Dispose v2-ci / build (push) Failing after 2m47s Details v2-ci / unit-tests (tests/Core/ZB.MOM.WW.OtOpcUa.Cluster.Tests) (push) Has been skipped Details v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.ControlPlane.Tests) (push) Has been skipped Details v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer.Tests) (push) Has been skipped Details v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.Runtime.Tests) (push) Has been skipped Details v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.Security.Tests) (push) Has been skipped Details v2-ci / integration (tests/Server/ZB.MOM.WW.OtOpcUa.Host.IntegrationTests) (push) Has been skipped Details v2-ci / integration (tests/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer.IntegrationTests) (push) Has been skipped Details Code-review follow-ups on the poll-loop collapse: (1) RetireAsync is fire-and- forget and does NOT guarantee zero overlap — the retired loop runs until its in-flight read+tick finish and it observes cancellation, so a device transition landing in that one-tick window can fire once on both loops (at most ONE duplicate raise/clear per reconnect, transient + self-correcting; upstream Part 9 conditions dedupe on ConditionId). Documented in both RetireAsync XML docs so it isn't mistaken for a zero-overlap guarantee. (2) wrap Cts.Dispose() so the fire-and-forget task has no theoretical unobserved-exception path.	2026-06-15 06:14:44 -04:00
Joseph Doherty	6ba59f9d4d	fix(abcip,focas): collapse alarm projection to a single poll loop (no reconnect leak) The owning DriverInstanceActor re-subscribes alarms on every Connected entry (DetachAlarmSource nulls its cached handle on Connected->Reconnecting without calling UnsubscribeAlarmsAsync), and the driver object + its alarm projection are reused across every in-place reconnect. Each SubscribeAsync started a fresh, never-cancelled Task.Run poll loop and added it to _subs, so N reconnects leaked N concurrent loops all polling the device and all firing the same raise/clear transitions => duplicate alarm events + CPU/mem growth. Mirrors the Galaxy #399 fix (Clear-before-Add) but for live poll loops the collapse must also CANCEL the superseded loops, not just drop references. SubscribeAsync now snapshots existing subs under _subsLock, clears _subs, adds the new sub, starts its loop, then retires each stale sub out-of-band (RetireAsync: Cancel + await loop + Dispose CTS, fire-and-forget so the new subscription's return isn't blocked on a poll interval). Snapshot+clear under the same lock DisposeAsync uses guarantees no double-own / double-dispose. There is exactly one consumer per driver instance (factory-per-actor), so retiring all prior subscriptions before starting the new one is faithful. Regression tests (TDD, fail->pass): subscribe twice then drive one device raise; assert OnAlarmEvent fires exactly once (was twice with two leaked loops).	2026-06-15 06:09:38 -04:00
Joseph Doherty	43b3769a1d	docs(plans): add write-outcome self-correction implementation plan v2-ci / build (push) Failing after 32s Details v2-ci / unit-tests (tests/Core/ZB.MOM.WW.OtOpcUa.Cluster.Tests) (push) Has been skipped Details v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.ControlPlane.Tests) (push) Has been skipped Details v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer.Tests) (push) Has been skipped Details v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.Runtime.Tests) (push) Has been skipped Details v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.Security.Tests) (push) Has been skipped Details v2-ci / integration (tests/Server/ZB.MOM.WW.OtOpcUa.Host.IntegrationTests) (push) Has been skipped Details v2-ci / integration (tests/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer.IntegrationTests) (push) Has been skipped Details The plan + task list for the write-outcome self-correction work (B1, already shipped via master `1d797c1c`). Its design-doc counterpart is already committed; this adds the matching plan artifacts, consistent with the other docs/plans/.	2026-06-15 05:57:15 -04:00
Joseph Doherty	5a70cd7910	chore(deps): bump vendored MxGateway.Client/Contracts 0.1.0 -> 0.1.1 User-published 0.1.1 of the MxGateway client + contracts packages into the local-mxgw vendored source (nuget-packages/). Bumps Directory.Packages.props to match and adds the 0.1.1 .nupkg artifacts alongside the existing 0.1.0 ones. Full solution builds clean against 0.1.1.	2026-06-15 05:57:09 -04:00
Joseph Doherty	013882262a	fix(galaxy): bound alarm-subscription handles to one (no reconnect leak) v2-ci / build (push) Failing after 44s Details v2-ci / unit-tests (tests/Core/ZB.MOM.WW.OtOpcUa.Cluster.Tests) (push) Has been skipped Details v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.ControlPlane.Tests) (push) Has been skipped Details v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer.Tests) (push) Has been skipped Details v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.Runtime.Tests) (push) Has been skipped Details v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.Security.Tests) (push) Has been skipped Details v2-ci / integration (tests/Server/ZB.MOM.WW.OtOpcUa.Host.IntegrationTests) (push) Has been skipped Details v2-ci / integration (tests/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer.IntegrationTests) (push) Has been skipped Details GalaxyDriver's StreamAlarms feed is session-less and survives an in-place reconnect, so DriverInstanceActor re-subscribed on every Connected re-entry (after dropping its own cached handle without an Unsubscribe — sync teardown). The re-subscribe was additive: _alarmSubscriptions.Add grew the list by one untracked handle per reconnect cycle — a slow unbounded leak. Functionally harmless (the gate is Count>0 and OnAlarmFeedTransition only reads [0], firing once regardless), but it accumulated forever. Fix: SubscribeAlarmsAsync clears the set before adding, collapsing to a single live handle (under the existing _alarmHandlersLock, atomic w.r.t. the fan-out reader). There is exactly one consumer per driver instance (factory-per-actor lifecycle), so replacing the set with the latest handle is faithful. Chosen over making the actor's sync DetachAlarmSource call UnsubscribeAlarmsAsync async/fire-and-forget — disproportionate for a minor leak. Regression test Re_subscribe_collapses_to_a_single_handle_no_accumulation (TDD-verified: FAILS without the Clear — releasing the latest handle leaves the feed open because stale handles remain; PASSES with the fix). Galaxy tests 263 pass / 3 skip; Runtime native-alarm 24 pass. Code-reviewed (approved).	2026-06-15 05:49:07 -04:00
Joseph Doherty	c9643f68ba	fix(runtime): restart driver no longer throws 'actor name is not unique' v2-ci / build (push) Failing after 42s Details v2-ci / unit-tests (tests/Core/ZB.MOM.WW.OtOpcUa.Cluster.Tests) (push) Has been skipped Details v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.ControlPlane.Tests) (push) Has been skipped Details v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer.Tests) (push) Has been skipped Details v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.Runtime.Tests) (push) Has been skipped Details v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.Security.Tests) (push) Has been skipped Details v2-ci / integration (tests/Server/ZB.MOM.WW.OtOpcUa.Host.IntegrationTests) (push) Has been skipped Details v2-ci / integration (tests/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer.IntegrationTests) (push) Has been skipped Details HandleRestartDriver stopped + respawned the child within one synchronous message handler, reusing the base actor name drv-<id>. Context.Stop is async (the child processes its own stop on its own mailbox), so the old child was ALWAYS still registered when the respawn ran — Context.ActorOf threw InvalidActorNameException deterministically on every AdminUI Restart press, crashing + restarting the host. Fix: a monotonic _childSpawnGeneration counter (single-threaded actor) feeds a -g<gen> suffix on every spawned child name, so a respawn can never collide with the still-terminating predecessor. Children are tracked by the _children dict (by IActorRef), never by actor path, so the suffix is invisible to callers. This also closes the same-shaped latent race in the reconcile path (a removed- then-readded instance, and a driver-type-change ToStop+ToSpawn in one plan). Regression test RestartDriver_respawns_the_child_without_an_actor_name_collision (verified: FAILS on the old code with the exact InvalidActorNameException, PASSES with the fix). Runtime.Tests 238/238 green. Code-reviewed (approved).	2026-06-15 05:41:18 -04:00

1 2 3 4 5 ...

1721 Commits