Commit Graph

77 Commits

Author SHA1 Message Date
Joseph Doherty 0a747c343d feat(runtime): decode IsArray/ArrayLength byte-parity in DeploymentArtifact 2026-06-16 21:40:22 -04:00
Joseph Doherty 6a8020e7e7 feat(adminui): native-alarm HistorizeToAveva opt-out 2026-06-16 16:27:31 -04:00
Joseph Doherty 93d9160dae feat(alarms): DriverHostActor routes native-condition acks to the owning driver [H6d] 2026-06-15 14:46:00 -04:00
Joseph Doherty 4501f12669 feat(vtags): wire IHistoryWriter through DriverHostActor (Null default; durable sink infra-gated) (H5d, stillpending §1) 2026-06-15 10:38:49 -04:00
Joseph Doherty 9c5a091395 feat(vtags): decode VirtualTag Historize from artifact, byte-parity with composer (H5b, stillpending §1) 2026-06-15 10:17:08 -04:00
Joseph Doherty b4af9e7f37 docs(comments): correct 7 stale 'later task/milestone' comments (stillpending §9) 2026-06-15 09:47:08 -04:00
Joseph Doherty 013882262a fix(galaxy): bound alarm-subscription handles to one (no reconnect leak)
v2-ci / build (push) Failing after 44s
v2-ci / unit-tests (tests/Core/ZB.MOM.WW.OtOpcUa.Cluster.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.ControlPlane.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.Runtime.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.Security.Tests) (push) Has been skipped
v2-ci / integration (tests/Server/ZB.MOM.WW.OtOpcUa.Host.IntegrationTests) (push) Has been skipped
v2-ci / integration (tests/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer.IntegrationTests) (push) Has been skipped
GalaxyDriver's StreamAlarms feed is session-less and survives an in-place
reconnect, so DriverInstanceActor re-subscribed on every Connected re-entry
(after dropping its own cached handle without an Unsubscribe — sync teardown).
The re-subscribe was additive: _alarmSubscriptions.Add grew the list by one
untracked handle per reconnect cycle — a slow unbounded leak. Functionally
harmless (the gate is Count>0 and OnAlarmFeedTransition only reads [0], firing
once regardless), but it accumulated forever.

Fix: SubscribeAlarmsAsync clears the set before adding, collapsing to a single
live handle (under the existing _alarmHandlersLock, atomic w.r.t. the fan-out
reader). There is exactly one consumer per driver instance (factory-per-actor
lifecycle), so replacing the set with the latest handle is faithful. Chosen
over making the actor's sync DetachAlarmSource call UnsubscribeAlarmsAsync
async/fire-and-forget — disproportionate for a minor leak.

Regression test Re_subscribe_collapses_to_a_single_handle_no_accumulation
(TDD-verified: FAILS without the Clear — releasing the latest handle leaves
the feed open because stale handles remain; PASSES with the fix). Galaxy tests
263 pass / 3 skip; Runtime native-alarm 24 pass. Code-reviewed (approved).
2026-06-15 05:49:07 -04:00
Joseph Doherty c9643f68ba fix(runtime): restart driver no longer throws 'actor name is not unique'
v2-ci / build (push) Failing after 42s
v2-ci / unit-tests (tests/Core/ZB.MOM.WW.OtOpcUa.Cluster.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.ControlPlane.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.Runtime.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.Security.Tests) (push) Has been skipped
v2-ci / integration (tests/Server/ZB.MOM.WW.OtOpcUa.Host.IntegrationTests) (push) Has been skipped
v2-ci / integration (tests/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer.IntegrationTests) (push) Has been skipped
HandleRestartDriver stopped + respawned the child within one synchronous
message handler, reusing the base actor name drv-<id>. Context.Stop is async
(the child processes its own stop on its own mailbox), so the old child was
ALWAYS still registered when the respawn ran — Context.ActorOf threw
InvalidActorNameException deterministically on every AdminUI Restart press,
crashing + restarting the host.

Fix: a monotonic _childSpawnGeneration counter (single-threaded actor) feeds a
-g<gen> suffix on every spawned child name, so a respawn can never collide with
the still-terminating predecessor. Children are tracked by the _children dict
(by IActorRef), never by actor path, so the suffix is invisible to callers.
This also closes the same-shaped latent race in the reconcile path (a removed-
then-readded instance, and a driver-type-change ToStop+ToSpawn in one plan).

Regression test RestartDriver_respawns_the_child_without_an_actor_name_collision
(verified: FAILS on the old code with the exact InvalidActorNameException,
PASSES with the fix). Runtime.Tests 238/238 green. Code-reviewed (approved).
2026-06-15 05:41:18 -04:00
Joseph Doherty a833d1b4aa fix(alarms): address code review — accurate reconnect comment + SubscribeAlarms drop handlers
- Correct the misleading DetachAlarmSource comment: a session-less feed (Galaxy) is NOT
  torn down on an in-place reconnect, so re-subscribe is additive (harmless; gate reads [0]).
- Add trace-only SubscribeAlarms drop handlers in Connecting/Reconnecting (symmetry with
  NativeAlarmRaised) so a self-tell overtaken by a queued disconnect doesn't dead-letter.
- Document the deliberate no-unsubscribe-on-empty asymmetry vs the value path.
Behavior-neutral for the un-gate path. Minor handle-accumulation leak tracked as follow-up.
2026-06-15 00:49:19 -04:00
Joseph Doherty 7f313df7a6 fix(alarms): subscribe native alarms to un-gate the IAlarmSource feed
Phase B native alarms never fired end-to-end: GalaxyDriver suppresses OnAlarmEvent until
an alarm subscription exists (_alarmSubscriptions.Count > 0), but the runtime only attached
the OnAlarmEvent handler and never called SubscribeAlarmsAsync — so the central feed stayed
gated and no transition reached the Part 9 condition / /alerts. Unit tests passed because
they inject through the IAlarmSource seam directly; the deferred live /run surfaced it.

DriverHostActor computes per-driver alarm refs (alarm-bearing tags' FullNames) and hands them
via SetDesiredSubscriptions; DriverInstanceActor calls SubscribeAlarmsAsync for IAlarmSource
drivers on Connected entry and whenever alarm refs are pushed while Connected (the deploy path),
idempotent via a cached handle reset on detach so reconnect re-subscribes.
2026-06-15 00:42:43 -04:00
Joseph Doherty 50e1141fc2 test(historian): assert Deferred sink forwards historianTagname + doc nits
I1: DeferredAddressSpaceSinkTests.RecordingSink now captures HistorianTagname
per EnsureVariable call (HistorianQueue/HistorianCalls, matching the
Phase7ApplierTests pattern); new test EnsureVariable_forwards_historianTagname_to_inner_sink
asserts the arg is forwarded unchanged through DeferredAddressSpaceSink.

M1: OtOpcUaNodeManager.EnsureVariable doc-comment notes that a changed
historize intent on an already-registered node is silently ignored until
a RebuildAddressSpace (rebuild precondition for Task 3 implementers).

N2: DeploymentArtifact.ExtractTagHistorize doc wording: "The live-edit
side" → "The live-edit composer side".
2026-06-14 19:16:23 -04:00
Joseph Doherty c35c1d3734 refactor(historian): single-parse ExtractTagHistorize + review-nit tests/docs
Stop parsing TagConfig twice per tag on the deploy hot path: Phase7Composer's
equipment-tag Select lambda is now block-bodied (captures isHistorized/historianTagname
once), and DeploymentArtifact.BuildEquipmentTagPlans captures locals before result.Add.
Add wrong-type-historianTagname InlineData to ExtractTagHistorizeTests. Extend the
parity round-trip fixture with a 4th tag (isHistorized:false + JSON-null tagname)
exercising the artifact-side private guard path. Align DeploymentArtifact's
ExtractTagHistorize doc-comment with the composer-side phrasing (ExtractTagFullName /
ExtractTagAlarm cross-reference).
2026-06-14 19:02:02 -04:00
Joseph Doherty 440929c82a feat(historian): carry isHistorized + historianTagname through EquipmentTagPlan (byte-parity) 2026-06-14 18:55:04 -04:00
Joseph Doherty 7ffe8939e7 style(drivers): clarify Reconnecting InitializeFailed no-op is generation-agnostic (code-review nit) 2026-06-14 17:19:34 -04:00
Joseph Doherty 751786ec8c fix(drivers): adopt corrected config via ApplyDelta while (re)connecting (#7)
A DriverInstanceActor stuck Reconnecting/Connecting now adopts a config delivered via ApplyDelta and
re-initialises with it, instead of dead-lettering and retrying the stale config forever. A monotonic
init generation supersedes the in-flight init so the corrected config always wins.
2026-06-14 17:15:28 -04:00
Joseph Doherty f9be38430c fix(alarms): route native alarms by ConditionId (dotted FullName), not bare SourceNodeId (integration review)
v2-ci / build (push) Failing after 46s
v2-ci / unit-tests (tests/Core/ZB.MOM.WW.OtOpcUa.Cluster.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.ControlPlane.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.Runtime.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.Security.Tests) (push) Has been skipped
v2-ci / integration (tests/Server/ZB.MOM.WW.OtOpcUa.Host.IntegrationTests) (push) Has been skipped
v2-ci / integration (tests/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer.IntegrationTests) (push) Has been skipped
2026-06-14 04:09:01 -04:00
Joseph Doherty 7e86fa7099 fix(alarms): normalise native TransitionKind to canonical EmissionKind vocabulary (review) 2026-06-14 03:58:46 -04:00
Joseph Doherty 8736fcc37c feat(alarms): Primary-gated AlarmTransitionEvent fan-out for native alarms (Phase B WS-5) 2026-06-14 03:48:41 -04:00
Joseph Doherty 4c56a1719b feat(alarms): DriverHostActor routes native alarm transitions to Part 9 conditions (Phase B WS-4c) 2026-06-14 03:34:25 -04:00
Joseph Doherty 422e5b7db2 refactor(alarms): harden ExtractTagAlarm severity parse (TryGetInt32) + trim projector prior-state (review nits) 2026-06-14 03:27:03 -04:00
Joseph Doherty 25c3bd16ba feat(alarms): DriverInstanceActor forwards native OnAlarmEvent to parent (Phase B WS-4b) 2026-06-14 03:24:24 -04:00
Joseph Doherty c1aeafaaf3 feat(alarms): NativeAlarmProjector maps transitions to condition snapshots (Phase B WS-4a) 2026-06-14 03:16:44 -04:00
Joseph Doherty e1ccd99ea2 feat(alarms): EquipmentTagPlan.Alarm parsed byte-parity from TagConfig (Phase B WS-2) 2026-06-14 03:12:48 -04:00
Joseph Doherty 590e497872 fix(runtime): narrow ActorNodeWriteGateway catch + drop vacuous no-actor assertion 2026-06-14 01:32:34 -04:00
Joseph Doherty 526ddb6a57 feat(runtime): ActorNodeWriteGateway — Asks RouteNodeWrite, returns NodeWriteOutcome 2026-06-14 01:23:43 -04:00
Joseph Doherty 7e405e949b fix(runtime): swallow self SubscriptionFailed too (symmetric to SubscriptionEstablished) 2026-06-14 00:42:31 -04:00
Joseph Doherty 42b4a923fd fix(runtime): fast-fail writes in degraded driver states + swallow self SubscriptionEstablished 2026-06-14 00:34:37 -04:00
Joseph Doherty 4cda275b8d fix(runtime): fast-fail RouteNodeWrite while Stale + micro-opts + raw-blob routing test 2026-06-14 00:16:47 -04:00
Joseph Doherty a23fb2b82e feat(server): equipment-tag node writability from Tag.AccessLevel (parity-safe, no migration) 2026-06-13 11:46:00 -04:00
Joseph Doherty f8f1027287 feat(runtime): NodeId->driver reverse routing + primary-gated RouteNodeWrite 2026-06-13 11:44:26 -04:00
Joseph Doherty c4435e4fd6 feat(runtime): route driver values to folder-scoped equipment NodeIds (live-value delivery)
v2-ci / build (push) Failing after 44s
v2-ci / unit-tests (tests/Core/ZB.MOM.WW.OtOpcUa.Cluster.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.ControlPlane.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.Runtime.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.Security.Tests) (push) Has been skipped
v2-ci / integration (tests/Server/ZB.MOM.WW.OtOpcUa.Host.IntegrationTests) (push) Has been skipped
v2-ci / integration (tests/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer.IntegrationTests) (push) Has been skipped
2026-06-13 06:32:38 -04:00
Joseph Doherty da1accceff feat(runtime): carry DriverInstanceId on AttributeValuePublished (live-value routing key) 2026-06-13 06:27:52 -04:00
Joseph Doherty 7d25480fee docs(galaxy): neutralize remaining stale SystemPlatform/alias terminology in comments + a test name
Replace "SystemPlatform mirror tag", "Galaxy alias", and "SystemPlatform-kind" in doc-comments and
test names with neutral accurate wording ("FolderPath-scoped tag", "EquipmentId == null", etc.).
No code, logic, or test bodies changed — comments and one test method name only.
2026-06-12 22:30:50 -04:00
Joseph Doherty 5edea52bd7 docs(galaxy): fix stale SystemPlatform/alias/Galaxy doc comments (review follow-up)
Resolves the code-review notes on 95be607a + the AdminUI bundle: the
EnsureVariable docs (IOpcUaAddressSpaceSink, OtOpcUaNodeManager) and the Tag
entity doc no longer say 'Galaxy / SystemPlatform / alias'; the DriverHostActor
ForwardToMux comment now states the real equipment-tag value-routing gap (the
FullName→NodeId 'live values' milestone) instead of claiming Galaxy values map
straight through.
2026-06-12 22:00:52 -04:00
Joseph Doherty 95be607a07 feat(opcua): remove SystemPlatform-mirror GalaxyTags contract end-to-end (composer+applier+artifact, byte-parity) 2026-06-12 21:45:19 -04:00
Joseph Doherty bc9e83ed9f feat(composer): admit GalaxyMxGateway-backed equipment alias tags (+byte-parity) 2026-06-11 21:10:21 -04:00
Joseph Doherty 06c415598c feat(redundancy): gate scripted-alarm alerts publish on Primary (A1) 2026-06-11 08:44:44 -04:00
Joseph Doherty 5256761368 feat(scripted-alarms): spawn + apply ScriptedAlarmHostActor in DriverHostActor (T10) 2026-06-10 15:17:29 -04:00
Joseph Doherty c9590c03d0 fix(scripted-alarms): harden artifact boolean decode + direct helper tests (T6 review)
Default HistorizeToAveva/Retain/Enabled to the entity defaults (true) when a
field is absent/null/non-boolean so a partial blob decodes identically to the
composer's view of a default-constructed ScriptedAlarm (byte-parity), and only
call GetBoolean for a genuine true/false token. Add direct ExtractAlarmDependencyRefs
unit tests (overlap dedup + reserved {{equip}} exclusion).
2026-06-10 14:47:24 -04:00
Joseph Doherty 8e8ca9efe8 feat(scripted-alarms): DeploymentArtifact byte-parity for the alarm plan (T6) 2026-06-10 14:41:46 -04:00
Joseph Doherty 66ea9c56f6 feat(runtime): DeploymentArtifact substitutes {{equip}} (parity with composer) 2026-06-10 07:53:20 -04:00
Joseph Doherty d909a8e4f6 docs+test(deploy): clarify driver-less attribution docs + no-line exclusion test (Task 2 review) 2026-06-08 07:02:25 -04:00
Joseph Doherty c688899134 fix(deploy): cluster-attribute driver-less equipment via its UNS line area (BuildClusterSets) 2026-06-08 06:53:41 -04:00
Joseph Doherty 6b36eff2d3 refactor(runtime): capture-first in HandleWriteAsync; assert no handler leak on resubscribe; fix stale comment 2026-06-07 10:31:20 -04:00
Joseph Doherty 98259ab026 fix(runtime): capture Sender before await in DriverInstanceActor subscribe (no-ActorContext race) 2026-06-07 10:26:17 -04:00
Joseph Doherty 4c221ce2b3 merge: equipment-namespace live values (VirtualTag route)
v2-ci / build (push) Failing after 36s
v2-ci / unit-tests (tests/Core/ZB.MOM.WW.OtOpcUa.Cluster.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.ControlPlane.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.Runtime.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.Security.Tests) (push) Has been skipped
v2-ci / integration (tests/Server/ZB.MOM.WW.OtOpcUa.Host.IntegrationTests) (push) Has been skipped
v2-ci / integration (tests/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer.IntegrationTests) (push) Has been skipped
2026-06-07 09:33:21 -04:00
Joseph Doherty 1c579410cd fix(runtime): flag cross-cluster orphan-equipment bindings on rebuild
v2-ci / build (push) Failing after 42s
v2-ci / unit-tests (tests/Core/ZB.MOM.WW.OtOpcUa.Cluster.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.ControlPlane.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.Runtime.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.Security.Tests) (push) Has been skipped
v2-ci / integration (tests/Server/ZB.MOM.WW.OtOpcUa.Host.IntegrationTests) (push) Has been skipped
v2-ci / integration (tests/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer.IntegrationTests) (push) Has been skipped
ParseComposition(blob, nodeId, onInconsistency?) detects a kept equipment whose
UNS line belongs to another cluster (a same-cluster-invariant violation that
would orphan the equipment folder) and reports it via an optional callback,
wired to OpcUaPublishActor's logger. Detection-only; the upstream draft
validator remains the authority. Adds two unit tests.
2026-06-07 08:24:11 -04:00
Joseph Doherty 2bfe18abcf chore(runtime): warn on missing VirtualTag evaluator; document Stale-recovery VirtualTag behaviour
Log a WARNING on startup when IVirtualTagEvaluator is not registered so a DI misconfig on a
driver-role node is visible in logs instead of silently evaluating all VirtualTags to NoChange.
Add a comment in PushDesiredSubscriptions noting that TryRecoverFromStale does not call this
method, so VirtualTags remain empty after a Stale recovery until the next deployment dispatch
(intentional, consistent with driver recovery).
2026-06-07 05:46:24 -04:00
Joseph Doherty 397f9b783a feat(runtime): spawn+apply VirtualTagHostActor on deploy apply and restore 2026-06-07 05:41:04 -04:00
Joseph Doherty c7661d0510 feat(opcua): parse Equipment VirtualTag plans from the deployment artifact 2026-06-07 05:09:53 -04:00