fix(galaxy): bound alarm-subscription handles to one (no reconnect leak)
v2-ci / build (push) Failing after 44s
v2-ci / unit-tests (tests/Core/ZB.MOM.WW.OtOpcUa.Cluster.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.ControlPlane.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.Runtime.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.Security.Tests) (push) Has been skipped
v2-ci / integration (tests/Server/ZB.MOM.WW.OtOpcUa.Host.IntegrationTests) (push) Has been skipped
v2-ci / integration (tests/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer.IntegrationTests) (push) Has been skipped

GalaxyDriver's StreamAlarms feed is session-less and survives an in-place
reconnect, so DriverInstanceActor re-subscribed on every Connected re-entry
(after dropping its own cached handle without an Unsubscribe — sync teardown).
The re-subscribe was additive: _alarmSubscriptions.Add grew the list by one
untracked handle per reconnect cycle — a slow unbounded leak. Functionally
harmless (the gate is Count>0 and OnAlarmFeedTransition only reads [0], firing
once regardless), but it accumulated forever.

Fix: SubscribeAlarmsAsync clears the set before adding, collapsing to a single
live handle (under the existing _alarmHandlersLock, atomic w.r.t. the fan-out
reader). There is exactly one consumer per driver instance (factory-per-actor
lifecycle), so replacing the set with the latest handle is faithful. Chosen
over making the actor's sync DetachAlarmSource call UnsubscribeAlarmsAsync
async/fire-and-forget — disproportionate for a minor leak.

Regression test Re_subscribe_collapses_to_a_single_handle_no_accumulation
(TDD-verified: FAILS without the Clear — releasing the latest handle leaves
the feed open because stale handles remain; PASSES with the fix). Galaxy tests
263 pass / 3 skip; Runtime native-alarm 24 pass. Code-reviewed (approved).
This commit is contained in:
Joseph Doherty
2026-06-15 05:49:07 -04:00
parent c9643f68ba
commit 013882262a
3 changed files with 56 additions and 3 deletions
@@ -571,9 +571,9 @@ public sealed class DriverInstanceActor : ReceiveActor, IWithTimers
// reconnects. NOTE: this does NOT tear down the driver-side subscription. For a session-bound
// IAlarmSource the old subscription dies with the session (no accumulation). For a session-less feed
// (GalaxyDriver's always-on central monitor) it survives an in-place reconnect, so the re-subscribe
// is additive — harmless because the gate only checks Count > 0 and the feed fans out once
// regardless of handle count, but it does slowly accumulate handles across many reconnects (a minor
// leak tracked as a follow-up; the correct cleanup is a driver-side reset on re-init).
// is additive — but the driver now collapses to a single live handle on each SubscribeAlarmsAsync
// (GalaxyDriver.SubscribeAlarmsAsync clears the set before adding), so handles no longer accumulate
// across reconnects. The gate (Count > 0) and the one-shot fan-out are unchanged.
_alarmSubscriptionHandle = null;
}