fix(driver-galaxy): resolve Medium code-review finding (Driver.Galaxy-006)

HashSet<T>.First() enumeration order is unspecified and unstable across mutations, so
the "owner" handle attached to alarm events was non-deterministic when multiple alarm
subscriptions were active. Change _alarmSubscriptions from HashSet to List (preserving
insertion order) and pick [0] — the earliest-registered handle — as the deterministic
owner. The server routes transitions by SourceNodeId, not by handle, so the choice of
handle does not affect delivery to active subscribers.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
Joseph Doherty
2026-05-22 09:45:55 -04:00
parent 910a538b19
commit d14564839e
2 changed files with 14 additions and 8 deletions

View File

@@ -108,13 +108,13 @@
| Severity | Medium | | Severity | Medium |
| Category | Concurrency & thread safety | | Category | Concurrency & thread safety |
| Location | `GalaxyDriver.cs:848-861` | | Location | `GalaxyDriver.cs:848-861` |
| Status | Open | | Status | Resolved |
**Description:** `OnAlarmFeedTransition` picks the "owner" handle with `_alarmSubscriptions.First()` under `_alarmHandlersLock`. `HashSet<T>.First()` enumeration order is unspecified and unstable across mutations — when multiple alarm subscriptions are active, the handle attached to a given `AlarmEventArgs` can change arbitrarily between transitions. The XML doc acknowledges "we still only fire the event once" but the downstream `AlarmConditionService` correlates transitions to the originating subscription via this handle; a non-deterministic owner can misroute unsubscribe bookkeeping or per-subscription state. **Description:** `OnAlarmFeedTransition` picks the "owner" handle with `_alarmSubscriptions.First()` under `_alarmHandlersLock`. `HashSet<T>.First()` enumeration order is unspecified and unstable across mutations — when multiple alarm subscriptions are active, the handle attached to a given `AlarmEventArgs` can change arbitrarily between transitions. The XML doc acknowledges "we still only fire the event once" but the downstream `AlarmConditionService` correlates transitions to the originating subscription via this handle; a non-deterministic owner can misroute unsubscribe bookkeeping or per-subscription state.
**Recommendation:** If alarm transitions genuinely fan out to all subscriptions, raise `OnAlarmEvent` once per active handle (or document that the handle is a non-correlating sentinel and have the server stop relying on it). If a single owner is required, make the choice deterministic (e.g. the earliest-created handle) and stable. **Recommendation:** If alarm transitions genuinely fan out to all subscriptions, raise `OnAlarmEvent` once per active handle (or document that the handle is a non-correlating sentinel and have the server stop relying on it). If a single owner is required, make the choice deterministic (e.g. the earliest-created handle) and stable.
**Resolution:** _(open)_ **Resolution:** Resolved 2026-05-22 — changed `_alarmSubscriptions` from `HashSet<GalaxyAlarmSubscriptionHandle>` to `List<GalaxyAlarmSubscriptionHandle>` so insertion order is preserved; `OnAlarmFeedTransition` now picks `[0]` (earliest-registered handle) instead of `First()` on a HashSet, making the owner selection deterministic and stable across mutations. Server routing uses `SourceNodeId` not the handle, so every active subscriber sees the same transition regardless of which handle is attached.
### Driver.Galaxy-007 ### Driver.Galaxy-007

View File

@@ -75,7 +75,10 @@ public sealed class GalaxyDriver
private readonly Lock _alarmHandlersLock = new(); private readonly Lock _alarmHandlersLock = new();
private readonly Lock _alarmFeedLock = new(); private readonly Lock _alarmFeedLock = new();
private bool _alarmFeedWired; private bool _alarmFeedWired;
private readonly HashSet<GalaxyAlarmSubscriptionHandle> _alarmSubscriptions = new(); // List preserves insertion order so OnAlarmFeedTransition always picks the
// earliest-registered handle — a deterministic choice that doesn't vary as
// handles are added/removed (Driver.Galaxy-006 fix: HashSet.First() is unstable).
private readonly List<GalaxyAlarmSubscriptionHandle> _alarmSubscriptions = new();
// PR 4.W — production runtime owned by InitializeAsync. The driver builds these // PR 4.W — production runtime owned by InitializeAsync. The driver builds these
// when it opens a real gw session; tests bypass them by injecting seams via the // when it opens a real gw session; tests bypass them by injecting seams via the
@@ -964,12 +967,15 @@ public sealed class GalaxyDriver
GalaxyAlarmSubscriptionHandle? handle; GalaxyAlarmSubscriptionHandle? handle;
lock (_alarmHandlersLock) lock (_alarmHandlersLock)
{ {
// Pick any active subscription handle as the "owner" of the event. The // Pick the earliest-registered handle as the event owner. The server routes
// server-side state machine doesn't multiplex by handle today; if multiple // by SourceNodeId (not by handle), so every active subscriber sees the same
// alarm subscriptions are active we still only fire the event once and // transition regardless of which handle is attached here. Using the first
// the AlarmConditionService dispatches per-source-node downstream. // insertion-order entry is deterministic and stable as long as at least one
// subscription remains — HashSet.First() was unstable across mutations
// (Driver.Galaxy-006 fix). _alarmSubscriptions is a List, so [0] is always
// the earliest-registered handle.
handle = _alarmSubscriptions.Count > 0 handle = _alarmSubscriptions.Count > 0
? _alarmSubscriptions.First() ? _alarmSubscriptions[0]
: null; : null;
} }
if (handle is null) return; if (handle is null) return;