fix(data-connection-layer): resolve DataConnectionLayer-008,013 — O(1) unsubscribe via reverse index, atomic disconnect guard

This commit is contained in:
Joseph Doherty
2026-05-16 22:14:23 -04:00
parent 7d1cc5cbb4
commit ff4a4bdeb7
6 changed files with 196 additions and 24 deletions

View File

@@ -8,7 +8,7 @@
| Last reviewed | 2026-05-16 |
| Reviewer | claude-agent |
| Commit reviewed | `9c60592` |
| Open findings | 2 |
| Open findings | 0 |
## Summary
@@ -381,7 +381,7 @@ after.
|--|--|
| Severity | Low |
| Category | Performance & resource management |
| Status | Open |
| Status | Resolved |
| Location | `src/ScadaLink.DataConnectionLayer/Actors/DataConnectionActor.cs:540-569` |
**Description**
@@ -404,7 +404,26 @@ prior state captured before removal.
**Resolution**
_Unresolved._
Resolved 2026-05-16 (commit pending). A `tagPath → subscriber-count` reverse index
(`_tagSubscriberCount`) was added: `HandleSubscribeCompleted` increments it whenever a
tag is newly added to an instance's set, and `HandleUnsubscribe` decrements it,
releasing a tag at the adapter only when the count reaches zero. The "any other
subscriber" check is now O(1) per tag instead of an O(instances) `Where(...).Any()`
scan. The redundant `!_unresolvedTags.Contains(tagPath)` re-check (always true after
the unconditional `Remove` on the line above) was removed — the surviving branch is
entered only for tags that have a subscription id, which are by definition resolved,
so `_resolvedTags--` is now unconditional with an explanatory comment. The cleanup
also fixed a latent leak the original code could not reach: an unresolved tag whose
last subscriber unsubscribes is now removed from `_unresolvedTags`/`_resolutionInFlight`
and decremented from `_totalSubscribed` (previously it lingered in the retry timer and
the subscribed total forever). Regression test
`DCL008_Unsubscribe_OnlyReleasesTagWhenLastSubscriberLeaves` subscribes a tag from two
instances plus an exclusive tag, then unsubscribes each instance and asserts the
shared tag is released at the adapter only after the last subscriber leaves and the
health counters track correctly. (This finding is a performance refactor, not a
correctness bug — the pre-fix `Where(...).Any()` logic was functionally correct, so
the test passes against both versions and serves as a behavioural guard for the
refactor.)
### DataConnectionLayer-009 — Implemented failover heuristic diverges from the documented state machine
@@ -606,7 +625,7 @@ deliberately not made here because this task is scoped to
|--|--|
| Severity | Low |
| Category | Documentation & comments |
| Status | Open |
| Status | Resolved |
| Location | `src/ScadaLink.DataConnectionLayer/Adapters/OpcUaDataConnection.cs:270-281` |
**Description**
@@ -629,4 +648,16 @@ under a race and is tolerated downstream."
**Resolution**
_Unresolved._
Resolved 2026-05-16 (commit pending). Rather than weaken the XML comment to match the
weak guard, the guard was made genuinely atomic so the documented "only the first
caller fires the event" guarantee becomes true. `OpcUaDataConnection._disconnectFired`
and `RealOpcUaClient._connectionLostFired` were changed from `volatile bool` to `int`,
and the check-then-set in `RaiseDisconnected` / `OnSessionKeepAlive` replaced with a
single `Interlocked.Exchange(ref flag, 1) != 0` compare-and-set; the reset on connect
uses `Interlocked.Exchange(ref flag, 0)`. The XML comments on both methods were updated
to describe the atomic compare-and-set explicitly. Regression test
`DCL013_ConcurrentConnectionLost_RaisesDisconnectedExactlyOnce` runs 25 rounds, each
fanning 32 barrier-synchronised threads that raise the client's `ConnectionLost` event
simultaneously, and asserts `Disconnected` fires exactly once per round; against a
non-atomic check-then-set it double-fires (verified by temporarily reverting the
guard), and it passes against the atomic fix.