fix(concurrency): close 8 race / thread-safety findings across CD, DCL, SR

CD-015: rewrite NotificationOutboxRepository.InsertIfNotExistsAsync as raw-SQL
IF NOT EXISTS … INSERT with SqlException 2601/2627 catch, ending the
at-least-once livelock on the site→central notification handoff.

DCL-018/019/020/021/022: add _subscribesInFlight guard so concurrent
same-tag subscribes don't orphan an adapter handle; delete the latent
dead _subscriptionHandles dictionary; stop double-counting
_totalSubscribed when an unresolved tag is promoted via another instance;
release adapter handles on mid-flight unsubscribe; gate the
tag-resolution retry timer with IsTimerActive so subscribe bursts don't
reset it into starvation.

SR-020: add _terminatingActorsByName shadow so a third deploy arriving
during a pending redeploy doesn't crash on InvalidActorNameException —
displaced senders get a Failed/superseded response and the latest
command wins on Terminated.

SR-024: split OperationTrackingStore reads from writes (fresh
SqliteConnection per GetStatusAsync) so long writes don't block status
queries; rewrite Dispose to drop the sync-over-async bridge that could
deadlock on a non-reentrant SyncContext; Interlocked.Exchange makes the
dispose-once flag race-safe across both paths.
This commit is contained in:
Joseph Doherty
2026-05-28 05:20:13 -04:00
parent 5d2386cc9d
commit f936f55f51
15 changed files with 1152 additions and 170 deletions
+14 -1
View File
@@ -891,9 +891,22 @@ columns) — asserting each column keeps an `EncryptedStringConverter`.
|--|--|
| Severity | High |
| Category | Concurrency & thread safety |
| Status | Open |
| Status | Resolved |
| Location | `src/ScadaLink.ConfigurationDatabase/Repositories/NotificationOutboxRepository.cs:33-45` |
**Resolution** — rewrote `InsertIfNotExistsAsync` as a single raw-SQL
`IF NOT EXISTS (...) INSERT` matching the
`AuditLogRepository.InsertIfNotExistsAsync` and
`SiteCallAuditRepository.UpsertAsync` patterns, with a `SqlException`
catch on numbers 2601 (unique-index violation) and 2627
(primary-key/unique-constraint violation) returning `false`. Concurrent
losers are logged at Debug and treated as no-ops, eliminating the
site-retry livelock. Two SQLite-targeted assertions in
`RepositoryCoverageTests` were migrated to a new MS SQL-fixture file
`tests/ScadaLink.ConfigurationDatabase.Tests/Repositories/NotificationOutboxRepositoryIntegrationTests.cs`,
which also adds a 50-way parallel race test verifying exactly one row
lands and no exception bubbles.
**Description**
`InsertIfNotExistsAsync` does `AnyAsync(x => x.NotificationId == n.NotificationId)`,