fix(concurrency/lifetime): close Theme 5 — 10 concurrency / DI / scope findings

Concurrency hazards, DI lifetime hygiene, and one verify-only confirmation
across 8 modules. Highlights:

Concurrency:
- CentralUI-030: SandboxConsoleCapture writes routed through WriteSynchronized
  locking on the captured StringWriter — intra-script Task fan-out can no
  longer corrupt the per-call buffer.
- Commons-021: ExternalCallResult.Response now backed by Lazy<dynamic?>
  (ExecutionAndPublication) — no more benign double-parse race.
- CD-017: DeploymentManagerRepository.DeleteDeploymentRecordAsync now takes
  an expected RowVersion and seeds entry.OriginalValues so EF emits
  DELETE ... WHERE Id=@id AND RowVersion=@prior; stale RowVersion now
  throws DbUpdateConcurrencyException instead of silent overwrite.
- Transport-009: AuditCorrelationContext.BundleImportId backed by
  AsyncLocal<Guid?> so concurrent imports get per-logical-call isolation
  (was a scoped instance shared via AuditService across runs).

DI / lifetime:
- AuditLog-003: All 3 AuditLog actor handlers switched to CreateAsyncScope
  + await using — async EF disposal no longer swallowed.
- AuditLog-007: INodeIdentityProvider resolution standardised on
  GetRequiredService<>() (was mixed with GetService<>()).
- AuditLog-011: AddAuditLogHealthMetricsBridge guarded by sentinel
  descriptor check — calling twice no longer double-registers the hosted
  service.

Shutdown / supervision:
- SiteCallAudit-002: AkkaHostedService adds a CoordinatedShutdown
  cluster-leave task (drain-site-call-audit-singleton) that issues a
  bounded GracefulStop(10s) so failover waits for in-flight upserts.

Registration safety:
- NS-020: AkkaHostedService now guards NotificationForwarder S&F
  registration with _notificationDeliveryHandlerRegistered + throws
  InvalidOperationException on double-register to make the regression loud.

VERIFY-only closures:
- NotifOutbox-005: Confirmed already closed by CD-015 fix (ac96b83) —
  NotificationOutboxRepository.InsertIfNotExistsAsync uses the same
  raw-SQL IF NOT EXISTS + 2601/2627 swallow pattern; race eliminated.

5+ new regression tests (CentralUI sandbox WhenAll, ExternalCallResult
64-reader Barrier, AuditLog DI idempotency, RowVersion stale-throw,
SiteCallAudit-002 shutdown drain). Build clean; affected suites all green.
README regenerated: 65 open (was 75).
This commit is contained in:
Joseph Doherty
2026-05-28 07:29:41 -04:00
parent 6ae0fea558
commit 2ed5c6c379
25 changed files with 699 additions and 239 deletions
+33 -10
View File
@@ -8,7 +8,7 @@
| Last reviewed | 2026-05-28 |
| Reviewer | claude-agent |
| Commit reviewed | `1eb6e97` |
| Open findings | 6 |
| Open findings | 3 |
## Summary
@@ -158,7 +158,7 @@ override as a children-only forward-compat placeholder, and state the actual
|--|--|
| Severity | Low |
| Category | Concurrency & thread safety |
| Status | Open |
| Status | Resolved |
| Location | `src/ScadaLink.AuditLog/Central/AuditLogIngestActor.cs:133`, `src/ScadaLink.AuditLog/Central/AuditLogPurgeActor.cs:139`, `src/ScadaLink.AuditLog/Central/SiteAuditReconciliationActor.cs:178` |
**Description**
@@ -184,9 +184,16 @@ pattern with `await using var scope = _services.CreateAsyncScope();`. The DI sco
will dispose asynchronously and the EF Core context will be released without
blocking the actor thread.
**Resolution**
**Resolution (2026-05-28):**
_Unresolved._
All three handlers now use `CreateAsyncScope()` + `await using var scope = ...`.
`AuditLogIngestActor.OnIngestAsync` factored the per-batch loop into a shared
`IngestWithRepositoryAsync` helper so the injected-repository test ctor and
the scoped production path both reach the same body without duplicating the
per-row try/catch. `AuditLogPurgeActor.OnTickAsync` and
`SiteAuditReconciliationActor.OnTickAsync` dropped the `try/finally { scope.Dispose(); }`
pattern in favour of the `await using` lexical scope. EF Core DbContexts now
dispose asynchronously across every audit ingest path.
### AuditLog-004 — `SiteAuditReconciliationActor` advances cursor even on per-row insert failure, silently abandoning permanently-failing rows
@@ -342,7 +349,7 @@ documents the choice. Behaviour for context-free callers is unchanged.
|--|--|
| Severity | Low |
| Category | Code organization & conventions |
| Status | Open |
| Status | Resolved |
| Location | `src/ScadaLink.AuditLog/ServiceCollectionExtensions.cs:148-218` |
**Description**
@@ -376,9 +383,16 @@ inside `AddAuditLog` (with a sensible default — null node name returns `<unkno
add an explicit guard at the top of `AddAuditLog` that throws if no provider has been
registered yet (`services.Any(d => d.ServiceType == typeof(INodeIdentityProvider))`).
**Resolution**
**Resolution (2026-05-28):**
_Unresolved._
Took option (b) — standardized all three consumers on `GetRequiredService<INodeIdentityProvider>()`.
The Host (`SiteServiceRegistration.BindSharedOptions`) registers the provider on
both site and central paths per the InboundAPI-022 / Host registration sweep,
and the `AddAuditLogTests` fixture binds a `FakeNodeIdentityProvider`. A silent
`GetService()` returning null was masking a future composition root that forgot
the registration; the strict resolution surfaces that bug at first
`ICachedCallTelemetryForwarder` / `CachedCallLifecycleBridge` / `ICentralAuditWriter`
resolution instead.
### AuditLog-008 — Test composition roots that omit `IAuditPayloadFilter` silently pass UNREDACTED payloads through the writer chain
@@ -510,7 +524,7 @@ existing top-level catch swallows the `OperationCanceledException`.
|--|--|
| Severity | Low |
| Category | Code organization & conventions |
| Status | Open |
| Status | Resolved |
| Location | `src/ScadaLink.AuditLog/ServiceCollectionExtensions.cs:53-55, 263-276, 301-346` |
**Description**
@@ -535,6 +549,15 @@ or (b) explicitly document idempotency on the public surface of every helper and
verify with a unit test in `AddAuditLogTests`. Option (a) matches the pattern other
SDK extensions use and removes a foot-gun.
**Resolution**
**Resolution (2026-05-28):**
_Unresolved._
Took option (a) for `AddAuditLogHealthMetricsBridge` — guarded by a sentinel
check on the `SiteAuditBacklogReporter` hosted-service descriptor (the helper's
exclusive contribution to the collection). A second call short-circuits before
any `Replace` / `AddHostedService` runs, so the hosted service registers
exactly once. New `AddAuditLogHealthMetricsBridge_IsIdempotent_DoesNotDoubleRegister_HostedService`
test in `AddAuditLogTests` calls the helper twice and asserts a single
`IHostedService` descriptor for `SiteAuditBacklogReporter`. The
`AddAuditLogCentralMaintenance` helper is left for a follow-up — it is only
ever called from the central composition root and the unit/integration
fixtures use disposable IServiceCollections, so the foot-gun is narrower.