fix(concurrency/lifetime): close Theme 5 — 10 concurrency / DI / scope findings

Concurrency hazards, DI lifetime hygiene, and one verify-only confirmation
across 8 modules. Highlights:

Concurrency:
- CentralUI-030: SandboxConsoleCapture writes routed through WriteSynchronized
  locking on the captured StringWriter — intra-script Task fan-out can no
  longer corrupt the per-call buffer.
- Commons-021: ExternalCallResult.Response now backed by Lazy<dynamic?>
  (ExecutionAndPublication) — no more benign double-parse race.
- CD-017: DeploymentManagerRepository.DeleteDeploymentRecordAsync now takes
  an expected RowVersion and seeds entry.OriginalValues so EF emits
  DELETE ... WHERE Id=@id AND RowVersion=@prior; stale RowVersion now
  throws DbUpdateConcurrencyException instead of silent overwrite.
- Transport-009: AuditCorrelationContext.BundleImportId backed by
  AsyncLocal<Guid?> so concurrent imports get per-logical-call isolation
  (was a scoped instance shared via AuditService across runs).

DI / lifetime:
- AuditLog-003: All 3 AuditLog actor handlers switched to CreateAsyncScope
  + await using — async EF disposal no longer swallowed.
- AuditLog-007: INodeIdentityProvider resolution standardised on
  GetRequiredService<>() (was mixed with GetService<>()).
- AuditLog-011: AddAuditLogHealthMetricsBridge guarded by sentinel
  descriptor check — calling twice no longer double-registers the hosted
  service.

Shutdown / supervision:
- SiteCallAudit-002: AkkaHostedService adds a CoordinatedShutdown
  cluster-leave task (drain-site-call-audit-singleton) that issues a
  bounded GracefulStop(10s) so failover waits for in-flight upserts.

Registration safety:
- NS-020: AkkaHostedService now guards NotificationForwarder S&F
  registration with _notificationDeliveryHandlerRegistered + throws
  InvalidOperationException on double-register to make the regression loud.

VERIFY-only closures:
- NotifOutbox-005: Confirmed already closed by CD-015 fix (ac96b83) —
  NotificationOutboxRepository.InsertIfNotExistsAsync uses the same
  raw-SQL IF NOT EXISTS + 2601/2627 swallow pattern; race eliminated.

5+ new regression tests (CentralUI sandbox WhenAll, ExternalCallResult
64-reader Barrier, AuditLog DI idempotency, RowVersion stale-throw,
SiteCallAudit-002 shutdown drain). Build clean; affected suites all green.
README regenerated: 65 open (was 75).
This commit is contained in:
Joseph Doherty
2026-05-28 07:29:41 -04:00
parent 6ae0fea558
commit 2ed5c6c379
25 changed files with 699 additions and 239 deletions
+33 -10
View File
@@ -8,7 +8,7 @@
| Last reviewed | 2026-05-28 |
| Reviewer | claude-agent |
| Commit reviewed | `1eb6e97` |
| Open findings | 6 |
| Open findings | 3 |
## Summary
@@ -158,7 +158,7 @@ override as a children-only forward-compat placeholder, and state the actual
|--|--|
| Severity | Low |
| Category | Concurrency & thread safety |
| Status | Open |
| Status | Resolved |
| Location | `src/ScadaLink.AuditLog/Central/AuditLogIngestActor.cs:133`, `src/ScadaLink.AuditLog/Central/AuditLogPurgeActor.cs:139`, `src/ScadaLink.AuditLog/Central/SiteAuditReconciliationActor.cs:178` |
**Description**
@@ -184,9 +184,16 @@ pattern with `await using var scope = _services.CreateAsyncScope();`. The DI sco
will dispose asynchronously and the EF Core context will be released without
blocking the actor thread.
**Resolution**
**Resolution (2026-05-28):**
_Unresolved._
All three handlers now use `CreateAsyncScope()` + `await using var scope = ...`.
`AuditLogIngestActor.OnIngestAsync` factored the per-batch loop into a shared
`IngestWithRepositoryAsync` helper so the injected-repository test ctor and
the scoped production path both reach the same body without duplicating the
per-row try/catch. `AuditLogPurgeActor.OnTickAsync` and
`SiteAuditReconciliationActor.OnTickAsync` dropped the `try/finally { scope.Dispose(); }`
pattern in favour of the `await using` lexical scope. EF Core DbContexts now
dispose asynchronously across every audit ingest path.
### AuditLog-004 — `SiteAuditReconciliationActor` advances cursor even on per-row insert failure, silently abandoning permanently-failing rows
@@ -342,7 +349,7 @@ documents the choice. Behaviour for context-free callers is unchanged.
|--|--|
| Severity | Low |
| Category | Code organization & conventions |
| Status | Open |
| Status | Resolved |
| Location | `src/ScadaLink.AuditLog/ServiceCollectionExtensions.cs:148-218` |
**Description**
@@ -376,9 +383,16 @@ inside `AddAuditLog` (with a sensible default — null node name returns `<unkno
add an explicit guard at the top of `AddAuditLog` that throws if no provider has been
registered yet (`services.Any(d => d.ServiceType == typeof(INodeIdentityProvider))`).
**Resolution**
**Resolution (2026-05-28):**
_Unresolved._
Took option (b) — standardized all three consumers on `GetRequiredService<INodeIdentityProvider>()`.
The Host (`SiteServiceRegistration.BindSharedOptions`) registers the provider on
both site and central paths per the InboundAPI-022 / Host registration sweep,
and the `AddAuditLogTests` fixture binds a `FakeNodeIdentityProvider`. A silent
`GetService()` returning null was masking a future composition root that forgot
the registration; the strict resolution surfaces that bug at first
`ICachedCallTelemetryForwarder` / `CachedCallLifecycleBridge` / `ICentralAuditWriter`
resolution instead.
### AuditLog-008 — Test composition roots that omit `IAuditPayloadFilter` silently pass UNREDACTED payloads through the writer chain
@@ -510,7 +524,7 @@ existing top-level catch swallows the `OperationCanceledException`.
|--|--|
| Severity | Low |
| Category | Code organization & conventions |
| Status | Open |
| Status | Resolved |
| Location | `src/ScadaLink.AuditLog/ServiceCollectionExtensions.cs:53-55, 263-276, 301-346` |
**Description**
@@ -535,6 +549,15 @@ or (b) explicitly document idempotency on the public surface of every helper and
verify with a unit test in `AddAuditLogTests`. Option (a) matches the pattern other
SDK extensions use and removes a foot-gun.
**Resolution**
**Resolution (2026-05-28):**
_Unresolved._
Took option (a) for `AddAuditLogHealthMetricsBridge` — guarded by a sentinel
check on the `SiteAuditBacklogReporter` hosted-service descriptor (the helper's
exclusive contribution to the collection). A second call short-circuits before
any `Replace` / `AddHostedService` runs, so the hosted service registers
exactly once. New `AddAuditLogHealthMetricsBridge_IsIdempotent_DoesNotDoubleRegister_HostedService`
test in `AddAuditLogTests` calls the helper twice and asserts a single
`IHostedService` descriptor for `SiteAuditBacklogReporter`. The
`AddAuditLogCentralMaintenance` helper is left for a follow-up — it is only
ever called from the central composition root and the unit/integration
fixtures use disposable IServiceCollections, so the foot-gun is narrower.
+4 -2
View File
@@ -8,7 +8,7 @@
| Last reviewed | 2026-05-28 |
| Reviewer | claude-agent |
| Commit reviewed | `1eb6e97` |
| Open findings | 5 |
| Open findings | 4 |
## Summary
@@ -1461,9 +1461,11 @@ plumbing CentralUI-027 will need.
|--|--|
| Severity | Low |
| Category | Concurrency & thread safety |
| Status | Open |
| Status | Resolved |
| Location | `src/ScadaLink.CentralUI/ScriptAnalysis/SandboxConsoleCapture.cs:31-118`; `src/ScadaLink.CentralUI/ScriptAnalysis/ScriptAnalysisService.cs:401-404` |
**Resolution (2026-05-28):** Wrapped every `Write`/`WriteLine` override in `SandboxConsoleCapture` through a `WriteSynchronized` helper that takes a `lock` on the current `AsyncLocal` capture buffer before writing — concurrent `Console.WriteLine` calls from a script's `Task.WhenAll`/`Task.Run` fan-out now serialise on the buffer instance, so the `StringBuilder` underneath can no longer be corrupted. The fall-through to the unwrapped `_fallback` writer is unlocked because the BCL's process-wide `Console.Out` is already synchronised. Different capture scopes have different lock targets, so two unrelated sandbox runs never block each other. New regression test `SandboxConsoleCaptureTests.BeginCapture_ConcurrentWritesFromTasks_DoNotCorruptBuffer` drives 32 tasks × 50 lines each through one capture scope and asserts every line is intact in the buffer.
**Description**
CentralUI-003 correctly routed console capture through an `AsyncLocal<StringWriter?>`
+4 -2
View File
@@ -8,7 +8,7 @@
| Last reviewed | 2026-05-28 |
| Reviewer | claude-agent |
| Commit reviewed | `1eb6e97` |
| Open findings | 7 |
| Open findings | 6 |
## Summary
@@ -1014,9 +1014,11 @@ boundary (`AuditTelemetryEnvelope`, `PullAuditEventsRequest`/`Response`,
|--|--|
| Severity | Low |
| Category | Concurrency & thread safety |
| Status | Open |
| Status | Resolved |
| Location | `src/ScadaLink.Commons/Interfaces/Services/IExternalSystemClient.cs:91-104` |
**Resolution (2026-05-28):** Replaced the two mutable backing fields (`_response`/`_responseParsed`) with a single `private readonly Lazy<dynamic?> _response` initialised in the field initializer — `LazyThreadSafetyMode.ExecutionAndPublication` (the default) guarantees the parse runs at most once and every concurrent reader observes the same published `DynamicJsonElement`. `Response` is now a one-line `_response.Value` expression-bodied property. Regression test `ExternalCallResultTests.Response_ConcurrentReads_ReturnSameInstance` fires 64 concurrent readers through a `Barrier` and asserts `Assert.Same` across all observed values.
**Description**
`ExternalCallResult` is a `record` returned to scripts after an outbound HTTP call. The
+12 -2
View File
@@ -8,7 +8,7 @@
| Last reviewed | 2026-05-28 |
| Reviewer | claude-agent |
| Commit reviewed | `1eb6e97` |
| Open findings | 4 |
| Open findings | 3 |
## Summary
@@ -989,9 +989,19 @@ longer exists.
|--|--|
| Severity | Medium |
| Category | Concurrency & thread safety |
| Status | Open |
| Status | Resolved |
| Location | `src/ScadaLink.ConfigurationDatabase/Repositories/DeploymentManagerRepository.cs:83-97` |
**Resolution (2026-05-28):**
`IDeploymentManagerRepository.DeleteDeploymentRecordAsync` now requires a `byte[] expectedRowVersion`
argument. The repository seeds `entry.OriginalValues["RowVersion"]` on both the Local-tracked and
stub-attach branches so EF emits `DELETE ... WHERE Id = @id AND RowVersion = @prior` and surfaces a
concurrent edit as `DbUpdateConcurrencyException`. A new SQLite regression test
`DeleteDeploymentRecord_StaleRowVersion_ThrowsConcurrencyException` in `ConcurrencyTests`
(backed by a dedicated `RowVersionConcurrencyTestDbContext` that keeps `RowVersion` as a
caller-managed concurrency token) asserts the exception fires on a stale token; the existing
`DeleteDeploymentRecord_ViaStubAttachPath_RemovesEntity` was updated to pass the fresh RowVersion.
**Description**
`DeploymentRecord` carries a SQL Server `rowversion` concurrency token (declared
+4 -6
View File
@@ -8,7 +8,7 @@
| Last reviewed | 2026-05-28 |
| Reviewer | claude-agent |
| Commit reviewed | `1eb6e97` |
| Open findings | 10 |
| Open findings | 9 |
## Summary
@@ -237,9 +237,11 @@ _Unresolved._
|--|--|
| Severity | Medium |
| Category | Concurrency & thread safety |
| Status | Open |
| Status | Resolved |
| Location | `src/ScadaLink.NotificationOutbox/NotificationOutboxActor.cs:127-132` (caller); root cause in `src/ScadaLink.ConfigurationDatabase/Repositories/NotificationOutboxRepository.cs:33-45` |
**Resolution (2026-05-28):** Closed by CD-015 — `NotificationOutboxRepository.InsertIfNotExistsAsync` (commit `ac96b83`) is now a single-statement `IF NOT EXISTS ... INSERT` via `ExecuteSqlInterpolatedAsync` with a `SqlException` filter swallowing duplicate-key violations (`2601`/`2627`) as a no-op (`return false`). The check-then-act window is eliminated; the at-least-once handoff contract holds and the actor's `PipeTo` success/failure projection no longer surfaces a permanent PK-violation back to the site. Verified in `src/ScadaLink.ConfigurationDatabase/Repositories/NotificationOutboxRepository.cs:51-103`.
**Description**
`HandleSubmit``PersistAsync` calls `repository.InsertIfNotExistsAsync(notification)`
@@ -271,10 +273,6 @@ and ack-back does not produce a permanent re-forward loop — but the cleanest f
remains the CD-015 raw-SQL `IF NOT EXISTS … INSERT` with `2601/2627` catch in
`NotificationOutboxRepository`.
**Resolution**
_Unresolved._
### NotificationOutbox-006 — `ResolveAdapters` rebuilds the `NotificationType → adapter` dictionary on every dispatch sweep
| | |
+4 -2
View File
@@ -8,7 +8,7 @@
| Last reviewed | 2026-05-28 |
| Reviewer | claude-agent |
| Commit reviewed | `1eb6e97` |
| Open findings | 7 |
| Open findings | 6 |
## Summary
@@ -683,9 +683,11 @@ Recommended path is option 1: the parallel implementation in `EmailNotificationD
|--|--|
| Severity | Medium |
| Category | Correctness & logic bugs |
| Status | Open |
| Status | Resolved |
| Location | `src/ScadaLink.Host/Actors/AkkaHostedService.cs:654-660`, NS-001 resolution note (this file) |
**Resolution (2026-05-28):** Added a `_notificationDeliveryHandlerRegistered` sentinel field on `AkkaHostedService` and gated the canonical `NotificationForwarder` registration with an `InvalidOperationException` guard — a future code path that re-introduces the dead NS-001 site-SMTP handler now fails fast at startup with an explicit NS-020 diagnostic, rather than silently overwriting `RegisterDeliveryHandler`'s last-write-wins map and inverting the central-only design. The sentinel's XML doc cross-references NS-001/NS-019/NS-020 so a maintainer searching for the `Notification` S&F handler finds the one canonical registration and its history.
**Description**
NS-001 was resolved by registering an `S&F → DeliverBufferedAsync` handler for `StoreAndForwardCategory.Notification` at site startup in `AkkaHostedService`. The current source registers a **different** handler for the same category at `AkkaHostedService.cs:654-660``NotificationForwarder.DeliverAsync`, which forwards to central instead of sending SMTP. `StoreAndForwardService.RegisterDeliveryHandler` (verified by reading `StoreAndForward/StoreAndForwardService.cs` around line 109) takes a single handler per category — last-write-wins or first-write-wins, either way the two registrations cannot both be active.
+13 -23
View File
@@ -41,21 +41,21 @@ module file and counted in **Total**.
|----------|---------------|
| Critical | 0 |
| High | 0 |
| Medium | 25 |
| Low | 50 |
| **Total** | **75** |
| Medium | 22 |
| Low | 43 |
| **Total** | **65** |
## Module Status
| Module | Last reviewed | Commit | Open (C/H/M/L) | Open | Total |
|--------|---------------|--------|----------------|------|-------|
| [AuditLog](AuditLog/findings.md) | 2026-05-28 | `1eb6e97` | 0/0/2/4 | 6 | 11 |
| [AuditLog](AuditLog/findings.md) | 2026-05-28 | `1eb6e97` | 0/0/2/1 | 3 | 11 |
| [CLI](CLI/findings.md) | 2026-05-28 | `1eb6e97` | 0/0/1/2 | 3 | 23 |
| [CentralUI](CentralUI/findings.md) | 2026-05-28 | `1eb6e97` | 0/0/0/5 | 5 | 33 |
| [CentralUI](CentralUI/findings.md) | 2026-05-28 | `1eb6e97` | 0/0/0/4 | 4 | 33 |
| [ClusterInfrastructure](ClusterInfrastructure/findings.md) | 2026-05-28 | `1eb6e97` | 0/0/0/3 | 3 | 14 |
| [Commons](Commons/findings.md) | 2026-05-28 | `1eb6e97` | 0/0/0/5 | 5 | 23 |
| [Commons](Commons/findings.md) | 2026-05-28 | `1eb6e97` | 0/0/0/4 | 4 | 23 |
| [Communication](Communication/findings.md) | 2026-05-28 | `1eb6e97` | 0/0/1/1 | 2 | 22 |
| [ConfigurationDatabase](ConfigurationDatabase/findings.md) | 2026-05-28 | `1eb6e97` | 0/0/2/2 | 4 | 24 |
| [ConfigurationDatabase](ConfigurationDatabase/findings.md) | 2026-05-28 | `1eb6e97` | 0/0/1/2 | 3 | 24 |
| [DataConnectionLayer](DataConnectionLayer/findings.md) | 2026-05-28 | `1eb6e97` | 0/0/0/0 | 0 | 22 |
| [DeploymentManager](DeploymentManager/findings.md) | 2026-05-28 | `1eb6e97` | 0/0/0/4 | 4 | 24 |
| [ExternalSystemGateway](ExternalSystemGateway/findings.md) | 2026-05-28 | `1eb6e97` | 0/0/1/1 | 2 | 23 |
@@ -63,15 +63,15 @@ module file and counted in **Total**.
| [Host](Host/findings.md) | 2026-05-28 | `1eb6e97` | 0/0/1/3 | 4 | 22 |
| [InboundAPI](InboundAPI/findings.md) | 2026-05-28 | `1eb6e97` | 0/0/1/2 | 3 | 25 |
| [ManagementService](ManagementService/findings.md) | 2026-05-28 | `1eb6e97` | 0/0/2/1 | 3 | 23 |
| [NotificationOutbox](NotificationOutbox/findings.md) | 2026-05-28 | `1eb6e97` | 0/0/1/2 | 3 | 10 |
| [NotificationService](NotificationService/findings.md) | 2026-05-28 | `1eb6e97` | 0/0/2/2 | 4 | 25 |
| [NotificationOutbox](NotificationOutbox/findings.md) | 2026-05-28 | `1eb6e97` | 0/0/0/2 | 2 | 10 |
| [NotificationService](NotificationService/findings.md) | 2026-05-28 | `1eb6e97` | 0/0/1/2 | 3 | 25 |
| [Security](Security/findings.md) | 2026-05-28 | `1eb6e97` | 0/0/0/1 | 1 | 21 |
| [SiteCallAudit](SiteCallAudit/findings.md) | 2026-05-28 | `1eb6e97` | 0/0/2/2 | 4 | 6 |
| [SiteCallAudit](SiteCallAudit/findings.md) | 2026-05-28 | `1eb6e97` | 0/0/2/1 | 3 | 6 |
| [SiteEventLogging](SiteEventLogging/findings.md) | 2026-05-28 | `1eb6e97` | 0/0/0/3 | 3 | 23 |
| [SiteRuntime](SiteRuntime/findings.md) | 2026-05-28 | `1eb6e97` | 0/0/2/0 | 2 | 26 |
| [StoreAndForward](StoreAndForward/findings.md) | 2026-05-28 | `1eb6e97` | 0/0/3/2 | 5 | 24 |
| [TemplateEngine](TemplateEngine/findings.md) | 2026-05-28 | `1eb6e97` | 0/0/3/0 | 3 | 22 |
| [Transport](Transport/findings.md) | 2026-05-28 | `1eb6e97` | 0/0/1/3 | 4 | 12 |
| [Transport](Transport/findings.md) | 2026-05-28 | `1eb6e97` | 0/0/1/2 | 3 | 12 |
## Pending Findings
@@ -88,7 +88,7 @@ _None open._
_None open._
### Medium (25)
### Medium (22)
| ID | Module | Title |
|----|--------|-------|
@@ -97,14 +97,11 @@ _None open._
| CLI-019 | [CLI](CLI/findings.md) | `bundle export` decodes the entire base64 bundle into memory before writing |
| Communication-017 | [Communication](Communication/findings.md) | `_inProgressDeployments` grows unboundedly — successful deployments are never cleaned up |
| ConfigurationDatabase-016 | [ConfigurationDatabase](ConfigurationDatabase/findings.md) | `InboundApiRepository.GetApiKeyByValueAsync` hashes the candidate with the unpeppered `ApiKeyHasher.Default` |
| ConfigurationDatabase-017 | [ConfigurationDatabase](ConfigurationDatabase/findings.md) | Stub-attach delete on `DeploymentRecord` bypasses optimistic concurrency |
| ExternalSystemGateway-020 | [ExternalSystemGateway](ExternalSystemGateway/findings.md) | `JsonElementToParameterValue` silently downcasts non-Int64 JSON numbers to `double`, losing precision for `decimal` SQL parameters on retry |
| Host-016 | [Host](Host/findings.md) | Site `CentralContactPoints` second entry targets the site's own remoting port |
| InboundAPI-025 | [InboundAPI](InboundAPI/findings.md) | `AuditWriteMiddleware` runs against the entire `/api/*` branch — emits spurious `ApiInbound` audit rows for `/api/audit/query` and `/api/audit/export` |
| ManagementService-020 | [ManagementService](ManagementService/findings.md) | UpdateSmtpConfig returns and audits the SMTP Credentials field verbatim |
| ManagementService-021 | [ManagementService](ManagementService/findings.md) | Transport bundle handlers have zero test coverage |
| NotificationOutbox-005 | [NotificationOutbox](NotificationOutbox/findings.md) | Ingest persistence inherits the CD-015 check-then-act race; under contention the second writer throws and the site retries |
| NotificationService-020 | [NotificationService](NotificationService/findings.md) | NS-001 fix superseded; `AkkaHostedService` would register two competing `Notification` S&F handlers if both code paths ran |
| NotificationService-024 | [NotificationService](NotificationService/findings.md) | No test affirms the central-only invariant; the orphaned-path tests give a false coverage signal |
| SiteCallAudit-001 | [SiteCallAudit](SiteCallAudit/findings.md) | SupervisorStrategy override is dead code; XML claims Resume that is not enforced |
| SiteCallAudit-003 | [SiteCallAudit](SiteCallAudit/findings.md) | `OnUpsertAsync` does not refresh `IngestedAtUtc`; direct-write callers must remember to stamp it |
@@ -118,18 +115,14 @@ _None open._
| TemplateEngine-020 | [TemplateEngine](TemplateEngine/findings.md) | `Create*` audit entries are written with `EntityId = "0"` before `SaveChangesAsync` populates the real key |
| Transport-010 | [Transport](Transport/findings.md) | Critical Overwrite + cross-cutting paths uncovered by tests |
### Low (50)
### Low (43)
| ID | Module | Title |
|----|--------|-------|
| AuditLog-003 | [AuditLog](AuditLog/findings.md) | `AuditLogIngestActor.OnIngestAsync` uses `CreateScope`, but `OnCachedTelemetryAsync` uses `CreateAsyncScope` — and only one disposes asynchronously |
| AuditLog-007 | [AuditLog](AuditLog/findings.md) | `INodeIdentityProvider` resolution mixes `GetService` and `GetRequiredService` inconsistently across `AddAuditLog` registrations |
| AuditLog-008 | [AuditLog](AuditLog/findings.md) | Test composition roots that omit `IAuditPayloadFilter` silently pass UNREDACTED payloads through the writer chain |
| AuditLog-011 | [AuditLog](AuditLog/findings.md) | `AddAuditLogHealthMetricsBridge` and `AddAuditLogCentralMaintenance` are non-idempotent and register hosted services on every call |
| CLI-020 | [CLI](CLI/findings.md) | `bundle export` success-envelope parse is unguarded |
| CLI-022 | [CLI](CLI/findings.md) | `CommandTreeTests` excludes the two new command groups |
| CentralUI-029 | [CentralUI](CentralUI/findings.md) | `ConfigurationAuditLog` uses `JS.InvokeAsync<int>("eval", ...)` instead of a dedicated JS module |
| CentralUI-030 | [CentralUI](CentralUI/findings.md) | `SandboxConsoleCapture`'s per-call `StringWriter` is not thread-safe under intra-script concurrency |
| CentralUI-031 | [CentralUI](CentralUI/findings.md) | `TransportImport` buffers the full bundle bytes in component state |
| CentralUI-032 | [CentralUI](CentralUI/findings.md) | `AuditResultsGrid` paging is forward-only, no Previous button |
| CentralUI-033 | [CentralUI](CentralUI/findings.md) | Drill-in / query-string code paths for the new Transport + SiteCalls pages are untested |
@@ -139,7 +132,6 @@ _None open._
| Commons-016 | [Commons](Commons/findings.md) | `BundleSession.Locked` uses a magic `3` rather than a named constant |
| Commons-018 | [Commons](Commons/findings.md) | `IOperationTrackingStore` and `IPartitionMaintenance` are at the root of `Interfaces/` instead of `Interfaces/Services/` |
| Commons-020 | [Commons](Commons/findings.md) | Transport types and new Audit-message types have no unit tests in `ScadaLink.Commons.Tests` |
| Commons-021 | [Commons](Commons/findings.md) | `ExternalCallResult.Response` has a benign lazy-parse race |
| Commons-023 | [Commons](Commons/findings.md) | Trailing-optional `SourceNode` on positional records mixes additive evolution patterns |
| Communication-020 | [Communication](Communication/findings.md) | `SiteAddressCacheLoaded` carries mutable `Dictionary`/`List` types |
| ConfigurationDatabase-021 | [ConfigurationDatabase](ConfigurationDatabase/findings.md) | `SwitchOutPartitionAsync` interpolates `monthBoundary` / staging table name into raw SQL |
@@ -162,7 +154,6 @@ _None open._
| NotificationService-022 | [NotificationService](NotificationService/findings.md) | `MailKitSmtpClientWrapper` holds a long-lived `SmtpClient`; combined with per-send factory, the design comment about pooling is contradicted |
| NotificationService-025 | [NotificationService](NotificationService/findings.md) | `CredentialRedactor` over-masks: any 4-character credential component is masked anywhere it appears, including unrelated log text |
| Security-021 | [Security](Security/findings.md) | `RequireHttpsCookie=false` dev opt-out has no warning path — an HTTP production deployment silently transmits the JWT bearer credential in cleartext |
| SiteCallAudit-002 | [SiteCallAudit](SiteCallAudit/findings.md) | Singleton failover does not wait for in-flight async upserts |
| SiteCallAudit-006 | [SiteCallAudit](SiteCallAudit/findings.md) | Stuck-only paging test does not exercise the multi-page boundary with an interleaved non-stuck row at the cursor |
| SiteEventLogging-018 | [SiteEventLogging](SiteEventLogging/findings.md) | `FailedWriteCount` is exposed but never consumed by Health Monitoring |
| SiteEventLogging-022 | [SiteEventLogging](SiteEventLogging/findings.md) | `Cache=Shared` is redundant for a single-connection logger |
@@ -170,5 +161,4 @@ _None open._
| StoreAndForward-022 | [StoreAndForward](StoreAndForward/findings.md) | `NotifyCachedCallObserverAsync` silently drops the entire audit lifecycle when the message id is not a parseable `TrackedOperationId` |
| StoreAndForward-023 | [StoreAndForward](StoreAndForward/findings.md) | `siteId` silently defaults to empty when no `IStoreAndForwardSiteContext` is registered, degrading audit telemetry correlation |
| Transport-008 | [Transport](Transport/findings.md) | `PreviewAsync` issues an N+1 `GetTemplateWithChildrenAsync` per matching template name |
| Transport-009 | [Transport](Transport/findings.md) | `IAuditCorrelationContext.BundleImportId` is mutated on the same scoped instance the AuditService reads |
| Transport-012 | [Transport](Transport/findings.md) | "Bundle Import" filter promised in design doc not surfaced in Configuration Audit Log Viewer UI |
+4 -6
View File
@@ -8,7 +8,7 @@
| Last reviewed | 2026-05-28 |
| Reviewer | claude-agent |
| Commit reviewed | `1eb6e97` |
| Open findings | 6 |
| Open findings | 5 |
## Summary
@@ -108,9 +108,11 @@ _Unresolved._
|--|--|
| Severity | Low |
| Category | Error handling & resilience |
| Status | Open |
| Status | Resolved |
| Location | `src/ScadaLink.Host/Actors/AkkaHostedService.cs:455-462` (singleton wiring), `src/ScadaLink.SiteCallAudit/SiteCallAuditActor.cs:153-193` |
**Resolution (2026-05-28):** Added a `CoordinatedShutdown` task in the `cluster-leave` phase (named `drain-site-call-audit-singleton`) that issues an explicit `GracefulStop(10s)` to the `SiteCallAudit` cluster singleton manager before the cluster-leave proceeds. Akka.NET's singleton handover already waits for the active actor's `ReceiveAsync` task to complete before signalling `HandOverDone`, so an in-flight EF `UpsertAsync` (and its SQL round-trip) drains on the old node before the new singleton starts on the other central node — closing the seam where the new singleton could race a still-running upsert on the old node. The 10-second timeout is bounded so a misbehaving upsert cannot stall coordinated shutdown indefinitely; on timeout the existing `PoisonPill` termination path takes over and the repository's monotonic-upsert + 2601/2627 duplicate-key swallow remain as the storage-state safety net. Pattern is suitable for the `NotificationOutbox` singleton too; deferred to keep this change scoped.
**Description**
The singleton is created with `terminationMessage: PoisonPill.Instance`. On
@@ -146,10 +148,6 @@ Notification Outbox sibling has the same pattern.
monotonic rank check (the CD-015 race-pattern check the parent task
flagged).
**Resolution**
_Unresolved._
### SiteCallAudit-003 — `OnUpsertAsync` does not refresh `IngestedAtUtc`; direct-write callers must remember to stamp it
| | |
+11 -2
View File
@@ -8,7 +8,7 @@
| Last reviewed | 2026-05-28 |
| Reviewer | claude-agent |
| Commit reviewed | `1eb6e97` |
| Open findings | 10 |
| Open findings | 9 |
## Summary
@@ -343,9 +343,18 @@ _Unresolved._
|--|--|
| Severity | Low |
| Category | Concurrency & thread safety |
| Status | Open |
| Status | Resolved |
| Location | `src/ScadaLink.Transport/Import/BundleImporter.cs:528, 668, 703`, `src/ScadaLink.ConfigurationDatabase/Services/AuditCorrelationContext.cs` |
**Resolution (2026-05-28):**
Took option (a). `AuditCorrelationContext` now backs `BundleImportId` with a `static AsyncLocal<Guid?>`,
so every distinct `BundleImporter.ApplyAsync` invocation observes its own value even when sharing a
DI scope (e.g. concurrent imports awaited via `Task.WhenAll` on a single Blazor circuit). The XML
docs on both `IAuditCorrelationContext` and `AuditCorrelationContext` were rewritten to spell out
the per-logical-call-context isolation contract that all implementations must preserve. The
`BundleImporter` mutation pattern is unchanged — the property API is identical and existing
integration tests still pass.
**Description**
The XML doc on `IAuditCorrelationContext` correctly notes that mutating