perf: close Theme 6 — 11 allocation / N+1 / lock-contention findings

Well-localised perf fixes across 8 modules.

Lock decoupling / SQL streaming:
- AuditLog-005: SqliteAuditWriter gains dedicated read-only _readConnection
  (+ _readLock) backed by WAL journal mode. GetBacklogStatsAsync,
  ReadPendingAsync, ReadPendingSinceAsync, ReadForwardedAsync no longer
  contend with the hot-path INSERT lock — backlog probes on a 30s timer
  can't stall the writer under multi-hundred-K Pending backlog.
- SEL-022: dropped Cache=Shared from SiteEventLogger's default connection
  string (single-connection logger; mode was dormant config).

Memory / streaming:
- CLI-019: bundle export streams base64 in 1 MB-aligned chunks via
  Convert.TryFromBase64Chars straight into the FileStream — no more
  full-bundle byte[] allocation.
- CentralUI-031: TransportImport now stages the upload to a per-session
  temp file under Path.GetTempPath() (replaces in-memory byte[] field);
  page implements IDisposable to delete the temp file on reset / new
  upload / dispose. Per-circuit working set drops from ~100 MB to ~80 KB.

N+1 hoisting:
- Transport-008: added ITemplateEngineRepository.GetTemplatesWithChildrenAsync
  bulk method; BundleImporter.PreviewAsync calls it once instead of per-
  template-name. Single query with .Include(...).AsSplitQuery().
- DM-023: BuildDeployArtifactsCommandAsync's per-site loop now references
  a pre-fetched GlobalArtifactSnapshot (shared scripts, external systems,
  DB connections, notification lists, SMTP) instead of re-querying per site.
- MgmtSvc-023: HandleQueryDeployments unfiltered branch uses one
  GetAllInstancesAsync bulk load + Dictionary<int,int?> lookup (was a
  GetInstanceByIdAsync per record).

Small allocations / per-tick rebuilds:
- InboundAPI-019: AuditWriteMiddleware gates EnableBuffering() on
  RequestHasBody() so GET/HEAD/DELETE/TRACE/OPTIONS and Content-Length:0
  requests skip the FileBufferingReadStream allocation.
- NotifOutbox-006: ResolveAdapters dictionary now cached on
  _adaptersCache (built lazily on first sweep) + actor-lifetime
  _adaptersScope; ResolveAdapters no longer rebuilds per dispatch tick.

Verify-only:
- Comm-017: Confirmed _inProgressDeployments was deleted by Comm-016 in
  commit ac96b83 — marked Resolved with that attribution. No code change.

Doc-correction:
- NS-022: Updated MailKitSmtpClientWrapper XML doc to spell out single-
  connection / per-delivery-factory contract (option (b) — transient
  client per Send — rejected because it re-handshakes TLS per email).

10+ new regression tests across 8 test projects. Build clean; affected
suites all green. README regenerated: 54 open (was 65).
This commit is contained in:
Joseph Doherty
2026-05-28 07:47:24 -04:00
parent 2ed5c6c379
commit 55f46e7c92
34 changed files with 1131 additions and 149 deletions
@@ -140,6 +140,63 @@ public class ArtifactDeploymentServiceTests : TestKit
Assert.Contains(result.Value.SiteResults, r => r.SiteId == "fail-site" && !r.Success);
}
// ── DeploymentManager-023: global artifact queries hoisted out of the per-site loop ──
[Fact]
public async Task DeployToAllSitesAsync_HoistsGlobalArtifactQueriesOutOfPerSiteLoop()
{
// DeploymentManager-023: previously each per-site iteration of the deploy-many
// loop re-issued the global artifact queries (shared scripts, external systems,
// DB connections, notification lists, SMTP configs) — a textbook N+1 over the
// global sets. With three sites the queries must now be issued ONCE in total,
// regardless of site count.
var sites = new List<Site>
{
new("Site One", "site-1") { Id = 1 },
new("Site Two", "site-2") { Id = 2 },
new("Site Three", "site-3") { Id = 3 },
};
_siteRepo.GetAllSitesAsync(Arg.Any<CancellationToken>()).Returns(sites);
var probe = Sys.ActorOf(Props.Create(() => new ArtifactProbeActor()));
var service = CreateServiceWithCommActor(probe);
var result = await service.DeployToAllSitesAsync("admin");
Assert.True(result.IsSuccess);
// Each global query must be called EXACTLY ONCE for the whole multi-site sweep.
await _templateRepo.Received(1).GetAllSharedScriptsAsync(Arg.Any<CancellationToken>());
await _externalSystemRepo.Received(1).GetAllExternalSystemsAsync(Arg.Any<CancellationToken>());
await _externalSystemRepo.Received(1).GetAllDatabaseConnectionsAsync(Arg.Any<CancellationToken>());
await _notificationRepo.Received(1).GetAllNotificationListsAsync(Arg.Any<CancellationToken>());
await _notificationRepo.Received(1).GetAllSmtpConfigurationsAsync(Arg.Any<CancellationToken>());
// The per-site query (data connections) DOES vary per site and must still run
// once per site.
await _siteRepo.Received(1).GetDataConnectionsBySiteIdAsync(1, Arg.Any<CancellationToken>());
await _siteRepo.Received(1).GetDataConnectionsBySiteIdAsync(2, Arg.Any<CancellationToken>());
await _siteRepo.Received(1).GetDataConnectionsBySiteIdAsync(3, Arg.Any<CancellationToken>());
}
[Fact]
public async Task RetryForSiteAsync_SingleSitePath_StillRunsTheGlobalQueriesOnce()
{
// DeploymentManager-023: the single-site convenience overload still owns its
// own global-fetch (it cannot inherit from a sweep), so for one site every
// global query is issued exactly once. Pin this so a future refactor cannot
// accidentally route RetryForSiteAsync through the multi-site loop and lose
// the audit row's deploymentId guarantee.
var probe = Sys.ActorOf(Props.Create(() => new ArtifactProbeActor()));
var service = CreateServiceWithCommActor(probe);
var result = await service.RetryForSiteAsync(1, "retry-site", "admin");
Assert.True(result.IsSuccess);
await _templateRepo.Received(1).GetAllSharedScriptsAsync(Arg.Any<CancellationToken>());
await _externalSystemRepo.Received(1).GetAllExternalSystemsAsync(Arg.Any<CancellationToken>());
await _externalSystemRepo.Received(1).GetAllDatabaseConnectionsAsync(Arg.Any<CancellationToken>());
await _notificationRepo.Received(1).GetAllNotificationListsAsync(Arg.Any<CancellationToken>());
await _notificationRepo.Received(1).GetAllSmtpConfigurationsAsync(Arg.Any<CancellationToken>());
}
[Fact]
public async Task RetryForSiteAsync_SiteSucceeds_ReturnsSuccessAndAudits()
{