13 well-bounded test-coverage gaps closed across 11 test projects.
Net +35 regression tests; no production code changes except the
SiteEventLogger src reference unchanged (W3 redacted only test code).
Test additions:
- CLI-022: CommandTreeTests pinned-count assertion bumped 14→16 and
3 InlineData rows added for the audit + bundle command groups.
- Commons-020: new TransportRecordsTests covers BundleManifest /
ExportSelection / ImportPreview / ImportResolution / ImportResult —
ctor + System.Text.Json round-trip + record-equality (14 tests).
- CD-024: SPLIT-RANGE failure-continuation now under
EnsureLookahead_SecondSplitThrows_LoopAborts_FirstBoundaryStillCommitted
(Skippable MS-SQL fixture); production-shape rowversion delete
asserted by DeleteDeploymentRecord_CurrentRowVersion_StubAttachPath_DeleteSucceeds.
- CentralUI-033: new QueryStringDrillInTests with 4 bUnit cases for
Transport + SiteCalls drill-in / query-string handling.
- DM-024: probe actors (ReconcileProbeActor, SerializationProbeActor,
ArtifactProbeActor) refactored from static fields to per-test instances
(Interlocked on counter) — all 31 callers updated; no production
changes required.
- HM-022: real-time PeriodicTimer test flake fixed by replacing
fixed-budget Task.Delay with a RunLoopUntil poll-until-condition
helper (5s/25ms). Production loop untouched.
- InboundAPI-023: new EndpointExtensionsTests covers the
POST /api/{methodName} composition wiring via TestServer (7 cases:
happy path, missing key 401, unknown method 403, invalid JSON 400,
missing param 400, script-throws 500 sanitised, AuditActorItemKey
stash invariant).
- MgmtSvc-021: 6 new ManagementActorTests cover the Transport bundle
handlers (role gate for Export/Preview/Import, unknown-name
ManagementCommandException, blocker-rejection, dedupe last-write-wins).
- SCA-006: SiteCallQueryRequest_StuckOnly_CursorAtNonStuckBoundary_SkipsToNextStuckRow
pins the missing boundary case.
- SEL-023: stress-test `bool stop` promoted to `volatile bool` for
cross-thread visibility under release/JIT.
Verify-only resolutions:
- NS-024: closed by NS-019 (commit ac96b83 deletion of
NotificationDeliveryService + its test file). No edits needed.
- NotifOutbox-008: FallbackMaxRetries/FallbackRetryDelay are private
forward-compat constants returned only when no SMTP-config row exists
(in which case EmailNotificationDeliveryAdapter returns Permanent,
bypassing the values entirely). Marked Resolved with note.
- Transport-010: Overwrite child-collection sync covered by the T-001/
T-002 tests added in commit e3ca9af; per-IP throttle by
BundleUnlockRateLimiterTests; failed-session retention by
BundleSessionStoreTests; T-009 closed structurally via AsyncLocal.
Marked Resolved by reference.
Build clean; all 11 affected test suites green. README regenerated:
33 open (was 46).
Security-sensitive batch, handled main-thread for careful judgment on
secret-leak and pepper-bypass paths.
Secret leak / pepper bypass:
- CD-016 (pepper bypass): InboundApiRepository's GetApiKeyByValueAsync no
longer hashes the candidate with the unpeppered ApiKeyHasher.Default —
ctor takes a lazy Func<IApiKeyHasher> accessor (lazy so test composition
roots without a pepper still bring up the repository), and the DI
registration wires sp.GetService<IApiKeyHasher>() so the production
peppered hasher matches the stored KeyHash. Regression test asserts
positive (peppered roundtrip) AND negative (Default hasher misses the
same key — proving the lookup uses the injected hasher).
- MgmtSvc-020 (SMTP credential leak): UpdateSmtpConfig/ListSmtpConfigs
now project through SmtpConfigPublicShape so the response payload and
audit-row afterState never carry the Credentials field — only a
HasCredentials bool. The SMTP password / OAuth2 client secret no
longer leaves the Admin-only UpdateSmtpConfig boundary the caller
already supplied it to.
Redaction:
- AuditLog-008 (test-fixture under-redact): new
SafeDefaultAuditPayloadFilter (stateless singleton) does HTTP header
redaction for the always-sensitive defaults (Authorization, X-Api-Key,
Cookie, Set-Cookie). FallbackAuditWriter, CentralAuditWriter, and
AuditLogIngestActor (both ingest paths) default to it instead of null
— composition roots that bypass AddAuditLog can no longer write
unredacted auth headers to the audit store.
- NotifService-025 (over-mask): CredentialRedactor.Scrub now only masks
the last colon-separated component (password / clientSecret) AND only
if it's >= 12 chars (typical password heuristic). Short user names
like "root" no longer become global redaction tokens that eat unrelated
diagnostic text. The full packed string is always masked regardless of
length. 3 new negative tests pin the no-over-mask contract.
Audit-row correctness / fail-loud:
- InboundAPI-025: Program.cs UseWhen predicate now excludes /api/audit,
/api/management, /api/centralui, /api/script-analysis AND requires POST
— the AuditWriteMiddleware no longer emits spurious ApiInbound rows
for audit-log query/export endpoints (write-on-read recursion broken).
- ESG-021: ApplyAuth now logs Warning (not silent) on empty
AuthConfiguration for apikey/basic, unknown AuthType, and malformed
Basic config. AuthConfiguration value NEVER logged. AuthType=none
remains silent (documented unauthenticated sentinel).
- Security-021: AddSecurity now logs a startup Warning when
RequireHttpsCookie=false — an HTTP-only deployment that previously
transmitted the cookie-embedded JWT silently in cleartext is now
audible in the log.
Defensive:
- CD-021: SwitchOutPartitionAsync's monthBoundary format string now
yyyy-MM-dd HH:mm:ss.fffffff (datetime2(7) precision) so a future
sub-second / non-midnight boundary doesn't silently round to the
wrong partition.
Plus reconciled stale per-module Open-findings counters that had drifted
from earlier sessions (AuditLog, CD, ESG, IAPI, MgmtSvc, NotifService,
Security).
Build clean; all affected test projects green (Host 208, ConfigDB 242,
ESG 69, IAPI 151, MgmtSvc 100, NotifService 55, Security 85, AuditLog
247/248 — 1 pre-existing date-sensitive integration test flake on
PartitionPurgeTests, unrelated). README regenerated: 46 open (was 54).
Well-localised perf fixes across 8 modules.
Lock decoupling / SQL streaming:
- AuditLog-005: SqliteAuditWriter gains dedicated read-only _readConnection
(+ _readLock) backed by WAL journal mode. GetBacklogStatsAsync,
ReadPendingAsync, ReadPendingSinceAsync, ReadForwardedAsync no longer
contend with the hot-path INSERT lock — backlog probes on a 30s timer
can't stall the writer under multi-hundred-K Pending backlog.
- SEL-022: dropped Cache=Shared from SiteEventLogger's default connection
string (single-connection logger; mode was dormant config).
Memory / streaming:
- CLI-019: bundle export streams base64 in 1 MB-aligned chunks via
Convert.TryFromBase64Chars straight into the FileStream — no more
full-bundle byte[] allocation.
- CentralUI-031: TransportImport now stages the upload to a per-session
temp file under Path.GetTempPath() (replaces in-memory byte[] field);
page implements IDisposable to delete the temp file on reset / new
upload / dispose. Per-circuit working set drops from ~100 MB to ~80 KB.
N+1 hoisting:
- Transport-008: added ITemplateEngineRepository.GetTemplatesWithChildrenAsync
bulk method; BundleImporter.PreviewAsync calls it once instead of per-
template-name. Single query with .Include(...).AsSplitQuery().
- DM-023: BuildDeployArtifactsCommandAsync's per-site loop now references
a pre-fetched GlobalArtifactSnapshot (shared scripts, external systems,
DB connections, notification lists, SMTP) instead of re-querying per site.
- MgmtSvc-023: HandleQueryDeployments unfiltered branch uses one
GetAllInstancesAsync bulk load + Dictionary<int,int?> lookup (was a
GetInstanceByIdAsync per record).
Small allocations / per-tick rebuilds:
- InboundAPI-019: AuditWriteMiddleware gates EnableBuffering() on
RequestHasBody() so GET/HEAD/DELETE/TRACE/OPTIONS and Content-Length:0
requests skip the FileBufferingReadStream allocation.
- NotifOutbox-006: ResolveAdapters dictionary now cached on
_adaptersCache (built lazily on first sweep) + actor-lifetime
_adaptersScope; ResolveAdapters no longer rebuilds per dispatch tick.
Verify-only:
- Comm-017: Confirmed _inProgressDeployments was deleted by Comm-016 in
commit ac96b83 — marked Resolved with that attribution. No code change.
Doc-correction:
- NS-022: Updated MailKitSmtpClientWrapper XML doc to spell out single-
connection / per-delivery-factory contract (option (b) — transient
client per Send — rejected because it re-handshakes TLS per email).
10+ new regression tests across 8 test projects. Build clean; affected
suites all green. README regenerated: 54 open (was 65).
Async cancellation hygiene, fire-and-forget observability, retry/shutdown
semantics, and audit-row coverage across 9 modules. Highlights:
Cancellation & lifecycle:
- AuditLog-006: SqliteAuditWriter.Dispose hops to thread pool, escaping the
captured SyncContext that risked sync-over-async deadlock.
- AuditLog-010: SiteAuditTelemetryActor owns a private lifecycle CTS,
threaded through drain paths instead of CancellationToken.None.
- Comm-019: CentralCommunicationActor adds lifecycle CTS for repo calls.
- Host-019: Migration StartupRetry forwards ApplicationStopping so SIGTERM
during the bounded-retry window aborts cleanly.
Cursor / retry / counter correctness:
- AuditLog-004: SiteAuditReconciliationActor's cursor now holds at `since`
when any row's idempotent insert is still being retried (per-EventId
retry counter, MaxPermanentInsertAttempts=5 escape valve with LogCritical
abandon). No more silent abandonment of permanently-failing rows.
- ConfigDB-019: Dropped the catch-and-continue on EnsureLookaheadAsync's
SPLIT loop — by class-doc construction the catch could only mask real
failures and let the next iteration create permanent partition holes.
- HM-017/018: HealthReportSender + CentralHealthReportLoop snapshot
per-interval counters before sending, restore via new
ISiteHealthCollector.AddIntervalCounters on transport failure so counts
aren't silently lost.
Fire-and-forget / shutdown waits:
- InboundAPI-018: AuditWriteMiddleware observes faulted audit-write tasks
via OnlyOnFaulted continuation (Warning log; response unchanged).
- SnF-024: StoreAndForwardService.StopAsync awaits in-flight retry sweep
with a bounded SweepShutdownWaitTimeout (10s).
Leak / refactor:
- Comm-021: SiteStreamGrpcServer.SubscribeInstance wraps Subscribe in its
own try/catch so a throw doesn't leak the relay actor or _activeStreams
entry.
- Comm-022: VERIFIED already-closed by Comm-016's dead-code purge.
- CLI-017: BundleCommands' three subcommands delegate to ExecuteCommandAsync
(auth-failure exit-code contract unified).
Defensive / validation:
- CLI-021: CliConfig.Load wraps file-read/JSON parse so malformed config
prints a warning and returns defaults instead of crashing the CLI.
- Host-022: ParseLevel emits stderr one-shot warning for unrecognised
MinimumLevel instead of silently coercing to Information.
- ESG-019: ExternalSystemClient sets HttpClient.Timeout=Infinite so the
per-call CTS is the sole timeout source (was clipped to 100s by .NET).
- Security-020: New SecurityOptionsValidator (IValidateOptions) rejects
empty LdapServer/LdapSearchBase with ValidateOnStart.
- DM-019: Lifecycle command timeouts now emit DisableTimedOut/EnableTimedOut/
DeleteTimedOut audit entries (mirrors DeployFailed pattern).
Plus reconciled stale per-module Open-findings counters that had drifted
from prior sessions.
20+ new regression tests across 11 test projects; build clean; affected
suites all green. README regenerated: 75 open (was 93).
Each finding is a focused validation guard or upper bound at a trust boundary.
Highlights:
- Commons-015: EncryptionMetadata ctor now validates Algorithm (AES-256-GCM
only), Kdf (PBKDF2-SHA256 only), Iterations ([100k, 10M]), non-null Salt/IV.
- Transport-004: new BundleUnlockRateLimiter (sliding-window, per-key,
singleton) wired into BundleImporter.LoadAsync; over-budget callers see
BundleUnlockRateLimitedException. Per-bundle 3-strike + per-window cap.
- ESG-022: ExternalSystemClient.InvokeHttpAsync allow-lists the documented
GET/POST/PUT/PATCH/DELETE set (case-insensitive); unknown verbs throw.
- SEL-015: SiteEventLogger queue now bounded (10k cap, DropOldest); dropped
events fault their Task and increment FailedWriteCount so the drop is
observable instead of an unbounded memory growth.
- SEL-017: EventLogQueryService clamps caller-supplied PageSize to a new
MaxQueryPageSize cap (default 500) so int.MaxValue can't OOM the host.
- SEL-020: LogEventAsync rejects severities outside {Info, Warning, Error}
(matches SQLite BINARY-collation query filter).
- InboundAPI-020: ContentType "json" check now case-insensitive
(application/JSON no longer slips through as not-json).
- InboundAPI-024: _knownBadMethods capped at 1000 entries (drops new entries
once full); per-request DB lookup remains the correctness path.
- SR-025: HandleSetStaticAttribute validates the attribute name against the
deployed config; unknown names now return Success=false instead of
leaking orphan override rows into the SQLite store.
- TE-021: MoveTemplateAsync runs the sibling-name-collision check at the
destination, mirroring TemplateFolderService.MoveFolderAsync.
- TE-022: LockEnforcer's once-locked-stays-locked rule now also covers
LockedInDerived (was previously only IsLocked).
New regression tests across 8 test projects (EncryptionMetadata, rate
limiter, ESG client allow-list, SEL bounded channel / PageSize clamp /
severity validation, InboundAPI ContentType + bad-methods cap, SiteRT
unknown-attribute, TemplateEngine MoveTemplate + LockedInDerived).
Build clean; affected suites all green. README regenerated: 93 open (was 104).
Note: a separate manual re-run was needed for the SiteEventLogging hunk
because its initial subagent's source edits never landed on disk despite
reporting success (file-collision-style failure mode).
Comm-016: delete dead HandleConnectionStateChanged + _debugSubscriptions /
_inProgressDeployments tracking + ConnectionStateChanged message record.
Disconnect detection is owned by the transport layers (gRPC keepalive PING
~25s; Ask-timeout at CommunicationService). Updates the
Component-Communication.md design doc to make that explicit.
SnF-018: NotificationForwarder.DeliverAsync now discards a corrupt buffered
payload (Warning log + return true) instead of returning false and parking
the row — honoring the design's "notifications do not park" invariant.
DM-018: reconciliation no longer force-sets Enabled, preserving an
intentional Disabled state after central failover.
ESG-018: DeliverBufferedAsync (both ExternalSystemClient + DatabaseGateway)
catches JsonException and returns false, turning a corrupt buffered row
into a parked operation instead of a retry-forever poison message.
InboundAPI-022: register ActiveNodeGate as IActiveNodeGate in the Central
DI branch so standby-node gating is actually wired up in production.
NS-019: remove orphaned NotificationDeliveryService /
INotificationDeliveryService / NotificationResult; central notification
delivery now lives entirely in NotificationOutbox.
SEL-016: normalise From/To filters to UTC before ISO-string compare so
non-UTC DateTimeOffset clients no longer get spuriously excluded events.
TE-017: include Description on attributes/alarms and a HashableConnections
projection (protocol, endpoint JSON, failover count) in the revision hash
and DiffService; staleness detection now catches description-only and
connection-endpoint edits.
Transport-001 and Transport-002 (also High) remain Open — they're being
handled in a follow-up batch because both touch BundleImporter.cs and
must serialise.
Re-applies the full 10-category checklist to every src/ project — including
first-time reviews of the four newer components (AuditLog, NotificationOutbox,
SiteCallAudit, Transport) — so the code-reviews/ index reflects today's
codebase rather than the 2026-05-16 baseline. 172 new Open findings (0
Critical, 18 High, 62 Medium, 92 Low); 481 findings total across 23 modules.
regen-readme.py now derives each module's Last reviewed + Commit from its
findings.md header instead of hard-coding 2026-05-16 / 9c60592, so future
single-module re-reviews show their own date in the Module Status table.
Establishes a per-module code review workflow under code-reviews/ and
records the 2026-05-16 baseline review (commit 9c60592): 241 findings
across all src/ modules (6 Critical, 46 High, 100 Medium, 89 Low).
This is the clean starting point for remediation work.