The largest themed batch — small mechanical fixes across 11 modules.
API / message hygiene:
- Comm-020: SiteAddressCacheLoaded now carries IReadOnlyDictionary /
IReadOnlyList — Akka messages must be immutable.
- Commons-016: BundleSession.MaxUnlockAttempts named constant replaces
magic 3.
- Commons-018: IOperationTrackingStore + IPartitionMaintenance moved from
Interfaces/ root to Interfaces/Services/ (namespace preserved — 9
consumers exceeded the in-prompt move threshold).
- Commons-023: TrackingStatusSnapshot.SourceNode now consistent with the
trailing-optional-with-default pattern used elsewhere.
- SR-022: AuditingDbCommand.DbConnection.set no longer uses reflection —
exposes AuditingDbConnection.Inner via internal API surface.
Dead code / config cleanup:
- ClusterInfra-011: decorative SectionName constant deleted.
- ClusterInfra-014: dead AddClusterInfrastructureActors method + its
"throws-when-called" test deleted.
- Host-021: Microsoft Logging:LogLevel block deleted from appsettings.json
(dead under Serilog).
Fail-loud over fail-silent:
- DM-021: ResolveSiteIdentifierAsync throws on missing site (was silently
substituting a DB id).
- DM-022: dropped transient Pending write — record now lands directly in
InProgress (no UI flicker, one fewer DB write).
- Host-020: LoggerConfigurationFactory emits a Console.Error warning when
both Serilog:MinimumLevel and ScadaLink:Logging:MinimumLevel are set
(ScadaLink remains truth per Host-011).
- SnF-022: NotifyCachedCallObserverAsync logs Warning on unparseable
TrackedOperationId (was silently dropping).
- SnF-023: empty siteId default replaced with $unknown-site sentinel
+ constructor normalisation.
Correctness:
- SCA-001: SupervisorStrategy XML rewritten to match actual
DefaultDecider/Restart semantics (was claiming Resume).
- SCA-003: OnUpsertAsync now restamps IngestedAtUtc on every upsert.
- SR-021: HandleDeployArtifacts now dispatches an internal
ApplyArtifactDataConnectionsToDcl message after the SQLite write so
system-wide artifact-deploy data-connection changes go live
immediately (was requiring a site restart).
- SnF-020: RetryParkedMessageAsync captures the parked row BEFORE the
local write so a concurrent delete can't skip standby replication.
Sentinels / naming collisions:
- HM-021: CentralSiteId changed from "central" to "$central"
(uncollideable — leading $ is forbidden in real SiteIdentifiers).
Doc / surface cleanups:
- SEL-018: FailedWriteCount promoted to ISiteEventLogger; XML softened
to "Available for future Health Monitoring integration".
- SnF-019: VERIFY outcome — documented parking-after-DefaultMaxRetries
in Component-StoreAndForward.md + DefaultMaxRetries XML (uniform
cap; maxRetries:0 is the unbounded escape hatch).
- SnF-021: Component-StoreAndForward.md no longer claims the tracking
table lives in SnF — it's in SiteRuntime, the interface is in Commons.
- CLI-020: bundle export response parse guarded with try/catch on
JsonException / KeyNotFoundException / FormatException — emits a
clean INVALID_RESPONSE exit instead of a stack trace.
Config:
- ClusterInfra-013: intent comment added to "catastrophic config" test.
- Host-016: appsettings.Site.json second CentralContactPoints entry
removed (was pointing at the SITE's own port); doc-key explains how
to extend.
- Host-018: NodeName added to both shipped per-role configs (was
causing SourceNode to be null on audit rows).
UI:
- CentralUI-029: replaced JS.InvokeAsync<int>("eval", …) with an ES
module import (new wwwroot/js/browser-time.js).
- CentralUI-032: AuditResultsGrid gains a Previous button backed by a
cursor stack.
10+ new regression tests across the affected projects. Build clean;
all suites green. README regenerated: 6 open (was 33).
Session-to-date: 130 of 136 originally-open Theme findings closed.
13 well-bounded test-coverage gaps closed across 11 test projects.
Net +35 regression tests; no production code changes except the
SiteEventLogger src reference unchanged (W3 redacted only test code).
Test additions:
- CLI-022: CommandTreeTests pinned-count assertion bumped 14→16 and
3 InlineData rows added for the audit + bundle command groups.
- Commons-020: new TransportRecordsTests covers BundleManifest /
ExportSelection / ImportPreview / ImportResolution / ImportResult —
ctor + System.Text.Json round-trip + record-equality (14 tests).
- CD-024: SPLIT-RANGE failure-continuation now under
EnsureLookahead_SecondSplitThrows_LoopAborts_FirstBoundaryStillCommitted
(Skippable MS-SQL fixture); production-shape rowversion delete
asserted by DeleteDeploymentRecord_CurrentRowVersion_StubAttachPath_DeleteSucceeds.
- CentralUI-033: new QueryStringDrillInTests with 4 bUnit cases for
Transport + SiteCalls drill-in / query-string handling.
- DM-024: probe actors (ReconcileProbeActor, SerializationProbeActor,
ArtifactProbeActor) refactored from static fields to per-test instances
(Interlocked on counter) — all 31 callers updated; no production
changes required.
- HM-022: real-time PeriodicTimer test flake fixed by replacing
fixed-budget Task.Delay with a RunLoopUntil poll-until-condition
helper (5s/25ms). Production loop untouched.
- InboundAPI-023: new EndpointExtensionsTests covers the
POST /api/{methodName} composition wiring via TestServer (7 cases:
happy path, missing key 401, unknown method 403, invalid JSON 400,
missing param 400, script-throws 500 sanitised, AuditActorItemKey
stash invariant).
- MgmtSvc-021: 6 new ManagementActorTests cover the Transport bundle
handlers (role gate for Export/Preview/Import, unknown-name
ManagementCommandException, blocker-rejection, dedupe last-write-wins).
- SCA-006: SiteCallQueryRequest_StuckOnly_CursorAtNonStuckBoundary_SkipsToNextStuckRow
pins the missing boundary case.
- SEL-023: stress-test `bool stop` promoted to `volatile bool` for
cross-thread visibility under release/JIT.
Verify-only resolutions:
- NS-024: closed by NS-019 (commit ac96b83 deletion of
NotificationDeliveryService + its test file). No edits needed.
- NotifOutbox-008: FallbackMaxRetries/FallbackRetryDelay are private
forward-compat constants returned only when no SMTP-config row exists
(in which case EmailNotificationDeliveryAdapter returns Permanent,
bypassing the values entirely). Marked Resolved with note.
- Transport-010: Overwrite child-collection sync covered by the T-001/
T-002 tests added in commit e3ca9af; per-IP throttle by
BundleUnlockRateLimiterTests; failed-session retention by
BundleSessionStoreTests; T-009 closed structurally via AsyncLocal.
Marked Resolved by reference.
Build clean; all 11 affected test suites green. README regenerated:
33 open (was 46).
Well-localised perf fixes across 8 modules.
Lock decoupling / SQL streaming:
- AuditLog-005: SqliteAuditWriter gains dedicated read-only _readConnection
(+ _readLock) backed by WAL journal mode. GetBacklogStatsAsync,
ReadPendingAsync, ReadPendingSinceAsync, ReadForwardedAsync no longer
contend with the hot-path INSERT lock — backlog probes on a 30s timer
can't stall the writer under multi-hundred-K Pending backlog.
- SEL-022: dropped Cache=Shared from SiteEventLogger's default connection
string (single-connection logger; mode was dormant config).
Memory / streaming:
- CLI-019: bundle export streams base64 in 1 MB-aligned chunks via
Convert.TryFromBase64Chars straight into the FileStream — no more
full-bundle byte[] allocation.
- CentralUI-031: TransportImport now stages the upload to a per-session
temp file under Path.GetTempPath() (replaces in-memory byte[] field);
page implements IDisposable to delete the temp file on reset / new
upload / dispose. Per-circuit working set drops from ~100 MB to ~80 KB.
N+1 hoisting:
- Transport-008: added ITemplateEngineRepository.GetTemplatesWithChildrenAsync
bulk method; BundleImporter.PreviewAsync calls it once instead of per-
template-name. Single query with .Include(...).AsSplitQuery().
- DM-023: BuildDeployArtifactsCommandAsync's per-site loop now references
a pre-fetched GlobalArtifactSnapshot (shared scripts, external systems,
DB connections, notification lists, SMTP) instead of re-querying per site.
- MgmtSvc-023: HandleQueryDeployments unfiltered branch uses one
GetAllInstancesAsync bulk load + Dictionary<int,int?> lookup (was a
GetInstanceByIdAsync per record).
Small allocations / per-tick rebuilds:
- InboundAPI-019: AuditWriteMiddleware gates EnableBuffering() on
RequestHasBody() so GET/HEAD/DELETE/TRACE/OPTIONS and Content-Length:0
requests skip the FileBufferingReadStream allocation.
- NotifOutbox-006: ResolveAdapters dictionary now cached on
_adaptersCache (built lazily on first sweep) + actor-lifetime
_adaptersScope; ResolveAdapters no longer rebuilds per dispatch tick.
Verify-only:
- Comm-017: Confirmed _inProgressDeployments was deleted by Comm-016 in
commit ac96b83 — marked Resolved with that attribution. No code change.
Doc-correction:
- NS-022: Updated MailKitSmtpClientWrapper XML doc to spell out single-
connection / per-delivery-factory contract (option (b) — transient
client per Send — rejected because it re-handshakes TLS per email).
10+ new regression tests across 8 test projects. Build clean; affected
suites all green. README regenerated: 54 open (was 65).
Each finding is a focused validation guard or upper bound at a trust boundary.
Highlights:
- Commons-015: EncryptionMetadata ctor now validates Algorithm (AES-256-GCM
only), Kdf (PBKDF2-SHA256 only), Iterations ([100k, 10M]), non-null Salt/IV.
- Transport-004: new BundleUnlockRateLimiter (sliding-window, per-key,
singleton) wired into BundleImporter.LoadAsync; over-budget callers see
BundleUnlockRateLimitedException. Per-bundle 3-strike + per-window cap.
- ESG-022: ExternalSystemClient.InvokeHttpAsync allow-lists the documented
GET/POST/PUT/PATCH/DELETE set (case-insensitive); unknown verbs throw.
- SEL-015: SiteEventLogger queue now bounded (10k cap, DropOldest); dropped
events fault their Task and increment FailedWriteCount so the drop is
observable instead of an unbounded memory growth.
- SEL-017: EventLogQueryService clamps caller-supplied PageSize to a new
MaxQueryPageSize cap (default 500) so int.MaxValue can't OOM the host.
- SEL-020: LogEventAsync rejects severities outside {Info, Warning, Error}
(matches SQLite BINARY-collation query filter).
- InboundAPI-020: ContentType "json" check now case-insensitive
(application/JSON no longer slips through as not-json).
- InboundAPI-024: _knownBadMethods capped at 1000 entries (drops new entries
once full); per-request DB lookup remains the correctness path.
- SR-025: HandleSetStaticAttribute validates the attribute name against the
deployed config; unknown names now return Success=false instead of
leaking orphan override rows into the SQLite store.
- TE-021: MoveTemplateAsync runs the sibling-name-collision check at the
destination, mirroring TemplateFolderService.MoveFolderAsync.
- TE-022: LockEnforcer's once-locked-stays-locked rule now also covers
LockedInDerived (was previously only IsLocked).
New regression tests across 8 test projects (EncryptionMetadata, rate
limiter, ESG client allow-list, SEL bounded channel / PageSize clamp /
severity validation, InboundAPI ContentType + bad-methods cap, SiteRT
unknown-attribute, TemplateEngine MoveTemplate + LockedInDerived).
Build clean; affected suites all green. README regenerated: 93 open (was 104).
Note: a separate manual re-run was needed for the SiteEventLogging hunk
because its initial subagent's source edits never landed on disk despite
reporting success (file-collision-style failure mode).
UTC invariant + culture-safety fixes across UI form binding, audit entity
hydrate, and locale-dependent parses. Highlights:
- CentralUI-026/027: AuditFilterBar / SiteCallsReport / NotificationReport /
EventLogs now apply SpecifyKind(Local) + ToUniversalTime() at form submit
so browser-local datetime-local inputs aren't silently treated as UTC.
- Commons-019: AuditEvent.OccurredAtUtc / IngestedAtUtc init-setters
re-tag any incoming DateTime as Kind=Utc, documenting the invariant.
- CD-018: AuditLogEntityTypeConfiguration adds UTC ValueConverters on the
*Utc DateTime columns so EF hydrate yields Kind=Utc (SQL Server's
datetime2 has no Kind metadata, so reads were returning Unspecified).
- CD-020: GetPartitionBoundariesOlderThanAsync now SpecifyKind(Utc) on the
raw-ADO read, matching the existing defence in AuditLogPartitionMaintenance.
- SEL-021: EventLogQueryService.DateTimeOffset.Parse now uses
InvariantCulture + AssumeUniversal | AdjustToUniversal.
- SR-023: Convert.ToDouble in ScriptActor + AlarmActor (4 sites) now
passes InvariantCulture so non-US locales don't mis-parse string values.
- HM-020: CentralHealthAggregator.MarkHeartbeat anchors LastHeartbeatAt to
max(receivedAt, now) on offline→online so a stale receivedAt can't
leave a recovered site one tick from re-going-offline.
3 new tests added (AuditLog UTC converter, AuditFilterBar/EventLogs/
NotificationReport-touching CentralUI tests already cover Apply paths,
heartbeat offline→online). Build clean; ConfigurationDatabase 236,
Commons 330, HealthMonitoring 71, SiteRuntime 301, SiteEventLogging 50,
CentralUI 50 — all green. README regenerated: 104 open (was 112).
Comm-016: delete dead HandleConnectionStateChanged + _debugSubscriptions /
_inProgressDeployments tracking + ConnectionStateChanged message record.
Disconnect detection is owned by the transport layers (gRPC keepalive PING
~25s; Ask-timeout at CommunicationService). Updates the
Component-Communication.md design doc to make that explicit.
SnF-018: NotificationForwarder.DeliverAsync now discards a corrupt buffered
payload (Warning log + return true) instead of returning false and parking
the row — honoring the design's "notifications do not park" invariant.
DM-018: reconciliation no longer force-sets Enabled, preserving an
intentional Disabled state after central failover.
ESG-018: DeliverBufferedAsync (both ExternalSystemClient + DatabaseGateway)
catches JsonException and returns false, turning a corrupt buffered row
into a parked operation instead of a retry-forever poison message.
InboundAPI-022: register ActiveNodeGate as IActiveNodeGate in the Central
DI branch so standby-node gating is actually wired up in production.
NS-019: remove orphaned NotificationDeliveryService /
INotificationDeliveryService / NotificationResult; central notification
delivery now lives entirely in NotificationOutbox.
SEL-016: normalise From/To filters to UTC before ISO-string compare so
non-UTC DateTimeOffset clients no longer get spuriously excluded events.
TE-017: include Description on attributes/alarms and a HashableConnections
projection (protocol, endpoint JSON, failover count) in the revision hash
and DiffService; staleness detection now catches description-only and
connection-endpoint edits.
Transport-001 and Transport-002 (also High) remain Open — they're being
handled in a follow-up batch because both touch BundleImporter.cs and
must serialise.
Re-applies the full 10-category checklist to every src/ project — including
first-time reviews of the four newer components (AuditLog, NotificationOutbox,
SiteCallAudit, Transport) — so the code-reviews/ index reflects today's
codebase rather than the 2026-05-16 baseline. 172 new Open findings (0
Critical, 18 High, 62 Medium, 92 Low); 481 findings total across 23 modules.
regen-readme.py now derives each module's Last reviewed + Commit from its
findings.md header instead of hard-coding 2026-05-16 / 9c60592, so future
single-module re-reviews show their own date in the Module Status table.
Establishes a per-module code review workflow under code-reviews/ and
records the 2026-05-16 baseline review (commit 9c60592): 241 findings
across all src/ modules (6 Critical, 46 High, 100 Medium, 89 Low).
This is the clean starting point for remediation work.