Files
ScadaBridge/code-reviews/README.md
T
Joseph Doherty 2ed5c6c379 fix(concurrency/lifetime): close Theme 5 — 10 concurrency / DI / scope findings
Concurrency hazards, DI lifetime hygiene, and one verify-only confirmation
across 8 modules. Highlights:

Concurrency:
- CentralUI-030: SandboxConsoleCapture writes routed through WriteSynchronized
  locking on the captured StringWriter — intra-script Task fan-out can no
  longer corrupt the per-call buffer.
- Commons-021: ExternalCallResult.Response now backed by Lazy<dynamic?>
  (ExecutionAndPublication) — no more benign double-parse race.
- CD-017: DeploymentManagerRepository.DeleteDeploymentRecordAsync now takes
  an expected RowVersion and seeds entry.OriginalValues so EF emits
  DELETE ... WHERE Id=@id AND RowVersion=@prior; stale RowVersion now
  throws DbUpdateConcurrencyException instead of silent overwrite.
- Transport-009: AuditCorrelationContext.BundleImportId backed by
  AsyncLocal<Guid?> so concurrent imports get per-logical-call isolation
  (was a scoped instance shared via AuditService across runs).

DI / lifetime:
- AuditLog-003: All 3 AuditLog actor handlers switched to CreateAsyncScope
  + await using — async EF disposal no longer swallowed.
- AuditLog-007: INodeIdentityProvider resolution standardised on
  GetRequiredService<>() (was mixed with GetService<>()).
- AuditLog-011: AddAuditLogHealthMetricsBridge guarded by sentinel
  descriptor check — calling twice no longer double-registers the hosted
  service.

Shutdown / supervision:
- SiteCallAudit-002: AkkaHostedService adds a CoordinatedShutdown
  cluster-leave task (drain-site-call-audit-singleton) that issues a
  bounded GracefulStop(10s) so failover waits for in-flight upserts.

Registration safety:
- NS-020: AkkaHostedService now guards NotificationForwarder S&F
  registration with _notificationDeliveryHandlerRegistered + throws
  InvalidOperationException on double-register to make the regression loud.

VERIFY-only closures:
- NotifOutbox-005: Confirmed already closed by CD-015 fix (ac96b83) —
  NotificationOutboxRepository.InsertIfNotExistsAsync uses the same
  raw-SQL IF NOT EXISTS + 2601/2627 swallow pattern; race eliminated.

5+ new regression tests (CentralUI sandbox WhenAll, ExternalCallResult
64-reader Barrier, AuditLog DI idempotency, RowVersion stale-throw,
SiteCallAudit-002 shutdown drain). Build clean; affected suites all green.
README regenerated: 65 open (was 75).
2026-05-28 07:29:41 -04:00

15 KiB

Code Reviews

Comprehensive, per-module code reviews of the ScadaLink codebase. Each module (one buildable project under src/) has its own folder containing a findings.md. This README is the aggregated index — the single place to see all outstanding work.

Generated by regen-readme.py from the per-module findings.md files. Do not edit by hand — edit the findings files and re-run the script.

How it works

  • Reviews are performed one module at a time against a fixed checklist.
  • Every finding is recorded in the module's findings.md with a severity and status.
  • Findings are never deleted — they are closed by changing their status, keeping a full audit trail.
  • This README aggregates every pending finding (Open / In Progress) across all modules.

See REVIEW-PROCESS.md for the full procedure: the review checklist, severity definitions, finding format, and how to mark items resolved.

Layout

code-reviews/
├── README.md            # this file — process overview + pending findings
├── REVIEW-PROCESS.md     # how to perform a review and track findings
├── regen-readme.py       # regenerates this README from the findings files
├── _template/findings.md # copy-this template for a module review
└── <Module>/findings.md  # one folder per src/ project

Baseline review — 2026-05-16

All 19 modules were reviewed at commit 9c60592 (241 findings: 6 Critical, 46 High, 100 Medium, 89 Low). The tables below track what remains open as findings are resolved and re-triaged; findings discovered after the baseline are appended to their module file and counted in Total.

Severity Open findings
Critical 0
High 0
Medium 22
Low 43
Total 65

Module Status

Module Last reviewed Commit Open (C/H/M/L) Open Total
AuditLog 2026-05-28 1eb6e97 0/0/2/1 3 11
CLI 2026-05-28 1eb6e97 0/0/1/2 3 23
CentralUI 2026-05-28 1eb6e97 0/0/0/4 4 33
ClusterInfrastructure 2026-05-28 1eb6e97 0/0/0/3 3 14
Commons 2026-05-28 1eb6e97 0/0/0/4 4 23
Communication 2026-05-28 1eb6e97 0/0/1/1 2 22
ConfigurationDatabase 2026-05-28 1eb6e97 0/0/1/2 3 24
DataConnectionLayer 2026-05-28 1eb6e97 0/0/0/0 0 22
DeploymentManager 2026-05-28 1eb6e97 0/0/0/4 4 24
ExternalSystemGateway 2026-05-28 1eb6e97 0/0/1/1 2 23
HealthMonitoring 2026-05-28 1eb6e97 0/0/0/2 2 23
Host 2026-05-28 1eb6e97 0/0/1/3 4 22
InboundAPI 2026-05-28 1eb6e97 0/0/1/2 3 25
ManagementService 2026-05-28 1eb6e97 0/0/2/1 3 23
NotificationOutbox 2026-05-28 1eb6e97 0/0/0/2 2 10
NotificationService 2026-05-28 1eb6e97 0/0/1/2 3 25
Security 2026-05-28 1eb6e97 0/0/0/1 1 21
SiteCallAudit 2026-05-28 1eb6e97 0/0/2/1 3 6
SiteEventLogging 2026-05-28 1eb6e97 0/0/0/3 3 23
SiteRuntime 2026-05-28 1eb6e97 0/0/2/0 2 26
StoreAndForward 2026-05-28 1eb6e97 0/0/3/2 5 24
TemplateEngine 2026-05-28 1eb6e97 0/0/3/0 3 22
Transport 2026-05-28 1eb6e97 0/0/1/2 3 12

Pending Findings

Every Open / In Progress finding across all modules, highest severity first. Resolved findings drop off this list but remain recorded in their module's findings.md (see REVIEW-PROCESS.md §4–§5). Full detail — description, location, recommendation — lives in the module's findings.md.

Critical (0)

None open.

High (0)

None open.

Medium (22)

ID Module Title
AuditLog-001 AuditLog Combined-telemetry transport is plumbed end-to-end but never invoked in production
AuditLog-005 AuditLog GetBacklogStatsAsync holds the SQLite hot-path write lock for the full COUNT+MIN scan
CLI-019 CLI bundle export decodes the entire base64 bundle into memory before writing
Communication-017 Communication _inProgressDeployments grows unboundedly — successful deployments are never cleaned up
ConfigurationDatabase-016 ConfigurationDatabase InboundApiRepository.GetApiKeyByValueAsync hashes the candidate with the unpeppered ApiKeyHasher.Default
ExternalSystemGateway-020 ExternalSystemGateway JsonElementToParameterValue silently downcasts non-Int64 JSON numbers to double, losing precision for decimal SQL parameters on retry
Host-016 Host Site CentralContactPoints second entry targets the site's own remoting port
InboundAPI-025 InboundAPI AuditWriteMiddleware runs against the entire /api/* branch — emits spurious ApiInbound audit rows for /api/audit/query and /api/audit/export
ManagementService-020 ManagementService UpdateSmtpConfig returns and audits the SMTP Credentials field verbatim
ManagementService-021 ManagementService Transport bundle handlers have zero test coverage
NotificationService-024 NotificationService No test affirms the central-only invariant; the orphaned-path tests give a false coverage signal
SiteCallAudit-001 SiteCallAudit SupervisorStrategy override is dead code; XML claims Resume that is not enforced
SiteCallAudit-003 SiteCallAudit OnUpsertAsync does not refresh IngestedAtUtc; direct-write callers must remember to stamp it
SiteRuntime-021 SiteRuntime HandleDeployArtifacts updates DataConnections in SQLite but never sends CreateConnectionCommand to the DCL
SiteRuntime-022 SiteRuntime AuditingDbCommand.DbConnection.set uses reflection to read AuditingDbConnection._inner
StoreAndForward-019 StoreAndForward Notifications park after DefaultMaxRetries exhaustion, contradicting "retried until central acks"
StoreAndForward-020 StoreAndForward RetryParkedMessageAsync skips standby replication when the message is deleted between local update and re-load
StoreAndForward-021 StoreAndForward Design doc claims the Operation Tracking Table lives in StoreAndForward but the implementation is in SiteRuntime
TemplateEngine-018 TemplateEngine DiffService reports no entries for added/removed/changed connections
TemplateEngine-019 TemplateEngine TemplateResolver.BuildInheritanceChain still uses the 0-as-no-parent sentinel that was removed from CycleDetector
TemplateEngine-020 TemplateEngine Create* audit entries are written with EntityId = "0" before SaveChangesAsync populates the real key
Transport-010 Transport Critical Overwrite + cross-cutting paths uncovered by tests

Low (43)

ID Module Title
AuditLog-008 AuditLog Test composition roots that omit IAuditPayloadFilter silently pass UNREDACTED payloads through the writer chain
CLI-020 CLI bundle export success-envelope parse is unguarded
CLI-022 CLI CommandTreeTests excludes the two new command groups
CentralUI-029 CentralUI ConfigurationAuditLog uses JS.InvokeAsync<int>("eval", ...) instead of a dedicated JS module
CentralUI-031 CentralUI TransportImport buffers the full bundle bytes in component state
CentralUI-032 CentralUI AuditResultsGrid paging is forward-only, no Previous button
CentralUI-033 CentralUI Drill-in / query-string code paths for the new Transport + SiteCalls pages are untested
ClusterInfrastructure-011 ClusterInfrastructure SectionName constant is decorative — no binding site references it
ClusterInfrastructure-013 ClusterInfrastructure Test uses catastrophic config values without an inline-intent comment
ClusterInfrastructure-014 ClusterInfrastructure AddClusterInfrastructureActors is dead surface — no caller, no behaviour
Commons-016 Commons BundleSession.Locked uses a magic 3 rather than a named constant
Commons-018 Commons IOperationTrackingStore and IPartitionMaintenance are at the root of Interfaces/ instead of Interfaces/Services/
Commons-020 Commons Transport types and new Audit-message types have no unit tests in ScadaLink.Commons.Tests
Commons-023 Commons Trailing-optional SourceNode on positional records mixes additive evolution patterns
Communication-020 Communication SiteAddressCacheLoaded carries mutable Dictionary/List types
ConfigurationDatabase-021 ConfigurationDatabase SwitchOutPartitionAsync interpolates monthBoundary / staging table name into raw SQL
ConfigurationDatabase-024 ConfigurationDatabase Missing test coverage for SPLIT-RANGE failure-continuation and production-shape rowversion delete
DeploymentManager-021 DeploymentManager ResolveSiteIdentifierAsync silently substitutes the DB id when the site row is missing
DeploymentManager-022 DeploymentManager Pending and InProgress are written back-to-back with no intervening work
DeploymentManager-023 DeploymentManager BuildDeployArtifactsCommandAsync re-queries system-wide artifacts once per site
DeploymentManager-024 DeploymentManager Test probe actors hold mutable static state across tests
ExternalSystemGateway-021 ExternalSystemGateway ApplyAuth silently sends an unauthenticated request on unknown AuthType, empty AuthConfiguration, or malformed Basic config
HealthMonitoring-021 HealthMonitoring CentralSiteId = "central" reserved constant silently collides with a real site named "central"
HealthMonitoring-022 HealthMonitoring CentralHealthReportLoopTests uses real-time PeriodicTimer + Task.Delay; flake-prone on slow CI
Host-018 Host Shipped per-role configs omit NodeOptions.NodeName, leaving SourceNode null
Host-020 Host MinimumLevel.Is silently overrides any operator-set Serilog:MinimumLevel
Host-021 Host Microsoft Logging:LogLevel section in appsettings.json is dead config under Serilog
InboundAPI-019 InboundAPI EnableBuffering() called unconditionally on every request, including bodyless requests
InboundAPI-023 InboundAPI EndpointExtensions.HandleInboundApiRequest composition wiring has no test coverage
ManagementService-023 ManagementService HandleQueryDeployments unfiltered branch is N+1 on instance lookup
NotificationOutbox-006 NotificationOutbox ResolveAdapters rebuilds the NotificationType → adapter dictionary on every dispatch sweep
NotificationOutbox-008 NotificationOutbox FallbackMaxRetries / FallbackRetryDelay path is unreachable in production AND untested
NotificationService-022 NotificationService MailKitSmtpClientWrapper holds a long-lived SmtpClient; combined with per-send factory, the design comment about pooling is contradicted
NotificationService-025 NotificationService CredentialRedactor over-masks: any 4-character credential component is masked anywhere it appears, including unrelated log text
Security-021 Security RequireHttpsCookie=false dev opt-out has no warning path — an HTTP production deployment silently transmits the JWT bearer credential in cleartext
SiteCallAudit-006 SiteCallAudit Stuck-only paging test does not exercise the multi-page boundary with an interleaved non-stuck row at the cursor
SiteEventLogging-018 SiteEventLogging FailedWriteCount is exposed but never consumed by Health Monitoring
SiteEventLogging-022 SiteEventLogging Cache=Shared is redundant for a single-connection logger
SiteEventLogging-023 SiteEventLogging Concurrent-stress test uses a non-volatile stop flag
StoreAndForward-022 StoreAndForward NotifyCachedCallObserverAsync silently drops the entire audit lifecycle when the message id is not a parseable TrackedOperationId
StoreAndForward-023 StoreAndForward siteId silently defaults to empty when no IStoreAndForwardSiteContext is registered, degrading audit telemetry correlation
Transport-008 Transport PreviewAsync issues an N+1 GetTemplateWithChildrenAsync per matching template name
Transport-012 Transport "Bundle Import" filter promised in design doc not surfaced in Configuration Audit Log Viewer UI