Files
ScadaBridge/code-reviews
Joseph Doherty 5d2386cc9d fix(transport): close bundle security + plaintext-retention gaps (4 findings)
T-003: move the unlock lockout server-side. The 3-strike counter used to
live in the Razor page only — a second tab / CLI caller could re-upload
the same bytes and grind PBKDF2 indefinitely. The counter now lives in
IBundleSessionStore, keyed by ContentHash, so retries against identical
bundle bytes are throttled regardless of client. BundleLockedException
surfaces the new typed error path.

T-005: bind the manifest's non-derivative fields into AES-GCM AAD. A
SHA-256 of the manifest (with ContentHash + Encryption normalised to
sentinels) is now passed to AesGcm.Encrypt / .Decrypt, so a tampered
SourceEnvironment / ExportedBy / CreatedAtUtc on a stolen bundle yields
an authentication-tag mismatch instead of slipping past the Step-4
typo-resistant confirmation gate.

T-006: cap zip entry count, decompressed length, and compression ratio
in LoadAsync's envelope validator BEFORE any payload is decompressed,
using ZipArchiveEntry.Length / .CompressedLength. New TransportOptions
fields default to 4 entries / 200 MB / 50x ratio.

T-007: clear decrypted plaintext on the ApplyAsync failure path and zero
the buffer on success before removing the session, so a 100 MB
DecryptedContent doesn't sit in memory for the 30-min TTL after a failed
apply. A BundleSessionEvictionService BackgroundService now also drives
EvictExpired periodically so abandoned sessions clear without needing a
fresh Get() call to trigger lazy eviction.

Also resolves NO-010 — the misleading "writer never throws" XML doc was
the same code+comment my prior NO-004 await-the-writer fix already
rewrote.
2026-05-28 04:14:07 -04:00
..

Code Reviews

Comprehensive, per-module code reviews of the ScadaLink codebase. Each module (one buildable project under src/) has its own folder containing a findings.md. This README is the aggregated index — the single place to see all outstanding work.

Generated by regen-readme.py from the per-module findings.md files. Do not edit by hand — edit the findings files and re-run the script.

How it works

  • Reviews are performed one module at a time against a fixed checklist.
  • Every finding is recorded in the module's findings.md with a severity and status.
  • Findings are never deleted — they are closed by changing their status, keeping a full audit trail.
  • This README aggregates every pending finding (Open / In Progress) across all modules.

See REVIEW-PROCESS.md for the full procedure: the review checklist, severity definitions, finding format, and how to mark items resolved.

Layout

code-reviews/
├── README.md            # this file — process overview + pending findings
├── REVIEW-PROCESS.md     # how to perform a review and track findings
├── regen-readme.py       # regenerates this README from the findings files
├── _template/findings.md # copy-this template for a module review
└── <Module>/findings.md  # one folder per src/ project

Baseline review — 2026-05-16

All 19 modules were reviewed at commit 9c60592 (241 findings: 6 Critical, 46 High, 100 Medium, 89 Low). The tables below track what remains open as findings are resolved and re-triaged; findings discovered after the baseline are appended to their module file and counted in Total.

Severity Open findings
Critical 0
High 12
Medium 52
Low 90
Total 154

Module Status

Module Last reviewed Commit Open (C/H/M/L) Open Total
AuditLog 2026-05-28 1eb6e97 0/0/3/8 11 11
CLI 2026-05-28 1eb6e97 0/0/2/4 6 23
CentralUI 2026-05-28 1eb6e97 0/0/2/5 7 33
ClusterInfrastructure 2026-05-28 1eb6e97 0/0/0/4 4 14
Commons 2026-05-28 1eb6e97 0/0/3/6 9 23
Communication 2026-05-28 1eb6e97 0/1/1/5 7 22
ConfigurationDatabase 2026-05-28 1eb6e97 0/1/4/5 10 24
DataConnectionLayer 2026-05-28 1eb6e97 0/1/4/0 5 22
DeploymentManager 2026-05-28 1eb6e97 0/1/1/5 7 24
ExternalSystemGateway 2026-05-28 1eb6e97 0/1/2/3 6 23
HealthMonitoring 2026-05-28 1eb6e97 0/0/2/5 7 23
Host 2026-05-28 1eb6e97 0/0/2/5 7 22
InboundAPI 2026-05-28 1eb6e97 0/1/3/4 8 25
ManagementService 2026-05-28 1eb6e97 0/0/2/2 4 23
NotificationOutbox 2026-05-28 1eb6e97 0/0/2/3 5 10
NotificationService 2026-05-28 1eb6e97 0/1/2/3 6 25
Security 2026-05-28 1eb6e97 0/0/0/2 2 21
SiteCallAudit 2026-05-28 1eb6e97 0/0/2/4 6 6
SiteEventLogging 2026-05-28 1eb6e97 0/1/2/6 9 23
SiteRuntime 2026-05-28 1eb6e97 0/0/4/3 7 26
StoreAndForward 2026-05-28 1eb6e97 0/1/3/3 7 24
TemplateEngine 2026-05-28 1eb6e97 0/1/4/1 6 22
Transport 2026-05-28 1eb6e97 0/2/2/4 8 12

Pending Findings

Every Open / In Progress finding across all modules, highest severity first. Resolved findings drop off this list but remain recorded in their module's findings.md (see REVIEW-PROCESS.md §4–§5). Full detail — description, location, recommendation — lives in the module's findings.md.

Critical (0)

None open.

High (12)

ID Module Title
Communication-016 Communication HandleConnectionStateChanged is dead code — the documented disconnect-cleanup workflow never fires
ConfigurationDatabase-015 ConfigurationDatabase NotificationOutboxRepository.InsertIfNotExistsAsync is a check-then-act race with no duplicate-key catch
DataConnectionLayer-018 DataConnectionLayer Concurrent subscribes for the same tag from different instances orphan an adapter subscription handle
DeploymentManager-018 DeploymentManager Reconciliation force-sets Enabled, overwriting an intentional Disabled after central failover
ExternalSystemGateway-018 ExternalSystemGateway DeliverBufferedAsync lets JsonException propagate, turning a corrupt buffered row into a permanent retry-forever poison message
InboundAPI-022 InboundAPI IActiveNodeGate has no production registration in Host — standby-node gating is silently disabled in production
NotificationService-019 NotificationService NotificationDeliveryService and INotificationDeliveryService are orphaned by the central-only redesign
SiteEventLogging-016 SiteEventLogging From/To filters compare non-normalised ISO 8601 strings against UTC-stored timestamps
StoreAndForward-018 StoreAndForward Notification corrupt-payload parks the buffered message, contradicting the "notifications do not park" design invariant
TemplateEngine-017 TemplateEngine Revision hash and diff both ignore Description and Connections, defeating staleness detection for real deployment changes
Transport-001 Transport Template Overwrite never syncs attributes / alarms / scripts
Transport-002 Transport ExternalSystem Overwrite never syncs methods

Medium (52)

ID Module Title
AuditLog-001 AuditLog Combined-telemetry transport is plumbed end-to-end but never invoked in production
AuditLog-004 AuditLog SiteAuditReconciliationActor advances cursor even on per-row insert failure, silently abandoning permanently-failing rows
AuditLog-005 AuditLog GetBacklogStatsAsync holds the SQLite hot-path write lock for the full COUNT+MIN scan
CLI-017 CLI BundleCommands.RunBundleCommandAsync duplicates ExecuteCommandAsync and breaks the auth exit-code contract
CLI-019 CLI bundle export decodes the entire base64 bundle into memory before writing
CentralUI-026 CentralUI AuditFilterBar From/To filters treat browser-local datetimes as UTC
CentralUI-027 CentralUI Same UTC misinterpretation in SiteCallsReport, NotificationReport, and EventLogs
Commons-015 Commons EncryptionMetadata accepts any algorithm string and any iteration count
Commons-017 Commons Component-Commons.md is significantly stale (audit enums, new entities, new repositories, new service interfaces, new folders)
Commons-019 Commons New *Utc-suffixed DateTime columns on AuditEvent / SiteCall are not enforced as UTC; inconsistent with Notification's DateTimeOffset
Communication-017 Communication _inProgressDeployments grows unboundedly — successful deployments are never cleaned up
ConfigurationDatabase-016 ConfigurationDatabase InboundApiRepository.GetApiKeyByValueAsync hashes the candidate with the unpeppered ApiKeyHasher.Default
ConfigurationDatabase-017 ConfigurationDatabase Stub-attach delete on DeploymentRecord bypasses optimistic concurrency
ConfigurationDatabase-018 ConfigurationDatabase DateTime-typed *Utc columns on AuditEvent / SiteCall carry no DateTimeKind enforcement
ConfigurationDatabase-019 ConfigurationDatabase EnsureLookaheadAsync swallows non-idempotent SPLIT failures and continues, creating partition holes
DataConnectionLayer-019 DataConnectionLayer OpcUaDataConnection._subscriptionHandles is a plain Dictionary<,> mutated from concurrent thread-pool continuations
DataConnectionLayer-020 DataConnectionLayer HandleSubscribeCompleted double-counts _totalSubscribed when a previously-unresolved tag is resolved by a different instance's subscribe
DataConnectionLayer-021 DataConnectionLayer HandleSubscribeCompleted re-creates and leaks _subscriptionsByInstance entry when the instance unsubscribed mid-flight
DataConnectionLayer-022 DataConnectionLayer HandleSubscribeCompleted and HandleTagResolutionFailed reset the tag-resolution retry timer on every call via StartPeriodicTimer, starving the retry under subscribe bursts
DeploymentManager-019 DeploymentManager Lifecycle command timeout writes no audit entry
ExternalSystemGateway-019 ExternalSystemGateway HttpClient.Timeout is not set; DefaultHttpTimeout > 100s is silently clipped by the framework default
ExternalSystemGateway-020 ExternalSystemGateway JsonElementToParameterValue silently downcasts non-Int64 JSON numbers to double, losing precision for decimal SQL parameters on retry
HealthMonitoring-017 HealthMonitoring HealthReportSender resets interval counters before Send; transport failures silently drop the interval's error counts
HealthMonitoring-019 HealthMonitoring SiteAuditTelemetryStalled and CentralAuditWriteFailures design-doc metrics have no HealthMonitoring-side surface
Host-016 Host Site CentralContactPoints second entry targets the site's own remoting port
Host-017 Host Site-shutdown ordering from REQ-HOST-7 is not wired
InboundAPI-018 InboundAPI AuditWriteMiddleware fires WriteAsync as _ = task — faulted async writes are unobserved
InboundAPI-021 InboundAPI ParentExecutionId correlation flows only through Call; attribute reads/writes lose the inbound→site execution-tree link
InboundAPI-025 InboundAPI AuditWriteMiddleware runs against the entire /api/* branch — emits spurious ApiInbound audit rows for /api/audit/query and /api/audit/export
ManagementService-020 ManagementService UpdateSmtpConfig returns and audits the SMTP Credentials field verbatim
ManagementService-021 ManagementService Transport bundle handlers have zero test coverage
NotificationOutbox-005 NotificationOutbox Ingest persistence inherits the CD-015 check-then-act race; under contention the second writer throws and the site retries
NotificationOutbox-007 NotificationOutbox NotificationOutboxOptions.DispatchBatchSize, DeliveredKpiWindow, and PurgeInterval are not in the design document
NotificationService-020 NotificationService NS-001 fix superseded; AkkaHostedService would register two competing Notification S&F handlers if both code paths ran
NotificationService-024 NotificationService No test affirms the central-only invariant; the orphaned-path tests give a false coverage signal
SiteCallAudit-001 SiteCallAudit SupervisorStrategy override is dead code; XML claims Resume that is not enforced
SiteCallAudit-003 SiteCallAudit OnUpsertAsync does not refresh IngestedAtUtc; direct-write callers must remember to stamp it
SiteEventLogging-015 SiteEventLogging Background write queue is unbounded; can grow without limit under sustained writer slowness
SiteEventLogging-017 SiteEventLogging Central client's PageSize is unbounded; defeats the "configurable page size" design rationale
SiteRuntime-020 SiteRuntime Second DeployInstanceCommand arriving during a pending redeploy races the still-terminating actor on its name
SiteRuntime-021 SiteRuntime HandleDeployArtifacts updates DataConnections in SQLite but never sends CreateConnectionCommand to the DCL
SiteRuntime-022 SiteRuntime AuditingDbCommand.DbConnection.set uses reflection to read AuditingDbConnection._inner
SiteRuntime-024 SiteRuntime OperationTrackingStore serialises all writes through one connection + SemaphoreSlim, and Dispose() does sync-over-async
StoreAndForward-019 StoreAndForward Notifications park after DefaultMaxRetries exhaustion, contradicting "retried until central acks"
StoreAndForward-020 StoreAndForward RetryParkedMessageAsync skips standby replication when the message is deleted between local update and re-load
StoreAndForward-021 StoreAndForward Design doc claims the Operation Tracking Table lives in StoreAndForward but the implementation is in SiteRuntime
TemplateEngine-018 TemplateEngine DiffService reports no entries for added/removed/changed connections
TemplateEngine-019 TemplateEngine TemplateResolver.BuildInheritanceChain still uses the 0-as-no-parent sentinel that was removed from CycleDetector
TemplateEngine-020 TemplateEngine Create* audit entries are written with EntityId = "0" before SaveChangesAsync populates the real key
TemplateEngine-021 TemplateEngine MoveTemplateAsync skips folder cycle and sibling-name-collision validation
Transport-004 Transport MaxUnlockAttemptsPerIpPerHour option is declared but never enforced
Transport-010 Transport Critical Overwrite + cross-cutting paths uncovered by tests

Low (90)

ID Module Title
AuditLog-002 AuditLog SupervisorStrategy comments claim Resume semantics but code returns the default Restart decider
AuditLog-003 AuditLog AuditLogIngestActor.OnIngestAsync uses CreateScope, but OnCachedTelemetryAsync uses CreateAsyncScope — and only one disposes asynchronously
AuditLog-006 AuditLog SqliteAuditWriter.Dispose() does sync-over-async and may deadlock
AuditLog-007 AuditLog INodeIdentityProvider resolution mixes GetService and GetRequiredService inconsistently across AddAuditLog registrations
AuditLog-008 AuditLog Test composition roots that omit IAuditPayloadFilter silently pass UNREDACTED payloads through the writer chain
AuditLog-009 AuditLog SqliteAuditWriter.DisposeAsync comment claims _disposed is set early, but it isn't
AuditLog-010 AuditLog Actor drain paths accept a CancellationToken parameter but always pass CancellationToken.None downstream
AuditLog-011 AuditLog AddAuditLogHealthMetricsBridge and AddAuditLogCentralMaintenance are non-idempotent and register hosted services on every call
CLI-020 CLI bundle export success-envelope parse is unguarded
CLI-021 CLI CliConfig.Load crashes the CLI on a malformed config file
CLI-022 CLI CommandTreeTests excludes the two new command groups
CLI-023 CLI Component-CLI.md claims audit commands ride POST /management; implementation uses REST endpoints
CentralUI-029 CentralUI ConfigurationAuditLog uses JS.InvokeAsync<int>("eval", ...) instead of a dedicated JS module
CentralUI-030 CentralUI SandboxConsoleCapture's per-call StringWriter is not thread-safe under intra-script concurrency
CentralUI-031 CentralUI TransportImport buffers the full bundle bytes in component state
CentralUI-032 CentralUI AuditResultsGrid paging is forward-only, no Previous button
CentralUI-033 CentralUI Drill-in / query-string code paths for the new Transport + SiteCalls pages are untested
ClusterInfrastructure-011 ClusterInfrastructure SectionName constant is decorative — no binding site references it
ClusterInfrastructure-012 ClusterInfrastructure Validator accepts SeedNodes.Count == 1 despite design requiring both nodes as seeds
ClusterInfrastructure-013 ClusterInfrastructure Test uses catastrophic config values without an inline-intent comment
ClusterInfrastructure-014 ClusterInfrastructure AddClusterInfrastructureActors is dead surface — no caller, no behaviour
Commons-016 Commons BundleSession.Locked uses a magic 3 rather than a named constant
Commons-018 Commons IOperationTrackingStore and IPartitionMaintenance are at the root of Interfaces/ instead of Interfaces/Services/
Commons-020 Commons Transport types and new Audit-message types have no unit tests in ScadaLink.Commons.Tests
Commons-021 Commons ExternalCallResult.Response has a benign lazy-parse race
Commons-022 Commons IAuditCorrelationContext references an unresolvable BundleImporter.ApplyAsync cref; JSON-blob columns have no documented shape
Commons-023 Commons Trailing-optional SourceNode on positional records mixes additive evolution patterns
Communication-018 Communication Site heartbeats hard-code IsActive: true regardless of node role
Communication-019 Communication LoadSiteAddressesFromDb does not pass a CancellationToken to the repository
Communication-020 Communication SiteAddressCacheLoaded carries mutable Dictionary/List types
Communication-021 Communication SiteStreamGrpcServer.SubscribeInstance leaks the StreamRelayActor if Subscribe throws pre-try
Communication-022 Communication _debugSubscriptions keyed by caller-supplied correlation ID; reuse silently orphans the prior subscriber
ConfigurationDatabase-020 ConfigurationDatabase GetPartitionBoundariesOlderThanAsync returns DateTime with Kind=Unspecified
ConfigurationDatabase-021 ConfigurationDatabase SwitchOutPartitionAsync interpolates monthBoundary / staging table name into raw SQL
ConfigurationDatabase-022 ConfigurationDatabase Stale "WP-24 Stub level sufficient for diff/staleness support" XML comment on DeploymentManagerRepository
ConfigurationDatabase-023 ConfigurationDatabase AuditLog correlation-index name drifts from design doc (IX_AuditLog_CorrelationId vs IX_AuditLog_Correlation)
ConfigurationDatabase-024 ConfigurationDatabase Missing test coverage for SPLIT-RANGE failure-continuation and production-shape rowversion delete
DeploymentManager-020 DeploymentManager DeployReconciled audit attributes the action to the prior deployer, not the current user
DeploymentManager-021 DeploymentManager ResolveSiteIdentifierAsync silently substitutes the DB id when the site row is missing
DeploymentManager-022 DeploymentManager Pending and InProgress are written back-to-back with no intervening work
DeploymentManager-023 DeploymentManager BuildDeployArtifactsCommandAsync re-queries system-wide artifacts once per site
DeploymentManager-024 DeploymentManager Test probe actors hold mutable static state across tests
ExternalSystemGateway-021 ExternalSystemGateway ApplyAuth silently sends an unauthenticated request on unknown AuthType, empty AuthConfiguration, or malformed Basic config
ExternalSystemGateway-022 ExternalSystemGateway new HttpMethod(method.HttpMethod) accepts any string at runtime; an invalid HTTP verb fails only at call time
ExternalSystemGateway-023 ExternalSystemGateway PATCH HTTP method is supported by code but absent from the design doc; body-vs-query decision drifts from the documented set
HealthMonitoring-018 HealthMonitoring Same counter-reset-before-publish hazard in CentralHealthReportLoop
HealthMonitoring-020 HealthMonitoring MarkHeartbeat brings offline site back online with a stale LastHeartbeatAt when receivedAt <= existing.LastHeartbeatAt
HealthMonitoring-021 HealthMonitoring CentralSiteId = "central" reserved constant silently collides with a real site named "central"
HealthMonitoring-022 HealthMonitoring CentralHealthReportLoopTests uses real-time PeriodicTimer + Task.Delay; flake-prone on slow CI
HealthMonitoring-023 HealthMonitoring StoreAndForwardBufferDepths_IsEmptyPlaceholder test name is stale; it now covers the default-state contract, not a placeholder
Host-018 Host Shipped per-role configs omit NodeOptions.NodeName, leaving SourceNode null
Host-019 Host Migration StartupRetry call drops the host CancellationToken
Host-020 Host MinimumLevel.Is silently overrides any operator-set Serilog:MinimumLevel
Host-021 Host Microsoft Logging:LogLevel section in appsettings.json is dead config under Serilog
Host-022 Host ParseLevel silently coerces unrecognised MinimumLevel to Information
InboundAPI-019 InboundAPI EnableBuffering() called unconditionally on every request, including bodyless requests
InboundAPI-020 InboundAPI ContentType.Contains("json") is case-sensitive; application/JSON with no Content-Length skips body parsing
InboundAPI-023 InboundAPI EndpointExtensions.HandleInboundApiRequest composition wiring has no test coverage
InboundAPI-024 InboundAPI _knownBadMethods is unbounded — an attacker can grow the cache by spamming distinct method names against the audit middleware path
ManagementService-022 ManagementService Design doc is stale on Transport bundle commands, /api/audit/* endpoints, and CommandTimeout
ManagementService-023 ManagementService HandleQueryDeployments unfiltered branch is N+1 on instance lookup
NotificationOutbox-006 NotificationOutbox ResolveAdapters rebuilds the NotificationType → adapter dictionary on every dispatch sweep
NotificationOutbox-008 NotificationOutbox FallbackMaxRetries / FallbackRetryDelay path is unreachable in production AND untested
NotificationOutbox-009 NotificationOutbox StuckAgeThreshold XML-doc says "in-progress notification is re-claimed" — contradicts the design's display-only stuck detection
NotificationService-022 NotificationService MailKitSmtpClientWrapper holds a long-lived SmtpClient; combined with per-send factory, the design comment about pooling is contradicted
NotificationService-023 NotificationService XML docs on the orphaned classes still describe the removed site-delivery flow; misleading to maintainers
NotificationService-025 NotificationService CredentialRedactor over-masks: any 4-character credential component is masked anywhere it appears, including unrelated log text
Security-020 Security SecurityOptions has no startup validation for required fields (LdapServer, LdapSearchBase)
Security-021 Security RequireHttpsCookie=false dev opt-out has no warning path — an HTTP production deployment silently transmits the JWT bearer credential in cleartext
SiteCallAudit-002 SiteCallAudit Singleton failover does not wait for in-flight async upserts
SiteCallAudit-004 SiteCallAudit Reconciliation puller and daily terminal-purge scheduler still deferred; design-doc drift
SiteCallAudit-005 SiteCallAudit AckErrorMessage switch arm for SiteUnreachable returns ack message instead of throwing
SiteCallAudit-006 SiteCallAudit Stuck-only paging test does not exercise the multi-page boundary with an interleaved non-stuck row at the cursor
SiteEventLogging-018 SiteEventLogging FailedWriteCount is exposed but never consumed by Health Monitoring
SiteEventLogging-019 SiteEventLogging EventLogPurgeService runs on every host node; design says "active node"
SiteEventLogging-020 SiteEventLogging severity and eventType are unvalidated free-form strings; doc enumerates a set that is not enforced
SiteEventLogging-021 SiteEventLogging DateTimeOffset.Parse uses the current culture; can throw on non-default locales
SiteEventLogging-022 SiteEventLogging Cache=Shared is redundant for a single-connection logger
SiteEventLogging-023 SiteEventLogging Concurrent-stress test uses a non-volatile stop flag
SiteRuntime-023 SiteRuntime Convert.ToDouble(value) in trigger and alarm evaluation is locale-sensitive
SiteRuntime-025 SiteRuntime HandleSetStaticAttribute persists unknown attribute names as static overrides
SiteRuntime-026 SiteRuntime ReplicationMessages.cs public record types have no XML documentation
StoreAndForward-022 StoreAndForward NotifyCachedCallObserverAsync silently drops the entire audit lifecycle when the message id is not a parseable TrackedOperationId
StoreAndForward-023 StoreAndForward siteId silently defaults to empty when no IStoreAndForwardSiteContext is registered, degrading audit telemetry correlation
StoreAndForward-024 StoreAndForward StopAsync does not wait for an in-flight retry sweep, so disposed dependencies can be touched after shutdown
TemplateEngine-022 TemplateEngine LockEnforcer.ValidateLockChange enforces "once-locked-stays-locked" for IsLocked but not for LockedInDerived
Transport-008 Transport PreviewAsync issues an N+1 GetTemplateWithChildrenAsync per matching template name
Transport-009 Transport IAuditCorrelationContext.BundleImportId is mutated on the same scoped instance the AuditService reads
Transport-011 Transport Design doc's Step-1 manifest preview promises decryption-free preview, but LoadAsync reads and validates content before passphrase
Transport-012 Transport "Bundle Import" filter promised in design doc not surfaced in Configuration Audit Log Viewer UI