Files
ScadaBridge/code-reviews/README.md
T
Joseph Doherty 819f1b4665 fix(validation): close Theme 3 — 11 input-validation / unbounded-input findings
Each finding is a focused validation guard or upper bound at a trust boundary.
Highlights:
- Commons-015: EncryptionMetadata ctor now validates Algorithm (AES-256-GCM
  only), Kdf (PBKDF2-SHA256 only), Iterations ([100k, 10M]), non-null Salt/IV.
- Transport-004: new BundleUnlockRateLimiter (sliding-window, per-key,
  singleton) wired into BundleImporter.LoadAsync; over-budget callers see
  BundleUnlockRateLimitedException. Per-bundle 3-strike + per-window cap.
- ESG-022: ExternalSystemClient.InvokeHttpAsync allow-lists the documented
  GET/POST/PUT/PATCH/DELETE set (case-insensitive); unknown verbs throw.
- SEL-015: SiteEventLogger queue now bounded (10k cap, DropOldest); dropped
  events fault their Task and increment FailedWriteCount so the drop is
  observable instead of an unbounded memory growth.
- SEL-017: EventLogQueryService clamps caller-supplied PageSize to a new
  MaxQueryPageSize cap (default 500) so int.MaxValue can't OOM the host.
- SEL-020: LogEventAsync rejects severities outside {Info, Warning, Error}
  (matches SQLite BINARY-collation query filter).
- InboundAPI-020: ContentType "json" check now case-insensitive
  (application/JSON no longer slips through as not-json).
- InboundAPI-024: _knownBadMethods capped at 1000 entries (drops new entries
  once full); per-request DB lookup remains the correctness path.
- SR-025: HandleSetStaticAttribute validates the attribute name against the
  deployed config; unknown names now return Success=false instead of
  leaking orphan override rows into the SQLite store.
- TE-021: MoveTemplateAsync runs the sibling-name-collision check at the
  destination, mirroring TemplateFolderService.MoveFolderAsync.
- TE-022: LockEnforcer's once-locked-stays-locked rule now also covers
  LockedInDerived (was previously only IsLocked).

New regression tests across 8 test projects (EncryptionMetadata, rate
limiter, ESG client allow-list, SEL bounded channel / PageSize clamp /
severity validation, InboundAPI ContentType + bad-methods cap, SiteRT
unknown-attribute, TemplateEngine MoveTemplate + LockedInDerived).
Build clean; affected suites all green. README regenerated: 93 open (was 104).

Note: a separate manual re-run was needed for the SiteEventLogging hunk
because its initial subagent's source edits never landed on disk despite
reporting success (file-collision-style failure mode).
2026-05-28 06:58:25 -04:00

193 lines
19 KiB
Markdown

# Code Reviews
Comprehensive, per-module code reviews of the ScadaLink codebase. Each module (one
buildable project under `src/`) has its own folder containing a `findings.md`. This
README is the aggregated index — the single place to see all outstanding work.
> Generated by `regen-readme.py` from the per-module `findings.md` files. Do not
> edit by hand — edit the findings files and re-run the script.
## How it works
- Reviews are performed one module at a time against a fixed checklist.
- Every finding is recorded in the module's `findings.md` with a severity and status.
- Findings are **never deleted** — they are closed by changing their status, keeping
a full audit trail.
- This README aggregates every **pending** finding (`Open` / `In Progress`) across all
modules.
See **[REVIEW-PROCESS.md](REVIEW-PROCESS.md)** for the full procedure: the review
checklist, severity definitions, finding format, and how to mark items resolved.
## Layout
```
code-reviews/
├── README.md # this file — process overview + pending findings
├── REVIEW-PROCESS.md # how to perform a review and track findings
├── regen-readme.py # regenerates this README from the findings files
├── _template/findings.md # copy-this template for a module review
└── <Module>/findings.md # one folder per src/ project
```
## Baseline review — 2026-05-16
All 19 modules were reviewed at commit `9c60592` (241 findings: 6 Critical, 46 High,
100 Medium, 89 Low). The tables below track what remains **open** as findings are
resolved and re-triaged; findings discovered after the baseline are appended to their
module file and counted in **Total**.
| Severity | Open findings |
|----------|---------------|
| Critical | 0 |
| High | 0 |
| Medium | 32 |
| Low | 61 |
| **Total** | **93** |
## Module Status
| Module | Last reviewed | Commit | Open (C/H/M/L) | Open | Total |
|--------|---------------|--------|----------------|------|-------|
| [AuditLog](AuditLog/findings.md) | 2026-05-28 | `1eb6e97` | 0/0/3/6 | 9 | 11 |
| [CLI](CLI/findings.md) | 2026-05-28 | `1eb6e97` | 0/0/2/3 | 5 | 23 |
| [CentralUI](CentralUI/findings.md) | 2026-05-28 | `1eb6e97` | 0/0/0/5 | 5 | 33 |
| [ClusterInfrastructure](ClusterInfrastructure/findings.md) | 2026-05-28 | `1eb6e97` | 0/0/0/3 | 3 | 14 |
| [Commons](Commons/findings.md) | 2026-05-28 | `1eb6e97` | 0/0/0/5 | 5 | 23 |
| [Communication](Communication/findings.md) | 2026-05-28 | `1eb6e97` | 0/0/1/4 | 5 | 22 |
| [ConfigurationDatabase](ConfigurationDatabase/findings.md) | 2026-05-28 | `1eb6e97` | 0/0/3/2 | 5 | 24 |
| [DataConnectionLayer](DataConnectionLayer/findings.md) | 2026-05-28 | `1eb6e97` | 0/0/0/0 | 0 | 22 |
| [DeploymentManager](DeploymentManager/findings.md) | 2026-05-28 | `1eb6e97` | 0/0/1/4 | 5 | 24 |
| [ExternalSystemGateway](ExternalSystemGateway/findings.md) | 2026-05-28 | `1eb6e97` | 0/0/2/1 | 3 | 23 |
| [HealthMonitoring](HealthMonitoring/findings.md) | 2026-05-28 | `1eb6e97` | 0/0/1/3 | 4 | 23 |
| [Host](Host/findings.md) | 2026-05-28 | `1eb6e97` | 0/0/1/5 | 6 | 22 |
| [InboundAPI](InboundAPI/findings.md) | 2026-05-28 | `1eb6e97` | 0/0/2/2 | 4 | 25 |
| [ManagementService](ManagementService/findings.md) | 2026-05-28 | `1eb6e97` | 0/0/2/1 | 3 | 23 |
| [NotificationOutbox](NotificationOutbox/findings.md) | 2026-05-28 | `1eb6e97` | 0/0/1/2 | 3 | 10 |
| [NotificationService](NotificationService/findings.md) | 2026-05-28 | `1eb6e97` | 0/0/2/2 | 4 | 25 |
| [Security](Security/findings.md) | 2026-05-28 | `1eb6e97` | 0/0/0/2 | 2 | 21 |
| [SiteCallAudit](SiteCallAudit/findings.md) | 2026-05-28 | `1eb6e97` | 0/0/2/2 | 4 | 6 |
| [SiteEventLogging](SiteEventLogging/findings.md) | 2026-05-28 | `1eb6e97` | 0/0/0/3 | 3 | 23 |
| [SiteRuntime](SiteRuntime/findings.md) | 2026-05-28 | `1eb6e97` | 0/0/2/0 | 2 | 26 |
| [StoreAndForward](StoreAndForward/findings.md) | 2026-05-28 | `1eb6e97` | 0/0/3/3 | 6 | 24 |
| [TemplateEngine](TemplateEngine/findings.md) | 2026-05-28 | `1eb6e97` | 0/0/3/0 | 3 | 22 |
| [Transport](Transport/findings.md) | 2026-05-28 | `1eb6e97` | 0/0/1/3 | 4 | 12 |
## Pending Findings
Every `Open` / `In Progress` finding across all modules, highest severity first.
Resolved findings drop off this list but remain recorded in their module's
`findings.md` (see [REVIEW-PROCESS.md](REVIEW-PROCESS.md) §4–§5). Full detail —
description, location, recommendation — lives in the module's `findings.md`.
### Critical (0)
_None open._
### High (0)
_None open._
### Medium (32)
| ID | Module | Title |
|----|--------|-------|
| AuditLog-001 | [AuditLog](AuditLog/findings.md) | Combined-telemetry transport is plumbed end-to-end but never invoked in production |
| AuditLog-004 | [AuditLog](AuditLog/findings.md) | `SiteAuditReconciliationActor` advances cursor even on per-row insert failure, silently abandoning permanently-failing rows |
| AuditLog-005 | [AuditLog](AuditLog/findings.md) | `GetBacklogStatsAsync` holds the SQLite hot-path write lock for the full COUNT+MIN scan |
| CLI-017 | [CLI](CLI/findings.md) | `BundleCommands.RunBundleCommandAsync` duplicates `ExecuteCommandAsync` and breaks the auth exit-code contract |
| CLI-019 | [CLI](CLI/findings.md) | `bundle export` decodes the entire base64 bundle into memory before writing |
| Communication-017 | [Communication](Communication/findings.md) | `_inProgressDeployments` grows unboundedly — successful deployments are never cleaned up |
| ConfigurationDatabase-016 | [ConfigurationDatabase](ConfigurationDatabase/findings.md) | `InboundApiRepository.GetApiKeyByValueAsync` hashes the candidate with the unpeppered `ApiKeyHasher.Default` |
| ConfigurationDatabase-017 | [ConfigurationDatabase](ConfigurationDatabase/findings.md) | Stub-attach delete on `DeploymentRecord` bypasses optimistic concurrency |
| ConfigurationDatabase-019 | [ConfigurationDatabase](ConfigurationDatabase/findings.md) | `EnsureLookaheadAsync` swallows non-idempotent SPLIT failures and continues, creating partition holes |
| DeploymentManager-019 | [DeploymentManager](DeploymentManager/findings.md) | Lifecycle command timeout writes no audit entry |
| ExternalSystemGateway-019 | [ExternalSystemGateway](ExternalSystemGateway/findings.md) | `HttpClient.Timeout` is not set; `DefaultHttpTimeout` > 100s is silently clipped by the framework default |
| ExternalSystemGateway-020 | [ExternalSystemGateway](ExternalSystemGateway/findings.md) | `JsonElementToParameterValue` silently downcasts non-Int64 JSON numbers to `double`, losing precision for `decimal` SQL parameters on retry |
| HealthMonitoring-017 | [HealthMonitoring](HealthMonitoring/findings.md) | `HealthReportSender` resets interval counters before `Send`; transport failures silently drop the interval's error counts |
| Host-016 | [Host](Host/findings.md) | Site `CentralContactPoints` second entry targets the site's own remoting port |
| InboundAPI-018 | [InboundAPI](InboundAPI/findings.md) | `AuditWriteMiddleware` fires `WriteAsync` as `_ = task` — faulted async writes are unobserved |
| InboundAPI-025 | [InboundAPI](InboundAPI/findings.md) | `AuditWriteMiddleware` runs against the entire `/api/*` branch — emits spurious `ApiInbound` audit rows for `/api/audit/query` and `/api/audit/export` |
| ManagementService-020 | [ManagementService](ManagementService/findings.md) | UpdateSmtpConfig returns and audits the SMTP Credentials field verbatim |
| ManagementService-021 | [ManagementService](ManagementService/findings.md) | Transport bundle handlers have zero test coverage |
| NotificationOutbox-005 | [NotificationOutbox](NotificationOutbox/findings.md) | Ingest persistence inherits the CD-015 check-then-act race; under contention the second writer throws and the site retries |
| NotificationService-020 | [NotificationService](NotificationService/findings.md) | NS-001 fix superseded; `AkkaHostedService` would register two competing `Notification` S&F handlers if both code paths ran |
| NotificationService-024 | [NotificationService](NotificationService/findings.md) | No test affirms the central-only invariant; the orphaned-path tests give a false coverage signal |
| SiteCallAudit-001 | [SiteCallAudit](SiteCallAudit/findings.md) | SupervisorStrategy override is dead code; XML claims Resume that is not enforced |
| SiteCallAudit-003 | [SiteCallAudit](SiteCallAudit/findings.md) | `OnUpsertAsync` does not refresh `IngestedAtUtc`; direct-write callers must remember to stamp it |
| SiteRuntime-021 | [SiteRuntime](SiteRuntime/findings.md) | `HandleDeployArtifacts` updates `DataConnections` in SQLite but never sends `CreateConnectionCommand` to the DCL |
| SiteRuntime-022 | [SiteRuntime](SiteRuntime/findings.md) | `AuditingDbCommand.DbConnection.set` uses reflection to read `AuditingDbConnection._inner` |
| StoreAndForward-019 | [StoreAndForward](StoreAndForward/findings.md) | Notifications park after `DefaultMaxRetries` exhaustion, contradicting "retried until central acks" |
| StoreAndForward-020 | [StoreAndForward](StoreAndForward/findings.md) | `RetryParkedMessageAsync` skips standby replication when the message is deleted between local update and re-load |
| StoreAndForward-021 | [StoreAndForward](StoreAndForward/findings.md) | Design doc claims the Operation Tracking Table lives in StoreAndForward but the implementation is in SiteRuntime |
| TemplateEngine-018 | [TemplateEngine](TemplateEngine/findings.md) | `DiffService` reports no entries for added/removed/changed connections |
| TemplateEngine-019 | [TemplateEngine](TemplateEngine/findings.md) | `TemplateResolver.BuildInheritanceChain` still uses the `0`-as-no-parent sentinel that was removed from `CycleDetector` |
| TemplateEngine-020 | [TemplateEngine](TemplateEngine/findings.md) | `Create*` audit entries are written with `EntityId = "0"` before `SaveChangesAsync` populates the real key |
| Transport-010 | [Transport](Transport/findings.md) | Critical Overwrite + cross-cutting paths uncovered by tests |
### Low (61)
| ID | Module | Title |
|----|--------|-------|
| AuditLog-003 | [AuditLog](AuditLog/findings.md) | `AuditLogIngestActor.OnIngestAsync` uses `CreateScope`, but `OnCachedTelemetryAsync` uses `CreateAsyncScope` — and only one disposes asynchronously |
| AuditLog-006 | [AuditLog](AuditLog/findings.md) | `SqliteAuditWriter.Dispose()` does sync-over-async and may deadlock |
| AuditLog-007 | [AuditLog](AuditLog/findings.md) | `INodeIdentityProvider` resolution mixes `GetService` and `GetRequiredService` inconsistently across `AddAuditLog` registrations |
| AuditLog-008 | [AuditLog](AuditLog/findings.md) | Test composition roots that omit `IAuditPayloadFilter` silently pass UNREDACTED payloads through the writer chain |
| AuditLog-010 | [AuditLog](AuditLog/findings.md) | Actor drain paths accept a `CancellationToken` parameter but always pass `CancellationToken.None` downstream |
| AuditLog-011 | [AuditLog](AuditLog/findings.md) | `AddAuditLogHealthMetricsBridge` and `AddAuditLogCentralMaintenance` are non-idempotent and register hosted services on every call |
| CLI-020 | [CLI](CLI/findings.md) | `bundle export` success-envelope parse is unguarded |
| CLI-021 | [CLI](CLI/findings.md) | `CliConfig.Load` crashes the CLI on a malformed config file |
| CLI-022 | [CLI](CLI/findings.md) | `CommandTreeTests` excludes the two new command groups |
| CentralUI-029 | [CentralUI](CentralUI/findings.md) | `ConfigurationAuditLog` uses `JS.InvokeAsync<int>("eval", ...)` instead of a dedicated JS module |
| CentralUI-030 | [CentralUI](CentralUI/findings.md) | `SandboxConsoleCapture`'s per-call `StringWriter` is not thread-safe under intra-script concurrency |
| CentralUI-031 | [CentralUI](CentralUI/findings.md) | `TransportImport` buffers the full bundle bytes in component state |
| CentralUI-032 | [CentralUI](CentralUI/findings.md) | `AuditResultsGrid` paging is forward-only, no Previous button |
| CentralUI-033 | [CentralUI](CentralUI/findings.md) | Drill-in / query-string code paths for the new Transport + SiteCalls pages are untested |
| ClusterInfrastructure-011 | [ClusterInfrastructure](ClusterInfrastructure/findings.md) | `SectionName` constant is decorative — no binding site references it |
| ClusterInfrastructure-013 | [ClusterInfrastructure](ClusterInfrastructure/findings.md) | Test uses catastrophic config values without an inline-intent comment |
| ClusterInfrastructure-014 | [ClusterInfrastructure](ClusterInfrastructure/findings.md) | `AddClusterInfrastructureActors` is dead surface — no caller, no behaviour |
| Commons-016 | [Commons](Commons/findings.md) | `BundleSession.Locked` uses a magic `3` rather than a named constant |
| Commons-018 | [Commons](Commons/findings.md) | `IOperationTrackingStore` and `IPartitionMaintenance` are at the root of `Interfaces/` instead of `Interfaces/Services/` |
| Commons-020 | [Commons](Commons/findings.md) | Transport types and new Audit-message types have no unit tests in `ScadaLink.Commons.Tests` |
| Commons-021 | [Commons](Commons/findings.md) | `ExternalCallResult.Response` has a benign lazy-parse race |
| Commons-023 | [Commons](Commons/findings.md) | Trailing-optional `SourceNode` on positional records mixes additive evolution patterns |
| Communication-019 | [Communication](Communication/findings.md) | `LoadSiteAddressesFromDb` does not pass a `CancellationToken` to the repository |
| Communication-020 | [Communication](Communication/findings.md) | `SiteAddressCacheLoaded` carries mutable `Dictionary`/`List` types |
| Communication-021 | [Communication](Communication/findings.md) | `SiteStreamGrpcServer.SubscribeInstance` leaks the `StreamRelayActor` if `Subscribe` throws pre-try |
| Communication-022 | [Communication](Communication/findings.md) | `_debugSubscriptions` keyed by caller-supplied correlation ID; reuse silently orphans the prior subscriber |
| ConfigurationDatabase-021 | [ConfigurationDatabase](ConfigurationDatabase/findings.md) | `SwitchOutPartitionAsync` interpolates `monthBoundary` / staging table name into raw SQL |
| ConfigurationDatabase-024 | [ConfigurationDatabase](ConfigurationDatabase/findings.md) | Missing test coverage for SPLIT-RANGE failure-continuation and production-shape rowversion delete |
| DeploymentManager-021 | [DeploymentManager](DeploymentManager/findings.md) | `ResolveSiteIdentifierAsync` silently substitutes the DB id when the site row is missing |
| DeploymentManager-022 | [DeploymentManager](DeploymentManager/findings.md) | `Pending` and `InProgress` are written back-to-back with no intervening work |
| DeploymentManager-023 | [DeploymentManager](DeploymentManager/findings.md) | `BuildDeployArtifactsCommandAsync` re-queries system-wide artifacts once per site |
| DeploymentManager-024 | [DeploymentManager](DeploymentManager/findings.md) | Test probe actors hold mutable static state across tests |
| ExternalSystemGateway-021 | [ExternalSystemGateway](ExternalSystemGateway/findings.md) | `ApplyAuth` silently sends an unauthenticated request on unknown `AuthType`, empty `AuthConfiguration`, or malformed Basic config |
| HealthMonitoring-018 | [HealthMonitoring](HealthMonitoring/findings.md) | Same counter-reset-before-publish hazard in `CentralHealthReportLoop` |
| HealthMonitoring-021 | [HealthMonitoring](HealthMonitoring/findings.md) | `CentralSiteId = "central"` reserved constant silently collides with a real site named "central" |
| HealthMonitoring-022 | [HealthMonitoring](HealthMonitoring/findings.md) | `CentralHealthReportLoopTests` uses real-time `PeriodicTimer` + `Task.Delay`; flake-prone on slow CI |
| Host-018 | [Host](Host/findings.md) | Shipped per-role configs omit `NodeOptions.NodeName`, leaving `SourceNode` null |
| Host-019 | [Host](Host/findings.md) | Migration `StartupRetry` call drops the host `CancellationToken` |
| Host-020 | [Host](Host/findings.md) | `MinimumLevel.Is` silently overrides any operator-set `Serilog:MinimumLevel` |
| Host-021 | [Host](Host/findings.md) | Microsoft `Logging:LogLevel` section in `appsettings.json` is dead config under Serilog |
| Host-022 | [Host](Host/findings.md) | `ParseLevel` silently coerces unrecognised `MinimumLevel` to `Information` |
| InboundAPI-019 | [InboundAPI](InboundAPI/findings.md) | `EnableBuffering()` called unconditionally on every request, including bodyless requests |
| InboundAPI-023 | [InboundAPI](InboundAPI/findings.md) | `EndpointExtensions.HandleInboundApiRequest` composition wiring has no test coverage |
| ManagementService-023 | [ManagementService](ManagementService/findings.md) | HandleQueryDeployments unfiltered branch is N+1 on instance lookup |
| NotificationOutbox-006 | [NotificationOutbox](NotificationOutbox/findings.md) | `ResolveAdapters` rebuilds the `NotificationType → adapter` dictionary on every dispatch sweep |
| NotificationOutbox-008 | [NotificationOutbox](NotificationOutbox/findings.md) | `FallbackMaxRetries` / `FallbackRetryDelay` path is unreachable in production AND untested |
| NotificationService-022 | [NotificationService](NotificationService/findings.md) | `MailKitSmtpClientWrapper` holds a long-lived `SmtpClient`; combined with per-send factory, the design comment about pooling is contradicted |
| NotificationService-025 | [NotificationService](NotificationService/findings.md) | `CredentialRedactor` over-masks: any 4-character credential component is masked anywhere it appears, including unrelated log text |
| Security-020 | [Security](Security/findings.md) | `SecurityOptions` has no startup validation for required fields (`LdapServer`, `LdapSearchBase`) |
| Security-021 | [Security](Security/findings.md) | `RequireHttpsCookie=false` dev opt-out has no warning path — an HTTP production deployment silently transmits the JWT bearer credential in cleartext |
| SiteCallAudit-002 | [SiteCallAudit](SiteCallAudit/findings.md) | Singleton failover does not wait for in-flight async upserts |
| SiteCallAudit-006 | [SiteCallAudit](SiteCallAudit/findings.md) | Stuck-only paging test does not exercise the multi-page boundary with an interleaved non-stuck row at the cursor |
| SiteEventLogging-018 | [SiteEventLogging](SiteEventLogging/findings.md) | `FailedWriteCount` is exposed but never consumed by Health Monitoring |
| SiteEventLogging-022 | [SiteEventLogging](SiteEventLogging/findings.md) | `Cache=Shared` is redundant for a single-connection logger |
| SiteEventLogging-023 | [SiteEventLogging](SiteEventLogging/findings.md) | Concurrent-stress test uses a non-volatile `stop` flag |
| StoreAndForward-022 | [StoreAndForward](StoreAndForward/findings.md) | `NotifyCachedCallObserverAsync` silently drops the entire audit lifecycle when the message id is not a parseable `TrackedOperationId` |
| StoreAndForward-023 | [StoreAndForward](StoreAndForward/findings.md) | `siteId` silently defaults to empty when no `IStoreAndForwardSiteContext` is registered, degrading audit telemetry correlation |
| StoreAndForward-024 | [StoreAndForward](StoreAndForward/findings.md) | `StopAsync` does not wait for an in-flight retry sweep, so disposed dependencies can be touched after shutdown |
| Transport-008 | [Transport](Transport/findings.md) | `PreviewAsync` issues an N+1 `GetTemplateWithChildrenAsync` per matching template name |
| Transport-009 | [Transport](Transport/findings.md) | `IAuditCorrelationContext.BundleImportId` is mutated on the same scoped instance the AuditService reads |
| Transport-012 | [Transport](Transport/findings.md) | "Bundle Import" filter promised in design doc not surfaced in Configuration Audit Log Viewer UI |