Files
ScadaBridge/code-reviews/Transport/findings.md
T
Joseph Doherty 5d2386cc9d fix(transport): close bundle security + plaintext-retention gaps (4 findings)
T-003: move the unlock lockout server-side. The 3-strike counter used to
live in the Razor page only — a second tab / CLI caller could re-upload
the same bytes and grind PBKDF2 indefinitely. The counter now lives in
IBundleSessionStore, keyed by ContentHash, so retries against identical
bundle bytes are throttled regardless of client. BundleLockedException
surfaces the new typed error path.

T-005: bind the manifest's non-derivative fields into AES-GCM AAD. A
SHA-256 of the manifest (with ContentHash + Encryption normalised to
sentinels) is now passed to AesGcm.Encrypt / .Decrypt, so a tampered
SourceEnvironment / ExportedBy / CreatedAtUtc on a stolen bundle yields
an authentication-tag mismatch instead of slipping past the Step-4
typo-resistant confirmation gate.

T-006: cap zip entry count, decompressed length, and compression ratio
in LoadAsync's envelope validator BEFORE any payload is decompressed,
using ZipArchiveEntry.Length / .CompressedLength. New TransportOptions
fields default to 4 entries / 200 MB / 50x ratio.

T-007: clear decrypted plaintext on the ApplyAsync failure path and zero
the buffer on success before removing the session, so a 100 MB
DecryptedContent doesn't sit in memory for the 30-min TTL after a failed
apply. A BundleSessionEvictionService BackgroundService now also drives
EvictExpired periodically so abandoned sessions clear without needing a
fresh Get() call to trigger lazy eviction.

Also resolves NO-010 — the misleading "writer never throws" XML doc was
the same code+comment my prior NO-004 await-the-writer fix already
rewrote.
2026-05-28 04:14:07 -04:00

21 KiB

Code Review — Transport

Field Value
Module src/ScadaLink.Transport
Design doc docs/requirements/Component-Transport.md
Status Reviewed
Last reviewed 2026-05-28
Reviewer claude-agent
Commit reviewed 1eb6e97
Open findings 12

Summary

The Transport module is structurally clean, follows the design doc's pipeline layout (Encryption → Serialization → Export / Import), and has solid lower-tier coverage (encryptor round-trips, manifest validator, dependency resolver, session store, diff engine). The big surface-area concerns cluster around two themes. First, the Overwrite resolution path is structurally incomplete: it updates only the parent entity's scalar fields (e.g. Template.Description / FolderId, ExternalSystem.EndpointUrl / AuthType / AuthConfiguration) and never replaces child collections (attributes, alarms, scripts, external-system methods), silently diverging from both the design doc's audit-row table and operator intent. Second, the 3-strike / per-IP unlock-rate-limit story declared in TransportOptions and the design doc isn't wired into the import service — the only counter is a local field on TransportImport.razor.cs, and MaxUnlockAttemptsPerIpPerHour is referenced nowhere. There are also some smaller integrity-and-resource issues (manifest fields outside ContentHash aren't bound to the encryption envelope, decrypted plaintext lives in the in-memory session for the full TTL on the failure path, and ZIP reads have no entry-count / per-entry decompression cap).

Checklist coverage

# Category Examined Notes
1 Correctness & logic bugs Yes Overwrite paths miss child sync (Transport-001, Transport-002); composition Overwrite intentionally clears (good).
2 Akka.NET conventions Yes No issues found — Transport is service-only, no actors / messages.
3 Concurrency & thread safety Yes IAuditCorrelationContext mutation is documented as not thread-safe (Transport-009); singleton BundleSessionStore w/ ConcurrentDictionary is fine.
4 Error handling & resilience Yes Rollback-failure path is well-considered, but failed sessions are never evicted (Transport-007).
5 Security Yes Unlock lockout + per-IP cap not enforced server-side (Transport-003, Transport-004); manifest fields outside ContentHash are unauthenticated (Transport-005); zip-bomb / per-entry decompression cap missing (Transport-006); secrets travel in plaintext in unencrypted bundles by design but UI-only warning (acceptable per doc).
6 Performance & resource management Yes BundleSession.DecryptedContent retained in memory for 30 min even on failure (Transport-007); PreviewAsync issues N+1 calls to GetTemplateWithChildrenAsync (Transport-008); BundleSerializer.Pack serializes content twice.
7 Design-document adherence Yes Overwrite-doesn't-sync-children contradicts the design doc's audit row table (Transport-001); per-IP-per-hour lockout in §11 not implemented (Transport-004); design says "bundles are not retained server-side after ApplyAsync commits" — but failed bundles are retained until TTL (Transport-007).
8 Code organization & conventions Yes No major issues found — clean separation, POCO DTOs in Serialization/, scoped vs singleton service lifetimes appropriate.
9 Testing coverage Yes Critical gap: no Overwrite-with-modified-children test for Templates or ExternalSystems (Transport-010); no test exercising failed-bundle session retention or per-IP lockout.
10 Documentation & comments Yes XML comments are extensive and accurate; design doc has some staleness (Transport-011, Transport-012).

Findings

Transport-001 — Template Overwrite never syncs attributes / alarms / scripts

Severity High
Category Correctness & logic bugs
Status Open
Location src/ScadaLink.Transport/Import/BundleImporter.cs:844-851

Description

The ResolutionAction.Overwrite branch in ApplyTemplatesAsync only writes Description and FolderId on the existing template and calls UpdateTemplateAsync(ex, …). The bundle DTO's Attributes, Alarms, and Scripts collections are never copied onto the existing entity, so an Overwrite of a template whose child collections changed silently leaves the target's existing children in place. ResolveAlarmScriptLinksAsync then runs against the unmodified existing alarms/scripts and does nothing useful for the Overwrite case. This contradicts the design doc's Configuration Audit Trail table ("Template overwritten → TemplateUpdated + per-field rows (TemplateAttributeAdded, TemplateScriptUpdated, …)") and the operator's mental model — an Overwrite that produces no diff is a footgun. The only integration test (ConflictResolutionTests.Overwrite_replaces_existing_template_description) asserts only on Description, so the regression is not caught.

Recommendation

For the Overwrite branch, replace the existing template's children to match the bundle DTO (delete-then-add or diff-and-merge), then re-run the alarm-script and composition rewire passes against the post-merge state. Emit the per-field audit rows the design doc enumerates. Add an integration test that overwrites a template whose Scripts / Attributes / Alarms differ.

Resolution

Unresolved.

Transport-002 — ExternalSystem Overwrite never syncs methods

Severity High
Category Correctness & logic bugs
Status Open
Location src/ScadaLink.Transport/Import/BundleImporter.cs:1213-1221

Description

ApplyExternalSystemsAsync Overwrite path writes EndpointUrl, AuthType, and AuthConfiguration on the existing ExternalSystemDefinition and calls UpdateExternalSystemAsync. The DTO's Methods collection is never written — any added, removed, or modified method on the incoming bundle silently does not land. Same shape of bug as Transport-001 but on a different entity. The design doc's audit-row table says "External system overwritten → ExternalSystemDefinitionUpdated + per-method rows", confirming methods are expected to round-trip.

Recommendation

Sync Methods on Overwrite via add / update / delete by name (mirroring the diff classification in ArtifactDiff.CompareExternalSystem) and emit the per-method audit rows. Add a test that overwrites an external system whose methods differ.

Resolution

Unresolved.

Transport-003 — Unlock lockout is enforced only client-side; server session is never marked Locked

Severity High
Category Security
Status Resolved
Location src/ScadaLink.Transport/Import/BundleImporter.cs:184-203, src/ScadaLink.CentralUI/Components/Pages/Design/TransportImport.razor.cs:267-309, src/ScadaLink.Commons/Types/Transport/BundleSession.cs:14-16

Description

BundleSession exposes FailedUnlockAttempts and a Locked computed property, and PreviewAsync / ApplyAsync correctly refuse to proceed when session.Locked == true. But for an encrypted bundle, LoadAsync throws CryptographicException before any session is opened, so no session ever holds a non-zero FailedUnlockAttempts. The 3-strike counter lives only in the Blazor page's local _failedUnlockAttempts field; a second tab / circuit / CLI caller bypassing the UI can retry the same uploaded bytes indefinitely because the importer accepts a passphrase against a stream and runs PBKDF2 each call (600 000 iterations / call). The Locked invariant on BundleSession is effectively unreachable — the field is dead code.

Recommendation

Move the lockout into IBundleImporter. Two viable shapes: (a) open a session on the first LoadAsync call (skip the decryption step until a separate UnlockAsync is called) and increment / lock there; (b) keep a per-content-hash counter in the session store, scoped by bundle SHA, so retries against the same bundle bytes are throttled regardless of the UI client. Either way, emit BundleImportUnlockFailed from the service, not from the Razor page. Test that a second concurrent caller cannot side-step the lockout.

Resolution

Unresolved.

Transport-004 — MaxUnlockAttemptsPerIpPerHour option is declared but never enforced

Severity Medium
Category Security
Status Open
Location src/ScadaLink.Transport/TransportOptions.cs:12, docs/requirements/Component-Transport.md §11

Description

TransportOptions.MaxUnlockAttemptsPerIpPerHour defaults to 10 and is documented in the design doc (§11, "Failed-unlock rate limit: per-session 3-strike lockout; per-IP-per-hour cap (default 10, configurable) to deter brute force against a stolen bundle"). A repo-wide grep finds zero readers of the field. There is no IP-keyed rate limiter, no IHttpContextAccessor in the importer, no middleware in Central UI guarding the import endpoints. The documented brute-force defence does not exist in code.

Recommendation

Either implement the per-IP cap (e.g. via Microsoft.AspNetCore.RateLimiting on the TransportImport page and the ManagementActor import command path, keyed on remote-IP for the UI and on authenticated principal for the CLI), or drop the option and the design-doc paragraph if the project is intentionally deferring this. Don't leave a dead-letter option that promises a security control that isn't there.

Resolution

Unresolved.

Transport-005 — Manifest fields outside ContentHash are not bound to the encrypted payload

Severity Medium
Category Security
Status Resolved
Location src/ScadaLink.Transport/Encryption/BundleSecretEncryptor.cs:31-49, src/ScadaLink.Transport/Serialization/ManifestValidator.cs:29-53

Description

AES-GCM is called with no Associated Authenticated Data (AAD). The manifest fields — SourceEnvironment, ExportedBy, ScadaLinkVersion, Summary, Contents, CreatedAtUtc, etc. — are plaintext and only the ContentHash field is checked against the content bytes. An attacker who obtains a bundle can edit any non-ContentHash manifest field (e.g. rewrite the SourceEnvironment displayed in the Step-4 typo-resistant confirmation gate, forge a more recent CreatedAtUtc, lie about ExportedBy) without breaking decryption. The Step-4 confirmation gate the design doc relies on ("User types the source environment name to confirm — typo-resistant gate at the prod boundary") is therefore tamperable.

Recommendation

Pass the SHA-256 of the manifest's canonical bytes (excluding ContentHash and Encryption, or simply the whole manifest minus those two fields) as the associatedData argument to AesGcm.Encrypt / AesGcm.Decrypt. Any tampering of the manifest's other fields then yields an authentication-tag mismatch on decrypt. Same change in the plaintext path can be approximated by extending the hash domain (compute a manifest-and-content hash, or sign the manifest, depending on how far you want to go).

Resolution

Unresolved.

Transport-006 — Bundle ZIP read has no per-entry size cap or entry-count cap (zip-bomb / decompression-bomb)

Severity Medium
Category Security
Status Resolved
Location src/ScadaLink.Transport/Serialization/BundleSerializer.cs:121-156, src/ScadaLink.Transport/Import/BundleImporter.cs:132-143

Description

LoadAsync caps the raw bundle bytes at MaxBundleSizeMb (default 100 MB) before opening the ZIP. But ReadContentBytes calls entry.Open() and CopyTo(MemoryStream) with no per-entry size limit and no defence against compression ratios — a 100 MB DEFLATE-compressed bundle can decompress to gigabytes. There is also no cap on the number of entries iterated; only two known entries are read (manifest.json + content.json/content.enc), but ReadContentBytes does not validate that no extra entries exist or that the expected entry's Length is bounded. A malicious importer-with-RequireAdmin (or a stolen bundle delivered to an admin) can OOM the central node.

Recommendation

Cap each entry's decompressed length explicitly (compare ZipArchiveEntry.Length against a configurable max, or copy into a length-limited stream). Reject bundles whose entry list contains anything other than the known manifest + content entries. Consider also rejecting any compression ratio over ~50x as a defence-in-depth measure.

Resolution

Unresolved.

Transport-007 — Failed import sessions retain decrypted plaintext for the full 30-minute TTL

Severity Medium
Category Performance & resource management
Status Resolved
Location src/ScadaLink.Transport/Import/BundleImporter.cs:614-696, src/ScadaLink.Transport/Import/BundleSessionStore.cs:67-93

Description

ApplyAsync calls _sessionStore.Remove(sessionId) only on the success path (line 614). The catch block re-throws without removing the session, so a failed apply leaves the BundleSession (with DecryptedContent up to ~100 MB) in the in-memory dictionary until the TTL elapses 30 min later (or Get lazily evicts on a separate lookup). Decrypted secrets — DB connection strings, SMTP credentials, external-system auth configs — sit in process memory for that window, accessible to anyone holding the session id. Multiplied across repeated import attempts on the same circuit, this can produce significant memory pressure (10 failed 100 MB imports = 1 GB) and contradicts the design doc's "Bundles are not retained server-side after ApplyAsync commits" statement.

Recommendation

In the ApplyAsync catch block, call _sessionStore.Remove(sessionId) (or at least zero out session.DecryptedContent) before re-throwing. Also clear DecryptedContent from the session on the success path before removing — the buffer is potentially still rooted by a caller-held reference. Consider shortening the TTL when a session is in a known-stuck state. The session store's EvictExpired exists but is only called on demand — wire it to a periodic timer so abandoned sessions clear even without traffic.

Resolution

Unresolved.

Transport-008 — PreviewAsync issues an N+1 GetTemplateWithChildrenAsync per matching template name

Severity Low
Category Performance & resource management
Status Open
Location src/ScadaLink.Transport/Import/BundleImporter.cs:252-272

Description

Building the per-template diff loops over every existing stub returned by GetAllTemplatesAsync and, for any name that matches an incoming DTO, calls GetTemplateWithChildrenAsync(stub.Id) to re-fetch with children. On a target DB with many templates that overlap the bundle this is one round-trip per matching template (often the whole bundle), each query carrying the full attributes/alarms/scripts/compositions joins. The diff itself is read-only and fits a single eager-loaded GetAllTemplatesWithChildrenAsync query.

Recommendation

Add a GetAllTemplatesWithChildrenAsync (or extend GetAllTemplatesAsync with an includeChildren flag) on ITemplateEngineRepository and use it here. The same N+1 appears in ResolveCompositionEdgesAsync (line 1093) for the just-imported templates, but that loop is bounded by the bundle's size and is less of a concern.

Resolution

Unresolved.

Transport-009 — IAuditCorrelationContext.BundleImportId is mutated on the same scoped instance the AuditService reads

Severity Low
Category Concurrency & thread safety
Status Open
Location src/ScadaLink.Transport/Import/BundleImporter.cs:528, 668, 703, src/ScadaLink.ConfigurationDatabase/Services/AuditCorrelationContext.cs

Description

The XML doc on IAuditCorrelationContext correctly notes that mutating BundleImportId is not thread-safe and that concurrent imports inside a single scope would cross-contaminate audit rows. The contract is "Blazor circuit / API request — sequential await chain — single writer". The risk is that this invariant is documentation-only — there is no enforcement (e.g. a mutex on set, or an AsyncLocal<Guid?> impl) and no test exercising a concurrent-callers scenario. A future change that schedules audit writes on a different synchronization context inside the apply transaction (e.g. Task.WhenAll over the Apply helpers) would silently start leaking the id across rows.

Recommendation

Either (a) back BundleImportId with an AsyncLocal<Guid?> so each logical call chain inherits the value and concurrent chains can't trample it, or (b) wrap the apply in a try/finally that snapshots and restores. (b) is closer to the current design. Either way, add an integration test that fires two overlapping ApplyAsync calls and asserts each bundle's rows carry only that bundle's id.

Resolution

Unresolved.

Transport-010 — Critical Overwrite + cross-cutting paths uncovered by tests

Severity Medium
Category Testing coverage
Status Open
Location tests/ScadaLink.Transport.IntegrationTests/ConflictResolutionTests.cs, tests/ScadaLink.Transport.IntegrationTests/Import/BundleImporterApplyTests.cs

Description

The existing tests cover the happy path well (round-trip, semantic-validator gating, rollback even when RollbackAsync itself throws, composition imports), but the per-entity Overwrite resolutions are only spot-tested: ConflictResolutionTests.Overwrite_replaces_existing_template_description asserts on Description only. Specifically missing:

  • Overwrite of a Template whose Attributes / Alarms / Scripts / Compositions diverged from the existing row (would catch Transport-001).
  • Overwrite of an ExternalSystem whose Methods diverged (would catch Transport-002).
  • Overwrite of a NotificationList whose Recipients collection diverged (NotificationList Overwrite does sync recipients via clear+add — needs an asserting test).
  • Concurrent ApplyAsync calls on a shared scope to exercise the IAuditCorrelationContext mutation contract (would catch Transport-009).
  • Per-IP unlock-throttle behaviour (would catch Transport-004).
  • A session that survives a failed Apply (would catch Transport-007).

Recommendation

Add the missing integration tests above. Most can be modelled after ConflictResolutionTests' export-then-mutate-target-then-apply pattern.

Resolution

Unresolved.

Transport-011 — Design doc's Step-1 manifest preview promises decryption-free preview, but LoadAsync reads and validates content before passphrase

Severity Low
Category Documentation & comments
Status Open
Location docs/requirements/Component-Transport.md Import Flow Step 1, src/ScadaLink.Transport/Import/BundleImporter.cs:124-203

Description

The design doc says: "The manifest is plaintext so the import wizard can preview bundle contents and source provenance before the user supplies a passphrase." LoadAsync honours that — but does so by ALWAYS reading and hashing the content blob (encrypted or not) on the first call, regardless of whether the caller has a passphrase. For an encrypted bundle with no passphrase, the code path that surfaces the encrypted-bundle prompt is the ArgumentException thrown at line 195, which has already performed the full manifest parse + content-hash check + read of the encrypted blob. That's fine, but it means there is no cheap "manifest peek" — the UI's "let the user see the manifest before deciding whether to type a passphrase" is at least O(bundle-size) and consumes the full upload buffer each call. The design doc gives a misleading impression of cost.

Recommendation

Either (a) add an explicit ReadManifestAsync(Stream) interface method that skips the content read for the pure preview case, or (b) update the design doc to clarify the full envelope is read on every LoadAsync and the cheap "peek" is conceptual rather than runtime.

Resolution

Unresolved.

Transport-012 — "Bundle Import" filter promised in design doc not surfaced in Configuration Audit Log Viewer UI

Severity Low
Category Documentation & comments
Status Open
Location docs/requirements/Component-Transport.md §Audit Trail, src/ScadaLink.ConfigurationDatabase/Repositories/CentralUiRepository.cs:148

Description

The design doc says: "The existing Configuration Audit Log Viewer gains a Bundle Import filter that surfaces all rows for a given import. The BundleImported summary row links to the filtered view." A repository filter on BundleImportId is wired into CentralUiRepository (line 148), but no UI filter control surfaces it and the BundleImported summary row does not carry a hyperlink in Configuration Audit Log Viewer. This is a documentation-vs-code gap, not a bug in Transport itself, but the spec lives in the Transport doc so it's reasonable to flag.

Recommendation

Either implement the filter dropdown + summary-row link in the Configuration Audit Log Viewer, or note the deferral in the design doc.

Resolution

Unresolved.