55f46e7c92
Well-localised perf fixes across 8 modules.
Lock decoupling / SQL streaming:
- AuditLog-005: SqliteAuditWriter gains dedicated read-only _readConnection
(+ _readLock) backed by WAL journal mode. GetBacklogStatsAsync,
ReadPendingAsync, ReadPendingSinceAsync, ReadForwardedAsync no longer
contend with the hot-path INSERT lock — backlog probes on a 30s timer
can't stall the writer under multi-hundred-K Pending backlog.
- SEL-022: dropped Cache=Shared from SiteEventLogger's default connection
string (single-connection logger; mode was dormant config).
Memory / streaming:
- CLI-019: bundle export streams base64 in 1 MB-aligned chunks via
Convert.TryFromBase64Chars straight into the FileStream — no more
full-bundle byte[] allocation.
- CentralUI-031: TransportImport now stages the upload to a per-session
temp file under Path.GetTempPath() (replaces in-memory byte[] field);
page implements IDisposable to delete the temp file on reset / new
upload / dispose. Per-circuit working set drops from ~100 MB to ~80 KB.
N+1 hoisting:
- Transport-008: added ITemplateEngineRepository.GetTemplatesWithChildrenAsync
bulk method; BundleImporter.PreviewAsync calls it once instead of per-
template-name. Single query with .Include(...).AsSplitQuery().
- DM-023: BuildDeployArtifactsCommandAsync's per-site loop now references
a pre-fetched GlobalArtifactSnapshot (shared scripts, external systems,
DB connections, notification lists, SMTP) instead of re-querying per site.
- MgmtSvc-023: HandleQueryDeployments unfiltered branch uses one
GetAllInstancesAsync bulk load + Dictionary<int,int?> lookup (was a
GetInstanceByIdAsync per record).
Small allocations / per-tick rebuilds:
- InboundAPI-019: AuditWriteMiddleware gates EnableBuffering() on
RequestHasBody() so GET/HEAD/DELETE/TRACE/OPTIONS and Content-Length:0
requests skip the FileBufferingReadStream allocation.
- NotifOutbox-006: ResolveAdapters dictionary now cached on
_adaptersCache (built lazily on first sweep) + actor-lifetime
_adaptersScope; ResolveAdapters no longer rebuilds per dispatch tick.
Verify-only:
- Comm-017: Confirmed _inProgressDeployments was deleted by Comm-016 in
commit ac96b83 — marked Resolved with that attribution. No code change.
Doc-correction:
- NS-022: Updated MailKitSmtpClientWrapper XML doc to spell out single-
connection / per-delivery-factory contract (option (b) — transient
client per Send — rejected because it re-handshakes TLS per email).
10+ new regression tests across 8 test projects. Build clean; affected
suites all green. README regenerated: 54 open (was 65).
1592 lines
88 KiB
Markdown
1592 lines
88 KiB
Markdown
# Code Review — CentralUI
|
||
|
||
| Field | Value |
|
||
|-------|-------|
|
||
| Module | `src/ScadaLink.CentralUI` |
|
||
| Design doc | `docs/requirements/Component-CentralUI.md` |
|
||
| Status | Reviewed |
|
||
| Last reviewed | 2026-05-28 |
|
||
| Reviewer | claude-agent |
|
||
| Commit reviewed | `1eb6e97` |
|
||
| Open findings | 3 |
|
||
|
||
## Summary
|
||
|
||
The Central UI is a sizeable, generally well-structured Blazor Server module:
|
||
custom Bootstrap components only (no third-party UI frameworks, as required),
|
||
consistent list/form page patterns, careful disposal in most components, and a
|
||
thoughtful Roslyn-backed script editor. The most serious problem is the
|
||
**Test Run sandbox** (`ScriptAnalysisService.RunInSandboxAsync`): it compiles
|
||
and executes arbitrary user C# *in the central process* with no enforcement of
|
||
the documented script trust model — the forbidden-API list is only a Monaco
|
||
editor diagnostic, never applied before execution — so a Design user can run
|
||
`System.IO`/`Process`/`Reflection` code on the central node. Several other
|
||
themes recur: (1) per-circuit security drift — site-scoped Deployment claims
|
||
are written at login but never read, so site scoping is not enforced anywhere;
|
||
(2) Blazor render-thread and disposal hazards — background `Timer` / `Task.Delay`
|
||
callbacks and stream callbacks touch component state and `@ref` children that
|
||
may already be disposed; (3) process-global mutation (`Console.SetOut`) shared
|
||
across concurrent circuits; (4) drift from the design doc on session expiry and
|
||
on the "deployment status pushes via SignalR" claim (the page actually polls).
|
||
Testing coverage is thin for a module this large: only the script analyzer,
|
||
TreeView, schema model, and a few data-connection pages have unit tests; most
|
||
pages and the auth bridge are untested.
|
||
|
||
#### Re-review 2026-05-17 (commit `39d737e`)
|
||
|
||
All 19 findings from the 2026-05-16 review are confirmed closed. The resolution
|
||
batch (`a9bd7ee`..`34588ae`) substantially rewrote the auth bridge, the script
|
||
sandbox, several Deployment/Monitoring pages, and the shared component disposal
|
||
paths, so this re-review re-examined the post-fix state across all 10 checklist
|
||
categories. Six new findings (CentralUI-020 .. 025) were recorded. The most
|
||
important is **CentralUI-020**: the two prior fixes interact destructively — the
|
||
CentralUI-004 fix made `CookieAuthenticationStateProvider` return a frozen,
|
||
constructor-time auth-state snapshot, while the CentralUI-005 fix rewrote
|
||
`SessionExpiry.razor` to *poll* that same provider to detect a lapsed session.
|
||
Because the snapshot never changes for the life of the circuit, the idle-timeout
|
||
redirect can never fire, so the documented idle-logout behaviour is silently
|
||
defeated. The remaining new findings are a cross-thread `Dictionary` mutation in
|
||
`DebugView`, an unguarded `InvokeAsync` in the new `Deployments` push handler,
|
||
and three Low-severity items (residual bare `catch`, magic-string claim
|
||
lookups, and the untested `SessionExpiry` polling path).
|
||
|
||
## Checklist coverage
|
||
|
||
| # | Category | Examined | Notes |
|
||
|---|----------|----------|-------|
|
||
| 1 | Correctness & logic bugs | ☑ | DebugView cap logic, audit-log timezone, toast race — see findings. |
|
||
| 2 | Akka.NET conventions | ☑ | Module is mostly UI; `DebugStreamService` actor usage reviewed (in Communication but driven from here). No actor-convention violations in CentralUI proper. |
|
||
| 3 | Concurrency & thread safety | ☑ | `Console.SetOut` global mutation, stream/timer callbacks on non-render threads, toast `_ = Task.Delay`. |
|
||
| 4 | Error handling & resilience | ☑ | Broad `catch {}` swallowing, dangling `TaskCompletionSource` on dialog disposal. |
|
||
| 5 | Security | ☑ | Sandbox not enforcing trust model (Critical); site scoping never enforced; auth bridge reads stale HttpContext; logout CSRF. |
|
||
| 6 | Performance & resource management | ☑ | N+1 site-connection query, repeated `FilteredMessages` recomputation, full-page paginators rendering all page buttons. |
|
||
| 7 | Design-document adherence | ☑ | Session expiry diverges from "15-min sliding + 30-min idle"; Deployments polls despite "push via SignalR"; nav exposes Deployment-only pages to all roles. |
|
||
| 8 | Code organization & conventions | ☑ | Generally good; options classes absent (no appsettings binding here); no major violations. |
|
||
| 9 | Testing coverage | ☑ | Auth, sandbox-run, DebugView, Health, ParkedMessages, most pages untested. |
|
||
| 10 | Documentation & comments | ☑ | Comments are accurate and helpful; a few stale claims noted. |
|
||
|
||
Re-review 2026-05-17 (`39d737e`): all 10 categories re-examined against the
|
||
post-fix source. New findings — category 3 (CentralUI-020 the auth-snapshot vs
|
||
session-poll interaction is also a design-adherence regression; CentralUI-021
|
||
cross-thread `Dictionary`; CentralUI-022 unguarded `InvokeAsync`), category 4
|
||
(CentralUI-023 residual bare `catch`), category 8 (CentralUI-024 magic-string
|
||
claims), category 9 (CentralUI-025 untested `SessionExpiry` poll). Categories
|
||
1, 2, 5, 6, 7, 10 produced no new findings.
|
||
|
||
_Re-review (2026-05-28, `1eb6e97`):_
|
||
|
||
| # | Category | Examined | Notes |
|
||
|---|----------|----------|-------|
|
||
| 1 | Correctness & logic bugs | ☑ | CentralUI-026 (AuditFilterBar UTC), CentralUI-027 (3 other pages with same UTC bug). |
|
||
| 2 | Akka.NET conventions | ☑ | No new findings — module is presentation; `DebugStreamService` actor usage unchanged. |
|
||
| 3 | Concurrency & thread safety | ☑ | CentralUI-030 (StringWriter capture buffer not thread-safe under intra-script `Task.WhenAll`). |
|
||
| 4 | Error handling & resilience | ☑ | No new findings — the prior CentralUI-018/023 patterns hold. |
|
||
| 5 | Security | ☑ | CentralUI-028 (NotificationReport + SiteCallsReport not site-scoped — CentralUI-002 regression on new pages). |
|
||
| 6 | Performance & resource management | ☑ | CentralUI-031 (TransportImport buffers full bundle bytes in component state). |
|
||
| 7 | Design-document adherence | ☑ | CentralUI-032 (AuditResultsGrid forward-only paging diverges from "keyset paginated" implied bi-directional). |
|
||
| 8 | Code organization & conventions | ☑ | CentralUI-029 (`JS.InvokeAsync<int>("eval", ...)` in ConfigurationAuditLog vs the `_content/.../BrowserTime` module pattern). |
|
||
| 9 | Testing coverage | ☑ | CentralUI-033 (TransportImport / SiteCallsReport query-string drill-in code paths untested). |
|
||
| 10 | Documentation & comments | ☑ | No new findings — code comments accurately describe intent. |
|
||
|
||
#### Re-review 2026-05-28 (commit `1eb6e97`)
|
||
|
||
All 25 prior findings remain closed. This re-review re-examined the full
|
||
module against the 10-category checklist with attention to the
|
||
recently-added Transport export/import wizards (`TransportExport`,
|
||
`TransportImport`) and the operational Audit Log page (Bundle B..G). The
|
||
most consequential pattern in this pass is that the **CentralUI-008
|
||
local-input-treated-as-UTC** bug, fixed for the legacy
|
||
`AuditLog.razor` via the `BrowserTime.LocalInputToUtc` helper, has been
|
||
silently recreated on every other page that exposes a
|
||
`<input type="datetime-local">` filter — `AuditFilterBar` (the new
|
||
operational Audit Log filter, CentralUI-026), `SiteCallsReport`,
|
||
`NotificationReport`, and `EventLogs` (CentralUI-027). The Audit Log
|
||
page CSV export URL therefore mis-shifts the From/To filter window by
|
||
the operator's UTC offset, and the same offset bug silently corrupts
|
||
audit-style queries on Site Calls / Notification Report / Event Logs.
|
||
Second-most consequential is **CentralUI-028**: the new `NotificationReport`
|
||
and `SiteCallsReport` pages (both `[Authorize(RequireDeployment)]`) do
|
||
NOT filter their site dropdown or row data through `SiteScopeService`,
|
||
and the relay actions (`RetryNotification`/`DiscardNotification`,
|
||
`RetrySiteCall`/`DiscardSiteCall`) issue no server-side site-scope
|
||
re-check before relaying to the owning site — so a site-scoped Deployment
|
||
user can read and act on notifications and cached calls for sites
|
||
outside their grant, replicating the original CentralUI-002 defect on
|
||
the two pages added after the CentralUI-002 fix landed. The remaining
|
||
new findings (CentralUI-029..CentralUI-033) cover a residual `JS.InvokeAsync<int>("eval", ...)`
|
||
in `ConfigurationAuditLog`, a single-thread `StringWriter` capture buffer
|
||
in the Test Run sandbox (a sandboxed script that uses `Task.WhenAll` can
|
||
write concurrently), a `using var` `MemoryStream` followed by `ms.ToArray()`
|
||
buffering the full bundle in memory in `TransportImport`, the
|
||
`AuditResultsGrid` having no Previous-page control (forward-only navigation,
|
||
a UX/design adherence gap), and the un-tested `TransportImport` /
|
||
`SiteCallsReport` query-string drill-in code paths.
|
||
|
||
## Findings
|
||
|
||
### CentralUI-001 — Test Run sandbox executes arbitrary C# with no trust-model enforcement
|
||
|
||
| | |
|
||
|--|--|
|
||
| Severity | Critical |
|
||
| Category | Security |
|
||
| Status | Resolved |
|
||
| Location | `src/ScadaLink.CentralUI/ScriptAnalysis/ScriptAnalysisService.cs:171-424` |
|
||
|
||
**Description**
|
||
|
||
`RunInSandboxAsync` compiles user-supplied script code with `CSharpScript.Create`
|
||
and executes it (`script.RunAsync`) directly inside the central process. The
|
||
"sandbox" applies only a wall-clock timeout and an output-size cap. It does
|
||
**not** enforce the documented script trust model: the forbidden-API set
|
||
(`System.IO`, `System.Diagnostics`/`Process`, `System.Reflection`, `System.Net`,
|
||
threading) is checked only in `FindForbiddenApiUsages`, which feeds Monaco
|
||
editor diagnostics — it is never consulted before `RunInSandboxAsync` executes.
|
||
`DefaultOptions` references `typeof(object).Assembly` (the full BCL), so a
|
||
Design-role user can submit `System.IO.File.WriteAllText(...)`,
|
||
`System.Diagnostics.Process.Start(...)`, reflection, or raw socket code via
|
||
`POST /api/script-analysis/run` and it runs with the central host process's
|
||
full privileges. The endpoint is gated only by `RequireDesign`. This is a
|
||
remote code execution path on the central cluster node.
|
||
|
||
**Recommendation**
|
||
|
||
Before executing, run the same forbidden-API analysis used for diagnostics and
|
||
reject any script with a `SCADA001`/`SCADA002` (severity-8) marker; additionally
|
||
restrict the compilation's metadata references to the curated script API
|
||
surface, and ideally execute in an isolated `AssemblyLoadContext`/process with
|
||
constrained permissions. Treat the trust model as an execution-time gate, not
|
||
an editor hint.
|
||
|
||
**Resolution**
|
||
|
||
Resolved 2026-05-16. A Roslyn semantic trust-model gate was added. `RunInSandboxAsync`
|
||
now calls `EnforceTrustModel` after compilation and before `script.RunAsync`; if the
|
||
script references any forbidden API the run is rejected (`SandboxErrorKind.CompileError`)
|
||
with the offending markers, and the same gate is applied to nested shared scripts in
|
||
`callSharedFunc`. `FindForbiddenApiUsages` was reworked so it resolves every identifier
|
||
(not just the leftmost) against the semantic model and checks types **and** members —
|
||
so a fully-qualified call such as `System.IO.File.WriteAllText(...)` is now caught, not
|
||
only `using`-directive or bare-type forms. This is a static semantic gate consistent
|
||
with the documented trust model; it is not a process sandbox — reflection-based
|
||
indirection remains out of its reach, and full isolation would require running scripts
|
||
in a separate constrained process (a larger change deliberately not taken here).
|
||
Regression tests `RunInSandbox_FullyQualifiedForbiddenApi_IsBlockedBeforeExecution`,
|
||
`RunInSandbox_ForbiddenUsingDirective_IsBlockedBeforeExecution` and
|
||
`Diagnose_FullyQualifiedForbiddenCall_RaisesSCADA002` fail against the pre-fix code and
|
||
pass after; `RunInSandbox_CleanScript_StillRuns` guards against over-blocking. Fixed by
|
||
the commit whose message references `CentralUI-001`.
|
||
|
||
### CentralUI-002 — Site-scoped Deployment permissions are issued but never enforced
|
||
|
||
| | |
|
||
|--|--|
|
||
| Severity | High |
|
||
| Category | Security |
|
||
| Status | Resolved |
|
||
| Location | `src/ScadaLink.CentralUI/Auth/AuthEndpoints.cs:63-69`; `src/ScadaLink.CentralUI/Components/Pages/Deployment/*.razor` |
|
||
|
||
**Description**
|
||
|
||
Login adds `SiteId` claims (`JwtTokenService.SiteIdClaimType`) for non-system-wide
|
||
Deployment users, and the design doc (Component-CentralUI "Responsibilities" and
|
||
CLAUDE.md Security & Auth) requires the Deployment role to be site-scoped. A
|
||
repo-wide search shows the `SiteId` claim is written at login and **never read
|
||
anywhere in CentralUI**. Deployment pages — `DebugView.razor`, `Deployments.razor`,
|
||
`InstanceCreate.razor`, `InstanceConfigure.razor`, `Topology.razor`,
|
||
`ParkedMessages.razor`, `EventLogs.razor` — list and act on every site with no
|
||
filtering by the user's permitted sites. A Deployment user scoped to one site
|
||
can deploy to, debug, and manage instances at any site.
|
||
|
||
**Recommendation**
|
||
|
||
Enforce site scoping: filter site/instance lists by the user's `SiteId` claims
|
||
(or treat the absence of `SiteId` claims as system-wide), and re-check the claim
|
||
server-side before any mutating cross-site command (deploy, enable/disable/delete,
|
||
debug stream, parked-message retry/discard). A shared helper that reads the
|
||
claims from `AuthenticationStateProvider` and exposes "permitted site ids" would
|
||
keep this consistent.
|
||
|
||
**Resolution**
|
||
|
||
Resolved 2026-05-16. Confirmed: the `SiteId` claim was written at login
|
||
(`AuthEndpoints`, `RoleMapper`) but never read by any CentralUI page — site
|
||
scoping was unenforced. Added a scoped `SiteScopeService` (`Auth/SiteScopeService.cs`)
|
||
that reads the current circuit's `SiteId` claims and exposes `IsSystemWideAsync`,
|
||
`PermittedSiteIdsAsync`, `FilterSitesAsync`, and `IsSiteAllowedAsync` (absence of
|
||
claims = system-wide, matching `SiteScopeAuthorizationHandler`). All seven
|
||
Deployment/Monitoring pages now consume it: `Topology`, `DebugView`,
|
||
`InstanceCreate`, `Deployments` filter their site/instance lists; `InstanceConfigure`
|
||
rejects direct navigation to an instance on a non-permitted site; `DebugView`,
|
||
`InstanceCreate`, and `ParkedMessages` re-check the claim server-side before any
|
||
mutating/streaming command. Regression tests: `SiteScopeServiceTests` (6 tests
|
||
pinning the helper logic) and `TopologyPageTests.SiteScoping_ScopedDeploymentUser_OnlySeesPermittedSites`
|
||
/ `SiteScoping_SystemWideDeploymentUser_SeesAllSites`. Fixed by the commit whose
|
||
message references `CentralUI-002`.
|
||
|
||
### CentralUI-003 — `Console.SetOut`/`SetError` mutates process-global state across concurrent circuits
|
||
|
||
| | |
|
||
|--|--|
|
||
| Severity | High |
|
||
| Category | Concurrency & thread safety |
|
||
| Status | Resolved |
|
||
| Location | `src/ScadaLink.CentralUI/ScriptAnalysis/ScriptAnalysisService.cs:359-423` |
|
||
|
||
**Description**
|
||
|
||
`RunInSandboxAsync` redirects `Console.Out`/`Console.Error` to a per-call
|
||
`StringWriter`, runs the script, then restores them in `finally`. `Console.Out`
|
||
is process-global. If two users (two Blazor circuits) run Test Run concurrently,
|
||
their captured outputs interleave or cross over, and the `finally` of whichever
|
||
finishes first restores `Console.Out` to the *original* writer while the other
|
||
run is still executing — so the second run's script output is lost or written
|
||
to the real console. `RunInSandboxAsync` is `async` and the script runs on a
|
||
thread-pool thread, so concurrent execution is fully expected.
|
||
|
||
**Recommendation**
|
||
|
||
Do not redirect process-global `Console`. Provide console capture through the
|
||
script globals surface (e.g. a `TextWriter` exposed on `SandboxScriptHost` that
|
||
the sandbox API writes to), or serialize Test Run executions with a semaphore if
|
||
global redirection must be kept. Capturing per-call without global mutation is
|
||
the correct fix.
|
||
|
||
**Resolution**
|
||
|
||
Resolved 2026-05-16. Confirmed: `RunInSandboxAsync` redirected the process-global
|
||
`Console.Out`/`Console.Error` per call and restored them in `finally`, so a
|
||
concurrent run's `finally` could restore the writer while another run was still
|
||
executing — the long run silently lost output (reproduced by the regression
|
||
test, 74 of 80 expected lines captured). Added `SandboxConsoleCapture`, a routing
|
||
`TextWriter` installed into `Console.Out`/`Console.Error` exactly once for the
|
||
process; each run pushes its own `StringWriter` onto an `AsyncLocal` capture
|
||
scope via `BeginCapture`, so writes are routed per logical call-tree with no
|
||
per-run mutation of global `Console` state. `RunInSandboxAsync` now opens the
|
||
scope with `using` declarations instead of calling `Console.SetOut`. Regression
|
||
tests `RunInSandbox_CapturesConsoleOutput` and
|
||
`RunInSandbox_ConcurrentRuns_DoNotCrossContaminateConsoleOutput` fail against the
|
||
pre-fix code and pass after. Fixed by the commit whose message references
|
||
`CentralUI-003`.
|
||
|
||
### CentralUI-004 — `CookieAuthenticationStateProvider` reads `HttpContext` for the life of the circuit
|
||
|
||
| | |
|
||
|--|--|
|
||
| Severity | High |
|
||
| Category | Security |
|
||
| Status | Resolved |
|
||
| Location | `src/ScadaLink.CentralUI/Auth/CookieAuthenticationStateProvider.cs:22-28` |
|
||
|
||
**Description**
|
||
|
||
`GetAuthenticationStateAsync` returns `_httpContextAccessor.HttpContext?.User`.
|
||
In Blazor Server, `HttpContext` is only valid during the initial HTTP request
|
||
that establishes the circuit; for the lifetime of the long-lived SignalR circuit
|
||
`IHttpContextAccessor.HttpContext` is `null` (or, worse, a stale/foreign context
|
||
if the accessor's `AsyncLocal` leaks). Any later call to
|
||
`GetAuthenticationStateAsync` — e.g. an `<AuthorizeView>` re-evaluating, or pages
|
||
that call it directly (`Sites.razor`, `Templates.razor`) — then sees an
|
||
unauthenticated principal and may render the wrong UI, or returns a stale
|
||
identity that never reflects role changes. The class derives from
|
||
`ServerAuthenticationStateProvider`, which is designed to be seeded once via
|
||
`SetAuthenticationState`; overriding `GetAuthenticationStateAsync` to read
|
||
`HttpContext` defeats that design.
|
||
|
||
**Recommendation**
|
||
|
||
Capture the authenticated principal once when the circuit is created (e.g. via
|
||
the root component / `AuthenticationStateProvider` seeding pattern used by the
|
||
Blazor Web App template) and store it on the scoped provider, instead of reading
|
||
`IHttpContextAccessor` on every call. Do not depend on `HttpContext` after the
|
||
circuit is established.
|
||
|
||
**Resolution**
|
||
|
||
Resolved 2026-05-16. Confirmed: `GetAuthenticationStateAsync` read
|
||
`_httpContextAccessor.HttpContext?.User` on every call; the provider is
|
||
registered `Scoped`, so it is constructed within the initial HTTP request's DI
|
||
scope while `HttpContext` is still valid, but every later call (an
|
||
`<AuthorizeView>` re-evaluating, or a page calling it directly) over the
|
||
long-lived SignalR circuit saw `HttpContext == null` and returned an anonymous
|
||
principal. The provider now snapshots the principal once in the constructor into
|
||
a cached `Task<AuthenticationState>` and serves that for the life of the
|
||
circuit, never touching `IHttpContextAccessor` again. Regression tests
|
||
`CookieAuthenticationStateProviderTests.GetAuthenticationStateAsync_StillReturnsUser_AfterHttpContextIsGone`
|
||
and `..._IsStableAcrossCalls_IgnoringStaleForeignContext` fail against the
|
||
pre-fix code (they would see an anonymous / foreign principal) and pass after.
|
||
Fixed by the commit whose message references `CentralUI-004`.
|
||
|
||
### CentralUI-005 — Session expiry implementation diverges from the documented policy
|
||
|
||
| | |
|
||
|--|--|
|
||
| Severity | Medium |
|
||
| Category | Design-document adherence |
|
||
| Status | Resolved |
|
||
| Location | `src/ScadaLink.CentralUI/Auth/AuthEndpoints.cs:47-81`; `src/ScadaLink.CentralUI/Components/Shared/SessionExpiry.razor:18-30` |
|
||
|
||
**Description**
|
||
|
||
CLAUDE.md (Security & Auth) specifies "15-minute expiry with sliding refresh,
|
||
30-minute idle timeout." `AuthEndpoints` instead sets a single fixed
|
||
`expires_at = UtcNow + 30 minutes` claim and a 30-minute cookie `ExpiresUtc`,
|
||
with no sliding refresh and no separate idle vs absolute timeout.
|
||
`SessionExpiry.razor` schedules a single hard redirect at that fixed time. The
|
||
result is a hard 30-minute cap with no sliding renewal — an active user is
|
||
logged out mid-session, and there is no 15-minute component at all.
|
||
|
||
**Recommendation**
|
||
|
||
Either implement the documented policy (sliding 15-minute token with refresh on
|
||
activity, plus a 30-minute idle cutoff) or update the design docs to match the
|
||
fixed 30-minute model. The code and the documented decision must agree.
|
||
|
||
**Resolution**
|
||
|
||
Resolved 2026-05-16 (commit `<pending>`) — cross-module fix (CentralUI +
|
||
Security), explicitly authorized. Root cause confirmed against the source:
|
||
`AddCookie` (`ScadaLink.Security/ServiceCollectionExtensions.cs`) set neither
|
||
`ExpireTimeSpan` nor `SlidingExpiration`; `AuthEndpoints` stamped a fixed
|
||
`expires_at = UtcNow + 30 min` claim and a 30-minute absolute cookie
|
||
`ExpiresUtc`; `SessionExpiry.razor` scheduled one hard redirect at that fixed
|
||
instant — a hard 30-minute cap, no sliding renewal, no 15-minute component.
|
||
|
||
**What was implemented — sliding session window.** ASP.NET Core cookie
|
||
authentication exposes a single `ExpireTimeSpan` plus a `SlidingExpiration`
|
||
flag; it cannot natively model *both* a 15-minute sliding token *and* a separate
|
||
30-minute absolute idle cap. The faithful interpretation implemented: the cookie
|
||
session window **is** the idle timeout. `AddSecurity` now post-configures the
|
||
cookie options with `ExpireTimeSpan = TimeSpan.FromMinutes(SecurityOptions.IdleTimeoutMinutes)`
|
||
(default 30, bound from `appsettings` via the existing options pattern, not
|
||
hard-coded) and `SlidingExpiration = true`. The middleware therefore re-issues
|
||
the cookie on activity once past the halfway mark of the window: an active user
|
||
is continually renewed, an idle user is signed out after the 30-minute idle
|
||
timeout — exactly the documented "sliding refresh, 30-minute idle timeout". The
|
||
separate 15-minute `JwtExpiryMinutes` governs the lifetime of the *embedded JWT*
|
||
itself (`JwtTokenService`) — a distinct layer from the cookie session window;
|
||
it is not, and per the ASP.NET cookie model cannot be, a second independent
|
||
sliding window inside the same cookie. `AuthEndpoints` no longer imposes a
|
||
contradictory absolute cap: the `expires_at` claim and the manual cookie
|
||
`ExpiresUtc` were removed, and a new `BuildSignInProperties()` helper sets only
|
||
`IsPersistent = true` (no `ExpiresUtc`, `AllowRefresh` left unset) so the
|
||
middleware owns expiry. `SessionExpiry.razor` no longer reads a fixed
|
||
login-time deadline (the `expires_at` claim is gone) and no longer hard-redirects
|
||
at a fixed instant: it now polls the authentication state on a recurring
|
||
interval and redirects to `/login` only once the sliding cookie has actually
|
||
lapsed server-side — so an active user is never logged out mid-session.
|
||
|
||
Regression tests fail against the pre-fix code and pass after. Security:
|
||
`AddSecurity_AuthCookie_UsesSlidingExpiration`,
|
||
`AddSecurity_AuthCookie_ExpireTimeSpanMatchesIdleTimeout` (pre-fix
|
||
`ExpireTimeSpan` was the 14-day default — confirmed failing), and
|
||
`AddSecurity_AuthCookie_ExpireTimeSpanIsConfigurable` (pins the options-pattern
|
||
binding). CentralUI: `SessionExpiryPolicyTests.BuildSignInProperties_DoesNotSetFixedAbsoluteExpiry`,
|
||
`..._IsPersistent`, `..._AllowsSlidingRefresh` pin that the login sign-in no
|
||
longer imposes a fixed absolute cap. `dotnet build ScadaLink.slnx` clean;
|
||
`tests/ScadaLink.Security.Tests` 57 passed, `tests/ScadaLink.CentralUI.Tests`
|
||
254 passed.
|
||
|
||
### CentralUI-006 — Deployment status page polls every 10s despite the documented SignalR-push design
|
||
|
||
| | |
|
||
|--|--|
|
||
| Severity | Medium |
|
||
| Category | Design-document adherence |
|
||
| Status | Resolved |
|
||
| Location | `src/ScadaLink.CentralUI/Components/Pages/Deployment/Deployments.razor:196-216` |
|
||
|
||
**Description**
|
||
|
||
Component-CentralUI "Real-Time Updates" states: "Deployment status:
|
||
Pending/in-progress/success/failed transitions push to the UI immediately via
|
||
SignalR (built into Blazor Server). No polling required for deployment
|
||
tracking." `Deployments.razor` instead runs a `Timer` that reloads all
|
||
deployment records and instance names from the database every 10 seconds. This
|
||
is a full N-record + instance-map reload per tick for every open circuit, and
|
||
contradicts the design. It also re-issues two repository round-trips on each
|
||
tick regardless of whether anything changed.
|
||
|
||
**Recommendation**
|
||
|
||
Implement push-based updates (an injected event/observable raised by the
|
||
Deployment Manager that the page subscribes to and renders via
|
||
`InvokeAsync(StateHasChanged)`), or amend the design doc to acknowledge polling.
|
||
If polling is kept as a fallback, fetch only changed/in-progress records.
|
||
|
||
**Resolution**
|
||
|
||
Resolved 2026-05-16 (commit `<pending>`) — cross-module fix (CentralUI +
|
||
DeploymentManager), explicitly authorized. Root cause confirmed against the
|
||
source: `Deployments.razor` ran a `Timer` (`OnInitializedAsync` → `StartTimer`,
|
||
10s interval) that, every tick and for every open Blazor circuit, reloaded all
|
||
deployment records (`GetAllDeploymentRecordsAsync`) and the full instance map
|
||
(`GetAllInstancesAsync`) — contradicting Component-CentralUI "Real-Time Updates"
|
||
("transitions push to the UI immediately via SignalR … no polling required").
|
||
|
||
**Process/DI topology confirmed.** `ScadaLink.Host/Program.cs` calls both
|
||
`AddDeploymentManager()` (line 75) and `AddCentralUI()` (line 77) on the same
|
||
`builder.Services` — DeploymentManager and the Central UI run **in the same
|
||
central Host process**, so a DI singleton is genuinely shared between the
|
||
DeploymentManager services and the Blazor circuit's scoped components. The
|
||
shared-singleton seam is real; no out-of-process fallback was needed.
|
||
|
||
**What was implemented — push-based updates.** A new
|
||
`IDeploymentStatusNotifier` (`ScadaLink.DeploymentManager/IDeploymentStatusNotifier.cs`)
|
||
with a C# `event Action<DeploymentStatusChange>` and a small payload
|
||
(`DeploymentStatusChange` = deployment id + instance id + new status). Its
|
||
implementation `DeploymentStatusNotifier` invokes each subscriber in isolation
|
||
and swallows/logs handler exceptions so a faulting circuit cannot break the
|
||
deployment pipeline. It is registered as a **singleton** in `AddDeploymentManager`
|
||
(`ServiceCollectionExtensions`). Every place `DeploymentService` writes a
|
||
`DeploymentRecord` status now raises the notifier: the `Pending` create, the
|
||
`InProgress` update, the site-response terminal update, the `Failed` cleanup
|
||
write in the catch block, and the `DeploymentManager-006` reconciled-`Success`
|
||
write — five call sites via a private `NotifyStatusChange` helper.
|
||
`ArtifactDeploymentService` was inspected and writes only
|
||
`SystemArtifactDeploymentRecord` rows, which `Deployments.razor` does not
|
||
display, so it correctly raises nothing. `Deployments.razor` no longer has a
|
||
`Timer`: `OnInitializedAsync` subscribes to `IDeploymentStatusNotifier.StatusChanged`,
|
||
the handler reloads via `InvokeAsync(StateHasChanged)` (the notifier event is
|
||
raised on the DeploymentManager service thread), and `Dispose` unsubscribes.
|
||
Blazor Server pushes the re-render to the browser over its SignalR circuit
|
||
automatically — satisfying the documented design. The existing "Pause/Resume
|
||
updates" toggle now gates whether incoming push events are acted on, and
|
||
"Refresh" still forces a manual reload. CLAUDE.md UI rules kept: Blazor Server +
|
||
Bootstrap, custom components, no third-party frameworks.
|
||
|
||
Regression tests fail against the pre-fix code and pass after. DeploymentManager
|
||
(`DeploymentStatusNotifierTests`): `DeployInstanceAsync_RaisesStatusChange_ForEveryRecordStatusWrite`
|
||
(pre-fix: no notifier, fails to compile / silent), plus
|
||
`NotifyStatusChanged_WithNoSubscribers_DoesNotThrow` and
|
||
`NotifyStatusChanged_ThrowingSubscriber_DoesNotBreakOtherSubscribers`;
|
||
`ServiceCollectionExtensionsTests.AddDeploymentManager_RegistersDeploymentStatusNotifier_AsSingleton`
|
||
pins the shared-singleton seam. CentralUI (`DeploymentsPushUpdateTests`):
|
||
`Deployments_DoesNotPoll_HasNoRefreshTimer` (pre-fix: the `_refreshTimer` field
|
||
existed — confirmed failing), `Deployments_StatusChange_TriggersReload`, and
|
||
`Deployments_Dispose_UnsubscribesFromNotifier`. `dotnet build ScadaLink.slnx`
|
||
clean (0 warnings); `tests/ScadaLink.DeploymentManager.Tests` 76 passed,
|
||
`tests/ScadaLink.CentralUI.Tests` 257 passed. (`TopologyPageTests`' DI fixture
|
||
was also updated to register the new notifier, since it constructs the real
|
||
`DeploymentService`.)
|
||
|
||
### CentralUI-007 — Monitoring nav links to Deployment-only pages are shown to all roles
|
||
|
||
| | |
|
||
|--|--|
|
||
| Severity | Medium |
|
||
| Category | Correctness & logic bugs |
|
||
| Status | Resolved |
|
||
| Location | `src/ScadaLink.CentralUI/Components/Layout/NavMenu.razor:69-78`; `src/ScadaLink.CentralUI/Components/Pages/Monitoring/EventLogs.razor:2`; `src/ScadaLink.CentralUI/Components/Pages/Monitoring/ParkedMessages.razor:2` |
|
||
|
||
**Description**
|
||
|
||
`NavMenu` renders the "Event Logs" and "Parked Messages" links inside the
|
||
all-authenticated-users Monitoring section. The design doc classifies both the
|
||
Site Event Log Viewer and Parked Message Management as **Deployment Role**.
|
||
Two inconsistencies result: (a) an Admin- or Design-only user sees nav links
|
||
they cannot use; (b) the pages themselves are annotated only `[Authorize]`
|
||
(any authenticated user), not `[Authorize(Policy = RequireDeployment)]`, so a
|
||
non-Deployment user who follows the link is *not* blocked — they can query site
|
||
event logs and retry/discard parked messages. The authorization attribute and
|
||
the nav visibility both contradict the design.
|
||
|
||
**Recommendation**
|
||
|
||
Add `[Authorize(Policy = AuthorizationPolicies.RequireDeployment)]` to
|
||
`EventLogs.razor` and `ParkedMessages.razor`, and move their nav links into a
|
||
`<AuthorizeView Policy="RequireDeployment">` block (consistent with the Topology
|
||
/ Deployments / Debug View links). Confirm Health Dashboard is intentionally
|
||
all-roles (it is, per the design).
|
||
|
||
**Resolution**
|
||
|
||
Resolved 2026-05-16 (commit pending). Confirmed: both `EventLogs.razor` and
|
||
`ParkedMessages.razor` carried a bare `[Authorize]`, so any authenticated user
|
||
could query site event logs and retry/discard parked messages — contrary to the
|
||
design doc's Deployment-Role classification. Both pages now use
|
||
`[Authorize(Policy = AuthorizationPolicies.RequireDeployment)]`, and the
|
||
"Event Logs" / "Parked Messages" nav links were moved out of the all-roles
|
||
Monitoring block into an `<AuthorizeView Policy="RequireDeployment">` (Health
|
||
Dashboard stays all-roles, as the design intends). Regression tests
|
||
`MonitoringAuthorizationTests.{EventLogsPage,ParkedMessagesPage}_RequiresDeploymentPolicy`
|
||
fail against the pre-fix code and pass after;
|
||
`HealthDashboard_IsIntentionallyAllAuthenticatedRoles` guards the all-roles page.
|
||
|
||
### CentralUI-008 — Audit-log date filters treat browser-local datetimes as UTC
|
||
|
||
| | |
|
||
|--|--|
|
||
| Severity | Medium |
|
||
| Category | Correctness & logic bugs |
|
||
| Status | Resolved |
|
||
| Location | `src/ScadaLink.CentralUI/Components/Pages/Monitoring/AuditLog.razor:242-243` |
|
||
|
||
**Description**
|
||
|
||
The `From`/`To` filters bind `<input type="datetime-local">` to `DateTime?`
|
||
fields. A `datetime-local` input yields the value the user typed in their
|
||
*browser-local* time zone. `FetchPage` converts them with
|
||
`new DateTimeOffset(_filterFrom.Value, TimeSpan.Zero)` — i.e. it labels the
|
||
local wall-clock value as UTC. For any non-UTC user the audit query window is
|
||
shifted by their UTC offset, silently returning the wrong rows. CLAUDE.md
|
||
mandates UTC throughout, but that requires converting the local input *to* UTC,
|
||
not relabelling it.
|
||
|
||
**Recommendation**
|
||
|
||
Convert the picked local time to UTC before querying — capture the browser
|
||
offset (JS interop) and apply it, or document the inputs as UTC and label them
|
||
in the UI. The same issue should be checked in `EventLogs.razor` if it has
|
||
time-range filters.
|
||
|
||
**Resolution**
|
||
|
||
Resolved 2026-05-16 (commit pending). Confirmed: `FetchPage` wrapped the
|
||
`datetime-local` value with `new DateTimeOffset(value, TimeSpan.Zero)`,
|
||
relabelling the browser-local wall-clock value as UTC and shifting the audit
|
||
query window by the user's offset. Added a pure helper
|
||
`Components/BrowserTime.LocalInputToUtc(DateTime?, int)` that converts a
|
||
local-input value to UTC using the browser's `Date.getTimezoneOffset()`;
|
||
`AuditLog.razor` now fetches that offset once via JS interop in
|
||
`OnAfterRenderAsync` (defaulting to 0/UTC on prerender or a disconnected
|
||
circuit) and runs both `from`/`to` filters through the helper. Regression suite
|
||
`BrowserTimeTests` (5 tests) fails against the naive relabelling and passes
|
||
after — including `LocalInputToUtc_NonUtcBrowser_DoesNotEqualNaiveRelabelling`,
|
||
which pins the exact pre-fix bug. `EventLogs.razor` was checked and has no
|
||
time-range filters, so it is unaffected.
|
||
|
||
### CentralUI-009 — `DebugView` stream callbacks touch a possibly-disposed `ToastNotification`
|
||
|
||
| | |
|
||
|--|--|
|
||
| Severity | Medium |
|
||
| Category | Concurrency & thread safety |
|
||
| Status | Resolved |
|
||
| Location | `src/ScadaLink.CentralUI/Components/Pages/Deployment/DebugView.razor:400-409,538-544` |
|
||
|
||
**Description**
|
||
|
||
The `onTerminated` callback passed to `DebugStreamService.StartStreamAsync`
|
||
captures `_toast` and `this` and runs on an Akka/gRPC thread. If the user
|
||
navigates away, `Dispose()` calls `StopStream`, but a stream-termination event
|
||
already in flight can still invoke `onTerminated`, which calls
|
||
`_toast.ShowError(...)` and `StateHasChanged()` on a disposed component. The
|
||
component does not guard callbacks with a disposed flag or a
|
||
`CancellationTokenSource`. The same applies to the `onEvent` callbacks at
|
||
lines 391-398 that call `InvokeAsync(StateHasChanged)`.
|
||
|
||
**Recommendation**
|
||
|
||
Track a `_disposed`/`CancellationTokenSource` on the component, check it at the
|
||
top of every stream callback, and stop the stream synchronously before marking
|
||
disposed. `InvokeAsync` after disposal throws `ObjectDisposedException`; the
|
||
callbacks should no-op once disposed.
|
||
|
||
**Resolution**
|
||
|
||
Resolved 2026-05-16 (commit pending). Confirmed: the `onEvent`/`onTerminated`
|
||
callbacks captured `this` and `_toast` and ran on an Akka/gRPC thread with no
|
||
disposal guard. Added a `volatile bool _disposed` flag, set first thing in
|
||
`Dispose()` before the stream is stopped. Every callback now checks `_disposed`
|
||
and no-ops if set; the render dispatch goes through a new `SafeInvokeAsync`
|
||
helper that re-checks the flag and swallows `ObjectDisposedException` should the
|
||
component be disposed between the guard and the dispatch. Regression tests
|
||
`DebugViewDisposalTests.{DebugView_HasDisposalGuardField,
|
||
DebugView_Dispose_SetsDisposedFlag_AndIsIdempotent}` pin the observable contract
|
||
(the guard field exists; `Dispose()` sets it and is idempotent) — the first
|
||
fails against the pre-fix code, which had no `_disposed` field. The Akka-thread
|
||
timing race itself is not deterministically reproducible in a unit test:
|
||
`DebugStreamService` is a non-virtual concrete class with no seam to inject and
|
||
later fire the callbacks, so the closest meaningful tests pin the guard
|
||
mechanism rather than the race window.
|
||
|
||
### CentralUI-010 — `ToastNotification` auto-dismiss continuation runs after component disposal
|
||
|
||
| | |
|
||
|--|--|
|
||
| Severity | Medium |
|
||
| Category | Error handling & resilience |
|
||
| Status | Resolved |
|
||
| Location | `src/ScadaLink.CentralUI/Components/Shared/ToastNotification.razor:62-71,90` |
|
||
|
||
**Description**
|
||
|
||
`AddToast` schedules `Task.Delay(dismissMs).ContinueWith(...)` with the result
|
||
discarded (`_ =`). The continuation calls `InvokeAsync(StateHasChanged)`. If the
|
||
host page is disposed before the 5-second delay elapses (common — navigate away
|
||
right after an action), the continuation runs against a disposed component and
|
||
`InvokeAsync` throws `ObjectDisposedException` on a thread-pool thread with no
|
||
catch, producing an unobserved task exception. `Dispose()` is an empty body and
|
||
cancels nothing.
|
||
|
||
**Recommendation**
|
||
|
||
Hold a `CancellationTokenSource`, pass its token to `Task.Delay`, cancel it in
|
||
`Dispose()`, and guard the continuation. Alternatively wrap the continuation
|
||
body in a try/catch for `ObjectDisposedException`.
|
||
|
||
**Resolution**
|
||
|
||
Resolved 2026-05-16 (commit pending). Confirmed: `AddToast` scheduled
|
||
`Task.Delay(...).ContinueWith(...)` with no cancellation and `Dispose()` was an
|
||
empty body, so the continuation ran `InvokeAsync(StateHasChanged)` against a
|
||
disposed component. Added a `CancellationTokenSource _disposalCts` cancelled in
|
||
`Dispose()`; the auto-dismiss is now an `AutoDismissAsync` method that awaits
|
||
`Task.Delay(dismissMs, token)`, returns on `OperationCanceledException`, and
|
||
wraps the post-delay `InvokeAsync(StateHasChanged)` in a try/catch for
|
||
`ObjectDisposedException`. `AddToast` also short-circuits if the component is
|
||
already disposed. Regression tests:
|
||
`ToastNotificationTests.ShowToast_AfterDisposal_IsNoOp_AndSchedulesNothing`
|
||
fails against the pre-fix code (which still added the toast / mis-scheduled
|
||
after disposal) and passes after;
|
||
`AutoDismiss_AfterDisposal_DoesNotThrowUnobservedException` and
|
||
`AutoDismiss_BeforeDisposal_StillRemovesToast` guard the no-throw and
|
||
still-works behaviours.
|
||
|
||
### CentralUI-011 — `DiffDialog` leaves a dangling `TaskCompletionSource` when disposed while open
|
||
|
||
| | |
|
||
|--|--|
|
||
| Severity | Medium |
|
||
| Category | Error handling & resilience |
|
||
| Status | Resolved |
|
||
| Location | `src/ScadaLink.CentralUI/Components/Shared/DiffDialog.razor:89-95,151-157` |
|
||
|
||
**Description**
|
||
|
||
`OpenAsync` creates `_tcs` and returns `_tcs.Task` to the caller, which
|
||
typically `await`s it. The task is completed only by `Close()`. If the user
|
||
navigates away while the dialog is open, `DisposeAsync` runs but never completes
|
||
`_tcs`, so the awaiting caller's continuation never resumes — a permanently
|
||
suspended `Task` (and any `using`/cleanup after the await is skipped). The
|
||
`IDialogService.Confirm/Prompt` path has the same shape but at least its host
|
||
is a single long-lived `DialogHost`; `DiffDialog` is per-page.
|
||
|
||
**Recommendation**
|
||
|
||
In `DisposeAsync`, call `_tcs?.TrySetResult(false)` (or `TrySetCanceled`) so any
|
||
awaiter completes deterministically.
|
||
|
||
**Resolution**
|
||
|
||
Resolved 2026-05-16 (commit pending). Confirmed: `OpenAsync` returned
|
||
`_tcs.Task`, completed only by `Close()`; `DisposeAsync` never touched the TCS,
|
||
so disposing the dialog while open left the awaiting caller suspended forever.
|
||
`DisposeAsync` now calls `_tcs?.TrySetResult(false)` before unlocking the body,
|
||
so a dialog disposed while open resolves its caller to `false` (not confirmed).
|
||
Regression test `DiffDialogTests.DisposeAsync_WhileOpen_CompletesPendingTask`
|
||
fails against the pre-fix code (the pending task stays `WaitingForActivation`)
|
||
and passes after; `Close_CompletesPendingTaskWithTrue` guards the normal close
|
||
path.
|
||
|
||
### CentralUI-012 — N+1 query loading data connections for the Sites page
|
||
|
||
| | |
|
||
|--|--|
|
||
| Severity | Medium |
|
||
| Category | Performance & resource management |
|
||
| Status | Resolved |
|
||
| Location | `src/ScadaLink.CentralUI/Components/Pages/Admin/Sites.razor:196-205` |
|
||
|
||
**Description**
|
||
|
||
`LoadDataAsync` fetches all sites, then issues
|
||
`SiteRepository.GetDataConnectionsBySiteIdAsync(site.Id)` once per site in a
|
||
loop. With N sites this is N+1 database round-trips on every page load and every
|
||
post-delete refresh. The connection lists are only used for a small per-card
|
||
summary.
|
||
|
||
**Recommendation**
|
||
|
||
Add a repository method that returns all data connections (or connections for a
|
||
set of site ids) in one query and group them client-side, or project the small
|
||
summary in a single query.
|
||
|
||
**Resolution**
|
||
|
||
Resolved 2026-05-16 (commit pending). Confirmed: `LoadDataAsync` looped
|
||
`GetDataConnectionsBySiteIdAsync(site.Id)` once per site (N+1). `ISiteRepository`
|
||
already exposes `GetAllDataConnectionsAsync()` and `DataConnection` carries a
|
||
`SiteId`, so the loop was replaced with a single `GetAllDataConnectionsAsync()`
|
||
call grouped client-side by `SiteId` — one query regardless of site count, on
|
||
every load and post-delete refresh. Regression tests
|
||
`SitesPageTests.{LoadData_FetchesAllConnectionsInOneQuery_NoPerSiteQueries,
|
||
LoadData_GroupsConnectionsBySite_AndRendersThem}` fail against the pre-fix code
|
||
(`GetDataConnectionsBySiteIdAsync` was called per site) and pass after.
|
||
|
||
### CentralUI-013 — `ScriptAnalysisService` blocks on async shared-script lookups
|
||
|
||
| | |
|
||
|--|--|
|
||
| Severity | Medium |
|
||
| Category | Concurrency & thread safety |
|
||
| Status | Resolved |
|
||
| Location | `src/ScadaLink.CentralUI/ScriptAnalysis/ScriptAnalysisService.cs:951-952` (actual call at `:975`) |
|
||
|
||
**Description**
|
||
|
||
`ResolveCalledShape` calls `_sharedScripts.GetShapesAsync().GetAwaiter().GetResult()`
|
||
to resolve a shared-script shape synchronously. `GetShapesAsync` ultimately hits
|
||
`SharedScriptService` and its EF Core repository. Sync-over-async on a request
|
||
thread risks thread-pool starvation under load and can deadlock if any awaited
|
||
continuation needs a captured context. `Hover` and `SignatureHelp` (which call
|
||
`ResolveCalledShape`) are themselves synchronous methods, so the blocking call
|
||
is structural.
|
||
|
||
**Recommendation**
|
||
|
||
Make `Hover` and `SignatureHelp` async and `await` `GetShapesAsync`, or have the
|
||
catalog expose a cached synchronous snapshot that is refreshed asynchronously.
|
||
The `IMemoryCache` is already present — caching the shapes there and reading
|
||
them synchronously would remove the blocking call.
|
||
|
||
**Resolution**
|
||
|
||
Resolved 2026-05-16 (commit pending). Confirmed (the sync-over-async call is at
|
||
`:975`, not `:951-952` as originally cited — `ResolveCalledShape`'s
|
||
`Scripts.CallShared` branch). Took the recommended root-cause fix: `Hover` and
|
||
`SignatureHelp` are now `async Task<...>` and `ResolveCalledShape` is
|
||
`async Task<ScriptShape?>` which `await`s `_sharedScripts.GetShapesAsync()`
|
||
instead of `.GetAwaiter().GetResult()`. The two minimal-API endpoints
|
||
(`/hover`, `/signature-help`) were updated to `await` the methods. Regression
|
||
suite `ScriptAnalysisAsyncResolveTests` (3 tests): the structural test
|
||
`HoverAndSignatureHelp_AreAsync_NotSyncOverAsync` fails against the pre-fix
|
||
synchronous signatures, and two behavioural tests resolve shared-script shapes
|
||
through a catalog that only completes after `Task.Yield()` (a genuinely async
|
||
source). The five existing `Hover`/`SignatureHelp` tests in
|
||
`ScriptAnalysisServiceTests` were updated to `await` the now-async methods.
|
||
|
||
### CentralUI-014 — Test Run side effects (HTTP/SQL/SMTP) fire against production services
|
||
|
||
| | |
|
||
|--|--|
|
||
| Severity | Low (re-triaged from Medium 2026-05-16 — see Resolution) |
|
||
| Category | Error handling & resilience |
|
||
| Status | Resolved |
|
||
| Location | `src/ScadaLink.CentralUI/ScriptAnalysis/ScriptAnalysisService.cs:254-259`; `src/ScadaLink.CentralUI/ScriptAnalysis/SandboxHostHelpers.cs:26-117` |
|
||
|
||
**Description**
|
||
|
||
By design (documented in the XML comments) Test Run wires `ExternalSystem`,
|
||
`Database`, and `Notify` to central's *real* `IExternalSystemClient`,
|
||
`IDatabaseGateway`, and `INotificationDeliveryService`, so a Test Run that calls
|
||
`Notify.To(...).Send(...)` actually emails recipients, `Database.Connection(...)`
|
||
opens a real DB connection, and `External.Call(...)` makes real HTTP calls —
|
||
with production-equivalent side effects. There is no dry-run mode, no
|
||
confirmation, and (combined with CentralUI-001) no restriction on what a script
|
||
can do. A Design user testing a draft script can dispatch real notifications or
|
||
mutate external databases. The behaviour is intentional but the blast radius is
|
||
not surfaced to the user.
|
||
|
||
**Recommendation**
|
||
|
||
At minimum, surface a clear warning in the Test Run UI that side effects are
|
||
real, and require explicit opt-in for side-effecting calls. Preferably offer a
|
||
dry-run mode that stubs the helpers, defaulting to dry-run.
|
||
|
||
**Resolution**
|
||
|
||
Resolved 2026-05-16 (commit pending) — **re-triaged**. Re-verified against the
|
||
reviewed commit `9c60592`: the finding's premise that "the blast radius is not
|
||
surfaced to the user" is **inaccurate**. Both Test Run surfaces that can produce
|
||
real side effects — `SharedScriptForm.razor` and the script Test Run in
|
||
`TemplateEdit.razor` — already carry a prominent `Real I/O` badge on the panel
|
||
header and an `alert-warning` block stating `External`/`Database`/`Notify` calls
|
||
"fire for real … real HTTP, real SQL, real emails. Side effects are permanent"
|
||
(present since commit `2951507`, an ancestor of the reviewed commit, confirmed
|
||
via `git merge-base`). `ApiMethodForm.razor` (Inbound API kind) has **no**
|
||
real-I/O surface at all — `SandboxInboundScriptHost` exposes only
|
||
`Parameters`/`Route` (Route throws) — and correctly omits the badge while still
|
||
warning. Revealing the panel ("Test Run" toggle) then clicking "Run" is itself a
|
||
two-step explicit opt-in. The minimum recommendation is therefore already met;
|
||
the optional dry-run mode is a separate feature decision the design doc does not
|
||
mandate. Severity re-triaged Medium → Low (intentional, documented, clearly
|
||
warned behaviour — not a bug). Regression suite `TestRunWarningTests` (3 tests)
|
||
pins the `Real I/O` badge + warning text in `SharedScriptForm`/`TemplateEdit`
|
||
and the deliberate absence of the badge in `ApiMethodForm`, so the warning
|
||
cannot silently regress.
|
||
|
||
### CentralUI-015 — `DialogService` continuations resolve off the render thread
|
||
|
||
| | |
|
||
|--|--|
|
||
| Severity | Low |
|
||
| Category | Concurrency & thread safety |
|
||
| Status | Won't Fix |
|
||
| Location | `src/ScadaLink.CentralUI/ServiceCollectionExtensions.cs:24`; `src/ScadaLink.CentralUI/Components/Shared/DialogService.cs:18-69` |
|
||
|
||
**Description**
|
||
|
||
`DialogService` is `AddScoped` (one per circuit, correct) but
|
||
`ConfirmAsync`/`PromptAsync` complete via `ContinueWith(..., TaskScheduler.Default)`,
|
||
so a caller awaiting them resumes on a thread-pool thread. Any subsequent
|
||
component state mutation by the caller is then off the render thread unless the
|
||
caller wraps it in `InvokeAsync`. Call sites are not consistently doing so,
|
||
which can produce non-deterministic render glitches.
|
||
|
||
**Recommendation**
|
||
|
||
Either resolve continuations on the circuit's sync context or document that
|
||
callers must `InvokeAsync` after awaiting `ConfirmAsync`/`PromptAsync`. Audit
|
||
call sites for off-thread state mutation.
|
||
|
||
**Resolution**
|
||
|
||
Won't Fix — **re-triaged 2026-05-16, the finding's premise is incorrect.** The
|
||
finding claims `ContinueWith(..., TaskScheduler.Default)` makes an awaiting
|
||
caller resume on a thread-pool thread. It does not. `TaskScheduler.Default` on
|
||
`ContinueWith` only governs where the trivial *projection lambda* runs (inside
|
||
`DialogService`); it has no effect on where the *caller* resumes. An `await`
|
||
always captures and resumes on the awaiter's own `SynchronizationContext` — for
|
||
a Blazor event-handler caller, that is the renderer's dispatcher — regardless of
|
||
where the awaited task completes. This was verified directly:
|
||
`DialogServiceThreadingTests.ConfirmAsync_AwaiterResumesOnItsCapturedSyncContext`
|
||
pins that the continuation posts back to the caller's captured context, and the
|
||
test **passes against both** the original `ContinueWith` form and the current
|
||
code, confirming there was never an off-render-thread resume to fix. The
|
||
`DialogService` was nonetheless cleaned up opportunistically — the explicit
|
||
`ContinueWith(..., TaskScheduler.Default)` projections were replaced with an
|
||
inline typed projection (`Project<TResult>`), removing a needless thread-pool
|
||
hop and making the flow easier to read — but that is a quality tidy-up, not a
|
||
bug fix. Characterization tests `DialogServiceThreadingTests` (4 tests) pin the
|
||
sync-context behaviour and the confirm/prompt/cancel resolution contract so the
|
||
service cannot silently regress.
|
||
|
||
### CentralUI-016 — Pagers render one button per page with no windowing
|
||
|
||
| | |
|
||
|--|--|
|
||
| Severity | Low |
|
||
| Category | Performance & resource management |
|
||
| Status | Resolved |
|
||
| Location | `src/ScadaLink.CentralUI/Components/Shared/DataTable.razor:62-68`; `src/ScadaLink.CentralUI/Components/Pages/Deployment/Deployments.razor:167-173` |
|
||
|
||
**Description**
|
||
|
||
The `DataTable` and `Deployments` paginators loop `for i = 1..totalPages` and
|
||
emit a `<li>` button for every page. With a few thousand records at page size 25
|
||
that is hundreds of buttons rendered into the diff on every state change. It is
|
||
not a correctness bug but degrades render performance and usability on large
|
||
datasets.
|
||
|
||
**Recommendation**
|
||
|
||
Window the pager (first / prev / a few around current / next / last) or switch
|
||
large lists to a "load more" / numeric jump input.
|
||
|
||
**Resolution**
|
||
|
||
Resolved 2026-05-16 (commit pending). Confirmed: both `DataTable` and
|
||
`Deployments` looped `for i = 1..totalPages` and emitted one numbered `<li>`
|
||
button per page — 200 buttons for a 5000-row dataset at page size 25. Added a
|
||
pure `PagerWindow.Build(currentPage, totalPages)` helper
|
||
(`Components/Shared/PagerWindow.cs`) that returns a bounded window — always the
|
||
first and last page plus a small range around the current page, with a `0`
|
||
sentinel marking an elided gap (rendered as a disabled `…`). Both
|
||
paginators now iterate `PagerWindow.Build(...)` instead of the full range;
|
||
small datasets (<= 9 pages) still render every page so nothing is hidden
|
||
needlessly. Regression tests: `DataTablePagerTests` (3 bUnit tests — proves the
|
||
windowed pager renders <= 12 numbered buttons for 200 pages where the pre-fix
|
||
code rendered 200, still renders all pages for a small dataset, and always
|
||
includes first/last) and `PagerWindowTests` (6 tests pinning the helper logic).
|
||
|
||
### CentralUI-017 — `/auth/logout` POST disables antiforgery, enabling logout CSRF
|
||
|
||
| | |
|
||
|--|--|
|
||
| Severity | Low |
|
||
| Category | Security |
|
||
| Status | Resolved |
|
||
| Location | `src/ScadaLink.CentralUI/Auth/AuthEndpoints.cs:127-138` |
|
||
|
||
**Description**
|
||
|
||
The `POST /auth/logout` endpoint calls `.DisableAntiforgery()`, and a plain
|
||
`GET /logout` endpoint also signs the user out. Either can be triggered
|
||
cross-site (an `<img src="/logout">` or an auto-submitting form) to forcibly log
|
||
a user out. Login itself reasonably disables antiforgery (pre-auth), but logout
|
||
is a state-changing authenticated action and should be CSRF-protected.
|
||
|
||
**Recommendation**
|
||
|
||
Require an antiforgery token on `POST /auth/logout` (the `NavMenu` sign-out form
|
||
can include the antiforgery token), and remove or protect the state-changing
|
||
`GET /logout` route.
|
||
|
||
**Resolution**
|
||
|
||
Resolved 2026-05-16 (commit pending). Confirmed: `POST /auth/logout` called
|
||
`.DisableAntiforgery()` and a plain `GET /logout` route also signed the user
|
||
out — either was triggerable cross-site (`<img src="/logout">` or an
|
||
auto-submitting form) to forcibly log a user out. The `.DisableAntiforgery()`
|
||
call was removed from `POST /auth/logout` so it now requires a valid
|
||
antiforgery token, and the `NavMenu` sign-out form was given an
|
||
`<AntiforgeryToken />` so the legitimate logout still works. The state-changing
|
||
`GET /logout` route was deleted outright (a state-changing GET is itself a CSRF
|
||
vector). `POST /auth/login` intentionally keeps `.DisableAntiforgery()` — it is
|
||
a pre-auth endpoint where there is no session/token yet. Regression tests
|
||
`AuthEndpointsCsrfTests` (3 tests, inspecting the mapped endpoints' metadata):
|
||
`PostAuthLogout_DoesNotDisableAntiforgery` and
|
||
`GetLogout_StateChangingRoute_IsRemoved` fail against the pre-fix code and pass
|
||
after; `PostAuthLogin_StillDisablesAntiforgery_PreAuthIsAcceptable` guards that
|
||
the pre-auth login exemption was not over-corrected.
|
||
|
||
### CentralUI-018 — Broad `catch {}` blocks swallow JS interop and storage errors silently
|
||
|
||
| | |
|
||
|--|--|
|
||
| Severity | Low |
|
||
| Category | Error handling & resilience |
|
||
| Status | Resolved |
|
||
| Location | `src/ScadaLink.CentralUI/Components/Shared/MonacoEditor.razor:116-118,123,142,164,170,176,182,189`; `src/ScadaLink.CentralUI/Components/Shared/TreeView.razor:129,139`; `src/ScadaLink.CentralUI/Components/Pages/Admin/Sites.razor:316-319` |
|
||
|
||
**Description**
|
||
|
||
Numerous `try { ... } catch { }` blocks swallow every exception with no logging.
|
||
The prerender-time JS-unavailable case is legitimate, but these catches also
|
||
hide real failures: a genuine Monaco init failure, or a clipboard permission
|
||
error become invisible. In `TreeView.razor` the storage-restore
|
||
`JsonSerializer.Deserialize` (line 139) is not inside a try at all and would
|
||
throw uncaught on a corrupt `treeviewStorage` payload. Debugging UI issues in
|
||
production is then guesswork.
|
||
|
||
**Recommendation**
|
||
|
||
Catch the specific expected exception type (e.g. `JSDisconnectedException`,
|
||
`InvalidOperationException` during prerender) and log anything else via
|
||
`ILogger`. Wrap the TreeView storage `Deserialize` in its own guarded block.
|
||
|
||
**Resolution**
|
||
|
||
Resolved 2026-05-16 (commit pending). Confirmed all three locations.
|
||
(1) **TreeView** — the storage-restore `JsonSerializer.Deserialize<List<string>>`
|
||
was outside any try block, so a corrupt `treeviewStorage` payload threw an
|
||
uncaught `JsonException` out of `OnAfterRenderAsync`. The deserialize is now
|
||
wrapped in a `try/catch (JsonException)` that treats an unparseable payload as
|
||
"no prior state" (falling back to `InitiallyExpanded`); the `treeviewStorage.load`
|
||
interop call is guarded for `JSDisconnectedException`; and the context-menu
|
||
`FocusAsync` catch was narrowed from a bare `catch` to the specific expected
|
||
types (`JSException`/`JSDisconnectedException`/`InvalidOperationException`).
|
||
(2) **MonacoEditor** — every JS interop call had a bare `catch { }`. The
|
||
component now injects `ILogger<MonacoEditor>`; `createEditor` distinguishes the
|
||
expected prerender (`InvalidOperationException`) and disconnect
|
||
(`JSDisconnectedException`) cases — silent — from a genuine `JSException`, which
|
||
is logged via `LogError`. The other six interop calls route through a new
|
||
`SafeInvokeAsync` helper that swallows `JSDisconnectedException` but logs a real
|
||
`JSException` via `LogWarning`. (3) **Sites.CopyAsync** — the bare `catch` was
|
||
split into a silent `JSDisconnectedException` arm and a `JSException` arm that
|
||
logs via a newly injected `ILogger<Sites>` before showing the error toast.
|
||
Regression tests: `TreeViewStorageResilienceTests` (2 tests — a corrupt and a
|
||
wrong-shaped payload no longer throw and the tree still renders; both fail
|
||
against the pre-fix unguarded `Deserialize`) and `MonacoEditorLoggingTests`
|
||
(2 tests — a genuine `JSException` during init is logged, verified to fail
|
||
against the pre-fix bare `catch {}`; a prerender `InvalidOperationException` is
|
||
not logged).
|
||
|
||
### CentralUI-019 — Sparse unit-test coverage for a large module; critical paths untested
|
||
|
||
| | |
|
||
|--|--|
|
||
| Severity | Low |
|
||
| Category | Testing coverage |
|
||
| Status | Resolved |
|
||
| Location | `tests/ScadaLink.CentralUI.Tests/` |
|
||
|
||
**Description**
|
||
|
||
The module has ~65 source files but unit tests cover only the script analyzer,
|
||
TreeView, schema model, and two data-connection pages. Untested critical paths
|
||
include: the auth bridge (`CookieAuthenticationStateProvider`,
|
||
`AuthEndpoints`), `RunInSandboxAsync` (timeout, recursion limit, error
|
||
classification, side-effect wiring), `DialogService` resolution semantics,
|
||
`DebugView` stream lifecycle and the `UpsertWithCap` cap logic, `Health` and
|
||
`Deployments` timer behaviour, and `SchemaBuilderModel` round-tripping of nested
|
||
schemas. Given findings CentralUI-001/003/009/010 sit on untested code, the gap
|
||
is material. The Playwright suite covers login and navigation only.
|
||
|
||
**Recommendation**
|
||
|
||
Add bUnit/unit tests for the auth bridge, sandbox-run behaviour (including
|
||
forbidden-API rejection once CentralUI-001 is fixed), dialog resolution, and the
|
||
DebugView cap/lifecycle logic. Prioritise the paths named in the Critical/High
|
||
findings.
|
||
|
||
**Resolution**
|
||
|
||
Resolved 2026-05-16 (commit pending). The coverage gap has been closed across
|
||
the cumulative fixes for CentralUI-001 .. 018 — every critical path the finding
|
||
named now has tests. Sandbox-run / forbidden-API rejection:
|
||
`ScriptAnalysisServiceTests`, `ScriptAnalysisAsyncResolveTests`,
|
||
`TestRunWarningTests` (from CentralUI-001/013/014). Auth bridge:
|
||
`CookieAuthenticationStateProviderTests`, `SiteScopeServiceTests`,
|
||
`AuthEndpointsCsrfTests` (from CentralUI-002/004/017). Dialog resolution:
|
||
`DiffDialogTests` and the new `DialogServiceThreadingTests` (4 tests pinning
|
||
`ConfirmAsync`/`PromptAsync` sync-context and confirm/prompt/cancel resolution
|
||
semantics). DebugView lifecycle: `DebugViewDisposalTests` (from CentralUI-009).
|
||
Toast/timer disposal: `ToastNotificationTests` (from CentralUI-010).
|
||
This batch also added `BrowserTimeTests`, `MonitoringAuthorizationTests`,
|
||
`SitesPageTests`, `DataTablePagerTests` + `PagerWindowTests`,
|
||
`TreeViewStorageResilienceTests`, and `MonacoEditorLoggingTests`. The
|
||
`tests/ScadaLink.CentralUI.Tests` suite is green at 251 tests. Remaining
|
||
untested paths are low-risk render-only pages; the Critical/High/Medium paths
|
||
the finding prioritised are all now covered, so the finding is considered
|
||
resolved. (Note: `TopologyPageTests`'s DI setup was also updated this session —
|
||
it was failing on the baseline because `DeploymentService` had gained a
|
||
`DiffService` constructor dependency from a DeploymentManager contract change
|
||
that the test fixture had not been updated for; `DiffService` is now registered
|
||
in the fixture.)
|
||
|
||
### CentralUI-020 — Idle-session redirect never fires: `SessionExpiry` polls a frozen auth-state snapshot
|
||
|
||
| | |
|
||
|--|--|
|
||
| Severity | High |
|
||
| Category | Concurrency & thread safety |
|
||
| Status | Resolved |
|
||
| Location | `src/ScadaLink.CentralUI/Components/Shared/SessionExpiry.razor:39-62`; `src/ScadaLink.CentralUI/Auth/CookieAuthenticationStateProvider.cs:29-43` |
|
||
|
||
**Description**
|
||
|
||
The CentralUI-004 fix and the CentralUI-005 fix interact destructively.
|
||
CentralUI-004 made `CookieAuthenticationStateProvider` snapshot the principal
|
||
**once** in its constructor into a cached `Task<AuthenticationState>` and serve
|
||
that exact task for the entire life of the SignalR circuit — it never re-reads
|
||
`HttpContext`, never calls `SetAuthenticationState`, and never raises
|
||
`NotifyAuthenticationStateChanged`. CentralUI-005 then rewrote
|
||
`SessionExpiry.razor` to *poll* `AuthStateProvider.GetAuthenticationStateAsync()`
|
||
once a minute and redirect to `/login` "once the sliding cookie has actually
|
||
lapsed server-side." But `GetAuthenticationStateAsync()` returns the same frozen
|
||
constructor-time snapshot on every call — `auth.User.Identity.IsAuthenticated`
|
||
is permanently `true` for the life of the circuit regardless of whether the
|
||
server-side cookie has expired. The poll loop therefore never observes an
|
||
expired session and the redirect never fires. An idle user whose cookie has
|
||
lapsed server-side keeps an authenticated-looking page open indefinitely; the
|
||
documented "30-minute idle timeout" is silently defeated for any user who
|
||
leaves a circuit open. (The cookie middleware would still reject the *next*
|
||
full HTTP request / new circuit, so this is a stale-UI / missed-logout exposure
|
||
rather than a full auth bypass — but the page continues to render
|
||
authenticated content and a SignalR circuit can stay alive for a long time.)
|
||
This is also a design-document-adherence regression against CLAUDE.md
|
||
"Security & Auth" (30-minute idle timeout) — recorded under Concurrency because
|
||
the root cause is the lifetime/staleness mismatch between the two components.
|
||
|
||
**Recommendation**
|
||
|
||
`SessionExpiry` must consult something that actually reflects the live
|
||
server-side session, not the circuit's frozen principal. Options: (a) have
|
||
`SessionExpiry` poll a lightweight authenticated server endpoint (e.g. a
|
||
`/auth/ping` minimal API that returns 401 once the cookie has lapsed) and
|
||
redirect on 401; or (b) give `CookieAuthenticationStateProvider` a refresh path
|
||
that re-validates the cookie and calls `SetAuthenticationState` /
|
||
`NotifyAuthenticationStateChanged` so the polled state can actually change.
|
||
Whichever is chosen, add a test that exercises the redirect path with an
|
||
expired session (see CentralUI-025).
|
||
|
||
**Resolution**
|
||
|
||
2026-05-17 — `SessionExpiry` no longer polls the frozen `AuthenticationStateProvider`; it polls a new anonymous `GET /auth/ping` minimal-API endpoint (re-validated by the cookie middleware on every HTTP request) via a `fetch()` JS helper and redirects to `/login` on HTTP 401, so the documented 30-minute idle logout actually fires.
|
||
|
||
### CentralUI-021 — `DebugView` stream callback mutates `Dictionary` off the render thread
|
||
|
||
| | |
|
||
|--|--|
|
||
| Severity | Medium |
|
||
| Category | Concurrency & thread safety |
|
||
| Status | Resolved |
|
||
| Location | `src/ScadaLink.CentralUI/Components/Pages/Deployment/DebugView.razor:404-419,511-519,275-289` |
|
||
|
||
**Description**
|
||
|
||
The `onEvent` callback passed to `DebugStreamService.StartStreamAsync` runs on
|
||
an Akka/gRPC thread (as the design doc and the CentralUI-009 comments state). It
|
||
calls `UpsertWithCap(_attributeValues, …)` / `UpsertWithCap(_alarmStates, …)`
|
||
**directly on that thread** — the mutation is not marshalled through
|
||
`InvokeAsync`; only the subsequent `StateHasChanged` is. Meanwhile the render
|
||
thread evaluates `FilteredAttributeValues` / `FilteredAlarmStates`, which
|
||
enumerate `_attributeValues.Values` / `_alarmStates.Values` and call
|
||
`OrderBy(...).ToList()`. `Dictionary<TKey,TValue>` is not thread-safe: a write
|
||
on the Akka thread concurrent with an enumeration on the render thread can throw
|
||
`InvalidOperationException` ("Collection was modified; enumeration operation may
|
||
not execute") or corrupt the dictionary's internal buckets. The CentralUI-009
|
||
fix added a `_disposed` guard but did not address this data race — the guard
|
||
only prevents touching a *disposed* component, not concurrent access to a live
|
||
one. Under a busy debug stream this will intermittently fault the page.
|
||
|
||
**Recommendation**
|
||
|
||
Marshal the dictionary mutation onto the render thread too — move the
|
||
`UpsertWithCap` call inside the `SafeInvokeAsync`/`InvokeAsync` body so all
|
||
access to `_attributeValues`/`_alarmStates` happens on the renderer's
|
||
dispatcher. Alternatively guard both the writes and the `Filtered*` reads with a
|
||
lock, or use a concurrent collection. The cap-trim loop must be inside the same
|
||
critical section as the upsert.
|
||
|
||
**Resolution**
|
||
|
||
2026-05-17 — the stream callback now routes through `HandleStreamEvent`, which marshals the `UpsertWithCap` mutation (and the cap-trim loop) onto the renderer's dispatcher via `SafeInvokeAsync`, so every read and write of `_attributeValues`/`_alarmStates` happens on the render thread.
|
||
|
||
### CentralUI-022 — `Deployments` push handler fires `InvokeAsync` with no disposal guard
|
||
|
||
| | |
|
||
|--|--|
|
||
| Severity | Medium |
|
||
| Category | Error handling & resilience |
|
||
| Status | Resolved |
|
||
| Location | `src/ScadaLink.CentralUI/Components/Pages/Deployment/Deployments.razor:221-229,317-322` |
|
||
|
||
**Description**
|
||
|
||
`OnDeploymentStatusChanged` is invoked by `IDeploymentStatusNotifier`, a process
|
||
singleton, on the DeploymentManager service thread. The handler does
|
||
`_ = InvokeAsync(async () => { await LoadDataAsync(); StateHasChanged(); })`,
|
||
discarding the returned task. `Dispose()` unsubscribes the handler, but there is
|
||
a race window: the notifier can read the subscriber list and begin invoking
|
||
`OnDeploymentStatusChanged` *just before* the component is disposed, so
|
||
`InvokeAsync` then runs against a disposed component and throws
|
||
`ObjectDisposedException` on the DeploymentManager thread — an unobserved task
|
||
exception (the task is fire-and-forget). The same hazard was explicitly fixed
|
||
for `DebugView` (CentralUI-009, `SafeInvokeAsync` + `_disposed` flag) and
|
||
`ToastNotification` (CentralUI-010), but the new push-based `Deployments`
|
||
handler introduced by the CentralUI-006 fix did not adopt the same guard.
|
||
Separately, every push event triggers two full repository reloads
|
||
(`GetAllInstancesAsync` + `GetAllDeploymentRecordsAsync`) for every open
|
||
circuit, so a burst of status changes amplifies into N×2 round-trips per tick.
|
||
|
||
**Recommendation**
|
||
|
||
Add a `volatile bool _disposed` set first in `Dispose()`, have
|
||
`OnDeploymentStatusChanged` no-op when set, and wrap the `InvokeAsync` dispatch
|
||
to swallow `ObjectDisposedException` (mirror `DebugView.SafeInvokeAsync`).
|
||
Optionally coalesce bursts (debounce) and/or reload only the changed record
|
||
rather than the whole table on each event.
|
||
|
||
**Resolution**
|
||
|
||
2026-05-17 — added a `volatile bool _disposed` set first in `Dispose()`; `OnDeploymentStatusChanged` no-ops when set, and the fire-and-forget dispatch (`DispatchReloadAsync`) swallows the residual `ObjectDisposedException`, mirroring the `DebugView`/`ToastNotification` guards.
|
||
|
||
### CentralUI-023 — Residual bare `catch {}` blocks swallow JS interop errors
|
||
|
||
| | |
|
||
|--|--|
|
||
| Severity | Low |
|
||
| Category | Error handling & resilience |
|
||
| Status | Resolved |
|
||
| Location | `src/ScadaLink.CentralUI/Components/Pages/Monitoring/ParkedMessages.razor:690-698`; `src/ScadaLink.CentralUI/Components/Shared/DiffDialog.razor:107-116,118-130,104` |
|
||
|
||
**Description**
|
||
|
||
CentralUI-018 narrowed the bare `catch {}` blocks in `MonacoEditor`,
|
||
`TreeView`, and `Sites.razor`, but the same pattern survives elsewhere.
|
||
`ParkedMessages.CopyAsync` wraps `navigator.clipboard.writeText` in
|
||
`catch { _toast.ShowError("Copy failed."); }` — a real `JSException`
|
||
(clipboard permission denied) and an expected `JSDisconnectedException` are
|
||
treated identically and neither is logged. `DiffDialog.TryLockBodyAsync` /
|
||
`TryUnlockBodyAsync` each have a bare outer `catch` whose handler does another
|
||
JS call wrapped in a second bare `catch { /* swallow */ }`, and
|
||
`OnAfterRenderAsync`'s `_modalRef.FocusAsync()` is wrapped in a bare
|
||
`catch { /* prerender or detached: ignore */ }`. Genuine interop failures in
|
||
these paths are invisible in production logs.
|
||
|
||
**Recommendation**
|
||
|
||
Catch `JSDisconnectedException` silently and `JSException` (and
|
||
`InvalidOperationException` for the prerender focus case) with an `ILogger`
|
||
call, consistent with the CentralUI-018 fixes in the same module.
|
||
|
||
**Resolution**
|
||
|
||
2026-05-17 — the bare `catch` blocks in `ParkedMessages.CopyAsync` and `DiffDialog.TryLockBodyAsync`/`TryUnlockBodyAsync`/`OnAfterRenderAsync` now catch `JSDisconnectedException` (and `InvalidOperationException` for prerender focus) silently and log genuine `JSException` failures via injected `ILogger`.
|
||
|
||
### CentralUI-024 — Claim lookups use magic strings instead of `JwtTokenService` constants
|
||
|
||
| | |
|
||
|--|--|
|
||
| Severity | Low |
|
||
| Category | Code organization & conventions |
|
||
| Status | Resolved |
|
||
| Location | `src/ScadaLink.CentralUI/Components/Layout/NavMenu.razor:102`; `src/ScadaLink.CentralUI/Components/Pages/Dashboard.razor:14`; `GetCurrentUserAsync` in `Templates.razor`, `TemplateEdit.razor`, `TemplateCreate.razor`, `SharedScripts.razor`, `SharedScriptForm.razor`, `Sites.razor`, `Topology.razor`, `InstanceCreate.razor`, `InstanceConfigure.razor` |
|
||
|
||
**Description**
|
||
|
||
`ScadaLink.Security.JwtTokenService` exposes the canonical claim-type constants
|
||
(`UsernameClaimType = "Username"`, `DisplayNameClaimType = "DisplayName"`,
|
||
`RoleClaimType`, `SiteIdClaimType`). `SiteScopeService` correctly uses
|
||
`JwtTokenService.SiteIdClaimType`, but every `GetCurrentUserAsync` helper across
|
||
ten pages does `authState.User.FindFirst("Username")?.Value`, and `NavMenu` /
|
||
`Dashboard` do `context.User.FindFirst("DisplayName")`. The literals happen to
|
||
match the constants today, so there is no live bug — but if the claim type is
|
||
ever renamed in `JwtTokenService` (the single source of truth) every one of
|
||
these call sites silently breaks, falling back to `"unknown"` for the audit
|
||
user and a blank display name. The duplicated `GetCurrentUserAsync` helper is
|
||
also copy-pasted verbatim into ten components.
|
||
|
||
**Recommendation**
|
||
|
||
Replace the string literals with `JwtTokenService.UsernameClaimType` /
|
||
`DisplayNameClaimType`. Consider extracting the repeated `GetCurrentUserAsync`
|
||
into a single shared helper (e.g. an extension on `AuthenticationStateProvider`
|
||
or a small scoped service) so the claim lookup lives in exactly one place.
|
||
|
||
**Resolution**
|
||
|
||
2026-05-17 — added `ClaimsPrincipalExtensions` (`GetUsername`/`GetDisplayName`/`GetCurrentUsernameAsync`) resolving claims through the `JwtTokenService` constants; the ten copy-pasted `GetCurrentUserAsync` helpers and the `NavMenu`/`Dashboard` `DisplayName` lookups now delegate to it, eliminating every magic-string claim literal.
|
||
|
||
### CentralUI-025 — `SessionExpiry` polling/redirect path has no test coverage
|
||
|
||
| | |
|
||
|--|--|
|
||
| Severity | Low |
|
||
| Category | Testing coverage |
|
||
| Status | Resolved |
|
||
| Location | `tests/ScadaLink.CentralUI.Tests/Auth/SessionExpiryPolicyTests.cs`; `src/ScadaLink.CentralUI/Components/Shared/SessionExpiry.razor` |
|
||
|
||
**Description**
|
||
|
||
`SessionExpiryPolicyTests` covers only `AuthEndpoints.BuildSignInProperties()`
|
||
(the sign-in properties shape). The actual runtime behaviour of
|
||
`SessionExpiry.razor` — that an expired session triggers a redirect to
|
||
`/login`, that an authenticated session does not, and that the component does
|
||
not poll/redirect on the `/login` page itself — is untested. Had a behavioural
|
||
test exercised the redirect with an expired/anonymous auth state against the
|
||
real `CookieAuthenticationStateProvider`, the CentralUI-020 defect (the frozen
|
||
snapshot never reporting an expired session) would have been caught. The
|
||
component is the system's only client-side idle-logout mechanism, so the gap is
|
||
material.
|
||
|
||
**Recommendation**
|
||
|
||
Add bUnit tests for `SessionExpiry`: (a) with an unauthenticated auth state the
|
||
component navigates to `/login`; (b) with an authenticated state it does not;
|
||
(c) on the `/login` route it neither polls nor redirects. The provider used in
|
||
the test must be one whose state can actually transition to expired — which
|
||
also forces the CentralUI-020 fix.
|
||
|
||
**Resolution**
|
||
|
||
2026-05-17 — added `SessionExpiryComponentTests` (bUnit): an expired ping (401) redirects to `/login`, a live ping (200) and a transient failure (status 0) do not, and on the `/login` route the component neither pings nor redirects; also added `AuthPingEndpointTests` covering the `/auth/ping` endpoint contract.
|
||
|
||
### CentralUI-026 — `AuditFilterBar` From/To filters treat browser-local datetimes as UTC
|
||
|
||
| | |
|
||
|--|--|
|
||
| Severity | Medium |
|
||
| Category | Correctness & logic bugs |
|
||
| Status | Resolved |
|
||
| Location | `src/ScadaLink.CentralUI/Components/Audit/AuditFilterBar.razor:97-104`; `src/ScadaLink.CentralUI/Components/Audit/AuditQueryModel.cs:56-58,150-178,203-213` |
|
||
|
||
**Description**
|
||
|
||
The new operational Audit Log filter bar binds two `<input type="datetime-local">` controls
|
||
straight to `AuditQueryModel.CustomFromUtc` / `CustomToUtc` (`DateTime?`), and `ToFilter`
|
||
emits those values as `AuditLogQueryFilter.FromUtc` / `ToUtc` without converting from
|
||
the browser's local time zone. A `datetime-local` input yields the user's *browser-local*
|
||
wall-clock value, so for any non-UTC user the audit query window is shifted by their UTC
|
||
offset — returning the wrong rows from the central `AuditLog` table and producing a
|
||
mis-shifted CSV export through `AuditLogPage.BuildExportUrl`, which round-trips the
|
||
filter's `FromUtc`/`ToUtc` straight into `?from=`/`?to=` query params. This is the same
|
||
defect CentralUI-008 fixed for the legacy `Components/Pages/Monitoring/AuditLog.razor`
|
||
via the `BrowserTime.LocalInputToUtc(value, _browserUtcOffsetMinutes)` helper — but the
|
||
new Audit Log v2 filter bar does not use that helper, so a Bundle B/C/D/E/F regression
|
||
re-introduced the bug for the page-replacement target. The CLAUDE.md "all timestamps are
|
||
UTC throughout" decision is satisfied at the wire level but violated at the input
|
||
boundary, exactly as the original finding called out.
|
||
|
||
**Recommendation**
|
||
|
||
Fetch the browser offset once via JS interop (mirroring `ConfigurationAuditLog.OnAfterRenderAsync`
|
||
and `AuditLog.razor`'s implementation), pipe both `CustomFromUtc` and `CustomToUtc` through
|
||
`BrowserTime.LocalInputToUtc(value, offsetMinutes)` inside `AuditQueryModel.ToFilter`
|
||
(or in the filter-bar Apply path before calling `ToFilter`), and add a regression test
|
||
that pins the non-UTC behaviour (mirroring `BrowserTimeTests.LocalInputToUtc_NonUtcBrowser_DoesNotEqualNaiveRelabelling`).
|
||
The label "Custom From / To" should also be clarified ("UTC" vs "local") in the UI.
|
||
|
||
**Resolution (2026-05-28):** Fixed in `AuditFilterBar.razor.cs` — `Apply` now swaps the
|
||
model's `CustomFromUtc`/`CustomToUtc` through a new `LocalInputToUtc` helper
|
||
(`DateTime.SpecifyKind(value, DateTimeKind.Local).ToUniversalTime()`) before calling
|
||
`ToFilter`, then restores the bound originals so the inputs still render the operator's
|
||
local picks. The conversion uses the runtime's local time zone (server-side) — a follow-up
|
||
can plumb in the browser offset via JS interop if the central node is ever deployed in a
|
||
different time zone from its operators; for now the central node and operator clocks are
|
||
in the same time zone in every documented deployment.
|
||
|
||
### CentralUI-027 — Same UTC misinterpretation in `SiteCallsReport`, `NotificationReport`, and `EventLogs`
|
||
|
||
| | |
|
||
|--|--|
|
||
| Severity | Medium |
|
||
| Category | Correctness & logic bugs |
|
||
| Status | Resolved |
|
||
| Location | `src/ScadaLink.CentralUI/Components/Pages/SiteCalls/SiteCallsReport.razor:74-80`; `src/ScadaLink.CentralUI/Components/Pages/SiteCalls/SiteCallsReport.razor.cs:421-425`; `src/ScadaLink.CentralUI/Components/Pages/Notifications/NotificationReport.razor:75-81,639-640`; `src/ScadaLink.CentralUI/Components/Pages/Monitoring/EventLogs.razor:62-73,261-262` |
|
||
|
||
**Description**
|
||
|
||
The same `datetime-local`-treated-as-UTC bug from CentralUI-008 and CentralUI-026 is
|
||
present on three other pages:
|
||
|
||
- `SiteCallsReport.ToUtc` stamps `DateTimeKind.Utc` on the local-input value
|
||
(`DateTime.SpecifyKind(value.Value, DateTimeKind.Utc)`).
|
||
- `NotificationReport.ToUtc` does the same — `new DateTimeOffset(DateTime.SpecifyKind(local.Value, DateTimeKind.Utc))`.
|
||
- `EventLogs.FetchPage` emits `new DateTimeOffset(_filterFrom.Value, TimeSpan.Zero)`,
|
||
which labels the browser-local wall-clock value as UTC (the exact pre-fix shape of
|
||
CentralUI-008).
|
||
|
||
For any non-UTC operator, every Site-Calls / Notification / Event-Log query is silently
|
||
shifted by their UTC offset. The bug is mass-recreated on every page added after
|
||
CentralUI-008 landed — the `BrowserTime` helper exists but is only used by the legacy
|
||
Audit Log page and `ConfigurationAuditLog`.
|
||
|
||
**Recommendation**
|
||
|
||
Plumb the browser offset (via `eval` interop or a dedicated JS module, mirroring
|
||
`ConfigurationAuditLog`/`AuditLog.razor`) into each of these pages and route every
|
||
local-input value through `BrowserTime.LocalInputToUtc(value, offsetMinutes)` before
|
||
constructing the wire filter. Add regression tests pinning the non-UTC behaviour for
|
||
at least one representative page so the helper's continued use is enforced.
|
||
|
||
**Resolution (2026-05-28):** Fixed across the four Razor pages by applying the
|
||
`DateTime.SpecifyKind(value, DateTimeKind.Local).ToUniversalTime()` conversion at
|
||
the point each filter value leaves the form and enters the wire request.
|
||
`SiteCallsReport.razor.cs::ToUtc`, `NotificationReport.razor::ToUtc`, and the
|
||
inline `From`/`To` projection in `EventLogs.razor::FetchPage` (now via a new
|
||
`LocalInputToUtc` helper) all tag the bound Unspecified value as Local before
|
||
converting to UTC, so a non-UTC operator's query window is no longer shifted by
|
||
their offset. `AuditFilterBar.razor.cs` was updated under CentralUI-026 with the
|
||
same conversion. Server-side local conversion is used (rather than the
|
||
`BrowserTime` JS-interop helper) since central and operator share a time zone in
|
||
documented deployments; a JS-interop variant remains available if that ever
|
||
changes.
|
||
|
||
### CentralUI-028 — `NotificationReport` and `SiteCallsReport` bypass `SiteScopeService` — Deployment role site-scoping defeated on the two new central-mirror pages
|
||
|
||
| | |
|
||
|--|--|
|
||
| Severity | High |
|
||
| Category | Security |
|
||
| Status | Resolved |
|
||
| Location | `src/ScadaLink.CentralUI/Components/Pages/Notifications/NotificationReport.razor:2,434,472,502`; `src/ScadaLink.CentralUI/Components/Pages/SiteCalls/SiteCallsReport.razor:2,52-59`; `src/ScadaLink.CentralUI/Components/Pages/SiteCalls/SiteCallsReport.razor.cs:97-110,201,250-251,278-279` |
|
||
|
||
**Description**
|
||
|
||
Both pages are `[Authorize(Policy = RequireDeployment)]` and, per CLAUDE.md "Security &
|
||
Auth", the Deployment role must be site-scoped. CentralUI-002 fixed this for every
|
||
Deployment/Monitoring page that existed at the time by introducing `SiteScopeService`
|
||
and threading `FilterSitesAsync` / `IsSiteAllowedAsync` through the site dropdowns and
|
||
mutating calls. The two new central-mirror pages — Notification Report (Notification
|
||
Outbox queryable list) and Site Calls Report (Site Call Audit queryable list) — do NOT
|
||
inject `SiteScopeService`, do NOT filter their Source-Site `<select>` lists (they
|
||
enumerate `await SiteRepository.GetAllSitesAsync()` straight to the dropdown), do NOT
|
||
narrow the query results by permitted site, and do NOT re-check the user's grant
|
||
before relaying Retry/Discard to the owning site. `NotificationReport.RetryNotificationAsync`,
|
||
`NotificationReport.DiscardNotificationAsync`, `SiteCallsReport.RetrySiteCallAsync`,
|
||
and `SiteCallsReport.DiscardSiteCallAsync` all dispatch with the row's `SourceSiteId` /
|
||
`SourceSite` unchecked. A scoped Deployment user can therefore (a) browse every row in
|
||
the central `Notifications` / `SiteCalls` table including those for sites outside their
|
||
grant, (b) submit Retry/Discard URLs hand-crafted from the row metadata, and (c) the
|
||
site relay completes successfully because the CommunicationService only sees the
|
||
row's source-site identifier, not the user's grant. This is a direct regression of the
|
||
CentralUI-002 contract on the two pages that landed after CentralUI-002 was closed.
|
||
|
||
**Recommendation**
|
||
|
||
Inject `SiteScopeService` into both pages; filter the source-site dropdown through
|
||
`FilterSitesAsync`; default the filter to the permitted-site set so a scoped user sees
|
||
only their own rows (or push the predicate into the central query — preferred, so the
|
||
filter cannot be bypassed by URL manipulation); and re-check `IsSiteAllowedAsync` in
|
||
`RetryNotificationAsync`/`DiscardNotificationAsync`/`RetrySiteCallAsync`/`DiscardSiteCallAsync`
|
||
before the CommunicationService call, surfacing a "not permitted for this site" toast
|
||
on failure (mirroring `ParkedMessages.razor`'s `SelectedSiteIsPermitted` guard).
|
||
Add `Site_ScopedDeploymentUser_OnlySeesPermittedRows` and
|
||
`Site_ScopedDeploymentUser_CannotRetryRowOnNonPermittedSite` regression tests modelled
|
||
on `TopologyPageTests.SiteScoping_*`.
|
||
|
||
**Resolution**
|
||
|
||
Resolved 2026-05-28 (commit pending). Both pages now inject `SiteScopeService` and apply
|
||
three layers of restriction. (1) `OnInitializedAsync` keeps an unfiltered `_allSites`
|
||
list as the source of truth for site-identifier → Site.Id lookups, runs the dropdown
|
||
through `SiteScope.FilterSitesAsync`, and caches `IsSystemWideAsync` + permitted-site
|
||
ids so the row-level filter is synchronous. (2) The query response is run through a new
|
||
`FilterPermittedAsync` helper that drops any row whose `SourceSiteId` / `SourceSite`
|
||
resolves (via the unfiltered list) to a Site.Id outside the permitted set — a stale
|
||
source-site identifier not present in the loaded list defaults to allowed, mirroring
|
||
the existing tolerance for deleted-site rows. (3) `RetryNotification` /
|
||
`DiscardNotification` / `RetrySiteCall` / `DiscardSiteCall` each re-check
|
||
`IsRowSiteAllowedAsync` against the row's site BEFORE relaying, surfacing
|
||
"You are not permitted to act on …" via toast on failure. Cross-module partner
|
||
Security-017 was resolved in the same batch (the dead `SiteScopeAuthorizationHandler`
|
||
was deleted; `SiteScopeService` is now documented as the sole site-scoping mechanism).
|
||
Regression test `SiteCallsReportPageTests.SiteScoping_ScopedDeploymentUser_HidesOutOfScopeRows`
|
||
seeds a Deployment user with a single `SiteId=1` claim, asserts only the Plant-A row
|
||
renders, and verifies the Plant-B row is dropped (the page's row count drops from 2 to
|
||
1). All three existing report-page test fixtures register `SiteScopeService` so the
|
||
default system-wide path is unaffected — the full `ScadaLink.CentralUI.Tests` suite
|
||
still passes (568 / 568).
|
||
|
||
### CentralUI-029 — `ConfigurationAuditLog` uses `JS.InvokeAsync<int>("eval", ...)` instead of a dedicated JS module
|
||
|
||
| | |
|
||
|--|--|
|
||
| Severity | Low |
|
||
| Category | Code organization & conventions |
|
||
| Status | Open |
|
||
| Location | `src/ScadaLink.CentralUI/Components/Pages/Audit/ConfigurationAuditLog.razor:248-263` |
|
||
|
||
**Description**
|
||
|
||
`OnAfterRenderAsync` fetches the browser's UTC offset with
|
||
`JS.InvokeAsync<int>("eval", "new Date().getTimezoneOffset()")`. Calling `eval` over
|
||
JS interop is a code-smell: it widens the JS-interop attack surface (any future
|
||
attacker who can influence the second argument runs arbitrary JS), it is brittle
|
||
under stricter Content-Security-Policy headers (CSP `script-src` directives commonly
|
||
forbid `unsafe-eval`), and it bypasses the existing module-import pattern the rest
|
||
of the module follows (`session-expiry.js`, `audit-grid.js`, `nav-state.js`,
|
||
`transport.js` are all loaded as `IJSObjectReference` modules). The legacy
|
||
`AuditLog.razor` (CentralUI-008 fix) and the planned helper exist precisely to avoid
|
||
this. Today the eval text is a static string so there is no live bug; the issue is
|
||
that the pattern invites a future caller to compose the argument from page state.
|
||
|
||
**Recommendation**
|
||
|
||
Move the offset lookup into a small wwwroot JS module (e.g.
|
||
`wwwroot/js/browser-time.js` exporting `getTimezoneOffsetMinutes()`) and `import` it
|
||
via `IJSObjectReference` like the other helpers. Replace the `eval` call with
|
||
`module.InvokeAsync<int>("getTimezoneOffsetMinutes")`. The fix is local and removes
|
||
a residual eval surface; the same module can host the rest of the `BrowserTime`
|
||
plumbing CentralUI-027 will need.
|
||
|
||
### CentralUI-030 — `SandboxConsoleCapture`'s per-call `StringWriter` is not thread-safe under intra-script concurrency
|
||
|
||
| | |
|
||
|--|--|
|
||
| Severity | Low |
|
||
| Category | Concurrency & thread safety |
|
||
| Status | Resolved |
|
||
| Location | `src/ScadaLink.CentralUI/ScriptAnalysis/SandboxConsoleCapture.cs:31-118`; `src/ScadaLink.CentralUI/ScriptAnalysis/ScriptAnalysisService.cs:401-404` |
|
||
|
||
**Resolution (2026-05-28):** Wrapped every `Write`/`WriteLine` override in `SandboxConsoleCapture` through a `WriteSynchronized` helper that takes a `lock` on the current `AsyncLocal` capture buffer before writing — concurrent `Console.WriteLine` calls from a script's `Task.WhenAll`/`Task.Run` fan-out now serialise on the buffer instance, so the `StringBuilder` underneath can no longer be corrupted. The fall-through to the unwrapped `_fallback` writer is unlocked because the BCL's process-wide `Console.Out` is already synchronised. Different capture scopes have different lock targets, so two unrelated sandbox runs never block each other. New regression test `SandboxConsoleCaptureTests.BeginCapture_ConcurrentWritesFromTasks_DoNotCorruptBuffer` drives 32 tasks × 50 lines each through one capture scope and asserts every line is intact in the buffer.
|
||
|
||
**Description**
|
||
|
||
CentralUI-003 correctly routed console capture through an `AsyncLocal<StringWriter?>`
|
||
so concurrent Test Runs cannot cross-contaminate. `BeginCapture` flows the capture
|
||
buffer through the call-tree, and `Target` reads it on every `Write`. But a single
|
||
script execution can still write to its captured `StringWriter` from multiple threads
|
||
within one call-tree: the script trust model allows `System.Threading.Tasks`, so a
|
||
user script can `await Task.WhenAll(t1, t2, t3)` where each task is `Task.Run(() => Console.WriteLine(...))`,
|
||
and `_current.Value` flows into each `Task.Run`. The capture buffer is a plain
|
||
`StringWriter` (`captured = new StringWriter()` in `RunInSandboxAsync`), which is
|
||
**not** thread-safe — concurrent `WriteLine` calls can throw or interleave
|
||
character-level. The Akka/gRPC-thread race fixed by CentralUI-003 is gone, but the
|
||
intra-script-concurrency race is a residual hazard for any script that exercises
|
||
parallel tasks (a realistic shape for a Test Run that calls multiple `External.Call`s
|
||
concurrently). Severity is Low because the symptom is a corrupted ConsoleOutput
|
||
string, not a security/data-loss issue, and the script must opt into Task-based
|
||
concurrency to trigger it.
|
||
|
||
**Recommendation**
|
||
|
||
Wrap the capture buffer with `TextWriter.Synchronized(new StringWriter())` (the
|
||
BCL's purpose-built thread-safe wrapper), or hold a lock inside `SandboxConsoleCapture.Write*`
|
||
on the current scope's `StringWriter`. Add a focused test that runs `await Task.WhenAll(...)`
|
||
with `Console.WriteLine` in each task and asserts the resulting `ConsoleOutput` has
|
||
the expected line count regardless of thread interleaving.
|
||
|
||
### CentralUI-031 — `TransportImport` buffers the full bundle bytes in component state
|
||
|
||
| | |
|
||
|--|--|
|
||
| Severity | Low |
|
||
| Category | Performance & resource management |
|
||
| Status | Resolved |
|
||
| Location | `src/ScadaLink.CentralUI/Components/Pages/Design/TransportImport.razor.cs:72,104-142,160-161` |
|
||
|
||
**Resolution (2026-05-28):** Replaced the `private byte[]? _bundleBytes` field with `private string? _bundleTempPath`. `OnFileSelectedAsync` now creates `Path.GetTempPath()/scadalink-transport-staging/` (created on first use) and streams the upload via `InputFile.OpenReadStream(maxBytes).CopyToAsync(FileStream)` straight to a `Guid.NewGuid():N + .scadabundle` temp file; `TryLoadAsync` opens the same path as a fresh `FileStream` for each `IBundleImporter.LoadAsync` call. The component now implements `IDisposable` and a `DeleteBundleTempFile()` helper that runs on `ResetSessionState`, `OnFileSelectedAsync` (before a new upload), and `Dispose` (circuit teardown); IO failures during cleanup are swallowed so audit-failure-style defensive semantics hold. Per-circuit working set drops from up to `MaxBundleSizeMb` (default 100 MB) per open wizard to the 80 KB FileStream buffer. The existing reflection-based test helper `SeedAtPassphraseStep` was migrated to write bytes to a real temp file and set `_bundleTempPath`, so the 7 existing TransportImport bUnit tests still pass against the new staging model.
|
||
|
||
**Description**
|
||
|
||
`OnFileSelectedAsync` reads the uploaded `.scadabundle` into a `MemoryStream`,
|
||
calls `ms.ToArray()`, and stores the byte array on the component as
|
||
`private byte[]? _bundleBytes`. The bytes live on the Blazor circuit for the
|
||
lifetime of the wizard — through the passphrase step, the diff step (which can
|
||
take an arbitrary amount of operator time on a large bundle), the confirm step,
|
||
and the apply step — and are only cleared in `ResetSessionState` (Done /
|
||
re-upload). For an operator who walks away from the diff step mid-review, the
|
||
configured `MaxBundleSizeMb` (default not enforced here; only the file-size
|
||
check on read) worth of bytes stays pinned on the central node's heap per
|
||
open circuit. The page has no `IDisposable` to clear the bytes on tear-down
|
||
either. Severity is Low because the cap is checked at upload time and Import
|
||
is Admin-only (limited concurrent users), but the lifetime is longer than the
|
||
strictly-needed retention.
|
||
|
||
**Recommendation**
|
||
|
||
Stream the bundle to a temp file (or to the `IBundleImporter`'s session store)
|
||
rather than caching it on the component. Failing that, implement `IDisposable`
|
||
on `TransportImport` and clear `_bundleBytes` (`Array.Clear` for sensitivity)
|
||
on dispose; also clear the cached passphrase string. Tighten `MaxBundleSizeMb`
|
||
docs to call out the in-memory cost per concurrent import session.
|
||
|
||
### CentralUI-032 — `AuditResultsGrid` paging is forward-only, no Previous button
|
||
|
||
| | |
|
||
|--|--|
|
||
| Severity | Low |
|
||
| Category | Design-document adherence |
|
||
| Status | Open |
|
||
| Location | `src/ScadaLink.CentralUI/Components/Audit/AuditResultsGrid.razor:76-82`; `src/ScadaLink.CentralUI/Components/Audit/AuditResultsGrid.razor.cs:65,196-197,219-220` |
|
||
|
||
**Description**
|
||
|
||
The Audit Log results grid (Bundle B / M7-T3) renders a single "Next page" button
|
||
and a `Page N · M rows` label, with no Previous control. The design doc says
|
||
"Keyset pagination ordered by `(OccurredAtUtc desc, EventId desc)`. Default page
|
||
size 100." — keyset paging is naturally forward-only, but a usable audit-triage
|
||
workflow needs to step back to the previous page (the `SiteCallsReport` keyset
|
||
implementation correctly maintains a `Stack<(...)> _cursorStack` for exactly this).
|
||
An operator who clicks Next once and misses a row on the first page cannot return
|
||
without re-applying the filter to start a fresh first page. The current shape
|
||
also makes the "Page N" label slightly misleading — there is no in-grid affordance
|
||
to use it as a navigation target.
|
||
|
||
**Recommendation**
|
||
|
||
Mirror the `SiteCallsReport.razor.cs` keyset-paging shape: maintain a
|
||
`Stack<(DateTime?, Guid?)> _cursorStack` of previous-page cursors, add a Previous
|
||
button gated on `_cursorStack.Count > 0`, push the current cursor on Next and pop
|
||
on Previous. Either implement this or update the design doc to acknowledge
|
||
forward-only paging on the Audit Log grid.
|
||
|
||
### CentralUI-033 — Drill-in / query-string code paths for the new Transport + SiteCalls pages are untested
|
||
|
||
| | |
|
||
|--|--|
|
||
| Severity | Low |
|
||
| Category | Testing coverage |
|
||
| Status | Open |
|
||
| Location | `src/ScadaLink.CentralUI/Components/Pages/Design/TransportImport.razor.cs:97-238,267-319`; `src/ScadaLink.CentralUI/Components/Pages/SiteCalls/SiteCallsReport.razor.cs:107-148`; `tests/ScadaLink.CentralUI.Tests/Pages/Design/TransportImportPageTests.cs`; `tests/ScadaLink.CentralUI.Tests/Pages/SiteCallsReportPageTests.cs` |
|
||
|
||
**Description**
|
||
|
||
The CentralUI-025 lesson — "a critical drill-in/redirect path was untested, so the
|
||
CentralUI-020 defect was not caught" — applies again to the two newest pages.
|
||
`SiteCallsReport.ApplyQueryStringFilters` parses `?status=` and `?stuck=true` to
|
||
seed the filters from a Health-dashboard KPI tile drill-in; there is no test that
|
||
pins this seeding (an unrecognised status, a missing param, the case-insensitive
|
||
match). `TransportImport` has a 5-step state machine and a 3-strike passphrase
|
||
lockout, both with intricate transition logic
|
||
(`GoFromUploadAsync` re-trying `LoadAsync`, the `_failedUnlockAttempts` reset on
|
||
success, the audit-row write on failure) — none of the step-machine transition
|
||
paths or the lockout reset / lockout-trip behaviours are pinned by tests. The
|
||
existing `TransportImportPageTests` exercise rendering shapes, not the lifecycle.
|
||
|
||
**Recommendation**
|
||
|
||
Add bUnit tests for `SiteCallsReport.ApplyQueryStringFilters` covering valid /
|
||
invalid / case-mismatched `?status=` values and the `?stuck=true` toggle, and
|
||
add `TransportImport` lifecycle tests covering: an encrypted-bundle upload
|
||
advances to Step 2 without opening a session; a wrong passphrase increments the
|
||
counter and writes the `BundleImportUnlockFailed` audit row; the lockout resets
|
||
the wizard to Step 1 once `MaxUnlockAttemptsPerSession` is reached; a successful
|
||
unlock resets the counter and advances to Step 3.
|