Files
scadalink-design/code-reviews/CentralUI/findings.md
Joseph Doherty 977d7369a7 docs: add code review process and baseline review of all 19 modules
Establishes a per-module code review workflow under code-reviews/ and
records the 2026-05-16 baseline review (commit 9c60592): 241 findings
across all src/ modules (6 Critical, 46 High, 100 Medium, 89 Low).
This is the clean starting point for remediation work.
2026-05-16 18:09:09 -04:00

26 KiB

Code Review — CentralUI

Field Value
Module src/ScadaLink.CentralUI
Design doc docs/requirements/Component-CentralUI.md
Status Reviewed
Last reviewed 2026-05-16
Reviewer claude-agent
Commit reviewed 9c60592
Open findings 19

Summary

The Central UI is a sizeable, generally well-structured Blazor Server module: custom Bootstrap components only (no third-party UI frameworks, as required), consistent list/form page patterns, careful disposal in most components, and a thoughtful Roslyn-backed script editor. The most serious problem is the Test Run sandbox (ScriptAnalysisService.RunInSandboxAsync): it compiles and executes arbitrary user C# in the central process with no enforcement of the documented script trust model — the forbidden-API list is only a Monaco editor diagnostic, never applied before execution — so a Design user can run System.IO/Process/Reflection code on the central node. Several other themes recur: (1) per-circuit security drift — site-scoped Deployment claims are written at login but never read, so site scoping is not enforced anywhere; (2) Blazor render-thread and disposal hazards — background Timer / Task.Delay callbacks and stream callbacks touch component state and @ref children that may already be disposed; (3) process-global mutation (Console.SetOut) shared across concurrent circuits; (4) drift from the design doc on session expiry and on the "deployment status pushes via SignalR" claim (the page actually polls). Testing coverage is thin for a module this large: only the script analyzer, TreeView, schema model, and a few data-connection pages have unit tests; most pages and the auth bridge are untested.

Checklist coverage

# Category Examined Notes
1 Correctness & logic bugs DebugView cap logic, audit-log timezone, toast race — see findings.
2 Akka.NET conventions Module is mostly UI; DebugStreamService actor usage reviewed (in Communication but driven from here). No actor-convention violations in CentralUI proper.
3 Concurrency & thread safety Console.SetOut global mutation, stream/timer callbacks on non-render threads, toast _ = Task.Delay.
4 Error handling & resilience Broad catch {} swallowing, dangling TaskCompletionSource on dialog disposal.
5 Security Sandbox not enforcing trust model (Critical); site scoping never enforced; auth bridge reads stale HttpContext; logout CSRF.
6 Performance & resource management N+1 site-connection query, repeated FilteredMessages recomputation, full-page paginators rendering all page buttons.
7 Design-document adherence Session expiry diverges from "15-min sliding + 30-min idle"; Deployments polls despite "push via SignalR"; nav exposes Deployment-only pages to all roles.
8 Code organization & conventions Generally good; options classes absent (no appsettings binding here); no major violations.
9 Testing coverage Auth, sandbox-run, DebugView, Health, ParkedMessages, most pages untested.
10 Documentation & comments Comments are accurate and helpful; a few stale claims noted.

Findings

CentralUI-001 — Test Run sandbox executes arbitrary C# with no trust-model enforcement

Severity Critical
Category Security
Status Open
Location src/ScadaLink.CentralUI/ScriptAnalysis/ScriptAnalysisService.cs:171-424

Description

RunInSandboxAsync compiles user-supplied script code with CSharpScript.Create and executes it (script.RunAsync) directly inside the central process. The "sandbox" applies only a wall-clock timeout and an output-size cap. It does not enforce the documented script trust model: the forbidden-API set (System.IO, System.Diagnostics/Process, System.Reflection, System.Net, threading) is checked only in FindForbiddenApiUsages, which feeds Monaco editor diagnostics — it is never consulted before RunInSandboxAsync executes. DefaultOptions references typeof(object).Assembly (the full BCL), so a Design-role user can submit System.IO.File.WriteAllText(...), System.Diagnostics.Process.Start(...), reflection, or raw socket code via POST /api/script-analysis/run and it runs with the central host process's full privileges. The endpoint is gated only by RequireDesign. This is a remote code execution path on the central cluster node.

Recommendation

Before executing, run the same forbidden-API analysis used for diagnostics and reject any script with a SCADA001/SCADA002 (severity-8) marker; additionally restrict the compilation's metadata references to the curated script API surface, and ideally execute in an isolated AssemblyLoadContext/process with constrained permissions. Treat the trust model as an execution-time gate, not an editor hint.

Resolution

Unresolved.

CentralUI-002 — Site-scoped Deployment permissions are issued but never enforced

Severity High
Category Security
Status Open
Location src/ScadaLink.CentralUI/Auth/AuthEndpoints.cs:63-69; src/ScadaLink.CentralUI/Components/Pages/Deployment/*.razor

Description

Login adds SiteId claims (JwtTokenService.SiteIdClaimType) for non-system-wide Deployment users, and the design doc (Component-CentralUI "Responsibilities" and CLAUDE.md Security & Auth) requires the Deployment role to be site-scoped. A repo-wide search shows the SiteId claim is written at login and never read anywhere in CentralUI. Deployment pages — DebugView.razor, Deployments.razor, InstanceCreate.razor, InstanceConfigure.razor, Topology.razor, ParkedMessages.razor, EventLogs.razor — list and act on every site with no filtering by the user's permitted sites. A Deployment user scoped to one site can deploy to, debug, and manage instances at any site.

Recommendation

Enforce site scoping: filter site/instance lists by the user's SiteId claims (or treat the absence of SiteId claims as system-wide), and re-check the claim server-side before any mutating cross-site command (deploy, enable/disable/delete, debug stream, parked-message retry/discard). A shared helper that reads the claims from AuthenticationStateProvider and exposes "permitted site ids" would keep this consistent.

Resolution

Unresolved.

CentralUI-003 — Console.SetOut/SetError mutates process-global state across concurrent circuits

Severity High
Category Concurrency & thread safety
Status Open
Location src/ScadaLink.CentralUI/ScriptAnalysis/ScriptAnalysisService.cs:359-423

Description

RunInSandboxAsync redirects Console.Out/Console.Error to a per-call StringWriter, runs the script, then restores them in finally. Console.Out is process-global. If two users (two Blazor circuits) run Test Run concurrently, their captured outputs interleave or cross over, and the finally of whichever finishes first restores Console.Out to the original writer while the other run is still executing — so the second run's script output is lost or written to the real console. RunInSandboxAsync is async and the script runs on a thread-pool thread, so concurrent execution is fully expected.

Recommendation

Do not redirect process-global Console. Provide console capture through the script globals surface (e.g. a TextWriter exposed on SandboxScriptHost that the sandbox API writes to), or serialize Test Run executions with a semaphore if global redirection must be kept. Capturing per-call without global mutation is the correct fix.

Resolution

Unresolved.

CentralUI-004 — CookieAuthenticationStateProvider reads HttpContext for the life of the circuit

Severity High
Category Security
Status Open
Location src/ScadaLink.CentralUI/Auth/CookieAuthenticationStateProvider.cs:22-28

Description

GetAuthenticationStateAsync returns _httpContextAccessor.HttpContext?.User. In Blazor Server, HttpContext is only valid during the initial HTTP request that establishes the circuit; for the lifetime of the long-lived SignalR circuit IHttpContextAccessor.HttpContext is null (or, worse, a stale/foreign context if the accessor's AsyncLocal leaks). Any later call to GetAuthenticationStateAsync — e.g. an <AuthorizeView> re-evaluating, or pages that call it directly (Sites.razor, Templates.razor) — then sees an unauthenticated principal and may render the wrong UI, or returns a stale identity that never reflects role changes. The class derives from ServerAuthenticationStateProvider, which is designed to be seeded once via SetAuthenticationState; overriding GetAuthenticationStateAsync to read HttpContext defeats that design.

Recommendation

Capture the authenticated principal once when the circuit is created (e.g. via the root component / AuthenticationStateProvider seeding pattern used by the Blazor Web App template) and store it on the scoped provider, instead of reading IHttpContextAccessor on every call. Do not depend on HttpContext after the circuit is established.

Resolution

Unresolved.

CentralUI-005 — Session expiry implementation diverges from the documented policy

Severity Medium
Category Design-document adherence
Status Open
Location src/ScadaLink.CentralUI/Auth/AuthEndpoints.cs:47-81; src/ScadaLink.CentralUI/Components/Shared/SessionExpiry.razor:18-30

Description

CLAUDE.md (Security & Auth) specifies "15-minute expiry with sliding refresh, 30-minute idle timeout." AuthEndpoints instead sets a single fixed expires_at = UtcNow + 30 minutes claim and a 30-minute cookie ExpiresUtc, with no sliding refresh and no separate idle vs absolute timeout. SessionExpiry.razor schedules a single hard redirect at that fixed time. The result is a hard 30-minute cap with no sliding renewal — an active user is logged out mid-session, and there is no 15-minute component at all.

Recommendation

Either implement the documented policy (sliding 15-minute token with refresh on activity, plus a 30-minute idle cutoff) or update the design docs to match the fixed 30-minute model. The code and the documented decision must agree.

Resolution

Unresolved.

CentralUI-006 — Deployment status page polls every 10s despite the documented SignalR-push design

Severity Medium
Category Design-document adherence
Status Open
Location src/ScadaLink.CentralUI/Components/Pages/Deployment/Deployments.razor:196-216

Description

Component-CentralUI "Real-Time Updates" states: "Deployment status: Pending/in-progress/success/failed transitions push to the UI immediately via SignalR (built into Blazor Server). No polling required for deployment tracking." Deployments.razor instead runs a Timer that reloads all deployment records and instance names from the database every 10 seconds. This is a full N-record + instance-map reload per tick for every open circuit, and contradicts the design. It also re-issues two repository round-trips on each tick regardless of whether anything changed.

Recommendation

Implement push-based updates (an injected event/observable raised by the Deployment Manager that the page subscribes to and renders via InvokeAsync(StateHasChanged)), or amend the design doc to acknowledge polling. If polling is kept as a fallback, fetch only changed/in-progress records.

Resolution

Unresolved.

Severity Medium
Category Correctness & logic bugs
Status Open
Location src/ScadaLink.CentralUI/Components/Layout/NavMenu.razor:69-78; src/ScadaLink.CentralUI/Components/Pages/Monitoring/EventLogs.razor:2; src/ScadaLink.CentralUI/Components/Pages/Monitoring/ParkedMessages.razor:2

Description

NavMenu renders the "Event Logs" and "Parked Messages" links inside the all-authenticated-users Monitoring section. The design doc classifies both the Site Event Log Viewer and Parked Message Management as Deployment Role. Two inconsistencies result: (a) an Admin- or Design-only user sees nav links they cannot use; (b) the pages themselves are annotated only [Authorize] (any authenticated user), not [Authorize(Policy = RequireDeployment)], so a non-Deployment user who follows the link is not blocked — they can query site event logs and retry/discard parked messages. The authorization attribute and the nav visibility both contradict the design.

Recommendation

Add [Authorize(Policy = AuthorizationPolicies.RequireDeployment)] to EventLogs.razor and ParkedMessages.razor, and move their nav links into a <AuthorizeView Policy="RequireDeployment"> block (consistent with the Topology / Deployments / Debug View links). Confirm Health Dashboard is intentionally all-roles (it is, per the design).

Resolution

Unresolved.

CentralUI-008 — Audit-log date filters treat browser-local datetimes as UTC

Severity Medium
Category Correctness & logic bugs
Status Open
Location src/ScadaLink.CentralUI/Components/Pages/Monitoring/AuditLog.razor:242-243

Description

The From/To filters bind <input type="datetime-local"> to DateTime? fields. A datetime-local input yields the value the user typed in their browser-local time zone. FetchPage converts them with new DateTimeOffset(_filterFrom.Value, TimeSpan.Zero) — i.e. it labels the local wall-clock value as UTC. For any non-UTC user the audit query window is shifted by their UTC offset, silently returning the wrong rows. CLAUDE.md mandates UTC throughout, but that requires converting the local input to UTC, not relabelling it.

Recommendation

Convert the picked local time to UTC before querying — capture the browser offset (JS interop) and apply it, or document the inputs as UTC and label them in the UI. The same issue should be checked in EventLogs.razor if it has time-range filters.

Resolution

Unresolved.

CentralUI-009 — DebugView stream callbacks touch a possibly-disposed ToastNotification

Severity Medium
Category Concurrency & thread safety
Status Open
Location src/ScadaLink.CentralUI/Components/Pages/Deployment/DebugView.razor:400-409,538-544

Description

The onTerminated callback passed to DebugStreamService.StartStreamAsync captures _toast and this and runs on an Akka/gRPC thread. If the user navigates away, Dispose() calls StopStream, but a stream-termination event already in flight can still invoke onTerminated, which calls _toast.ShowError(...) and StateHasChanged() on a disposed component. The component does not guard callbacks with a disposed flag or a CancellationTokenSource. The same applies to the onEvent callbacks at lines 391-398 that call InvokeAsync(StateHasChanged).

Recommendation

Track a _disposed/CancellationTokenSource on the component, check it at the top of every stream callback, and stop the stream synchronously before marking disposed. InvokeAsync after disposal throws ObjectDisposedException; the callbacks should no-op once disposed.

Resolution

Unresolved.

CentralUI-010 — ToastNotification auto-dismiss continuation runs after component disposal

Severity Medium
Category Error handling & resilience
Status Open
Location src/ScadaLink.CentralUI/Components/Shared/ToastNotification.razor:62-71,90

Description

AddToast schedules Task.Delay(dismissMs).ContinueWith(...) with the result discarded (_ =). The continuation calls InvokeAsync(StateHasChanged). If the host page is disposed before the 5-second delay elapses (common — navigate away right after an action), the continuation runs against a disposed component and InvokeAsync throws ObjectDisposedException on a thread-pool thread with no catch, producing an unobserved task exception. Dispose() is an empty body and cancels nothing.

Recommendation

Hold a CancellationTokenSource, pass its token to Task.Delay, cancel it in Dispose(), and guard the continuation. Alternatively wrap the continuation body in a try/catch for ObjectDisposedException.

Resolution

Unresolved.

CentralUI-011 — DiffDialog leaves a dangling TaskCompletionSource when disposed while open

Severity Medium
Category Error handling & resilience
Status Open
Location src/ScadaLink.CentralUI/Components/Shared/DiffDialog.razor:89-95,151-157

Description

OpenAsync creates _tcs and returns _tcs.Task to the caller, which typically awaits it. The task is completed only by Close(). If the user navigates away while the dialog is open, DisposeAsync runs but never completes _tcs, so the awaiting caller's continuation never resumes — a permanently suspended Task (and any using/cleanup after the await is skipped). The IDialogService.Confirm/Prompt path has the same shape but at least its host is a single long-lived DialogHost; DiffDialog is per-page.

Recommendation

In DisposeAsync, call _tcs?.TrySetResult(false) (or TrySetCanceled) so any awaiter completes deterministically.

Resolution

Unresolved.

CentralUI-012 — N+1 query loading data connections for the Sites page

Severity Medium
Category Performance & resource management
Status Open
Location src/ScadaLink.CentralUI/Components/Pages/Admin/Sites.razor:196-205

Description

LoadDataAsync fetches all sites, then issues SiteRepository.GetDataConnectionsBySiteIdAsync(site.Id) once per site in a loop. With N sites this is N+1 database round-trips on every page load and every post-delete refresh. The connection lists are only used for a small per-card summary.

Recommendation

Add a repository method that returns all data connections (or connections for a set of site ids) in one query and group them client-side, or project the small summary in a single query.

Resolution

Unresolved.

CentralUI-013 — ScriptAnalysisService blocks on async shared-script lookups

Severity Medium
Category Concurrency & thread safety
Status Open
Location src/ScadaLink.CentralUI/ScriptAnalysis/ScriptAnalysisService.cs:951-952

Description

ResolveCalledShape calls _sharedScripts.GetShapesAsync().GetAwaiter().GetResult() to resolve a shared-script shape synchronously. GetShapesAsync ultimately hits SharedScriptService and its EF Core repository. Sync-over-async on a request thread risks thread-pool starvation under load and can deadlock if any awaited continuation needs a captured context. Hover and SignatureHelp (which call ResolveCalledShape) are themselves synchronous methods, so the blocking call is structural.

Recommendation

Make Hover and SignatureHelp async and await GetShapesAsync, or have the catalog expose a cached synchronous snapshot that is refreshed asynchronously. The IMemoryCache is already present — caching the shapes there and reading them synchronously would remove the blocking call.

Resolution

Unresolved.

CentralUI-014 — Test Run side effects (HTTP/SQL/SMTP) fire against production services

Severity Medium
Category Error handling & resilience
Status Open
Location src/ScadaLink.CentralUI/ScriptAnalysis/ScriptAnalysisService.cs:254-259; src/ScadaLink.CentralUI/ScriptAnalysis/SandboxHostHelpers.cs:26-117

Description

By design (documented in the XML comments) Test Run wires ExternalSystem, Database, and Notify to central's real IExternalSystemClient, IDatabaseGateway, and INotificationDeliveryService, so a Test Run that calls Notify.To(...).Send(...) actually emails recipients, Database.Connection(...) opens a real DB connection, and External.Call(...) makes real HTTP calls — with production-equivalent side effects. There is no dry-run mode, no confirmation, and (combined with CentralUI-001) no restriction on what a script can do. A Design user testing a draft script can dispatch real notifications or mutate external databases. The behaviour is intentional but the blast radius is not surfaced to the user.

Recommendation

At minimum, surface a clear warning in the Test Run UI that side effects are real, and require explicit opt-in for side-effecting calls. Preferably offer a dry-run mode that stubs the helpers, defaulting to dry-run.

Resolution

Unresolved.

CentralUI-015 — DialogService continuations resolve off the render thread

Severity Low
Category Concurrency & thread safety
Status Open
Location src/ScadaLink.CentralUI/ServiceCollectionExtensions.cs:24; src/ScadaLink.CentralUI/Components/Shared/DialogService.cs:18-69

Description

DialogService is AddScoped (one per circuit, correct) but ConfirmAsync/PromptAsync complete via ContinueWith(..., TaskScheduler.Default), so a caller awaiting them resumes on a thread-pool thread. Any subsequent component state mutation by the caller is then off the render thread unless the caller wraps it in InvokeAsync. Call sites are not consistently doing so, which can produce non-deterministic render glitches.

Recommendation

Either resolve continuations on the circuit's sync context or document that callers must InvokeAsync after awaiting ConfirmAsync/PromptAsync. Audit call sites for off-thread state mutation.

Resolution

Unresolved.

CentralUI-016 — Pagers render one button per page with no windowing

Severity Low
Category Performance & resource management
Status Open
Location src/ScadaLink.CentralUI/Components/Shared/DataTable.razor:62-68; src/ScadaLink.CentralUI/Components/Pages/Deployment/Deployments.razor:167-173

Description

The DataTable and Deployments paginators loop for i = 1..totalPages and emit a <li> button for every page. With a few thousand records at page size 25 that is hundreds of buttons rendered into the diff on every state change. It is not a correctness bug but degrades render performance and usability on large datasets.

Recommendation

Window the pager (first / prev / a few around current / next / last) or switch large lists to a "load more" / numeric jump input.

Resolution

Unresolved.

CentralUI-017 — /auth/logout POST disables antiforgery, enabling logout CSRF

Severity Low
Category Security
Status Open
Location src/ScadaLink.CentralUI/Auth/AuthEndpoints.cs:127-138

Description

The POST /auth/logout endpoint calls .DisableAntiforgery(), and a plain GET /logout endpoint also signs the user out. Either can be triggered cross-site (an <img src="/logout"> or an auto-submitting form) to forcibly log a user out. Login itself reasonably disables antiforgery (pre-auth), but logout is a state-changing authenticated action and should be CSRF-protected.

Recommendation

Require an antiforgery token on POST /auth/logout (the NavMenu sign-out form can include the antiforgery token), and remove or protect the state-changing GET /logout route.

Resolution

Unresolved.

CentralUI-018 — Broad catch {} blocks swallow JS interop and storage errors silently

Severity Low
Category Error handling & resilience
Status Open
Location src/ScadaLink.CentralUI/Components/Shared/MonacoEditor.razor:116-118,123,142,164,170,176,182,189; src/ScadaLink.CentralUI/Components/Shared/TreeView.razor:129,139; src/ScadaLink.CentralUI/Components/Pages/Admin/Sites.razor:316-319

Description

Numerous try { ... } catch { } blocks swallow every exception with no logging. The prerender-time JS-unavailable case is legitimate, but these catches also hide real failures: a genuine Monaco init failure, or a clipboard permission error become invisible. In TreeView.razor the storage-restore JsonSerializer.Deserialize (line 139) is not inside a try at all and would throw uncaught on a corrupt treeviewStorage payload. Debugging UI issues in production is then guesswork.

Recommendation

Catch the specific expected exception type (e.g. JSDisconnectedException, InvalidOperationException during prerender) and log anything else via ILogger. Wrap the TreeView storage Deserialize in its own guarded block.

Resolution

Unresolved.

CentralUI-019 — Sparse unit-test coverage for a large module; critical paths untested

Severity Low
Category Testing coverage
Status Open
Location tests/ScadaLink.CentralUI.Tests/

Description

The module has ~65 source files but unit tests cover only the script analyzer, TreeView, schema model, and two data-connection pages. Untested critical paths include: the auth bridge (CookieAuthenticationStateProvider, AuthEndpoints), RunInSandboxAsync (timeout, recursion limit, error classification, side-effect wiring), DialogService resolution semantics, DebugView stream lifecycle and the UpsertWithCap cap logic, Health and Deployments timer behaviour, and SchemaBuilderModel round-tripping of nested schemas. Given findings CentralUI-001/003/009/010 sit on untested code, the gap is material. The Playwright suite covers login and navigation only.

Recommendation

Add bUnit/unit tests for the auth bridge, sandbox-run behaviour (including forbidden-API rejection once CentralUI-001 is fixed), dialog resolution, and the DebugView cap/lifecycle logic. Prioritise the paths named in the Critical/High findings.

Resolution

Unresolved.