docs(code-reviews): re-review batch 1 at 39d737e — CentralUI, CLI, ClusterInfrastructure, Commons, Communication

17 new findings: CentralUI-020..025, CLI-014..016, ClusterInfrastructure-009..010, Commons-013..014, Communication-012..015.
This commit is contained in:
Joseph Doherty
2026-05-17 00:41:21 -04:00
parent 39d737ebd6
commit e49846603e
6 changed files with 842 additions and 52 deletions

View File

@@ -5,10 +5,10 @@
| Module | `src/ScadaLink.CentralUI` |
| Design doc | `docs/requirements/Component-CentralUI.md` |
| Status | Reviewed |
| Last reviewed | 2026-05-16 |
| Last reviewed | 2026-05-17 |
| Reviewer | claude-agent |
| Commit reviewed | `9c60592` |
| Open findings | 0 |
| Commit reviewed | `39d737e` |
| Open findings | 6 |
## Summary
@@ -32,6 +32,24 @@ Testing coverage is thin for a module this large: only the script analyzer,
TreeView, schema model, and a few data-connection pages have unit tests; most
pages and the auth bridge are untested.
#### Re-review 2026-05-17 (commit `39d737e`)
All 19 findings from the 2026-05-16 review are confirmed closed. The resolution
batch (`a9bd7ee`..`34588ae`) substantially rewrote the auth bridge, the script
sandbox, several Deployment/Monitoring pages, and the shared component disposal
paths, so this re-review re-examined the post-fix state across all 10 checklist
categories. Six new findings (CentralUI-020 .. 025) were recorded. The most
important is **CentralUI-020**: the two prior fixes interact destructively — the
CentralUI-004 fix made `CookieAuthenticationStateProvider` return a frozen,
constructor-time auth-state snapshot, while the CentralUI-005 fix rewrote
`SessionExpiry.razor` to *poll* that same provider to detect a lapsed session.
Because the snapshot never changes for the life of the circuit, the idle-timeout
redirect can never fire, so the documented idle-logout behaviour is silently
defeated. The remaining new findings are a cross-thread `Dictionary` mutation in
`DebugView`, an unguarded `InvokeAsync` in the new `Deployments` push handler,
and three Low-severity items (residual bare `catch`, magic-string claim
lookups, and the untested `SessionExpiry` polling path).
## Checklist coverage
| # | Category | Examined | Notes |
@@ -47,6 +65,14 @@ pages and the auth bridge are untested.
| 9 | Testing coverage | ☑ | Auth, sandbox-run, DebugView, Health, ParkedMessages, most pages untested. |
| 10 | Documentation & comments | ☑ | Comments are accurate and helpful; a few stale claims noted. |
Re-review 2026-05-17 (`39d737e`): all 10 categories re-examined against the
post-fix source. New findings — category 3 (CentralUI-020 the auth-snapshot vs
session-poll interaction is also a design-adherence regression; CentralUI-021
cross-thread `Dictionary`; CentralUI-022 unguarded `InvokeAsync`), category 4
(CentralUI-023 residual bare `catch`), category 8 (CentralUI-024 magic-string
claims), category 9 (CentralUI-025 untested `SessionExpiry` poll). Categories
1, 2, 5, 6, 7, 10 produced no new findings.
## Findings
### CentralUI-001 — Test Run sandbox executes arbitrary C# with no trust-model enforcement
@@ -961,3 +987,232 @@ it was failing on the baseline because `DeploymentService` had gained a
`DiffService` constructor dependency from a DeploymentManager contract change
that the test fixture had not been updated for; `DiffService` is now registered
in the fixture.)
### CentralUI-020 — Idle-session redirect never fires: `SessionExpiry` polls a frozen auth-state snapshot
| | |
|--|--|
| Severity | High |
| Category | Concurrency & thread safety |
| Status | Open |
| Location | `src/ScadaLink.CentralUI/Components/Shared/SessionExpiry.razor:39-62`; `src/ScadaLink.CentralUI/Auth/CookieAuthenticationStateProvider.cs:29-43` |
**Description**
The CentralUI-004 fix and the CentralUI-005 fix interact destructively.
CentralUI-004 made `CookieAuthenticationStateProvider` snapshot the principal
**once** in its constructor into a cached `Task<AuthenticationState>` and serve
that exact task for the entire life of the SignalR circuit — it never re-reads
`HttpContext`, never calls `SetAuthenticationState`, and never raises
`NotifyAuthenticationStateChanged`. CentralUI-005 then rewrote
`SessionExpiry.razor` to *poll* `AuthStateProvider.GetAuthenticationStateAsync()`
once a minute and redirect to `/login` "once the sliding cookie has actually
lapsed server-side." But `GetAuthenticationStateAsync()` returns the same frozen
constructor-time snapshot on every call — `auth.User.Identity.IsAuthenticated`
is permanently `true` for the life of the circuit regardless of whether the
server-side cookie has expired. The poll loop therefore never observes an
expired session and the redirect never fires. An idle user whose cookie has
lapsed server-side keeps an authenticated-looking page open indefinitely; the
documented "30-minute idle timeout" is silently defeated for any user who
leaves a circuit open. (The cookie middleware would still reject the *next*
full HTTP request / new circuit, so this is a stale-UI / missed-logout exposure
rather than a full auth bypass — but the page continues to render
authenticated content and a SignalR circuit can stay alive for a long time.)
This is also a design-document-adherence regression against CLAUDE.md
"Security & Auth" (30-minute idle timeout) — recorded under Concurrency because
the root cause is the lifetime/staleness mismatch between the two components.
**Recommendation**
`SessionExpiry` must consult something that actually reflects the live
server-side session, not the circuit's frozen principal. Options: (a) have
`SessionExpiry` poll a lightweight authenticated server endpoint (e.g. a
`/auth/ping` minimal API that returns 401 once the cookie has lapsed) and
redirect on 401; or (b) give `CookieAuthenticationStateProvider` a refresh path
that re-validates the cookie and calls `SetAuthenticationState` /
`NotifyAuthenticationStateChanged` so the polled state can actually change.
Whichever is chosen, add a test that exercises the redirect path with an
expired session (see CentralUI-025).
**Resolution**
_Unresolved._
### CentralUI-021 — `DebugView` stream callback mutates `Dictionary` off the render thread
| | |
|--|--|
| Severity | Medium |
| Category | Concurrency & thread safety |
| Status | Open |
| Location | `src/ScadaLink.CentralUI/Components/Pages/Deployment/DebugView.razor:404-419,511-519,275-289` |
**Description**
The `onEvent` callback passed to `DebugStreamService.StartStreamAsync` runs on
an Akka/gRPC thread (as the design doc and the CentralUI-009 comments state). It
calls `UpsertWithCap(_attributeValues, …)` / `UpsertWithCap(_alarmStates, …)`
**directly on that thread** — the mutation is not marshalled through
`InvokeAsync`; only the subsequent `StateHasChanged` is. Meanwhile the render
thread evaluates `FilteredAttributeValues` / `FilteredAlarmStates`, which
enumerate `_attributeValues.Values` / `_alarmStates.Values` and call
`OrderBy(...).ToList()`. `Dictionary<TKey,TValue>` is not thread-safe: a write
on the Akka thread concurrent with an enumeration on the render thread can throw
`InvalidOperationException` ("Collection was modified; enumeration operation may
not execute") or corrupt the dictionary's internal buckets. The CentralUI-009
fix added a `_disposed` guard but did not address this data race — the guard
only prevents touching a *disposed* component, not concurrent access to a live
one. Under a busy debug stream this will intermittently fault the page.
**Recommendation**
Marshal the dictionary mutation onto the render thread too — move the
`UpsertWithCap` call inside the `SafeInvokeAsync`/`InvokeAsync` body so all
access to `_attributeValues`/`_alarmStates` happens on the renderer's
dispatcher. Alternatively guard both the writes and the `Filtered*` reads with a
lock, or use a concurrent collection. The cap-trim loop must be inside the same
critical section as the upsert.
**Resolution**
_Unresolved._
### CentralUI-022 — `Deployments` push handler fires `InvokeAsync` with no disposal guard
| | |
|--|--|
| Severity | Medium |
| Category | Error handling & resilience |
| Status | Open |
| Location | `src/ScadaLink.CentralUI/Components/Pages/Deployment/Deployments.razor:221-229,317-322` |
**Description**
`OnDeploymentStatusChanged` is invoked by `IDeploymentStatusNotifier`, a process
singleton, on the DeploymentManager service thread. The handler does
`_ = InvokeAsync(async () => { await LoadDataAsync(); StateHasChanged(); })`,
discarding the returned task. `Dispose()` unsubscribes the handler, but there is
a race window: the notifier can read the subscriber list and begin invoking
`OnDeploymentStatusChanged` *just before* the component is disposed, so
`InvokeAsync` then runs against a disposed component and throws
`ObjectDisposedException` on the DeploymentManager thread — an unobserved task
exception (the task is fire-and-forget). The same hazard was explicitly fixed
for `DebugView` (CentralUI-009, `SafeInvokeAsync` + `_disposed` flag) and
`ToastNotification` (CentralUI-010), but the new push-based `Deployments`
handler introduced by the CentralUI-006 fix did not adopt the same guard.
Separately, every push event triggers two full repository reloads
(`GetAllInstancesAsync` + `GetAllDeploymentRecordsAsync`) for every open
circuit, so a burst of status changes amplifies into N×2 round-trips per tick.
**Recommendation**
Add a `volatile bool _disposed` set first in `Dispose()`, have
`OnDeploymentStatusChanged` no-op when set, and wrap the `InvokeAsync` dispatch
to swallow `ObjectDisposedException` (mirror `DebugView.SafeInvokeAsync`).
Optionally coalesce bursts (debounce) and/or reload only the changed record
rather than the whole table on each event.
**Resolution**
_Unresolved._
### CentralUI-023 — Residual bare `catch {}` blocks swallow JS interop errors
| | |
|--|--|
| Severity | Low |
| Category | Error handling & resilience |
| Status | Open |
| Location | `src/ScadaLink.CentralUI/Components/Pages/Monitoring/ParkedMessages.razor:690-698`; `src/ScadaLink.CentralUI/Components/Shared/DiffDialog.razor:107-116,118-130,104` |
**Description**
CentralUI-018 narrowed the bare `catch {}` blocks in `MonacoEditor`,
`TreeView`, and `Sites.razor`, but the same pattern survives elsewhere.
`ParkedMessages.CopyAsync` wraps `navigator.clipboard.writeText` in
`catch { _toast.ShowError("Copy failed."); }` — a real `JSException`
(clipboard permission denied) and an expected `JSDisconnectedException` are
treated identically and neither is logged. `DiffDialog.TryLockBodyAsync` /
`TryUnlockBodyAsync` each have a bare outer `catch` whose handler does another
JS call wrapped in a second bare `catch { /* swallow */ }`, and
`OnAfterRenderAsync`'s `_modalRef.FocusAsync()` is wrapped in a bare
`catch { /* prerender or detached: ignore */ }`. Genuine interop failures in
these paths are invisible in production logs.
**Recommendation**
Catch `JSDisconnectedException` silently and `JSException` (and
`InvalidOperationException` for the prerender focus case) with an `ILogger`
call, consistent with the CentralUI-018 fixes in the same module.
**Resolution**
_Unresolved._
### CentralUI-024 — Claim lookups use magic strings instead of `JwtTokenService` constants
| | |
|--|--|
| Severity | Low |
| Category | Code organization & conventions |
| Status | Open |
| Location | `src/ScadaLink.CentralUI/Components/Layout/NavMenu.razor:102`; `src/ScadaLink.CentralUI/Components/Pages/Dashboard.razor:14`; `GetCurrentUserAsync` in `Templates.razor`, `TemplateEdit.razor`, `TemplateCreate.razor`, `SharedScripts.razor`, `SharedScriptForm.razor`, `Sites.razor`, `Topology.razor`, `InstanceCreate.razor`, `InstanceConfigure.razor` |
**Description**
`ScadaLink.Security.JwtTokenService` exposes the canonical claim-type constants
(`UsernameClaimType = "Username"`, `DisplayNameClaimType = "DisplayName"`,
`RoleClaimType`, `SiteIdClaimType`). `SiteScopeService` correctly uses
`JwtTokenService.SiteIdClaimType`, but every `GetCurrentUserAsync` helper across
ten pages does `authState.User.FindFirst("Username")?.Value`, and `NavMenu` /
`Dashboard` do `context.User.FindFirst("DisplayName")`. The literals happen to
match the constants today, so there is no live bug — but if the claim type is
ever renamed in `JwtTokenService` (the single source of truth) every one of
these call sites silently breaks, falling back to `"unknown"` for the audit
user and a blank display name. The duplicated `GetCurrentUserAsync` helper is
also copy-pasted verbatim into ten components.
**Recommendation**
Replace the string literals with `JwtTokenService.UsernameClaimType` /
`DisplayNameClaimType`. Consider extracting the repeated `GetCurrentUserAsync`
into a single shared helper (e.g. an extension on `AuthenticationStateProvider`
or a small scoped service) so the claim lookup lives in exactly one place.
**Resolution**
_Unresolved._
### CentralUI-025 — `SessionExpiry` polling/redirect path has no test coverage
| | |
|--|--|
| Severity | Low |
| Category | Testing coverage |
| Status | Open |
| Location | `tests/ScadaLink.CentralUI.Tests/Auth/SessionExpiryPolicyTests.cs`; `src/ScadaLink.CentralUI/Components/Shared/SessionExpiry.razor` |
**Description**
`SessionExpiryPolicyTests` covers only `AuthEndpoints.BuildSignInProperties()`
(the sign-in properties shape). The actual runtime behaviour of
`SessionExpiry.razor` — that an expired session triggers a redirect to
`/login`, that an authenticated session does not, and that the component does
not poll/redirect on the `/login` page itself — is untested. Had a behavioural
test exercised the redirect with an expired/anonymous auth state against the
real `CookieAuthenticationStateProvider`, the CentralUI-020 defect (the frozen
snapshot never reporting an expired session) would have been caught. The
component is the system's only client-side idle-logout mechanism, so the gap is
material.
**Recommendation**
Add bUnit tests for `SessionExpiry`: (a) with an unauthenticated auth state the
component navigates to `/login`; (b) with an authenticated state it does not;
(c) on the `/login` route it neither polls nor redirects. The provider used in
the test must be one whose state can actually transition to expired — which
also forces the CentralUI-020 fix.
**Resolution**
_Unresolved._